Naming conventions
As your project grows, the importance of keeping things organized grows. A clearly organized project makes it easier to find the workflows, pipelines and other project artefacts, and makes your project easier to maintain overall.
Your naming convention should not only cover all aspects of your projects. For Apache Hop, that means conventions for your workflows, pipelines, transforms, actions and metadata items. There’s more to your project than just Apache Hop, and other areas of your project are no exception. Input and output files, database tables etc will be a lot easier to manage if named clearly, cleanly and consistently.
For larger projects, a formal naming conventions document helps to centrally manage the naming conventions, and helps to avoid confusion when different team members use their own naming conventions interchangeably.
A naming convention should be maintained, updated, enforced and verified periodically. Automated naming convention checks (e.g. through scripts in commit hooks) could be considered to automate the validation of your naming conventions.
Transform and action names
Clearly named transforms and actions make your pipelines and workflows a lot easier to understand.
The default action and transform names use the action or transform name. This makes it easy to understand what the transform does, but tell you nothing about the purpose it has in your workflow or pipeline.
Filter rows
(or god forbid, Filter rows 2 2
or similar names you get after copy/pasting transforms) doesn’t tell you anything. A short but concise transform name like start_date < today
tells you exactly what is going on in a filter transform.

For example, for input and output files, you could use the filename you’re reading from or writing to.
You can use (copy/paste) any unicode character in the name of a transform or action and even newlines are allowed. |
Metadata
Metadata item names like relational database connections should immediately tell you what data they contain or what their purpose is.
Metadata item names shouldn’t contain technical or environment details.
For example, if your CRM system runs in a Postgresql database, CRM
is fine as a name. Your connection is configured for an Oracle database, so that doesn’t need to be repeated in the name. Environment information should be configured in your project lifecycle environments, so there’s no need to include dev
, test
or prod
in your connection names.