Pipeline
Description
The Pipeline
action runs a previously-defined pipeline within a workflow.
This action is the access point from your workflow to your actual data processing activity (pipeline).
Usage
An example of a common workflow includes getting FTP files, checking existence of a necessary target database table, running a pipeline that populates that table, and e-mailing an error log if a pipeline fails. For this example, the Pipeline action defines which pipeline to run to populate the table.
See also:
-
The Workflow action that executes a sub-workflow from a workflow.
-
The Workflow Executor transform that executes a workflow from a pipeline.
-
The Pipeline Executor transform that executes a sub-pipeline from a pipeline.
Options
General
Option | Description |
---|---|
Action name | Name of the action. |
Pipeline | Specify your pipeline by entering in its path or clicking Browse. The selected pipelines will automatically converted to a path relative to your |
Run Configuration | The pipeline can run in different types of pipeline configurations. Select the desired run configuration to control where and how the pipeline is executed. |
Options tab
Option | Description |
---|---|
Execute for every result row | Runs the pipeline once for every result row from a previous pipeline (or workflow) in the current workflow. |
Clear results rows before execution | Makes sure the results rows are cleared before the pipeline starts. |
Clear results files before execution | Makes sure the results files are cleared before the pipeline starts. |
Wait for remote pipeline to finish | If you selected Server as your environment type, choose this option to block the workflow until the pipeline runs on the server. |
Follow local abort to remote pipeline | If you selected Server as your environment type, choose this option to send the local abort signal remotely. |
Logging tab
By default, if you do not set logging, Apache Hop will take generated log entries and create a log record inside the workflow. For example, suppose a workflow has three pipelines to run and you have not set logging. The pipelines will not log information to other files, locations, or special configurations. In this instance, the workflow runs and logs information into its master workflow log.
In most instances, it is acceptable for logging information to be available in the workflow log. For example, if you have load dimensions, you want logs for your load dimension runs to display in the workflow logs. If there are errors in the pipelines, they will be displayed in the workflow logs. However, you want all your log information kept in one place, you must then set up logging.
Option | Description |
---|---|
Specify logfile | Specifies a separate logging file for running this pipeline. |
Name | Specifies the directory and base name of the log file (C:\logs for example). |
Extension | Specifies the file name extension (.log or .txt for example). |
Log level | Specifies the logging level for running the pipeline. See Logging for more details. |
Append logfile | Appends the logfile as opposed to creating a new one. |
Create parent folder | Creates a parent folder for the log file if it does not exist. |
Include date in filename | Adds the system date to the filename with format YYYYMMDD (_20051231). |
Include time in filename | Adds the system time to the filename with format HHMMSS (_235959). |
Parameters tab
Pass params downstream: On the Parameters tab, select the pipeline transform checkbox to Pass parameter values to sub pipeline
. The parameter must already exist in the pipeline (in pipeline properties for example) or alternatively, on the Parameters tab, you can specify new parameters. The Parameters tab allows you to override existing parameter values or NULL them by leaving the value empty.
Pass field values upstream: The sub pipeline requires a Copy rows to result transform to send a row upstream. This requires a row to exist in the sub pipeline. Note that that rows do not exist in a workflow, but you can use a Get variables in a subsequent sub pipeline to use the first sub pipeline’s field values.
Use Set variables if you want to pass a single value upstream from a pipeline to the workflow and act upon that variable. In this case, you can choose a scope of “valid in the parent workflow”.
Option | Description |
---|---|
Copy results to parameters | Copies the results from a previous pipeline as parameters of the pipeline using the Copy rows to result transform. |
Pass parameter values to sub pipeline | Pass all parameters of the workflow down to the sub-pipeline. |
Parameter | Specify the parameter name passed to the pipeline. |
Stream column name | Specify the field of an incoming record from a previous pipeline as the parameter. |
Value | Specify pipeline parameter values through one of the following actions: |
Get Parameters | Get the existing parameters already associated by the pipeline. |