CrateDB Bulk Loader

Description

The CrateDB Bulk Loader transform loads data from Apache Hop to CrateDB using two different approaches:

the COPY FROM command.
The CrateDB HTTP endpoint for bulk operations.

Supported Engines

Hop Engine

Supported

Spark

Maybe Supported

Flink

Maybe Supported

Dataflow

Maybe Supported

The CrateDB Bulk Loader is linked to the database type. When the COPY mode is used, it will fetch the JDBC driver from the hop/lib/jdbc folder.

General Options

Option Description

Option	Description
Transform name	Name of the transform.
Target schema	The name of the target schema to write data to. This is a mandatory field because CrateDB needs to know which of the default schemas write to (`doc` and `blob` are the default schemas in CrateDB).
Target table	The name of the target table to write data to.

Transform name

Name of the transform.

Target schema

The name of the target schema to write data to. This is a mandatory field because CrateDB needs to know which of the default schemas write to (doc and blob are the default schemas in CrateDB).

Target table

The name of the target table to write data to.

Main Options

Option Description

Option	Description
Connection	Name of the database connection on which the target table resides.
Use HTTP Endpoint	Choose the mode to use to load data into CrateDB. Supported options are `HTTP Endpoint` and `COPY`; when `HTTP Endpoint` is selected, the `COPY` options are disabled and vice versa.
Batch size	HTTP mode works writing in batch. The number of rows to send in a single batch to CrateDB must be set as there’s no default value.
Specify database fields	Specify the database and stream fields mapping
Stream to file	Write the current pipeline stream to a file in the local filesystem or in S3 before performing the `COPY` load.
Local folder	Local folder where to store files that will be used by the `COPY` command. As per documentation, CrateDB retrieves files from its nodes filesystem (scheme `file://`). However, in most cases, Hop is executed in a different machine than the CrateDB one, so the most adopted solution with such scenarios is mapping the remote folder (CrateDB) with a local one (Hop) via volumes, for example. In the `Local folder` field, you can specify the local folder where the file will be written. The file will be written in the local filesystem, which is linked to the remote filesystem (for e.g. Docker Volume). Leave it empty otherwise in other scenarios (i.e.: writing to S3).
Read from file	Do not stream the contents of the current pipeline, but perform the `COPY` load from a pre-existing file in the local filesystem or in S3. Supported formats are `CSV` (comma delimited).

Connection

Name of the database connection on which the target table resides.

Use HTTP Endpoint

Choose the mode to use to load data into CrateDB. Supported options are HTTP Endpoint and COPY; when HTTP Endpoint is selected, the COPY options are disabled and vice versa.

Batch size

HTTP mode works writing in batch. The number of rows to send in a single batch to CrateDB must be set as there’s no default value.

Specify database fields

Specify the database and stream fields mapping

Stream to file

Write the current pipeline stream to a file in the local filesystem or in S3 before performing the COPY load.

Local folder

Local folder where to store files that will be used by the COPY command.

As per documentation, CrateDB retrieves files from its nodes filesystem (scheme file://). However, in most cases, Hop is executed in a different machine than the CrateDB one, so the most adopted solution with such scenarios is mapping the remote folder (CrateDB) with a local one (Hop) via volumes, for example.

In the Local folder field, you can specify the local folder where the file will be written. The file will be written in the local filesystem, which is linked to the remote filesystem (for e.g. Docker Volume).

Leave it empty otherwise in other scenarios (i.e.: writing to S3).

Read from file

Do not stream the contents of the current pipeline, but perform the COPY load from a pre-existing file in the local filesystem or in S3. Supported formats are CSV (comma delimited).

AWS Authentication

Option Description

Option	Description
Use AWS system variables	When selected, picks up the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` values from your operating system’s environment variables.
AWS_ACCESS_KEY_ID	(if `Use AWS system variables` is unchecked) specify a value or variable for your `AWS_ACCESS_KEY_ID`.
AWS_SECRET_ACCESS_KEY	(if `Use AWS system variables` is unchecked) specify a value or variable for your `AWS_SECRET_ACCESS_KEY`.

Use AWS system variables

When selected, picks up the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY values from your operating system’s environment variables.

AWS_ACCESS_KEY_ID

(if Use AWS system variables is unchecked) specify a value or variable for your AWS_ACCESS_KEY_ID.

AWS_SECRET_ACCESS_KEY

(if Use AWS system variables is unchecked) specify a value or variable for your AWS_SECRET_ACCESS_KEY.

HTTP Authentication

At the moment, Hop only supports the Basic and Bearer authentication methods.

Option	Description
HTTP Login	Insert the username for the HTTP authentication
HTTP password	Insert the password for the HTTP authentication

Option

Description

HTTP Login

Insert the username for the HTTP authentication

HTTP password

Insert the password for the HTTP authentication

Fields

Map the current stream fields to the CrateDB table’s columns.