REST Client transform Icon REST Client

Description

The REST Client transform enables you to consume RESTfull services. Representational State Transfer (REST) is a key design idiom that embraces a stateless client-server architecture in which the web services are viewed as resources and can be identified by their URLs.

The REST client transform can use a pre-defined REST connection or use full URLs directly.
When using a REST connection, the URL (hard-coded or accepted from a field) will be considered a path relative to the base URL defined in the REST connection. Header values that are specified in the Headers tab of the transform will overwrite any headers with the same name that were defined in the REST connection.

When used without a REST connection, the full URL needs to be specified.

When pagination is enabled on this transform and the selected REST connection defines a non-NONE pagination strategy, Hop issues multiple HTTP requests per incoming row and can split each page into multiple output rows (see the Pagination tab). At Basic log level, each fetched page is logged (REST pagination: fetched results page …).

Example

The REST client Transform returns a "result" field (can change the name), and the field is often used in the next transform. For example, it can be read in by a JSON input transform that extracts the fields specified on the Fields tab.

Supported Engines

Hop Engine

Supported

Spark

Maybe Supported

Flink

Maybe Supported

Dataflow

Maybe Supported

Options

General Tab

The General tab is where you enter basic connection information for accessing a resource.

Option Description

Transform name

Name of this transform as it appears in the pipeline workspace

REST Connection

The (optional) REST connection to use for the base URL and authentication/authorization header name and value.

URL

Indicates the path to a resource

Accept URL from field

Designates the path to a resource is defined from a field

URL field name

Indicates the field from which the path to a resource is defined

HTTP method

Indicates how the transform interacts with a resource---options are either GET, PUT, DELETE, POST, HEAD, or OPTIONS

Get Method from field

Designates the GET method is defined from a field

Method fieldname

Indicates the field from which the GET method is defined

Body field

Contains the request body for POST, PUT, and DELETE methods. Body field only accepts a previous field, not a hard coded value.

Application type

Designates what type of application a resource is---options are either TEXT PLAIN, XML, JSON, OCTET STREAM, XHTML, FORM URLENCODED, ATOM XML, SVG XML, or TEXT XML

Connection timeout

Indicates the timeout until a connection is established (milliseconds)

Read timeout

Indicates the timeout for waiting for reading data (milliseconds)

Result fieldname

Designates the name of the result output field

HTTP status code fieldname

Designates the name of the HTTP status code field

Response time (milliseconds) fieldname

Designates the name of the response time field

Authentication Tab

If necessary, enter authentication details for a resource in the Authentication tab.

Hop allows using NULL parameters/variables for both the HTTP Login and HTTP Password.
Option Description

HTTP Login

Indicates the username required to access a resource

HTTP Password

Indicates the password associated with the provided HTTP Login username

Preemptive

Option to send the authentication credentials before a server gives an unauthorized response. Currently always true, due to bug #4196.

Proxy Host

Indicates the name of a proxy host

Proxy Port

Indicates the port number of a proxy host

SSL Tab

The SSL tab is where you provide authentication details for accessing a resource that requires SSL certificate authentication.

Option Description

Truststore file

Indicates the location of a truststore file

Truststore password

Indicates the password associated with the provided truststore file

Headers Tab

The Headers tab enables you to define the content of any HTTP headers using an existing field. Populate the list of fields by clicking the Get fields button.

To figure out what Headers are required, you can use Postman and remove as many headers as possible for the Request to still work. You should not need to use the Headers with value “<calculated when request is sent>” in Postman. You do not need to manually add an Authorization header if you used the Authentication tab.
Option Description

Field

The field from incoming Hop stream that contains the header information. Add the header value field name to the Field column.

Name

The name of the outgoing Hop field from this transform. Add the header key field name to the Name column.

Parameters Tab

The Parameters tab enables you to define parameter values for POST, GET, PUT, and DELETE requests.

Option Description

Parameter

The field from incoming Hop stream that contains the parameter information

Parameter

The name of the outgoing Hop field from this transform

Matrix Parameters tab

Use the Matrix Parameters tab to define matrix parameter values for POST, PUT, DELETE, and PATCH requests.

Option Description

Parameter

The field from the incoming Hop stream that contains the matrix parameter information

Parameter

The name of the outgoing Hop field from this transform

Pagination tab

Use this tab to loop over paged REST APIs. Paging behaviour (Link header, offset/limit, page number, cursor) is defined on the REST connection metadata. This tab controls whether the loop runs and how each HTTP response is turned into Hop rows.

Option Description

Enable pagination loop for incoming rows

When enabled and the REST connection uses a non-NONE pagination strategy, Hop performs multiple HTTP requests per input row until the API has no more pages or a safeguard limit is reached. When disabled, or when the connection pagination strategy is NONE, Hop performs a single request (legacy behaviour).

Maximum page requests per input row

Hard cap on HTTP iterations per input row while following link headers, cursors, offsets, or page numbers. Leave blank or 0 to use the built-in safeguard (default 256).

Optional result split JsonPath/XPath

When set and Application type is JSON or XML, Hop emits one output row per JsonPath or XPath match instead of one row with the full response body. Typical for top-level arrays ($[]) or nested collections ($.results[]). Supports ${VAR} and %%VAR%% substitution. This field does not apply Hop’s $[HH] hexadecimal byte decoding, so bracketed JsonPath selectors (for example $[*]) are preserved.

Paging stops when:

  • the connection strategy reports no further page (no rel="next" link, empty page, or missing cursor),

  • HTTP status is outside the 2xx range,

  • the maximum page request limit is reached, or

  • for LINK_HEADER, the next URL was already fetched (cycle guard).

When a split path is configured but a page yields no matching elements, Hop fails with a clear error unless the response is a recognised empty paged collection (for example {"results":[]}).

GitHub’s public REST API requires a User-Agent header. Add it on the Headers tab (or use a Bearer token on the connection for higher rate limits). A 401 response with No Auth usually means an invalid Authorization header was sent (for example an unresolved ${GH_TOKEN} placeholder).

Samples

rest-client-github-releases-loop.hpl — lists apache/hop GitHub releases via LINK_HEADER pagination (per_page=5), splits $[*], and parses fields with a JSON Input transform. Connection metadata: github-releases.json (in the REST transform samples project).