Get Data From XML


The Get Data From XML transform provides the ability to read data from any type of XML file using XPath specifications.

Get Data From XML can read data from 3 kind of sources (files, stream and url) in 2 modes (user can define files and urls at static mode or in a dynamic way).


Files Tab

The files tab is where you define the location of the XML files from which you want to read. The table below contains options associated with the Files tab.

Option Description

Transform name

Name of the transform.

XML Source from field

  • XML source is defined in a field : the previous transform is giving XML data in a certain field in the input stream.

  • XML source is a filename : the previous transform is giving filenames in a certain field in the input stream. These are read.

  • Read source as URL : the previous transform is giving URLs in a certain field in the input stream. These are read.

  • Get XML source from a field : specify the field to read XML, filename or URL from.

File or directory

Specifies the location and/or name of the input text file. Note: Click Add to add the file/directory/wildcard combination to the list of selected files (grid) below.

Regular expression

Specifies the regular expression you want to use to select the files in the directory specified in the previous option.

Selected Files

Contains a list of selected files (or wildcard selections) and a property specifying if file is required or not. If a file is required and it is not found, an error is generated;otherwise, the file name is skipped.

Show filename(s)…​

Displays a list of all files that will be loaded based on the current selected file definitions

Content Tab

Option Description


  • Loop XPath : For every "Loop XPath" location we find in the XML file(s), we will output one row of data. This is the main specification we use to flatten the XML file(s). You can use the "Get XPath nodes" button to search for the possible repeating nodes in the XML document. Please note that if the XML document is large that this can take a while.

  • Encoding : the XML filename encoding in case none is specified in the XML documents. (yes, those still exist)

  • Namespace aware : check this to make the XML document namespace aware.

  • Ignore comments : Ignore all comments in the XML document while parsing.

  • Validate XML : Validate the XML prior to parsing. Use a token when you want to replace dynamically in a Xpath field value. A token is between @_ and - (@_fieldname-). Please see the Example 1 to see how it works.

  • Use token : a token is not related tro XML parsing but to Hop.

  • Igore empty file : an empty file is not a valid XML document. Check this if you want to ignore those altogether.

  • Do not raise an error if no file: Don’t raise a stink if no files are found.

  • Limit : Limits the number of rows to this number (zero (0) means all rows).

  • Prune path to handle large files: almost the same value as the "Loop XPath" property with some exceptions, see Get Data from XML - Handling Large Files for more details. Note that you can use this parameter to avoid multiple HTTP URL requests.

Additional fields

  • Include filename in output? : Allows you to specify a field name to include the file name (String) in the output of this transform.

  • Rownum in output? : Allows you to specify a field name to include the row number (Integer) in the output of this transform.

Add to result filename

  • Add files to result filename : Adds the XML filenames read to the result of this pipeline. A unique list is being kept in memory that can be used in the next workflow action in a workflow, for example in another pipeline.

Fields Tab

Option Description


The name of the output field


The path to the element node or attribute to read


The element type to read: Node or Attribute


The data type to convert to


The format or conversion mask to use in the data type conversion


The length of the output data type


The precision of the output data type


The currency symbol to use during data type conversion


The numeric decimal symbol to use during data type conversion


The numeric grouping symbol to use during data type conversion

Trim type

The type of trimming to use during data type conversion


Repeat the column value of the previous row if the column value is empty (null)

Metadata Injection Support

All fields of this transform support metadata injection. You can use this transform with ETL Metadata Injection to pass metadata to your pipeline at runtime.