Get Data From XML
DescriptionThe Get Data From XML transform provides the ability to read data from any type of XML file using XPath specifications. Get Data From XML can read data from 3 kind of sources (files, stream and url) in 2 modes (static or dynamic). See also:
Samples (Samples project):
|
Options
Files Tab
The files tab is where you define the location of the XML files from which you want to read. The table below contains options associated with the Files tab.
Option | Description |
---|---|
Transform name | Name of the transform. This must be unique within the pipeline. |
XML Source from field | |
XML source is defined in a field | The XML data to use is present ina field in the input stream. |
XML source is a filename | The XML data is available in and read from a file. |
Read source as URL | The XML data is read from the location specified by a URL. |
Get XML source from a field | Specify the field to that contains the XML data, filename or URL from. |
File or directory | Specifies the location and/or name of the input text file. |
Add | Click to add the file/directory/wildcard combination to the list of selected files (grid) below. |
Browse | Click to browse to the file’s location. |
Regular expression | Specifies a regular expression used to select multiple files in the directory specified in the previous option. |
Selected Files | Contains a list of selected files (or wildcard selections) and a property specifying if file is required or not. If a file is required and it is not found, an error is generated;otherwise, the file name is skipped. |
Delete | Click to remove the selected file in the table. |
Edit | Click to modify the selected file in the table. |
Show filename(s)… | Displays a list of all files that will be loaded based on the current selected file definitions |
Content Tab
Option | Description |
---|---|
Settings | |
Loop XPath | For every matching entry in the XML file(s) or data, one row of data is added to the output. This is the main specification used to flatten the XML file(s). You can use the "Get XPath nodes" button to search for the possible repeating nodes in the XML document, however, if the XML document is large, this can take a while. |
Encoding | The XML filename encoding, in case none is specified in the XML documents. |
Namespace aware? | Enable if the XML document requires namespaces to be considered while parsing. |
Ignore comments? | Enable to ignore all comments in the XML document while parsing. |
Validate XML? | Enable to validate the XML prior to parsing. Use a token when you want to replace dynamically in a Xpath field value. A token is between @_ and - (@_fieldname-). Please see the Example 1 to see how it works. |
Use token | Enable to use a token for validating the XML. |
Ignore empty file | Enable to ignore any files with no content. These are not valid XML documents. |
Do not raise an error if no files | Enable to do nothing if no files are found. Otherwise, an error is returned. |
Limit | Specify a maximum number of rows to return. Zero (0) returns all rows. |
Prune path to handle large files | Specifies a path, similar to the Loop XPath, used to process chunks of data from the XML file. Each matching value defines a chunk of data that is read and processed. Use the prune path to speed up processing of large files. You can also use this parameter to avoid multiple HTTP URL requests. You can also do this using the XML Input Stream (StAX) transform. |
Additional fields | |
Include filename in output? | Allows you to specify a field name to include the file name (String) in the output of this transform. |
Filename fieldname | The field to read the file name value from. |
Rownum in output? | Allows you to specify a field name to include the row number (Integer) in the output of this transform. |
Rownum fieldname | The field to read the row number value from. +2 |
Add to result filename | Add files to result filename |
Fields Tab
Option | Description |
---|---|
Name | The name of the output field |
XPath | The path to the element node or attribute to read |
Element | The element type to read: Node or Attribute |
Result Type | Either "Value of" or "Single node"
|
Type | The data type to convert to |
Format | The format or conversion mask to use in the data type conversion |
Length | The maximum number of characters in the the output data value |
Precision | The number of decimal places used to display numbers in the output data |
Currency | The currency symbol to use for monetary values. |
Decimal | The numeric decimal symbol to use for floating-point numbers. |
Group | The numeric grouping symbol to use for separating thousands in the data |
Trim type | How whitespace characters are removed from values, either from the left (trims leading spaces), right (trims trailing spaces), both (trims all whitespace), or none (no trimming is done) |
Repeat | Repeat the column value of the previous row if the column value is empty (null) |
Get fields | Click to populate the table with fields from the input stream. |
Select fields from snippet | Click to populate the table with fields corresponding to a Loop Xpath and an XML document that must be provided in the popup dialog. |
===Additional output fields tab
Option | Description |
---|---|
Short filename field | The field used to store the file name, without the path or file extension. |
Extension field | The field used to store the file extension. |
Path field | The field used to stare the path to the file. |
Size field | The field used to store the file size. |
Is hidden field | The field used to specify whether the file is hidden. |
Last modification field | The field used to store the date the file was last modified. |
Uri field | The field used to store the XML document’s source URL. |
Root uri field | The field used to store the XML document’s namespace URL, taken from the root element |