XML Output (Advanced)
Options
The dialog is organized into three tabs: File, Content and XML Tree.
File tab
| Option | Description |
|---|---|
Transform name | Name of the transform. |
Output | Where to send the XML: Write to file, Output XML as field, or Write to file and output XML as field (both). Stored in pipeline XML as codes |
XML output field | Name of the field that receives the completed XML document (one value per split when splitting is enabled). Used when Output is Output XML as field or both. |
Include input fields in output | When Output includes an XML field: if enabled (default), each emitted row contains all input fields plus the XML field; if disabled, only the XML field is emitted (narrow stream useful for chaining). |
Filename | Base name of the output XML file (without extension). VFS URIs are supported. Required when Output writes to a file. |
Extension | File extension (without the leading dot). Defaults to |
Encoding | Character encoding for the output file. Defaults to |
Include transform copy number in filename | Append the transform copy number to the filename. |
Include date in filename | Append the system date ( |
Include time in filename | Append the system time ( |
Specify custom date/time format | Use a custom date/time pattern instead of the date/time toggles above. |
Date/time format | Java |
Split every N rows | Maximum rows per file before rolling over to a new split, or per completed XML field segment when Output includes an XML field. |
Zip output file | Wrap each output file in a zip archive (one entry per file). Generated XSDs are written next to the archive, not inside it. |
Do not open new file at start | Defer file creation until the first input row is received. |
Do not create file if no rows | Delete the output file at the end of the run if no rows were ever written. |
Add filename to result | Add the produced file(s) to the pipeline’s result file list (only after at least one row is written). |
Show file name(s) … | Pops up a list with sample filenames built from the current settings. |
Content tab
| Option | Description |
|---|---|
Compact | Suppress whitespace and EOL between elements; useful for byte-size-sensitive output. |
Blank line after XML declaration | Add a blank line right after the |
Emit empty elements | Emit an open/close tag pair for an element that has no value and no children. |
Emit attribute when value is null | Emit an attribute even when its source value is |
Emit attribute when no field is mapped | Emit an attribute that has no mapped field, using its default value. |
Trim leading/trailing whitespace | Trim text values before emitting them. |
Default decimal separator | Default decimal separator for numeric values; per-node settings still take precedence. |
Default grouping separator | Default grouping separator for numeric values; per-node settings still take precedence. |
Generate sibling XSD file | Write a sibling |
DOCTYPE root element / system / public identifier | Emit a |
XSL stylesheet href / type | Emit an |
XML Tree tab
The XML Tree tab is the visual designer for the output structure. The left pane lists the input fields received from the previous transform; the right pane is split between the target tree (top) and the property pane (bottom) for the currently-selected node.
Working with the tree
-
Click Get fields to (re)load the input fields from the previous transform.
-
Drag a field from the left pane and drop it onto an element in the tree. A new child element is created with that field name and
mappedFieldpre-filled. -
Use the toolbar above the tree (or the right-click menu) to:
-
+ Element / + Attribute / + Fragment: add a child node of the chosen kind under the selected element.
-
Delete: remove the selected node and its descendants (the root cannot be deleted).
-
Up / Down: reorder the selected node among its siblings.
-
Loop: toggle the loop flag. Exactly one element in the tree must carry it; switching the loop on a different node automatically clears it elsewhere.
-
Group-by: toggle the group-by flag on an ancestor of the loop element.
-
-
Selecting a node populates the Properties form below the tree. Edits propagate to the model immediately.
Node properties
| Property | Description |
|---|---|
Name | Local name of the element or attribute. |
Namespace URI | Optional XML namespace URI. When set on the root element, it becomes the default namespace and is also written into the generated XSD as the |
Kind |
|
Mapped field | Input field whose value provides this node’s content. For attributes and elements it sets the value; for nodes flagged |
Default value | Static text used when |
Format / Length / Precision / Currency / Decimal / Grouping | Per-node value-meta overrides used when converting the field value to a string. Per-node settings take precedence over the global Default decimal/grouping separator. |
Loop | Marks this element as the row-loop element. Exactly one element must carry the flag. |
Group-by | Marks this element as a group-by ancestor of the loop. Consecutive rows with equal |
Force create | Output this node even when the value is |
Remove outer wrapper (duplicate parent tag) | For |
Chaining and output-to-field
When Output is Output XML as field or both, the transform adds the configured XML output field to the stream for each completed document (or each split). A second XML Output (Advanced) transform can map that field with a DocumentFragment node. Use Remove outer wrapper on the fragment if the inner XML already has a root tag that would duplicate the parent element in the target tree.
Group-by behaviour
For the group-by mechanism to collapse correctly, the input rows must already be sorted by the group-by key(s). Use a Sort Rows transform upstream if needed. When the key changes, the open group element is closed and a new one is opened with the new key.
XSD generation
When Generate sibling XSD file is enabled, the transform writes a .xsd schema next to each output file (or split). The schema:
-
declares one global element matching the root of the configured tree;
-
nests complex types corresponding to elements with children or attributes;
-
sets
maxOccurs="unbounded"on the loop element and on every group-by ancestor; -
renders attributes as
xs:attributedeclarations (withuse="required"when the source node isForce create); -
renders document-fragment nodes as
<xs:any processContents="skip"/>placeholders; -
maps Hop value types to XSD built-ins as follows: integer →
xs:long, number/big-number →xs:decimal, date/timestamp →xs:dateTime, boolean →xs:boolean, binary →xs:base64Binary, everything else →xs:string; -
uses the root node’s namespace as the schema’s
targetNamespace(andelementFormDefault="qualified") when set.
The XSD is written outside zip archives and is added to the pipeline’s result file list when Add filename to result is enabled.
Memory profile
The transform uses StAX streaming and only buffers the XML state of the currently-open path of group elements. A single very large group is therefore O(largest group) in memory rather than O(document).
Example: orders with grouped items
Input rows (already sorted by orderId):
| orderId | itemName | price |
|---|---|---|
1 | foo | 1.50 |
1 | bar | 2.00 |
2 | baz | 3.25 |
Tree:
-
orders(root, element)-
order(element, group-by, mapped field =orderId)-
id(attribute, mapped field =orderId) -
item(element, loop)-
name(element, mapped field =itemName) -
price(element, mapped field =price, format =0.00)
-
-
-
Output:
<?xml version="1.0" encoding="UTF-8"?>
<orders>
<order id="1">
<item><name>foo</name><price>1.50</price></item>
<item><name>bar</name><price>2.00</price></item>
</order>
<order id="2">
<item><name>baz</name><price>3.25</price></item>
</order>
</orders>