Neo4j

Description

Neo4j is an open source graph database which you can download from www.neo4j.com/download-center

You can use it to represent information with nodes and relationships in a property graph. Neo4j doesn’t use indexes which allows it to traverse large graphs really quickly with so-called graph algorithms. For more information on these unique graph algorithms see: Neo4j Graph Algorithms

Execution lineage

You can use Neo4j to store logging and execution lineage of your workflows and pipelines. The way you do this is simply by setting the variable NEO_LOGGING_CONNECTION to the name of the Neo4j Connection where you want the logging and lineage to be written to.

The Neo4j plugin offers a separate perspective to query this logging and lineage information. For example, it allows you to quickly jump to the place where an error occurred. This neat trick is performed by asking the database to find the shortest path between and execution node where an error occurred and without children and the "grand parent" node. The path you get is the exact path that was followed from for example the "grand parent" workflow to the exact transform where an error occurred.