Documentation

Data Model

[insert data schema image here]

There are 3 classes of nodes, each have their own set of relationships.

    • 'Real' nodes and relationships. These nodes exist in reality, such as people, companies and addresses, SIC codes, etc. Relationships reflect real connections, such as 'DIRECTOR_OF' or 'LOCATED_AT'
      • Companies
      • Company Office Holders
        • Company Officers
        • Corporate Company Officers
      • Address
      • SIC code


    • 'Event' nodes and relationships. Timedate stamps exist as nodes and support time series functions and analysis. Event relationships reflect what happened, such as 'DIRECTOR_DISQUALIFIED' or 'INCORPORATED_ON'
      • Event


    • 'Audit' nodes and relationships. Audit nodes support auditing, detailing the source and any changes to the Real and Event nodes.
      • Source


Access

There is no front end or user / access management so be prepared for some limitations.

  • Query the data via the standard Neo4j browser.
  • We'll issue you with a user name and password (password change prompt on first logon).
  • Access is also restricted by IP so you'll to provide that, obviously a static IP would be better.
  • All users will be 'Readers' and only able to execute read-only queries.
  • Data can't be exported.
  • There is no API access (yet).


Cypher Queries

Our data assets are built using Neo4j and its native query language Cypher. If you are familiar with Cypher knowing the names of nodes, relationships and the data model is all you need to get started. If you are new to it, please see the Examples

WARNING. The Graph Project is not a finished product, you are will be accessing data and performing queries in an very untested environment. What you do will affect other users so a few tips on being considerate to them and to Mark who will have to fix any issues. This is a large database and some types of query will crash it.

  • Prefix your queries with EXPLAIN before live execution if you are unsure of the implications. This will indicate load.
  • Suffix your queries with LIMIT to curtail the number of results returned.
  • Create a sub graph to test any algorithm or complex / multi-step query (see algorithms below)
  • If you're query is hanging run :QUERIES and you may be able to kill it off


Virtualisation

Algorithms