Ergo Connectors and API's (Java, Python, C, SQL, Json,RDF, etc.)

Loading Large Amounts of Data (fastload)

Ergo2SQL

ErgoRDF/OWL

ErgoSPARQL

Java-to-Ergo API

Ergo2Java

Python-to-Ergo API

C/C++-to-Ergo Interface

Loading Large Amounts of Data (fastload)

In earlier sections we discussed loading knowledge bases into Ergo. Since Ergo has a rich syntax, parsing general Ergo sentences is sometimes time consuming and so loading large knowledge bases will not scale. In order to enable fast-loading of data into Ergo, one must limit the format of what can be loaded. The fast-loading format imposes the following restrictions:

The files being loaded must contain only Ergo facts, not rules.
Only first-order predicates and terms are allowed -- no HiLog or frames.
No infix or postfix operators are allowed.
No transactional predicates.
The file can contain comments (only /* ... */) and preprocessor instructions such as #define and #include.
In fast-loaded files, the percent sign (%) is used as a comment, like // in the regular Ergo files.
The fast-loaded files must have the .P extension (for they, in fact, contain enhanced Prolog-style data).

The data is loaded into special data structures, called storage containers, and is queried using a special primitive. The fast-loading primitives are:

fastload{Filename,StorageContainerName}.

This loads the data from a file name into a storage container (denoted by an alphanumeric symbolic constant).

fastquery{Storage,ErgoQuery}.

This is the query primitive.

fasterase{StorageContainerName}.

This erases all data from the given storage container.

More details can be found in ErgoAI Reasoner User's Manual , section "Fast Loading Vast Amounts of Data."

Ergo2SQL

This package provides the primitives to connect to SQL databases and to execute queries on them. Please see the Guide to ErgoAI Packages, section "Querying SQL Databases."

ErgoRDF/OWL

There are two ways to access RDF and OWL stores from Ergo. The ErgoRDF/OWL Tool accessible from the ErgoStudio and the Ergo API to access OWL knowledge bases via the calls embedded in Ergo rules and queries.

The ErgoOWL API contains primitives that allow Ergo to load OWL files and present them in the form of Ergo facts. This API is described in detail in the Guide to ErgoAI Packages. The most commonly used API call is the query of the form

System[rdf_load(Filename, Module)]@\owl.

For example:

System[rdf_load('wine.owl', MyModule)]@\owl.

This will load the OWL file wine.owl (assumed to be in OWL/XML) into the Ergo module MyModule where the RDF triples are represented via the Ergo frame syntax. They can be queried as follows:

?Subject[?Property->?Object]@MyModule.

The above call guesses the format of the input file from the file name extension (.owl, .rdf, .nt, .ttl, etc.). The Guide to ErgoAI Packages also describes other calls, which give the user more control over loading, provide additional queries and RDF graph manipulation primitives.

The ErgoRDF/OWL Tool provides a manual way of doing what ErgoOWL API does automatically. Therefore, it is not the preferred way to work with RDF. One might choose to use that tool only in the very rare cases when there is a need for direct access to the intermediate files produced by Ergo during the import process (for instance, when one intends to use these intermediate files as a starting point for a future knowledge base, add rules directly tho those files, etc.).

The tool is accessible from the ErgoAI Studio's Tools menu. It will open a window that lets one select the different options for translating RDF and OWL files into Ergo knowledge bases.

Note: This tool interface requires Java 8 or later.

Input: Once the tool's window is open, select the desired input type depending on the RDF format used by the input file (XML, JSON, n-triples or n-quads files or directories) . You will be asked to select the input file or directory. Once you select a file, the "Input file name" label will show the selected file. Note: if n-quads is selected as the input type then n-triples will also be tolerated in the input. In that case, the output graph name, entered below, will be used. If none entered then "main" will be used as the default graph name.

Output: Select the output predicate type: n-quads or n-triples, where n-triples is the default option. If the output is in the form of n-quads then the graph name can also be specified. The default graph name is "main".

Select the output format: either the fastload syntax, or the native Ergo predicates, or the frames. The fastload format can be loaded an order of magnitude faster, which is crucial for large files. This is done using the fastload{file,storage_container} command, as explained earlier in this section. Querying is done via the fastquery{module,query} command. However, the fastload format is not suitable for human consumption and requires the aforesaid special primitives, which add a layer of complexity. This format is used by the ErgoOWL API internally, which hides the complexity of fast-loading completely, making the interface simple and straightforward. For this reason, ErgoRDF/OWL Tool is rarely used with this format.

The native Ergo syntax is significantly less scalable that fastload, but it is good for human consumption. It is therefore used for demonstrations and cases such as the aforementioned use of the intermediate files to start new knowledge bases.

Next, edit the default IRI prefixes, if necessary.

Finally click on "Import RDF/OWL" to perform the conversion. The "Status" label will be showing "Loading file ..." until the result of the translation becomes available. The output file will be created in the same directory with the input file, so you must have the write permission to that directory. The output file name will be inputFileName.ergo for the native predicates or frames formats, and inputFileName.P for the fastload format. The output pane will show the first 1000 lines of the translation -- this can be opened and loaded independently in the Editor.

Once the translation process completes, the result is shown in the OWL and Ergo content text areas on the right side of the screen, and the system will now be ready to import another RDF/OWL file.

The following example shows a translation of the W3C wine ontology (in http://www.w3.org/TR/owl-guide/wine.rdf) into an Ergo file using the ErgoRDF/OWL tool:

One can now load the Ergo file with the fastload primitive as follows:

    fastload{'C:\\Users\\Paul\\Desktop\\wine.rdf.P', owl}.

and query it using

    fastquery{owl, ?X}.

ErgoSPARQL

There are two ways to query SPARQL endpoints in Ergo: through the ErgoSPARQL Tool accessible from the ErgoAI Studio's Tools menu and directly from within Ergo rules and queries.

The ErgoSPARQL Tool accessible from the ErgoAI Studio's Tools menu will open a window that can be used to query SPARQL end points put the result into Ergo files.

In order to use the tool, enter/edit the SPARQL endpoint, user credentials, IRI prefixes, and a SPARQL query.

Select the output format: the fastload syntax or the native Ergo format. The default option is the fastload format, which supports quick loading of large data sets into Ergo. This is done as explained earlier in this section. The fastload format is preferred, as it is much more scalable.

Change the output predicate name (called, node, by default), if needed. Note: whether you select the fastload or the native Ergo format, the output will be a set of facts of the form node(arg1,arg2,arg3,...). The main difference between the two forms of outputs is in how the IRI prefixes are specified.

Click on the "Query SPARQL endpoint" to send a query to the endpoint. You will be asked to chose an output file. The "Status" label will be showing "Querying SPARQL endpoint ..." until the result is retrieved and saved.

Once the query terminates, the output is placed in the Ergo content text area on the right side of the screen and the system will now be ready for the next query.

The following example queries the SPARQL DBpedia endpoint for the first 100 triples.

To query SPARQL endpoints directly from Ergo rules and queries, the ErgoSPARQL API provides several primitives, described below.

A complete description of this API is given in the Guide to ErgoAI Packages, and some representative commands are shown below where we use DBpedia and Wikidata as examples.

NOTE: all examples below are shown as if they are calls executed from within rule bodies and queries. To execute them as standalone commands from within a file, they must be prefixed with a ?- query symbol, as usual. For instance:

    ?- System[open(DBpedia_ConnectionID, 'http://dbpedia.org/sparql', '', '')]@\sparql.

Opening a connection to DBpedia:

    System[open(DBpedia_ConnectionID, 'http://dbpedia.org/sparql', '', '')]@\sparql.

Examples of queries to DBpedia:

Query[select(DBpedia_ConnectionID,'

# Companies that have Aerospace as an industry.

# Show also, if available, these companies English labels.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX onto:  <http://dbpedia.org/ontology/>

PREFIX res:  <http://dbpedia.org/resource/>

PREFIX prop: <http://dbpedia.org/property/>

SELECT ?uri ?string

WHERE

?uri  rdf:type onto:Company  .

?uri prop:industry res:Aerospace .

OPTIONAL {?uri rdfs:label ?string . FILTER (lang(?string) = "en") }

LIMIT 10

')->?Res]@\sparql.

Query[select(DBpedia_ConnectionID,'

# Cities that have population urbans > 2000000 or that have a population > 2000000.

# Show also, if available, these cities English labels.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX prop: <http://dbpedia.org/property/>

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

PREFIX onto: <http://dbpedia.org/ontology/>

SELECT DISTINCT ?uri ?string

WHERE

?uri rdf:type onto:City.

      { ?uri prop:population ?population. }

        UNION

      { ?uri prop:populationUrban ?population. }

FILTER (xsd:integer(?population) > 2000000) .

OPTIONAL {?uri rdfs:label ?string . FILTER (lang(?string) = "en") }

LIMIT 10

')->?Res]@\sparql.

Closing the DBpedia connection:

    System[close(DBpedia_ConnectionID)]@\sparql.

Opening a connection to Wikidata:

    System[open(Wikidata_ConnectionID, 'https://query.wikidata.org/sparql', '', '')]@\sparql.

Examples of queries to Wikidata:

Query[select(Wikidata_ConnectionID,['

# All airports in Wikidata

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

PREFIX wd: <http://www.wikidata.org/entity/>

PREFIX wikibase: <http://wikiba.se/ontology#>

PREFIX bd: <http://www.bigdata.com/rdf#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT * WHERE {

    ?place wdt:P31/wdt:P279* wd:Q1248784 .

    ?place rdfs:label ?placelabel . FILTER (lang(?placelabel) = "en")

LIMIT 100

'])->?Res]@\sparql.

Query[select(Wikidata_ConnectionID,'

# All banks in WikiData

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

PREFIX wd: <http://www.wikidata.org/entity/>

PREFIX wikibase: <http://wikiba.se/ontology#>

PREFIX bd: <http://www.bigdata.com/rdf#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT * WHERE {

    ?s wdt:P31 wd:Q22687 .

    ?s rdfs:label ?slabel . FILTER (lang(?slabel) = "en")

LIMIT 100

')->?Res]@\sparql.

Query[select(Wikidata_ConnectionID,'

# All banks with their country

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

PREFIX wd: <http://www.wikidata.org/entity/>

PREFIX wikibase: <http://wikiba.se/ontology#>

PREFIX bd: <http://www.bigdata.com/rdf#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?banklabel ?countrylabel  WHERE {

  ?bank wdt:P31 wd:Q22687 .

  ?bank wdt:P159 ?headquarters .

  ?headquarters wdt:P17 ?country .

  ?country rdfs:label ?countrylabel .  FILTER (lang(?countrylabel) = "en")

  ?bank rdfs:label ?banklabel . FILTER (lang(?banklabel) = "en")

LIMIT 100

')->?Res]@\sparql.

Closing the Wikidata connection:

System[close(Wikidata_ConnectionID)]@\sparql.

If you run the second and third query above, you will likely notice that the number of actual answers returned is less than the specified limit of 100. This is because, by default, SPARQL may return duplicate answers that were produced by joining different triples. Thus, among the 100 answers returned, 60 might be duplicates of some other answers.

Ergo, in contrast, eliminates the duplicates when it presents the answers to the user, which accounts for the difference. If you wish all SPARQL answers to be different, add the DISTINCT keyword, as in the first query.

Java-to-Ergo API

This API enables one to write Java applications that can talk to Ergo. Namely, the API allows Java applications to start an Ergo instance in a subprocess, load knowledge bases into it, send queries to that instance of Ergo, and process the answers.

There are two interfaces here: the "low-level interface" and the "high-level interface." The high-level interface is easier to use, but it is very restrictive and is still experimental. Most users use the low-level interface, and this is what we recommend to focus on.

The APIs are described in the Guide to ErgoAI Packages, section "Java-to-Ergo Interfaces." See also an example in ErgoAI Example Bank.

Ergo2Java

This API lets Ergo queries to invoke Java as a subprocess and perform operations on Java objects. This works both when Ergo runs as a Studio application and as a command-line application. In the future, this will allow sending messages to arbitrary Java objects which will turn Ergo into a scripting language for Java applications. For now, however, this API is rather limited. It lets Ergo pop up text windows and write to them, pop up various dialog boxes, and perform some other operations.

This API is described in the Guide to ErgoAI Packages, section "Ergo-to-Java Interface."

Python-to-Ergo API

This API enables Python programs to query Ergo knowledge bases. It is described in the Guide to ErgoAI Packages, section "Python-to-Ergo Interface." See also an example in the ErgoAI Example Bank.

C/C++-to-Ergo Interface

Ergo can be invoked and queried from within C and C++. A detailed description is in the ErgoAI Example Bank. This feature is available in the development version of Ergo and will appear in the next release.

Page updated

Report abuse