Reference 0.11.2

Methods and class

The pyDatalog module has the following methods :

    • assert_fact(predicate_name, *terms) : asserts predicate_name(terms[0], terms[1], ...)
    • retract_fact( predicate_name, *terms) : retracts predicate_name(terms[0], terms[1], ...)
    • load(code) : where code is a string containing a datalog program, as described in the section below. This method is used to add facts and clauses to the datalog database.
    • program() : a function decorator that loads the datalog program contained in the decorated function.
    • predicate() : a function decorator that declares a predicate resolver written in python
    • ask(query) : where query is a string containing one or more literal(s) joined by the & operator. It returns an instance of Answer, or None.
    • clear() : removes all facts and clauses from the datalog database.

The Answer class returned by ask(query) contains the following attributes and methods:

    • name : name of the predicate that was queried
    • arity : arity of the predicate
    • answers : a list of tuples that satisfy the query. The length of each tuple is the same as the arity.
    • __eq__(other) : facilitates comparison to another set of tuples
    • __str__() : prints the answers

Grammar of a pyDatalog program

In theory, the code string of a pyDatalog program can contain any python code, as defined in the official grammar of Python. However, function and variable names (that are not reserved by python) are considered Datalog symbols, have special meaning, and should appear only in statements that follow a subset of the python grammar, as defined below.

The terminal symbols in this grammar are defined in BNF as follows :

    • simple_predicate ::= [a-fA-F_] [0-9a-fA-F_]*
    • constant ::= [a-f] [0-9a-fA-F_]* | python literals
    • variable ::= [A-F_] [0-9a-fA-F_]*
    • Note : words starting with _pyD_ are reserved for pyDatalog

Please note:

    • "=" defines a logic formula, while "==" appears in a fact, clause or query and must always be surrounded by parenthesis
    • an aggregate function can only appear in the head of a clause
    • an inequality must be surrounded by parenthesis, and can only appear in the body of a clause
    • although the order of pyDatalog statements is indifferent, the order of literals within a body is significant:
      • an expression used as a key of a function must be bound by a previous literal (otherwise no result is returned)
      • the right hand side of X==expr must be bound by a previous literal (otherwise, no result is returned)
      • the right hand side of p[X]< expr must be bound (otherwise, no result is returned).
      • the left and right hand sides of X<expr comparisons must be bound (otherwise, an error is raised)
      • the variables in a negated body must be either bound by a previous literal, or not used later in the body
    • the head of a clause can only contain constant or variable (but no expressions). Each variable must also appear in the body

Aggregate functions:

    • len (P[X]==len(Y)) <= body : P[X] is the count of values of Y (associated to X by the body of the clause)
    • sum (P[X]==sum(Y, for_each=Z)) <= body : P[X] is the sum of Y for each Z. (Z is used to distinguish possibly identical Y values)
    • concat (P[X]==concat(Y, order_by=Z, sep=',')) <= body : same as 'sum' but for string. The strings are sorted by Z, and separated by ','.
    • min, max (P[X]==min(Y, order_by=Z)) <= body : P[X] is the minimum (or maximum) of Y sorted by Z.
    • rank (P[X]==rank(for_each=Y, order_by=Z)) <= body : P[X] is the sequence number of X in the list of Y values when the list is sorted by Z.
    • running_sum (P[X]==running_sum(N, for_each=Y, order_by=Z)) <= body : P[X] is the sum of the values of N, for each Y that are before or equal to X when Y's are sorted by Z.
    • The named arguments must be specified in the given order. X and the named arguments can be a list of variables (instead of just one variable), to represent more complex grouping. Variables in order_by arguments can be preceded by '-' for descending sort order. If the aggregation function does not depend on a variable, use a constant (e.g. P[None] == len(Y)).

Beware that, when loading a datalog program, a symbol could become a constant. For example,

@pyDatalog.program()
def _():
+ a(i)
for i in range(3):
+ b(i)
print(pyDatalog.ask("a('i')")) # prints a set with 1 element : the ('i',) tuple
print(pyDatalog.ask("b(X)")) # prints a set with 3 elements, each containing one element : 0, 1 or 2

The for loop assigns an integer to i, which is inserted as a constant in + b(i).

In-line queries

Classes inheriting from pyDatalog.Mixin can participate in in-line queries. An in-line query follows the syntax of a body (see above), where:

    • variables must be declared using X=pyDatalog.Variable() or X,Y,Z=pyDatalog.variables(3)
    • literals and functions must be prefixed by the class name, and have arguments of type pyDatalog.Variable or python literals. (for example, A.p(X,A.b) is not allowed)
    • the in keyword must be replaced by a call to ._in( ), e.g. a.p[X]._in((1,2)). not in must be replaced by ._not_in( )

The result of an in-line query behaves like a python list, with a column for each variable, in order of their appearance in the query. Additionally, it has a >= X operator which return the first value of variable X in the result of the query.

After an in-line query, each variable in the query behaves like a list of the result. Additionally, the first result (or None) can be obtained with .v() (e.g. X.v())