pyDatalog can be mixed with SQLAlchemy to provide powerful query facilities on relational databases, as shown in this example.
The first step is to create a Base class that combines pyDatalog and SQLAlchemy capability, as follows :
from sqlalchemy.ext.declarative import declarative_base# define a base class with SQLAlchemy and pyDatalog capabilitiesBase = declarative_base(cls=pyDatalog.Mixin, metaclass=pyDatalog.sqlMetaMixin)The second step is to associate a SQLAlchemy session to this Base class, with the appropriate configuration. pyDatalog will use this session to fetch data from the databases.
from sqlalchemy import create_enginefrom sqlalchemy.orm import sessionmaker# create database in memory
engine = create_engine('sqlite:///:memory:', echo=False)# open a session on a database, then associate it to the Base classSession = sessionmaker(bind=engine)session = Session()Base.session = sessionClasses that inherit from Base will now have both pyDatalog and SQLAlchemy capability. There are 2 ways to define the attributes of those classes : explicitly, or by inspecting the existing database tables. The first approach enables the creation of the tables in the relational database.
from sqlalchemy import Column, Integer, String, ForeignKeyclass Employee(Base): # Employee inherits from the Base class __tablename__ = 'employee' # data are stored in the Employee table name = Column(String, primary_key=True) manager_name = Column(String, ForeignKey('employee.name')) salary = Column(Integer)# now create the tableBase.metadata.create_all(engine)The second approach can be used to reverse engineer an existing database, using __table_args__ :
class Employee(Base): __tablename__ = 'employee' __table_args__ = {'autoload':True} # autoload the modelIn both cases, relations can be defined between tables.
The Employee class can now be defined with logic clauses and used in in-line queries, as explained in the previous tutorial. For example :
# who has a salary of 6300 ?X = pyDatalog.Variable()Employee.salary[X] == 6300print(X) # prints [Employee: Mary]pyDatalog uses the latest in-session data :
X[0].salary = 3000 # modify in-session dataEmployee.salary[X] == 3000print(X) # prints [Employee: Mary]
Different classes in a python program can inherit from different Base classes, each having a session on different databases : pyDatalog queries will join data from the relevant databases, performing multi-database queries effortlessly. SQLAlchemy will issue the SQL statements in the proper dialect for each database. Data changes to instances of the classes can be persisted using SQLAlchemy's transaction facilities. See SQLAlchemy's documentation and tutorial for more details.
Queries on non-relational databases can be implemented using predicate resolvers written in python. They can be mixed with queries on relational databases using conjunctive queries or logic clauses.
FoundationDB has a nice tutorial for their pyDatalog binding. The example below illustrates a binding to MongoDB. (see the source).
The first step in the example is to create records in a MongoDB database.
from pymongo import Connectionconnection = Connection()db = connection.Employeesprofiles = db.profilesprofiles_to_insert = [{"name": "John", "diploma": "MSc."}, {"name": "Mary", "diploma": "EE"}, {"name": "Sam", "diploma": "MBA"}]profiles.insert(profiles_to_insert)For this to work, the mongoDB database must be started on the machine. See pymongo tutorial.
The second step is to add a predicate resolver for diploma to the Employee class used in the section above, as explained in the following tutorial.
from SQLAlchemy import Employee # import the SQLAlchemy exampledef _pyD_diploma2(cls, employee, diploma): global profiles if employee.is_const(): r = profiles.find_one({"name": employee.id.name}) if r: yield (employee, r["diploma"]) return raise AttributeError# attach the resolver to the Employee classEmployee._pyD_diploma2 = classmethod(_pyD_diploma2) This resolver search profiles by name. A search by diploma would be resolved in a similar way, using find().
It then becomes easy to query both databases transparently:
from pyDatalog import pyDatalogX, N, Diploma = pyDatalog.variables(3)# Who has a salary of 6800 and a MSc. diploma(Employee.salary[X]==6800) & (Employee.diploma[X]=="MSc.")print(X) # prints [Employee: John]Should diploma be later stored in the relational database, only the python resolver would need to be changed.