Universal Models
and the
Chase Procedure
Universal Models
and the
Chase Procedure
Instructors:
Phokion G. Kolaitis
University of California Santa Cruz and IBM Research
email: kolaitis@ucsc.edu
Andreas Pieris
University of Cyprus and University of Edinburgh
email: apieris@inf.ed.ac.uk
Category: Advanced Course
Dates: July 22 - 26, 2024
Time: 9:00am - 10:30am
Room: Conference Hall - Building 12, NCSR "Demokritos"
Many database problems that involve rule-like constraints, e.g., the problems of logical implication of constraints, computing data exchange solutions, and ontological query answering (to name a few) can be solved by simply exhibiting a universal model of a given database D and a given set of constraints C. Intuitively, a universal model is a representative of all the models of the knowledge-base (D, C), that is, the instances that contain D and satisfy every constraint in C.
The chase procedure is a fundamental tool for building universal models; it takes as input a database D and a set C of rule-like constraints, adds new tuples to D or modifies tuples as dictated by the rules in C, and repeats this process until either all constraints in C are satisfied or it is determined that no universal model exists. After discussing the central notion of universal model, we will introduce the main variants of the chase procedure (restricted chase and semi-oblivious chase), will show that they indeed produce universal models, and will explore their key differences.
We will then focus on the challenge of the non-termination of the chase procedure under tuple-generating dependencies and equality-generating dependencies. In particular, we will discuss the undecidability of deciding whether the chase procedure terminates, and identify broad syntactic conditions that are sufficient and, in some cases, also necessary, for the termination of the chase procedure.
The course will begin with a review of relational databases and queries over relational databases. After this, two large classes of rule-like constraints will be introduced and illustrated: tuple-generating dependencies (tgds) and equality-generating dependencies (egds). Rigorous semantics to tgds and egds will be given using the notion of homomorphism, a fundamental notion that will be used throughout the course.
The notions of a universal model of a knowledge-base and of the certain answers of a query over a knowledge-base will be introduced and illustrated. Two application scenarios, namely ontology-based data access and data exchange, will be described in some depth.
The restricted chase procedure for tgds and egds will be described and its main properties will be investigated. This includes defining the notion of applicable triggers for tgds and egds, the notions of finite and infinite chase sequence, and the result of the chase procedure. The main theorem about the restricted chase asserts that, given a knowledge-base, either this procedure builds a universal model or it determines that no model of the given knowledge-base exists.
Some limitations of the restricted chase will also be discussed and illustrated. The semi-oblivious chase will then be described and investigated.
After this, the focus will be on the chase termination problem, which comes in several different flavors. For arbitrary tgds and egds, the chase termination problem and its variants is undecidable; in fact, this is true even if we focus only on tgds.
To cope with the undecidability of the chase termination, syntactic restrictions on the tgds will be imposed. In particular, the notion of weak acyclicity will be introduced and its properties will be investigated. Settings in which weak acyclicity is both a necessary and sufficient condition for chase termination will also be explored.
Slides
References
A list of research articles, survey papers, and monographs related to the course.
Phokion Kolaitis is a Distinguished Research Professor at the University of California Santa Cruz and a Principal Research Staff Member at the IBM Almaden Research Center. His research interests include principles of database systems, logic in computer science, and computational complexity. For additional information, please visit his personal website.
Andreas Pieris is an Assistant Professor at the University of Cyprus and an Associate Professor at the University of Edinburgh. His research interests are database theory with emphasis on knowledge-enriched and uncertain data, knowledge representation and reasoning, and logic in computer science. For additional information, please visit his personal website.