Version: 3 December 2011
ERROL (Entity Relationship Role Oriented Language; Markowitz and Raz 1983a) is a declarative database query and manipulation language for the Entity-relationship model (ERM). It is applicable to any data model on which ERM can be mapped, virtually any general purpose database data model. It is based on the capability of ER diagrams to be described accurately by simple Natural language (NL) sentences. A specification of a complex operation upon an ERM database can be described accurately by a complex and/or compound NL sentence constructed from the simple sentences describing the respective ER diagram. An ERROL expression mimics such NL sentence with one-to-one correspondence between ERROL subexpressions and NL subsentences: An ERROL expression can look like the corresponding NL sentence, or at least like a similar, equivalent one. This allows to write in ERROL very complex queries by simple conversion from their NL specifications. It also allows a straightforward checking of an ERROL expression meeting a complex NL specification. With such characteristics it can be a foundation for future Data management languages, more convenient for humans to use than existing languages which may need complex expressions even for moderately complex NL expressions (e.g., SQL; see Example below).
ERROL is also applicable to newer applications like querying the Semantic web using ontologies. As well it is applicable to other than ERM semantic data models, e.g., Object-Role Modeling (ORM), which has many similarities to the ERM. ERROL has some similarities to the later Gellish (in particular to Gellish English), a formal language with a strong connection to natural languages, and can use its dictionaries.
Reshaped relational algebra (RRA; Markowitz and Raz 1983b, 1984; Raz 1986), with operators that follow the semantics of respective major NL constructs, has been developed to support ERROL over relational databases. It is used both to specify ERROL's semantics concisely and accurately, and to implement ERROL effectively over relational databases.
ERROL and its RRA translation expose and exploit the connection between the way humans reason and talk about needed information, possibly very complex, and the database operations needed in order to compute this information from the database data. A sequence of such RRA operations is generated automatically from an ERROL expression by an ERROL-to-RRA compiler. The compiler output has been applied directly to a relational database, and also has a translation to SQL, the standard interface, for a straightforward, "regular" execution by SQL database systems, to take advantage of their query optimization.
While Natural language (NL) can be vague and ambiguous, ERROL (Markowitz and Raz 1983a) is accurate with well-defined semantics, and thus provides a check for NL queries. With these properties ERROL is a good compromise between NL as a query language and other database languages: Long experience with database languages shows that simple queries are easy to phrase with both NL and most database languages. However, some complicated queries may be hard to phrase accurately in most languages, including NL. In most cases NL is the natural query tool for humans, but being sometimes vague and ambiguous, often a complex NL query's meaning is not accurately defined, and a dialogue that involves a well-defined formal language feedback is needed to verify the intended meaning when NL is used. In practice a query specification is typically initially thought of, formulated, in NL, and then translated by an expert to the desired database language. Validating translation correctness (semantic equivalence) is usually difficult for complex queries and relies completely on the expert's understanding of the NL specification. Wrong interpretation can be very costly, both in execution time and implications. Being very close to NL, and identical to it in most cases, ERROL eases these difficulties.
It is worthwhile noting that the English language utilization in ERROL is different in nature from its utilization in other computer languages like COBOL and SQL: While in the latter English-like language commands are utilized, in ERROL a predicate (in a query or other manipulation) is specified by an English-like expression.
Though originally motivated by the linguistic aspect of ERM, ERROL has been found to provide a powerful paradigm beyond this aspect: Specification of a complex query predicate by Navigation in an ER Diagram ("query by navigation"; the emphasis is on navigation in an ER Diagram, versus any other data structure). Even without exploiting many NL elements, the navigation itself together with basic schema elements provides accurate specification. As a matter of fact the navigation is independent of any specific NL, and presents the semantic relations (Relationships and comparisons) between objects of interest (Entities and attributes, and resulting objects by arithmetic operations, aggregations, and logic derivations), where the ER diagram provides a limited form of semantic network. All the needed information for a complete specification over a given schema exists in a skeleton ERROL specification which includes only ER Diagram elements, and may include constants, aggregate functions, logical connectives, arithmetic operations and comparison operators. Using such language-independent representation, a specification can be easily machine-translated accurately among different natural languages, using small numbers of each language's syntactic constructs, and having an ER diagram described by respective simple sentences in different languages. Specification reconstruction in English (that can be re-processed correctly by the ERROL compiler) from the language-independent representation has been done successfully (within the ERROL System project; see the Brief history section below).
Reshaped relational algebra (RRA; Markowitz and Raz 1983b, 1984) has been developed to support ERROL over relational databases. RRA is equivalent to Relational algebra (RA) in expression power (each one algebra's operator can be expressed by the other algebra's operators (Raz 1986) ). A strong correlation exists between ERROL constructs (and corresponding NL's) and RRA operators. This is used both to specify ERROL's semantics concisely and accurately, and to implement ERROL effectively over relational databases. For any ERROL specification (expression; query or other manipulation) a corresponding RRA expression is derived in a straightforward way. This RRA expresstion computes the specification over any relational database with schema that is semantically relevant to the specification (possibly through schema transformation for ERM compatibility with the specification, when needed; the computation result is a relation).
ERROL is a declarative language (a specification of what is requested is given; rather than a procedural language which specifies the way to compute it; for example, SQL comprises portions of each type). An ERROL expression describes a navigation hypertree (hypertree is an acyclic hypergraph), generated by navigating in the ER diagram. Names and constants are nodes. Entities and Relationships define one types of edges, sets of attribute names. A connecting ERROL construct defines a second type of edge, a "regular" tree (dual node) edge. (See (Raz 1987) below for more detail.) The navigation may include jumps in the diagram (in case of comparisons) or repetitions over same sections in the diagrams (if the query specification requires repeated utilization of same diagram elements). A complex specification of a Hypertree, a computaion by a long NL sentence, may suffer from ambiguities due to unclear connection between sentence parts. ERROL solves this problem by inserted parantheses (primarily parentheses) when needed, to uniquely define an expression's hypertree with the correct intended meaning.
Structured English (SE; Raz 1987; not to be confused with Structured English used for pseudocode) is an extension of a subset of the English language and an enhancement of the initial ERROL. In SE several "syntactic sugar" elements of ERROL have been made more flexible, and further flexibility with word usage and sentence structure has been introduced, to get it closer to NL. However, the original correspondence between NL and respective ERROL basic constructs has been preserved. In what follows no distinction is made between ERROL and SE.
The linguistic aspect of the Entity relationship model, and the way it is utilized by ERROL for navigation in an ER Diagram is demonstrated by the following example:
See additional examples in a section below.
Main article: Reshaped relational algebra
The Reshaped relational algebra (RRA; Markowitz and Raz 1983b, 1984; an alternative formulation can be found in Raz 1986 ) has been developed to support ERROL over relational databases. RRA is equivalent to the Relational Algebra (RA) in expression power (each one algebra's operator can be expressed by the other algebra's operators (Raz 1986) ), and has strong analogies to some basic NL constructs. As such it is ideal for describing the semantics of ERROL, as well as for implementing ERROL over relational databases. For any ERROL specification (expression; query or other manipulation) a crresponding RRA expression computes the specification over a relational database. The main difference between RRA and RA is with natural join and projection operations embedded in various RRA operators.
An ERROL expression (or respective hypertree) can be translated in a straightforward way to an RRA expression. An RRA expression is a partial order (typically with several compatible sequences) of RRA operations, which provides a procedure for computing the value of the ERROL expression over a relational database (i.e., computing the query or data manipulation resulting relation) by manipulating its relations. Each ERROL subexpression type has a corresponding RRA operator. The subexpression variables (entity, relationship, and attribute names) and constants comprise the operator's parameters. The operators' order is determined by the hypertree structure of the (possibly paranthesized) ERROL expression. By proper renaming, the join operation embedded in RRA operators automatically connects between corresponding attributes in entities and relationships, and resolves references by using a single entity identifying temporary name for all that entity occurrences in the query that have the same reference symbol (for same entity occurrences with different reference symbols, different temporary names are given; for same entity occurrences with no reference symbol, different temporary names are given).
RRA expression simplification, and consistency checking together with subexpression contradiction and tautology identification, can be done using RRA axioms and theorems (Raz 1986). RRA expression computation optimization can be done similarly to the ways it is done for Relational algebra (RA).
Both RRA and initial version of ERROL, with all needed for queries linguistic constructs, including implementation guidelines, have been developed by Victor M. Markowitz as the subject of his M.Sc. thesis (Markowitz 1983) at the Technion - Israel Institute of Technology (advisor: Yoav Raz). Further ERROL enhancements and implementation, including ERROL to RRA translation (M.Sc. project of Reuven Cohen; Cohen 1984), have been done by Yoav Raz together with graduate students. In 1984 Yoav Raz, Victor Markowitz, and Reuven Cohen won the Computer Science Award of ILA – The Israeli Information Technology Association.
The ERROL System (Raz et al. 1984) implements database queries and manipulations over a relational database using SE and RRA. During the years 1982-1988 it has been developed at the Technion, Israel, using UNIX, Lex, YACC, and Ingres, and further enhanced at UCSD (see output examples in Raz 1987).
The examples below are given primarily in their skeleton representation, or close to that, rather than their more close-to-NL forms (e.g., set operations rather than quantifiers) to more clearly hint on their RRA translations.
This example relates to a factory database. The portion of the database relevant to this example (and other examples below) has departments, items in stock, and suppliers of these items as entities. Departments REQUEST (order) items from suppliers. Suppliers SUPPLY items to departments. Both last sentences above define ternary (three-way) relationships.
(from example 2 in (Raz 1987) )
Using the schema in Example 1 above consider the following imaginary complex query:
A possible straightforward skeleton ERROL expression for this query is the following:
or the following:
Using the schema in Example 1 above:
If half the number of "Red" items requested by any single department is considered, the NL sentence is slightly changed:
A possible ERROL query is the following:
Using the schema Example 1 above:
Using the schema in Example 1 above: