CLAIRE 4 Programming Language
Welcome
"The Art of Elegant Programming"
CLAIRE at a glance
CLAIRE 4.1.2 is now a "beta" release" - August 22nd, 2024
A Simple Object-Oriented Data Model
We first briefly describe the CLAIRE object-oriented data model. Classes are defined with a list of typed slots, and only single inheritance is supported. Each class is a first-class object of the system and is accessible at runtime (an optional list of instances is stored for each class). The <: operator is used to define a class (because subclasses are subtypes in CLAIRE); it consists of appending a list of new slots to the slots of the parent class. Slots support default values and unknown values, i.e., there is a special unknown object that represents the absence of value. The following two examples illustrate class definition; the second one shows a parameterized class (the usual Stack class, parameterized by the type of its elements).
Point <: object(x:integer = 0, y:integer = 0)
Stack <: object(of:type, content:list = nil)
The second example is slightly surprising, compared to other approaches such as template classes, because it implicitly rely on the fact that types are first-class objects (hence the type). Thus we attach to each stack the type that must contain the objects that are pushed in the stack through the slot of. The responsibility of checking that objects in the content of the stack belong to this type is left to the programmer. On the other hand, parameterized signatures, as we shall see later, can express this dependency to support the strong typing of stack operations. The CLAIRE data type complements the simplicity of the class hierarchy. A type is either a class, a constant set of objects, the union or the intersection of two types, a parameterized class, an integer interval or a typed subset of list, array or set. A parameterized class is the subset of the class such that the parameter belongs to a given type. For instance, Stack[of:{integer,float}] is the set of stacks whose type parameter is integer or float. A stack from this parameterized type will contains either integers (if its of parameter is integer) or floats, but not both. A stack that contains either would be represented by the parameterized type Stack[of:(integer ∪ float)] There are three kinds of primitive collections: arrays (constant length and ordered), lists (dynamic length, ordered) and sets (dynamic length, no order and no duplicates). Primitive collection can either be typed, with a read-only type attribute of, similar to the Stack example, or un-typed and immutable. This means that only typed lists or sets support updates, which makes sense since we need a type to guarantee the type-safety of updates. Immutable lists (or sets) receive a dynamic type when they are created.
Consequently, both list and set types come in two flavors: either strongly typed (list, à la C++), or supporting inclusion polymorphism (list[t]), i.e., list[t1] ≼ list[t2] ⇔ t1 ≼ t2. The type list contains typed lists l whose type parameter of(l) is precisely t. A list object is a list that contains objects that belong to t, including instances of subtypes of t. On the other hand, the type list[t] contains both typed lists l whose type parameter of(l) is a subtype of t, as well as immutable lists whose member belong to t. This applies equally to set and set[t]. The distinction between these two kinds of list/set types is a key feature in CLAIRE. Since list ≼ list[t], we can write generic list methods, that apply both to mutable and dynamic, immutable lists. The type list[t]is interesting because of inclusion polymorphism, and because read operations on lists of type list[t] can be statically type-checked. On the other hand, list is monomorphic (list ≼ list ⇒ t1 = t2), but both read and write operations can be statically type-checked. Let us briefly illustrate this on an example. Consider three classes A ≤ B ≤ C and the following methods:
foo(x:list[C], y:B) : void -> x[1] := y
bar(x:list, y:B) : void -> x[1] := y
Although B is a subclass of C, the insertion in the foo method into the list x is not safe, since x may be a list object; on the other hand, bar is safe but less flexible. Thus, the method foo relies on dynamic typing at run-time to guarantee the safety of the update, whereas the method bar is compiled more efficiently. In addition, the tuple type is used for constant, heterogeneous, typed “array-like” (i.e., indexed) containers; for instance, tuple(integer,integer,float) represents the set of tuples with exactly three elements: two integers and a float. For example, tuples are used in CLAIRE to return multiple values from a function. We can summarize the richness of type expressions with the following list:
classes are types (everything is an object in CLAIRE, that belongs to a class)
such are parameterized classes (classes with a type filter, such as stack<integer>)
integer intervals are types
constant sets are types
a union of two types is a type, so is the intersection (because of the lattice structure)
if t is a type, subtype[t] is a type, which yields parameterized lists and sets,
for full polymorphism (supporting read/write operations)
Types are conceptually represented as expressions that denote sets of objects, and can be used everywhere in a program since they are reified. Types as objects are themselves typed with the subtype[...] type; for example, subtype[integer] contains all types that are included in integer. This set-oriented view is supported by the fact that type subsumption is defined by the inclusion of the set extensions for all possible evolution of the object database. Types form an inclusion lattice with syntactical greater lower bound and subsumption (subtyping). The relationship between the type lattice and the powerset of the object domain is the basis for using an abstract interpretation scheme. CLAIRE inherits the SMALLTALK strategy of considering everything as an object, from a class. This implies that primitive types such as integer, floats, strings, Boolean are seen as primitive classes in CLAIRE.
2. Polymorphism
Methods in CLAIRE are simply overloaded functions, where each parameter contributes to the overloading and each parameter can be typed with any CLAIRE type. For instance, attaching a method to a class union is useful to palliate the absence of multiple inheritance. Suppose that a class A is a natural candidate for multiple inheritance with parents B and C. If we chose B as the unique parent of A, we can define the methods for C that we want to be inherited by A on the explicit union (A U C). This is not equivalent because of the lack of extensibility (adding a new subclass to (A U C) is only possible through the addition of a subclass to A or C). However, it is close enough for most practical cases. Methods may be attached anywhere in the type lattice, not simply to the class hierarchy (more precisely, anywhere in the lattice of type Cartesian products). This makes CLAIRE closer to a language like Haskell than to C++. A method is called a restriction of the global function which itself is called a property. It is important to notice that the class hierarchy is considered as a “law” given by the user (seen as set inclusion) and is not based on type substitution. As a consequence, there are no constraints imposed on the user when adding new methods. Covariance is enforced implicitly but not imposed explicitly (adding a new method on a subdomain may augment the range of the function on a larger domain). If we define:
foo(x:integer) : (0 .. 5) -> ... // ... stands for “some expression”
foo(x:(1 .. 10)) : (5 .. 10) -> ...
the range of foo on integer is actually the union (0 .. 10). The range in the first definition is taken as a type restriction for this first definition, not for “foo on any integer” which is defined by the two restrictions. The use of a "type lattice" (i.e., closed under intersection) does not prevent the occurrence of “inheritance conflicts” where two restrictions have conflicting definition on overlapping signature. If one domain is included in the other it is assumed to take precedence (such is the case for the foo example, so foo(3) uses the second definition). Otherwise, a warning is issued. The lattice structure guarantees that it is possible to solve the inheritance conflict with a proper, more specialized, definition, since the intersection of overlapping signatures can always be described with types. Parameterized types can be used to produce parameterized signature. That means that we can use the value of a parameter, when it is a type, inside the signature, as in the following example (for each operation op, x :op y is syntactical sugar for x := (x op y)):
push(s:Stack, y:X) : void -> s.content :add y
The use of the type variable X states that the second argument must belong to the type contained in the slot of from the first argument (the stack). Note that there is an explicit difference between stack[of:{integer}] and stack[of:subtype[integer]]. The first is the set of stacks with range parameter integer (exactly), whereas the second can be thought of as the stacks of integers (stacks whose range parameter is a subtype of integer). There is no parametric subtyping with the first (i.e. stack[of:{integer}] ≼ stack[of:{number}] is false) as it should be. However, there is parametric subtyping with the second (i.e. stack[of:subtype[integer]] ≼ stack[of:subtype[number]]). To alleviate notations, stack<integer> is syntactical sugar for stack[of:{integer}]
Another important feature is the fact that CLAIRE supports higher-order functions. Properties, that are function names, may be passed as arguments to a method and they can, in turn, be applied to functions. For instance, one can write:
exp(f:property, x:any, n:integer) : integer // exponentiation
-> if (n = 0) f(x) else f(exp(f, x, n - 1))
This example will not be type-checked by the compiler and CLAIRE will rely on dynamic typing. On the other hand, most methods are statically type-checked and the C++ or Go code produced by the compiler is very close to hand-written code (on classical benchmarks for procedural languages, using CLAIRE is similar to using C++, as is shown here. Last, second-order types can be attached to methods. A second-order type is a method annotation, defined by a lambda abstraction, which represents the relationship between the type of the input arguments and the result. More precisely, a second-order type is a function such that, if it is applied to any valid type tuple for the input arguments, its result is a valid type for the result of the original function. Let us consider the two following examples:
Identity(x:any) : type[x] -> x
top(s:Stack) : type[x] -> last(s.content)
The first classical example states that Identity is its own second-order type. The second one asserts that the top of a Stack always belong to the type of the Stack elements (of parameter). In the expression e that we introduce with the type[e] construct, we can use the types of the input variables directly through the variables themselves, we may also use the extra type variables that were introduced in a parameterized signature (as in the top example) and we may use the basic CLAIRE operators on types such as ∪ (type union), ∩ (type intersection) and some useful operations such as member. member is used in particular to describe the range of the enumeration method set!. This method returns a set, whose members belong to the input class c by definition. Thus, we know that they must belong to the type member(X) for any type X to which c belongs (If c is a type, member(c) is the minimal type that contains all possible members of c). This translates into the following definition:
set!(c:class) : type[set[member(c)]] -> instances(c)
In the same way that overloading becomes more powerful with a richer type system, we have introduced tables to represent binary relations. Tables extend traditional arrays by allowing the domain (and the range) to be any type. When the domain is not an integer interval (the usual case for arrays) it is implemented through hashing. The following examples illustrate CLAIRE tables:
square[x:(0 .. 20)] : integer := x * x
dist[x:(0 .. 10), y:(0 .. 10)] : float := 0.0
creator[c:class] : string := “who created that class”
color[s:{car, house, table}] : set := {}
married[t:tuple(person,person)] : tuple(date,place) U {nil} := nil
The combination of objects, relations (with explicit and implicit inverses), heterogeneous sets and lists makes CLAIRE a good language for knowledge representation. For instance, entity-relationship modeling is straightforward.
3. Set and Logic Programming
Sets play an important role in CLAIRE. We saw that each type represents a set and may be used as such. A set is used through testing membership, iteration, or combining with another set and a set operation. Membership is tested using the operation ∈ (the ASCII representation is ‘%’). Iteration is performed with the for x in S e(x) control structure. We have also introduced two convenient expressions:
• exists(x in S | P(x)) returns true if there exists an x in S such that P(x) is true, and false otherwise
• some(x in S | P(x)) returns an x from S such that P(x) is true if one exists, and unknown else (as in ALMA [AS97]).
For instance, we may write:
x ∈ (string U list)
for y in (1 .. 10) U (20 .. 30)
(if exists(c in (1 .. 10) | ok?(c + y)) choose(y))
when x := some(t in TASK | completed?(t)) in register(x) // t was found
else error(“no completed task was found”)
Although primitive data types (such as string or list) may be considered as sets, they are seen as infinite sets, thus trying to enumerate them will yield an error. New sets can be formed through selection (e.g., {x in S | P(x)}) or set image (e.g., {f(x) | x in S}). When duplicates should not be removed, a list image can be formed with list{f(x) | x in S}. Using a straightforward syntax leads to straightforward examples:
{x in Person | x.age > 18}
exists(p in {x.father | x in (man U woman)} | p.age < 20)
list{x.age | x in Person}
for x in {f(y) | y in (1 .. 10)} print(x)
sum(list{y.salary | y in {y in Person | y.dept = sales}})
In addition to sets, CLAIRE uses a fragment of an object-oriented logic to support inference rules associated to events, which will be presented in the "table & rules" reference pages. This logic may be seen as an object-oriented extension of binary DATALOG with object methods, set expressions and negation. The following expression is the logic part of a CLAIRE rule:
edge(x,y) | exists(z, edge(x,z) & path(z,y)) => x.path :add y
The first extension is the introduction of interpreted functions, which are represented by object methods. These methods are either comparison methods or binary operations. For computing x.tax from x.salary, we may write:
exists(z, z = x.salary & z > 30000 & z < 50000 & y = 3000 + (z - 30000) * 0.28 ) => x.tax := y
The second extension is the introduction of set expressions. Here is a simple example that uses a set expression to represent the set of children who are still minors:
z = 2 + size({y in Person | y ∈ x.children & y.age <= 18}) => x.family_size := z T
he last extension is the introduction of negation (using not(P(x,y))). Although this object-oriented logic supports a top-down complete resolution strategy (with stratified negation), it is only used in CLAIRE to provide forward-chaining inference rules. A rule is defined by linking a logical condition to a conclusion, which is any CLAIRE expression. Here is a simple example:
compute_salary(x:Person[age:(18 .. 40)]) :: rule(
x.salary < average_salary({y in Person | x.dept = y.dept}) => x.next_salary := x.salary * 1.1)
To complete this brief overview of CLAIRE, we need to mention the “versioning” mechanism, designed for tree search exploration, like branch & bound. A version (also called a world, using the AI terminology) is a virtual copy of the state of the objects. The goal is to be able to return to a previously stored state if a hypothetical branch fails during problem solving. As a consequence, worlds in CLAIRE are organized into a stack and only two operations are allowed: one for copying the current state of the database and another for returning to the previous state. The part of the objects that supports these defeasible updates is defined by the programmer and may include slots, tables, global variables or specific data structures such as lists or arrays (defeasible is an AI term that means “backtrack-able”). Each time we ask CLAIRE to create a new world, CLAIRE saves the status of defeasible slots (and tables, variables, ...). Worlds are represented with numbers, and various operations are provided to query the current world, create and return to previous worlds, with or without “forgetting” the most recent updates. The use of worlds is also encapsulated with the branch(e) control structure which creates a world, evaluates the expression e and returns true if the result of e is true, and otherwise backtracks to the previous world and returns false. Using these programming features makes writing search algorithms, such as branch-and-bound, much simpler in CLAIRE than with traditional imperative languages such as Java or C++. For instance, here is a simple program fragment that solves the “n queens” problem:
D :: (1 .. SIZE)
column[n:D] : D := unknown
possible[x:D, y:D] : boolean := true
event(column)
store(column, possible)
r1(x:D, y:D) :: rule(exists(z:D, column[z] = y) => possible[x,y] := false)
r2(x:D, y:D) :: rule(exists(z:D, column[z] + z = y + x) => possible[x,y] := false)
r3(x:D, y:D) :: rule(exists(z:D, column[z] - z = y - x) => possible[x,y] := false)
[queens(n:(0 .. SIZE)) : boolean
-> (if (n = 0) true
else exists(p in D | (possible[n,p] & branch( (column[n] := p, queens(n - 1)) )))]
In this program, queens(n) returns true if it is possible to place n queens. The search tree (these embedded choices) is represented by the stack of the recursive calls to the method queens. At each level of the tree, each time a decision is made (an assignment to the column table), a new world is created so that we can backtrack whether this hypothesis (this branch of the tree) leads to a failure.