Collections and Instructions

3.1 Lists, Sets and Tuples

claire provides two easy means of manipulating collections of objects: sets and lists. Lists are ordered, possibly heterogeneous, collections. To create a list, one must use the list(...) instruction: it admits any number of arguments and returns the list of its arguments. Each argument to the list(...) constructor is evaluated.

list(a,b,c,d) list(1,2 + 3) list()

Sets are collections without order and without duplicates. Sets are created similarly with the set(...) constructor :

set(a,b,c) set(1,2 + 3)

A major feature of CLAIRE is the fact that lists or sets may be typed. This means that each bag (set or list) may have a type slot named of, which contains a type to which all members of the list must belong. This type is optional, as is illustrated by the previous examples, where no typing was given for the lists or sets. To designate a type for a new list or a new set, we use a slightly different syntax:

list<thing>(a,b,c,d) list<integer>(1,2 + 3) list<float>()

set<thing>(a,b,c) set<integer>(1, 2 + 3)

Typing a list or a set is a way to ensure that adding new values to them will not violate typing assumptions, which could happen in earlier versions of CLAIRE. Insertion is now always a destructive operation (add(l,x) returns the list l, that has been augmented with the value x at its end).

Since typing is mandatory in order to assume type-safe updates onto a list or a set, if no type is provided, CLAIRE will forbid any future update: the list or the set is then a “read-only” structure. This is the major novelty in CLAIRE 3.2: there is a difference between:

list(a,b,c,d) set(1,2 + 3) list{i | i in (1 ..2)}

which are read-only structures, and

list<thing>(a,b) set<integer>(1,2 + 3) list<integer>{i | i in (1 ..2)}

which are structures that can be updated (modified).

List or set types can be arbitrarily complex, to represent complex list types such as list of lists of integers (cf. Section4). However, it is recommended to use a global constant to represent a complex type that is used as a list type, as follows:

MyList :: list<integer>

set<MyList>(list<integer>(1), list<integer>(2,3))

Constant sets are valid claire types and can be built using the following syntax:

{a,b,c,d} {3, 8}

The expressions inside a constant set expression are not evaluated and should be primitive entities, such as integers or strings, named objects or global constants. Constant sets are constant, which means that inserting a new value is forbidden and will provoke an error.

A set can also be formed by selection. The result can either be a set with {x in a | P(x)}, or a list with list{x in a | P(x)}, when one wants to preserve the order of a and keep the duplicates if a was a list. Similarly, one may decide to create a typed or an un-typed list or set, by adding the additional type information between angular brackets. For instance, here are two samples with and without typing:

{x in class | (thing % x.ancestors) }

list{x in (0 .. 14) | x mod 2 = 0}

set<class>{x in class | (thing % x.ancestors) }

list<integer>{x in (0 .. 14) | x mod 2 = 0}

When does one need to add typing information to a list or a set ? A type is needed when new insertions need to be made, for instance when the list or set is meant to be stored in an object’s slot which is itself typed.

Also, the image of a set via a function can be formed. Here again, the result can either be a set with {f(x)| x in a} or a list with list{f(x) | x in a}, when one wants to preserve the order of a and the duplicates.

{(x ^ 2) | x in (0 .. 10)}

list<integer>{size(x.slots) | x in class}

For example, we have the traditional average_salary method:

average_salary(s:set[man]) : float -> (sum(list{m.sal | m in s}) / size(s))

Last, two usual constructions are offered in claire to check a Boolean expression universally (forall) or existentially (exists). A member of a set that satisfies a condition can be extracted (a non-deterministic choice) using the some construct: some(x in a | f(x)). For instance, we can write:

exists(x in (1 .. 10) | x > 2) ;; returns true

some(x in (1 .. 10) | x > 2) ;; returns 3 in most implementations

exists(x in class | length(x.ancestors) > 10)

The difference between exists and some is that the first always returns a boolean, whereas the second returns one of the objects that satisfy the condition (if there exists one) and unknown otherwise. It is very often used in conjunction with when (cf. next section), as in the following example:

when x := some(x in man | rich?(x)) in

(borrow_from(x,1000), ...)

else printf("There is no one from whom to borrow! ")

Conversely, the Boolean expression forall(x in a | f(x)) returns true if and only if f(x) is true for all members of the set a. The two following examples returns false (because of 1):

forall(x in (1 .. 10) | x > 2)

forall(x in (1 .. n) | exists( y in (1 .. x) | y * y > x))

Definition: A list is an ordered collection of objects that is organized into an extensible array, with an indexed access to its members. A list may contain duplicates, which are multiple occurrence of the same object. A set is a collection of objects without duplicates and without any user-defined order. The existence of a system-dependent order is language-dependent and should not be abused. The concept of bag in CLAIRE is the unifier between lists and sets : a collection of objects with possible duplicates and without order.

A read-only (untyped) list can also be thought as tuples of values. For upward compatibility reasons, the expression tuple(a1,…,an) is equivalent to list(a1,…,an):

tuple(1,2,3), tuple(1,2.0,”this is heterogeneous”)

Since it is a read-only list, a tuple cannot be changed once it is created, neither through addition of a new member (using the method add) or through the exchange of a given member (using the nth= method). CLAIRE offers an associated data type, as explained in Section 4.2. For instance, the following expressions are true:

tuple(1,2,3) % tuple(integer,integer,integer)

tuple(1,2,3) % tuple(0 .. 1, 0 .. 10, 0 .. 100)

tuple(1,2.0,”this is heterogeneous”) % tuple(any,any,any)

Typed tuples are used to return multiple values from a method (cf. Section 4.1). Because a tuple is a bag, it supports membership, iteration and indexed access operations. However, there is yet another data structure in CLAIRE for homogeneous arrays of fixed length, called arrays. Arrays are similar to lists but their size is fixed once they are created and they must be assigned a subtype (a type for the members of the array) that cannot change. Because of these strong constraints, CLAIRE can provide an implementation that is more efficient (memory usage and access time) than the implementation of bags. However, the use of arrays is considered an advanced feature of CLAIRE since everything that is done with an array may also be done with a list. Arrays are described in Section 3.7.

3.2 Blocks

Parentheses can be used to group a sequence of instructions into one. In this case, the returned value is the value of the last instruction.

(x := 3, x := 5)

Parentheses can also be used to explicitly build an expression. In the case of boolean evaluation (for example in an if), any expression is considered as true except false, empty sets and empty lists.

(1 + 2) * 3 if (x = 2 & l)

Local variables can be introduced in a block with the let construct. These variables can be typed, but it is not mandatory (claire will use type inference to provide with a reasonable type). On the other hand, unlike languages such as C++, you always must provide an initialization value when you define a variable. A let instruction contains a sequence of variable definitions and, following the in keyword, a body (another instruction). The scope of the local variable is exactly that body and the value of the let instruction is the value returned by this body.

let x := 1, y := 3 in (z := x + y, y := 0)

Notice that CLAIRE uses := to represent assignment and = to represent equality. The compiler will issue a warning if a statement (x = y) is used where an assignment was probably meant (this is the case when the value of the assignment is not needed, such as in x := 1, y = 3, z := 4).

The value of local variables can be changed with the same syntax as an update to an object: the syntax :op is allowed for all operations op.

x := x + 1, x :+ 1, x :/ 2, x :^ 2

The name of a local variable can be any identifier, including the name of an existing object or variable. In that case, the new variable overrides the older definition within the scope of the let. While this may prove useful in a few cases, it should be used sparingly since it yields to code that is hard to read. A rule of thumb is to avoid mixing the name of variables and the name of properties since it often produces errors that are hard to catch (the property cannot be accessed any more once a variable with the same name is defined). The control structure when is a special form of let, which only evaluates the body if the value of the local variable (unique) is not unknown (otherwise, the returned value is unknown). This is convenient to use slots that are not necessarily defined as in the following example

when f := get(father,x) in printf(“his father is ~S\n”,f)

The default behavior when the value is unknown can be specified using the else keyword. The statement following the else keyword will be evaluated and its value will be returned when the value of the local variable is unknown.

when f := get(father,x) in printf(“his father is ~S\n”,f)

else printf(“his father is not known at the present time \n”)

Local variables can also be introduced as a pattern, that is, a tuple of variables. In that case, the initial value must be a tuple of the right length. For instance, one could write:

let (x,y,z) := tuple(1,2,3) in x + y + z

The tuple of variable is simply introduced as a sequence of variables surrounded by two parentheses. The most common use of this form is to assign the multiple values returned by a function with range tuple, as we shall see in the next section. If we suppose that f is a method that returns a tuple with arity 2, then the two following forms are equivalent:

let (x1,x2) := f() in ...

let l := f(), x1 := l[1], x2 := l[2] in ...

Tuples of variables can also be assigned directly within a block as in the following example

(x1,x2) := tuple(x2,x1)

Although this is mostly used for assigning the result of tuple-valued functions without any useless allocation, it is interesting to note that the previous example will be compiled into a nice value-exchange interaction without any allocation (the compiler is smart enough to determine that the list “ list(x2,x1) ” is not used as such).

The key principle of lexical variables is that they are local to the “ let ” in which they are defined. claire supports another similar type of block, which is called a temporary slot assignment. The idea is to change the value of a slot but only locally, within a given expression. This is done as follows:

let x.r := y in e

changes the value of r(x) to y, executes e and then restore r(x) to its previous value. It is strictly equivalent to

let old_v := x.r in (x.r := y, let result := e in (x.r := old_v, result)

CLAIRE provides automatic type inference for variables that are defined in a let so that explicit typing is not necessary in most of the cases. Here are a few rules to help you decide if you need to add an explicit type to your variable or even cast a special type for the value that is assigned to the variable:

(a) Type inference will provide a type to a Let variable only if they do not have one already.

(b) when you provide a type in let x:t := y, the compiler will check that the value y belong to t and will issue a warning and/or insert a run-time type-check accordingly.

(c) if you want to force the type that is inferred to something smaller than what CLAIRE thinks for y, you must use a cast:

let x := (y as t2) in ...

To summarise,

  • in most cases CLAIRE range inference works, so you write let x := y in ...

· you use let x:t := y to weaken the type inference, mostly because you want to put something of a different type later,

  • you use let x := (y as t) to narrow the type inferred by CLAIRE.


3.3 Conditionals


if statements have the usual syntax (if <test> x else y) with implicit nesting (else if). The <test> expression is evaluated and the instruction x is evaluated if the value is different from false, nil or {} (cf. Section 2.4). Otherwise, the instruction y is evaluated, or the default value false is returned if no else part was provided.

if (x = 1) x := f(x,y)

else if (x > 1) x := g(x,y)

else (x := 3, f(x,y))

if (let y := 3 in x + y > 4 / x) print(x)

If statements must be inside a block, which means that if they are not inside a sequence surrounded by parenthesis, they must be themselves surrounded by parenthesis (thus forming a block).

case is a set-based switch instruction: claire tests the branching sets one after another, executes the instruction associated with the first set that contains the object and exits the case instruction without any further testing. Hence, the default branch is associated with the set any. As for an if, the returned value is nil if no branch of the case is relevant.

case x ({1} x + 1, {2,3} x + 2, any x + 3)

case x (integer (x := 3, print(x)), any error("~I is no good\n",x))

Note that the compiler will not accept a modification of the variable that is not consistent with the branch of the case (such as case x ({1} x := 2)). The expression on which the switching is performed is usually a variable, but can be any expression. However, it should not produce any side effect since it will be evaluated many times.

Starting with CLAIRE 3.3, only Boolean expressions should be used in the <test> expression of a conditional statement. The implicit coercion of any expression into a Boolean is still supported but should not be used any longer. The compiler will issue a warning if a non-Boolean expression is used in an If.

3.4 Loops

claire supports two types of loops: iteration and conditional loops (while and until). Iteration is uniquely performed with the for statement, it can be performed on any collection:

for x in (1 .. 3) a[x] := a[x + 3]

for x in list{x in class | size(x.ancestors) >= 4} printf("~S \n",x)

A collection here is taken in a very general sense, i.e., an object that can be seen as a set through the enumeration method set!. This includes all claire types but is not restricted since this method can be defined on new user classes that inherit from the collection root. For instance, set!(n:integer) returns the subset of (0 .. 29) that is represented by the integer n taken as a bit-vector. To tell CLAIRE that her new class is a collection, the user must define it as a subclass of collection. If x is a collection, then

· for z in x

· (z % x)

are supported. When defining a new subclass of collection, the methods set! and % must be defined for this new class, and it is also advisable to define size and iterate to get compiler speed-ups (if size is not defined, an implicit call to set! is made). Other collection handling methods, such as add, delete, etc may be defined freely if needed.

Notice that it is possible that the expression being evaluated inside the loop modifies the set itself, such as in

for x in {y in S | P(y)} P(x) := false

Because the CLAIRE compiler tries to optimize iteration using lazy evaluation, there is no guarantee about the result of the previous statement. In this case, it is necessary to use an explicit copy as follows:

for x in copy({y in S | P(y)}) P(x) := false

The iteration control structure plays a major role in claire. It is possible to optimize its behavior by telling claire how to iterate a new subclass (C) of collection. This is done through adding a new restriction of the property iterate for this class C, which tells how to apply a given expression to all members of an instance of C. This may avoid the explicit construction of the equivalent set which is performed through the set! method. This optimization aspect is described in Section 4.6.

Conditional loops are also standard (the exit condition is executed before each loop in a while and after each loop in a until),

while (x > 0) x :+ 1

until (x = 12) x :+ 1

while not(i = size(l)) (l[i] := 1, i :+ 1)

The value of a loop is false. However, loops can be exited with the break(x) instruction, in which case the return value is the value of x.

for x in class (if (x % subtype[integer]) break(x))

There is one restriction with the use of break: it cannot be used to escape from a try … catch block. This situation will provoke an error at compile-time.

3.5 Instantiation

Instantiation is the mechanism of creating a new object of a given class; instantiation is done by using the class as a selector and by giving a list of "<slot>=<value>" pairs as arguments.

complex(re = 0.0, im = 1.0)

person(age = 0, father = john)

Recall that the list of instances of a given class is only kept for “instanced” classes (a class is “instanced”, that is maintains its extension, if has been declared as such or if it inherits from the thing class). The creation of a new instance of a class yields to a function call to the method close. Objects with a name are represented by the class thing, hence descendants of thing (classes that inherit from thing) can be given a name with the definition operation ::. These named objects can later be accessed with their name, while objects with no name offer no handle to manipulate them after their creation outside of their block (objects with no name are usually attached to a local variable with a let whenever any other operation other than the creation itself is needed)

paul :: person(age = 10, father = peter)

Notice that the identifier used as the name of an object is a constant that cannot be changed. Thus, it is different from creating a global variable (cf. Section 6.4) that would contain an object as in :

aGoodGuy:person :: person(age = 10, father = peter)

Additionally, there is a simpler way of instantiating parameterized classes by dropping the slot names. All values of the parameter slots must be provided in the exact order that was used to declare the list of parameters. For instance, we could use :

complex(0.0,1.0), stack(integer)

The previously mentioned instantiation form only applies to a parameterized class. It is possible to instantiate a class that is given as a parameter (say, the variable v) using the new method. New(v) creates an instance of the class v and new(v,s) creates a named instance of the class v (assumed to be a subclass of thing) with the name s.

3.6 Exception Handling

Exceptions are a useful feature of software development: they are used to describe an exceptional or wrong behavior of a block. Exception can be raised, to signal this behavior and are caught by exception handlers that surround the code where the exceptional behavior happened. Exceptions are claire objects (a descendent from the class exception) and can contain information in slots. The class exception is an “ephemeral” class, so the list of instances is not kept. In fact, raising an exception e is achieved by creating an instance of the class e. Then, the method close is called: the normal flow of execution is aborted and the control is passed to the previously set dynamic handler. A handler is created with the following instruction.

try <expression> catch <class> <expression>

For instance we could write

try 1 / x catch any (printf("1/~A does not exists",x),0)

A handler "try e catch c f", associated with a class c, will catch all exceptions that may occur during the evaluation of e as long as they belong to c. Otherwise the exception will be passed to the previous dynamic handler (and so on). When a handler "catches" an exception, it evaluates the "f" part and its value is returned. The last exception that was raised can be accessed directly with the exception!() method. Also, as noticed previously, the body of a handler cannot contain a break statement that refers to a loop defined outside the handler.

The most common exceptions are errors and there is a standard way to create an error in claire using the error(s:string,l:listargs) instruction. This instruction creates an error object that will be printed using the string s and the arguments in l, as in a printf statement (cf. Section 6). Here are a few examples.

error("stop here")

error("the value of price(~S) is ~S !",x,price(x))

Another very useful type of exception is contradiction. claire provides a class contradiction and a method contradiction!() for creating new contradictions. This is very commonly used for hypothetical reasoning with forms like (worlds are explained in section 5.4) :

try ( choice(), ; create a new world

... ; performs an update that may cause a contradiction

catch contradiction (backtrack(), ; return to previous world

...

In fact, this is such a common pattern that CLAIRE provides a special instruction, branch(x), which evaluates an expression inside a temporary world and returns a boolean value, while detecting possible contradiction. The statement branch(x) is equivalent to

try ( choice(),

if x true else (backtrack(), false)

catch contradiction (backtrack(), false)

If we want to find a value for the slot x.r among a set x.possible that does not cause a contradiction (through rule propagation) we can simply write :

when y := some(y in x.possible | branch(x.r = y)) in x.r := y

else contradiction!()

3.7 Arrays

An array can be seen as a fixed-size list, with a member type (the slot name is of), which tells the type of all the members of the array. Because of the fixed size, the compiler is able to generate faster code than when using lists, so lists should be used when the collection shrinks and grows, and an array may be used otherwise. This is especially true for arrays of floats, which are handled in a special (and efficient) way by the compiler.

Arrays are simpler than lists, and only a few operations are supported. Therefore, more complex operations such as append often require a cast to list (list!). An array is created explicitly with the make_array property :

let l := make_array(10,float,0.0) in

l[1] := l[3] + l[4]

Note that the of type must be given explicitly (it can be retrieved with member_type(l)), as well as a default value (0.0 in the previous example). An array is printed as [0.0,0.0, …, 0.0], similarly to a list but with surrounding brackets. Operations on arrays are described in the API and include copying, casting a bag into an array (array!), defeasible update on arrays using store, and returning the length of the array with length. An array can also be made from a list using array!, which is necessary to create arrays that contain complex objects (such as arrays of arrays). For instance,

Matrix :: array!(list<float[]>{ make_array(10,float,0.0) | i in (1 .. 10)})

is correct, while the following will not work because the internal one-dimension array will be shared for all columns.

Matrix :: make_array(10,float[],make_array(10,float,0.0))

Since they are collections, arrays can be iterated, thus all iteration structures (image, selection, ...) can be used.


3.8 Map Sets

Dictionaries are now first-class citizens of CLAIRE. A map_set is a set of association pairs (x:y) such that the value of the dictionary for entry x is y. Map sets have been introduced in CLAIRE4 to leverage the Go underlying implementation. They are collections, so they can be iterated.

Map sets are typed, with a domain (to which x belongs) and a range (to which y belongs). The common usage is to create an empty directory using map! :

D :: map!(string,integer)

put(D,”toto”,4)

get(D,”toto”)

CLAIRE also supports the explicit creation of a map set from value pairs:

MD :: map<string,integer>(“toto”:4,”ludicrous”:9)

set!(MD) // returns {4,9}