GADTs in OCAML

Synopsis


This page gives information on the original implementation of GADTs in OCaml, now merged in trunk.

    svn checkout http://caml.inria.fr/svn/ocaml/trunk

Examples are in the subdirectory

  testsuite/tests/typing-gadts

which you can access directly at the following URL

  http://caml.inria.fr/cgi-bin/viewvc.cgi/ocaml/trunk/testsuite/tests/typing-gadts/

Status


Fully implemented but still needs more testing. Please test and find bugs or ask questions. The -principal flag needs the most testing. Works with both camlp4 and camlp5.

About the implementation


The two following documents give some insights on the implementation and its theoretical background.

Important note: the above documents are about the previous version of GADTs. On 2011-11-24, a new version was merged in trunk, with a different approach to ambiguity. In particular the new approach is compatible with objects and polymorphic variants.

Example code


module Exp = struct
  type _ t = 
    | IntLit : int -> int t
    | BoolLit : bool -> bool t
    | Pair : 'a t * 'b t -> ('a * 'b) t
    | App : ('a -> 'b) t * 'a t -> 'b t
    | Abs : ('a -> 'b) -> ('a -> 'b) t 

  let rec eval : type s . s t -> s = function
    | IntLit x -> x
    | BoolLit y -> y
    | Pair (x,y) -> (eval x,eval y)
    | App (f,a) -> (eval f) (eval a)
    | Abs f -> f
end

module List = struct
  type zero
  type _ t = 
    | Nil : zero t
    | Cons : 'a * 'b t -> ('a * 'b) t

  let head (type s)  : (s * _) t -> s  =
    function Cons (a,b) -> a

  let tail (type s) : (_ * s) t -> s t =
    function Cons (a,b) -> b

  let rec length : type a . a t -> int = function
  | Nil -> 0
  | Cons (a,b) -> 1 + length b
end

Intuition


To gain some intuition, we must first understand what I refer to as a newtype. In the expression:

fun (type s) -> e

the type s is a newtype, and is treated like a type constructor in e. The type s can take on other types, but only within a branch. For example, in the expression: 

let eval (type s) (x:s t) : s =
   match x with
   | IntLit x -> x
   | BoolLit x -> x

then the newtype s will have type int in the branch 

IntLit x -> x 

and bool in the branch 

BoolLit x -> x

Constructor level existential types


In the Exp example, we could define a type:

type anyt = Anyt : 'a t -> anyt

let eval_anyt (Anyt x) =
  try
    eval x ; 
    print_endline "evaluation succeeded"
  with _ -> 
    print_endline "evaluation failed"

Syntax extensions


Summary

  • Type parameters can be omitted in type declarations
  • Constructors can be annotated with a general type
  • Syntactic sugar is provided for mixing polymorphic recursion and newtype declarations

Type parameters

The following declaration is now legal:

type (_,_) t = u

and is equivalent to

type ('a,'b) t = u

if 'a and 'b are not in u. For example:

type _ t = int

is the same as

type 'a t = int

Furthermore, unnamed arguments can have a variance marker

type +_ t = Foo : 'a t -> 'a t

Constructors

Constructors can now be given a more general type:

type _ t = 
     | IntLit : int -> int t
     | BoolLit : bool -> bool t

You can also mix and match with normal constructors:

type 'a t = 
     | IntLit : int -> int t
     | BoolLit : bool -> bool t
     | Lit of 'a

Mixing polymorphic recursion and newtypes

Consider the evaluator:

type _ t = 
     | IntLit : int -> int t
     | BoolLit : bool -> bool t

We can write an evaluator as follows:

let eval : 'a . 'a t -> 'a = 
    fun (type s) (x : s t) -> 
          ((match x with
             | IntLit x -> x 
             | BoolLit x -> x) : s)

However, this can now be written in the simpler form:

let eval : type a . a t -> a = fun x ->
  match x with
  | IntLit x -> x 
  | BoolLit x -> x

or, even more simply:

let eval : type a . a t -> a = function
   | IntLit x -> x
   | BoolLit x -> x

You can also quantify multiple arguments:

let eval : type a b. (a t * b t) -> a = function
  | IntLit x, _ -> x
  | BoolLit x, _ -> x

To see why this extension is necessary, consider the following example:

let rec f (type s) (x : s) = f x (* this does not type check *)

Here we try to call f with an argument of new type s. If we do not know that f is polymorphic in its argument, we get a type error.

Exhaustiveness check


Exhaustiveness check is done, to a great extent, by trying many possibilities. For example:

type _ t = 
    | BoolLit : bool -> bool t
    | IntLit : int -> int t

let same_type : type s . s t * s t -> bool = function
  | BoolLit _, BoolLit _ -> true
  | IntLit _, IntLit _ -> true

would pass the exhaustiveness check. However:

type _ t = 
  | BoolLit : bool -> bool t
  | IntLit : int -> int t
  | Impossible : float t -> float t

let same_type : type s . s t * s t -> bool = function
  | BoolLit _, BoolLit _ -> true
  | IntLit _, IntLit _ -> true

would not, even though it is impossible to construct a value with the Impossible constructor.

Sometimes, there isn't enough information to tell two types apart and this can effect exhaustiveness checking. For example:

module M : sig type t type u end = struct
  type t = int
  type u = bool
end

type _ t = 
  | BoolLit : M.u -> M.u t
  | IntLit : M.t -> M.t t

let same_type : type s . s t * s t -> bool = function
  | BoolLit _, BoolLit _ -> true
  | IntLit _, IntLit _ -> true

would fail the exhaustiveness check because the compiler does not know that M.u can never be equal to M.t.

Unused match cases


There are two cases for unused matches:

Duplicate variants

The following unused match case is caught:

type _ t = A : int t

let foo =
  function A -> () | A -> ()

Redundant underscore

Unfortunately, the following unused match case is _not_ caught:

type _ t = A : int t | B : bool t

let foo : int t -> unit = 
  function A -> () | _ -> ()

Variance


Type constructors with GADT data constructors can be variant in all their type parameters which cannot introduce new equations.

For example, the following is correct:     

type (_,+_) foo = Bar : 'a -> (int,'a) foo

but the following isn't:

type (+_,_) foo = Bar : 'a -> (int,'a) foo

Justification

To see why this is the case, consider the following example (taken from Jacques Garrigue):

type -'a t = C : < m : int > -> < m : int > t

type eval : type a . a t -> a = fun (C x) -> x

let a = C (object method m = 5 end)

let b = (a :> < m : int ; n : bool > t)

let c = eval b

Limitations


Polymorphic variants

Patterns containing polymorphic variant patterns are not handled as GADTs. 
This is because propagating type information in patterns is required for GADTs, but might change types inferred for polymorphic variant patterns.

For example, the following code is not allowed:

let rec eval : type s . [`A] * s t -> unit = function
  | `A, IntLit _ -> ()
  | `A, BoolLit _ -> ()

Generation order

Within a pattern, constraints are generated, and can only be used, from left to right in tuples and in label definition order for records. For example:

let check : type s . s t * s -> bool = function 
  | BoolLit true, false -> false
  | IntLit 5, 6 -> false

would type but the function:

let check' : type s . s * s t -> bool = function 
  | true, BoolLit false -> false
  | 5, IntLit 6 -> false

would not. The reason is that patterns are checked in this order during runtime so that check' might compare integers with booleans.

Note: for record patterns, left to right must be understood according to the definition order.

Propagation

OCaml currently does not always propagate the type information. This means that the user will occasionally have to annotate his types.

Currently, this code doesn't compile:

let rec baz : type s . s t -> s =
  fun (type z) ->
  function
    IntLit x -> x
  | BoolLit y -> y

so you need to add the annotation:

let rec baz : type s . s t -> s =
  fun (type z) ->
  ((function
      IntLit x -> x
    | BoolLit y -> y) : s t -> s)

or, equivalently:

let rec baz : type s z . s t -> s =
  function
    IntLit x -> x
  | BoolLit y -> y

Exhaustiveness check

The exhaustiveness check is not complete. However, this is not new to O'Caml. Consider:

type a = {field : 'a. 'a list}

let foo = function {field = []} -> ()

This function is exhaustive, but it is flagged as being non-exhaustive. 

With GADTs, the problem becomes a little more severe. Consider:

type _ u = Int : int u | Bool : bool u
type _ v = VInt : int v

let foo =
  function Int, VInt -> ()

Even though foo is exhaustive it is flagged as being non-exhaustive. As a general rule, when matching on a tuple, the exhaustiveness check is more effective if you keep your wildcard patterns ( _ ) at the right most position in the tuple. 

Contact



rathereasy @@@-@@@ gmail.com
garrigue @@@-@@@ math.nagoya-u.ac.jp