This project implements a Hindley Milner based type inference system with equi-recursive types.
The language that is dealt with in this project is a simple typed λ -calculus with constants. This can be interpreted as a generalization of Simply typed lambda calculus (STLC), with an addition of let expressions.
The initial type environment contains the following primitives — booleans, integers, pairs, fst, snd, inr, inl, match, unit, unitCase, fix point combinator, conditionals, zero, pred, and times. The following are their corresponding types.
Primitives | Type |
---|---|
true, false |
bool |
1, 2, 3, ... |
int |
pair |
|
fst |
|
snd |
|
inr |
|
inl |
|
match |
|
unit |
unit |
unitCase |
|
fix |
|
cond |
|
pred |
|
zero |
|
times |
Among the above primitives, true, fals
, 1, 2, 3, ...
and unit
are basic types.
pair
, fst
and snd
are provided for constructing and decomposing Product Type.
inl
, inr
and match
are provided for constructing and decomposing Sum Type.
unit
, and unitCase
are provided for constructing and decomposing Unit Type.
With sum type, product type and unit type, we can construct equirecursive types such as list
: , and binary tree
: .
The fix point combinator fix
is provided for constructing recursive expressions.
Additional primitives can be introduced for the purpose of constructing more complex expressions. For example, pred
, zero
, and times
are introduced for constructing a factorial let binding.
Below are the inference rules for the type system.
Name | Rule |
---|---|
Identifier | |
Abstraction | |
Application | |
Let Binding | |
Generalization | |
Specialization |
Unification in the process of inference is to merge two different types or to instantiate a type variable. This is the core of the type inference. This project implements the unification algorithm by treating every type as a node in a forest and employing a union-find data structure, which is highly efficient. Pseudo code:
boolean unify ( Node m, Node n ) {
s = find ( m );
t = find ( n );
if ( s = t )
return true;
else if ( nodes s and t represent the same basic type )
return true ;
else if ( s and t are the same op-node ) {
union ( s, t );
return unify ( s.left, t.left ) and unify ( s.right and t.right );
}
else if ( s or t represents a variable) {
union ( s, t );
return true ;
}
else
return false ;
}
Note that the pseudo code does not consider the direction of the union. However, the direction is important when implementing the union.
Specifically, we have two cases where the direction is enforced.
- If one of the representative members for the equivalence classes of
s
andt
is a non-variable node (e.g. op-node and basic types), union makes that nonvariable node be the representative for the merged equivalence class. - In unifying a non-generic variable to a term, all the variables contained in that term should become non-generic.
Other than that, union assumes to merge the node provided in the first argument to the node provided in the second argument.
The following are examples of the successfully type inference and some internal failures (due to the nature of Hindley-Milner type system).
(let factorial = (fix (fn self => (fn n => (((cond (zero n)) 1) ((times n) (self (pred n))))))) in (factorial 5)) : int
(let f = (fn x => x) in ((pair (f 4)) (f true))) : (int * bool)
(let g = (fn f => 5) in (g g)) : int
(fn g => (let f = (fn x => g) in ((pair (f 3)) (f true)))) : (j -> (j * j))
(fn f => (fn g => (fn arg => (g (f arg))))) : ((n -> o) -> ((o -> p) -> (n -> p)))
(fn x => ((pair (x 3)) (x true))) : Type mismatch: bool != int
((pair (f 4)) (f true)) : Undefined symbol f
(inl unit) : (unit + u)
This project is based on the Python code by Robert Smallshire, the C++ code by Jared Hoberock, and the paper "Basic Polymorphic Typechecking" by Cardelli.