Tuesday, April 5, 2011

Learning the Lambda Calculus, second order types

In a previous post we looked at the simply typed lambda calculus, which corresponds to a propositional logic where we can only form statements with implications such as "p implies q" written as p -> q with the meaning of "a function from terms of type p to terms of type q". If we allow universal quantification over types we get the second order lambda calculus aka the polymorphic lambda calculus aka System F. The name System F is, of course, awesome, but there is a more powerful system with an even more awesome name- The Calculus of Constructions. Later on in this post we will see something in the PLC (polymorphic lambda calculus) that is common in programming languages, but rarely given its true name (type constructors as types with higher order kinds).

System F does not correspond to what you normally hear called second order logic, which may be called second order predicate logic, but rather to (the implication fragment of) second order propositional logic. Interestingly Haskell's concept of a type class gives us predicates, which makes it more like first order predicate logic, which is awesome.

So, we have two tools- implication in the form of lambda abstractions written as ->, and universal quantification in the form of type abstraction. In the simply typed lambda calculus we could specify the identity function as (\x:a.x) read as "for x of type a, x" which has no effect on its arguments. The problem with this is that this function only takes arguments of type "a", so we have to define a different identity function for each type, even though all identity function look exactly the same. Universal quantification to the rescue! Now we can define the identity function once and for all, introducing some new notation, as (\a:*.\x:a.x) which is read as "given a type called a and a term of type a, return that term". I don't have a good way of showing in text the difference between abstraction of types and abstraction of values, but the first argument's type is * which is the type of all types. I kind of like using "\" for both term and type abstractions as they follow mostly the same rules. This makes "*" a kind of second order type, and it is called a kind. We pronounce it "star". To use this term, we must first supply some type, then some term of that type, and will get back our term unchanged like so- (\a:*.\x:a.x)Integer 3 =beta 3. In logic this would look like "forall a, a -> a" meaning that all things imply themselves.

The nice thing about this situation is that we can define functions that are the same for all types. This gives what is called parametric polymorphism, similar to Java's Generics. In Haskell the type system is a series of extensions on the second order lambda calculus, so we don't have some system tacked on to the language but rather a built-in and well understood form of parametric polymorphism.

I want to mention that Haskell's type system isn't quite System F, but rather the Hindley-Milner system, which restricts the locations that the universal quantification can occur in order to allow type inference. This means that the compiler can determine types for you, which is hugely convenient and useful, while imposing a very small restriction to the set of valid programs.

Lets look at one more interesting thing about this system. For the term (\a:*.\x:a.x) we must supply a type. What if we supply the terms own type? Well, its type is written (in my own ascii art interpretation of the actual symbols) as (\a:*.\x:a.x) : (\a:*.a->a). Therefore we have that (\a:*.\x:a.x)(\a:*.\x:a.x) =beta (\b:(\a:*.a->a).\x:b.b) : (\b:*->*.\x:b.b) : (* -> *) -> (* -> *). This is pretty hard to see just looking at the terms, especially as I don't have all the symbols I would like (like both lower and upper case lambdas, product types as the big pi symbol, and the box called "sort" which is the type of all kinds (which brings us to the fourth layer of the type system)). The important thing to see here is that the kind of the argument is not *, but rather (* -> *), and the kind of the resulting term is (* -> *) -> (* -> *). The kind * -> * is something that needs a type and returns a type. This is what I was talking about in the beginning of this post as something we see in programming. The array type in most languages (such as c, c++, Java, Haskell) is not itself a type. There is no such thing as an array, there are only arrays of some type of object. We can have arrays of ints, written int[], or of strings, written String[] (in Java), or of Objects, written Object[]. This means that arrays need a type, which we can call A, and return a type, an array of As. The term we just saw has a more complex kind, and it happens that supplying the identity function its own type give a kind of second order identity- the identity on type constructors.

In Haskell we can have kinds formed with * and ->, which may remind us of something. It is very similar to an untyped lambda calculus, as different occurrences of * may be different types.

I think that enough for now. Having mentioned the Calculus of Constructions, maybe I will post about it later on.

2 comments:

  1. Did you mean it's close to the Typed Lambda Calculus (with one type)? It's not close to the untyped calculus because you can't return arbitrary type functions.

    ReplyDelete
  2. I was thinking that * is like an arbitrary term and -> a beta abstraction, so its like an untyped LC because the * can be anything and not just a correctly typed term. Your right though, its more like a simply typed LC with only one ground type.
    Thanks for reading, btw!

    ReplyDelete