Wednesday, March 23, 2011

Learning the Lambda Calculus, the Y combinator

The Y combinator is many peoples favorite combinator. Given some term in the LC the Y combinator gives us the fixed point of that term (as will any other fixpoint combinator- Y is just a well known one). This is pretty spectacular- given some function, f, we want a function that will tell us x such that f(x) = x. Notice that some functions have many fixpoints, but the fact that *all* functions (lambda terms) have at least one is very surprising. Many functions in math have such a value- abs (the absolute value function) has every non-negative number as a fixpoint as abs(n) = n for all n >= 0. Certainly there is no fixpoint function in general, but in the untyped LC there is and Y in one of them. Some tutorials will try to explain Y in a programming language, but I actually think it is clearer as a lambda term. The reason I want to go over this a little is that it gives us some of the original motivation for type systems as a way of forbidding such terms, which is were I want to end up.

So what does this term look like? Like other combinators it has no free variables, but it is definitely more complex than the terms we have seen so far. The term looks like (\f.(\x.f(xx))(\x.f(xx))). Notice the two inner terms are the same, and f is the function we are asking for the fixpoint of. I will write out the term entirely sometimes and as Y sometimes. If this really gives us a fix point, that means that if we chose some term, f, and write out Yf (Y applied to f) we should get some result, which I will just call (Yf) meaning the result of this application, such that f(Yf) = Yf, which is just the condition we need for something to be a fix point.

I really think the best way to see this is to try it out, so lets dive in. Call the input function g (just to distinguish it from the f in the way I wrote Y) we want to reduce (\f.(\x.f(xx))(\x.f(xx)))g. Well, first off lets do a beta reduction (\f.(\x.f(xx))(\x.f(xx)))g =beta (\x.g(xx))(\x.g(xx)). That wasn't so hard! Now it looks like another beta reduction is in order: (\x.g(xx))(\x.g(xx)) =beta g((\x.g(xx))(\x.g(xx))). This is an interesting term- notice that the right side (the thing g is applied to) is ((\x.g(xx))(\x.g(xx))), which is what we got after the first beta reduction. This means that Yf = ((\x.g(xx))(\x.g(xx))) as that is what we got after applying g to Y, so lets substitute Yg into our current form: g((\x.g(xx))(\x.g(xx))) = g(Yg). Wow- it looks like we have taken Yg and reduced it to g(Yg)! And even better, remember the condition for (Yg) to be a fix point is that g(Yg) = Yg. Well, equality is reflexive so we have just shown that indeed Yg = g(Yg).

If you have ever tried to take the cosine of some fixed starting number many times you found that if you kept applying cosine to the result of the previous application of cosine it gets closer and closer to a fixed number, and eventually cosine(x) = x and no change occurs. Similarly notice that not only does Yg = g(Yg) but g(Yg) = g(g(Yg)) = g(g(g(Yg))). See the pattern? The fix point of any function g is g applied to itself an infinite number of times! While with cosine we started somewhere (some fixed starting number) with the Y combinator it is "g"s all the way down.

With such an amazing little function, why would anyone want to forbid it? It turns out that while it is useful, encodes the idea of recursion very nicely and has other nice properties we won't get to, it has one problem. The definition diverges! This brings us to a fundamental aspect of computing- sometimes computations "blow up" or take forever to complete, and there is no way to determine in general exactly when they will do so. The Y combinator is amazing because it is a great example of how a useful thing for computing (recursion) can go wrong.

So what if we wanted to disallow any term that could blow up in this way? We would have to give up some useful forms of computation as there is no system that is both universal (can compute all computable things) and also never blows up. Getting rid these undesirable situations we must also forbid useful ones. This is where type systems come in, and where we can get into some more meaty subjects.

This post is getting long, so I just want to point out one last thing- the type of the Y combinator is (a -> a) -> a for all a, as given any function of one argument it returns a value (the (least) fixpoint of the function). In logic this would be read as "a implies a implies a" which would mean that anything that implies itself is true. Perhaps this doesn't seem so bad- a true thing implies itself and it is true. On the other hand, a false thing implying a false thing is true and a true thing implying a false thing is false. This means that if the LC is to serve as a system of logic we must avoid this term as it is contradictory. This is the original motivation for type systems- the untyped lambda calculus as a logic leads to paradoxes and we must restrict it to avoid them. This also shows us a profoundly deep connection between logic and computing- the unbounded computation of the Y combinator is closely tied to the fact that it encodes a contradiction.

2 comments:

  1. That was actually a really interesting and mostly comprehensible post.

    ReplyDelete
  2. Have you read the series up to this post?

    ReplyDelete