Monday, December 20, 2010

Memoizing Functions

I am reading "Fun with Type Functions" by Simon Peyton Jones, Oleg Kiselyov and Chung-chieh Shan. It is an amazing paper- it is pleasant to read, interesting, and on a very cool topic. One thing they mention in passing is memoizing for referentially transparent languages, which in a way is very nice because you can be sure that no side effects could result and so you can safely memoize results. I was so stunned how awesome this is technique I literally sat there in awe for several minutes before reading on. This is an area of research I've been meaning to look at because it gets mentioned now and again, and now I have to get serious on looking this stuff up.
The idea of memoization is to turn a function into a data structure. You can often save time by saving the results of a function for future calls with the same input so you will not have recompute it. The way I was taught to do this, which was in Java since that is my university's teaching language, was to create a HashMap and, for each time a function is called, check if the parameters are in the map. If so, return the result, if not then compute the result, store it in the map, and return it. This is all well and good for simple little ACM programming competition problems (which is how I've used it) but in general it is kind of a design problem in Java, as we have no higher order functions. Python has its decorator syntax that allows you to wrap a function in a memoizing decorator, which makes the memoization transparent to the user. This is pretty cool, but what if functions can't keep some state that they modify when called? In Haskell, the situation is pretty amazing. Instead of creating the structure piece by piece as the function is called, you create the whole thing all at once, sort of. They have two functions- one that turns a function into a table storing its input, output pairs, and one that turns a table storing those pairs into a function. This means that if we go from function to table to function we get a function with the same type as the original, but that will memoize its results. With lazyness, the table can even be infinitely large, holding all possible input/output pairs for the function, and it will take a finite amount of memory and will fill itself out as its needed.
This is (to me) just stunning, and I don't think I have given the topic justice in this explanation. I may post about it more as I read about it, but for now, anyone interested should read the paper. The actual topic of the paper is type families in Haskell, which are also a very cool feature.

1 comment:

  1. http://www.geek.com/articles/games/super-mario-64-completed-in-5-minutes-2011013/

    ReplyDelete