I am reading "Fun with Type Functions" by Simon Peyton Jones, Oleg Kiselyov and Chung-chieh Shan. It is an amazing paper- it is pleasant to read, interesting, and on a very cool topic. One thing they mention in passing is memoizing for referentially transparent languages, which in a way is very nice because you can be sure that no side effects could result and so you can safely memoize results. I was so stunned how awesome this is technique I literally sat there in awe for several minutes before reading on. This is an area of research I've been meaning to look at because it gets mentioned now and again, and now I have to get serious on looking this stuff up.

The idea of memoization is to turn a function into a data structure. You can often save time by saving the results of a function for future calls with the same input so you will not have recompute it. The way I was taught to do this, which was in Java since that is my university's teaching language, was to create a HashMap and, for each time a function is called, check if the parameters are in the map. If so, return the result, if not then compute the result, store it in the map, and return it. This is all well and good for simple little ACM programming competition problems (which is how I've used it) but in general it is kind of a design problem in Java, as we have no higher order functions. Python has its decorator syntax that allows you to wrap a function in a memoizing decorator, which makes the memoization transparent to the user. This is pretty cool, but what if functions can't keep some state that they modify when called? In Haskell, the situation is pretty amazing. Instead of creating the structure piece by piece as the function is called, you create the whole thing all at once, sort of. They have two functions- one that turns a function into a table storing its input, output pairs, and one that turns a table storing those pairs into a function. This means that if we go from function to table to function we get a function with the same type as the original, but that will memoize its results. With lazyness, the table can even be infinitely large, holding all possible input/output pairs for the function, and it will take a finite amount of memory and will fill itself out as its needed.

This is (to me) just stunning, and I don't think I have given the topic justice in this explanation. I may post about it more as I read about it, but for now, anyone interested should read the paper. The actual topic of the paper is type families in Haskell, which are also a very cool feature.

## Monday, December 20, 2010

## Sunday, December 19, 2010

### Sudoku in Haskell

I'm currently writing a Sudoku solver in Haskell for fun. The board is a list (rows) of lists (squares) of lists (possible numbers for that square) and solving proceeds by eliminating possibilities as much as possible, finding the square with the fewest possibilities, and trying each one as if it was the only possibility for that square. I implemented this in Python a while ago, so I wanted to try it out in my new favorite language.

I've noticed that I get a lot of benefit out of Haskell for this type of program. The List monad is exactly what I need for nondeterministic selection of possibilities for a square, I used the derivative of the list type (aka a zipper) to move around in the board, I don't have to make a copy of the board before changing it (as there are no side effects), and laziness allows the first solution to be the one that is used without calculating the others (which will fail as there should only be one solution) without adding some way to exit the recursive solver when a solution is found. I might look into ways to do this kind of exit, such as with a continuation, but for now I don't even have to try at all- Haskell is nice enough to do this for me in a way.

Its not done yet, so back to work!

I've noticed that I get a lot of benefit out of Haskell for this type of program. The List monad is exactly what I need for nondeterministic selection of possibilities for a square, I used the derivative of the list type (aka a zipper) to move around in the board, I don't have to make a copy of the board before changing it (as there are no side effects), and laziness allows the first solution to be the one that is used without calculating the others (which will fail as there should only be one solution) without adding some way to exit the recursive solver when a solution is found. I might look into ways to do this kind of exit, such as with a continuation, but for now I don't even have to try at all- Haskell is nice enough to do this for me in a way.

Its not done yet, so back to work!

## Monday, December 13, 2010

### A Case for Imperative Reasoning

I've just finished watching a talk, the link is below, about operational semantics and the difficulty of specifying certain semantics in a purely declarative way. I really enjoyed the speaker, and I feel like I have a more complex view of the difference between imperative and declarative styles now. As someone who loves functional programming, it was interesting to see a reasonable case made about the challenges it faces in certain areas of specification. You have to watch a little bit into the talk to get the good parts, btw, as his opening statements about functional programming are not the interesting part.

The basic idea was that there are operational parts of an algorithm that may need to be specified for an implementation to be considered an instance of the algorithm (I'm taking a particular view of the notion of an algorithm, and this is discussed briefly in the talk) such as that a sort be in-place. I have no idea how we would state this directly in a functional language, except perhaps to build an functional description of an imperative process and some how ensure that it is carried out in a way the preserves the intended semantics.

The real reason I posted this is not just because it is interesting but because it is a rare thing to see an argument for imperative programming. I like this kind of thing because it adds a level of complexity and sophistication to the regular sort of arguments for or against one language/paradigm or another. I wonder though- he mentions that he uses intuitionistic logic in his current work, and as we all know the lambda calculus, and by extension Haskell, is an intuitionistic logic itself (through the Curry-Howard isomorphism) and programs are proofs of theorems in this logic. How this is related (if it indeed is related at all) I have no idea.

The link:

http://channel9.msdn.com/Blogs/Charles/C9-Conversations-Yuri-Gurevich-Abstraction-Algorithms-and-Logic

The basic idea was that there are operational parts of an algorithm that may need to be specified for an implementation to be considered an instance of the algorithm (I'm taking a particular view of the notion of an algorithm, and this is discussed briefly in the talk) such as that a sort be in-place. I have no idea how we would state this directly in a functional language, except perhaps to build an functional description of an imperative process and some how ensure that it is carried out in a way the preserves the intended semantics.

The real reason I posted this is not just because it is interesting but because it is a rare thing to see an argument for imperative programming. I like this kind of thing because it adds a level of complexity and sophistication to the regular sort of arguments for or against one language/paradigm or another. I wonder though- he mentions that he uses intuitionistic logic in his current work, and as we all know the lambda calculus, and by extension Haskell, is an intuitionistic logic itself (through the Curry-Howard isomorphism) and programs are proofs of theorems in this logic. How this is related (if it indeed is related at all) I have no idea.

The link:

http://channel9.msdn.com/Blogs/Charles/C9-Conversations-Yuri-Gurevich-Abstraction-Algorithms-and-Logic

## Sunday, December 12, 2010

### Updates on Haskell Evolutionary Algorithms Library

My Haskell implementation of RGEP (Robust Gene Expression Programming) is going really well! Its called Heal, Haskell Evolutionary Algorithm Library, and its the latest iteration in my quest for a library to do my research with where I can change anything I want at any time to test new algorithms and genetic operators, etc.

I've decided to change everything to vectors, hoping to get some performance improvements. While I don't have any direct comparisons, I just ran a 100 individual population with individuals having 128 codons, and each codon of length 3. That is a largish dataset, and if I remember correctly my Java implementation took about 24 hours to do similar work (note that that was a pretty poor implementation on my part). My new library is about a third the size, supports PGEP, RGEP, and GAs, with an easy way to add more, and ran in 6 minutes (giving me a pretty poor answer to this symbolic regression problem I'm testing with, but thats not Haskell's fault). Wow! I also had the profiler running, and a full 87% of the time was spent garbage collecting, leaving only 40 seconds to run my code. It allocated a total of 21 Gigabytes during the entire run.

Now, I obviously must do something about this because that is an absurd amount of time in GC. The problem is I don't really know how to do deeper profiling in Haskell, and I'm having trouble with the most direct approach (running the executable with +RTS -p). So far the "shotgun" approach of adding strictness annotations is not working. Since every individual needs to be evaluated all the way every generation, laziness is only helping for some stuff, and I don't really need it for the main population, so I don't think throwing in bangs will hurt, but so far it hasn't helped at all. In fact, with only 8 Megabytes for the stack, the program stack overflows, so I gave it 15 and it runs fine.

In addition to being much faster, this library is much nicer looking (though I need to clean it up pretty badly) and I have delighted in using the various features of Haskell. I'm not even talking about the advanced features- the normal everyday ones are amazing.

On to profiling! I really want to see where all the time is being taken up. I feel like there is a lot of opportunity for improvement in time, space, and in just general style and organization.

I've decided to change everything to vectors, hoping to get some performance improvements. While I don't have any direct comparisons, I just ran a 100 individual population with individuals having 128 codons, and each codon of length 3. That is a largish dataset, and if I remember correctly my Java implementation took about 24 hours to do similar work (note that that was a pretty poor implementation on my part). My new library is about a third the size, supports PGEP, RGEP, and GAs, with an easy way to add more, and ran in 6 minutes (giving me a pretty poor answer to this symbolic regression problem I'm testing with, but thats not Haskell's fault). Wow! I also had the profiler running, and a full 87% of the time was spent garbage collecting, leaving only 40 seconds to run my code. It allocated a total of 21 Gigabytes during the entire run.

Now, I obviously must do something about this because that is an absurd amount of time in GC. The problem is I don't really know how to do deeper profiling in Haskell, and I'm having trouble with the most direct approach (running the executable with +RTS -p). So far the "shotgun" approach of adding strictness annotations is not working. Since every individual needs to be evaluated all the way every generation, laziness is only helping for some stuff, and I don't really need it for the main population, so I don't think throwing in bangs will hurt, but so far it hasn't helped at all. In fact, with only 8 Megabytes for the stack, the program stack overflows, so I gave it 15 and it runs fine.

In addition to being much faster, this library is much nicer looking (though I need to clean it up pretty badly) and I have delighted in using the various features of Haskell. I'm not even talking about the advanced features- the normal everyday ones are amazing.

On to profiling! I really want to see where all the time is being taken up. I feel like there is a lot of opportunity for improvement in time, space, and in just general style and organization.

## Saturday, December 11, 2010

### From #Haskell

I found these in the quotes of the week section of the Haskell new site:

data Neither a b = Left | Right

But in another sense, functional programmers are applied logicians who spend all their time proving trivial theorems in interesting ways in an inconsistent intuitionist logic.

unsafeCoerce is just a generalization of id

It's also the same thing as the Yoneda lemma. That's the thing about maths. Everything is actually the same.

programming languages can be divided into two broad classes: functional and dysfunctional

Schroedinger's cat is really in a thunk not a box

it's neat how you learn haskell because you are drawn in by the purely functional paradigm, and then you find loads more things like algebraic data types, monad abstractions, arrows and applicative, lack of objects... so that when people say "well, it's not haskell, but at least X is functional", it's just not the same at all

peyton `simon` jones

Here's the short guide to Haskell for OO programmers: Haskell isn't at all an OO language.

I'm starting to believe that learning haskell is mostly about carefully crafting small and clever functions and then finding out that they are already part of the standard library.

the first rule of fix club is "the first rule of fix club is "the first rule of fix club is...

Since NaN /= NaN, I think, we should decipher 'NaN' as 'Not a NaN'

Haskell is an even 'redder' pill than Lisp or Scheme

Haskell really is amazing. Laziness, type classes, monads, functors (and other category theoretic goodness), applicative functors, algebraic data types, pattern matching, currying, and a huge number of other things.

data Neither a b = Left | Right

But in another sense, functional programmers are applied logicians who spend all their time proving trivial theorems in interesting ways in an inconsistent intuitionist logic.

unsafeCoerce is just a generalization of id

It's also the same thing as the Yoneda lemma. That's the thing about maths. Everything is actually the same.

programming languages can be divided into two broad classes: functional and dysfunctional

Schroedinger's cat is really in a thunk not a box

it's neat how you learn haskell because you are drawn in by the purely functional paradigm, and then you find loads more things like algebraic data types, monad abstractions, arrows and applicative, lack of objects... so that when people say "well, it's not haskell, but at least X is functional", it's just not the same at all

peyton `simon` jones

Here's the short guide to Haskell for OO programmers: Haskell isn't at all an OO language.

I'm starting to believe that learning haskell is mostly about carefully crafting small and clever functions and then finding out that they are already part of the standard library.

the first rule of fix club is "the first rule of fix club is "the first rule of fix club is...

Since NaN /= NaN, I think, we should decipher 'NaN' as 'Not a NaN'

Haskell is an even 'redder' pill than Lisp or Scheme

Haskell really is amazing. Laziness, type classes, monads, functors (and other category theoretic goodness), applicative functors, algebraic data types, pattern matching, currying, and a huge number of other things.

## Thursday, December 9, 2010

### Species and Functors and Types, Oh My!

I very much enjoyed the paper "Species and Functors and Types, Oh My!", which has a good talk by the author at:

http://vimeo.com/16753644

There are other good papers, including "Species: making analytic functors practical for functional programming" and "AN INVITATION TO COMBINATORIAL SPECIES."

Awesome subject, great papers, good talk, good times.

http://vimeo.com/16753644

There are other good papers, including "Species: making analytic functors practical for functional programming" and "AN INVITATION TO COMBINATORIAL SPECIES."

Awesome subject, great papers, good talk, good times.

## Wednesday, December 8, 2010

### Proof of Correctness for RGEP's Editing Mechanism

RGEP's editing stage consists of two parts- removing the non-coding region if one exists, and removing the extra operators, if any exist. I have convinced myself that the mechanism works, but I have not shown this rigorously until now.

To prove that RGEP's editing mechanism will always produce valid individuals, consider each stage in turn. The first stage removes the "non-coding" region. It is well known that expressions in polish notation can be parenthesized without ambiguity. If we add parenthesis and find symbols to the right of the right-most closing parenthesis then removing those symbols corresponds to removing the non-coding region. Note that we can account for expressions with too many operators by closing their parenthesis at the end of the expression.

In the second stage we can assume that all expressions are either complete or over-specified, as expressions with extra symbols at the end will have been removed in the first stage. The correctness of this stage is proven by cases:

Case 1- the expression is a valid prefix expression. In this case the postfix traversal of the reverse of the expression will be exactly the same as the prefix traversal of the expression, by definition.

Case 2- The individual has too many operators to be considered a valid prefix expression. In this case the postfix evaluation of the reverse of the individual will encounter either a list of operators, which would all cause a stack underflow and will therefore be ignored, leaving an empty stack as the result, or an individual with one or more terminal.

Operators before the first terminal will cause a stack underflow and will be ignored. Operators after the terminal may result in a change in the stack, but cannot empty the stack. At the end of the evaluation the top of the stack will be considered the result of the evaluation. Note that there must be only one thing on the stack in this case- if there were terminals that would end up on the stack then they would have been removed in the first editing stage.

We can also prove that RGEP's prefix-with-editing can encode all individuals that PGEP's prefix notation can, and that RGEP is a strict superset of Karva-with-head-tail-distinction. First, both PGEP and RGEP use prefix notation,

so any individual in PGEP, which is by definition valid, is therefore valid in RGEP. Assuming the correctness of RGEP's editing mechanism, RGEP has some redundancy in its encoding- some expressions will be reduced to valid prefix expressions which must be smaller then before editing (remember that all editing in RGEP deletes symbols), and must therefore be expressible as an prefix notation individual by padding it with symbols to meet the fixed length requirements.

To show that RGEP is a strict superset of Karva we first show that all Karva individuals have a prefix equivalent. A Karva individual encodes a tree, and as prefix notation can express any tree and requires no more symbols then Karva, we must be able to reorder the symbols in a valid Karva expression to get the same tree in prefix expression. To show that RGEP > Karva-with-head-tail we must only show a single case of a prefix expression that is valid in RGEP and not expressible in Karva. notation For individuals of length 7, with only the + operator and the terminal 1, the following individual is valid in RGEP- +1+1+11. The corresponding tree is too right unbalanced to express in Karva notation (by inspection) if we restrict the operators to only the first three places, as we must in this situation. Therefore RGEP > Karva in the sense that it can express more trees for symbol lists of fixed length. I'm sure this can be generalized to arbitrary lengths by constructing similar right-unbalanced trees, but the point is still valid- such a tree exists in at least one situation.

To prove that RGEP's editing mechanism will always produce valid individuals, consider each stage in turn. The first stage removes the "non-coding" region. It is well known that expressions in polish notation can be parenthesized without ambiguity. If we add parenthesis and find symbols to the right of the right-most closing parenthesis then removing those symbols corresponds to removing the non-coding region. Note that we can account for expressions with too many operators by closing their parenthesis at the end of the expression.

In the second stage we can assume that all expressions are either complete or over-specified, as expressions with extra symbols at the end will have been removed in the first stage. The correctness of this stage is proven by cases:

Case 1- the expression is a valid prefix expression. In this case the postfix traversal of the reverse of the expression will be exactly the same as the prefix traversal of the expression, by definition.

Case 2- The individual has too many operators to be considered a valid prefix expression. In this case the postfix evaluation of the reverse of the individual will encounter either a list of operators, which would all cause a stack underflow and will therefore be ignored, leaving an empty stack as the result, or an individual with one or more terminal.

Operators before the first terminal will cause a stack underflow and will be ignored. Operators after the terminal may result in a change in the stack, but cannot empty the stack. At the end of the evaluation the top of the stack will be considered the result of the evaluation. Note that there must be only one thing on the stack in this case- if there were terminals that would end up on the stack then they would have been removed in the first editing stage.

We can also prove that RGEP's prefix-with-editing can encode all individuals that PGEP's prefix notation can, and that RGEP is a strict superset of Karva-with-head-tail-distinction. First, both PGEP and RGEP use prefix notation,

so any individual in PGEP, which is by definition valid, is therefore valid in RGEP. Assuming the correctness of RGEP's editing mechanism, RGEP has some redundancy in its encoding- some expressions will be reduced to valid prefix expressions which must be smaller then before editing (remember that all editing in RGEP deletes symbols), and must therefore be expressible as an prefix notation individual by padding it with symbols to meet the fixed length requirements.

To show that RGEP is a strict superset of Karva we first show that all Karva individuals have a prefix equivalent. A Karva individual encodes a tree, and as prefix notation can express any tree and requires no more symbols then Karva, we must be able to reorder the symbols in a valid Karva expression to get the same tree in prefix expression. To show that RGEP > Karva-with-head-tail we must only show a single case of a prefix expression that is valid in RGEP and not expressible in Karva. notation For individuals of length 7, with only the + operator and the terminal 1, the following individual is valid in RGEP- +1+1+11. The corresponding tree is too right unbalanced to express in Karva notation (by inspection) if we restrict the operators to only the first three places, as we must in this situation. Therefore RGEP > Karva in the sense that it can express more trees for symbol lists of fixed length. I'm sure this can be generalized to arbitrary lengths by constructing similar right-unbalanced trees, but the point is still valid- such a tree exists in at least one situation.

Subscribe to:
Posts (Atom)