Monday, December 3, 2012

Robust Gene Expression Programming + Let Expressions

I've been thinking about (and implementing) Robust Gene Expression Programming recently. I've posted about this technique before, but as a refresher it can be thought of as either a Genetic Algorithm that evolves expressions in a given (unityped) language or as a postfix form of linear Genetic Programming (with the intent that the language is purely functional rather than imperative as linear GPs sometimes are). One of the things I noticed while developing this algorithm was that there is no reason one couldn't include stack manipulation operators into the expression language, regardless of the language one is interested in, and more importantly these operators add power to the algorithm essentially for free. Recently I've realized that "let" expressions could be similarly added, and that they may improve the algorithms ability to find useful programs in a nicely clean way.

Adding stack operators and "let" as an extension to a language is motivated by the observation that useful programs consist of useful subexpressions. This means that program with good subexpressions should propagate throughout generations allowing them to get mixed into individuals and so on. There are several schemes in the Gene Expression Programming world for helping useful subexpressions get reused. I tend to like simple schemes when dealing with these algorithms, and I think it is pretty cool that a stack operator like "dup" can be added to, say, a language for arithmetic expressions to duplicate it in the resulting expression tree, allowing it to be reused. Something like "tuck" could perform a similar function of duplicating subexpressions, and for allowing the expression tree to be essentially edited to move subtrees around and combine them in different ways.

Now for the point of this post- in addition to adding stack operators I think one could easily add "let" expressions to any expression language. Again we get them essentially for free in the sense that we can add them without changing the type of the expression language (they can be expanded out before the expression is evaluated for fitness). The scheme I have in mind involves adding variables to the terminals of the language and, for each variable added, a "let" operator for that variable taking two expressions and replacing the first expression for all occurrences of the variable in the second term. Variables that occur free in an term (not bound by a surrounding "let") are ignored during evaluation so they do not disrupt the usual evaluation (which is set up to handle problematic cases like this and still result in a valid expression).

Note that I've been calling these "let" expressions rather then lambda because they never form actual functions but must rather be immediately applied to a value to be included in the expression tree. One could do otherwise by designing an expression language with lambda terms, but as an extension to a language we must enforce the rule that each expression has the same type. Since there is only one type, and the "let" terms are immediately evaluated, these are more like "let" then lambda.

The main problem with this idea that I see is the somewhat ad-hoc way we have to decide beforehand how many variables we want to have. I don't have a big problem with this, and I think for many problems a small number would be sufficient to add some value, but it does feel a bit odd. On the other hand, this strategy is much simpler than some other ideas that have been looked at, and sometimes one exchanges simplicity for perfection.

It will probably be a while before I get around to implementing this to see if it actual helps at all, but I like that RGEP seems to have some simple, optional, and lightweight ways to extend itself without changing the core algorithm. Now if I could only figure out a clean way to allow more then one type...

2 comments:

  1. I just wanted you to know that I read this entire post. I didn't understand much of it but my brain would like to think that it did, although if you could see the visual butchery my mind has applied to your explanation...well, let's say I'd rather you didn't. Imagine the best while understanding the worst. To give you credit where it is due however, I did (I think) sort of understand what the term "Robust Gene Expression Programming."

    ReplyDelete
  2. Aww, thanks Jenevieve! I do wonder what you must have been thinking, though...

    ReplyDelete