Sunday, February 13, 2011

Stack Effects

In concatenative languages like Forth every function (called a word) operates on an implicit stack. This means that instead of specifying the type signature, Int -> Int -> Int for a function like +, we specify the stack effect, which in forth would be a comment of the form ( n m -- n+m ). Stack effects are necessary as a word can have more then one result, for example the word dup has the stack comment ( n -- n n ) duplicating the top of the stack.
In RGEP the postfix notation allows us to use words from Forth as operators, where in previous GEPs operators had to be simple functions. This means that they need to be aware of their stack effect and not just their arity (the number of arguments they take). Since RGEP operators must preserve the closure property, the stack effect is essentially just the number of items required on the stack to execute the word and the number of items left after execution. To hold this data as well as the function for the operator I am using this type:
data OP a = OP { eats::Int, leaves::Int, applyOp::[a] -> [a], name::String }
So an OP, an operator, eats an integer number of argument from the stack and leaves an integer number of arguments as well. The operator itself is the function [a] -> [a] acting on the stack. I don't know how to embed the stack effect in the type system, but I would not be surprised if it was doable (I think I've read about ways of doing this) but unfortunately that would not be enough- I really do need to know how arguments the operators take and leave for other reasons in my evolutionary algorithm library.
The name is what you might expect- its the name of the operator. I think it is reasonable that all operators be named, and it makes printing them must nicer if they know who they are.

The nicest thing about arbitrary stack effects is that I can include functions like dup, drop, over, tuck, nip, rot, and drop in any problem, as they are polymorphic in their type. I hope that this will prove an easy way of promoting the reuse of substructures in RGEP. Other methods of promoting substructure reuse are Ferriera's version of ADFs (automatically defined functions) and Xin Li's work on subexpression libraries. I really like Xin Li's work and I think that the subexpression library approach is probably a good thing, but in the spirit of RGEP I want to propose a much simpler way to reuse subexpressions. Using the postfix operators, especially dup, over, and tuck reorganizes the expression tree such that certain subexpressions will appear many times. This has the disadvantage that the subexpressions may have only one actual copy in the individual, and so are easy to disrupt, but on the other hand it is trivial to add this method to RGEP (In Heal, my library, you just have to add them to the operator list) and if they promote subexpressions at all then it is worth trying them out. This is much simpler than keeping track of common subexpressions and saving them in a library, adding additional terminals for each expression, and expanding and contracting those terminals.

No comments:

Post a Comment