Monday, September 6, 2010

Self Emergence of Substructures

There is a paper by the creators of PGEP called "Direct Evolution of Hierarchical Solutions with Self-Emergent Substructures." This paper describes a system in which expression trees can be extracted from an individual (in the elite group) and collapsed into a single terminal, which is added to a library of substructures. Similar techniques appear to exist in other fields, such as static and dynamic super instructions in forth, which are known to improve the efficiency of native code compiler.

This idea has some interesting possibilities, methinks. For one, it allows substructures to combine in new and interesting ways. Also, it may allow greater depth to appear in expression trees, which is a problem in regular GEP in certain cases (a problem where many terminals should appear in the optimal solution would require very long individuals to ensure proper expression tree sizes to include necessary terminals). I would like to investigate this area of GEP research as part of my masters, presumably using BPGEP as the underlying method.

One possibility that I may try is a simpler way to add this technique to GEP then that proposed in the paper. It is not overly complex, but I think maybe a simpler way would be easy to digest for people using GEP, and may be able to keep the primary advantages. My adviser warned me against overly complex methods, where a great deal of computation time is taken just keeping track of things and updating all the moving parts. One of my favorite things about PGEP is its relative simplicity (comparing to GEP), and I don't want to add too much more to it other than the binary encoding.

Another possibility is to allow a great deal of subtree saving and collapsing, and to have relatively small individuals. The idea would be to recreate the conditions of Messy Genetic Algorithms, where small partial solutions are built up to create working solutions. The population would have small individuals, which would become more complex as the number of saved subexpressions grew. I would have to account for uncontrolled growth (as appears in some forms of GP) as the depth would increase with each "level" of saved subexpression. Such a method could use the same biological justification the MGAs do.

No comments:

Post a Comment