To fulfill this goal I've chosen to keep the binary encode, with the change that sequences of bits that do not encode a symbol are thrown out in the gene expression phase, and the mutation, crossover and notation from PGEP. I've replaced roulette wheel selection with 2-tournament selection. This works by choosing 2 (or more generally k) random individuals from the population and having them compete in a tournament. The tournament consists of choosing the best individual with probability p, the second best with less probability, the third (if there is one) with a smalled probability, etc. This is basically a roulette wheel on a ranked subpopulation. I've changed the selection because there is no convincing reason to use roulette wheel (the original GEP paper chose it with very little justification), tournament selection is apparently popular in GA literature, it removes the dependence that selection in roulette wheel has on the scale of the fitness function, and it can support both minimization and maximization problems. Roulette wheel only supports maximization, and so the fitness function is often changed to turn a minimization to a maximization. This is messy and may have unintended consequences. Note that I'm keeping elitism- it seems to help a lot.
The other change is in the editing mechanism. I don't want to have to check for invalid individuals after each genetic operation, and I consider it a problem with implementation that they have to pass that kind of information to the genetic operations. I would rather keep in the spirit of GPM systems and simply do editing to ensure that individuals encode trees. To do this we can just remove the noncoding region, flip the individual, and do a postfix evaluation where operations that would cause a stack underflow are ignored. There is another way to do this by moving backwards through the list but the postfix interpretation is very nice and concise.
I like this editing mechanism because it doesn't introduce any new material to the individual. In fact, all the editing that is done- bits to symbols, removing the noncoding region, and the postfix traversal- will only remove elements, which I consider a nice consistent approach. The only problem with this is the possibility of removing all symbols and ending up with an empty individuals. I don't like this, but I am willing to say that these individuals just end up with the worst possible fitness.
Thats the main points I'm making here- I also investigate other notations and editing mechanism to explain why they are not satisfying. I would like to also investigate adding some of the features that different GP variations have, like typed programs, recursion, iteration, and lambda terms that allow it to solve problems that are otherwise difficult for GPs. I've read, and I agree, that a huge amount of power is to be had by adding nesting and program reuse in evolved programs. This would be a test of RGEPs simplicity in the sense that if it is simply it may be easy to add to. On the other hand, it may be to simple to easily add to, and I expect a lot of complexity will be added to make these kinds of modification. I doubt I will be able to do more then one, so if I could find a nice way to remove the closure property I would be pretty happy. I expect that many of the variation of GEP that already exist are easy to add to RGEP, like ephemeral constants.
No comments:
Post a Comment