Ferriera has shown that the dynamics of GEP are different from that of a GA in her book, Gene Expression Programming- Mathematical Modeling by an Artificial Intelligence. The best fitness predictably is pretty consistent (as elitism is in use) but the average fitness skips around quite a lot. It is pretty clear that this is because the operators defined by Ferreira are very disruptive, so that the algorithm explores more then exploits. It can produce huge variations in a single mutation, if the structure of the tree is changed enough. This is fine in a way- elitism ensures we have at least a pretty good solution, and selection will ensure it has a good presence in the population, and most of the operators will search a wide area of the solution space around this individual.
The problem is that there are so many operators, and they are so disruptive that we end up with little more than a random search (at worst). The problems that the technique is applied to in the book are mostly fairly simple, and I wonder if the randomness of the search is just ensuring that good solutions are found because of the smallish size of the solution space. For much larger solution spaces, that randomness may result in a lot of wasted effort (exploring when we should be exploiting what we know is good).
Other operators have simplified GEP since it was created (rotation and point mutation as the only mutations, and single and double cross as the only crosses) and PGEP seems to be better able to keep a subexpressions structure even after small changes to the individual. I'm wondering if the population dynamics have changed when applying this operations and variations of GEP. If I learn anything about this in the remaining part of my capstone I'll definitely post it here.
No comments:
Post a Comment