Sunday, June 13, 2010

Preservation of Substructures

I've convinced myself that one of the most important parts of GEP is the ability to encode knowledge about a problem domain in the form of a tree. While knowledge can be potentially encoded in many ways, and trees may be unable to encode some forms of knowledge (potentially limiting the ability of GEP to solve problems when there is no mapping (or maybe no easy or convenient mapping) between the solution space and a space of expressions encoded in trees), they are very nice for many things and have a nice notation (such as prefix or karva) that allow the whole genotype->phenotype thing that is also one of the most important parts of GEP.

I've been thinking of the different types of things one can do with trees and the ways trees can evaluate to solutions to different combinatorial optimization problems, and I've realized that I would like to be able to improve the ability of GEP to preserve structures (that is to say, subtrees). PGEP does this already, but I'm sure there are some other things we can do, such as saving useful substructures and making them terminal (which has already been done (Direct Evolution of Hierarchical Solutions with Self-Emergent Substructures by Xin Li)), or some other mechanism.

At the moment, I can't think of any new or interesting mechanism to try out, but I think there are a couple of forms that such a mechanism might take:
1. A new structure (like PGEP) that tends towards preserving subtrees in the presence of small disruptions to the linear structure of the gene.
2. A new operator that reorganizes the gene or otherwise tends to help substructures survive. I've been trying to think of a reorganizing operator for a while that would transform trees (especially ones made of commutative operators) into ones whose linear representation has the minimal overall length of substructures. I may have to talk about this more in another post.
Another example is Xin Li's (et all) operator that makes successful subtrees into new terminals.
3. A change to current operators to make them more aware of the structure of the genes. I plan to introduce a rotational mutation in BPGEP that does this sort of thing.

I may have more to say about making subtrees into terminals later, as the increased size of the terminal set may be offset by BGEP, making that a particularly effective strategy in BPGEP.

1 comment:

  1. On the topic of substructure aware (or substructure reenforcing) operators, I suddenly understood the other day what the motivation behind all the RIS, IS and other operators in the original GEP research were about. They hoped to move substructures around. I would rather avoid that kind of complexity and direct awareness of tree structure (rotation is much better in this way).

    ReplyDelete