Its Computer Science Time: Structure of GPM Systems

I've been trying to understand the structure of Evolutionary Algorithms in generals, and noticed that they are very complex and very little can be said about them in general. In a previous post I talked about how much variation exists in different EA systems, and that it is hard to say anything at all about them in general. In light of this, I don't plan on making a general EA system, but rather a program for a particular type of EA- a GPM system.

GPM (Genotype-Phenotype Mapping) systems are something that I've only just recently discovered. They are the basis for all linear genetic programs and GEP in particular can be viewed as a specialization of the very general GPM system. This idea introduces exactly the analogy I was planning to make about the biological process of gene expression from base pairs to RNA, editing of the RNA, and then the creation of proteins. In a way it is nice that this is already introduced, as it gives me a vocabulary and justification for my thesis.

GPM systems consist of some linear encoding of things (symbols from some alphabet, integers, bits, real numbers, whatever). The evaluation process is where the difference between GA and GPM appears, although a GA could be considered a GPM where the translation is not performed (GPM generalizes GA in that sense). The process involves grouping base pairs into codons (such as a group of bits in BGEP) which are translated into a RNA strand (a list of operators and terminals in GEP) which can then be edited before the expression process. Editing is justified by the editing done to RNA before protein creation, though as I understand it the process is actually not all that related to the process found in actual gene expression. This editing is performed to make sure that the gene can be expressed as a valid phenotype. In other words it modifies the individual until it can be translated into a program. This stage may involve making sure the genetic information satisfies other constraints as well. Some editing operations may be replacing symbols, deleting symbols, and adding symbols.

After editing the program must be created. This may be very involved- wrapping it in a main function and compiling for example- or very simple- evaluating the prefix expression in PGEP. The generality here is very surprising- in the papers by Wolfgang Banzhaf he evolves a C program, which is quite a feat, considering the complexity of the syntax. Unfortunately it requires a very complex editing stage, which can add all sorts of information to the individuals such that it hardly resembles the original genetic information at all.

Using GPM as a basis, and drawing from GEP, PGEP, and BGEP, I hope to introduce a simple (conceptually and in implementation), effective, unconstrained, general GPM system that I am calling BPGEP (Binary Prefix Gene Expression Programming). With the structure suggested by GPM I will be making this EA system I'm writing in Haskell a GPM system, allowing individuals to by only lists, populations to be lists (at least, for now they will have to be lists), operators to do the usual mutation, crossover, selection, etc, as well as a way of setting up translation between encodings.

Its Computer Science Time

Thursday, October 21, 2010

Structure of GPM Systems

No comments:

Post a Comment