Wednesday, January 2, 2013

Generating Initial Population for a Genetic Algorithm with Markov Chains

Sometimes when running a Genetic Algorithm a lot of time is lost when the initial population is random. This depends on a lot of factors, such as the encoding of solutions, or the distribution of good solutions in the solution space. One case that seems common is when some solutions are not in the feasible region, and generating random solutions results in a population containing mostly invalid possibilities.

One way to remedy this is to generate individuals encoding feasible solutions to start with. Something this is as easy as walking a grammar while making random decisions about how to expand non-terminal symbols, but sometimes no such structure is available or solutions known in practice follow a more complex distribution then can be easily describe this way (some expansions may be much more likely then others, for example).

In this situation, one possibility would be to train a Markov Chain on known solutions, perhaps sampled from a physical process, such as in a prepared training set. The resulting Markov chain would generate solutions like those found in practice while still being random.

This is not without drawbacks- it assumes we know something about the solution space and that the samples used accurately cover this space. It will improve convergence time but will not search as much of the solution space as a randomly generated population would. Of course, these things are always about trade-offs, and this seems like a nice tool to have in one's toolbox.

On last note- it seems like this could also be used to generate new values for a training set given an initial set. This might prevent some over-fitting, assuming we are able to evaluate the new samples accurately.

No comments:

Post a Comment