Saturday, February 19, 2011

Decision Tree Induction

To show that GEP can be used to solve many types of problems its inventor, Candida Ferreira, presents many different tasks from symbolic regression, design of neural networks, block stacking, traveling salesman, and many others. Some of them require significant modification of the original algorithm but other problems can be solved by carefully chosen operators and terminals.

One problem that is fairly simple to set up with GEP is the creation of decision trees. These are what they sound like- tree structures that can make decisions. They consist of nodes each of which corresponds to an attribute, and for some given value of that attribute the node will return the value of the correct subtree corresponding to that value. This means that we can fill in the attribute values and ask the tree to give us an answer, and one of its leaves will ultimately be selected as the result.

To do this in GEP, or in my case RGEP, we just make the attributes into operators with n children where n is the number of values the attribute can have, and we have one terminal for each value of the decision variable. I'm only doing a simple example case, and Ferreira adds some additional complexity when creating decision trees with continuous valued attributes.

Since I'm only doing a proof of concept for my thesis, I am only going to try a simple example about whether or not to play tennis based on the weather. I have already written some very very basic code to create decision trees- I considered more complex schemes, but since this is a very minor part of my thesis I ended up adding the ability to create operators and terminals as well as a fitness evaluator for decision trees in 5 lines of code. Pretty simple!

No comments:

Post a Comment