Wednesday, December 8, 2010

Proof of Correctness for RGEP's Editing Mechanism

RGEP's editing stage consists of two parts- removing the non-coding region if one exists, and removing the extra operators, if any exist. I have convinced myself that the mechanism works, but I have not shown this rigorously until now.

To prove that RGEP's editing mechanism will always produce valid individuals, consider each stage in turn. The first stage removes the "non-coding" region. It is well known that expressions in polish notation can be parenthesized without ambiguity. If we add parenthesis and find symbols to the right of the right-most closing parenthesis then removing those symbols corresponds to removing the non-coding region. Note that we can account for expressions with too many operators by closing their parenthesis at the end of the expression.

In the second stage we can assume that all expressions are either complete or over-specified, as expressions with extra symbols at the end will have been removed in the first stage. The correctness of this stage is proven by cases:
Case 1- the expression is a valid prefix expression. In this case the postfix traversal of the reverse of the expression will be exactly the same as the prefix traversal of the expression, by definition.
Case 2- The individual has too many operators to be considered a valid prefix expression. In this case the postfix evaluation of the reverse of the individual will encounter either a list of operators, which would all cause a stack underflow and will therefore be ignored, leaving an empty stack as the result, or an individual with one or more terminal.
Operators before the first terminal will cause a stack underflow and will be ignored. Operators after the terminal may result in a change in the stack, but cannot empty the stack. At the end of the evaluation the top of the stack will be considered the result of the evaluation. Note that there must be only one thing on the stack in this case- if there were terminals that would end up on the stack then they would have been removed in the first editing stage.

We can also prove that RGEP's prefix-with-editing can encode all individuals that PGEP's prefix notation can, and that RGEP is a strict superset of Karva-with-head-tail-distinction. First, both PGEP and RGEP use prefix notation,
so any individual in PGEP, which is by definition valid, is therefore valid in RGEP. Assuming the correctness of RGEP's editing mechanism, RGEP has some redundancy in its encoding- some expressions will be reduced to valid prefix expressions which must be smaller then before editing (remember that all editing in RGEP deletes symbols), and must therefore be expressible as an prefix notation individual by padding it with symbols to meet the fixed length requirements.
To show that RGEP is a strict superset of Karva we first show that all Karva individuals have a prefix equivalent. A Karva individual encodes a tree, and as prefix notation can express any tree and requires no more symbols then Karva, we must be able to reorder the symbols in a valid Karva expression to get the same tree in prefix expression. To show that RGEP > Karva-with-head-tail we must only show a single case of a prefix expression that is valid in RGEP and not expressible in Karva. notation For individuals of length 7, with only the + operator and the terminal 1, the following individual is valid in RGEP- +1+1+11. The corresponding tree is too right unbalanced to express in Karva notation (by inspection) if we restrict the operators to only the first three places, as we must in this situation. Therefore RGEP > Karva in the sense that it can express more trees for symbol lists of fixed length. I'm sure this can be generalized to arbitrary lengths by constructing similar right-unbalanced trees, but the point is still valid- such a tree exists in at least one situation.

2 comments:

  1. Arg! I only consider the case of operators with up to 2 arguments! For the case of operators with arity>2 it is possible that some arguments are left on the stack. I'm okay with that, as I'm considering allowing arbitrary stack effects and just popping the top anyway.

    ReplyDelete
  2. +1+1+11 in karva = +1+1+11 in prefix

    They(postfix/prefix/karva) are equivalent.

    Karva is not that good as described by C.Ferreira, but she is not a fool either.

    ReplyDelete