Tuesday, August 21, 2012

Generality is Correctness

At work I came across a situation where the length (and other properties) of a data structure was hard-coded in many similar functions across a project, where each function acts on a different instance of the structure. At first I was reluctant to implement my own version of the function in this way. Lacking a convenient way to abstract this function I'm okay with reimplementing it in my own code, but it is possible for the length to be calculated at run-time instead of hard-coded.

I reasoned that all the functions be implemented by calculating the length, as it would make them more generic. The function would work correctly if the sizes changed, or if it were using in a different situation then the one originally intended. In this sense the more generic the function the fewer properties it has, and the more likely the features it does have will be correct. Being very generic should make a function easy to understand (there are fewer things to be aware of) but should also restrict its behavior to exclude extra information (the length used in that function is part of its behavior not described by its type) or corner cases. An more extreme example is Parametric Polymorphism, where there are even derivable laws that such functions must satisfy, but it seems like this concept occurs in all levels of design.

In the end we are hard-coding all of these sizes. This is also a form of correctness, as all function do one and only one thing, and can be used in only one way. The main reason for doing it this way is that we can show that throughout the project the sizes are used in a consistent way and that nowhere does the size get misused. This means that we will not see a bug that a size is calculated incorrectly and ends up overwriting memory, getting memory access exceptions, or taking a huge amount of time to process a structure that appears larger than it is. On the other hand, these functions are specific to a single use, and it is difficult to make global changes to them if we needed to. In this particular case that is okay, but it makes maintenance more difficult.

The main thing I've learned here is that the tradeoffs where I work are not the ones I'm used to. We do not build small, reusable components, and there are not many mechanisms for abstraction or specification. Documentation becomes extremely important, and there are often subtle bugs (in my code included) involving memory management.

1 comment:

  1. Yes agreed. I have also seen this trend. In an academic pursuit one can create a scenario where things are modularized and easily maintainable. In the real world, not all solutions are elegant. Good discussion, keep up the good work!

    ReplyDelete