Refactoring

I have observed that rewriting (or 'refactoring') code, and generally tidying it up, is a good thing. (I believe that this, rather than pair programming or even test-first, is the biggest win for Extreme Programming.) It is possible to go further -- writing, debugging, and testing code, followed by leaving it unaltered -- is generally a bad thing. Such code seldom possesses quality.

Code that has been rewritten, particulary if it has been rewritten several times, generally does possess quality. It has proof of habitability because you (or someone else) was able to return to it and clean it up (c.f. Alexander). It also proves that you (or someone else) cares sufficiently about its quality that they did return to it to improve it (c.f. Pirsig).

However, code rewriting is actively discouraged in industry because of deadlines, pressure to be 'first to market', and because code that works (however Heath-Robinson it might be) is not perceived to be a problem. At Redmond, code that doesn't work isn't perceived to be much of a problem either -- proof that you can get away with worse than Worse is Better./

This is a mistake.

Code that merely works is less habitable than code which has been repeatedly improved by rewriting. It will result in higher maintenance costs in the long term, and contribute to the 'crisis in software' which is largely manufactured by various practices prevalent in the industry.

Another mistake commonly made is to assume that rewriting can be avoided, by careful application of a methodology, particularly the top-down methodologies recommended by various so-called software gurus. These result in less code that is reusable, in redundant functionality, and in higher costs if you get things wrong. And you inevitably will get things wrong -- the best data structure is seldom obvious and it's virtually guaranteed that, if you think hard enough, you will arrive at a better one than your original choice -- and when you do you'll want to rewrite your code (unless, of course, you don't care about its quality), but this will be hampered by its inflexibility, and the bureacracy associated with having to rewrite design documents.

In fact, there is much to be said for bottom-up programming and design, and documenting your code after it's written and stable (though of course you should comment the code while you're writing it).

In order to find ways of improving your code, you have to be familiar with more than one way of doing most things. That means a familiarity with different styles of programming, different algorithms (Knuth), different approaches (The Right Thing vs. Worse is Better vs. Good enough is best).

Finally a specific example mentioned in Quality Without A Name: Minix has been rewritten again and again, and it shows.

Bloat is bad

The more lines of code you have to maintain, the higher the maintenance cost -- the number of bugs per line of code is not far off constant. This means that if you can cut down the overall size of the code, it will reduce maintenance costs, and either free programmers to develop new products in an expanding economy, or make maintenance possible at all in a declining one if ever redundancies become necessary. Both help to improve the long term survival of your company.

If you can't find a use for a procedure or data structure now, there's a good chance nobody else will either. Even if it's obviously useful, defer its implementation until it's actually going to be used.

Bill Gates Was Right

Bill Gates has often been ridiculed for once saying that "640K ought to be enough for anybody". But, for most purposes, he was right. The essential functionality of MS Office applications was present on early IBM PCs in software such as Visicalc (a spreadsheet system) and Wordstar (a word processor).


This page was linked to from

© Copyright Donald Fisk 2003