Review on Refactoring

2021-01-20 on this blog

End of last year I finally took my time to read one of the classics, Martin Fowler's "Refactoring" ¹, notably the very old but mostly still valid 1999 version. Here I want to share some of my thoughts about it.

What it is

Using my own words, refactoring is the process of transforming program code into a different but computationally equivalent version which makes it easier for humans to understand. Refactoring is never a purpose on its own. Fowler recommends to integrate refactoring tightly into the daily process of development by using the "Two Hats of Refactoring". At first there is an intent to change the code be it a reported bug or a new feature (why else would you want to change it?). The programmer considers the code and checks whether the change is easy to make. If so they put on the programmer's hat, add tests for the change and implement the desired behaviour. It is barely allowed to modify existing code, only what is really needed for the current change.

If however the change is difficult to make, stand back. What is it that makes the change so difficult? Does it require one to touch many different places in a similar way? Would you like to use existing functionality but it is heavily integrated somewhere unrelated? Do you recognize your change to only be a subtle difference in specific behaviour but you are unable to "inject" the difference where appropriate? All these questions allude to the presence of so called Code Smells and a large part of his book contains recipes to recognise and remove these smells from the code by performing very small and strictly structured steps, refactorings, to improve the structure of the code up to the point where the change again becomes easy to make. So the programmer puts on the refactoring hat and does refactorings. Testing really is important here. Each refactoring aims at small modifications and after each iteration the tests are run to ensure nothing has broken in between. The ultimate goal is code understandability. Every part of code should have a clear and concise intent jumping right into the reader's eyes upon encounter.

But what about performance?

Apart from the refactoring itself, the book expresses some other opinions I want to comment on. The first one being "Don't optimise for performance while refactoring". Some of the refactorings appear to make the code "do more", e.g. the procedure "remove temporary variable" which replaces every read of a local variable by the expression (almost always a method call) which generates the variable's value. The rationale is that local variables oftentimes lead to complex interconnection of code and make it harder to perform other refactorings, e.g. extracting a method. Here Fowler values understandability of the code much higher than a potential (!) gain in performance by saving a computation. And most of the time he will be right. For those few cases where performance really matters it can easily be refactored again to exhibit faster computation. But in most of the cases you will have saved a lot of time you can invest in different work than improving code that is already fast enough. Also never improve performance without measurements. You use unit tests while refactoring for understandability and you use measurements while refactoring for performance. If you don't use measurements, how else do you know your code runs faster?

Strategy Pattern at work

I also appreciated his introduction code snippet where I saw the State / Strategy Pattern in action. There was a typical case of replacing a switch statement by a polymorphic method call. However the predicate over which the switch statement was, was subject to change during the lifetime of the containing object. In this case extract the polymorphic behaviour into a separate class hierarchy, the Strategy. Then the original object can delegate its method call to this Strategy class hierarchy by holding a member variable of that type. And changing the object's behaviour at runtime just means assigning a different Strategy to it.

Null - but safe

Another interesting pattern which was not particularly new to me but was the first time I was explicitly confronted with were Null Objects. Instead of passing around pesky null values you introduce subtypes to the actual types being possibly null and let them implement a dummy version of the actual type. This can eliminate quite a lot of bloat both for readability by eliminating unnecessary if statements and also removes the risk of a null dereference at runtime.

Collaborative refactorings?

A question on which I unfortunately could't get a direct answer to was the problem of collaboration. If you work in a team and you decide to refactor a piece of code you can make people really angry if they have to resolve merge conflicts just because you renamed some method or moved some code around. It would have been nice to get some discussion about this problem but maybe the 1999 is just a bit too old as branching version control systems were not too common in this time.

Good old times

At last I want to point out that the 1999 version has some sections which really appear outdated today. Fowler nicely describes the nature of a unit test by suggesting that each (in his Java case) class should have a test() method which performs a complete self-check of the class's functionality. The most important bit is that these are self-checking, meaning that there must be a binary output about the outcome. Apparently he was fighting against the opinion of a test where the possibly verbose output had to be checked manually if it was correct. These times are (hopefully?) long gone and a binary PASS/FAIL decision on tests is widely acknowledged today. Also the book contains a short introduction on an interesting new Java testing library called JUnit. It automatically discovers a set of tests, runs them and gives nice output on the result of the tests. It even contained a small GUI window with a progress bar and a run button which was really convenient in a time before IDEs. Talking about IDEs, the tool support was barely present in these days. You couldn't just select a section of code and extract it into an own method, all these steps had to be performed by hand. This is also why the book contains so detailed descriptions about each refactoring: You never know if you might be confronted with a situation (possibly a new language #Rust) where tool support is only emerging and not on the level of contemporary expectance. It was really fun to read these sections and to compare the progress done since then.

Of course I know that there is a newer version of the book, however for reasons I cannot fully phrase (maybe it has something to do with nostalgia) I rather read the old version. Maybe I will someday spend my time to read the new version (which also happens to use JavaScript in its examples).

M. Fowler, K. Beck et al.: Refactoring: Improving the Design of Existing Code. Published June 28th 1999 by Addison-Wesley Professional.