Too Much Incremental Refactoring

by Jeff Langr

August 14, 2008

Having reasonably comprehensive tests is great. I currently have a good-sized codebase for an application I’m working on. In one key subsystem, representing about 1/6 of the whole application so far, there are about 25 classes and 86 code-level tests. Most of the classes are small, one to three methods. Coverage is in the very high 90s. Sixteen of the 86 tests are necessarily “slow” tests (centered around persistence), so I run them in a separate suite. Combined, they take 3/4 of a second; the “fast” tests run almost instantly.

I’ve been very happy with the design and implementation of this subsystem so far, with the exception of one central class for which the tests are a bit ugly. Up until now, I’ve been able to introduce significant new features in very short order, with minimal shotgun surgery and no fear of breaking anything (the sixteen integration tests are a key part of eliminating this fear).

But I just introduced one would-be small change that introduced significant pain. There were two causes of the pain: one, I chose to use an array to represent one significant collection, and two, I chose to expose that implementation through parameter passing. Why did I choose an array? Well, it represented a fixed collection of data for a class, which means it was easiest to code as an array (think initialization). It also made for effective tests, particularly given the expressiveness of varargs.

My new change pushed me to prefer a parameterized collection instead of an array. So I bit the bullet and figured I’d just incrementally refactor my way away from the array. Worked great! In an incremental fashion–with no one step taking more than 5 minutes before getting a green bar (and most taking 30 seconds)–I got to my desired solution steady and sure. I made a couple simple mistakes along the way, and relished the way the red bar immediately informed me of my stupidity. The problem, though, was that there were far too many of these wonderfully short steps, adding up to about an hour of work. “Ok, made the change here…ohhh, there, it ripples to a half dozen other methods!”

The lesson that I was reminded of was that I would have been much better off having encapsulated the collection early on. Instead of passing about Widget[] or List<Widget>, this change would have been considerably simpler had I created a type named WidgetList. I would have had to expose some overloaded “bridge” methods while I fixed the tests that used varargs, but in 20/20 hindsight, the change would have taken me maybe 10 minutes.

Of course I knew better, even in the heady moments of thinking the arrays made things a bit simpler. There’s plenty of literature that covers this, and I’ve read it. I’ve chastised others for the same thing plenty of times, even recently. But for whatever reason, I didn’t bother. Laziness, not paying enough attention, not caring enough. Of course it’s a violation of encapsulation, but technically I can also look at my problem as having allowed the duplication of my choice for the collection’s implementation to run rampant throughout the system.

A pair would probably have squashed the potential headache early on. Maybe. Take a look at your code–where are you passing around exposed collections?

Share your comment

Jeff Langr

About the Author

Jeff Langr has been building software for 40 years and writing about it heavily for 20. You can find out more about Jeff, learn from the many helpful articles and books he's written, or read one of his 1000+ combined blog (including Agile in a Flash) and public posts.