I do TDD mostly to support the ability to keep my code and design clean, with high confidence that I’m not breaking something in the process. Most developers who do not practice TDD refactor inadequately, because of the fear based on the very real likelihood of breaking the software. I can’t blame ’em for the fear. I’ve helped developers in environments where defects related to downtime had the potential to cost the company millions of dollars per minute.
“Inadequate? How dare you!” Yes, inadequate. The best TADers (test-after developers) claim around 70%-75% coverage. The problem is that there’s usually some good, chewy, gristly meat in that 25% to 30% that isn’t covered, and is thus not easily or safely maintained. Costs go up. The system degrades in quality by definition.
In contrast, I have very high confidence that I can move code about when doing TDD. I can also improve the structure and expressiveness of a system, making it far easier to maintain.
Duplication is sooo easy to introduce in a system. It’s harder to spot, and even harder to fix if you don’t have fast tests providing high levels of coverage. I’ve seen one real case where introducing close-to-comprehensive unit tests on a system resulted in shrinkage down to about 1/3 its original size over a reasonably short period of time. And with most systems I’ve encountered, I can scan through the source code and quickly spot rampant bits of unnecessary duplication.
Good code structure and expressiveness is also lacking in most systems. If you’ve been a developer for more than a few weeks, it’s almost a certainty that you’ve spent way too much time slogging through a multi-hundred or multi-thousand line long function, trying to understand how it works and how to fix it without breaking something else. In a well-structured (cohesive) system, the time to pinpoint and fix things can be reduced by almost an order of magnitude.
TADers simply don’t eliminate duplication and clean up the code to this extent. It’s not a goal of TAD.
Which would you rather maintain? The TAD system, or a system that’s half the size and double the expressiveness?
There are many other reasons TDD allows me to go faster than TAD. The converse of my “why TAD sucks” reasons should hint at many of them.
Comments
Shmoo July 14, 2011 at 2:01 am
Hi, Jeff,
Thanks for another interesting article. You seldom fail to deliver.
I realise that it was written a while ago but I hope you don’t mind
some comments.
“I do TDD mostly to support the ability to keep my code and design
clean, with high confidence that I’m not breaking something in the
process.”
Well, your reasons for doing TDD are your own, but most view TDD as a design process; what gives you confidence that you’re not breaking something is not TDD but the unit tests that TDD provides, which TAD also provides, so you’re not making a distinction here between TDD and TAD (as your title might suggest).
“Most developers who do not practice TDD refactor inadequately, because of the fear based on the very real likelihood of breaking the software.”
Could you quote your source for this statement? Or is it from your own, anecdotal experience?
It is an odd statement simply because, by definition, refactoring does not entail a very real likelihood of breaking software.
Here’s what that chap Fowler says: Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behaviour. Its heart is a series of small behaviour preserving transformations. Each transformation (called a ‘refactoring’) does little, but a sequence of transformations can produce a significant restructuring.
Of course, Fowler can’t say define, “Small,” but I think most would agree that any refactoring that entailed a very real likelihood of breaking software is too big. This point stands independent of the process used, so it’s odd that you think that there is a correlation between misunderstanding refactoring and the set of developers who don’t practice TDD.
““Inadequate? How dare you!” Yes, inadequate. The best TADers (test-after developers) claim around 70%-75% coverage.”
Forgive me if I’ve misunderstood, but you seem here to attempt to justify your claim that TADers don’t refactor adequately by pointing to their low (test) coverage. Refactoring, however, has nothing to do with coverage percentages.
You seem to suggest that before refactoring, the developer checks the coverage percentage and based on that decides whether to proceed with a refactoring. Jeff, you’re a seasoned programmer: you know this doesn’t happen, so I must have misunderstood your point here.
Much more likely, when deciding whether to refactor the developer will first check whether unit tests exist for the section to be refactored and if not the developer will write them (regardless of the current coverage). The developer will then proceed with the refactoring.
This again is independent of process: this is just current best-practice for handling legacy code (defined as code without unit tests).
“The problem is that there’s usually some good, chewy, gristly meat in that 25% to 30% that isn’t covered, and is thus not easily or safely maintained. Costs go up. The system degrades in quality by definition. In contrast, I have very high confidence that I can move code about when doing TDD. I can also improve the structure and expressiveness of a system, making it far easier to maintain.”
You’re switching between maintenance and refactoring here, but normal practice for handling legacy code states that if any code is to be maintained or its structure improved after it has been written, unit tests are written for that code before maintenance and re-structuring begins. This again fails to distinguish between TAD and TDD.
Or are you claiming that costs do not, “Go up,” for TDD systems? Do TDD systems, “Degrade in quality,” less than TAD systems? If you claim this, you simply must quote source: the first at least would be revolutionary for software development.
“Duplication is sooo easy to introduce in a system. It’s harder to spot, and even harder to fix if you don’t have fast tests providing high levels of coverage.”
Why is duplication harder to spot with fast tests rather than slow ones?
How does duplication correlate with test execution speed? My tests take 54 seconds to run: how much duplication do I have? How much harder is it to answer that question if my tests now take 57 seconds to run?
Similarly with coverage: I have 75% test coverage today but only 73% tomorrow, how does this increase the difficulty of spotting duplication?
“I’ve seen one real case where introducing close-to-comprehensive unit tests on a system resulted in shrinkage down to about 1/3 its original size over a reasonably short period of time. And with most systems I’ve encountered, I can scan through the source code and quickly spot rampant bits of unnecessary duplication.”
I put it to you that it’s your scanning of the code that reveals duplication, not the writing of unit tests per se. And I notice that you introduced those close-to-comprehensive unit test on an existing system (so the tests were TADed): this fails to make a distinction between TDD and TAD.
(By the way, what’s necessary duplication?)
“Good code structure and expressiveness is also lacking in most systems. If you’ve been a developer for more than a few weeks, it’s almost a certainty that you’ve spent way too much time slogging through a multi-hundred or multi-thousand line long function, trying to understand how it works and how to fix it without breaking something else. In a well-structured (cohesive) system, the time to pinpoint and fix things can be reduced by almost an order of magnitude.”
Fine, but this doesn’t distinguish TAD from TDD.
“TADers simply don’t eliminate duplication and clean up the code to this extent. It’s not a goal of TAD.”
To what extent? You have not put a case for anything eliminating duplication except your scanning and have mentioned no cleaning-up of code except misunderstood refactoring.
TAD, furthermore, is not the all-inclusive set of principles by which TADers write software: there are countless many more. Eliminating duplication may not be a goal of TAD, but it is a goal of the DRY principle, to which most TADers/TDDers/Waterfallers, etc., also adhere. Just because it’s not a goal of TAD doesn’t mean that TADers don’t do it. Ensuring that database accesses don’t deadlock when scaled to tens of thousands of accesses per second is not a goal of TDD but that doesn’t mean that TDDers write bad database-interaction code.
“Which would you rather maintain? The TAD system, or a system that’s half the size and double the expressiveness?”
Oh, come now, Jeff, a false dichotomy? Can I play, too? How about this, “Which would you rather maintain? The TDD system, or a system that’s half the size and double the expressiveness?” Or how about, “Which would you rather eat? A vat of radioactive zombie flesh, or a healthy, delicious meal cooked by a great French chef whose daughter has just given him his first bouncing, baby grandson?”
“There are many other reasons TDD allows me to go faster than TAD.”
Other reasons?! And correct me if I’m wrong, but is this the first reference to a speed comparison between TDD and TAD in this article, apart from the title? Here, in the conclusion?
“The converse of my “why TAD sucks” reasons should hint at many of them.”
Jeff, that referenced article has less going for it than this one; oil tankers daily navigate narrower channels than those presented in that article.
Thanks again,
Shmoo.
Jeff Langr July 14, 2011 at 8:52 am
Greetings Schmoo,
Many thanks for the lengthy response. Too lengthy to tackle in one response, so I’ll take a more incremental approach.
“Well, your reasons for doing TDD are your own, but most view TDD as a design process; what gives you confidence that you’re not breaking something is not TDD but the unit tests that TDD provides”
Yes. Note that I said I do TDD to “support” the ability to refactor, alluding to the same thing you’re saying. I must apologize–in effort to keep the blog post short and to my point, I took plenty of liberties by not going into excruciating supporting detail. And you’re right–I’m not making a distinction in this sentence you referred to between TDD and TAD, despite the title.
“by definition, refactoring does not entail a very real likelihood
of breaking software.”
This is a great point, and one I’ve noted myself before: If you
follow what Fowler says, either you have sufficient test coverage,
or you’re taking such small, safe steps that nothing could possibly
break.
So I must confess again to being loose with my terms here. Would you be so kind as to at least accept that for purposes of my post, I mean to define the term “refactoring” in the general spirit of “I’m attempting to transform the code and retain the behavior, although I might not have sufficient tests or being doing it quite safely enough?”
If you’re ok with that, then it changes the meaning of your statement that “refactoring does not entail a very real likelihood of breaking software.” I certainly agree with you if you’re sticking to Fowler’s definition. But otherwise, I’m sure you’d agree that attempting to transform code with insufficient coverage has a great likelihood of breaking things. I must (anecdotally only) relate that I’ve even seen many tiny transform attempts create insidious little bugs.
Thanks so far; more forthcoming.
Jeff Langr July 14, 2011 at 9:33 am
“you seem here to attempt to justify your claim that TADers don’t refactor adequately by pointing to their low (test) coverage.”
I believe you have me here, Schmoo. The problem is that I’m arguing two different things, and it shows. In the first paragraph, I say that “Most developers who do not practice TDD refactor inadequately.” Anecdotal it may be, that’s what I’ve seen time and time again, so I believe it. But this blog is about TDD vs TAD–not TDD vs (TDD + people who don’t practice either). So let’s discuss the TAD aspect–which is what the second paragraph is about.
“You seem to suggest that before refactoring, the developer checks
the coverage percentage and based on that decides whether to proceed
with a refactoring.”
My apologies if you got this impression. I was not suggesting that.
“Much more likely, when deciding whether to refactor the developer
will first check whether unit tests exist for the section to be
refactored and if not the developer will write them”
I think you’re right in that this is the proper thing to do if you
are practicing TAD. However–and this is the first point I am
saddened that I must disagree–I don’t think there’s anything to back
up that this happens in practice. My anecdotal experience in this
area is that the developers don’t write the tests as much as they
should. Sometimes they write the tests, but often the tests weren’t
written in the first place because they were too difficult to write
after-the-fact, and life isn’t any easier for them now. Instead,
they proceed to change the code, run some manual (or
integration-level) tests, and go with that.
“This again is independent of process: this is just current best-practice for handling legacy code (defined as code without unit tests).” I agree.
With respect to this contention: “The problem is that there’s usually some good, chewy, gristly meat in that 25% to 30% that isn’t covered, and is thus not easily or safely maintained. Costs go up.”, I’m hinging it off of my previous discussion–it becomes slower if the uncovered areas are not kept sufficiently clean as they are developed.
It does, as you suggest, all depend on proper behavior.
So let me take a step back and see if we can get even closer to some level of agreement: TDD or TAD, theoretically, can produce exactly the same, very positive results. I must suspect you probably back the notion that TDD is overkill, and TAD is a better choice because you are writing just enough tests and not too many.
As far as studies about TDD systems degrading in quality less than TAD (or vice versa), there are unfortunately no such studies, and honestly I don’t think a single study or two would prove much of anything anyway. (There are studies that do compare quality in TDD systems versus non-TDD systems, but I don’t take much stock in those either.)
Code does degrade in quality in areas that remain untested, though. Instead of making often dramatic changes required to design code correctly, programmers will usually take the path of least change in order to avoid breakage. Usually this is to the detriment of the design.
“Why is duplication harder to spot with fast tests rather than slow
ones?”
This is a poorly constructed sentence on my part; I meant for the
“fast” to attach to the second part of the contention. In any case,
the fast should have been stricken entirely–that’s an allusion to
not having unit tests at all, which does make it slower to fix
things.
Jeff Langr July 14, 2011 at 9:48 am
“TDD or TAD, theoretically, can produce exactly the same, very positive results. I must suspect you probably back the notion that TDD is overkill, and TAD is a better choice because you are writing just enough tests and not too many.”
Where was I going with this? Oh yeah.
Adherence to best practice WRT TAD and reality are two different things. I suspect the bulk of the average team wants little to do with TAD or TDD. They just want to build their code and ship it so they can keep their job; unit testing to them is simply something they are told they have to do.
The teams that are left are the ones that care about software development. I know you do, and I know I do. I’ve certainly seen TDD do wonders in such a team that ‘cares,” and you’ve no doubt seen great successes with a TAD approach. (And of course I’ve been around long enough to see great successes with neither.) Which one is better for you and me? I’m always the first to say that it’s entirely up to team consensus–pick the approach that everyone agrees is right for them and their circumstance. So if TAD has worked well for you, go for it.
For the less-passionate teams, however, I’ve seen TAD turn into completely worthless efforts (and I’ve also seen similar results when TDD was introduced, too, but not under my watch). I have had success with taking non-unit-testing teams under the TDD wing, and watching it help shape their understanding of design, refactoring, and so on. Those teams might some day consider a TAD approach–when they have transitioned to a team that can be trusted to make the right choices at the right time.
Jeff Langr July 14, 2011 at 9:55 am
“I put it to you that it’s your scanning of the code that reveals duplication, not the writing of unit tests per se”
Yes, you are right. That’s what I had intended to say, despite my badly constructed earlier sentence. A dramatically reduced system size makes it a lot less time consuming to scan the code, and also easier to spot duplication. Having the habit of attempting to shrink the code because you have unit tests is where the reduced system size exists. Most TDDers have this habit.
Jeff Langr July 14, 2011 at 9:57 am
“(By the way, what’s necessary duplication?)”
Duplication that crummy frameworks or languages require, for one.
There are also cases where removing too much duplication makes for
overly complex, poorly expressive code.
Jeff Langr July 14, 2011 at 10:02 am
Hi Schmoo,
“‘Which would you rather maintain? The TAD system, or a system that’s half the size and double the expressiveness?’ Oh, come now, Jeff, a false dichotomy?”
Show me what you consider a great example of a TAD system that does not contain significant amounts of duplication, and then let’s talk from there.
Jeff
Shmoo July 15, 2011 at 3:20 am
Hi, Jeff,
I am deeply humbled by your responses.
They have completely taken the wind from my too-oft childishly peevish sails.
Your points are so considerately and maturely made that I feel guilty for having posted my rant in the first place. Looks like it’s the shame cupboard under that stairs for me this afternoon.
Thank you,
Shmoo, head bowed.
Jeff Langr July 18, 2011 at 10:55 am
Hi Schmoo,
Many thanks also for your kind words. I also appreciated your original post as well–thanks and keep the feedback coming!
Regards,
Jeff
Amruth February 4, 2019 at 9:21 am
Hi Jeff,
Very lengthy explanation and useful.
Can you please suggest some game with which we can motivate the team to follow TDD.
Hope to hear back from you.
Jeff Langr February 4, 2019 at 4:27 pm
Greetings Amruth–
Thanks for the message!
When I was working with about half a dozen other coaches at Ford, we would get in a room, pair up, and choose a coding challenge at random from the site Codewars.com. It became a race to see who could complete the code challenge first, but you had to use TDD. We reviewed it as a group afterward to talk about which solutions were better, as well.
There are a pile of sites to help you come up with good exercises; exercism.io is one. Cyber-dojo.com is another.
I worked with a group who mostly programmed as a mob throughout their day; the official day started at 9am. At 8am they would pull up a kata as a warm-up exercise to get in the mood and have a little fun.
You might also look into the concept of randoris: https://medium.com/connected/teaching-code-with-randori-51ac9a7fe7be These are similar to mobbing, but with a slightly different structure. A goal is to make the experience entertaining.
There are people who run exercises using Legos (for example, see http://www.gargoylesoftware.com/ex/lego_tdd). I wouldn’t find this effective for programmers, although perhaps it can help non-programmers understand what TDD is about.
I would probably suggest running competitions only after the team has done some exploration of exercises as a group, either through mobbing or randoris.
It might help to make sure you are clear on why you (or someone
else) thinks they should be doing TDD. People have to understand
what possible benefits it can bring to them as individuals (and not
just for the team’s benefit). For me, TDD has:
– sped me up, particularly long term and as the project scales
– helped me learn more about design
– prevented me from shipped dumb, embarrassing defects
– provided gratification and continual senses of completion with
each passing test
– allowed me to keep code from getting worse and worse
– created nice documentation on what the system does, meaning that
I’m not spending a lot of time having to analyze the code to know
what it does
TDD represents a completely different way of working. If people have been programming for a number of years without TDD, they’ve learned to be good at–so why should they learn TDD if they’re already reasonably successful?
Doing TDD can also be frustrating if you don’t know how to do it well; you really need to ensure someone is there to help you work through things. If you have one person who knows TDD well and no one else does, mob programming can be a lot of fun and ensure everyone comes up to speed at the same time. (Pairing can also help.) If your system is particularly challenging to test-drive, that can be a reason why they don’t want to do TDD–they try and they get frustrated, so they give up. If it’s a boring system where nothing’s happening (data is just getting pulled from the server, for instance), TDD can seem pointless (and might even be so).
Mostly I do TDD because I enjoy it. I like programming, I like the satisfaction of the passing tests, I like cleaning up my code so that it’s more easily maintained and understandable. People who like programming a lot will appreciate TDD more, once they understand how to do it.
It doesn’t come overnight. I teach two and three-day TDD classes. The first day, everyone is skeptical. We work through a few exercises, and I try to let the exercises ease the anxiety and address some of their concerns. Over the two or three days, we continue to do exercises. By the end of the second day (or into the third day if we have one), most students are intrigued enough to want to try to continue doing it. Some students get pretty excited by that point.
As with anything that is challenging to learn, TDD takes time, particularly since it represents a new habit, a completely different way of working. Someone needs to be there for people as they are learning, to support them and to answer their questions and concerns.