Interview, “How Unit Testing Improves Code Quality in TDD”


Hardik Shah presents an article, “How Unit Testing Improves Code Quality in TDD.” The first part of the post, hosted at Simform’s site, describes test-driven development (TDD) and unit testing, and also demonstrates a brief example of creating code using TDD.

After presenting some recommendations for practicing TDD, Shah provides three “case studies,” which seem closer to interviews about industry experts’ perceptions of TDD. The three folks involved include J.B. Rainsberger, Frederico Goncalves, and myself.

WeDoTDD: Companies & Teams Practicing Test-Driven Development

Take a moment and head on over to WeDoTDD.com: a site that posts detailed information about companies, people, and teams who practice and teach TDD. You’ll pick up some great ideas about what works with TDD in real teams.

You’ll find an in-depth interview with me talking about my TDD experiences as an individual and also with respect to working with my customers. I enjoyed answering the questions, and some of them forced me to think hard about why I do things the way I do.

New article at Gurock, “Test-Driven Development: A Love Story?”

Test-Driven Development: A Love Story? (22-May-2018)
In this blog post for Gurock, I wax rhapsodic about TDD… well, maybe not. I don’t love TDD itself; I love the things that TDD enables me to do. I talk about what those things are, along with a bit of my personal history about how I got here.

New article at Ranorex, “Convincing Your Team to Adopt TDD”

Convincing Your Team to Adopt TDD (15-May-2018)
In this blog post for Ranorex, you’ll pick up a number of ideas about how to approach introducing TDD into your team. While TDD is a simple practice to learn (though a challenge to master), getting developers to think they might want to even consider it is not an easy task. I’ll also talk about why resistance is to be expected.

Clean Tests: Common Concerns, Part 2: “I’ll Have Twice As Much Code to Maintain”

The unit tests you produce in TDD represent an investment. You’ll want to ensure they continue to return on value and not turn into a liability.


Breaking Dawn - Loch Rusky
“Breaking Dawn – Loch Rusky,” courtesy John McSporran.License.

The concerns:

  • The tests will have their own defects.
  • I’ll have twice as much code to maintain.
  • The tests are handcuffs.
  • Testing everything (via TDD) results in diminishing returns.

I’ll have twice as much code to maintain

It’s true, do TDD and you’ll likely have as much test code as production code, if not more. However, most sizable non-test-driven systems contain over twice the code needed. In other words: if C represents the ideal amount of production code for a system’s implementation, and TC represents test code, then 2C ≈ C + TC. That’s a wash, to me. I’ll take the version that comes with the tests, thank you.

2C? Twice the code needed? How’s that? The short answer: In non-test-driven systems, there’s no high-confidence way to keep code cruft from building up.

The TDD Bet

I worked with a customer in the Great Lakes area some years ago. They’d deployed a production application prior to my arrival; I was told its initial non-test-driven release was around 180,000 lines of Java code. The team, dedicated to TDD, had slowly but surely added unit tests as they test-drove new features, meanwhile cleaning up what they could in the legacy code. By my arrival, they’d shrunk the codebase to around 70,000 lines; by the time I’d left, they’d shrunk it perhaps another 10,000 lines.

Granted, much of the bloat was a large swath of dead and generated code they’d spotted. But a significant amount was purely unnecessary code.

I’d take the TDD Bet on almost any sizable legacy codebase: Given enough time to add tests and clean things up, it could be reduced to under half its size. A codebase with 100,000 lines of bloated code can transform into a codebase with 50,000 lines of production code and 50,000 more of tests.

Where Does This Bloat Come From?

I’m guilty of producing bloated code on my first pass of implementing some small solution. It’s much like writing (for example, these blog entries): I’ll slam out a few sentences in an attempt to spew what’s in my head onto the computer screen. Then I’ll read what I wrote, trying to put myself into the shoes of another reader: Does it make sense? Is it as punchy as it could be? I usually find numerous opportunities for cleanup!

There’s nothing wrong with slapping out a sloppy paragraph as long as you clean it up later. The same approach works for coding, with the caveat that you don’t want to wait too long–“later” might never come because something new demands your attention. When doing TDD, we ensure we clean up the code every few minutes, using the confidence that a passing test suite gives us.

Without TDD, we don’t do nearly enough of these little bits of clean-up. It’s far too unsafe: The chance of breaking other stuff is way too high. By definition, our non-test-driven code stinks a little bit more every few minutes.

Duplicate code rears its ugly head all the time. Duplication can happen in many forms–small bits of logic appearing dozens of times throughout the codebase, different algorithms to accomplish the same purpose, rote mechanisms where more abstract concepts would suffice.

I watched a developer copy-and-paste an entire 100+ line function in order to introduce a small, 5-line-long code variance. The right thing, of course, would be to use delegation, polymorphism, whatever, as long as it didn’t mean duplicating logic. And the developer knew that, as we might hope. “But I don’t want to change that function. I don’t want to be the one who crashes production because I changed some working code in order to make the system ‘cleaner.'”

Rather than introduce design concepts that minimize redundancy when they must deal with behavior variances, developers make what they think is the least impactful change out of fear. For these programmers, the notion of cleaner provides no immediate, obvious value. We know clean will make their job easier over time, but fear now intimidates more.

So, yes, with TDD, you’ll have as much test code as production code to maintain, but overall the amount of code will not be significantly different.

More importantly, you won’t fear your own code.

 

New article at Gurock, “Succeeding With Test-Driven Development at a Distributed Start-up”

Succeeding With Test-Driven Development at a Distributed Start-up (17-Jan-2018)
In this blog post for Gurock’s blog, I talk about my experiences doing test-driven development (TDD) at a small startup: Where did we succeed, where did we fail, and what would I change?

Clean Tests: Common Concerns, Part 1

The unit tests you produce in TDD represent an investment. You’ll want to ensure they continue to return on value and not turn into a liability.


Magnetospheric Multiscale
“Magnetospheric Multiscale,” courtesy NASA Goddard Space Flight Center.License.

As a first step, find the right frame of mind. Start by thinking about your unit tests as part of your production system, not as an adjunct effort or afterthought. We consider the diagnostic devices built into complex machines (e.g. automobiles) as part of the product. Software is a complex machine. Your tests provide diagnostics on the health of your software product.

Here are some common concerns about TDD:

  • The tests will have their own defects.
  • I’ll have twice as much code to maintain.
  • The tests are handcuffs.
  • Testing everything (via TDD) results in diminishing returns.

I’ll tackle each one of these in a separate post, offering some thoughts about how to reduce your anxiety level. (No doubt I’ll add other concerns along the way.) Then I’ll move on to how to craft tests that are more sustainable and anxiety-free.

The tests will have their own defects.

It is absolutely possible for a unit test to be defective. Sometimes it’s not verifying what you think it is, sometimes you’re not executing the code you think you are, sometimes there’s not even an assert, and sometimes things aren’t in the expected state to start with.

Strictly adhering to the TDD cycle (red-green-refactor) almost always eliminates this as a real-world problem. You must always see the completed test fail at least once, and for the right reason (pay attention to the failure messages!), before making it pass. And writing just enough code to make it pass should help reinforce that it was *that code* that passed the test.

Other ways to ensure your tests start and stay honest:

  • Pair or mob (or review the tests carefully with some other form of review).
  • Minimize use of complex test doubles, such as fakes.
  • Avoid partial mocks at all costs.
  • Don’t mock past immediate collaborators.
  • Employ contract tests.
  • Fix the design issues that demand complex test setup.
  • Stick to single-behavior tests.
  • Design tests for clarity. Emphasize test abstraction and employ AAA.
  • Always read the tests carefully as your point of entry to understanding.
  • Improve each test, as warranted, each time you touch it.

 

Next: The tests are handcuffs.

TDD and Speculative Design

Student: “I know that I’ll need a HashMap somewhere in my very near coding future. Why shouldn’t I just introduce it now?”

The seeing eye
“The seeing eye,” courtesy Valerie Everett.License.

Jeff: “This is one of the better questions about TDD, and one of the most common concerns about how it’s practiced. Do you really need a HashMap? Maybe you won’t by the time you’re done. By introducing this speculative design element now, you’d be creating complexity prematurely. You will incur additional cost every time you revisit the code in the interim. You might even end up with the unnecessary complexity forever.”

Student: “But I know I’ll need it in about 20 minutes.”

Jeff: “Maybe. Sometimes I’ve found that my most-cocksure design speculations were wrong, and TDD drove out a simpler solution. But for now, let’s say you’re right. With TDD, you’re learning how to think and build a solution more incrementally–a skill that can improve your ability to accommodate new, never-before-conceived behaviors into the codebase.”

Student: “Right, but this whole incremental thing requires some reworking of code. You told us that we start with the solution for the simplest case, which for this example requires only a hardcoded return value. Then we write another test that requires us to replace the hardcoding with a scalar. Then another test that requires segregating data, which I can do using a hash-based collection. Seems wasteful to write some code, only to throw it away a few minutes later.”

Jeff: “We’re trying to adhere to the rhythmic TDD discipline of red-greeen-refactor. Part of the success of TDD hinges on seeing a failing (red) test for a new behavior. One important reason: If you never see a test fail, you have little clue that it’s a legitimate test. When you write a more generalized solution than you need, writing a next test that fails first may be near impossible. Ultimately, not following the cycle leads to you writing fewer tests.”

Student: “Remind me, why is TDD and adhering to its cycle important?”

Jeff: “Foremost, the cycle lets us know we’re on track with building the correct code every few minutes. Also, as developers, we’re good about constantly creating lots of cruft–code that’s not as easily understood or maintained. In fact, cruft buildup is where a significant portion of your development costs rise. The cycle gives you continual opportunities to clean up the junk as soon as you create it–before it’s too late. The tests minimize your fear of making the necessary changes.”

“If you stick to the cycle, and write a failing test for each new behavior, you’ll end up with a set of tests for virtually all bits of logic you intended. These tests will document all behaviors you chose to drive into the system. Well-crafted tests can save everyone the countless hours otherwise needed to understand how the system behaves over time and as it grows.”

Student: “So, it’s also important that the cycles are short.”

Jeff: “Yes. It’s generally easier to find and fix problems when you find out about them sooner. Also, if you take larger steps, you’ll create more code than you’ll likely clean during the refactor step.”

Student: “But what about the wasted bits of code, the rework?”

Jeff: “You’re making a tradeoff. You are the tortoise, taking an incremental, steady route as opposed to the hare, who takes larger leaps requiring unpredictable amounts of route correction. The amount of code rework is often trivial, and if you learn to always take on the next-smallest case, the amount of new code is also small. You’ll learn to get good at taking small, incremental steps.”

Student: “Yeah, but rework?”

Jeff: “Let’s not forget that debugging cycles and defects represent rework costs that we rarely account for, and they are significant, even monstrous, on most development efforts. There’s also considerable cost when, however rarely, your speculation is wrong.”

Student: “Right, but I know more than just what’s needed for the current test case. Shouldn’t I create a design that accommodates all of that knowledge?”

Jeff: “It’s always valuable to create a design model given the needs you know, whether in your head or on the whiteboard. But don’t force that design into the system until the behaviors described by the tests demand it. And don’t spend a lot of effort on the details.

“As you test-drive, you can shift the design in the direction of your speculative model, as long as you’re keeping the code as clean as possible. Beck’s simple design rule #4 can help tell you when you’ve gone overboard: Avoid any additional moving parts or complexity than needed to pass the current set of tests.”

Student: “I would rather build the design to what I know I need over the next few hours.”

Jeff: “That’s your prerogative, though I suggest starting with the smaller, more incremental steps. It’s the better way to learn TDD: It’ll keep you honest, and generally result in more tests that describe the behavior you’re test-driving into the system.

“Once you’re comfortable with the TDD rhythm, you might take larger steps. I sometimes do. Much of the time, I end up ok, but every once in a while, I get bitten. That pushes me back in the direction of taking smaller steps.”

Student: “Right, but I know I need the HashMap. Here, see, I’m just about done, and there it is.”

Jeff: “Nice. But it turns out that the PO changed her mind. When you come in tomorrow, she’ll explain that we need to track things transactionally. The HashMap is the wrong abstraction for this–we want a time series collection. You’re going to have more rework than if you’d kept the design simple.”

Student: “Isn’t this rationale for spending just a bit more time on pinning down requirements up front?”

Jeff: ” One of the reasons we do agile is to allow for changes in direction based on feedback–from the marketplace, from the PO, from what we learn. TDD provides a cyclic mechanism that supports the cycle of agile and its goal of producing high-quality software that meets the current, changing needs of the customer.”

“TDD is all about continual, just-in-time, sufficient design. I like to say that design is so important, you can’t just consider it once. You have to address it continually.”

Student: “Sounds like it’s all your opinion. I’d like to hear from someone else.”

Jeff: Tim Ottinger also suggested that TDD gives you the ability to abandon the code after any integration step (which could be as often as every R-G-R cycle)–knowing it’s as clean and complete as the tests indicate.

“Tim also hinted at something else important: None of us are perfect, and many of our predictions about design are flat-out wrong. TDD provides continual humility lessons for most developers, helping us reign our natural hubris at minimal cost.”


Note: this contrived dialog is based on real questions and challenges from actual TDD students (and still coming–I had much the same discussion two weeks ago).

TDD Kata: Risk Card Sets

Tired of the same old programming katas? Give this one a try!

Risk “IXS_4046,” courtesy Leon Brocard. License.

In the game of Risk, you receive a card at the end of your turn if you captured at least one territory during that turn. If you end up with a proper set (to be defined shortly), you can redeem in on a later turn for bonus armies.

Cards either contain one of three symbols–horseman (H), cannon (C), soldier (S)–or are Jokers (J) that can act as any one of the three symbols.

This Kata has two parts. Please do Part 1 imagining that you know nothing about Part 2, i.e. minimally like a good TDDer should.

Part 1: Is the collection of cards a valid set?

Determine whether or not a collection of three (selected) cards represents a valid set that can be turned in. A valid set contains three cards either all with the same symbol, or each will different symbols.

The user has selected exactly three valid cards–the UI constrains how many cards can be selected–so your solution should concern itself with no other possibilities.

Examples of valid sets:

H-H-H
C-S-H
C-S-J (with the Joker acting as an H)
S-S-J (with the Joker acting as an S)

Examples of invalid sets:

H-H-C
S-S-H

Part 2: Does the collection of cards contain a valid set?

Determine whether or not a collection of 0 through 5+ cards (i.e. all the cards that a user holds) contains a valid set.

Five cards logically must always contain a valid set. The only case where a four card set is invalid is when it contains only two symbols, e.g. H-H-C-C.

Discussion / Suggestions

This is not a long kata: My first implementation (in Java, I’ll try another language next time) that covers both parts is essentially a complex conditional with four conditional expressions, probably 20 minutes of effort. It did bring up a couple interesting questions as I ran through it.

  • How long did it take the first time through? The second time?
  • If you lead others through this kata: How long does it take for someone new to TDD?
  • Was this too difficult? Too easy?
  • Functional approaches (LINQ, Java 8 streams, etc) seem to be the way to go. If you produce a more imperative solution, how many lines was your code? How readable by another developer?
  • How much do your implementation choices leak into the test names? Do your test names describe it more from an implementation stance, in other words, or are they essentially a restatement of the Risk rules?
  • Did you extract your conditional expressions into intention-declaring functions, or do you assume the ability of a developer to understand their intent? If you extracted conditionals, to what extent did you sacrifice performance to improve readability / composition?
  • How many tests did you end up with for each part? Does it make sense to eliminate any (for the sake of keeping documentation simple) now that you have a complete solution?
  • In Part 1, were there tests you felt compelled to write for confidence, but that passed immediately (i.e. you were unable to follow R-G-Refactor)? How might you have avoided this?
  • Did some of your tests for Part 2 pass immediately? If so, what would have been a simpler solution for Part 1?
  • When you built Part 2, did you factor it to take advantage of the method created in Part 1, or did you have the method created in Part 1 take advantage of the method created in Part 2? Or neither?

I’d love to hear from you if you try this kata–please report back any interesting findings!

I was triggered to create this Kata after I saw a complex open source implementation (in a working Risk app) that required several dozens of lines of Java code. You can do much better.

My first implementation appears here.

Agile Java Rewrite?

I’ve had the idea to write a book on refactoring for some time, and even produced a proposal a couple months ago. The focus would be “refactoring in the context of TDD,” aka continual design. I only got halfway through a sample chapter, which the publisher requests. (Writing the sample also reassures me it’s the book I want to and can write.) This past week I heard that a Big Name is currently working on a refactoring book rewrite. Hmm, maybe this project is not something that I should be doing.

Agile Java has been in print since 2005. It still sells copies–not a lot–and I also still receive thanks for the book from many folks who found it a great way to learn (TDD + OO + Java from the ground up). I feel a bit guilty that people still use a 12+year-old-book featuring a version of Java end-of-lifed for about 8 years! Time for a rewrite, as some have requested?

A dozen years later, what would change?

  • New approach based on new versions of Java: The book was written to coincide with Java 5 and its dramatic language changes, which also made it easier for a simpler, more-OO focus than previously would have been possible. With Java 9 on the horizon (July 2017, maybe?), a new version would find lambdas being introduced fairly early to allow new developers to focus on the value of more-functional approaches to solving problems.
  • Potential new “Additional Lesson” chapters focusing on full-stack considerations (HTTP and JPA, for example)… or maybe this whole notion of covering peripheral technologies gets dropped. Yeah, let’s do that, and keep this to a somewhat-reasonable length.
  • Updated for the latest version of JUnit (5).
  • Use of fluent assertions (probably Hamcrest with mention of other frameworks).
  • Emphasis on one behavior per test, as well as the value of tests-as-documentation.

Otherwise, the tone and general flow would remain the same. Hopefully not the page length: Few folks want to read 750-page books anymore.

I’m the sort to find the stuff I wrote a while back to be lacking, so I might be tempted to rewrite a few things…

Atom