- Consulting / Coaching
- Jeff’s Blog
Should We Automate All of Our Feature Scenarios? (8-Feb-2018)
This post for Ranorex’s blog talks about how behavior-driven development (BDD) can help us come to consensus. I share a story about a customer who found value in BDD, even though we were unable to automate their “specs by example.”
The unit tests you produce in TDD represent an investment. You’ll want to ensure they continue to return on value and not turn into a liability.
As a first step, find the right frame of mind. Start by thinking about your unit tests as part of your production system, not as an adjunct effort or afterthought. We consider the diagnostic devices built into complex machines (e.g. automobiles) as part of the product. Software is a complex machine. Your tests provide diagnostics on the health of your software product.
Here are some common concerns about TDD:
I’ll tackle each one of these in a separate post, offering some thoughts about how to reduce your anxiety level. (No doubt I’ll add other concerns along the way.) Then I’ll move on to how to craft tests that are more sustainable and anxiety-free.
It is absolutely possible for a unit test to be defective. Sometimes it’s not verifying what you think it is, sometimes you’re not executing the code you think you are, sometimes there’s not even an assert, and sometimes things aren’t in the expected state to start with.
Strictly adhering to the TDD cycle (red-green-refactor) almost always eliminates this as a real-world problem. You must always see the completed test fail at least once, and for the right reason (pay attention to the failure messages!), before making it pass. And writing just enough code to make it pass should help reinforce that it was *that code* that passed the test.
Other ways to ensure your tests start and stay honest:
Next: The tests are handcuffs.
We occasionally hear dramatic stories about defects that killed or cost multi-millions or billions(!) of dollars. Yet these fiascos likely don’t change our programming habits. Maybe we don’t worry since our application isn’t critical enough. Or maybe we’ve learned to accept defects as a necessary evil.
What’s the defect count on your primary system? Chances are good you have hundreds if not thousands of defects. We know that these defects represent some problem, but calculating the true cost they represent is extremely difficult, so we don’t ever bother. Let’s mull over some of the cost increases that defects create.
Fixing a defect represents increased opportunity costs (time spent not delivering new revenue-producing solutions). Anecdotal evidence (from teams I’ve worked with, associates, and Google-ing about) suggests that a typical team might spend anywhere from 20% of time to 60% or more on rework demanded by fixing defects.
We’d like to think that when we fix a problem, we can move on… but chances are decent that in fixing a problem, we unwittingly created other defects of similarly undetermined cost.
We must pay to “support” the errant behavior in production. How many support folks does your application demand? A customer of mine boasted very few production defects, but they only meant problems that prevented their customer from accomplishing their goals. It turns out that they had a very fat book of workarounds and a support staff of over 40 (for an application that might have demanded maybe 10 folks at any other shop). Extra support folks cost money, and so does their manager, their computers, and so on.
Ahh but we also have support tiers. What’s the cost of stressing out the team and managers in the middle of the night or over weekends with production issues? Many developers look at “being on support rotation” as a given, necessary evil, but its stress toll can be significant. If you burn out your people with off-hours support, the good ones may move on—a potentially devastating cost.
But wait… I mentioned customers, so…
We know these things all cost *something*, but we will probably never truly know how much.
Defects aren’t our only quality-related cost sink. I’ll discuss the cost of poorly constructed code later.
“It means doing exactly what you said you were going to do—that’s really the definition of quality. That way you save the expenses of doing things over again, and you don’t disappoint anybody.” – Philip Crosby as quoted in Industry Week, “Philip Crosby: Quality is Still Free,” by Tim Stevens, June 19, 1995.
Student: “I know that I’ll need a HashMap somewhere in my very near coding future. Why shouldn’t I just introduce it now?”
Jeff: “This is one of the better questions about TDD, and one of the most common concerns about how it’s practiced. Do you really need a HashMap? Maybe you won’t by the time you’re done. By introducing this speculative design element now, you’d be creating complexity prematurely. You will incur additional cost every time you revisit the code in the interim. You might even end up with the unnecessary complexity forever.”
Student: “But I know I’ll need it in about 20 minutes.”
Jeff: “Maybe. Sometimes I’ve found that my most-cocksure design speculations were wrong, and TDD drove out a simpler solution. But for now, let’s say you’re right. With TDD, you’re learning how to think and build a solution more incrementally–a skill that can improve your ability to accommodate new, never-before-conceived behaviors into the codebase.”
Student: “Right, but this whole incremental thing requires some reworking of code. You told us that we start with the solution for the simplest case, which for this example requires only a hardcoded return value. Then we write another test that requires us to replace the hardcoding with a scalar. Then another test that requires segregating data, which I can do using a hash-based collection. Seems wasteful to write some code, only to throw it away a few minutes later.”
Jeff: “We’re trying to adhere to the rhythmic TDD discipline of red-greeen-refactor. Part of the success of TDD hinges on seeing a failing (red) test for a new behavior. One important reason: If you never see a test fail, you have little clue that it’s a legitimate test. When you write a more generalized solution than you need, writing a next test that fails first may be near impossible. Ultimately, not following the cycle leads to you writing fewer tests.”
Student: “Remind me, why is TDD and adhering to its cycle important?”
Jeff: “Foremost, the cycle lets us know we’re on track with building the correct code every few minutes. Also, as developers, we’re good about constantly creating lots of cruft–code that’s not as easily understood or maintained. In fact, cruft buildup is where a significant portion of your development costs rise. The cycle gives you continual opportunities to clean up the junk as soon as you create it–before it’s too late. The tests minimize your fear of making the necessary changes.”
“If you stick to the cycle, and write a failing test for each new behavior, you’ll end up with a set of tests for virtually all bits of logic you intended. These tests will document all behaviors you chose to drive into the system. Well-crafted tests can save everyone the countless hours otherwise needed to understand how the system behaves over time and as it grows.”
Student: “So, it’s also important that the cycles are short.”
Jeff: “Yes. It’s generally easier to find and fix problems when you find out about them sooner. Also, if you take larger steps, you’ll create more code than you’ll likely clean during the refactor step.”
Student: “But what about the wasted bits of code, the rework?”
Jeff: “You’re making a tradeoff. You are the tortoise, taking an incremental, steady route as opposed to the hare, who takes larger leaps requiring unpredictable amounts of route correction. The amount of code rework is often trivial, and if you learn to always take on the next-smallest case, the amount of new code is also small. You’ll learn to get good at taking small, incremental steps.”
Student: “Yeah, but rework?”
Jeff: “Let’s not forget that debugging cycles and defects represent rework costs that we rarely account for, and they are significant, even monstrous, on most development efforts. There’s also considerable cost when, however rarely, your speculation is wrong.”
Student: “Right, but I know more than just what’s needed for the current test case. Shouldn’t I create a design that accommodates all of that knowledge?”
Jeff: “It’s always valuable to create a design model given the needs you know, whether in your head or on the whiteboard. But don’t force that design into the system until the behaviors described by the tests demand it. And don’t spend a lot of effort on the details.
“As you test-drive, you can shift the design in the direction of your speculative model, as long as you’re keeping the code as clean as possible. Beck’s simple design rule #4 can help tell you when you’ve gone overboard: Avoid any additional moving parts or complexity than needed to pass the current set of tests.”
Student: “I would rather build the design to what I know I need over the next few hours.”
Jeff: “That’s your prerogative, though I suggest starting with the smaller, more incremental steps. It’s the better way to learn TDD: It’ll keep you honest, and generally result in more tests that describe the behavior you’re test-driving into the system.
“Once you’re comfortable with the TDD rhythm, you might take larger steps. I sometimes do. Much of the time, I end up ok, but every once in a while, I get bitten. That pushes me back in the direction of taking smaller steps.”
Student: “Right, but I know I need the HashMap. Here, see, I’m just about done, and there it is.”
Jeff: “Nice. But it turns out that the PO changed her mind. When you come in tomorrow, she’ll explain that we need to track things transactionally. The HashMap is the wrong abstraction for this–we want a time series collection. You’re going to have more rework than if you’d kept the design simple.”
Student: “Isn’t this rationale for spending just a bit more time on pinning down requirements up front?”
Jeff: ” One of the reasons we do agile is to allow for changes in direction based on feedback–from the marketplace, from the PO, from what we learn. TDD provides a cyclic mechanism that supports the cycle of agile and its goal of producing high-quality software that meets the current, changing needs of the customer.”
“TDD is all about continual, just-in-time, sufficient design. I like to say that design is so important, you can’t just consider it once. You have to address it continually.”
Student: “Sounds like it’s all your opinion. I’d like to hear from someone else.”
Jeff: Tim Ottinger also suggested that TDD gives you the ability to abandon the code after any integration step (which could be as often as every R-G-R cycle)–knowing it’s as clean and complete as the tests indicate.
“Tim also hinted at something else important: None of us are perfect, and many of our predictions about design are flat-out wrong. TDD provides continual humility lessons for most developers, helping us reign our natural hubris at minimal cost.”
Tired of the same old programming katas? Give this one a try!
In the game of Risk, you receive a card at the end of your turn if you captured at least one territory during that turn. If you end up with a proper set (to be defined shortly), you can redeem in on a later turn for bonus armies.
Cards either contain one of three symbols–horseman (H), cannon (C), soldier (S)–or are Jokers (J) that can act as any one of the three symbols.
This Kata has two parts. Please do Part 1 imagining that you know nothing about Part 2, i.e. minimally like a good TDDer should.
Determine whether or not a collection of three (selected) cards represents a valid set that can be turned in. A valid set contains three cards either all with the same symbol, or each will different symbols.
The user has selected exactly three valid cards–the UI constrains how many cards can be selected–so your solution should concern itself with no other possibilities.
Examples of valid sets:
C-S-J (with the Joker acting as an H)
S-S-J (with the Joker acting as an S)
Examples of invalid sets:
Determine whether or not a collection of 0 through 5+ cards (i.e. all the cards that a user holds) contains a valid set.
Five cards logically must always contain a valid set. The only case where a four card set is invalid is when it contains only two symbols, e.g. H-H-C-C.
This is not a long kata: My first implementation (in Java, I’ll try another language next time) that covers both parts is essentially a complex conditional with four conditional expressions, probably 20 minutes of effort. It did bring up a couple interesting questions as I ran through it.
I’d love to hear from you if you try this kata–please report back any interesting findings!
I was triggered to create this Kata after I saw a complex open source implementation (in a working Risk app) that required several dozens of lines of Java code. You can do much better.
My first implementation appears here.
With the goal of delivering quality software, I can help you with:
Want to hear more? Call 719-287-GEEK or use the Contact Me form to the left.