Clean Tests: Common Concerns, Part 1

The unit tests you produce in TDD represent an investment. You’ll want to ensure they continue to return on value and not turn into a liability.


Magnetospheric Multiscale
“Magnetospheric Multiscale,” courtesy NASA Goddard Space Flight Center.License.

As a first step, find the right frame of mind. Start by thinking about your unit tests as part of your production system, not as an adjunct effort or afterthought. We consider the diagnostic devices built into complex machines (e.g. automobiles) as part of the product. Software is a complex machine. Your tests provide diagnostics on the health of your software product.

Here are some common concerns about TDD:

  • The tests will have their own defects.
  • I’ll have twice as much code to maintain.
  • The tests are handcuffs.
  • Testing everything (via TDD) results in diminishing returns.

I’ll tackle each one of these in a separate post, offering some thoughts about how to reduce your anxiety level. (No doubt I’ll add other concerns along the way.) Then I’ll move on to how to craft tests that are more sustainable and anxiety-free.

The tests will have their own defects.

It is absolutely possible for a unit test to be defective. Sometimes it’s not verifying what you think it is, sometimes you’re not executing the code you think you are, sometimes there’s not even an assert, and sometimes things aren’t in the expected state to start with.

Strictly adhering to the TDD cycle (red-green-refactor) almost always eliminates this as a real-world problem. You must always see the completed test fail at least once, and for the right reason (pay attention to the failure messages!), before making it pass. And writing just enough code to make it pass should help reinforce that it was *that code* that passed the test.

Other ways to ensure your tests start and stay honest:

  • Pair or mob (or review the tests carefully with some other form of review).
  • Minimize use of complex test doubles, such as fakes.
  • Avoid partial mocks at all costs.
  • Don’t mock past immediate collaborators.
  • Employ contract tests.
  • Fix the design issues that demand complex test setup.
  • Stick to single-behavior tests.
  • Design tests for clarity. Emphasize test abstraction and employ AAA.
  • Always read the tests carefully as your point of entry to understanding.
  • Improve each test, as warranted, each time you touch it.

Next: The tests are handcuffs.

TDD and Speculative Design

Student: “I know that I’ll need a HashMap somewhere in my very near coding future. Why shouldn’t I just introduce it now?”

The seeing eye
“The seeing eye,” courtesy Valerie Everett.License.

Jeff: “This is one of the better questions about TDD, and one of the most common concerns about how it’s practiced. Do you really need a HashMap? Maybe you won’t by the time you’re done. By introducing this speculative design element now, you’d be creating complexity prematurely. You will incur additional cost every time you revisit the code in the interim. You might even end up with the unnecessary complexity forever.”

Student: “But I know I’ll need it in about 20 minutes.”

Jeff: “Maybe. Sometimes I’ve found that my most-cocksure design speculations were wrong, and TDD drove out a simpler solution. But for now, let’s say you’re right. With TDD, you’re learning how to think and build a solution more incrementally–a skill that can improve your ability to accommodate new, never-before-conceived behaviors into the codebase.”

Student: “Right, but this whole incremental thing requires some reworking of code. You told us that we start with the solution for the simplest case, which for this example requires only a hardcoded return value. Then we write another test that requires us to replace the hardcoding with a scalar. Then another test that requires segregating data, which I can do using a hash-based collection. Seems wasteful to write some code, only to throw it away a few minutes later.”

Jeff: “We’re trying to adhere to the rhythmic TDD discipline of red-greeen-refactor. Part of the success of TDD hinges on seeing a failing (red) test for a new behavior. One important reason: If you never see a test fail, you have little clue that it’s a legitimate test. When you write a more generalized solution than you need, writing a next test that fails first may be near impossible. Ultimately, not following the cycle leads to you writing fewer tests.”

Student: “Remind me, why is TDD and adhering to its cycle important?”

Jeff: “Foremost, the cycle lets us know we’re on track with building the correct code every few minutes. Also, as developers, we’re good about constantly creating lots of cruft–code that’s not as easily understood or maintained. In fact, cruft buildup is where a significant portion of your development costs rise. The cycle gives you continual opportunities to clean up the junk as soon as you create it–before it’s too late. The tests minimize your fear of making the necessary changes.”

“If you stick to the cycle, and write a failing test for each new behavior, you’ll end up with a set of tests for virtually all bits of logic you intended. These tests will document all behaviors you chose to drive into the system. Well-crafted tests can save everyone the countless hours otherwise needed to understand how the system behaves over time and as it grows.”

Student: “So, it’s also important that the cycles are short.”

Jeff: “Yes. It’s generally easier to find and fix problems when you find out about them sooner. Also, if you take larger steps, you’ll create more code than you’ll likely clean during the refactor step.”

Student: “But what about the wasted bits of code, the rework?”

Jeff: “You’re making a tradeoff. You are the tortoise, taking an incremental, steady route as opposed to the hare, who takes larger leaps requiring unpredictable amounts of route correction. The amount of code rework is often trivial, and if you learn to always take on the next-smallest case, the amount of new code is also small. You’ll learn to get good at taking small, incremental steps.”

Student: “Yeah, but rework?”

Jeff: “Let’s not forget that debugging cycles and defects represent rework costs that we rarely account for, and they are significant, even monstrous, on most development efforts. There’s also considerable cost when, however rarely, your speculation is wrong.”

Student: “Right, but I know more than just what’s needed for the current test case. Shouldn’t I create a design that accommodates all of that knowledge?”

Jeff: “It’s always valuable to create a design model given the needs you know, whether in your head or on the whiteboard. But don’t force that design into the system until the behaviors described by the tests demand it. And don’t spend a lot of effort on the details.

“As you test-drive, you can shift the design in the direction of your speculative model, as long as you’re keeping the code as clean as possible. Beck’s simple design rule #4 can help tell you when you’ve gone overboard: Avoid any additional moving parts or complexity than needed to pass the current set of tests.”

Student: “I would rather build the design to what I know I need over the next few hours.”

Jeff: “That’s your prerogative, though I suggest starting with the smaller, more incremental steps. It’s the better way to learn TDD: It’ll keep you honest, and generally result in more tests that describe the behavior you’re test-driving into the system.

“Once you’re comfortable with the TDD rhythm, you might take larger steps. I sometimes do. Much of the time, I end up ok, but every once in a while, I get bitten. That pushes me back in the direction of taking smaller steps.”

Student: “Right, but I know I need the HashMap. Here, see, I’m just about done, and there it is.”

Jeff: “Nice. But it turns out that the PO changed her mind. When you come in tomorrow, she’ll explain that we need to track things transactionally. The HashMap is the wrong abstraction for this–we want a time series collection. You’re going to have more rework than if you’d kept the design simple.”

Student: “Isn’t this rationale for spending just a bit more time on pinning down requirements up front?”

Jeff: ” One of the reasons we do agile is to allow for changes in direction based on feedback–from the marketplace, from the PO, from what we learn. TDD provides a cyclic mechanism that supports the cycle of agile and its goal of producing high-quality software that meets the current, changing needs of the customer.”

“TDD is all about continual, just-in-time, sufficient design. I like to say that design is so important, you can’t just consider it once. You have to address it continually.”

Student: “Sounds like it’s all your opinion. I’d like to hear from someone else.”

Jeff: Tim Ottinger also suggested that TDD gives you the ability to abandon the code after any integration step (which could be as often as every R-G-R cycle)–knowing it’s as clean and complete as the tests indicate.

“Tim also hinted at something else important: None of us are perfect, and many of our predictions about design are flat-out wrong. TDD provides continual humility lessons for most developers, helping us reign our natural hubris at minimal cost.”


Note: this contrived dialog is based on real questions and challenges from actual TDD students (and still coming–I had much the same discussion two weeks ago).

TDD Kata: Risk Card Sets

Tired of the same old programming katas? Give this one a try!

Risk “IXS_4046,” courtesy Leon Brocard. License.

In the game of Risk, you receive a card at the end of your turn if you captured at least one territory during that turn. If you end up with a proper set (to be defined shortly), you can redeem in on a later turn for bonus armies.

Cards either contain one of three symbols–horseman (H), cannon (C), soldier (S)–or are Jokers (J) that can act as any one of the three symbols.

This Kata has two parts. Please do Part 1 imagining that you know nothing about Part 2, i.e. minimally like a good TDDer should.

Part 1: Is the collection of cards a valid set?

Determine whether or not a collection of three (selected) cards represents a valid set that can be turned in. A valid set contains three cards either all with the same symbol, or each will different symbols.

The user has selected exactly three valid cards–the UI constrains how many cards can be selected–so your solution should concern itself with no other possibilities.

Examples of valid sets:

H-H-H
C-S-H
C-S-J (with the Joker acting as an H)
S-S-J (with the Joker acting as an S)

Examples of invalid sets:

H-H-C
S-S-H

Part 2: Does the collection of cards contain a valid set?

Determine whether or not a collection of 0 through 5+ cards (i.e. all the cards that a user holds) contains a valid set.

Five cards logically must always contain a valid set. The only case where a four card set is invalid is when it contains only two symbols, e.g. H-H-C-C.

Discussion / Suggestions

This is not a long kata: My first implementation (in Java, I’ll try another language next time) that covers both parts is essentially a complex conditional with four conditional expressions, probably 20 minutes of effort. It did bring up a couple interesting questions as I ran through it.

  • How long did it take the first time through? The second time?
  • If you lead others through this kata: How long does it take for someone new to TDD?
  • Was this too difficult? Too easy?
  • Functional approaches (LINQ, Java 8 streams, etc) seem to be the way to go. If you produce a more imperative solution, how many lines was your code? How readable by another developer?
  • How much do your implementation choices leak into the test names? Do your test names describe it more from an implementation stance, in other words, or are they essentially a restatement of the Risk rules?
  • Did you extract your conditional expressions into intention-declaring functions, or do you assume the ability of a developer to understand their intent? If you extracted conditionals, to what extent did you sacrifice performance to improve readability / composition?
  • How many tests did you end up with for each part? Does it make sense to eliminate any (for the sake of keeping documentation simple) now that you have a complete solution?
  • In Part 1, were there tests you felt compelled to write for confidence, but that passed immediately (i.e. you were unable to follow R-G-Refactor)? How might you have avoided this?
  • Did some of your tests for Part 2 pass immediately? If so, what would have been a simpler solution for Part 1?
  • When you built Part 2, did you factor it to take advantage of the method created in Part 1, or did you have the method created in Part 1 take advantage of the method created in Part 2? Or neither?

I’d love to hear from you if you try this kata–please report back any interesting findings!

I was triggered to create this Kata after I saw a complex open source implementation (in a working Risk app) that required several dozens of lines of Java code. You can do much better.

My first implementation appears here.

Agile Java Rewrite?

I’ve had the idea to write a book on refactoring for some time, and even produced a proposal a couple months ago. The focus would be “refactoring in the context of TDD,” aka continual design. I only got halfway through a sample chapter, which the publisher requests. (Writing the sample also reassures me it’s the book I want to and can write.) This past week I heard that a Big Name is currently working on a refactoring book rewrite. Hmm, maybe this project is not something that I should be doing.

Agile Java has been in print since 2005. It still sells copies–not a lot–and I also still receive thanks for the book from many folks who found it a great way to learn (TDD + OO + Java from the ground up). I feel a bit guilty that people still use a 12+year-old-book featuring a version of Java end-of-lifed for about 8 years! Time for a rewrite, as some have requested?

A dozen years later, what would change?

  • New approach based on new versions of Java: The book was written to coincide with Java 5 and its dramatic language changes, which also made it easier for a simpler, more-OO focus than previously would have been possible. With Java 9 on the horizon (July 2017, maybe?), a new version would find lambdas being introduced fairly early to allow new developers to focus on the value of more-functional approaches to solving problems.
  • Potential new “Additional Lesson” chapters focusing on full-stack considerations (HTTP and JPA, for example)… or maybe this whole notion of covering peripheral technologies gets dropped. Yeah, let’s do that, and keep this to a somewhat-reasonable length.
  • Updated for the latest version of JUnit (5).
  • Use of fluent assertions (probably Hamcrest with mention of other frameworks).
  • Emphasis on one behavior per test, as well as the value of tests-as-documentation.

Otherwise, the tone and general flow would remain the same. Hopefully not the page length: Few folks want to read 750-page books anymore.

I’m the sort to find the stuff I wrote a while back to be lacking, so I might be tempted to rewrite a few things…

The Compulsive Coder, Episode 8: You Might Not Like This

Things I think I like to do in code that others might find off-putting or even stupid. Convince me that I’m wrong. Please. Comments and castigations welcome.

Annoyed “Annoyed,” courtesy Feliciano Guimarães. License.
  • Ignore access modifiers for fields in tests. We’re always taught to make things private when possible. For Java, I simply omit the private keyword–it’s one less piece of clutter, and these are tests, which should have no clients.
  • Start my test class names with A or An. Examples: AnExpirationCalculator, APantry, AnItem. This means that I have a naming convention for tests that is vetted by sheer readability. For example: APantry.listsItemsExpiringToday(). I’m actually ambivalent and even inconsistent about this; if enough people beat me up, I might change, but I kind of like it the more I do it.
  • Eschew safety braces. Why type if (x) { doSomething(); } when if (x) doSomething(); will suffice? “But what if you forget to add the braces when needed (ed: and you don’t remember to format the code)?” Has yet to be a problem, particularly since I’m doing TDD.
  • Put single-statement if‘s (and for other similar control flow statements) on a single line. In fact, single-line if‘s make the use of safety braces look even stupider, as the previous bullet demonstrates.
  • import packagename.*; … because I figure one could always hover or automatically get the imports to show explicitly if needed. Lately though I’m feeling a bit repentant and showing explicit imports… for pedantic reasons?
  • Largely ignore code coverage and other static-code-analysis-tool-generated metrics. My sh*t does stink sometimes, but I know when it does and when it doesn’t.
  • Use Java. Well, most of the shops I interact with still use it. It is our LCD language.

Maybe I need to pair more frequently again.

More on That First TDD Exercise

incrementalism “Kata,” courtesy Brett Jordan. License.

A few weeks ago, I iterated characteristics for a very-first exercise for TDD newbies. But once you’ve settled on a great introductory TDD example, how do you deliver it? What do you talk about or demonstrate, and what do the students do on their own?

I used to initiate TDD training by first doing a demo, then letting students do their own similar exercise (or even the same one) on their own. When demonstrating my example, I’d first talk about the TDD cycle, then launch into some coding. During the demo, I’d discuss the mechanics of their unit-testing tool and assertion framework of choice, what a test looked like, what sorts of things to focus on during refactoring, and so on.

I got good at coding while talking and looking at the students. So what? The problem was, I was blathering and they were watching, providing a great opportunity for them to start tuning out–a very bad thing for people who likely are anxious about TDD in the first place. I tried a bit of choreographing: I’d demo bit-by-bit, waiting for students to catch up on coding the same bit. It worked, but still seemed like too much was getting covered all at once.

My goal instead became to get students into their own, simplest possible exercise as quickly as possible. Using a technique I learned from James Grenning long ago, students are given a complete set of tests for the first exercise. Initially the tests are all disabled or commented out. The tests are ordered to best promote consistent incremental growth of the solution.

The student’s job is to uncomment one test at a time, watch it fail, get it to pass, refactor, and then move on to the next test. My primary instruction to them is to ensure that they write no more code than needed to get the current test to pass. This is a hard and important lesson to learn and understand! My secondary instruction is to tell them to clean the code up with each passing test (yet they never do enough of it, regardless of instruction!).

If I’ve done my job particularly well in terms of ordering the tests, I can even introduce a tool that forces the issue of minimal implementations: The TDD Smackdown Tool is a simple test runner that generates test failure when students write code that causes any of the still-commented-out tests to pass. It’s not foolproof, but might prevent them from coding too much before you have time to get to their machine to, well, smack ’em down.

By simply following the instructions and getting tests to pass, students learn a lot by osmosis. A second exercise, without the training wheels of already-written tests, goes far more smoothly. They can look at the first exercise for what a unit test must look like.

No technique is perfect. Some possible challenges / discussions:
– “Since I have all the tests, I know what the complete solution will require, so why not code it already?”
– “Would I normally write all the tests at once?”

Using this Grenning Method, the focus shifts to one core element: how to incrementally grow a solution using TDD. Discussions can focus on the red-green-refactor cycle, emphasizing refactoring and “red first” / minimal implementation.

Selecting a First TDD Exercise

Kata “Kata,” courtesy Peter Gordon. License.

Introducing developers to TDD? You want to carefully choose a first exercise. Too short, too long, too simple, too complex, too toy-like, too close to their problems, too whatever, and you might have chosen a turn-off. First impressions matter!

Of course, you can’t please everybody, but you can usually find a good middle ground. Here are some criteria for selecting an appropriate first TDD exercise.

Not frivolous. While very many of us love games and trivialities, there are enough developers that don’t. Their takeaway: “TDD is just for toys, not the ‘professional’ development that I do.” The Bowling Game? Save it and other games for a second or later exercise.
“Real” enough. A dozen years ago, I used to demonstrate a stack example in Java. Its implementation was a trivial four methods that simply delegated to an underlying list. Same takeaway: “Why would I bother building such a thing? TDD is just for toy examples.”
Time frame: Not too short, not too long. For a very first exercise, something that takes learners about 25-40 minutes to implement as a pair is ideal. (For experienced test-drivers like you or me, this translates to about 10-15 minutes as an individual and 20 minutes in a pair.) Any longer and they’ll get frustrated; much shorter and they won’t have had a chance to get into the rhythm.
Promotes incrementalism. Part of TDD’s power is in its ability to help you incrementally grow an implementation in a series of small steps. A ~45 minute example as suggested might involve 5-7 individual tests. As a counter-example: Looking for a Benefactor, a challenge at the kata site CodeWars.com, provides a decent challenge. However, the end-implementation is fully codeable with a second test, and there’s perhaps only one additional (denegerate) test to try.
Allows alternate solutions. Just about everything does, but if there’s one and only one decent way to code it, find something that allows for more interesting choices.
Mild problem solving required. You don’t want learners to bog down on thinking terribly hard about the problem itself. Yes, they need to think, but not so much that they’re distracted away from the core concepts and technique of TDD. For this reason, problems requiring “interesting” math are probably not a great idea for the average IT developer. Think simple.
Promotes refactoring. The ability to refactor with confidence is perhaps the primary reason to do TDD, so this criterion is quite important. You want strong reminders at refactoring steps that the code will get out of hand without incremental cleanup. It’s likely that problems promoting incrementalism and requiring “mild problem solving” will provide enough refactoring opportunities.
Allows mistakes. Mistakes are great learning opportunities! You want the possibility for an average learner to make up to a few stupid (or not so stupid) mistakes when implementing a solution. It’s even better if they are also likely to make mistakes when refactoring.
Minimizes esoteric language knowledge. A good developer shouldn’t have to Google something to figure out how to implement a solution.
Potentially expandable. While there’s something to be said for “done,” having an example you can build on later, or add to for fast learners who are ahead of others, can be worthwhile.
Fun enough. Not a toy example, but something that programmers might enjoy enough to quickly relate to and dig into.

Knowing your audience can help you refine the above to chose an example even better suited to their interests!

One example of an exercise that I’ve found to work well: acronym generator, also at CodeWars. It can be done with or without regex string splitting, and the regex is simple if that’s the route the developer chooses. There are ample additional requirements you could introduce (salutations, suffixes, resolve clashes, etc.).

Mind you, this is all for a first example. Subsequent exercises can be as meaty, long, goofy, or tricky as needed!

The Compulsive Coder, Episode 7: Please AAA

Q. What behavior does the following JUnit test describe? (In other words, what’s an appropriate name for the test?)

        MockInitialContext mic = MockInitialContextFactory.getContext();
        mic.map.put(ClassicConstants.JNDI_CONTEXT_NAME, "tata");

        LoggerFactory.getLogger(ContextDetachingSCLTest.class);

        ContextJNDISelector selector = (ContextJNDISelector) ContextSelectorStaticBinder.getSingleton().getContextSelector();
        Context context = selector.getLoggerContext();
        assertEquals("tata", context.getName());
        System.out.println(selector.getContextNames());
        assertEquals(2, selector.getCount());

Maybe it helps, just a bit, if I tell you that the test name is testCreateContext.

Why must I spend time reading through each line of a test like this to understand what it’s really trying to prove?

The test designer might have made the central point of the test obvious, by visually chunking the test into its Arrange-Act-Assert (AAA) parts:

        MockInitialContext mic = MockInitialContextFactory.getContext();
        mic.map.put(ClassicConstants.JNDI_CONTEXT_NAME, "tata");
        LoggerFactory.getLogger(ContextDetachingSCLTest.class);
        ContextJNDISelector selector = (ContextJNDISelector) ContextSelectorStaticBinder.getSingleton().getContextSelector();

        Context context = selector.getLoggerContext();

        assertEquals("tata", context.getName());
        System.out.println(selector.getContextNames());
        assertEquals(2, selector.getCount());

Well, I think that’s right at least, given what we know.

(Yes, the word “test” in the name duplicates the requisite @Test annotation. And yes, there are other problems with the test, e.g. printing anything to the console. But it’s excruciatingly simple to first ensure the structure of a test is clear.)

Standards are only that if they are constantly adhered to.


Previous Compulsive Coder blog entry: this Duplication Is Killing Me
Next Compulsive Coder blog entry:

The Compulsive Coder, Episode 6: this Duplication is Killing Me

A few months ago, I got caught up in a series of contentious debates about duplication in test code. I’m the sort who believes that most of our design heuristics–simple design in this case–fall victim to the reality of the pirate’s code: “The code is more what you’d call ‘guidelines’ than actual rules.” Which is OK by me; rules are made to be broken.

this
Image courtesy drivethrucafe; license

Unit tests are different beasts than production code. They serve a different purpose. For the most part, they can and should follow the same standards as production code. But since one of their primary purposes is to provide summary documentation, there are good reasons to adopt differing standards. (I solicited some important distinctions on Twitter a few months ago; this will be fodder for an upcoming post.)

With respect to Beck’s four rules of simple design, many have contended that expressiveness slightly outweighs elimination of duplication, irrespective of being production code or tests. With production code, I’m on the fence.

But for tests? The documentation aspect of tests is more relevant than their adherence to OO design concepts. Yes, duplication across tests can cause some future headaches, but the converse–tests that read poorly–is a more typical and bigger time-waster. Yes, get rid of duplication, absolutely, but not if it causes the readability of the tests to suffer.

My dissenting converser was adamant that eliminating duplication was king, citing Beck himself in Test Driven Development by Example (2002), on page one, no doubt: Step four in the rhythm of TDD says you run all the tests, and step five, “Refactor to remove duplication.” In that summary of steps, Beck suggests no other reasons to refactor! I can understand how a literalist-slash-originalist might be content with putting their foot down on this definition, never mind that “refactoring” has a broader definition anyway, never mind Beck’s simple design to the contrary, and never mind 14 years passage in which we’ve discussed, reflected, and refined our understanding of TDD.

Why a contentious debate? The dissenter was promoting a framework; they had created some hooks designed to simplify the creation tests for that framework. A laudable goal, except for the fact that I had no clue what the tests were actually proving. That just wads my undies.

Where am I going with all this? Oh yeah. So as I watched the dissenter scroll through some of their tests & prod code onscreen, I noted this repeated throughout:

public void doSomeStuff() {
   this.someField = someMethod();
   int x = this.someField + 42;
   // etc.
}

Apparently their standard is to always identify field usage by scoping it with this. (I bit my lip, because at that point I was weary of the protracted debate, and figured I could write about it later. Here I am.)

“Hey, over-using this might be a good idea, because we can more easily identify where fields are used, which is important information!”

OK, I’m offended that duplication-monists don’t view the use of this in this case (i.e. not to disambiguate) as unnecessary duplication. It is. Squash it. Instead, use the power of your sophisticated IDEA or Eclipse IDEA, and colorize fields appropriately. (I like orange.) They’ll stand out like sore thumbs.


Previous Compulsive Coder blog entry: Extract Method Flow
Next Compulsive Coder blog entry: Please AAA

A Story Isn’t a Feature

Your business doesn’t want to hear about how you “delivered stories.”

story
Image source: AJC1, Flickr

The business wants you to deliver value, in the form of what they most likely refer to as features.

A story should trigger a conversation around a feature request. Your software product doesn’t contain stories, it contains features that help users accomplish goals.

Wikipedia describes a user story as “one or more sentences” that capture a user’s needs. Those sentences represent a brief description or summary based on the planning and preparation so far, and they are reminders of the conversation that you must continue to have around the feature you will build. The sentences are not the feature. They aren’t even the full story, in most cases, as it often takes the requester several more oral sentences or even paragraphs to say what they really want.

Why Does This Matter?

If you think about stories as a simplified form of a requirements document (“a couple sentences”), you’re constraining agile to be a small tweak to how you did things before, with a trendier name. That diminishes the most important aspect of how agile can help you deliver software successfully: Through continual collaboration and oral communication. Without understanding of the importance of collaboration, stories end up being another waterfall-like hand-off document.

On another front: Too many teams waste time over the wording of a story. It’s a discussion point, not an artifact. The “standard” story template should help you clarify feature requests by asking you to think about who needs them and why. But otherwise spending time on precise word-smithing of a story writeup misses the point. You’re going to discuss it, and you can iron out any ambiguities as you do.

Stories provide only minimal value as an artifact. The ensuing discussion is where the details come out. Unfortunately, most of us can’t fully recall details from a conversation a half year prior, or even a half month prior. Granted, the details end up embodied in the software, but source code more often than not does a poor job of expressing such details concisely and simply. Even programmers must sometimes spend hours to determine just what the code does for the smallest of features.

We need a better artifact than the few sentences that represent a story. Many successful teams have found that the best way to capture system behaviors is in the form of acceptance tests, best derived by use of acceptance test-driven development (ATDD) or BDD. The acceptance tests supplant the nebulous conversation with real meat. A test’s name describes the goal it accomplishes, and its narrative provides an example that quickly relates the details of system behavior. The set of test names describes the complete set of designed system behaviors. The tests capture the important detail behind the discussion triggered by the story.

A story is not a feature, and it should always trigger a conversation.

Atom