- Consulting / Coaching
- Jeff’s Blog
The 3rd in series of blog posts in which I describe TDD katas & exercises that I’ve used for training purposes.
Other posts in the series:
The name normalizer transforms a name from its typical western form (for example, “Henry David Thoreau”, where the surname appears last) into surname-comma-first form (for example, “Thoreau, Henry D.”), presumably to facilitate sorting a list of names by last name. Here’s an ordered list of tests to drive an incremental derivation of the name normalizer:
Additional potential features:
This is one of those exercises that you could probably keep busy with for a full morning if you really wanted–there are plenty more interesting rules about names throughout the world.
Duration**: 30-45 minutes
This has become my go-to first TDD exercise; I have students do it using “TDD paint by numbers,” i.e. they are provided with the tests already written (then uncomment and implement them, one-by-one). TDD paint-by-numbers allows you to launch very quickly into the first exercise, with minimal need for up-front discussion or explanation. Particularly, you can avoid any discussion about the testing or assertion framework.
Often I will show students a horrible implementation of the name normalizer before discussing their exercise, then ask them what sorts of problems it exhibits. I ask how safe they would feel if they had to add a new feature.
Recently I have been starting this exercise with support for the first ~3 tests already coded. First, this allows me to set the stage for what I hope their code looks like (highly declarative). Second, since they aren’t coding the first test, I can defer the typical angst about providing a hard-coded first implementation. (That discussion comes up and is addressed in a second exercise.) Third, it suggests that TDD isn’t just for “from-scratch” things. Fourth, it allows me to reiterate the point about the safety of making incremental additions to existing code with good tests.
Additional behaviors for which students may want tests (depending on their implementation or confidence level) can include appending suffixes to mononyms or stripping spaces from mononyms.
If I’ve shown students the horrible implementation first, talked a bit about declarative coding or programming by intention, and then stayed atop of them as they do their exercises, it’s possible they will produce reasonably refactored code for this exercise. (But you know how people are…)
Without these caveats, the solutions are generally a big mess. I’ve sometimes gone the route of letting them produce a mess, then re-running the exercise as a demo or in a mob, in which case I press the issue about appropriate refactoring.
My GitHub page contains a repository for Name Normalizer, in which you can find some starter tests in various programming languages. Others have already contributed; please feel free to do so. If you poke around at the branches, you will find some sample solutions as well (not all languages come with solutions–feel free, too, to provide one).
** For pairing TDD novices. Impacts to duration can include:
The 2nd in series of blog posts in which I describe TDD katas & exercises that I’ve used for training purposes.
Other posts in the series:
A data structure that supports storing multiple values at a key. Think “English language dictionary,” which can contain multiple definitions for a single word.
dictionary.put('a word', 'a definition') // or dictionary['a word'] = 'a definition'
var keys = dictionary['a word'] expect(keys).toEqual(['1st definition', '2nd definition']
Duration**: 45 minutes
I used the Multimap as an introductory TDD exercise for a while many years ago. In turn working backward, this exercise replaced my use of test-driving a stack as an exercise; I moved off of that due to the negative feedback it garnered because of its over-simplicity (which appeared to help reinforce resistances to TDD).
Test-driving a multimap works well as a first exercise (or second exercise if the first was done using “TDD paint by numbers,” in which case you’d want to discuss a starter test list with the group). It does provide a couple opportunities for the students to create mildly interesting defects. It is “real” enough in that I’ve found a few occasions to use it in production code.
Note that some languages and / or frameworks (e.g. Guava for Java and C++11 / later) already provide a Multimap implementation. In this case, test-driving a multimap can still work as an exercise, though it could also provoke thoughts of “why are we wasting our time building something already readily available?”
Some possibilities for extra credit / extending the exercise.
** For pairing TDD novices. Impacts to duration can include:
The 1st in series of blog posts in which I describe TDD katas & exercises that I’ve used for training purposes.
Other posts in the series:
When training on TDD, I agonize most about the exercises. Students should be challenged enough to remain engaged, but not so much that they mire in a problem solution, to the detriment of their learning important TDD concepts. I’ve written a few blog posts about what I thought made for a good exercise–not overly trivial, not too short, not too long, not too mathematical, etc.–as well as how to run the session for a very first TDD exercise:
You’ll find piles of katas and coding exercises listed at numerous sites, including https://exercism.io, http://codekata.com, https://www.codewars.com, https://github.com/emilybache, and https://projecteuler.net/archives.
What would be nice is if every exercise came with training notes: information that would help you understand when and if it’s appropriate for your teaching context: how long will it take, what are the key themes the exercise helps impart, challenges to look for, and so on.
Over this and the next handful of blog posts, I’ll provide this information for some TDD exercises that I’ve devised.
The duration I suggest for each is roughly what it takes in a classroom setting where students are pairing. The duration can be impacted by a number of factors:
In the remainder of this post, I’ll describe the stock portfolio exercise.
In subsequent posts, I’ll cover: soundex, Risk card sets, name normalizer, MultiMap.
An in-memory module to track personal stock purchases. The stories:
Duration: 45-50 minutes
For me, this is usually the students’ second exercise; the prior is some form of TDD “paint by numbers” (where the tests were already written for them). Since they are writing their own tests for the first time, then, a core focus is on helping them understand where to start and what tests to write next.
Generally we talk through the first 5 or so tests as a group; I write these test names on the whiteboard and we discuss their potential implementation. It’s possible to write the first handful or so of tests–all of them around concepts of emptiness and symbol count–without having to introduce a key-value structure. (Even when they get around to the test that says the symbol count shouldn’t increase for a repeat purchase for a symbol, they can use a set before they need to introduce the key-value store.)
Students will likely come up with some additional validation test cases–negative numbers, nulls, etc. That’s ok though not very interesting. A key case to include and discuss is what the portfolio answers when asked for shares of a symbol not yet purchased (it should probably be 0). A test that not everyone will think of is what should happen to the count of unique symbols when all shares of a symbol have been sold.
One big plus of this exercise is that it supports many additional stories if needed. Here are two key ones:
There are good opportunities for refactoring in this kata. Similarities between purchasing and selling should result in some useful common abstractions. Also, once the time series becomes a thing, it’s probably a good point to introduce another module / class.
Overall I’ve found that the stock portfolio exercise works very well. It provides enough opportunities for students to trip up, and starts to bang in the concept of incrementalism.
Two Rules for Mobbing Success (25-Apr-2019)
In this blog post for Ranorex, I provide two key rules for making mob programming a successful and enjoyable way to work. You’ll figure out most of the rest of the guidelines for mobbing yourself: Things like respect for your teammates, guidelines about people coming and going, and so on. But these two rules–while simple to employ–aren’t necessarily obvious, and they’ll make all the difference in the world between a tedious, protracted mobbing session and something you’ll actually enjoy quite a bit.
To some seasoned developers, test-driven development (TDD) can initially seem like the dumbest thing ever. Once you’ve written a failing test, you are supposed to write only as much code as needed to make the test to pass. No speculation about what you think you need in the future–a week from now, an hour from now, or even ten minutes from now. Per Bob Martin’s three laws of TDD, write no more production code than “sufficient to pass the one failing unit test.”
“But I know I will want a hashmap in the next test or two, because I’ll have a bunch of keys and values…”
You’re a seasoned developer, so you’re probably right most of the time when you say you’ll need it. And yes, if following TDD, for now you must provide a simpler implementation. “That seems stupid; providing the simpler solution now means that I’ll be reworking it later to create the right one.”
As with most things in computing, TDD is a tradeoff. It trades off your current way of working–likely a code-test-fix cycle–with a test-code/fix cycle.
In a code-test-fix cycle, you write what you think the proper code is, you design and run tests (whether they are manual or automated), and you fix any problems that you discover. The duration of each step (code, test, and fix) usually varies dramatically, anywhere from minutes to hours (and sometimes longer). The tests you write usually cover a large subset of the behaviors in the code you just wrote. A perfect developer ends up with no fix cycle segment in code-test-fix (aka test-after development or TAD).
In a test-code/fix cycle, you define completion criteria for the code to be written in the form of automated tests. You write the code you think meets the behavior demonstrated in the tests, and fix any attempts that do not make the tests pass. You also clean the code. The duration of each step (test, code/fix, and refactor, also known as red-green-refactor in TDD) is fairly consistent and very short–ideally no more than 5 minutes. A perfect developer ends up writing code forthwith that passes the test.
A key distinction of test-code/fix, then, is that the test you write determines the scope of the code to write. Your goal is to code only logic for the behavior within the scope of this test. If your code is insufficient, the tests do not pass.
Any more code than specified by the tests falls outside the scope that they define. You can choose to write additional tests to describe (and vet) these “excess” behaviors, but you are now out of the TDD rhythm: Such tests will pass when you execute them.
You can of course choose not to write additional tests, in which case some unknown amount of the excess behaviors will be untested. It is a choice, but at this point you are no longer doing TDD, by definition.
So what if you’re not doing TDD? So what if you’re not testing everything? Breaking the TDD rhythm (by writing code in excess of the tests defined) carries the same ramifications as doing TAD in general, ones that you’ve already learned to accept as a seasoned developer.
Confidence in code correctness is but one reason to practice TDD, however. The tests TDD creates can also document the voluminous choices you make as a developer. When you choose to add behaviors without providing tests, you encode this behavior in way that is often not easily decoded: It can take a long time to uncover intent in the midst of any volume of code.
Well-designed tests can immediately impart the choices you make about the behaviors you designed into the code. They act as trustworthy documentation on the intended capabilities of the system.
Even with well-designed tests that document choices made in the code, you can produce code that resists easy comprehension. You’re not likely a perfect writer: When you first write anything (an article, an email, a blog post, a tweet, etc.), you often bloat it with unnecessary words, or create text that’s difficult to decipher. Good writing is a process of getting your thoughts down, then returning to edit these thoughts for clarity.
And so it is with code. Even if you excel at writing the correct (test-passing) code out of the gate, chances are good that it’s a little or even a lot messy. Perhaps it is code that others find difficult to understand. Perhaps it duplicates other concepts already in your system. Perhaps it violates your team standards. Perhaps it could be written more concisely (maybe using a construct you’re vaguely aware of, but you wanted to get the code working first). Perhaps you realize a slightly-better name for the variable you chose, particularly once you re-read the code to yourself.
Getting ideas down in some form, then cleaning them up, is how most of us do and should work. The realization of prose on paper, or code in an editor, makes it easier for you (and others) to see the messiness in all its glory. Once it’s in our face, we know that we should clean it up.
TDD builds this editing process into the cycle. Once you produce code that works, you can immediately and safely shape it into something that will help decrease the cost of its maintenance.
Back to test-after: Adding untested code reduces your confidence about making changes to that code. Consciously or otherwise, you will similarly reduce the amount of code editing you do.
Suppose you’re tasked with building a stock portfolio. Along with supporting the ability to purchase shares of symbols (e.g. AAPL or IBM), the portfolio should answer the number of distinct symbols.
You’re an experienced developer. “I’m going to create this hashmap now to capture the number of shares purchased for each stock symbol, because that’s the solution I’m going to end up with.”
If the only test you’ve written so far is around purchasing shares for a given stock symbol, the following potential tests pass as soon as you run them–if you even think to write them:
The immediately-passing tests put you out of the realm of TDD. So yes, to answer this section’s titular question, your speculative hashmap represents excess code.
As suggested earlier, most of the time you’re probably right about the generalities you think you need. When you’re right, it may seem like a waste of time to incrementally rework code (by starting with a specific solution and generalizing it a bit with each test). Still, the incremental solution keeps you on a rhythm, creates documentation for all intents in the system, prevents you from injecting defects, and allows you to keep the code clean incrementally.
Every once in a while, however, you aren’t right about the generalities needed. In the cases where you aren’t right, the incorrect and unneeded generality will cost you in the interim: The additional, unnecessary complexity can increase the effort required to read and maintain the code, over and over again across the lifetime of the system. (It can also raise questions about “why” and intent that can be hard to answer.) And when it comes time to support new behavior, it will usually take longer with an overly complex implementation than the simplest possible one.
You might view the incremental path that TDD promotes as a means of exploration. Seeking a simpler solution might open your mind up to other possibilities–things that you might not dream up if you race to the more comfortable, habitual solution that seems like it’s a foregone conclusion.
For the stock portfolio, a hashmap might seem like the proper projection, but it turns out that using a time series is better suited to historical data and can also result in simpler code.
It’s possible you’re claiming foul right about now: “If I had all the requirements up front for the portfolio, particularly ones around tracking purchase history and auditing, I might have come up with the best possible design.” Maybe. It’s also possible that your predisposition to certain kinds of solutions might have led you to a design that boxed you in to a constrained and inflexible solution.
A key goal of agile software development is to support and embrace change. With each iteration, a product owner can introduce new features–things never previously conceived. These interests can come about as a result of feedback from a number of events, including changes in the marketplace, changes to what a specific customer seeks, technology advances, and education regarding better techniques.
For example, no major U.S. airline carrier had supported baggage fees before 2008; it’s likely that few airlines had ever imagined them. When American Airlines introduced baggage fees in May of that year, the other carriers scrambled to incorporate a feature that their systems likely didn’t support so easily.
The TDD cycle in a sense is a microcosm of a well-executed agile process:
Both TDD and agile iterative development are feedback-driven: A key goal for each is the ability to change directions easily if new information demands it.
In agile, you don’t build support for features that the product owner doesn’t ask for. Similarly, to succeed with TDD, you must adhere to and trust a core rule of TDD: Once you’ve watched a test fail, you may write only the code necessary to make the tests pass.
With the goal of delivering quality software, I can help you with:
Want to hear more? Call 719-287-GEEK or use the Contact Me form to the left.