TDD Kata: Roman number converter

by Jeff Langr

September 20, 2011

I’ve introduced the whole-number-to-Roman-numeral kata in my test-driven development training, not out of sadistic interest, but because I think it offers some excellent lessons about TDD. I do find it a bit amusing that good developers often struggle badly with the derivation of the algorithm, even after I give them my guidelines for the exercise:

  • Don’t assume that your tests must go in order 1, 2, 3, 4, 5, 6, 7, … Seek the test that would make for the next simplest increment of code. Seek patterns–if you’ve figured out the pattern for 1, 2, 3, then what’s the next series of numbers that follows the same pattern?

  • When you refactor incrementally, make similar code look exactly alike, and then get rid of the duplication.

  • Do not prematurely optimize code. Avoid using code-expressiveness “optimizations:” use while loops and simple decrement constructs instead of for loops and post-decrement operators, and let them stand until your algorithm is complete.

After they complete the exercise, we discuss Uncle Bob’s Transformation Priority Premise (TPP), and I ask them to re-run the exercise for homework (and to get them used to the idea of a kata) with the TPP in mind.

The first time I attempted the Kata some years ago, I got lost in the weeds. I didn’t give up; I scrapped a good amount of code, thought a bit harder about the ordering of tests, and eventually made my way to the finish line. I noted at least a couple insights were required, and came up with the guidelines above as a result in order to help students keep on track.

“Let’s start at the beginning, a very good place to start:”

       @Test
       public void convertIntToRoman() {
          assertThat(convert(1), is("I"));
       }

The implementation is the topmost TPP transform, ({}->nil):

       String convert(int arabic) {
          return null;
       }

Quickly following, the passing implementation (nil->constant):

       String convert(int arabic) {
          return "I";
       }

Moving along to the next test:

    assertThat(convert(2), is("II"));

There might be a way to get a higher-level transform, but (unconditional->if) seems simplest to me:

       String convert(int arabic) {
          if (arabic == 2) return "II";
          return "I";
       }

The conversion for 3 indicates a repeating pattern:

       assertThat(convert(3), is("III"));

…which begs for a loop (if->while):

       String convert(int arabic) {
          String roman = "";
          while (arabic-- > 0)
             roman += "I";
          return roman;
       }

Performance geeks: never mind the use of String vs StringBuilder/StringBuffer. I want to get the algorithm working first, and all that rot clutters up the code a bit.

Hmm. I didn’t follow my guideline, and used post-decrement where I probably should have just introduced a separate statement to decrement the arabic variable. (Oh and another sin, I’m manipulating the argument. Ah well.)

Now a bit of thought. One, two, three, ten! No, four sir! No, ten. I have support for I, II, III. If I seek a similar pattern, it’s likely to do the least to my code in terms of transforms required, so what if I head to X, XX, and XXX?

       assertThat(convert(10), is("X"));

I can get that to work with a simple (unconditional->if), but it’s horrid:

       String convert(int arabic) {
          if (arabic == 10) return "X";
          String roman = "";
          while (arabic-- > 0)
             roman += "I";
          return roman;
       }

That’s ok–adding a test for 20:

       assertThat(convert(20), is("XX"));

…starts to make things look better, as I introduce (if->while) again:

       String convert(int arabic) {
          String roman = "";
          while (arabic >= 10) {
             roman += "X";
             arabic -= 10;
          }
          while (arabic-- > 0)
             roman += "I";
          return roman;
       }

Those two while loops look an awful lot alike. I can make them look almost exactly alike–I replace my post-decrement (see, shouldn’t have introduced it yet) with a separate statement at the end of the loop body, and adjust my comparison expression:

      String convert(int arabic) {
          String roman = "";
          while (arabic >= 10) {
             roman += "X";
             arabic -= 10;
          }
          while (arabic >= 1) {
             roman += "I";
             arabic -= 1;
          }
          return roman;
       }

It’s clear that these while loops differ only by their data. A bit of insight suggests introducing a table-lookup, allowing the implementation to be factored to the following:

       static int[] ARABIC_DIGITS = { 10, 1 };
       static String[] ROMAN_DIGITS = { "X", "I" };
    
       String convert(int arabic) {
          String roman = "";
          for (int i = 0; i < ARABIC_DIGITS.length; i++) {
             while (arabic >= ARABIC_DIGITS[i]) {
                roman += ROMAN_DIGITS[i];
                arabic -= ARABIC_DIGITS[i];
             }
          }
          return roman;
       }

(In Java, I tried once building the above as a map, but since the algorithm demands traversing the digits in reverse order, it ends up being a good deal more code than the simple but slightly obnoxious paired arrays.)

Huh. Look at that. A test for 30 should pass too, and it does.

But is it a bit too much implementation? Maybe–now that I support I’s and X’s, I want to be able to combine them, yet it looks like the following tests will pass right off the bat:

       assertThat(convert(11), is("XI"));
       assertThat(convert(33), is("XXXIII"));

And they do! I wonder if there’s a way I could have coded so those tests failed first, but I’ve not thought about it much. Maybe I could have put the second loop in an else block. Ideas?

What about 5?

      assertThat(convert(5), is("V"));

It’s just another digit; adding support for 5 to the table gets this test to pass:

       static int[] ARABIC_DIGITS = { 10, 5, 1 };
       static String[] ROMAN_DIGITS = { "X", "V", "I" };

And now these little combinations involving 5 also pass:

       assertThat(convert(8), is("VIII"));
       assertThat(convert(27), is("XXVII"));

I have to imagine similar tests involving L, C, D, and M should all pass too.

Instead, I’ll hit that pesky one, 4:

       assertThat(convert(4), is("IV"));

The joy of four is that if I try to get there with subtraction, I’m in a world of hurt. This is where students usually go really, really wrong. But, wait… insight! What if I consider that “IV” is just another digit, like any other Roman digit, except that it requires two characters instead of one to represent?

       static int[] ARABIC_DIGITS = { 10, 5, 4, 1 };
       static String[] ROMAN_DIGITS = { "X", "V", "IV", "I" };

Voila. At this point, I’ve covered the various scenarios, now it’s just a matter of fleshing out the table with support for all of the Roman digits. How about I just write a couple tests that fire across all cylinders?

          assertThat(convert(2499), is("MMCDXCIX"));
          assertThat(convert(3949), is("MMMCMXLIX"));

The final “table:”

    static int[] ARABIC_DIGITS =
    { 1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1 };
    static String[] ROMAN_DIGITS =
    { "M","CM","D","CD","C","XC","L","XL","X","IX","V","IV","I" };

We haven’t had to touch the algorithm for a while. It still holds up!

       String convert(int arabic) {
          String roman = "";
          for (int i = 0; i < ARABIC_DIGITS.length; i++) {
             while (arabic >= ARABIC_DIGITS[i]) {
                roman += ROMAN_DIGITS[i];
                arabic -= ARABIC_DIGITS[i];
             }
          }
          return roman;
       }

I did a search for code that solves this challenge, and had to laugh at the ugliness of a number of the implementations. There were some clever but convoluted ones, some awful ones, and a few that had the same implementation here.

TDD can’t derive algorithms, so they say. I take a different view. An algorithm is simply a “step-by-step procedure for calculations,” per Wikipedia. And TDD is great at helping drive out the vast majority of step-by-step procedures.

Oh–they mean complex algorithms. I see. That’s ok: Most of what we must implement on a day-to-day basis can be done with reasonably simple algorithms. In fact, I’d warrant that far less than a percent of any given system requires anything but simple algorithms that any journeyman programmer can easily devise. (And I’ve seen some very “complex” systems, such as BaBar and the Sabre GDS.)

Complex algorithms do require leaps achieved often only by appropriate insights, sometimes gained only after extensive thought about the problem, and sometimes never reached by a given single mind. TDD will not necessarily help your mind achieve those insights. However, its incremental nature can in some cases give you enough pause and isolation on a small piece of the problem, to the point where the insight needed almost screams at you.

Comments

Pingback: A TDD Bag of Tricks | langrsoft.com


trams March 18, 2013 at 9:30 am

Fantastic tutorial for thinking in the TDD mindset. The algorithm is quite elegant, and I wouldnt have come up with a solution like that without help. I would have done it to perform subtraction for the the IV IX, etc cases.

I wrote a rather convoluted solution for the reverse case (roman to arabic), I’ll see if I can simplify it based on your method.

Cheers!


Jeff Langr March 18, 2013 at 10:06 am

Thanks trams,

You might also look at Corey Haines’ implementation of the algorithm. He drives toward a more functional approach, probably not something I would push for C++, C#, or Java code.

Jeff


Curtis Cooley June 19, 2013 at 9:44 am

I ran this Kata at a client. I paired with one of the developers and we came to almost this exact algorithm. We did use a Map in Java by passing in a Comparator to the constructor of TreeMap. A tad more code, but it fixed the parallel array issue.


Jeff Langr June 19, 2013 at 12:25 pm

Thanks Curtis! The TreeMap is nice and just a bit of a extra pain. 🙂

Jeff


Hoto April 20, 2014 at 3:25 am

I love your version of this kata. Thanks. At first I didn’t believe this algorithm works. I wrote 41 tests. Still working.


Timo Meinen May 16, 2014 at 3:45 am

If you want to use a map you can simply use a LinkedHashMap, which keeps the order.


Jeff Langr May 16, 2014 at 8:44 am

Thanks Timo! I think I’d built it a long while back with a regular map and then later with (perhaps a) tree map. Or maybe it was a LinkedHashMap. I vaguely remember having to build a comparator to help traverse it in reverse order. I’d probably use something like that in production code; it seems like a bunch of extra code in the kata. There’s also this: http://stackoverflow.com/questions/8893231/how-to-traverse-linked-hash-map-in-reverse


Ben January 30, 2017 at 1:58 am

This is tricky to do with incremental TDD and your solution is probably the most common out there. thanks for the post.

mine’s completely different, but then again, i violated open/close principle to get there.

rgds Ben


Share your comment

Jeff Langr

About the Author

Jeff Langr has been building software for 40 years and writing about it heavily for 20. You can find out more about Jeff, learn from the many helpful articles and books he's written, or read one of his 1000+ combined blog (including Agile in a Flash) and public posts.