Not Giving Up on vi(m)

I wrote this blog as a reaction to reading James McKay’s blog entry, “:q!,” in which he describes abandoning vim. A story familiar to me, but I fortunately never abandoned it for good. I do have to agree that Intellisense and other wonders of modern IDEs make them the preferred environment for things like Java, but I still get a lot of use out of vim.

In the past, I’d be forced to use vim while, say, having to do a small amount of my work on Solaris servers, but then I’d be back on Windows shortly thereafter. I loaded RedHat at home many years ago but it didn’t stick. Then I tried Ubuntu… a couple times, and Un*x finally stuck. Yet I was still only occasionally using vim, which meant I was a perpetual newb, at a “level 1” of vim capability (I’ll sadly call the scale the VCM, or vim capability model).

VCM level 1 is the ability to get in and out of the tool and do basic editing (search, page forward, change/replace text, occasional use of the dot operator, etc.). I remember that I too thought the only way to delete many lines was via count-based commands (e.g., 10d). At VCM level 1, vim beeps at you a lot, and you don’t understand why people swear by it.

What changed things was a result of being forced into having to hit vim on a daily basis for an extended period of time. I had the opportunity to work with Tim Ottinger a bit, and we would pair from time to time. Tim is the author of a great article on vim, “Use vim Like a Pro.” I won’t mention pairing after this paragraph, but I will say that without the ability to pair with someone at a higher VCM level, I might never have advanced much from my level 1 proficiency.

I’m making up the VCM based on my history, of course, but it might just work for others, too. VCM level 2 is marked by use of these critical elements of vim:

  • regular use of the dot operator
  • use of hjkl for cursor movement
  • ability to manage buffers, registers, and split windows
  • ability to visually mark code
  • regular use of f and t in combination with other commands; use of other context-based movement commands
  • use of ctrl-n for auto completion
  • awareness and occasional use of macros
  • occasional use of help facility to learn something new
  • regular (no pun intended) use of regex
  • use of ctags to aid code navigation

Reaching level 2 was important: At level 1, I simply thought vim was a frustrating, not-very-powerful tool. At level 2, I am fairly effective when editing, and faster at many tasks than most people in their GUI editors. I also see that there is much more to learn and master.

Tim is probably at VCM level 4: mastery of most, if not all, of vim’s features. I suppose that marks Level 3 as when you have ingrained all the major facilities (i.e. most everything that Tim covers in Use vim Like a Pro), are trying to ingrain something new on a regular basis (I’ve now habituated the use of ~ to toggle case, for example), have considerably customized your .vimrc, and are making an active attempt to do everything in the most efficient way possible. I am getting into level 3 now, but I haven’t used vim heavily in a while.

During my fortunate opportunities to learn vim from Tim, I discovered there were a few things about vim that he didn’t know. Shocker! It’s an extensive tool, but I guess what that means is that there’s a VCM level 5. Let’s define that as mastering VCM Level 4 plus everything that Tim doesn’t know now.

TDD Kata: Roman number converter

I’ve introduced the whole-number-to-Roman-numeral kata in my test-driven development training, not out of sadistic interest, but because I think it offers some excellent lessons about TDD. I do find it a bit amusing that good developers often struggle badly with the derivation of the algorithm, even after I give them my guidelines for the exercise:

  • Don’t assume that your tests must go in order 1, 2, 3, 4, 5, 6, 7, … Seek the test that would make for the next simplest increment of code. Seek patterns–if you’ve figured out the pattern for 1, 2, 3, then what’s the next series of numbers that follows the same pattern?
  • When you refactor incrementally, make similar code look exactly alike, and then get rid of the duplication.
  • Do not prematurely optimize code. Avoid using code-expressiveness “optimizations:” use while loops and simple decrement constructs instead of for loops and post-decrement operators, and let them stand until your algorithm is complete.

After they complete the exercise, we discuss Uncle Bob’s Transformation Priority Premise (TPP), and I ask them to re-run the exercise for homework (and to get them used to the idea of a kata) with the TPP in mind.

The first time I attempted the Kata some years ago, I got lost in the weeds. I didn’t give up; I scrapped a good amount of code, thought a bit harder about the ordering of tests, and eventually made my way to the finish line. I noted at least a couple insights were required, and came up with the guidelines above as a result in order to help students keep on track.

“Let’s start at the beginning, a very good place to start:”

   @Test
   public void convertIntToRoman() {
      assertThat(convert(1), is("I"));
   }

The implementation is the topmost TPP transform, ({}->nil):

   String convert(int arabic) {
      return null;
   }

Quickly following, the passing implementation (nil->constant):

   String convert(int arabic) {
      return "I";
   }

Moving along to the next test:

assertThat(convert(2), is("II"));

There might be a way to get a higher-level transform, but (unconditional->if) seems simplest to me:

   String convert(int arabic) {
      if (arabic == 2) return "II";
      return "I";
   }

The conversion for 3 indicates a repeating pattern:

   assertThat(convert(3), is("III"));

…which begs for a loop (if->while):

   String convert(int arabic) {
      String roman = "";
      while (arabic-- > 0)
         roman += "I";
      return roman;
   }

Performance geeks: never mind the use of String vs StringBuilder/StringBuffer. I want to get the algorithm working first, and all that rot clutters up the code a bit.

Hmm. I didn’t follow my guideline, and used post-decrement where I probably should have just introduced a separate statement to decrement the arabic variable. (Oh and another sin, I’m manipulating the argument. Ah well.)

Now a bit of thought. One, two, three, ten! No, four sir! No, ten. I have support for I, II, III. If I seek a similar pattern, it’s likely to do the least to my code in terms of transforms required, so what if I head to X, XX, and XXX?

   assertThat(convert(10), is("X"));

I can get that to work with a simple (unconditional->if), but it’s horrid:

   String convert(int arabic) {
      if (arabic == 10) return "X";
      String roman = "";
      while (arabic-- > 0)
         roman += "I";
      return roman;
   }

That’s ok–adding a test for 20:

   assertThat(convert(20), is("XX"));

…starts to make things look better, as I introduce (if->while) again:

   String convert(int arabic) {
      String roman = "";
      while (arabic >= 10) {
         roman += "X";
         arabic -= 10;
      }
      while (arabic-- > 0)
         roman += "I";
      return roman;
   }

Those two while loops look an awful lot alike. I can make them look almost exactly alike–I replace my post-decrement (see, shouldn’t have introduced it yet) with a separate statement at the end of the loop body, and adjust my comparison expression:

  String convert(int arabic) {
      String roman = "";
      while (arabic >= 10) {
         roman += "X";
         arabic -= 10;
      }
      while (arabic >= 1) {
         roman += "I";
         arabic -= 1;
      }
      return roman;
   }

It’s clear that these while loops differ only by their data. A bit of insight suggests introducing a table-lookup, allowing the implementation to be factored to the following:

   static int[] ARABIC_DIGITS = { 10, 1 };
   static String[] ROMAN_DIGITS = { "X", "I" };

   String convert(int arabic) {
      String roman = "";
      for (int i = 0; i < ARABIC_DIGITS.length; i++) {
         while (arabic >= ARABIC_DIGITS[i]) {
            roman += ROMAN_DIGITS[i];
            arabic -= ARABIC_DIGITS[i];
         }
      }
      return roman;
   }

(In Java, I tried once building the above as a map, but since the algorithm demands traversing the digits in reverse order, it ends up being a good deal more code than the simple but slightly obnoxious paired arrays.)

Huh. Look at that. A test for 30 should pass too, and it does.

But is it a bit too much implementation? Maybe–now that I support I’s and X’s, I want to be able to combine them, yet it looks like the following tests will pass right off the bat:

   assertThat(convert(11), is("XI"));
   assertThat(convert(33), is("XXXIII"));

And they do! I wonder if there’s a way I could have coded so those tests failed first, but I’ve not thought about it much. Maybe I could have put the second loop in an else block. Ideas?

What about 5?

  assertThat(convert(5), is("V"));

It’s just another digit; adding support for 5 to the table gets this test to pass:

   static int[] ARABIC_DIGITS = { 10, 5, 1 };
   static String[] ROMAN_DIGITS = { "X", "V", "I" };

And now these little combinations involving 5 also pass:

   assertThat(convert(8), is("VIII"));
   assertThat(convert(27), is("XXVII"));

I have to imagine similar tests involving L, C, D, and M should all pass too.

Instead, I’ll hit that pesky one, 4:

   assertThat(convert(4), is("IV"));

The joy of four is that if I try to get there with subtraction, I’m in a world of hurt. This is where students usually go really, really wrong. But, wait… insight! What if I consider that “IV” is just another digit, like any other Roman digit, except that it requires two characters instead of one to represent?

   static int[] ARABIC_DIGITS = { 10, 5, 4, 1 };
   static String[] ROMAN_DIGITS = { "X", "V", "IV", "I" };

Voila. At this point, I’ve covered the various scenarios, now it’s just a matter of fleshing out the table with support for all of the Roman digits. How about I just write a couple tests that fire across all cylinders?

      assertThat(convert(2499), is("MMCDXCIX"));
      assertThat(convert(3949), is("MMMCMXLIX"));

The final “table:”

static int[] ARABIC_DIGITS =
{ 1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1 };
static String[] ROMAN_DIGITS =
{ "M","CM","D","CD","C","XC","L","XL","X","IX","V","IV","I" };

We haven’t had to touch the algorithm for a while. It still holds up!

   String convert(int arabic) {
      String roman = "";
      for (int i = 0; i < ARABIC_DIGITS.length; i++) {
         while (arabic >= ARABIC_DIGITS[i]) {
            roman += ROMAN_DIGITS[i];
            arabic -= ARABIC_DIGITS[i];
         }
      }
      return roman;
   }

I did a search for code that solves this challenge, and had to laugh at the ugliness of a number of the implementations. There were some clever but convoluted ones, some awful ones, and a few that had the same implementation here.

TDD can’t derive algorithms, so they say. I take a different view. An algorithm is simply a “step-by-step procedure for calculations,” per Wikipedia. And TDD is great at helping drive out the vast majority of step-by-step procedures.

Oh–they mean complex algorithms. I see. That’s ok: Most of what we must implement on a day-to-day basis can be done with reasonably simple algorithms. In fact, I’d warrant that far less than a percent of any given system requires anything but simple algorithms that any journeyman programmer can easily devise. (And I’ve seen some very “complex” systems, such as BaBar and the Sabre GDS.)

Complex algorithms do require leaps achieved often only by appropriate insights, sometimes gained only after extensive thought about the problem, and sometimes never reached by a given single mind. TDD will not necessarily help your mind achieve those insights. However, its incremental nature can in some cases give you enough pause and isolation on a small piece of the problem, to the point where the insight needed almost screams at you.

Unit Tests Are Still a Waste of Time

Per a recent Dr. Dobbs article, the only reason to write unit tests is as a “convenience … to track down bugs faster.” If that’s the only value proposition, then I must agree with the author that writing unit tests is a “luxury” that only infrequently makes sense.

You’d be far better off investing the time in improving your integration tests or (much better) bolstering the tests that demonstrate customer acceptance criteria. They run much more slowly, but provide more value than writing unit tests after you code: They not only help prevent you from creating defects in the system, they can also be a key negotiation point, and they can act as living documentation on the behaviors your system supports.

Fortunately, writing tests first can create myriad other benefits that outweigh their cost: reduced system size, tests that improve developer understanding of system behaviors, more decoupled/cohesive designs, cleaner code, and the ability to change code safely and rapidly as it needs to be changed. Yes, there’s some churn along the way, but it’s usually mild (and milder as the quality of design improves). These benefits are why I’ve done TDD for a dozen years now, having witnessed some fantastic successes along the way.

Atom