Blogging

I’m convinced that blogs are a very self-indulgent forum, a cheap way for people to get up on a soapbox and spout stuff that people wouldn’t pay to read. I’ve never claimed to not be self-indulgent, however, so don’t expect me to shut down this blog. (I did shut down my Agile Java forum because of all the spam jerks out there.)

Going forward, I’m going to try to move these blog entries more into the realm of the provocative and honest. I’ve been challenged with the notion of that I’m a zealot, that I’m an arrogant, elitist agilist. Perhaps some of that’s true.

I’m certain that I’m a bit arrogant. But some of that is necessary in order to have the confidence to run a business on your own. I’m definitely not so arrogant that I would presume what I’m doing is the only right way to do things, and that other people are doing it “wrong.” I’m also pretty humble about my own abilities.

And I’m certain that I’m pretty enthusiastic about TDD. Why? To sell more consulting and/or training? I’d be lying if I said it has nothing to with generating business.

But that’s not really why I promote TDD. I’ve done it for seven years now. Some of that was either short-term consulting or in a training setting. But I’ve also had longer stints during that time span (7 and 12 months) where I did hardcore TDD, and on sizeable systems (500,000-800,000 loc systems). And most of the consulting work I’ve done has been in helping developers work on typical, larger systems in Fortune 500 companies. I say this, because I’ve seen people suggest I’ve never worked on a real project, anything above 30,000 loc. Whatever.

What this work has generally revealed is that most developers are clueless about quality code and good design. But it’s also revealed that these aren’t that difficult concepts–as long as the developer is willing to learn. It only takes a few days before things start clicking with most developers. I worked with one sharp, wonderful guy for about 4 days worth of pairing, doing TDD, and refactoring. I asked for a quote from him about his experience, one that I could take to his VP (and justify my cost). He thought for a minute and said, “I’ve learned more about software development in the past few days than I have in the past two years.”

I’ve also worked with many developers who just don’t care that much about what they do for a living. And I’ve worked with many developers who already know how to build perfect software. Not. There’s some adage about old dogs and new tricks. This industry seems to birth a lot of dogs who happen to already know everything.

Ultimately, I do this because I love developing software. I’ve also learned that my code always starts out sucking, that there’s always a better way to do it, and that I can get to that better way pretty easily–particulary if I’m doing TDD. I’ve also noted that just about everyone I’ve paired with demonstrates the same.

And, here’s my zealot out-of-touch agilist statement: I love doing TDD, because I’ve gotten it to work, for me and for many developers I’ve had the pleasure of doing business with. If you don’t like it, go back to the same old way you’ve always been slogging through code. I don’t give a hoot. And don’t forget to write a self-reassuring blog about how TDD can’t possibly work as well as I’ve touted it.

Open Note to Google

I thank Google one final time for having me speak last Tuesday (June 6). The talk I gave was entitled “Speeding up Development With TDD.” The abstract suggested I was giving a brief live coding demo, which I did. It was intended as an introductory talk, and I’m not sure how more might have been expected given the abstract. Nonetheless, I apologize for the miscommunication.

Given a 60 minute time frame, choosing an example that works for demonstrating technique means choosing a simple class. Unfortunately, choosing a simple example suggests that perhaps this TDD thing is limited to only academic examples. I did try to explain this, using a few anecdotes about real companies getting real results from the practice.

I have learned from my public drubbing, given to me by a Google employee in a blog post. I’m always very enthusiastic about TDD, based on real results I’ve seen, but I still do try to temper the message (hence the reason I included a slide that talks about “real costs” of doing TDD). Obviously I need to provide more disclaimers in the future.

I agree with the message that the TDD community needs to work harder to provide more “real world” examples. Thank you for suggesting it.

I do have a few messages for Google.

  • Tests == specs. Ok, I’ll concede that this is silly as a literal comparison. I presented this notion as one of my “mindsets,” something that helps me think in terms of some of the goals I should be able to get out of them. The word “mindset” was on the title of the slide I presented.
  • If it’s not testable, it’s useless. Once again, this was on the slide called “mindsets.” It’s a direction to head in. It’s intended to get you to think in the opposite direction, instead of making excuses why not to test. I’m sorry, I can’t back down much on this statement. The fact that Google can ship code that just happens to work suggests that maybe they’re better programmers than the rest of us. But this notion that it’s ok to design most code in a way that it can’t be tested is ludicrous. Note that I never said that every test had to be a unit test, or even automated. But the more of them that are, the better. Most code can be unit tested if designed well. Most.
  • Your code is just like everyone else’s code. I’ll take the Google challenge, and bet that I could take almost any Google system (except for the few where they’re doing TDD well), and pare it down to 2/3 or less of the original amount of code, while at the same time dramatically improving its maintainability.
  • Every company thinks their systems are uniquely complex. I’m willing to bet, however, that the bulk of the requirements for Google systems are probably no more complex than systems anywhere else. I heard the excuse at Google that “some things are just too hard to test.” Every time I hear similar excuses, I’ve seen poorly designed systems. Certainly, there will be small amounts of my system for which coding unit tests outweighs the benefits. But if any significant portion of my system is too hard to test, then I’m doing a poor job of managing complexity using basic OO design concepts.
  • Someone in the audience that suggested popping off a stack was a good way to test creation of a stack. This is a perfect example that demonstrates complete misunderstanding of what TDD is about. I said (it’s on the tape) that coding such a test into “testCreate” was inappropriate, that it represents different behavior and should be placed in another test. I probably wasn’t clear enough on this point. I never dismissed the thought in order to maintain a “script.”

I learned a few things from this opportunity. I can only hope Google did too.

Comments:

Maybe it would be worth picking up a different example that is more complex and easier to get wrong.

Build something more complicated like a BTree based upon groups of integers or something, storing integers only, and interior pointers as integers and show how it can be tested? I think it’s easier to make mistakes with a BTree or a similar design than a Stack.

As well as this, it allows you to show how to structure code into the pure (more testable) and inpure parts and show how this aids testing.

Hi,

I think the difficulty in TDD exercises tends to be that the exercises themselves need to be done in a quick amount of time in order to keep things going with a class. IMHO it just requires a little abstract thinking to translate those techniques and ideas to “the real world”.

Well, see you at Carfax next week. ūüėČ

– James

Are Tests Specs?

I’ve presented TDD concepts many times to many different audiences. Usually I can answer questions about coding, but sometimes I’m thinking about something else. I tend to think better in front of a keyboard and some code, and sometimes I can’t visualize what an audience member is asking. That happened to me last week, when an audience member said you couldn’t effectively test blah-blah-blah and have that test work as a “readable specification,” because blah-blah-blah.

At the time, the best I could think of to say is that sometimes there are things that just aren’t effective to test in TDD. That’s true. Some algorithms take way too long to execute with any set of data. Sometimes there are concepts that just don’t lend well to coding them as “specs by example.” But these examples are rare.

I can always write write unit tests that are named well and read well from start to finish. By reading both the test name and statements that comprise it, for a set of tests, I have a comprehensive understanding of what the class under test is capable of doing. Further, I have examples that show me how to work with the class under test. And if all its unit tests are passing (which they should pretty much always do), I know that the example code will actually work.

In retrospect, I figured out what the audience member was asking. It was as simple as exponentiation (somehow I heard it as something more complex at the time–my failure). The argument was that it would just be simpler to write a single comment that says, “this method raises x to the power y.”

That’s a deceptive example. The one-line comment isn’t a specification, it’s a summary description. If you didn’t already know what “raising x to the power y” really meant, that comment would be useless to you. But we all think we know what exponentiation is about. So using exponentation as an example sounds like a good argument against using tests to express specifications. Seemingly, it’s simpler to just provide a short comment.

In fact, I doubt most people could recite all the specifics required to completely document exponentiation. Here they are, from Sun’s own javadoc for the Math.pow function.

    Returns the value of the first argument raised to the power of the second argument. Special cases:

        * If the second argument is positive or negative zero, then the result is 1.0.
        * If the second argument is 1.0, then the result is the same as the first argument.
        * If the second argument is NaN, then the result is NaN.
        * If the first argument is NaN and the second argument is nonzero, then the result is NaN.
        * If
              o the absolute value of the first argument is greater than 1 and the second argument is positive infinity, or
              o the absolute value of the first argument is less than 1 and the second argument is negative infinity,
          then the result is positive infinity.
        * If
              o the absolute value of the first argument is greater than 1 and the second argument is negative infinity, or
              o the absolute value of the first argument is less than 1 and the second argument is positive infinity,
          then the result is positive zero.
        * If the absolute value of the first argument equals 1 and the second argument is infinite, then the result is NaN.
        * If
              o the first argument is positive zero and the second argument is greater than zero, or
              o the first argument is positive infinity and the second argument is less than zero,
          then the result is positive zero.
        * If
              o the first argument is positive zero and the second argument is less than zero, or
              o the first argument is positive infinity and the second argument is greater than zero,
          then the result is positive infinity.
        * If
              o the first argument is negative zero and the second argument is greater than zero but not a finite odd integer, or
              o the first argument is negative infinity and the second argument is less than zero but not a finite odd integer,
          then the result is positive zero.
        * If
              o the first argument is negative zero and the second argument is a positive finite odd integer, or
              o the first argument is negative infinity and the second argument is a negative finite odd integer,
          then the result is negative zero.
        * If
              o the first argument is negative zero and the second argument is less than zero but not a finite odd integer, or
              o the first argument is negative infinity and the second argument is greater than zero but not a finite odd integer,
          then the result is positive infinity.
        * If
              o the first argument is negative zero and the second argument is a negative finite odd integer, or
              o the first argument is negative infinity and the second argument is a positive finite odd integer,
          then the result is negative infinity.
        * If the first argument is finite and less than zero
              o if the second argument is a finite even integer, the result is equal to the result of raising the absolute value of the first argument to the power of the second argument
              o if the second argument is a finite odd integer, the result is equal to the negative of the result of raising the absolute value of the first argument to the power of the second argument
              o if the second argument is finite and not an integer, then the result is NaN.
        * If both arguments are integers, then the result is exactly equal to the mathematical result of raising the first argument to the power of the second argument if that result can in fact be represented exactly as a double value.

    (In the foregoing descriptions, a floating-point value is considered to be an integer if and only if it is finite and a fixed point of the method ceil or, equivalently, a fixed point of the method floor. A value is a fixed point of a one-argument method if and only if the result of applying the method to the value is equal to the value.)

    A result must be within 1 ulp of the correctly rounded result. Results must be semi-monotonic.

    Parameters:
        a - the base.
        b - the exponent.
    Returns:
        the value a^b.

All that blather, and it’s still a poor specification! Why? Because it doesn’t define what it means to “raise an argument to the power of a second argument.” You have to already know what that means. It’s like defining a word by using that word itself in the definition.

In most code we write, we’re not encapsulating a simple math API call or a single call to some already known quantity. We’re building new classes and methods that each do very different, very unique things that we probably can’t guess from a glib one-line description. The javadoc for Math.pow should really say something like: “returns the identity element, 1, multiplied by the base, as many times as indicated by the exponent.” That’s a mathematically correct definition (not counting the exceptional cases).

So I took a few minutes and built a test class that I think acts as a readable specification for how exponentiation works. I chose to support exponentiation for integers, not floating point numbers. As such, I chose to also omit support for negative exponents. Otherwise the return value would need to be either a float or a fractional abstraction. I didn’t feel like dealing with that–yet. (Want to see it? Let me know.)

Here are the tests:

package util;

import junit.framework.*;

public class MathUtilTest extends TestCase {
   static final int LARGE_NUMBER = 10000000;

   public void testSquares() {
      for (int i = 1; i < 10; i++)
         assertEquals(i + " squared:", i * i, MathUtil.power(i, 2));
   }

   public void testCubes() {
      for (int i = 1; i < 10; i++)
         assertEquals(i + " cubed:", i * i * i, MathUtil.power(i, 3));
   }

   public void testLargerExponents() {
      assertEquals(16, MathUtil.power(2, 4));
      assertEquals(256, MathUtil.power(2, 8));
      assertEquals(65536, MathUtil.power(2, 16));
   }

   public void testNegativeBases() {
      assertEquals(-2, MathUtil.power(-2, 1));
      assertEquals(4, MathUtil.power(-2, 2));
      assertEquals(-8, MathUtil.power(-2, 3));
      assertEquals(16, MathUtil.power(-2, 4));
   }

   public void testAnythingRaisedToZeroIsAlwaysOne() {
      assertEquals(1, MathUtil.power(-2, 0));
      assertEquals(1, MathUtil.power(-1, 0));
      assertEquals(1, MathUtil.power(0, 0));
      assertEquals(1, MathUtil.power(1, 0));
      assertEquals(1, MathUtil.power(2, 0));
      assertEquals(1, MathUtil.power(LARGE_NUMBER, 0));
   }

   public void testZeroRaisedToAnyPositiveIsAlwaysZero() {
      assertEquals(0, MathUtil.power(0, 1));
      assertEquals(0, MathUtil.power(0, 2));
      assertEquals(0, MathUtil.power(0, LARGE_NUMBER));
   }

   public void testOneRaisedToAnythingIsAlwaysOne() {
      assertEquals(1, MathUtil.power(1, 1));
      assertEquals(1, MathUtil.power(1, 2));
      assertEquals(1, MathUtil.power(1, LARGE_NUMBER));
   }

   public void testNegativeZeroExponentIsOne() {
      assertEquals(1, MathUtil.power(1, -0));
   }

   public void testNegativeExponentsUnsupported() {
      try {
         MathUtil.power(1, -1);
         fail("should not be supported");
      }
      catch (UnsupportedOperationException expected) {

      }
   }

   public void testOverflow() {
      try {
         MathUtil.power(3, 100);
         fail("expected overflow");
      }
      catch (IntegerOverflowException expected) {
      }
   }
}

(The class IntegerOverflowException is an empty subclass of RuntimeException.)

Before I present the tests, here’s the code for the resulting power function:

package util;

import junit.framework.*;

public class MathUtil {
   public static int power(int base, int exponent) {
      if (exponent < 0)
         throw new UnsupportedOperationException();
      if (exponent == 0)
         return 1;
      long result = 1;
      for (int i = 0; i < exponent; i++) {
         result *= base;
         if (result > Integer.MAX_VALUE)
            throw new IntegerOverflowException();
      }
      return (int)result;
   }
}

I built the production code incrementally, in accordance with each new bit of unit test code I wrote.

The tests certainly are not exhaustive. They are good enough (a) to give me confidence that the code works, and (b) to describe what exponentiation is all about. I’m sure there’s room for improvement in these tests, in how they read and in their completeness.

Still, these tests have a clear advantage over javadoc: they don’t require the reader to interpret a lot of English gobbledygook. Simple code examples speak far larger volumes about what I really want to know. The examples that these tests present tell me how to use the power function, and about the results it produces. That’s most of what I need. And for most of the real programmers I know, that’s what they would prefer.

Having said all that, more frequently I encounter tests that don’t do such a good job of explaining themselves. They contain lots of magic numbers, their names don’t tell me what’s important, the tests contain complex logic, they run on for several screens, and so on. Ultimately I have to do a lot of reading between the lines in order to figure out what the test is all about.

Are tests specs? Yes, they can be, although in a small number of cases it’s probably better to just write a brief English specification. But I’m doing test-driven development anyway, for other benefits that include fewer defects and better design. If I’m going to expend that effort, why shouldn’t I also strive to make the tests as useful as they can be?

I often imagine that someone offers me a choice between two systems. The first system has profuse comments, lots of paper specifications, and exhaustive UML models. But it has no tests, or maybe some poorly written tests. This first system is typical of most of the systems out there (except that most systems don’t even have good models).

The second system I can choose from has no such documentation. It contains no source code comments, no paper specifications, and no UML models. But it does have comprehensive, well-written unit tests (it was probably built using TDD). I’ll take the latter any day, with glee.

 

Comments:

I once listened to a lecture from a famous late mathematician. He introduced the “proof” that he was going to show to us with the words

“I will show you that the result holds for 3. Then you’ll see that it holds for all choices of 3”

This was funny. It was also deep, in that a fully formal proof with “n” in place of 3 would have been more correct, but less clear. And compelling belief in the truth of the theorem is what a proof is all about.

So this is what I would have answered in your place: “I’ll test for 0, then for 1, 2, then for 3, and add a comment saying that it is expected to work for all n > 3 or something like this. TDD is not about “trying to break” production code. It’s (also) about communicating, and understanding. When I see that the production code does not depend on “3” being the input, I’ll be confident that 4 and 4000 also work.

 

Excellent post! I have started doing this wholeheartedly by citing actual test code into the documentation. This gives readers both a narrative (where I can give background on and justifications for the API’s design), and hard proof that what I am saying is what the system actually does. And I like the fact that writing both the docs and these use-case tests first forces me to really think things through from the user’s point of view.

 

Another blog poster suggested these tests were just as long as the written spec. Maybe they are. That’s missing the point. The idea is that it’s possible to use examples to demonstrate the spec–not that the test code could somehow magically compress the amount of specification detail. And yes, it’s not truly specification, but it is far easier for developers to figure out.

 

>>Math.pow should really say something like: “returns the identity element, 1, multiplied by the base, as many times as indicated by the exponent.”

That’s actually not correct. E.g. pow(2, 2.5) is not an exceptional case, and you can’t two and a half time multiply something.

Regards

 

that’s why it says “something like”

Atom