C++11: Using Lambdas to Support a Times-Repeat Loop

lambdaMore often than not, explicit for loops that execute n times usually need to know what what n is. Sometimes they don’t, in which case the for-loop structure seems mildly tedious. It’s certainly idiomatic, and you’ve seen ’em all a million times:

for (auto i = 0; i < 5; i++) {
   // code to repeat here
}

Still, having seen them all a million times, you’ve encountered many either accidental or purposeful variations:

for (auto i = 1; i < 5; i++) {
   // code to repeat here
}
for (auto i = 0; i <= 5; i++) {
   // code to repeat here
}
for (auto i = 5; i >= 0; i++) {
   // code to repeat here
}
for (auto i = 0; i < 5; i += 2) {
   // code to repeat here
}

The result: A for loop is something you must always pause to inspect. It’s too easy for your brain to miss one of these variants.

I’d much rather be able to simply send a timesRepeat message to an integer, as I can do in Smalltalk (Ruby has a similar message, times):

   5 timesRepeat: [ "code to repeat here" ]

In C++, overloading the multiplication operator would allow me to code something like:

   5 * [] { /* code to repeat here */ }

(The [] is the capture clause, which indicates that the following expression is a lambda, and also specifies how variables might be captured. In this case, the empty capture clause [] indicates that no variables are captured.)

A lambda function is of type function, therefore I can support the times-repeat expression with the following:

int operator*(int number, function<void (void)> func) {
   for (int i = 0; i < number; i++) func();
}

In other words, simply translate the lambda into the corresponding for loop, iterating number times, and calling func() each time, where number represents the receiver (5 in the example usage above).

A simple test demonstrates that the construct works:

unsigned int totalGlobal;

TEST(Lambdas, CanSupportedASimplifiedTimesRepeatLoop) {
   5 * [] { totalGlobal++; };

   ASSERT_THAT(totalGlobal, Eq(5));
}

(Use of the global variable totalGlobal allows writing the lambda expression without requiring variable capture.)

The resulting expression is succinct and even better than idiomatic, it’s obvious. I don’t think I’d go as far as to say that you should replace all for loops, just ones where access to the index is not required.

You could, of course. Here’s the result, a couple small tweaks to the first solution.

int operator*(int number, function<void (int)> func) {
   for (int i = 0; i < number; i++) func(i);
}

TEST(Lambdas, CanUnnecessarilyReplaceAForLoop) {
   unsigned int total{0};

   4 * [&] (int i) { total += i; };

   ASSERT_THAT(total, Eq(0 + 1 + 2 + 3));
}

The operator overload now simply passes the loop index i to the function being executed.

In the test, I now choose to use a local variable to capture the total. The capture clause of [&] indicates that any variable accessed by the lambda function is captured by reference. The function now also includes a parameter list, specifying in this case the index parameter of i.

I doubt I’d promote use of the second form (lambda with indexer) in production code. The first (times repeat) seems succinct and appropriate.

C++ Via TFL: The Range-Based For Loop

C++ catches up with constructs common in Java and C# with the new range-based for loop.

catInfinity
Image source: normanack, Flickr

A simple example should suffice to demonstrate its common use.

TEST(ARangeBasedForLoop, CanIterateOverAVector) {
   vector collectedNumbers;
   vector numbers{1, 2, 3};

   for (int each: numbers)
      collectedNumbers.push_back(each);

   ASSERT_THAT(collectedNumbers, Eq(vector { 1, 2, 3}));
}

It’s always nice to know the appropriate names for the syntactical parts of things. Here’s an overly clever test that shows what things are called in the range-based for loop.

TEST(ARangeBasedForLoop, HasAppropriateNamesForItsSyntacticalParts) {
   typedef int type_specifier_seq;
   vector expression{1, 2, 3, };
   vector collectedNumbers;
   auto statement = [&] (int each) -> void { collectedNumbers.push_back(each); };

   for (type_specifier_seq simple_declarator: expression) statement(simple_declarator);

   ASSERT_THAT(collectedNumbers, Eq(vector { 1, 2, 3}));
}

The range-based for loop supports iterating over a number of common things. Here are a few.

TEST(ARangeBasedForLoop, CanIterateOverAnArray) {
   vector collectedNumbers;
   int numbers[]{1, 2, 3};

   for (int each: numbers)
      collectedNumbers.push_back(each);

   ASSERT_THAT(collectedNumbers, Eq(vector { 1, 2, 3}));
}

TEST(ARangeBasedForLoop, CanIterateOverAMap) {
   map dictionary{{1, "uno"}, {2, "dos"}};
   map collectedDictionary;

   for (pair pair: dictionary)
      collectedDictionary[pair.first] = pair.second;
   ASSERT_THAT(collectedDictionary, Eq(map{{1, "uno"}, {2, "dos"}}));
}

TEST(ARangeBasedForLoop, CanIterateOverAString) {
   vector collectedChars;

   for (char each: "abc")
      collectedChars.push_back(each);

   ASSERT_THAT(collectedChars, Eq(vector{ 'a', 'b', 'c', 0 }));
}

I’ve shown explicit type information in the prior examples. You should usually prefer auto, instead.

TEST(ARangeBasedForLoop, ShouldPreferAnInferencingTypeSpecifier) {
   vector collectedNumbers;

   for (auto each: {1, 2, 3})
      collectedNumbers.push_back(each);

   ASSERT_THAT(collectedNumbers, Eq(vector{ 1, 2, 3}));
}

If you want to update each element as you iterate…

TEST(ARangeBasedForLoop, CanAccessEachElementByReference) {
   vector strings{"a", "b"};

   for (auto& each: strings)
      each += each;

   ASSERT_THAT(strings, Eq(vector{ "aa", "bb" }));
}

Or if you want better performance, but want to disallow modifying each element.

TEST(ARangeBasedForLoop, CanAccessEachElementByConstReference) {
   vector strings{"a", "b"};
   vector collectedStrings;

   for (const auto& each: strings)
      // each += each; // fails compilation
      collectedStrings.push_back(each);
      

   ASSERT_THAT(strings, Eq(vector{ "a", "b" }));
}

The range-based for loop can iterate over anything that implements begin() and end() (each returning an iterator object–something that implements operator!=, operator*, and operator++). Here is a simple class that supports iteration across a range of numbers, (also supporting the ability to skip elements).

class IntSequence {
public:
   IntSequence(int start, int stop, int by=1): start_{start}, stop_{stop}, by_{by} {}

   class iterator {
   public:
      iterator(int value, int by) : current_(value), by_(by) {}

      bool operator!=(const iterator& rhs) const { 
         return current_ <= rhs.current_; 
      }

      int& operator*() { return current_; }

      iterator operator++() {
         current_ += by_;
         return *this;
      }
   private:
      int current_;
      int by_;
   };
   iterator begin() { return iterator(start_, by_); }
   iterator end() { return iterator(stop_, by_); }

private:
   int start_;
   int stop_;
   int by_;
};

And here’s an example demonstrating use:

TEST(ARangeBasedForLoop, CanIterateOverAnythingImplementingBeginAndEnd) {
   vector collectedNumbers;
   int start{3};
   int stop{10};
   int by{2};
   IntSequence sequence(start, stop, by);

   for (auto each: sequence)
      collectedNumbers.push_back(each);

   ASSERT_THAT(collectedNumbers, Eq(vector{3, 5, 7, 9}));
}

(This example could be construed as contrived: Why not simply use a regular ol’ for loop? Most of the time, that idiom is simple and probably preferred, but you might find value in the ability to pass around a sequence concept to other functions, or to serialize it, or otherwise use it where having an object abstraction might simplify code.)

Note: Code build using gcc 4.7.2 under Ubuntu.

C++11: Sum Across a Collection of Objects Using a Lambda or a Range-Based For Loop

I’m always disappointed when my Google search doesn’t turn up useful results on the first page. Often, an answer to a C++ question takes me to a StackOverflow page, or to a cplusplus page, where I usually find what I want. This time I didn’t find what I want on the first page (the stackoverflow link wasn’t quite it), hence this blog post.


Image source: jfgormet, Flickr

I’ve just started a series on test-focused learning (TFL) for the new C++11 features. I’m jumping the gun a little here since I’ve not yet covered lambdas (or auto, or the range-based for, or rvalue references), but I wanted searchers to have (what I think is) a better answer on the first page.

The story: Iterate a vector of objects, adding into a sum by dereferencing a member on each object, then return the sum from a function.

Here’s the declaration for the Item class:

class Item {
   public:
      Item(int cost) : cost_{cost} {}
      int Cost() { return cost_; }
   private:
      int cost_;
};

Here’s the assertion (I coded this test-first, of course):

ASSERT_THAT(
   TotalCost({Item(5), Item(10), Item(15)}), 
   Eq(5 + 10 + 15));

You can get this test to pass in three statements using the range-based for loop.

int TotalCost(vector<Item>&& items) {
   int total{0};
   for (auto item: items) total += item.Cost();
   return total;
}

That’s clean and simple to understand, explanation of syntax barely needed.

Here’s the implementation using accumulate and a lambda.

int TotalCost(vector<Item>&& items) {
   return accumulate(items.begin(), items.end(), 0, 
     [] (int total, Item item) { return total + item.Cost(); });
}

That’s it. If you’re not familiar with accumulate (I wasn’t, hence my Google search), it takes a range, an initial value, and a function. If you’re not familiar with using lambdas to declare functions:

  • [] declares that the lambda requires no capture of other variables
  • between () is the parameter list for the arguments that accumulate passes to the function
  • between the {} is the function’s implementation

Which do you prefer (or neither), and why?

C++11 Via TFL (Test-Focused Learning): Uniform Initialization

I’ve been working with the new features of C++11 for many months as I write a book on TDD in C++. The updates to the language make for a far-more satisfying experience, particularly in terms of helping me write clean code and tests.

I haven’t written any Java code for over a half-year, and I don’t miss it one bit (I do miss the IDEs a bit, although I’m enjoying working in vim again, particularly with a few new tips picked up from Use Vim Like a Pro and Practical Vim). New language features such as lambdas and type inferencing represent a leap-frogging that shine a little shame on Oracle’s efforts.

source: Naval History and Heritage Command
Image source: Naval History & Heritage Command

Over a series of upcoming blog entries, I will be demonstrating many of the new language features in C++ via test-focused learning (TFL). This entry: uniform initialization, the new scheme for universally-consistent initialization that also simplifies the effort to initialize collections, arrays, and POD types.

One goal of this blog series is to see how well the tests can communicate for themselves. Prepare for a lot of test code (presume they all pass, unless otherwise noted) and little blather. Please feel free to critique by posting comments; there’s always room for improvement around the clarity of tests (particularly regarding naming strategy). Note that TFL and TDD have slightly different goals; accordingly, I’ve relaxed some of the TDD conventions I might otherwise follow (such as one assert per test).

The Basics

Note: Your compiler may not be fully C++11-compliant. The examples shown here were built (and tested) under gcc 4.7.2 under Ubuntu. The unit testing tool is Google Mock (which supports the Hamcrest-like matchers used here, and includes Google Test).

TEST(BraceInitialization, SupportsNumericTypes) {
   int x{42};
   ASSERT_THAT(x, Eq(42));

   double y{12.2};
   ASSERT_THAT(y, DoubleEq(12.2));
}

TEST(BraceInitialization, SupportsStrings) {
   string s{"Jeff"};
   ASSERT_THAT(s, Eq("Jeff"));
}

TEST(BraceInitialization, SupportsCollectionTypes) {
   vector<string> names {"alpha", "beta", "gamma" };
   ASSERT_THAT(names, ElementsAre("alpha", "beta", "gamma"));
}

TEST(BraceInitialization, SupportsArrays) {
   int xs[] {1, 1, 2, 3, 5, 8};
   ASSERT_THAT(xs, ElementsAre(1, 1, 2, 3, 5, 8));
}

Those tests are simple enough. Maps are supported too:

TEST(BraceInitialization, SupportsMaps) {
   map<string,unsigned int> heights {
      {"Jeff", 176}, {"Mark", 185}
   };

   ASSERT_THAT(heights["Jeff"], Eq(176));
   ASSERT_THAT(heights["Mark"], Eq(185));
}

Explicit initialization of collections isn’t nearly as prevalent in production code as it is in tests. I’m tackling uniform initialization first because I’m so much happier with my resulting tests. The ability to create an initialized collection in a single line is far more expressive than the cluttered, old-school way.

TEST(OldSchoolCollectionInitialization, SignificantlyCluttersTests) {
   vector<string> names;

   names.push_back("alpha");
   names.push_back("beta");
   names.push_back("gamma");

   ASSERT_THAT(names, ElementsAre("alpha", "beta", "gamma"));
}

No Redundant Type Specification!

Uniform initialization eliminates the need to redundantly specify type information when you need to pass lists.

TEST(BraceInitialization, CanBeUsedOnConstructionInitializationList) {
   struct ReportCard {
      string grades[5];
      ReportCard() : grades{"A", "B", "C", "D", "F"} {}
   } card;

   ASSERT_THAT(card.grades, ElementsAre("A", "B", "C", "D", "F"));
}
TEST(BraceInitialization, CanBeUsedForReturnValues) {
   struct ReportCard {
      vector<string> gradesForAllClasses() {
         string science{"A"};
         string math{"B"};
         string english{"B"};
         string history{"A"};
         return {science, math, english, history};
      }
   } card;

   ASSERT_THAT(card.gradesForAllClasses(), ElementsAre("A", "B", "B", "A"));
}
TEST(BraceInitialization, CanBeUsedForArguments) {
   struct ReportCard {
      vector<string> subjects_;

      void addSubjects(vector<string> subjects) {
         subjects_ = subjects;
      }
   } card;

   card.addSubjects({"social studies", "art"});

   ASSERT_THAT(card.subjects_, ElementsAre("social studies", "art"));
}

Direct Class Member Initialization

Joyfully (it’s about time), C++ supports directly initializing at the member level:

TEST(BraceInitialization, CanBeUsedToDirectlyInitializeMemberVariables) {
   struct ReportCard {
      string grades[5] {"A", "B", "C", "D", "F"};
   } card;

   ASSERT_THAT(card.grades, ElementsAre("A", "B", "C", "D", "F"));
}

Class member initialization essentially translates to the corresponding mem-init. Be careful if you have both:

TEST(MemInit, OverridesMemberVariableInitialization) {
   struct ReportCard {
      string schoolName{"Trailblazer Elementary"};
      ReportCard() : schoolName{"Chipeta Elementary"} {}
   } card;

   ASSERT_THAT(card.schoolName, Eq("Chipeta Elementary"));
}

Temporary Type Name

TEST(BraceInitialization, EliminatesNeedToSpecifyTempTypeName) {
   struct StudentScore {
      StudentScore(string name, int score) 
         : name_(name), score_(score) {}
      string name_;
      int score_;
   };
   struct ReportCard {
      vector<StudentScore> scores_;
      void AddStudentScore(StudentScore score) {
         scores_.push_back(score);
      }
   } card;

   // old school: cardAddStudentScore(StudentScore("Jane", 93));
   card.AddStudentScore({"Jane", 93}); 

   auto studentScore = card.scores_[0];
   ASSERT_THAT(studentScore.name_, Eq("Jane"));
   ASSERT_THAT(studentScore.score_, Eq(93));
}

Be careful that use of this feature does not diminish readability.

Defaults

TEST(BraceInitialization, WillDefaultUnspecifiedElements) {
   int x{};
   ASSERT_THAT(x, Eq(0));

   double y{};
   ASSERT_THAT(y, Eq(0.0));  

   bool z{};
   ASSERT_THAT(z, Eq(false));

   string s{};
   ASSERT_THAT(s, Eq(""));
}
TEST(BraceInitialization, WillDefaultUnspecifiedArrayElements) {
   int x[3]{};
   ASSERT_THAT(x, ElementsAre(0, 0, 0));

   int y[3]{100, 101};
   ASSERT_THAT(y, ElementsAre(100, 101, 0));
}
TEST(BraceInitialization, UsesDefaultConstructorToDeriveDefaultValue) {
   struct ReportCard {
      string school_;
      ReportCard() : school_("Trailblazer") {}
      ReportCard(string school) : school_(school) {}
   };

   ReportCard card{};

   ASSERT_THAT(card.school_, Eq("Trailblazer"));
}

Odds & Ends

TEST(BraceInitialization, CanIncludeEqualsSign) {
   int i = {99};
   ASSERT_THAT(i, Eq(99));
}

… but why bother?

It’s always nice when a new language feature makes it a little harder to make the dumb mistakes that we all tend to make from time to time (and sometimes, such dumb mistakes are the most devastating).

TEST(BraceInitialization, AvoidsNarrowingConversionProblem) {
   int badPi = 3.1415927;
   ASSERT_THAT(badPi, Eq(3));

   int pi{3.1415927}; // emits warning by default
//   ASSERT_THAT(pi, Eq(3.1415927));
}

Running the AvoidsNarrowingConversionProblem test results in the following warning:

warning: narrowing conversion of ‘3.1415926999999999e+0’ from ‘double’ to ‘int’ inside { } [-Wnarrowing]

Recommendation: use the gcc compiler switch:

-Werror=narrowing

…which will instead cause compilation to fail.

Use With Auto

TEST(BraceInitialization, IsProbablyNotWhatYouWantWhenUsingAuto) {
   auto x{9};
   ASSERT_THAT(x, A<const initializer_list<int>>());
   // in other words, the following assignment passes compilation. Thus x is *not* an int.
   const initializer_list<int> y = x;
}

The Most Vexing Parse?

It’s C++. That means there are always tricky bits to avoid.

TEST(BraceInitialization, AvoidsTheMostVexingParse) {
   struct IsbnService {
      IsbnService() {}
      string address_{"http://example.com"};
   };

   struct Library {
      IsbnService service_;
      Library(IsbnService service) : service_{service} {}
      string Lookup(const string& isbn) { return "book name"; }
   };

   Library library(IsbnService()); // declares a function(!)
//   auto name = library.Lookup("123"); // does not compile

   Library libraryWithBraceInit{IsbnService()};
   auto name = libraryWithBraceInit.Lookup("123"); 

   ASSERT_THAT(name, Eq("book name"));
}

All the old forms of initialization in C++ will still work. Your best bet, though, is to take advantage of uniform initialization and use it at every opportunity. (I’m still habituating, so you’ll see occasional old-school initialization in my code.)

A Smackdown Tool for Overeager TDDers

smackdown!
Image source: https://commons.wikimedia.org/wiki/File:Cross_rhodes_on_gabriel.jpg

I’ve always prefaced my first test-driven development (TDD) exercises by saying something like, “Make sure you write no more code than necessary to pass your test. Don’t put in data structures you don’t need, for example.” This pleading typically comes on the tail of a short demo where I’ve mentioned the word incremental numerous times.

But most people don’t listen well, and do instead what they’ve been habituated to do.

With students in shu mode, it’s ok for instructors to be emphatic and dogmatic, smacking students upside the head when they break the rules for an exercise. It’s impossible to properly learn TDD if you don’t follow the sufficiency rule, whether deliberately or not. Trouble is, it’s tough for me to smack the heads of a half-dozen pairs all at once, and some people tend to call in HR when you hit them.

The whole issue of incrementalism is such an important concept that I’ve introduced a new starting exercise to provide me with one more opportunity to push the idea. The natural tendency of students to jump to an end solution is one of the harder habits to break (and a frequent cause of students’ negative first reaction when they actually try TDD).

I present a meaty first example (latest: the Soundex algorithm) where all the tests are marked as ignored or disabled, a great idea I learned from James Grenning. In Java, the students are to un-@Ignore tests one-by-one, simply getting them to pass, until they’ve gotten all tests to pass. The few required instructions are in the test file, meaning they can be hitting this exercise about two minutes after class begins.

Problem is, students have a hard time not breaking rules, and always tend to implement too much. As I walk around, I catch them, but it’s often a little too late. Telling them that they need to scrap their code and back up isn’t what they want to hear.

So, I built a custom test-runner that will instead fail their tests if they code too much, acting as a virtual head-smacking Jeff. (I built a similar tool for C++ that I’ve used successfully in a couple C++ classes.)

Here’s the (hastily built) code:

import org.junit.*;
import org.junit.internal.*;
import org.junit.internal.runners.model.*;
import org.junit.runner.*;
import org.junit.runner.notification.*;
import org.junit.runners.*;
import org.junit.runners.model.*;

public class IncrementalRunner extends BlockJUnit4ClassRunner {

   public IncrementalRunner(Class klass) 
         throws InitializationError {
      super(klass);
   }

   @Override
   protected void runChild(
         FrameworkMethod method, RunNotifier notifier) {
      EachTestNotifier eachNotifier = 
         derivedMakeNotifier(method, notifier);
      if (method.getAnnotation(Ignore.class) != null) {
         runIgnoredTest(method, eachNotifier);
         return;
      }

      eachNotifier.fireTestStarted();
      try {
         methodBlock(method).evaluate();
      } catch (AssumptionViolatedException e) {
         eachNotifier.addFailedAssumption(e);
      } catch (Throwable e) {
         eachNotifier.addFailure(e);
      } finally {
         eachNotifier.fireTestFinished();
      }
   }

   private void runIgnoredTest(
         FrameworkMethod method, EachTestNotifier eachNotifier) {
      eachNotifier.fireTestStarted();
      runExpectingFailure(method, eachNotifier);
      eachNotifier.fireTestFinished();
   }

   private EachTestNotifier derivedMakeNotifier(
         FrameworkMethod method, RunNotifier notifier) {
      Description description = describeChild(method);
      return new EachTestNotifier(notifier, description);
   }

   private void runExpectingFailure(
         final FrameworkMethod method, EachTestNotifier notifier) {
      if (runsSuccessfully(method)) 
         notifier.addFailure(
            new RuntimeException("You've built too much, causing " + 
                                 "this ignored test to pass."));
   }

   private boolean runsSuccessfully(final FrameworkMethod method) {
      try {
         methodBlock(method).evaluate();
         return true;
      } catch (Throwable e) {
         return false;
      }
   }
}

(Note: this code is written for JUnit 4.5 due to client version constraints.)

All the custom runner does is run tests that were previously @Ignored, and expect them to fail. (I think I was forced into completely overriding runChild to add my behavior in runIgnoredTest, but I could be wrong. Please let me know if you’re aware of a simpler way.) To use the runner, you simply annotate your test class with @RunWith(IncrementalRunner.class).

To effectively use the tool, you must provide students with a complete set of tests that supply a definitive means of incrementally building a solution. For any given test, there must be a possible implementation that doesn’t cause any later test to pass. It took me a couple tries to create a good sequence for the Soundex solution.

The tool is neither foolish-proof nor clever-proof; a small bit of monkeying about and a willingness to deliberately cheat will get around it quite easily. (There are probably a half-dozen ways to defeat the mechanism: For example, students could un-ignore tests prematurely, or they could simply turn off the custom test-runner.) But as long as they are not devious, the test failure from building too much gets in their face and smacks them when I’m not be around.

If you choose to try this technique, please drop me a line and let me know how it went!

TDD for C++ Programmers

C++Recently I’ve been getting a good number of calls from C++ shops interested in doing TDD, despite my heavy Java background. I took on some of the business and had to turn away some to avoid being swamped. Many other folks I know (name dropping time!)–Tim Ottinger, James Grenning, JB Rainsberger, others–have also reported doing C++ work recently.

Is TDD finally taking hold in C++ shops? Does TDD even make sense for C++? I think so, and two current customers believe they’ve been seeing great benefits come from applying it. Building and delivering a C++ TDD course recently helped me come back up to speed in the language to the point where I was comfortably familiar with all of the things I hated about it. 🙂 It makes no sense to take such a difficult language and stab at it without the protection of tests.

I’ve been simultaneously writing more (after a typical winter writing freeze) and looking at Erlang–a much cooler language, challenging in a different kind of way. Meanwhile, my editor at PragProg has been asking for new book ideas. Here were some of my thoughts:

  • Refactoring 2012
  • Modern OO Design (not template metaprogramming!) / Simple Design
  • Object-Oriented Design “In a Flash” (card deck, like Agile in a Flash)

No matter how hard I try to run screaming from C++, there it is right behind me. It’s indeed a powerful language, and there is gobs and gobs of code written in it, and it’s about time we started trying to figure out how to make the best of it. It’s not going away in my lifetime. I also think C++ programmers are not well-served in terms of writings on TDD out there.

So… I decided it was going to be TDD in C++. Tim Ottinger and I put together and just sent out a proposal for a book tentatively named TDD for C++ Programmers (with a catchy subtitle, no doubt). We hope there’s enough demand and interest to get the proposal accepted. If all goes well, we’ll be soliciting reviewers in a few weeks.

I look forward to writing again with Tim! More in an upcoming blog post about our collaborative writing experience.

Violating Standards in Tests

Should your test code be subject to the same standards as your production code? I believe there should be different sets of standards. I am collecting interesting idioms that are normally shunned in good production code, but are acceptable and advantageous in test code.

An obvious example is the use of macros in tools like James Grenning’s CppUTest. The testing framework (an updated version of CppUnitLite) requires programmers to use macros to identify test functions:

TEST(HelloWorld, PrintOk)
{
   printHelloWorld();
   STRCMP_EQUAL("Hello World!n", buffer); 
}

No self-respecting C++ programmer uses macros anymore to replace functions; per Scott Meyers and many others, it’s fraught with all sorts of problems. But in CppUTest, the use of macros greatly simplifies the work of a developer by eliminating their need to manually register the name of a new test with a suite.

Another example is the use of import static in Java. The general rule of thumb, suggested by Sun themselves, is to not overuse import static. Deleting the type information for statically scoped methods and fields can obscure understanding of code. It’s considered appropriate only when use of static elements is pervasive. For example, most developers faced with coding any real math do an import static on the java.lang.Math functions.

However, I use static import frequently from my tests:

import static org.junit.Assert.*;
import org.junit.*;
import static util.StringUtil.*;

public class StringUtilCommaDelimitTest {
   @Test public void degenerate() {
      assertEquals("", commaDelimit(new String[] {}));
   }

   @Test public void oneEntry() {
      assertEquals("a", commaDelimit(new String[] {"a"}));
   }
   ...
}

Developers unquestioningly use import static for JUnit 4 assertions, as they are pervasive. But the additional use here is for the method commaDelimit, which is defined as static in the target class StringUtil. More frequently, I’ll have the test refer to (statically defined) constants on the target class. For a test reader, it becomes obvious where that referenced constant would be defined.

What other coding standards are appropriate for tests only, and not for production code?

Test Abstraction

I’m staring at a single CppUnit test function spanning hundreds of source lines. The test developer inserted visual indicators to help me pick out the eight test cases it covers:

//++++++++++++++++++++++++++++++++++++++++++++++++++

Each of these cases is brief: four to eight lines of data setup, followed by a execution statement enclosed in a CPPUNIT_ASSERT. Of course they could be broken up into eight separate test functions, but otherwise they are reasonable.

Prior to the eight tests there are two hundred lines of setup code. Most of the initialization sets data to reasonable default values so that the application code won’t crash and burn while being exercised.

I don’t know enough about the test to judge it in terms of its appropriateness as a “unit” test. It seems more integration test than anything. But perhaps all I would need to do is cleverly divorce the target function from all of those data setup dependencies, and break it up into eight separate test functions.

The aggregation of tests is typical, and no doubt comes from a compulsion to not waste all those 200 lines of work! The bigger problem I have is the function’s lack of abstraction. Uncle Bob always says, “abstraction is elimination of the irrelevant and amplification of the essential.” When it comes down to understanding tests, it is usually a matter of how good a job the developer was at abstracting intent. Two hundreds of lines of detailed setup does not exhibit abstraction!

For a given unit test, I always want to know why a given assertion should hold true, based on the setup context. The lengthy object construction and initialization should be encapsulated in another method, perhaps createDefaultMarket(). Relevant pieces of data can be layered atop the Market object: applyGroupDiscountRate(0.10), applyRestrictionCode(), etc. Not only does it help explain the data differences and correlate the setup with the result, it makes it easier to read the test, and easier to write new tests (reuse!).

I often get blank stares when I ask developers to make their tests more readable. Would they respond better to requests to improve their use of abstraction?

Atom