Don’t Stop Reading…

When writing, be careful of presenting an opposing point first with the notion that you’ll make your counterpoint in a subsequent paragraph or page. That first pitch is often as far as some readers get. Royce’s article “Managing the Development of Large Software Systems” led to decades of waterfall for this reason.

Recently, a pair-developer expressed concern over the use of multiple returns in a short method. I said I knew about the long-standing rule but that it created little value for very short methods. Two days later, I heard the same concern during another pairing session. Twice in one week? Curious!

I asked where they’d read about the rule. “Uncle Bob’s Clean Code” was the surprising response. “Hand me your copy!” I insisted, a bit worried, and quickly located the text in the functions chapter. The very last paragraph on page 48, starting a new section, states: “Dijkstra rules of structured programming… [say] …every function … should have one entry and one exit. Following these rules means that there should only be one return statement in a function…” Oh! Yeah, I learned that 30 years ago. Really, Bob?

Page 49, top paragraph: “”While we are sympathetic to the goals and disciplines of structured programming, these rules serve little benefit when functions are very small. It is only in larger functions that such rules provide significant benefits.”

Oh, ok. Whew! Crisis of faith averted.

Readers: Inattentive reading can be embarrassing!
Writers: Don’t assume readers will get to your point. Make it first.

Cute Little Abstractions

I’m refactoring several classes of over 2000 lines each. These are classic god-classes that mediate a handful of stubby little data-centric classes around them. I reduced the first, primary domain class by over 1100 lines, to under 1500, by simply pushing responsibilities around. Adding tests after the fact (using WELC techniques, of course) gave me the confidence to refactor. In just a few hours, I’ve been able to completely eliminate about 700 lines so far out of the 1100 that were shifted elsewhere. I’m loving it!

After cleaning up some really ugly code, I ended up with better (not great) code:

private boolean containsThreeOptionsSameType() {
   Map<String, Integer> types = new HashMap<String, Integer>();
   for (Option option : options)
      increment(types, option.getType());
   return count(types, Option.ALPHA) >= 3 || 
          count(types, Option.BETA) >= 3 || 
          count(types, Option.GAMMA) >= 3 || 
          count(types, Option.DELTA) >= 3;
}

private int count(Map<String, Integer> map, String key) {
   if (!map.containsKey(key))
      return 0;
   return map.get(key);
}

private void increment(Map<String, Integer> map, String key) {
   if (!map.containsKey(key))
      map.put(key, 1);
   else
      map.put(key, map.get(key) + 1);
}

I could clean up the complex conditional on the return statement (perhaps calling a method to loop through all types). Also, the count and increment methods contain a bit of duplication.

More importantly, the count and increment methods represent a missed abstraction. This is very common: We tend to leave little, impertinent, SRP-snubbing, OCP-violating methods lying about, cluttering up our classes.

(Aside: I’m often challenged during reviews about exposing things so I can test them. Long-term, entrenched developers are usually very good about blindly following the simple rule about making things as private as possible. But they’re perfectly willing to throw OO concept #1, cohesion, completely out the window.)

I decided to test-drive the soon-to-be-no-longer-missing abstraction, CountingSet. I ended up with an improved implementation of increment by eliminating the redundancy I spotted above:

import java.util.*;

public class CountingSet<T> {
   private Map<T, Integer> map = new HashMap<T, Integer>();

   public int count(T key) {
      if (!map.containsKey(key))
         return 0;
      return map.get(key);
   }

   public void add(T key) {
      map.put(key, count(key) + 1);
   }
}

Some developers would be appalled: “You built a cruddy little class with just four lines of real code? And for only one client?” And further: “It’s only a subset of a counting set, where are all the other methods?”

Sure thing, buddy. Here’s my client:

private boolean containsThreeOptionsSameType() {
   CountingSet<String> types = new CountingSet<String>();
   for (Option option : options)
      types.add(option.getType());
   return types.count(Option.ALPHA) >= 3 || 
          types.count(Option.BETA) >= 3 || 
          types.count(Option.GAMMA) >= 3 || 
          types.count(Option.DELTA) >= 3;
}

My client now only contains intent. The small ugliness with the parameterized map and boxing is now “information hidden.” I also have a simple reusable construct–I know I’ve had a similar need several times before. As for building and/or using a fully-functional CountingSet, there’s something to be said for creating “adapter” classes with the smallest possible, and most meaningful, interface.

Good enough for now. Don’t hesitate to create your own cute little abstractions!

Convention and Case

I’m learning Grails. I found a good tutorial written by Jason Rudolph. It does a great job of covering a broad range of the typical things you’d want to do in putting up a quick web site.

Unfortunately, it’s for an older version of Grails, but so far that hasn’t been a barrier; most of the changes I need to make are self-explanatory. I’m almost finished, and feel I have a good grasp on using Grails.

Still, I struggled for about two hours last night on a problem in going through the tutorial. Here’s a form I defined.

<g:form controller="user" method="post">
    ...
    <g:actionSubmit value="Log In"/>
    ...
</g:form>

And here’s some of the controller:

class UserController extends BaseController {
  def beforeInterceptor =
 [action:this.&auth, except: ['login', 'logout']]

  def index = { redirect(action:list,params:params) }

  def allowedMethods = [delete:'POST', save:'POST', update:'POST']

  def login = {
   // ... 
  }
  // ...
}

I think that’s all that’s relevant to show. Those of you who know Grails know how the form knows which controller method to call; I wasn’t paying enough attention and missed the simple cause of my problem, which was that a simple form submit from the login page kept generating a 404, URL not found. It showed me the request URI:

RequestURI=/racetrack/user/index

Yet that is a valid URI (index redirects, of course). In fact, if I copied the URL grails generated, and pasted it into the address bar, the proper page came up. I tried a few things, including explicitly specifying the action:

<g:form controller="user" method="post" action="login">

No dice, same problem:

HTTP ERROR: 404

NOT_FOUND

RequestURI=/racetrack/user/login

Frustrated, I tried a few other things. Still no dice. I looked at the HTML provided in the book and looked to ensure I typed it correctly (I almost always type my own sample code rather than paste it, I learn better that way). Looks pretty much the same. Web search, no exact match on my same problem; found a couple odd things that it might be, tried ’em, not the problem.

When in doubt, really make sure you have exactly the same thing. Whitespace and case. Case couldn’t possibly matter on button text, could it? Well, yes, especially when you follow this notion of programming by convention or programming by default or whatever you want to call it. An actionSubmit defines the controller method, either explicitly or implicitly. Explicitly, you can simply say:

<g:actionSubmit value="Log In" action="login"/>

In the absence of the action attribute, it uses the value attribute. Apparently, it lowercases the first letter for you, but then simply removes spaces, not lower-casing any subsequent letters. Thus it was looking for a controller action named “logIn,” apparently. The tutorial code has it typed as “Log in.” I typed it “Log In” (I thought it looked nicer). Harumph.

Dumb on my part, for not paying enough attention to how the form was supposed to figure out which controller method to call (I hadn’t figured this out yet, and thought this was another “by convention” element, perhaps from somewhere else). Dumb on Groovy’s part, for not showing me the URI with the improper casing.

Lesson for me: Pay more attention. Lesson for tools that play by convention: If you’re going to do something as clever as use button text to define an action, make sure you are picky about case everywhere, and make it clear from whence that action name came (i.e. better error messages). Lesson for tutorial writers: It’s hard work to write a great tutorial. A good one gets you through all the happy path circumstances. A great one helps you out with all the dumb mistakes you’ll inevitably make.

Violating Standards in Tests

Should your test code be subject to the same standards as your production code? I believe there should be different sets of standards. I am collecting interesting idioms that are normally shunned in good production code, but are acceptable and advantageous in test code.

An obvious example is the use of macros in tools like James Grenning’s CppUTest. The testing framework (an updated version of CppUnitLite) requires programmers to use macros to identify test functions:

TEST(HelloWorld, PrintOk)
{
   printHelloWorld();
   STRCMP_EQUAL("Hello World!n", buffer); 
}

No self-respecting C++ programmer uses macros anymore to replace functions; per Scott Meyers and many others, it’s fraught with all sorts of problems. But in CppUTest, the use of macros greatly simplifies the work of a developer by eliminating their need to manually register the name of a new test with a suite.

Another example is the use of import static in Java. The general rule of thumb, suggested by Sun themselves, is to not overuse import static. Deleting the type information for statically scoped methods and fields can obscure understanding of code. It’s considered appropriate only when use of static elements is pervasive. For example, most developers faced with coding any real math do an import static on the java.lang.Math functions.

However, I use static import frequently from my tests:

import static org.junit.Assert.*;
import org.junit.*;
import static util.StringUtil.*;

public class StringUtilCommaDelimitTest {
   @Test public void degenerate() {
      assertEquals("", commaDelimit(new String[] {}));
   }

   @Test public void oneEntry() {
      assertEquals("a", commaDelimit(new String[] {"a"}));
   }
   ...
}

Developers unquestioningly use import static for JUnit 4 assertions, as they are pervasive. But the additional use here is for the method commaDelimit, which is defined as static in the target class StringUtil. More frequently, I’ll have the test refer to (statically defined) constants on the target class. For a test reader, it becomes obvious where that referenced constant would be defined.

What other coding standards are appropriate for tests only, and not for production code?

Test Abstraction

I’m staring at a single CppUnit test function spanning hundreds of source lines. The test developer inserted visual indicators to help me pick out the eight test cases it covers:

//++++++++++++++++++++++++++++++++++++++++++++++++++

Each of these cases is brief: four to eight lines of data setup, followed by a execution statement enclosed in a CPPUNIT_ASSERT. Of course they could be broken up into eight separate test functions, but otherwise they are reasonable.

Prior to the eight tests there are two hundred lines of setup code. Most of the initialization sets data to reasonable default values so that the application code won’t crash and burn while being exercised.

I don’t know enough about the test to judge it in terms of its appropriateness as a “unit” test. It seems more integration test than anything. But perhaps all I would need to do is cleverly divorce the target function from all of those data setup dependencies, and break it up into eight separate test functions.

The aggregation of tests is typical, and no doubt comes from a compulsion to not waste all those 200 lines of work! The bigger problem I have is the function’s lack of abstraction. Uncle Bob always says, “abstraction is elimination of the irrelevant and amplification of the essential.” When it comes down to understanding tests, it is usually a matter of how good a job the developer was at abstracting intent. Two hundreds of lines of detailed setup does not exhibit abstraction!

For a given unit test, I always want to know why a given assertion should hold true, based on the setup context. The lengthy object construction and initialization should be encapsulated in another method, perhaps createDefaultMarket(). Relevant pieces of data can be layered atop the Market object: applyGroupDiscountRate(0.10), applyRestrictionCode(), etc. Not only does it help explain the data differences and correlate the setup with the result, it makes it easier to read the test, and easier to write new tests (reuse!).

I often get blank stares when I ask developers to make their tests more readable. Would they respond better to requests to improve their use of abstraction?

Code to Discuss

A good software team knows to talk openly about the code. One suggestion is to get together once a day, maybe at day end, to just talk about or look at some code, even if you are already pairing. The get-together need not take long nor be formal in any manner. Fifteen to thirty minutes should be sufficient. Toss some code up on the wall using an LCD projector and have at it.

Initially, such discussions can be touchy, as critique can be mistakenly viewed as insults or attacks. Over time, the regular exercise of discussing code should form a group that better understands how to communicate effectively with each other.

What to talk about? I brainstormed some ideas for a specific set of teams, things I knew they needed to discuss, and provide that starter list here. Given the almost limitless number of reasons to talk about the code, there is little excuse to forego the forum.

  • “here’s something cool that I did today!”
  • “how could I better express this test?”
  • “here is a mocking technique I found valuable”
  • “this test didn’t make sense to me”
  • “this area of the code concerns me”
  • test granularity
  • test naming
  • BDD
  • how TDD changed my design
  • naming standards
  • coding standards
  • warning standards
  • JUnit 4.4+: Hamcrest matchers, theories, etc.
  • “we’re not writing enough tests”
  • code coverage (taking a look at the actual lines exercised by tests)
  • WTFs in the code or test. Can be amusing, but watch the toes!
  • test execution speed. Splitting into slow/fast suites.
  • identifying and speeding up slow tests
  • test smells
  • various forms of test double (see XUnit Patterns)
  • simple design vs SOLID
  • “we’re not refactoring our production code enough”
  • “we’re not refactoring our tests enough”
  • object mothers for tests
  • “the build is a pain in the rear”
  • looking at a new mock tool like Mockito

And so on. Perhaps get a volunteer (or two) to “open” a topic each week. They talk for a few minutes, and then you open things up for group discussion.

Scaring You Straight: Legacy Madness

legacyMadness

I saw this classic comic book cover and couldn’t resist. There could be a whole True Coding Crime series.

Comments and Footnotes

Will I ever stop railing on about comments? I don’t think so. As long as it’s possible to write code in a less than clear manner, people will always be compelled to explain their coding deficiencies. And I will always be compelled to resist.

As a writer, I try to ensure that the main text in a book or article is clear to the reader. Every rare once in a while (perhaps a couple dozen times in a book), I feel compelled to write extra information, something that supplements the text. I do so in the form of a footnote. The footnote provides non-essential information. It’s something that might help add some insights to what I just wrote, perhaps, or a humorous but otherwise distracting note.

My editors would never let me use footnotes to re-explain a passage of text. A coder attempting to salvage bad code with comments is like me adding footnotes to try and explain poorly written text. If the mainline text didn’t explain it well, why should a footnote explain it any better?

Footnotes and comments are ultimately a distraction. Ever try to read a book with footnotes on almost every page (and I’m not talking citations or references)? You may as well be reading two books, in some cases. As writers, we owe it to our readers write more clearly and to minimize these distractions.

 

Comments:

Excellent analogy Jeff.

Today I was asked what I think about assertThat for jUnit4 framework. It brings nothing more than a little bit more verbosity for tests, but won’t help people write better tests. Those , whom tests are of a poor readability and quality will remain the same with a new constructions, too.

Cryptics

In posting messages about the new for-each loop syntax in Java 5.0, many developers have complained about its crypticism. A for-each loop might look like:

for (Employee employee: department)
 terminate(employee);

You would read this for-each loop as, “for each employee (of type Employee) in department).” Many developers would have preferred something like:

for (Employee employee in department)
 terminate(employee);

But alas, Sun resists new Java keywords like “in” because they have the potential to break gobs of existing code.

It’s not that big a deal to me, since I view the loop construct as idiomatic. Once I saw it and it was explained to me, or perhaps twice, it became ingrained. Subsequent encounters required no such explanation or even translation time.

Nonetheless, the new for-each loop is a slightly cryptic construct. Not quite as bad as the ternary operator (conditional ? true-value : false-value), but still not obvious the first time. Some developers can’t stand it.

My take is that Java is a fairly cryptic language to start with, particularly if you have no background in C-based languages (C, C++, Objective C, C#, and whatever else). The old C-style for loop isn’t obvious either. Typical C code is rife with odd characters such as *, %, &, {, }, and |.

This obvious observation led me to create Jeff’s hypothesis of cryptics:

Jeff’s Definition of Cryptics: The crypticism of a language is directly proportional to the mean number of shift keystrokes required.

It’s a pretty useless definition, granted. But taking a look at a couple languages that appear on opposite ends of the spectrum shows that there is some validity. Curly braces in C-based languages { use the shift key, as does * (pointer dereference). The new generics stuff in 1.5 uses angle brackets < and >, both requiring use of shift. The tertiary operator uses both : and ?. Strings are set off by use of the ” character. And C uses lots of parentheses.

Smalltalk, in contrast, uses brackets ([ and ]) to set off blocks. No shift-combination necessary! Boolean operators are spelled out instead of using & and |. Strings are set off by use of the ‘ character. The major exception to the rule is the use of the colon to designate a selector (but this also suggests that perhaps you should pass fewer parameters to your methods).

Of course, crypticism is related to how easy it is to read text, not write it. But it’s no coincidence that the cryptics require shift-combinations on a standard keyboard or typewriter. These are the characters that make things difficult to read when used to excess.

I was curious enough to pose the idea to Dan Ingalls, who had a lot to do with the design of early Smalltalk implementations. He indicated that indeed there was a conscious effort to reduce the number of shift keystrokes required. This was due to the fact that one goal for Smalltalk was that children would be able to program in it.

Lots of shift-key combinations = lots of crypticism. I’m still amazed at the fact that modern languages retain so much of the unnecessary constructs and ugliness of a language that is 30 years old.

Atom