Essay – Extreme Enthusiasm

Bureaucratic tests

matteo — Mon, 28 Mar 2016 16:00:53 +0000

The TDD cycle should be fast! We should be able to repeat the red-green-refactor cycle every few minutes. This means that we should work in very small steps. Kent Beck in fact is always talking about “baby steps.” So we should learn how to make progress towards our goal in very small steps, each one taking us a little bit further. Great! How do we do that?

Example 1: Testing that “it’s an object”

In the quest for “small steps”, I sometimes see recommendations that we write things like these:

it("should be an object", function() {
  assertThat(typeof chat.userController === 'object')
});

which, of course, we can pass by writing

chat.userController = {}

What is the next “baby step”?

it("should be a function", function() {
  assertThat(typeof chat.userController.login === 'function')
});

And, again, it’s very easy to make this pass.

chat.userController = { login: function() {} }

I think these are not the right kind of “baby steps”. These tests give us very little value.

Where is the value in a test? In my view, a test gives you two kinds of value:

Verification value, where I get assurance that the code does what I expect. This is the tester’s perspective.
Design feedback, where I get information on the quality of my design. And this is the programmers’s perspective.

I think that in the previous two tests, we didn’t get any verification value, as all we were checking is the behaviour of the typeof operator. And we didn’t get any design feedback either. We checked that we have an object with a method; this does not mean much, because any problem can be solved with objects and methods. It’s a bit like judging a book by checking that it contains written words. What matters is what the words mean. In the case of software, what matters is what the objects do.

Example 2: Testing UI structure

Another example: there are tutorials that suggest that we test an Android’s app UI with tests like this one:

public void testMessageGravity() throws Exception {
  TextView myMessage = 
    (TextView) getActivity().findViewById(R.id.myMessage);
  assertEquals(Gravity.CENTER, myMessage.getGravity());
}

Which, of course, can be made to pass by adding one line to a UI XML file:

android:gravity="center"
/>

What have we learned from this test? Not much, I’m afraid.

Example 3: Testing a listener

This last example is sometimes seen in GUI/MVC code. We are developing a screen of some sort, and we try to make progress towards the goal of “when I click this button, something interesting happens.” So we write something like this:

@Test
public void buttonShouldBeConnectedToAction() {
    assertEquals(button.getActionListeners().length, 1);
    assertTrue(button.getActionListeners()[0] 
                 instanceof ActionThatDoesSomething);
}

Once again, this test does not give us much value.

Bureaucracy

The above tests are all examples of what Keith Braithwaithe calls “pseudo-TDD”:

Think of a solution
Imagine a bunch of classes and functions that you just know you’ll need to implement (1)
Write some tests that assert the existence of (2)
[… go read Keith’s article for the rest of his thoughts on the subject.]

In all of the above examples, we start by thinking of a line of production code that we want to write. Then we write a test that asserts that that line of code exists. This test does nothing but give us permission to write that line of code: it’s just bureaucracy!

Then we write the line of code, and the test passes. What have we accomplished? A false sense of progress; a false sense of “doing the right thing”. In the end, all we did was wasting time.

Sometimes I hear developers claim that they took longer to finish, because they had to write the tests. To me, this is nonsense: I write tests to go faster, not slower. Writing useless tests slows me down. If I feel that testing makes me slower, I should probably reconsider how I write those tests: I’m probably writing bureaucratic tests.

Valuable tests

Bureaucratic tests are about testing a bit of solution (that is, a bit of the implementation of a solution). Valuable test are about solving a little bit of the problem. Bureaucratic tests are usually testing structure; valuable tests are always about testing behaviour. The right way to do baby steps is to break down the problem in small bits (not the solution). If you want to do useful baby steps, start by writing a list of all the tests that you think you will need.

In Test-Driven Development: by Example, Kent Beck attacks the problem of implementing multi-currency money starting with this to-do list:

$5 + 10 CHF = $10 if rate is 2:1
$5 * 2 = $10

Note that these tests are nothing but small slices of the problem. In the course of developing the solution, many more tests are added to the list.

Now you are probably wonder what would I do, instead of the bureaucratic tests that I presented above. In each case, I would start with a simple example of what the software should do. What are the responsibilities of the userController? Start there. For instance:

it("logs in an existing user", function() {
  var user = { nickname: "pippo", password: "s3cr3t" }
  chat.userController.addUser user

  expect(chat.userController.login("pippo", "s3cr3t")).toBe(user)
});

In the case of the Android UI, I would probably test it by looking at it; the looks of the UI have no behaviour that I can test with logic. My test passes when the UI “looks OK”, and that I can only test by looking at it (see also Robert Martin’s opinion on when not to TDD). I suppose that some of it can be automated with snapshot testing, which is a variant of the “golden master” technique.

In the case of the GUI button listener, I would not test it directly. I would probably write an end-to-end test that proves that when I click the button, something interesting happens. I would probably also have more focused tests on the behaviour that is being invoked by the listener.

Conclusions

Breaking down a problem into baby steps means that we break in very small pieces the problem to solve, not the solution. Our tests should always speak about bits of the problem; that is, about things that the customer actually asked for. Sometimes we need to start by solving an arbitrarily simplified version of the original problem, like Kent Beck and Bill Wake do in this article I found enlightening; but it’s always about testing the problem, not the solution!

A fundamental design move

matteo — Thu, 30 May 2013 16:19:44 +0000

Pardon me if the content of this post looks obvious to you. It took me years to understand this!

When doing incremental design, I found that there is one fundamental move that makes design emerge. It is

I recognize that one thing has more than one responsibility;
then I separate the responsibilities by splitting the thing into two or more smaller things;
and I connect the smaller things with the proper degree of coupling.

For instance, in the description of the domain of the game Monopoly we have this sentence:

A player takes a turn by rolling two dice and moving around the board accordingly.

A straightforward translation of this into code is

class Player {
  // ...
  public void takeTurn() {
    int result = Math.random(6) + Math.random(6);
    this.position = (this.position + result) % 40;
  }      
}

Can you see that Player#takeTurn() does two things? You can see this from the beginning by the wording: “by rolling two dice *and* moving around the board”.

You can also see it when you try to write the test:

@Test public void playerTakesTurn() {
  Player player = new Player();
  assertEquals(0, player.position()); // initial position
  
  player.takeTurn();
  assertEquals(???, player.position()); // how should I know???
}

We can’t write the last assertion because we have no control over the dice.

The standard way to solve this is to move the responsibility of extracting a random result to a separate class.

@Test public void playerTakesTurn() {
  Dice dice = new Dice();
  Player player = new Player(dice);
  //...
}

class Player {
  public void takeTurn() {
    int result = this.dice.roll();
    this.position = (this.position + result) % 40;
  }      
}

This still does not solve our problem, since Dice#roll() will still produce a random result that we have no control over. But now we have the option of making the coupling between Player and Dice weaker, by making Dice an interface and passing a fake Dice implementation that will return the result that we want:

@Test public void playerTakesTurn() {
  Dice dice = new FakeDiceReturning(7);
  Player player = new Player(dice);
  assertEquals(0, player.position()); // initial position

  player.takeTurn();
  assertEquals(7, player.position()); // now it's easy
}

This design move, that is almost forced by the need to write a clear test, has a number of advantages:

Now we have the option to pass different dice implementation. If the rules ever called for using different kinds of dice (for instance, 8-sided dice) we would not need to change the Player class at all.
Now we can test the Dice implementation separately. We might thus discover the bug in the above random number generation code. (did you see it?) :-)

So, to summarize, the fundamental design move is in three steps:

You realize a portion of code has two responsibilities, perhaps because the test is hard to write.
You separate the portion of code in two parts; this is usually some form of “extract method“.
You’re not finished yet, because you still have to decide what kind of coupling you want between the two parts. Usually, a simple extract method will produce a coupling that is too tight. More work is required to ease the coupling to the degree that you want.

It is simply a consequence of the Single Responsibility Principle; but it can also be seen as the or as the application of the “clarity of intent” rule in Kent Beck’s 4 Rules of Simple Design.

Now the interesting thing is that this design move applies not just to methods, but to classes, modules, services and applications! Whenever you see two things inside one, separate them. If you applied the right degree of coupling, you will end up with a system that is simpler.

Design tip: Watch how you cut!

matteo — Mon, 09 Jul 2012 09:14:11 +0000

The number one design problem I see people do is writing code that is overspecific. By that I mean code that can only be used in one place and can never be reused. Programmers should always try to write code that is generic, reusable and independent from the context where it is used.

Design is (mainly) about breaking the solution in small, manageable pieces. However, how you do the cut is important; if you cut poorly, you end up with a worse mess than when you had a single monolithic piece.

This is an example I saw last week. It is Rails-specific but I think non-Rails readers will be able to follow the logic. The application had a general HTML layout that contained something like this:


    <%= yield :right_sidebar %>
  
  
    
    <%= yield %>

Ignore for a second the use of a table for layout (that requires a separate post!) Can you see the design flaw?

Yes, the flaw is that all pages that want to add something to the right sidebar must remember to wrap it in .... Like this:

<% content_for :right_sidebar do %>

  BANNER GOES HERE

<% end %>

In practice the code that goes in the right sidebar is dependent on the context where it is used. The solution in this case is simple: keep the context in the calling code, not in the called code. Like this:


  
    
    <%= yield %>

    
    
      <%= yield :right_sidebar %>

This has the problem that if the page does not want to put anything in the right banner, it will still have an empty td element. Fortunately, Rails has a clean solution for this:


    <% if content_for? :right_sidebar %>
    
    <% end %>
  
  
    
    <%= yield %>

    
      <%= yield :right_sidebar %>

The code that uses the right sidebar is now simplified to

<% content_for :right_sidebar do %>
  BANNER GOES HERE
<% end %>

And this code will not have to change when the application layout is changed to something that does not use tables.

Watch how you cut!

Formalism versus Object Thinking

matteo — Sun, 22 Jan 2012 16:08:48 +0000

The book Object Thinking makes a very good explanation of the divide between two different, antagonistic modes of thinking. The formalist tradition (in software) values logic and mathematics as design tools. A formalist thinks that documents have objective, intrinsic meaning. Think “design documents”; think “specifications”.

The empirical tradition (the book calls it hermeneutics, but I will stick to this less formal name :-) values experimentation. Empiricists hold that the meaning of a document is a shared, temporary convention between the author and the readers. Think “user story”; think CRC cards; think “quick design session on the whiteboard.”

The empiriricists brought us Lisp; the formalists brought us Haskell. The formalists brought us Algol, Pascal, Ada. The empiricists brought us C, Perl, Smalltalk.

Empiricists like to explain things with anthropomorphism: “this object knows this and wants to talk to that other object…” The formalists detest anthropomorphism; see these quotes from Dijkstra.

As a former minor student of the best formalist tradition there is, and a current student of the Object Thinking tradition, I think I’m qualified to comment. Please don’t take my notes as meaning that the formalist tradition sucks; I certainly don’t think this. I’m interested in highlighting differences. I think a good developer should learn from both schools.

Formalists aim to bring clarity of thought by leveraging mathematical thinking.

Object thinking aims to bring clarity of thought by leveraging spatial reasoning, metaphor, intuition, and other modes of thinking.

It is well known that mathematical thinking is powerful. It’s also more difficult to learn and use. One example that was a favourite of Dijkstra is the problem of covering a chessboard with dominoes when the opposite corners of the chessboards were removed. If we try to prove that it’s impossible by “trying” to do it or simulating it, we’d quickly get bogged down. On the other hand, there’s a very simple and nice proof that shows that it’s impossible. Once you get that idea, you have power :-)

An even more striking example is in this note from Dijkstra on the proof method called “pigeonhole principle”. Dijkstra finds that the name “pigeon-hole principle” is unfortunate, as is the idea to imagine “holes” and a process of filling them with “pigeons” until you find that some pigeon has no hole. The process is vivid and easy to understand; yet it is limiting. Dijkstra shows in this note how to define the principle in a more simple and powerful way:

For a non-empty, finite bag of numbers, the maximum value is at least the average value.

This formulation is simple (but not easy!) Armed with this formulation, Dijkstra explains how he used this principle to solve on the spot a combinatorial problem about Totocalcio that a collegue of his could not solve with pen and paper. He also explains how he used it to solve a generalization of the problem, which would not be easy to prove with the “object-oriented” version of the principle.

I think this note presents the contrast between formalism and empiricism vividly. If you put in the effort to internalize the formal tool, that which was difficult becomes easy, and you can solve a whole new level of problems.

On the other hand, the formalists do now always win :-) Formalists reject the idea of making tests the cornerstone of software development. In my opinion they are squarely wrong; examples are the primary tools to do software development, and you can’t even understand if a specification is correct until you *test* it with examples.

The one thing that boths camps have in common is that they are both minority arts. Real OOP is almost as rare as Dijkstra-style program derivation. The common industrial practice is whateverism :-)

On knowing what you’re doing

matteo — Sun, 09 Oct 2011 19:12:17 +0000

I’d like to share this quote from E.W.Dijkstra:

When the design of the THE Multiprogramming System neared its completion the University’s EL X8 was getting installed, but it had not been paid yet, and we hardly had access to it because the manufacturer had to put it to the disposal of an American software house that was supposed to write a COBOL implementation for the EL X8. They were program testing all the time, and we let it be known that if occasionally we could have the virgin machine for a few minutes, we would appreciate it. They were nice guys, and a few times per week we would get an opportunity for our next test run. We would enter the machine room with a small roll of punched paper tape, and a few minutes later we would leave the machine room with the output we wanted. I remember it vividly because when they realized what we were achieving, our minimal usage of the machine became more and more frustrating for them. I don’t think their COBOL implementation was ever completed.
EWD1303, My recollections of operating systems design

I can picture Dijkstra and his collegues working with paper and blackboards and thinking hard about how they were writing their software. They came into the room and their software just worked. And I can picture the Cobol crew in a furious vicious circle of code-and-fix; their growing frustration and despair. That was about 1960: no books on software design existed back then. For that matter, no books on compiler writing existed either.

Could the Cobol crew have done better? Absolutely. In 1954, the FORTRAN team led by John Backus produced a compiler that reportedly wrote code almost as good as hand-written. How could the poor Cobol crew have done something as good or at least good enough?

My answer: by designing their software. By breaking the thing into parts (modules) and developing each one separately. By using analogies from other engineering disciplines; by using metaphors. I know this is all obvious to us in 2011 as we all know about module decomposition and the use of metaphors and coupling and cohesion. But is it really obvious? Really?

What I see is that modern-day software teams today can *still* be divided in two kinds: the Dijkstra-crew kind and the Cobol-crew kind. Those who are in control and produce reasonably good software within reasonable resources and whose code is reasonably clean; and those who toil away late hours and produce late and buggy software, with no design, or perhaps with an ineffective, we-dont-really-believe-in-it design and crappy code.

Which do you want to be? What will you do to become it?

Design by Contract vs. Test-Driven Development

matteo — Sat, 28 May 2011 09:54:25 +0000

A great many years ago I was fascinated by Bertrand Meyer’s book “Object Oriented Program Construction.” One of the many remarkable things in that book is the idea of “Design By Contract”, where you specify what a method does by means of a logical pre– and post–condicion. Consider the square root function:

  pre: x ≥ 0
  post: abs(y*y - x) < epsilon

This is a very good specification:

It’s efficiently executable.
The intent is clear.
Gives no hint about how to implement it, i.e., it does not contain design ideas.

Now I’m reading the Scrumban book by Corey Ladas. One thing Corey says is that Test-Driven Development is good, but not as good as Design By Contract; in fact, he says, TDD might be a stepping stone to DBC.

I have never met someone who does DBC. This by itself does not mean a lot, as I’m not widely travelled, in working experience. I’d be quite interested in reading experiences about this. Anyway I suspect that there are fundamental reasons why it’s not widely practiced. My hypothesis is that the square root is an exceptionally good DBC example, but most functions are not as easily specified by contract.

Consider for example the function “specified” by the following TDD-style tests:

  it "returns empty string for empty string" do
    assert_equal "", squeeze("")
  end
  
  it "replaces sequences of equal characters with one" do
    assert_equal "abca", squeeze("aaabbccccaa")
  end

I think it’s quite clear how we should implement this function, even though we are given just two examples of its behaviour. How would we specify it with DBC? I’m not sure what is the best way. I thought about this for a long time (believe me) and this is the best I came up with:

pre: true // any string is valid input
post: 
  let s be the input string.
  let s' the output string.
  (∀ i: s'[i-1] ≠ s'[i])
  ∧
  (∀ i,j: i < j ⇒ 
    (∃ k,l: k < l ∧ s'[i] = s[k] ∧ s'[j] = s[l] ))
  )

You read it like this:

The output string does not contain consecutive equal values (that was easy!)
Two distinct values in the output string must have been present in the input string and in the same order.

Now is this a correct specification? Can I *prove* that this specification is correct? Well, this is a specification, so it’s supposed to be self-evidently correct. But is it? I don’t think it is self evident at all. I think it takes a bit of thought to understand it. It’s not easy to find a way to prove it correct. One way is to look for counterexamples, that is, find a pair (s,s’) that satisfies the specification yet contradicts my intuitive notion of what “squeeze” should do. (In fact, I can think of at least two examples that prove that this specification has holes. Can you find them?) In other words, it turns out that the only way to convince myself that this specification is correct is by testing it on carefully chosen examples!

* *
*

Well it’s not true that I can’t find a clearer specification. Consider this other one:

  pre: true
  post: let squeeze behave like f where
    f([]) = []
    f([x]) = [x]
    f([x,x|rest]) = f([x|rest])
    f([x,y|rest]) = x + f([y|rest]) if x ≠ y

The notation [x,y|rest] means “a string that begins with character x, then character y, then 0 or more other characters.” My observations follow.

This is a purely mathematical specification of function squeeze, as pure as the previous one. It’s arguably clearer than the previous specification. I think this one is correct. I could test it against a few cases just to make sure, but it seems OK to me.

This spec can be executed efficiently, while checking an input-output pair against the other spec requires quadratic time.

But there’s a problem; this spec is actually a program. Once I have this spec I can use *this* for an implementation and work no more. I already have my implementation.

In conclusion, this is my objection to DBC: I suspect that the square root is an exception; most methods are too complex to specify with pure logic, or require us to write a functional program that solves the problem. In both cases we are left with the need to test our specification on specific examples.

This is why I think TDD is for most purposes more effective than DBC. In general, concrete examples are by orders of magnitude simpler to write and more self-evident than universal statements. Most universal statements will have to be tested against examples anyway, or we wouldn’t be confident in their correctness.

Long live examples!

Anti-FOR tips from the Yahtzee Kata

matteo — Sat, 14 May 2011 15:47:20 +0000

Again on the Kata Yahtzee, that I blogged about some time ago.

If you have not solved the kata at least once, please stop reading this! Get back when you have.

* *
*

Good to see you again! Now that you solved it, you probably know that the naive solution takes many “for” loops. Let D be the player dice, represented as an array of die results, e.g., D=(1,6,1,6,4). The naive rules for sixes would be


    def sixes_score 
      sum = 0
      for d in D
        if d == 6
          sum += 6
        end
      end
      return sum
    end

This solution involves searching for sixes and adding up. Why do we need to search? We need to search because there are many different D that are worth exactly the same for the sixes rule. For instance, both D=(1,2,3,6,6) and D=(6,6,1,2,3) are worth 12.

One way to avoid the search is to represent the die rolls in a canonical form, that is a form where two equivalent results are represented in the same way. The obvious way to obtain a canonical form is to sort D; but in this particular case, we’d still need to search for sixes.

An alternative canonical form would be to count how many results we have: for instance D=(1,1,3,6,6) would be represented as C = [2,0,1,0,0,2], that is “two times 1, one time 3, two times 6.” The rule for sixes becomes


    def sixes_score
      return 6*C[6]
    end

(Note: we take C to be a 1-based array. It’s easy to make one in Java or Ruby: use an array of length 7 and ignore index 0.)

The straight rules are also easy, because C is a canonical form for them as well:


    def small_straight_score
      if C == [1,1,1,1,1,0] then 15 else 0
    end
    def large_straight_score
      if C == [0,1,1,1,1,1] then 20 else 0
    end

Now, how does this help us with the other rules? Take the Yahtzee rule, for instance. The naive solution


    def really_naive_yahtzee_score   
      for (i=1; i < 5 ; i++)
        return 0 if D[i-1] ≠ D[i]
      end
      return 50
    end

can be slightly improved by


    def slighly_less_naive_yahtzee_score     
      for d in D
        return 0 if D[0] ≠ d
      end
      return 50
    end

Using C does not improve much as we still have to search:


    def still_naive_yahtzee_score
      for c in C
        return 50 if c == 5
      end
      return 0
    end

This is because there are many different C that are equivalent with respect to the Yahtzee rule: for instance C = [0,0,0,5,0] and C=[0,5,0,0,0]. Can we apply the same reasoning and find another canonical representation? Why yes! If we sort C = [0,0,0,5,0] to obtain S = {5,0,0,0,0} the yahtzee rule becomes very simple:


    def cool_yahtzee_score
      if S[0] == 5 then 50 else 0
    end

Many other rules are immediately codified this way:


    def four_of_a_kind_score
      if S[0] ≥ 4 then sum(D) else 0
    end
    
    def full_house_score
      if S[0] == 3 ∧ S[1] == 2 then 25 else 0
    end

The pair rule is a bit more challenging: it is not part of the “official” rules but it make for interesting coding :o). The rule is “Pair: The player scores the sum of the two highest matching dice. For example, 3, 3, 3, 4, 4 gives 8.” Using C requires searching. Using S would be no good (can you see why?)

Again: can you find a canonical representation for the pair rule so that we don’t have to search? Hint: remove “noise” to reveal information.

Conclusion

It’s important to find ways to remove IFs. It’s also important to find ways to remove FORs! I blogged about this before.

We used two ways to remove FORs here:

Use canonical representations, like C and S;
Hide them in well-known functions like sort and sum.

Our search for canonical forms helps us develop a language for reasoning effectively about our problem domain. In fact we are building a little theory.

Zero is a number

matteo — Fri, 06 Aug 2010 13:45:05 +0000

I won’t bore you with the story of how long it took for people to recognize that zero is a number. Without zero it would be difficult to explain what is the value of, say, 3 minus 3; we’d be forced to say that it’s a “meaningless” expression. Funny huh? Yet some developers seem to be stuck to medieval thinking in this respect.

Have you ever seen code like this?


public List findAllEmployeesByDepartment(int departmentId) {
  String sql = "select * from employees where department_id = ?";
  ResultSet rs = select(sql, department_id);
  if (rs.size() == 0) {
    return null;
  } else {
    // ... convert the recordset to a List and return it
  }
}

This developer seems to think that an empty List is not a regular list, so he thinks he should return a special value like null to signal that the query returned no values. This is totally unnecessary. No, I take it back: this is totally wrong. You are forcing all callers of findAllEmployeesByDepartment to check for null. Not only that; this code seem to say that it’s a totally unnatural and unexpected thing for this query to return no rows. Soon developers will forget to check for null, and the application will throw NullPointerExceptions.

A related example is:


Foo[] foos = ...;
if (foos.length > 0) {
  for (int i=0; i < foos.length; i++) {
    // do something with foo[i]
  }
}

Here the developer thinks that they have to treat the case of an empty array separately. In fact the IF is totally unnecessary. If the array is empty, the loop would execute zero times anyway. Java (and C) arrays use asymmetric bounds, which make it easier to write code that does not need to treat a zero-size interval as a special case.

In conclusion: empty collections are perfectly valid collections, and empty arrays are perfectly valid arrays. It’s a good idea to write code that doesn’t treat “zero” as a special case.

This post is part of a series on development fundamentals.

TDD is no substitute for knowing what you are doing

matteo — Tue, 29 Jun 2010 11:26:53 +0000

Know your stuff

A while ago we had a fun evening at the Milano XPUG writing a Sudoku solver. I blogged about my solution. I’m not particularly proud of it, in retrospect. The code and the tests are not obvious. I can’t read any of it and be certain that it works. It does not speak.

It is true that solving puzzles like Sudoku is quite different from what application programmers do everyday at work. Why is it that? The problems that we solve in business applications do not have that mathematical crispness that puzzles have. Perhaps it’s because we’re not good enough at analyzing them and expressing them abstractly. That would explain why business code is so long, convoluted and expensive.

Anyway, the point I want to make is that it is not satisfying to use the tests in TDD as a crutch for constructing hapazard code that, with a kick here and a few hammer blows there seem to work. The point of TDD is to *design* code; and a good design shows how and why a solution works.

I often see people doing katas that involve problems with well-known solutions. We usually disregard, forget, or ignore the well-known solution! And we keep writing tests and production code until we rig together something that passes the tests. It’s painful. I know. I too did some of that.

TDD does not work well when we don’t know what we’re doing. Some high-profile XPers failed to ship when TDDing their way with unfamiliar APIs or disregarding known solutions. TDD is no substitute for analyzing a problem, and finding abstractions that make it easy to solve. TDD without thinking and analyzing and abstracting is not fun!.

It’s for this reason that there is the XP practice of “spiking solutions”, that is, take time to learn how to do something, make experiments, then apply what you learned. If you know how to do things, you will not spend hours discussing with your pair; you and your pair will grab the keyboard from each other, as Francesco Cirillo often says.

A better way

Consider Sudoku again. Peter Norvig solves it in two different ways by using known techniques. The first solution is depth-first search, which is gueranteed to terminate as the graph of Sudoku states is acyclic. The other is by constraint propagation. If I were to do the exercise again, I would try to make the analysis apparent from the code.

Say we want to solve it by depth-first search. That entails two sub-problems:

a) writing a depth-first algorithm
b) writing something that enumerates legal moves in a given Sudoku board

I would start by testing the depth-first search algorithm. I would drive it with an abstract “tree” implementation. This means I could concentrate on the search algorithm without being distracted by the complex details of the Sudoku rules.

Then I would test-drive the generation of next-moves from a Sudoku position. That could also be done incrementally. Can we imagine a simplified Sudoku board? The full Sudoku is too complex for the first test! A year ago I would have started by defining a 9 by 9 array of numbers, but now the sheer boredom of defining it would stop me. Is there a better way?

Relax. Think. Dream.

Think about the game terminology. As Norvig says, the game is about units (either a row, a column or a box). A unit with no empty spaces has no descendant moves. A unit where a number is missing has the full unit as a descendant move. A unit where two numbers are missing… You get the point.

Then work out how to combine descendant moves of two units that share a square. Think a row and a column. If the common square is empty, than valid solutions for that square must be valid for both units…

The point is to work the problem incrementally. Try smaller scales first. Try 2×2 boards. Make the size of units and the board parametric. Add the constraint rules one by one, so that you can test them separately.

Conclusions

One important principle to apply is “separation of concerns”. Enumerating moves is a problem, and search strategy is another. By solving them separately, our tests become more clear and to the point. We gain confidence that we know how and why our code works.

Another way to see this is to decompose a problem in smaller problems; prove with tests that you can solve the subproblems, then prove with tests that you can solve the composition of the subproblems.

When you have a problem that is of fixed size 42, turn that constant into a parameter and solve the problem for N=1, N=2, … Imagine if the Sudoku board was 100×100 instead of 9×9; would you define a 100×100 matrix in your tests? Turning these constants into parameters make your code more general, your tests more clear, while making the problem *easier* to solve!

To summarize, what I think is important is

Learn data structures, algorithms, known solutions, the proper way of doing things.
Apply separation of concerns.
Solving a slightly more general problem sometimes is much easier than solving the actual problem
It’s more fun to work when you know what you’re doing!

Update

Charlie Poole recently posted this on the TDD mailing list (Emphasis is mine):

I’ve written elsewhere that I believe attempting to get TDD to “drive” the invention of a new algorithm reflects an incorrect understanding of what TDD is for.

TDD allows us to express intent (i.e. design) in a testable manner and to move from intention to implementation very smoothly – I know of no better way.

OTOH, you have to start out with an intent. In this context, I think that means you need to have some idea of the algorithm you want to implement. TDD will help you implement it and even refine the details of the idea. Writing tests may also inspire you to have further ideas, to deepen the ones you started with or to abandon ideas that are not working out.

Vlad Levin blogs thoughtfully:

one of the first rules I teach my students when I am doing a TDD workshop or teaching a course is precisely that TDD is not an algorithm generator! Solving sudoku is just the kind of problem you want to find an algorithm for first, then implement that algorithm

[…]

So what is the purpose of TDD then? One goal of TDD is to reduce the need to determine ahead of time which classes and methods you’re going to implement for an entire story. There’s a large body of shared experience in the developer community that trying to anticipate such things tends to lead to paralysis where nothing useful gets done and/or produces bloated, over-designed code. Instead, you can develop one aspect of the story at a time, using each test to keep yourself moving forward and refactoring the design as you go along

A semi-forgotten design principle

matteo — Fri, 11 Sep 2009 08:41:36 +0000

The common wisdom is that Ruby is slow and Java is fast. In general, it’s true. But is it always? Look at this simple test.


$ cat hello.rb 
puts "Hello world!"
$ ruby hello.rb 
Hello world!
$ time ruby hello.rb 
Hello world!

real	0m0.008s
user	0m0.004s
sys	0m0.003s
$

So it looks like it takes 8ms to run a simple “hello, world” in Ruby. How does Java compare to this?


$ cat Hello.java 
public class Hello {
    public static void main(String ... args) {
        System.out.println("Hello, world!");
    }
}
$ 
$ javac Hello.java 
$ java Hello
Hello, world!
$ time java Hello
Hello, world!

real	0m0.122s
user	0m0.061s
sys	0m0.028s
$

Even if we ignore the time it takes to compile the Java program, it looks like running the “Hello, world” in Java takes 15 times longer than Ruby. This is due to the long startup time of the Java Virtual Machine. The times you see here are taken on my MacBook Pro; they will be different on other operating systems, but not much different.

So what, you will say? “The startup time is not important! As soon as the JVM is up and running, Java can run circles around Ruby.”

I don’t agree that startup times are not important. The startup time for Java becomes much worse when you run complex applications. A vanilla Tomcat with no web applications installed takes about one minute to start up. Compare with Webrick, the Ruby web server, that is up and running with my web application in 3 seconds. The difference in startup times makes all the difference in the world when you’re developing software. It takes at least one minute, often much longer, to start up a Java application so that I can try it. There are times when you’re developing an application when you need to test it after each tiny change. It’s very difficult to do that in Java. The problem is made much worse by the fact that in general Java “containers” can’t reload changed classes without a restart. (Webrick can do that.)

What, you will say? “Matteo gave up TDD! He tests applications manually by clicking around like a monkey!” No, really, it’s not like this. I always write production code with TDD. That does not mean that you *never* test your stuff manually on the live application. Quite the opposite: there is a danger, with new converts to unit testing, that we trust our tests too much. I’ve seen people declare a story “finished” when all the unit tests are green, without ever checking if it really works! And of course, if you never test it manually, it will not work. There is a need for manual testing (some call it exploratory testing), even if you’re Kent Beck or Misko Every.

So I hope you’ll agree with me that short startup times are important for developers. But there are other implications. The fact that it takes a lot of time to startup a Java application means that Java developers are trained to write applications in a single JVM process. For instance, we often see dozens of web applications running in a single Tomcat. If you need concurrent operations, the solution is always to run more threads within the same process. And there is a big problem with this.

The operating system’s concept of a “process” is a very useful one. A “process” is a bundle of threads and resources: memory, open files, network connections, and the like. A process in Unix or Windows is a watertight compartment. When a process terminates, *all* of its resources are released. A process cannot easily corrupt the state of another process. A process can be given limits on how much memory or CPU it can take. It’s very useful to organize a concurrent application as a set of cooperating operating system processes. That’s the way the Apache Http server works, and that is a remarkably reliable software. It’s also one of the smart ideas in the Chrome browser, to run each tab in a separate process.

It’s a good design principle to have many small modules communicating with well-defined interfaces, rather than a single monolith where all the threads can interact in unforeseen ways. It’s also the way of Unix to design applications as collections of small communicating processes. Which makes me wonder how could Sun ever get us to believe that it’s a good idea to put all of our eggs in a single, huge process. Should not Sun be the champion of the Unix way? But I digress.

In conclusion, I claim that designing applications with small cooperating operating system processes is a good principle. Java current practice runs against this, but it need not be.