Kent Beck’s Simplification strategy

I finally got round to watching Kent Beck’s video on Responsive Design. It’s a very interesting video, for me. I mean, it’s very interesting from the point of view of someone who has been doing test-driven development for years, and who for years has watched people work with TDD.

I’d like to comment on an interesting part of this video, the one where he talks about the Four Strategies, which are: Leap, Parallel, Stepping Stone and Simplification. Carlo Pescio says he is unsure about the difference between Stepping Stone and Simplification; but I think I grok what Kent means. I think that Simplification is a crucial strategy that is enabled by TDD; it would be too difficult and expensive to do without tests. On the other hand, if you do TDD, you need to understand Simplification, otherwise your tests will not sing.

So what is the Simplification strategy? Kent says (around 00:58)

I can imagine that I want to get over here, so what is the least I can do that would be progress? Suppose I want a big multi-threaded application, so I think, I can’t do a big multi-threaded application in one step. It’s too big a step, it’s not a safe step. So what am I going to do? Well, there are different possibilities, but the one I do all the time is to say, “Well, which of those am I going to eliminate? Is it going to be a multi-threaded application doing something trivial, or is it going to be a single-threaded application doing something more substantial?” […] I do this *further* than most people that I encounter. Someone says “I want a Sudoku solver and I do a 16 by 16 grid” and I say “Well, how about a 1 by 1 grid?”

I think that the Sudoku example nails it. Let’s see where the “one-cell Sudoku” leads. What would the test look like? I suppose it would be like

def test_solve_one_cell_sudoku
  sudoku = Sudoku.new(1)
  sudoku.solve
  assert_equal "1", sudoku[0,0]
end

Any number would solve the one-cell Sudoku, so I arbitrarily chose “1” as the solution. What would the next test look like? I would write a two-cell Sudoku. Mind you, not a 2-by-2 Sudoku; that would contain four cells! Just a non-square, two cells Sudoku:

def test_solve_two_cells_sudoku
  sudoku = Sudoku.new(1, 2) # one row, two columns
  sudoku.solve
  assert_equal "1", sudoku[0,0]
  assert_equal "2", sudoku[0,1]
end

Again, any two numbers would solve the two-cell Sudoku, as long as they are different. So I arbitrarily chose “1” and “2”.

Now, these arbitrary choices are starting to bug me. If it’s true that any other two distinct numbers would do, why do I over-constrain the test? By forcing “1” and “2” I’m placing an extra constraint that might make things more difficult later. I could make my tests less brittle by specifying exactly what I want:

def test_solve_one_cell_sudoku
  sudoku = Sudoku.new(1)
  sudoku.solve
  assert_is_digit sudoku[0,0]
end

def assert_is_digit(string)
  assert_match /\d/, string
end

and

def test_solve_two_cells_sudoku
  sudoku = Sudoku.new(1, 2) # one row, two columns
  sudoku.solve
  assert_is_digit sudoku[0,0]
  assert_is_digit sudoku[0,1]
  assert_not_equal sudoku[0,0], sudoku[0,1]
end

OK, now the tests are less brittle, but they are also much less clear! I like my tests to by very simple examples of what the production code does. I don’t like it when I have to think about what the tests mean

Big flash!

I now understand that the indeterminacy of the original tests is in the fact that the solver can choose from an alphabet of symbols that is larger that the Sudoku problem. If I constrain the first test to “choose” its symbol from the alphabet that contains only the “1” symbol, then the indeterminacy disappears.

def test_solve_one_cell_sudoku
  sudoku = Sudoku.new(["1"], 1)
  sudoku.solve
  assert_equal "1", sudoku[0,0]
end

Same for the second test:

def test_solve_two_cells_sudoku
  sudoku = Sudoku.new(["1", "2"], 1, 2)
  sudoku.solve
  assert_equal "1", sudoku[0,0]
  assert_equal "2", sudoku[0,1]
end

So the Simplification strategy led me to discover a concept, the alphabet, that I was previously ignoring. Continuing along this route, I will probably be led to “discover” the concept of constraint, that will be a useful Stepping Stone to solve the whole Sudoku; not to mention that constraints will be useful to solve Sudoku variants as well. On the other hand, if I try to solve the whole 9×9 Sudoku problem from the start, I will probably end up writing procedural crap, just as I previously did :-)

Let me give another example of Simplification: suppose that you have to write a batch command that produces a report out of web access log files. This is not a theoretical example; I had to do that more than once myself, and the team I’m currently coaching had to solve this exact problem. Suppose that the output we want looks like this:

date          2xx-results    3xx-results   4xx-results   5xx-results
29/Jul/2011          1223             23           456            12
01/Aug/2011          1212             24            11           123

As a TDD beginner I would have started with testing an empty report (that is easy to do, but does not teach you much) and then, for a second test, a one line report like this:

date          2xx-results    3xx-results   4xx-results   5xx-results
29/Jul/2011             1              0             0             0

The nasty thing is that this second test contains most of the complexity of the whole problem. It’s too big a step. If I try to solve it in one Leap, it will probably lead to procedural crap. It is better to use the Child Test pattern (from TDD By Example, p. 143). But I don’t have much guidance on how to choose my Child Test. What objects do I need? As a TDD beginner I would often come up with silly objects that would be just procedural crap disguised by objects.

Here is where I would do best to use a Strategy. The Stepping Stone strategy could help: for instance, I could invent a DSL for web access reports. If I did that, then writing my original report would be easy!

Or I could use the Simplification strategy: start with a one-column, one-row report. That would give me the guidance I need to find the next small, safe step. I prepared a kata for this problem; I will probably publish it on github some day.

3 Responses to “Kent Beck’s Simplification strategy”

  1. Luca Minudel Says:

    Hei Matteo,

    while many are in hurry to voice their opinions or criticize someone else ideas is fascinating to see that still exist people that is willing to first understand.

    I haven’t found it easy from the video to really get the definitions of all the design strategies Kent Beck described.

    Following this link you find also the slides, this helped me a little, still I’m not sure I perfectly understood Kent ideas:
    http://blogs.ugidotnet.org/luKa/archive/0001/01/01/design-strategies-kent-beck-presentation.aspx

    I’m eager to read more post from you about this argument. A second point of view will help me for sure.

  2. Kevin Rutherford Says:

    Hi Matteo, These are really nice examples, and definitely help to clarify Kent’s explanations and approach.

    In the logfile example (having also done it myself several times :), my first cut is to check that I can summarize the input correctly. So one column first, with increasing complexity in the file. Only then would I think about splitting out the other columns.

    Keep up the great writing — thanks!

  3. matteo Says:

    Hi Luca and Kevin,

    thanks for you kind comments. I just published a followup to this post :)

Leave a Reply