Kent Beck’s Simplification strategy
I finally got round to watching Kent Beck’s video on Responsive Design. It’s a very interesting video, for me. I mean, it’s very interesting from the point of view of someone who has been doing test-driven development for years, and who for years has watched people work with TDD.
I’d like to comment on an interesting part of this video, the one where he talks about the Four Strategies, which are: Leap, Parallel, Stepping Stone and Simplification. Carlo Pescio says he is unsure about the difference between Stepping Stone and Simplification; but I think I grok what Kent means. I think that Simplification is a crucial strategy that is enabled by TDD; it would be too difficult and expensive to do without tests. On the other hand, if you do TDD, you need to understand Simplification, otherwise your tests will not sing.
So what is the Simplification strategy? Kent says (around 00:58)
I can imagine that I want to get over here, so what is the least I can do that would be progress? Suppose I want a big multi-threaded application, so I think, I can’t do a big multi-threaded application in one step. It’s too big a step, it’s not a safe step. So what am I going to do? Well, there are different possibilities, but the one I do all the time is to say, “Well, which of those am I going to eliminate? Is it going to be a multi-threaded application doing something trivial, or is it going to be a single-threaded application doing something more substantial?” […] I do this *further* than most people that I encounter. Someone says “I want a Sudoku solver and I do a 16 by 16 grid” and I say “Well, how about a 1 by 1 grid?”
I think that the Sudoku example nails it. Let’s see where the “one-cell Sudoku” leads. What would the test look like? I suppose it would be like
def test_solve_one_cell_sudoku sudoku = Sudoku.new(1) sudoku.solve assert_equal "1", sudoku[0,0] end
Any number would solve the one-cell Sudoku, so I arbitrarily chose “1” as the solution. What would the next test look like? I would write a two-cell Sudoku. Mind you, not a 2-by-2 Sudoku; that would contain four cells! Just a non-square, two cells Sudoku:
def test_solve_two_cells_sudoku sudoku = Sudoku.new(1, 2) # one row, two columns sudoku.solve assert_equal "1", sudoku[0,0] assert_equal "2", sudoku[0,1] end
Again, any two numbers would solve the two-cell Sudoku, as long as they are different. So I arbitrarily chose “1” and “2”.
Now, these arbitrary choices are starting to bug me. If it’s true that any other two distinct numbers would do, why do I over-constrain the test? By forcing “1” and “2” I’m placing an extra constraint that might make things more difficult later. I could make my tests less brittle by specifying exactly what I want:
def test_solve_one_cell_sudoku sudoku = Sudoku.new(1) sudoku.solve assert_is_digit sudoku[0,0] end def assert_is_digit(string) assert_match /\d/, string end
and
def test_solve_two_cells_sudoku sudoku = Sudoku.new(1, 2) # one row, two columns sudoku.solve assert_is_digit sudoku[0,0] assert_is_digit sudoku[0,1] assert_not_equal sudoku[0,0], sudoku[0,1] end
OK, now the tests are less brittle, but they are also much less clear! I like my tests to by very simple examples of what the production code does. I don’t like it when I have to think about what the tests mean…
Big flash!
I now understand that the indeterminacy of the original tests is in the fact that the solver can choose from an alphabet of symbols that is larger that the Sudoku problem. If I constrain the first test to “choose” its symbol from the alphabet that contains only the “1” symbol, then the indeterminacy disappears.
def test_solve_one_cell_sudoku sudoku = Sudoku.new(["1"], 1) sudoku.solve assert_equal "1", sudoku[0,0] end
Same for the second test:
def test_solve_two_cells_sudoku sudoku = Sudoku.new(["1", "2"], 1, 2) sudoku.solve assert_equal "1", sudoku[0,0] assert_equal "2", sudoku[0,1] end
So the Simplification strategy led me to discover a concept, the alphabet, that I was previously ignoring. Continuing along this route, I will probably be led to “discover” the concept of constraint, that will be a useful Stepping Stone to solve the whole Sudoku; not to mention that constraints will be useful to solve Sudoku variants as well. On the other hand, if I try to solve the whole 9×9 Sudoku problem from the start, I will probably end up writing procedural crap, just as I previously did :-)
Let me give another example of Simplification: suppose that you have to write a batch command that produces a report out of web access log files. This is not a theoretical example; I had to do that more than once myself, and the team I’m currently coaching had to solve this exact problem. Suppose that the output we want looks like this:
date 2xx-results 3xx-results 4xx-results 5xx-results 29/Jul/2011 1223 23 456 12 01/Aug/2011 1212 24 11 123
As a TDD beginner I would have started with testing an empty report (that is easy to do, but does not teach you much) and then, for a second test, a one line report like this:
date 2xx-results 3xx-results 4xx-results 5xx-results 29/Jul/2011 1 0 0 0
The nasty thing is that this second test contains most of the complexity of the whole problem. It’s too big a step. If I try to solve it in one Leap, it will probably lead to procedural crap. It is better to use the Child Test pattern (from TDD By Example, p. 143). But I don’t have much guidance on how to choose my Child Test. What objects do I need? As a TDD beginner I would often come up with silly objects that would be just procedural crap disguised by objects.
Here is where I would do best to use a Strategy. The Stepping Stone strategy could help: for instance, I could invent a DSL for web access reports. If I did that, then writing my original report would be easy!
Or I could use the Simplification strategy: start with a one-column, one-row report. That would give me the guidance I need to find the next small, safe step. I prepared a kata for this problem; I will probably publish it on github some day.
August 12th, 2012 at 00:57
Hei Matteo,
while many are in hurry to voice their opinions or criticize someone else ideas is fascinating to see that still exist people that is willing to first understand.
I haven’t found it easy from the video to really get the definitions of all the design strategies Kent Beck described.
Following this link you find also the slides, this helped me a little, still I’m not sure I perfectly understood Kent ideas:
http://blogs.ugidotnet.org/luKa/archive/0001/01/01/design-strategies-kent-beck-presentation.aspx
I’m eager to read more post from you about this argument. A second point of view will help me for sure.
August 12th, 2012 at 16:36
Hi Matteo, These are really nice examples, and definitely help to clarify Kent’s explanations and approach.
In the logfile example (having also done it myself several times :), my first cut is to check that I can summarize the input correctly. So one column first, with increasing complexity in the file. Only then would I think about splitting out the other columns.
Keep up the great writing — thanks!
August 15th, 2012 at 15:32
Hi Luca and Kevin,
thanks for you kind comments. I just published a followup to this post :)