Archive for the 'Expressive Code' Category

Anti-FOR tips from the Yahtzee Kata

Saturday, May 14th, 2011

Again on the Kata Yahtzee, that I blogged about some time ago.

If you have not solved the kata at least once, please stop reading this! Get back when you have.

*         *

Good to see you again! Now that you solved it, you probably know that the naive solution takes many “for” loops. Let D be the player dice, represented as an array of die results, e.g., D=(1,6,1,6,4). The naive rules for sixes would be

    def sixes_score 
      sum = 0
      for d in D
        if d == 6
          sum += 6
      return sum

This solution involves searching for sixes and adding up. Why do we need to search? We need to search because there are many different D that are worth exactly the same for the sixes rule. For instance, both D=(1,2,3,6,6) and D=(6,6,1,2,3) are worth 12.


When refactoring is no use

Sunday, November 7th, 2010

TDD is about design. It’s not about testing.

Last week an interesting thing happened while I was working with a team. They had been writing a new application with TDD since last June, and they were successful in that the application was nearly ready, on time and on budget. But there was one thing that was worrying them.

They had been following an article on the Model-View-Controller pattern, and had structured their application accordingly. There was a controller for every window, and there was a model too. (It’s not a web application, it’s Swing.) The view object was extremely thin, as it only contained code for constructing and laying out widgets; absolutely no other logic in there. The controller hooked listeners to the appropriate UI widgets.

So everything was OK, no? A clear architecture, a clear separation of concerns, everything tested. There was a problem: the controllers were big objects, with intricate logic. The tests were difficult to read and difficult to write. It was difficult to understand what was being tested. My worry was that maintenance of that GUI could quickly turn into a mess. Also my customer was worried about that. The fact is, whenever anything had to be changed about the logic of that window, the controller would have to be changed, and the view would have to be changed, and the model would have to be changed. We had three big monoliths that were involved in everything. This ran against the open-closed principle.

My first reaction was to call upon my refactoring skills. Take the unreadable tests and make them readable. Use extract method and expressive names to make them read like a story about what was being tested. Some success there, but not a clear victory. It was still difficult to find the code being tested, as those were end-to-end tests. Then I found the bit that contained the important logic and wrote proper unit tests for it. Some success, as I was able to test with precision all the corner cases that the end-to-end test was not exercising. But still not a clear victory. Then I started to work on the big controller. I took the worst one with me in my hotel room, and I spent a chunk of my evening trying to refactor it. Extract method, rename, extract class. Extract, extract, extract. It was still big and ugly. With a bit of polish, but still big and ugly. Still running against the OCP.

The next day I pulled a different set of tools. I remembered the lessons in Francesco Cirillo‘s Emergent Design Workshop. I showed the team how to use Francesco’s technique, which is to start with a design of the communicating objects on the whiteboard. After one day of practice with this technique, we attacked the real problem. We started designing the main application window from scratch.

The team was surprised, and I was surprised as well, because the big controller evaporated. It was clear, when working at the whiteboard, that it was not necessary. The model object evaporated too. We designed the main scenarios, and we saw that a simple design could accommodate all of them. At first I sketched the design for them, but they were soon criticizing my design and changing it to suit their needs and tastes. Then we started coding; it went smoothly because the design was already done, and we were taking advantage of all the low-level objects that they had implemented already.

We learned is that when the design is wrong, no amount of refactoring moves will get you out of the hole.

TDD is no substitute for knowing what you are doing

Tuesday, June 29th, 2010

Know your stuff

A while ago we had a fun evening at the Milano XPUG writing a Sudoku solver. I blogged about my solution. I’m not particularly proud of it, in retrospect. The code and the tests are not obvious. I can’t read any of it and be certain that it works. It does not speak.

It is true that solving puzzles like Sudoku is quite different from what application programmers do everyday at work. Why is it that? The problems that we solve in business applications do not have that mathematical crispness that puzzles have. Perhaps it’s because we’re not good enough at analyzing them and expressing them abstractly. That would explain why business code is so long, convoluted and expensive.

Anyway, the point I want to make is that it is not satisfying to use the tests in TDD as a crutch for constructing hapazard code that, with a kick here and a few hammer blows there seem to work. The point of TDD is to *design* code; and a good design shows how and why a solution works.

I often see people doing katas that involve problems with well-known solutions. We usually disregard, forget, or ignore the well-known solution! And we keep writing tests and production code until we rig together something that passes the tests. It’s painful. I know. I too did some of that.

TDD does not work well when we don’t know what we’re doing. Some high-profile XPers failed to ship when TDDing their way with unfamiliar APIs or disregarding known solutions. TDD is no substitute for analyzing a problem, and finding abstractions that make it easy to solve. TDD without thinking and analyzing and abstracting is not fun!.

It’s for this reason that there is the XP practice of “spiking solutions”, that is, take time to learn how to do something, make experiments, then apply what you learned. If you know how to do things, you will not spend hours discussing with your pair; you and your pair will grab the keyboard from each other, as Francesco Cirillo often says.

A better way

Consider Sudoku again. Peter Norvig solves it in two different ways by using known techniques. The first solution is depth-first search, which is gueranteed to terminate as the graph of Sudoku states is acyclic. The other is by constraint propagation. If I were to do the exercise again, I would try to make the analysis apparent from the code.

Say we want to solve it by depth-first search. That entails two sub-problems:

  • a) writing a depth-first algorithm
  • b) writing something that enumerates legal moves in a given Sudoku board

I would start by testing the depth-first search algorithm. I would drive it with an abstract “tree” implementation. This means I could concentrate on the search algorithm without being distracted by the complex details of the Sudoku rules.

Then I would test-drive the generation of next-moves from a Sudoku position. That could also be done incrementally. Can we imagine a simplified Sudoku board? The full Sudoku is too complex for the first test! A year ago I would have started by defining a 9 by 9 array of numbers, but now the sheer boredom of defining it would stop me. Is there a better way?

Relax. Think. Dream.

Think about the game terminology. As Norvig says, the game is about units (either a row, a column or a box). A unit with no empty spaces has no descendant moves. A unit where a number is missing has the full unit as a descendant move. A unit where two numbers are missing… You get the point.

Then work out how to combine descendant moves of two units that share a square. Think a row and a column. If the common square is empty, than valid solutions for that square must be valid for both units…

The point is to work the problem incrementally. Try smaller scales first. Try 2×2 boards. Make the size of units and the board parametric. Add the constraint rules one by one, so that you can test them separately.


One important principle to apply is “separation of concerns”. Enumerating moves is a problem, and search strategy is another. By solving them separately, our tests become more clear and to the point. We gain confidence that we know how and why our code works.

Another way to see this is to decompose a problem in smaller problems; prove with tests that you can solve the subproblems, then prove with tests that you can solve the composition of the subproblems.

When you have a problem that is of fixed size 42, turn that constant into a parameter and solve the problem for N=1, N=2, … Imagine if the Sudoku board was 100×100 instead of 9×9; would you define a 100×100 matrix in your tests? Turning these constants into parameters make your code more general, your tests more clear, while making the problem *easier* to solve!

To summarize, what I think is important is

  • Learn data structures, algorithms, known solutions, the proper way of doing things.
  • Apply separation of concerns.
  • Solving a slightly more general problem sometimes is much easier than solving the actual problem
  • It’s more fun to work when you know what you’re doing!


Charlie Poole recently posted this on the TDD mailing list (Emphasis is mine):

I’ve written elsewhere that I believe attempting to get TDD to “drive” the invention of a new algorithm reflects an incorrect understanding of what TDD is for.

TDD allows us to express intent (i.e. design) in a testable manner and to move from intention to implementation very smoothly – I know of no better way.

OTOH, you have to start out with an intent. In this context, I think that means you need to have some idea of the algorithm you want to implement. TDD will help you implement it and even refine the details of the idea. Writing tests may also inspire you to have further ideas, to deepen the ones you started with or to abandon ideas that are not working out.

Vlad Levin blogs thoughtfully:

one of the first rules I teach my students when I am doing a TDD workshop or teaching a course is precisely that TDD is not an algorithm generator! Solving sudoku is just the kind of problem you want to find an algorithm for first, then implement that algorithm


So what is the purpose of TDD then? One goal of TDD is to reduce the need to determine ahead of time which classes and methods you’re going to implement for an entire story. There’s a large body of shared experience in the developer community that trying to anticipate such things tends to lead to paralysis where nothing useful gets done and/or produces bloated, over-designed code. Instead, you can develop one aspect of the story at a time, using each test to keep yourself moving forward and refactoring the design as you go along

The geometry lab — an exercise

Tuesday, June 15th, 2010

Last week I was traning a team on XP techniques. We tried the following exercise:

I want you people to build me a Swing application that computes the area of a square with a given side length.

I asked for an estimate. The devs were nervous, someone said “impossible!” :-) someone said 5 hours. I played the part of the project-manager-who-was-once-a-developer and said “come on, five hours?? I could do that in 10 minutes in my sleep. What’s so difficult about it”? Then I reasoned with them that if we keep our estimates too comfortable, our business opportunities may fly out of the window. They agreed on a 2 hours estimate.

They proceeded to implement the feature. The three devs rotated every 7 minutes. This was a good slot size; everyone was involved, even the junior one who is rarely given the keyboard. The feature was done in one and a half hour. Then I said

Cool. Now we need to compute the area of a triangle of a given base and height. How much time for this?

The devs estimated 1.5 hours. It was delivered on time. Now the fun part started. The team wrote the application in the “usual” way, by writing new code for the new window. No effort was spent, at this time, to reduce duplication. I pointed out that

We’re going to need to implement many more of these geometry formulae. Make it so that it is trivial to add others.

The team came up with a design where the Swing window object is generic and can be customized to support the input for any formula that requires a variable number of inputs with different names. They thought they could do it in 2 hours. It took 4. At some point we wasted a lot of time on Swing layouts, trying to fathom the mysteries of GroupLayout. I gave some help here. Then we were done! Stepping again in my role of customer I said

Very well. The next feature we need is to compute the area of a circle from the radius.

It was done in 10 minutes. The customer was very satisfied, and so were the devs.

What have we learned?

  • I have learned the power of letting the team come up with their own design. It’s difficult for me, an xp-trainer-who-was-once-a-developer, to give up giving guidance on design. But time and again, I have seen the damage of doing so: the team follows my design, gets bogged down, does not learn.
  • We have learned how hard it is to make the code easy to change. It would have been easy to declare we were “done” after the area of triangle was working. But we were not really “done” from the point of view of TDD. Remember, the cycle is red-green-REFACTOR, and by “refactor” what is really meant is “remove duplication”.
  • Once you get to clean, refactored code, the cost of changes drops. And it’s a pleasure to work with!
  • The decision to invest time in making the code generic might seem difficult. After all, you can get skilled at copy-pasting Swing code and writing many copies of the Swing form class. But then you are left with gobs of code. And good luck applying a different graphic layout to them all! My answer is that we should get skilled at writing flexible code. It took us 4 hours to make the code generic. Next time they have to do something similar, it will take less.

    Copying-and-pasting is a dead end; there is a limit at how skilled you can become at it, and there is certainly a big problem in the quality of the code you deliver. Learning to do good, clean, flexible code never ends. It’s a path where you can get to write better and better code. Which path would you rather be on?

Very cool, guys. This geometry app rocks!!

Design problem #2

Monday, June 14th, 2010

This is a subset of the Back to the Checkout kata by Dave Thomas, which I used many times as a TDD training exercise.

Suppose you have a PriceRules object that knows that the prices of items. Its responsibility is to know the following table:

Item Unit Price Special Price
A 50 3 for 130
B 30 2 for 45
C 20
D 15

Then you have a Cart object that knows which items a customer is trying to buy. For instance, a given cart could contain the list [A, A, C, A, D].

The problem is to compute the total that the customer has to pay, out of a collaboration between (at least) the cart and the pricerRules objects. It seems easy, but there is a catch: you are forbidden to use getters. All methods must return “void”, in Java terms. Design the messages that are exchanged between the objects and produce the desired result. (In the example, it would be 165). Have fun!

Software Design problems, anyone?

Sunday, June 13th, 2010

You learn math by solving problems. Problems frame the way you learn, give you a tangible proof that you’re progressing, give you a sense of meaning and achievement. How do you learn physics? By studying the books of course, but then solving physics problems is very important. How do you learn to play deep games such as chess or go? By playing, mostly. And then by pondering and solving problems. Electronics? Chemistry? Building science? Genetics? The books on these subjects are full of problems.

How do you learn good software design? I don’t know. The books that I’ve read explain principles, and provide examples. Rarely I’ve seen books that contain problems, exercises, or challenges. (Notable exceptions: William Wake’s Refactoring Workbook and Ka Iok Tong’s Essential Skills for Agile Development.)

I propose that we assemble a collection of problems meant to develop and discuss software design. A good problem for this goal would problem *not* have a single correct answer, for design and engineering are always a matter of compromises. A good problem should be a means to discuss the various choices and tradeoff, and worse and better ways to solve it. A good problem should be a small framework.

Let’s start! Here is a problem that I find interesting. The good old Fizz-Buzz problem goes like this:

Write a program that prints the numbers in order from 1 to 100, with the exception that when a number is a multiple of 3, it prints “Fizz”. When a number is a multiple of 5, it prints “Buzz”. And when a number is multiple of both, it prints “FizzBuzz”. In all other cases, it just prints the decimal representation of the number.

There is an obvious way to solve this exercise, of course. It’s a very simple problem, from the point of view of programming. I would have the student solve it however they like. Most solutions contain a 3-way IF. I would then ask students to remove duplication. Early XP books were strong on removing duplication, for a good reason. It takes a bit of training to see how much duplication can creep in even such a small bit of programming.

The usual objection I get at this point is that it makes no sense to go this deep in removing duplication for such a small and trivial example. They also will say that the 3-IFs version is more readable than any version where duplication is removed. This is the crux of the matter.

I then continue the exercise by adding the requirement that

For multiples of 7, the program prints “Bang”.

Easy, they say. Add a fourth IF. Not so fast, I say :-)

For multiples of 7 and 3, the program prints “FizzBang”. For multiples of 5 and 7, the program prints “BuzzBang”. For multiples of 3, 5, 7, the program prints “FizzBuzzBang”!

Now we have an exploding number of IFs. If the next requirement is of the same sort as this one, we see how the IF-chain solution becomes untenable :-) Now solve this!


  • I got the idea of using FizzBuzz as a design example from Giordano Scalzo, who presented it at the Milano XPUG and posted a solution on slideshare
  • Other sources of problems, in no particular order: the Refactoring to patterns book by Joshua Kerievsky. The list of katas by Dave Thomas. The Refactoring in Ruby book by William Wake and Mike Rutherford. The Ruby Quiz site. I’m not merely looking for programming problems. I’m looking for design problems. The difference is that I don’t just want a problem that requires a correct or efficient solution. I want a problem that requires a solution that is easy to understand and change.

Antonio Ganci on design

Sunday, April 18th, 2010

My friend and former Sourcesense collegue Antonio Ganci posted an article (Italian) that describes exactly what I meant in A Frequently Asked TDD Question and its followup. It’s about what you can accomplish if you’re really good at design.

It’s in Italian, so I will report here in English the main points: Antonio is the main developer for an application. He had the most crazy change requests from his customers, like:

  • All user-visible text should be uppercase,
  • There should be an offline mode for working at home
  • Users with Vista or newer OS should see a WPF user interface, the others should get normal Windows Forms
  • Change rounding from 2 to 4 digits everywhere
  • Log every data modification

He reports that some of these changes were done in less than half an hour. Quite a feat, and Antonio seems proud of his design. I think he has good reasons to be proud!

Antonio does not say how he did that, but we can guess that the key is the Once and Only Once rule. If there’s a single place where you print/compute/store a number, it’s very easy to change precision. If you have some sort of builder to generate the user interface, it’s not terribly expensive to generate a WPF interface instead of a Windows Forms one.

Once and Only Once: there should be a single authoritative place in the code where each concept is represented. How do you get to OAOO? One part of the story is to remove duplication. Never write the same concept in two places, that is, keep DRY. The other part is to write expressive code: don’t write “a+b”, write what you mean by “a+b”: what does it mean in terms of the application? That’s the Once, and the Only Once.

Bravo Antonio.

Report of the first run of the OCP kata

Tuesday, February 23rd, 2010

Two weeks ago we had our first meeting of the Milano Coding Dojo. It was great fun, and I was honored to see Giordano had prepared such a good presentation mentioning, among other things, the “OCP Kata” of my earlier blog post. The “Open Closed Principle” says that we should be able to add new feature by adding code, not by changing existing code (with an exception made for the place where the objects are created; after all, for the new class to be used, it must be instantiated somewhere.) The OCP Kata is a set of rules, to be used in a training session, that force us to apply the OCP.

So this was not only the first test-drive of this Dojo, but also of the OCP Kata. How did it go?

We worked randori-style on the Yathzee kata. My impressions follow.

On the OCP Kata rules

The OCP Kata was an influence only for the first test (forced us to use an explicit factory) and the second test (forced us to apply the OCP). After that, the OCP rules did not fire, as the problem was naturally easy to be solved in OCP style. After all, it was the implementation of a series of scoring rules for the Yathzee game. Once you have the scoring rules machinery in place, everything else can be completed just by adding a new class (and modifying the factory).

One class, many uses

We must always keep an eye on the design. The complexity of the code kept going up, until we worked hard at removing duplication. The OCP rules do not produce a good design by themselves. Early in the kata, the rule for “twos” was the same as the rule for “threes” with 3 in place of 2. The solution was to create a SingleNumberRule that takes the number in the constructor. We avoided making two classes, when a single class could be used in different context with different configuration.

The driving force was removing duplication.

More duplication

Later, we had a lot of duplication between the “pair” rule, and the “double pair” rule. The code that looks for a pair is needed in both rules. An old-school OO programmer would have made the two rules derive from a common, abstract base class. The abstract base class would be a repository for shared methods. Modern OO programmers know to use inheritance only as a last resort. So what could we do to remove duplication without inheritance? One key observation was that most of that duplicated code was looking heavily into the array of rolls. When you have code that uses heavily a data structure, it’s a good idea to move both data structure and code in an object.

The natural name for that object is “hand”, so we created a Hand class that wraps the array of die rolls. The duplicated code disappeared.

The driving forces were removing duplication and avoiding direct access to data.

Finding abstractions

The code in the Hand class was still not good enough. It was full of loops. There was no flash of insight here, we just applied a few “extract method”s that moved each loop in its own little method. Once we did that, we realized that some loops depended on another one that counts the occurrences of each number in the hand. For instance, the occurrences in the hand (1, 1, 3, 3, 4) are (2, 0, 2, 1, 0, 0). This is a key abstraction in this domain.

The other abstraction that is needed to implement the pair rule is “find me the highest pair”, which is just max{i | occurrences(i) ≥ 2}. (It is not enough to score *any* pair. It must be the highest pair, if more are present.)

To implement the “double pair” rule, we need a way to say “find the second highest pair”. One way to say this is that if the highest pair is, say, 4, we must look for the highest pair that is less then 4. The method we need is

    public int highestPairLessThen(int n) {
       return max{i | occurrences(i) ≥ 2 && i < n};

Now the two pairs rule was easy to implement:

    public int highestPair() {
      return highestPairLessThen(7);
    public int secondHighestPair() {
      return highestPairLessThen(highestPair());

The solution here was to find the right abstractions, and implement complex things in terms of simple things. It’s a bit of functional programming in the small.


The goal of good design is to have simple building blocks that can be combined together to create complex things. When we are at the object-talking-to-other-objects level, the OCP principles guides us to invent object abstractions. When we are in the small, within-the-object level, it’s good to apply some mathematical thinking. It’s not deep, difficult mathematics. It’s just a game of finding the right definitions, and using them to express complex things in terms of simpler things.

Update: cleaned up HTML, added headings

The OCP kata

Tuesday, January 12th, 2010

Read the first chapter of the Patterns book, it’s all there, where it says “favor composition over inheritance”,

said Jacopo. We were chatting about the Open/Closed Principle, and how I read about it in Meyer in 1991, yet it didn’t “click” for me back then.

Now I see how the OCP is key to writing code that can be changed easily, which is the chief technical goal of an agile team. I wondered, is there a way to teach and learn the OCP? Is there a kata to learn OCP?

It’s unfortunate that most common coding katas result in “single-object-oriented programming”. The famous bowling score example by Robert Martin, for instance, is usually solved by creating *one* object. This might be fine for learning how to write simple code. It’s not so good for learning how to do object-oriented design.

So I invented this little exercise to practice and learn OCP.

Take any coding problem. The bowling score, the string evaluator, the supermarket checkout, you name it. Then follow these instructions.

0. Write the first failing test. Then write a factory that returns an object, or an aggregate of objects, that make the test pass.

The factory should be limited to creating objects and linking them together. No conditionals allowed.

1. Write the next failing test.

2. Can you make it pass by changing the factory and/or creating a new class and nothing else? If yes, great! Go back to 1. If not, refactor until you can.

The refactoring should bring the system to a state where it’s possible to implement the next test just by changing the aggregate of objects that is returned by the factory. Be careful not to implement new functionality; the current test should still fail.

For instance, take the bowling score problem. The first test is

  @Test public void gutterGame() throws Exception {
    BowlingGame game = new BowlingGameFactory().create();
    for (int i=0; i<20; i++) {
    assertEquals(0, game.score());

The code to make this pass is

  class BowlingGameFactory {
    public BowlingGame create() {
      return new BowlingGame();
  class BowlingGame {
    public void roll(int n) {}
    public int score() {
      return 0;

Nothing strange here. Now the second test is

  @Test public void allOnesGame() throws Exception {
    BowlingGame game = new BowlingGameFactory().create();
    for (int i=0; i<20; i++) {
    assertEquals(20, game.score());

The simplest code that makes both tests pass would be to change BowlingGame to accumulate rolls in a variable. But our rules stop us from doing that; we must find a way to implement the new functionality with a new object. I think about it for a few minutes, and all I can think of is to delegate to another object the accumulation of rolls. I will call this role “Rolls”. Cool! This forces me to invent a new design idea. But I must be careful not to add new functionality, so I will just write a Rolls object that always returns 0.

  interface Rolls {
    void add(int n);
    int sum();
  class BowlingGame {
    private final Rolls rolls;
    public BowlingGame(Rolls rolls) {
      this.rolls = rolls;
    public void roll(int n) {

    public int score() {
      return rolls.sum();
  class BowlingGameFactory {
    public BowlingGame create() {
      Rolls zero = new Rolls() {
        public void add(int n) {}
        public int sum() { return 0; }
      return new BowlingGame(zero);

This passes the first test, and still fails the second. In order to pass the second test, all I have to do is provide a real implementation of Rolls and change the factory.

  class Accumulator implements Rolls {
    void add(int n) { ... }
    int sum() { ... }
  class BowlingGameFactory {
    public BowlingGame create() {
      return new BowlingGame(new Accumulator());

And so on. The point here is to think about how to

  1. compose functionality out of existing objects, and
  2. avoid reworking existing code.



24/12/2013 Chris F Carroll‘s Gilded Mall Kata is a new exercise that can be done with the OCP kata rules. Thanks Chris!

TDD is not finished until the code speaks

Saturday, April 25th, 2009

A problem, and solutions that don’t seem right

I recently asked a few people to solve a little programming problem. In this problem, a number of towns are connected by one-way roads that have different distances. The programmer must write code that answers questions such as:

  • What is the distance of the path A-B-C-D?
  • How many paths are there from A to C that are exactly 4 steps long?
  • What is the distance of the shortest path from A to D?

The graph

I reviewed three different solutions; they were all valid, reasonably well written. Two had proper unit tests, while the third was checked by prints in a "main". The authors tried hard to write a "good" solution. There were no long methods; all the logic was broken down in small methods. And yet, I was not pleased with the results.