About Kent Beck’s Stepping Stone strategy

Most papers in computer science describe how their author learned what someone else already knew. — Peter Landin

And blog posts too. — Matteo

This is a follow-up to my previous post.

In his Responsive Design presentation, Kent Beck talks about the Stepping Stone strategy. He talks about two kinds of stepping stones: one is about building abstractions that might make your work easier.

The first kind of Stepping Stone

He talks about the abstractions that allow Google people to perform their magic, like BigTable which is built upon Google File System, which is built upon a Distributed Lock Manager. I can relate with this view of “design”. It’s powerful. I like this quote from Daniel Jackson

Part of what allowed thousands of engineers to build scalable systems at Google is that really smart engineers like Jeff Dean and Sanjay Ghemawat built simple but versatile abstractions like MapReduce, SSTable, protocol buffers, and the like. Part of what allowed Facebook engineering to scale up is the focus on similarly core abstractions like Thrift, Scribe, and Hive. And part of what allows designers to build products effectively at Quora is that Webnode and Livenode are fairly easy to understand and build on top of.

Keeping core abstractions simple and general reduces the need for custom solutions and increases the team’s familiarity and expertise with the common abstractions. The growing popularity and reliability of systems like Memcached, Redis, MongoDB, etc. have reduced the need to build custom storage and caching systems. Funneling the team’s focus onto a small number of core abstractions rather than fragmenting it over many ad-hoc solutions means that common libraries get more robust, monitoring gets more intelligent, performance characteristics get better understood, and tests get more comprehensive. All of this helps contribute to a simpler system with reduced operational burden.

The interesting thing about this strategy is that it’s the one that you *don’t* do when you TDD. Kent Beck is saying, in effect, “you can do plenty of work with TDD, but sometimes it’s beneficial to stop doing TDD, and try instead to guess that you will need some intermediate or related thing. Then you build that something, and the original problem that you had might be much easier to solve.”

Let’s have an example, shall we? The Sudoku Solver, for instance. What Stepping Stones could be useful? From Artificial Intelligence classes taken long ago, and from Peter Norvig’s excellent article on the subject, I know that two ways to solve a Sudoku are depth-first search and constraint propagation. If I wanted to try depth-first search, my Stepping Stone would be the depth-first search algorithm itself.

Aside. What is depth-first search? Depth-first search means that you model your problem as a search through the space of possible moves. Given any position you’re in, you try the first possible move, which brings you in a new position.

  1. If the new position is a winning position, you’re done.
  2. If the new position is not a winning one, you make the first possible move from here, and repeat.
  3. If there are no possible moves in the new position, then you back-track to the previous one, and try the second possible move.

This algorithm is very simple. It works for a huge variety of problems; all you need is that you can enumerate moves from any position. It does not always find a solution; for instance, on some problems it can enter an infinite loop. But this can’t happen with Sudoku, so depth-first is an excellent algorithm to try with Sudoku.

The depth-first algorithm is easy to test-drive. Once I have this algorithm, the original problem “solve Sudoku” is reduced to “from any given Sudoku board, enumerate possible moves”, which is much simpler.

Not only the problem is made simpler; it’s also easier to check that I implemented it correctly.

What about the second way to solve Sudoku, that is constraint propagation? This brings us to the next section.

The second kind of Stepping Stone

The other kind of Stepping Stone that Kent Beck mentions in the video is related to the old Lisp maxim

Q. How do you solve a difficult problem?
A. You implement a language in which the problem is trivial to solve.

This is the opposite of the Simplification strategy. Sometimes, the easiest way to solve a problem is to generalize it. Let’s make an example. The Http access log report problem from my last post would be easy to solve with SQL. Assuming that the access log lines are in a database table, I could do

select date,
  sum(if(status >= 200 and status < 300, 1, 0)) as 2xx-results,
  sum(if(status >= 300 and status < 400, 1, 0)) as 3xx-results,
  sum(if(status >= 400 and status < 500, 1, 0)) as 4xx-results,
  sum(if(status >= 500 and status < 600, 1, 0)) as 5xx-results,
from access_log
group by date

How’s that for simplicity? But why couldn’t I write a similar program in Ruby? I would start by imagining what the solution would look like:

  sum(:2xx_results) {|x| (200...300) === x.status ? 1 : 0 }.
  sum(:3xx_results) {|x| (300...400) === x.status ? 1 : 0 }.
  sum(:4xx_results) {|x| (400...500) === x.status ? 1 : 0 }.
  sum(:5xx_results) {|x| (500...600) === x.status ? 1 : 0 }  

Then I could implement the various needed features of Report one by one. That would give me a nice list of small, safe steps to implement.

It seems a paradox that solving a more general problem may be easier, but we can see from this example how this can be true. Another example is in my post Feeling like carrying bags of sand.

Getting back to the Sudoku solver, a possible Stepping Stone of the second variety could be to implement a simple constraint propagation system. If you’re interested, there is a nice discussion in Structure and Interpretation of Computer Programs.


Stepping Stone is an antidote to the tendency to carry YAGNI too far, to the point that we refuse to write anything that is even slightly more general than what is exactly required. We’re not violating YAGNI when it’s simpler to write a Domain-Specific Language than trying to solve a complicated problem directly.

XP warns against premature abstraction. Let’s leave aside for the moment the observation that generalization is not the same thing as abstraction. It is true that by creating abstractions you run the risk of creating things that turn out not to be needed. Kent Beck warns against this in the video. Yet, I believe Stepping Stones are a very important dimension to design, one that is not used when we do TDD and that should be considered whenever we have a difficult programming problem.

Further reading

The idea of “layers of abstractions” is (I believe) due to Edsger Dijkstra. See his paper The Structure of the “THE”-Multiprogramming System. It’s interesting to compare how his idea of powerful abstractions differs from the weaker concept of “layers” in business applications.

5 Responses to “About Kent Beck’s Stepping Stone strategy”

  1. Luca Minudel Says:

    So far this match my understanding from video and slides.

    As I understood those design strategy (simplify or place stepping stone) can be applied at different level of granularity (from methods to subsystems to complete systems).

    As I understood those design strategy apply at the coding-time (as in your posts/examples) and also at the run-time, in the sense that they apply also to the iterative development and release strategy.

    Do this match your understanding too or am I pushing these ideas outside the original author’s intention?

  2. Howard Says:

    There are a few hidden issues when drawing an analogy with things that google can do and things that everyone else can do.

    1) If you are wildly successful and have a great deal of money, you can build amazing infrastructure. It is easier to say technology that has a revolutionary affect like “bigtable, GFS, etc” when you use google as the author. It is harder to build something like “bigtable” when you can’t afford to hire a Jeff Dean and give him the resources he needs to create new technology. (Even if you can afford him, there aren’t enough Jeffs to hire). Furthermore, Google hires developers who build development tools which is fantastic. But not everyone can afford that and it may not matter; youtube didn’t have that luxury before google bought them. You have to balance providing value to your customers with developer productivity. But really there is something subtle with google. When you hire enough people in the top 1% you can accomplish things that others can’t match and it is a mistake to try before you secure a clear way to pay for it. In hindsight, bigtable seems simple enough for someone who is in the top 10% to implement. You have to ask yourself it she could do something that has the same affect as bigtable today. Could someone in the top 1% do it without others in the 1% around him to interact with?

    2) It is easy to spend a lot of time building a tool to solve your problem such that you run of time (market window) to solve the problem. I once saw a group of developers who convinced their manager to allow them to write a C++ IDE so it would be easier for them to write the product. Similarly, it can be wasteful to rebuild part of oracle inside of ruby so that you have a similar abstraction. I know you’re thinking “Everyone knows that” but I don’t think they do know how hard it is when they quote google as an example.

    Some people underestimate how hard it is to build key infrastructure. At google the bigtable team was tiny because they only accepted the smartest developers they could find (the top 5%). When bigtable failed, it is not simple to isolate the problem. In one case, there was a bigtable bug that caused a corruption every billion or so transaction. The team found it quickly and rolled out a patch within a few hours. This is not technology your average developer can just pick up and run with.

    You see the tip of the iceberg when you see what google does; what is hidden is the huge amount money they spend to hire people who can pull it off. You don’t want to try to keep up with them in hiring people in the top 10%. You don’t want to try to build the next revolutionary infrastructural technology. Google has a blind spot. They are only interested in solving really hard problems. There are plenty of billion dollar easy problems that google is too smart to consider. Twitter is a good example.

  3. itoctopus Says:

    What you’re basically saying is that one should re-use other people’s code as a black box which will facilitate development and increase productivity.

    There is quote that I read some time ago (I can’t remember who said it – it might be an IBM or a Microsoft engineer): “Don’t trust code that you haven’t written yourself” – While of course, this is not practical – it can tell a lot about the mentality of many developers.

    Thank you for sharing!

  4. matteo Says:

    @Luca: Good point! These strategy apply at many levels. About applying them to the release cycle: do you mean that you might spend an entire release on infrastructure work? I’m not sure about that. Agile thinking tells me we should always release value for the customer. Perhaps what you say is true for some situations (e.g., you’re on an internal project and you’re delivering value to internal customers. ) I suppose it depends on context…

    @Howard: what you say is true, but you should also consider that:

    * many pieces of infrastructure are already there for the taking; take Redis or Hadoop or Memcached for instance.

    * The Stepping Stone strategy applies at all levels (see Luca’s comment). It’s not only architecture level. The examples I provide are much smaller than that.

    * Stepping Stone is not the only strategy that you should use. Other ideas in XP in particular and agile methods in general would prevent you to waste too much on useless infrastructure. For instance: Spike Solution, Continuous Delivery, and the ever-important You Ain’t Gonna Need It.

    @ioctopus: I’m saying that you should either find or write yourself the stepping stones that might benefit you. It’s wise to look for quality in all things, so I would not just grab any open source product; I would only use top quality third-party infrastructure.

  5. Kent Beck: Best Practices for Software Design with Low Feature Latency and High Throughput « The Holy Java Says:

    […] About Kent Beck’s Stepping Stone strategy by Matteo Vaccari – the two types of them, examples (I’m not sure I agree with his interpretation of the simplification strategy though) […]

Leave a Reply