Emmanuel Gaillot on Software Craftmanship

May 25th, 2010

Excellent post by Emmanuel. My favourite quote:

I envision a future in which programmers are the conscious repositories of a body of knowledge. A future in which they regain their craft, instead of tweaking frameworks they don’t understand. A future, eventually, in which programmers say “no” to demands at odds with their ethics.

Antonio Ganci on design

April 18th, 2010

My friend and former Sourcesense collegue Antonio Ganci posted an article (Italian) that describes exactly what I meant in A Frequently Asked TDD Question and its followup. It’s about what you can accomplish if you’re really good at design.

It’s in Italian, so I will report here in English the main points: Antonio is the main developer for an application. He had the most crazy change requests from his customers, like:

  • All user-visible text should be uppercase,
  • There should be an offline mode for working at home
  • Users with Vista or newer OS should see a WPF user interface, the others should get normal Windows Forms
  • Change rounding from 2 to 4 digits everywhere
  • Log every data modification

He reports that some of these changes were done in less than half an hour. Quite a feat, and Antonio seems proud of his design. I think he has good reasons to be proud!

Antonio does not say how he did that, but we can guess that the key is the Once and Only Once rule. If there’s a single place where you print/compute/store a number, it’s very easy to change precision. If you have some sort of builder to generate the user interface, it’s not terribly expensive to generate a WPF interface instead of a Windows Forms one.

Once and Only Once: there should be a single authoritative place in the code where each concept is represented. How do you get to OAOO? One part of the story is to remove duplication. Never write the same concept in two places, that is, keep DRY. The other part is to write expressive code: don’t write “a+b”, write what you mean by “a+b”: what does it mean in terms of the application? That’s the Once, and the Only Once.

Bravo Antonio.

On Jeff Sutherland’s “Shock therapy” paper

April 13th, 2010

I read the paper by Jeff Sutherland and others, Shock Therapy: A Bootstrap for a Hyper-Productive Scrum. There are some things I like in this paper, and some things that I don’t.

I like it that

  • The coach insists on a proper definition of done, holding the team accountable for the quality of the work.
  • The coach insists that only properly expressed stories are inserted in the team backlog. No vague stories, with no acceptance criteria.
  • The coach insists that Scrum training is done beforehand, and all parties involved attend, both business and developers.
  • The coach insists that all part of Scrum are implemented, with no “scrumbut.” The coach provides reasonable defaults for the (few) degrees of freedom of the Scrum framework.
  • The coach works by helping the team solve problems by themselves, with the goal of getting the team to a point where they don’t need the coach anymore.

I don’t like

  • The theme of forcefully enforcing the rules that pervades the paper. “Resistance is futile”… Bah.
  • Measuring team maturity with velocity alone. Velocity is a highly suspect measure. Velocity depends on the stories estimate numbers, which are decided by the team. There is a part of subjectivity in the estimates. Did they estimate *all* the stories before starting work? Did they change *any* of the estimates later?.
  • Velocity is also suspect because it bears no strict relation to return on investment. You might be very fast in developing software that does not give you a iota of profit. This does not seem to be a concern in this paper.
  • Thirdly, it’s very easy to be faster than what we did in the first iteration. In my experience, in the first iteration a lot of effort goes in building a “walking skeleton” of the system, learning about the problem domain and project technologies, and so on.
  • Fourth, it’s unclear what it means to compare “velocities” of different teams. Who did the estimates that are used to compare velocities? And how do you compare velocities with teams that are not even doing Scrum??
  • Then I have issues with how the paper says nonchalantly “ATDD was used” as if practicing ATDD was easy. In my experience, which is also what many trainers say, it takes at least two or three months to begin to be proficient with TDD, let alone ATDD. How much experience and training did the developers have with ATDD? Does the “shock therapy” work well even when the team members are new to TDD and other XP engineering practices?
  • No mention is made of the quality of the code produced by the teams. Was the high velocity bought at the expense of introducing technical debt? Was code quality even measured in any way? The paper does not say.

In conclusion, I found a few useful ideas in this paper, but I think it leaves a lot of questions unanswered. I have no problem believing that Jeff Sutherland can achieve very good results with his teams. I find the paper does not prove at all that there is a magic formula that guarantees high productivity.

Answering Luca’s comment

March 20th, 2010

Luca Minudel commented on my previous post. My answer has got so long that it became a full post.

Hi Luca,

I’m a fan of the Growing Object-Oriented Software book, but I haven’t adopted the mockist style yet. I know that I have seen lots of instances of mock misuse, leading to unreadable and brittle tests. So I’m afraid it depends a lot on who’s doing the training :-) It’s true that the GOOS book promotes good object-orientation.

My point in my previous post was not advocating a particular style of training. It is simply to answer a common question. “If we work incrementally without design upfront, as all the books on TDD advocate, is there a risk of getting in a situation where adding some features becomes unreasonably expensive?” This fear is often related to non-functional requirements.

I think this fear has merit, and it’s too easy for new converts to TDD (including me) to confidently say “we have the tests, we can refactor the code to make it do whatever is needed.” The bug in this reasoning is that we should not code with the expectation to do (what often amounts to) major refactoring. We should code with the objective of producing a system where implementing new stuff is so easy that it feels like composing Lego bricks, or like putting the last piece of a puzzle, when it slides comfortably in its place.

Getting to a design that is this good is the real goal of TDD and XP. I can’t say in all honesty that I can achieve this level of goodness in major applications. But I’m beginning to see how one could do that.

So the point of my post is that you can afford to work in a totally incremental way, only if you are dead serious about keeping your code squeaky clean. Which is not easy to do, given the time pressures we always have. But then again, this is nothing new, this is what the XP books have been saying from the beginning, isn’t it? Only it is surprising to discover how really clean your code should be.

Coming back to your comment, it seems you are concerned with ways to help a team to do the right thing and produce good code. My experience is that the first step is to know your material well. So if you’re accomplished producing good code in the mockist style, more power to you! Your team will pick it up from you.

It’s interesting that you found it easier to communicate good style using the mockist approach rather than teaching principles like OCP and Demeter. I’ll have to think about it.

Dear reader: if you haven’t read Growing Object-Oriented Software yet, I invite you to do so. It’s not a book about “mocks” simply. It’s a book about a style of software development, deeply and beautifully object-oriented. For a taste of the authors’ style, I suggest you to read their paper Mock Roles, Not Objects (pdf).

A frequently asked TDD question

March 18th, 2010

Today I and Tommaso facilitated the kickoff of a new project for a new team. One question that came up was:

We know that before the project is finished, we will have to profile all buttons and links so that users will not see options they are not authorized to use. Should we keep this in mind as we write the code today? Or should we rigidly adhere to YAGNI and defer all work on profiling until the profiling story is chosen by our customer?

This question comes up quite often. An answer that is often heard by agilists is

Never! Today you should only work for today’s story! The profiling story might never be chosen, after all, and even if it is chosen, you will refactor your code to accomodate the new functionality. YAGNI, my friend!

Tommaso rightly objected that this point of view is wrong. When the moment of link profilation comes up, the application is usually at a late stage of development. Refactoring all links and buttons to include profilation takes a lot of work. And we knew this functionality was coming, so we can’t even invoke the “customer changing their mind” excuse.

Agile development can only work if we can keep down the cost of adding functionality even late in the development cycle. This should apply to unforeseen changes, and even more so to features that we always knew were needed. If a change requires major reworking, then we clearly did something wrong.

What do I suggest then? Should we build in infrastructure right at the beginning for all major features? Should we do design upfront to protect us from this sort of mistakes?

Well, there is not a clear-cut answer. Naive reading of the TDD book will lead you to believe that just applying the TDD mantra

  1. Red—Write a little test that doesn’t work, and perhaps doesn’t even compile at first.
  2. Green—Make the test work quickly, committing whatever sins necessary in the process.
  3. Refactor—Eliminate all of the duplication created in merely getting the test to work.

Kent Beck, Test Driven Development by Example

will lead you slowly but surely, almost automatically, to a well-written system that makes it easy to add changes at all times. Well, there is a huge misunderstanding here. There is nothing “almost automatic” in this process. If you read carefully the book, it will also say things like “Our designs must consist of many highly cohesive, loosely coupled components”.

Just how highly cohesive and how loosely coupled our systems must be, is something that is left to the judgment of the reader. Which in most cases is not skilled enough to imagine just how cohesive and decoupled people like Kent Beck think when they say “cohesive and decoupled”. Most people don’t understand how serious people like Kent Beck are when they say “eliminate all of the duplication”.

Because if you’re not fanatical about removing duplication, about striving for code that is highly cohesive and decoupled, there is little chance of achieving code that is easy to change over time.

My answer is that if I am serious about removing duplication, then I will *not* have code like <a href='foo'>bar</a> written in more than one place; let alone in hundreds of places. I will have a single function, a single method, a single place where the html link is generated. Then when the time comes to add profilation, or HTTPS, or Ajax, or anything else that affects how my links should work, then I will have a *single* place to change. The change will not be too expensive.

If I don’t think I can work with the level of discipline it takes to really remove duplication, then perhaps I’ll be better off stopping pretending I’m really doing TDD, and start doing some design upfront.

Lazy proxy in Ruby

March 11th, 2010

I’m a total newbie when it comes to Ruby evaluation tricks, so when I learned this today I felt it was a good thing to share :-)

The problem: speeding up a Rails application. When all is said and done, you need to cache page fragments in order to speed up an application significantly. For instance: you start with

class ProductsController < ApplicationController
  def category
    @products = Product.find_by_category(params[:id])
  end
end

...

<div id="products">
  <% for product in @products do %>
    <!-- some complicated html code -->
  <% end %>
</div>

and then add fragment caching in the view with

<% cache "category-#{params[:id]}" do %>
  <div id="products">
    <% for product in @products do %>
      <!-- some complicated html code -->
    <% end %>
  </div>
<% end %>

OK, this speeds up view rendering. But we are still executing the query in the controller, to obtain a list of products we are not even using. The standard Rails solution to this is

  class ProductsController < ApplicationController
    def category
      unless fragment_exist? "category-#{params[:id]}"
        @products = Product.find_by_category(params[:id])
      end
    end
  end

This is nice enough. But one things is worrying me, is there might be a race condition between the “unless fragment_exists?” test and the call to “cache” in the view. If the cron job that cleans the cache directory executes between the two, the user will see an error.

I thought to myself, wouldn’t it be nice to give the view a lazy proxy in place of the array of results? The lazy proxy will only execute the query if it is needed. The controller becomes:

class ProductsController < ApplicationController
  def category
    @products = LazyProxy.new do
      Product.find_by_category(params[:id])
    end
  end
end

The LazyProxy magic is surprisingly simple:

class LazyProxy < Delegator
  def initialize(&block)
    @block = block
  end

  def __getobj__
    @delegate ||= @block.call
  end
end

The block given to the constructor is saved, and not used immediately. The Delegator class from the standard library delegates all calls to the object returned by the __getobj__ method. The “||=” trick makes sure that the result of @block.call will be saved in an instance variable, so that the query is executed at most once.

So the idea is that the view will be given a lazy proxy for a query. If the fragment exists, the view code will not be evaluated and the proxy will not be used. No query. If the fragment does not exist, the lazy proxy is used and a query is executed. There is no race condition, for there is no test to see if the fragment exists.

What do you think?

Update One additional advantage of the lazy proxy is that you no longer need to make sure that the fragment key is the same on both view and controller.

Prima lezione del corso di Tecnologia e Applicazioni Internet

March 7th, 2010

Summary: first lesson with my new class. Teaching TDD, letting the students get a glimpse of how skilled they are in programming; which is unfortunately not much.

Per il secondo anno insegno Tecnologia e Applicazioni Internet all’Insubria. Il mio obiettivo per questo corso è di insegnare come sviluppare applicazioni web, applicando le pratiche tecniche di Extreme Programming. In particolare, vorrei insegnare Test-Driven Development e i principi di Object-Oriented Design, per come li capisco. Il mio meta-obiettivo per questo corso è fare in modo che lo studente diventi il doppio più bravo a programmare.

Per questo corso uso Java e non Rails. Il motivo di questa scelta è che Rails, per quanto sia una spanna sopra a tutti i web framework in Java, è purtuttavia un framework e in quanto tale è una stampella, una gruccia, che ti permette di stare in piedi ma certo non ti aiuta quando vuoi imparare a camminare da solo, men che meno a correre. Per imparare a camminare da soli bisogna imparare a programmare a oggetti.

In Aula

Per la prima lezione ho spiegato il TDD da solo, senza la complicazione delle servlet. Mi sono sforzato di pensare a un esempio che fosse piccolo a sufficienza per fare una demo di fronte agli studenti, in un pomodoro o poco più. Ho deciso di fare un “calcolatore a riga di comando”, ovvero un programma che presa una stringa come “2 + 3″ come argomento sulla riga di comando, stampi “5.0″ su standard output.

Il primo test che ho scritto:

@Test
public void twoAndThreeIsFive() throws Exception {
	Calculator calculator = new Calculator();
	double result = calculator.add(2, 3);
	assertEquals(5.0, result, EPSILON);
}

Abbastanza semplice da far passare. Ma non era sufficiente, perché dalla riga di comando gli argomenti arrivano come stringhe e non come interi già parsati. Per cui ho scritto un secondo test che ha fatto emergere una classe Parser

@Test
public void willParseTwoAndFive() throws Exception {
	Calculator calculator = new Calculator();
	String result = new Parser(calculator).calculate("2 + 5");
	assertEquals("7.0", result);
}

Perché creare una seconda classe a questo punto? Non sarebbe bastato mettere il metodo “parse” nella classe Calculator? Avrei potuto, però in questo modo il metodo “add” sarebbe diventato un metodo ad uso interno della classe. Come avrei fatto a testarlo? Avrei dovuto buttare via il test su add, oppure tenere add come “public” anche se in realtà serve solo internamente. Oppure usare qualche brutto trucco come dare ad “add” visibilità protected oppure package.

Invece, tenendo il Calculator come classe a sè che si occupa solo di fare conti, mentre Parser si occupa di leggere e scrivere stringhe, posso tenere “add” come metodo pubblico di Calculator. Martin direbbe che ho applicato il “Single Responsibility Principle.” Per me è stato decisivo pensare “se no mi tocca testare un metodo privato”.

Poi non ero ancora soddisfatto. Nel TDD quello che facciamo è sviluppare un isola felice di codice a oggetti, che però a un certo punto si deve scontrare con la realtà procedurale del mondo esterno. In questo caso il “mondo esterno” è il main, che deve creare e invocare i nostri oggetti. Per me è fondamentale che il main non contenga nessuna logica, ma soltanto la creazione di un certo numero di oggetti, collegati insieme. Se faccio restituire il risultato a Parser#calculate, poi al main resta la responsabilità di invocare System.out.println() per stampare.

Il mio obiettivo è ridurre al minimo la logica nel main, in modo che il main, che è per sua natura più difficile da testare unitariamente, sia così semplice da risultare ovviamente corretto. O comunque, per essere sicuro che il main se fallisce, fallisce sempre, e se funziona, funziona sempre. In questo modo posso essere ragionevolmente certo che se il mio main contiene un errore, me ne accorgerò. Gli errori di cablaggio, come li chiama Hevery, sono facili da trovare.

Allora ho applicato il principio “Tell, don’t Ask,” e ho passato l’OutputStream come collaboratore alla Parser#calculate.

@Test
public void willParseTwoAndFive() throws Exception {
	Calculator calculator = new Calculator();
	OutputStream stream = new ByteArrayOutputStream();
	new Parser(calculator, stream).calculate("2 + 5");
	assertEquals("7.0\n", stream.toString());
}

In questo modo è come se dicessi a Parser, “questa è la tua stringa da calcolare, questo è lo stream dove devi scrivere il risultato, adesso arrangiati, non ne voglio sapere nulla.”

Un pattern che si può riconoscere in questo design è una versione embrionale di collecting parameter. In generale cerco di evitare di avere metodi che restituiscono dati. Di solito è più efficace dire agli oggetti di fare cose, piuttosto che chiedere dati. Questo è il principio “tell, don’t ask“.

Possiamo anche vedere l’oggetto Parser come un adapter: adatta l’interfaccia del Calculator, basata su numeri, alle necessità di main, che lavora con stringhe.

Tutto ciò, beninteso, non significa che per programmare bisogna ad ogni piè sospinto cercare nel manualone dei pattern uno o più pattern da ficcare dentro al nostro codice. Al contrario, quello che ho fatto io è stato di scrivere il codice che mi sembrava più appropriato per risolvere il mio problema, e poi, ragionandoci sopra, ho riconosciuto dei pattern in quello che avevo scritto.

In laboratorio

La seconda parte della lezione si è svolta in laboratorio. Ho proposto un semplice esercizio, di scrivere un programma che concatena le righe di due file a una a una, un po’ come fa il comando paste(1) di Unix. Ho visto subito che per la maggior parte degli studenti questo esercizio era troppo difficile, per cui sono subito passato a suggerire come primo test una versione semplificata del problema.

@Test
public void pasteLinesFromArrays() throws Exception {
	List a = Arrays.asList("aa", "bb");
	List b = Arrays.asList("xx", "zz");
	List result = new ArrayList();

	Concatenator concatenator = new Concatenator();
	concatenator.concatenate(result, a, b);

	assertEquals(Arrays.asList("aaxx", "bbzz"), result);
}

I miei studenti sono al terzo anno di Informatica triennale. Nel nostro corso di laurea, il linguaggio di programmazione di riferimento è Java. Purtroppo, ho dovuto osservare che per la grande maggioranza dei miei circa 40 studenti, scrivere il codice che fa passare questo esercizio è un problema difficile. E nessuno (mi pare) è stato in grado di estendere il codice per fare passare anche il secondo test:

@Test
public void listsCanBeOfDifferentLength() throws Exception {
	List a = Arrays.asList("a");
	List b = Arrays.asList("b", "c");
	List result = new ArrayList();

	Concatenator concatenator = new Concatenator();
	concatenator.concatenate(result, a, b);

	assertEquals(Arrays.asList("ab", "c"), result);
}

Non so che cosa pensare. Questi esercizi mi sembrano di un livello di difficoltà paragonabile al famoso “problema” FizzBuzz, che viene usato nei colloqui di lavoro per scremare quelli che non sanno programmare per niente da quelli che forse sono capaci di fare qualcosa. Al terzo anno mi aspetterei qualche cosa di più. Sto cercando di ricordare me stesso al terzo anno di università. Sono sicuro che sarei riuscito a risolvere questo problema.

Ma non importa. Venerdì prossimo continuerò con esercizi di questo tipo. Piano piano miglioreremo. Sono sicuro che, alla fine del corso, gli studenti che avranno continuato a frequentare raggiungeranno l’obiettivo di diventare (almeno) il doppio più bravi a programmare.

Next speaking engagements

February 25th, 2010

I’m happy to say that the Birthday Greetings Kata session that I did with Antonio at XP Days Benelux was selected for a second run at the Mini XP Day! I hope to see you in Eindhoven, The Netherlands, on April 26.

My other speaking engagement is a Coaching Workshop, co-organized with Simone Casciaroli, that will happen at Better Software in Firenze, 5-6 May. Simone and I were going to present this at the Agile Day 2009, but Simone was hit by flu and could not come.

Report of the first run of the OCP kata

February 23rd, 2010

Two weeks ago we had our first meeting of the Milano Coding Dojo. It was great fun, and I was honored to see Giordano had prepared such a good presentation mentioning, among other things, the “OCP Kata” of my earlier blog post. The “Open Closed Principle” says that we should be able to add new feature by adding code, not by changing existing code (with an exception made for the place where the objects are created; after all, for the new class to be used, it must be instantiated somewhere.) The OCP Kata is a set of rules, to be used in a training session, that force us to apply the OCP.

So this was not only the first test-drive of this Dojo, but also of the OCP Kata. How did it go?

We worked randori-style on the Yathzee kata. My impressions follow.

On the OCP Kata rules

The OCP Kata was an influence only for the first test (forced us to use an explicit factory) and the second test (forced us to apply the OCP). After that, the OCP rules did not fire, as the problem was naturally easy to be solved in OCP style. After all, it was the implementation of a series of scoring rules for the Yathzee game. Once you have the scoring rules machinery in place, everything else can be completed just by adding a new class (and modifying the factory).

One class, many uses

We must always keep an eye on the design. The complexity of the code kept going up, until we worked hard at removing duplication. The OCP rules do not produce a good design by themselves. Early in the kata, the rule for “twos” was the same as the rule for “threes” with 3 in place of 2. The solution was to create a SingleNumberRule that takes the number in the constructor. We avoided making two classes, when a single class could be used in different context with different configuration.

The driving force was removing duplication.

More duplication

Later, we had a lot of duplication between the “pair” rule, and the “double pair” rule. The code that looks for a pair is needed in both rules. An old-school OO programmer would have made the two rules derive from a common, abstract base class. The abstract base class would be a repository for shared methods. Modern OO programmers know to use inheritance only as a last resort. So what could we do to remove duplication without inheritance? One key observation was that most of that duplicated code was looking heavily into the array of rolls. When you have code that uses heavily a data structure, it’s a good idea to move both data structure and code in an object.

The natural name for that object is “hand”, so we created a Hand class that wraps the array of die rolls. The duplicated code disappeared.

The driving forces were removing duplication and avoiding direct access to data.

Finding abstractions

The code in the Hand class was still not good enough. It was full of loops. There was no flash of insight here, we just applied a few “extract method”s that moved each loop in its own little method. Once we did that, we realized that some loops depended on another one that counts the occurrences of each number in the hand. For instance, the occurrences in the hand (1, 1, 3, 3, 4) are (2, 0, 2, 1, 0, 0). This is a key abstraction in this domain.

The other abstraction that is needed to implement the pair rule is “find me the highest pair”, which is just max{i | occurrences(i) ≥ 2}. (It is not enough to score *any* pair. It must be the highest pair, if more are present.)

To implement the “double pair” rule, we need a way to say “find the second highest pair”. One way to say this is that if the highest pair is, say, 4, we must look for the highest pair that is less then 4. The method we need is

    public int highestPairLessThen(int n) {
       return max{i | occurrences(i) ≥ 2 && i < n};
    }

Now the two pairs rule was easy to implement:

    public int highestPair() {
      return highestPairLessThen(7);
    }

    public int secondHighestPair() {
      return highestPairLessThen(highestPair());
    }

The solution here was to find the right abstractions, and implement complex things in terms of simple things. It’s a bit of functional programming in the small.

Conclusions

The goal of good design is to have simple building blocks that can be combined together to create complex things. When we are at the object-talking-to-other-objects level, the OCP principles guides us to invent object abstractions. When we are in the small, within-the-object level, it’s good to apply some mathematical thinking. It’s not deep, difficult mathematics. It’s just a game of finding the right definitions, and using them to express complex things in terms of simpler things.

Update: cleaned up HTML, added headings

Niente Shore né Larsen

February 6th, 2010

Summary: the Shore + Larsen course due in Milan this month is cancelled due to not enough attendance. Too bad.

Che peccato! XP Labs mi ha scritto che i corsi di James Shore e Diana Larsen che avrebbero dovuto esserci questo mese, sono stati annullati per il numero insufficiente di iscritti. Sarebbe stata un’occasione unica per noi della zona di Milano di imparare da questi autori. Io sono un grande fan del manuale di Shore e Warden.

Mi sono allora iscritto ai i corsi di Rebecca Wirfs-Brock, che saranno in marzo. Spero proprio che si riesca a raggiungere il numero! Gli autori di Growing Object-Oriented Software citano Rebecca come l’originatrice dello stile di design che prediligono.