Another look at the anti-IF campaign

Last year, the Italian XP pioneer Francesco Cirillo launched his famous “anti-IF” campaign. The idea is bring down the complexity of programs by using object-oriented design; the most important trick in that bag is to replace case analysis by polymorphism. For instance, to replace

  if (shape.type == SQUARE)
    area = shape.getSide() * shape.getSide();
  else if (shape.type == TRIANGLE)
    area = (shape.getBase() * shape.getHeight())/2.0;
  else if (shape.type == CIRCLE)
    ...

by the much nicer

  area = shape.getArea();

There are many other useful tricks from OO design. But I’d like to talk about another set of tricks for eliminating IFs, namely the bag of tricks of algorithm design.

Boolean values are values too.

One simple trick is to replace

  if (foo and bar or baz)
    return true;
  else
    return false;

by

  return foo and bar or baz;

Now, some people prefer the first form, because they prefer operational reasoning over reasoning over boolean values (see the comments to a previous post on this subject). I much prefer the firstsecond, for one thing because it’s shorter. Concision is power. And I try to avoid operational reasoning as much as I can. If we push operational reasoning to the extreme we get:

  if (foo) {
    if (bar) {
      return true;
    }
  }
  if (baz) {
    return true;
  } else {
    return false;
  }

This should be equivalent to the previous two examples. But it’s not immediately obvious that it is. It requires much more thinking, and thinking is something I must save for important things.

Encapsulate IFs in well-known functions

Suppose you must find the maximum value of two numbers. Please don’t write

  if (x > y)
    return x;
  return y;

It’s much better to write

  return max(x, y);

Try to extend these two examples to the maximum of three numbers and see which one turns out clearer. The fact is that while we stay in the realm of expressions, we can exploit many nice properties, such that max is associative; in the realm of IFs it’s much more difficult to reason. You must simulate execution in your head, and that takes energy away from you; energy you might spend on other, more important problems.

Another example: replace

  if (x < 0) return -x;
  return x;

by

  return abs(x);

If your language does not provide an abs function, you can define it nicely as max(x, -x).

Zero is a perfectly good number

This code returns the sum of the elements of an array.

// ugly and wrong
int arraySum(int[] array) {
  int sum = array[0];
  for (int i=1; i < array.length; i++) {
    sum += array[i];
  }
  return sum;
}

Can you spot the error?

*
*     *

Yes, it breaks for empty arrays. If we keep it this way we'll be forced to pull ifs in the calling code, in all places where we call this method:

  // very ugly 
  if (x.length > 0) {
    s = arraySum(x);
  } else {
    ???
  }

This is ugly, and verbose. And it's not entirely clear what should happen when the length is 0. Should we give up and throw an exception? It's much better to stop treating 0 elements as an error. An empty array is a perfectly good array, and the sum of 0 elements is 0.

  // still ugly 
  int arraySum(int[] array) {
    if (array.length == 0) {
      return 0;      
    }
    int sum = array[0];
    for (int i=1; i < array.length; i++) {
      sum += array[i];
    }
    return sum;
  }  

Now it's correct, but there's no need to treat an empty array as a special case:

  // good
  int arraySum(int[] array) {
    int sum = 0;
    for (int i=0; i < array.length; i++) {
      sum += array[i];
    }
    return sum;
  }  

Now if the array is empty, the loop will be executed 0 times, since 0 < 0 is false. The result will be correct for all N ≥ 0.

Whenever you have to apply a binary operator such as "+" to a set of values, you should ask yourself what should be the answer for 0 elements. If the binary operator has a unit element, that is an element that leaves the other operand unchanged:

  0 + n = n + 0 = n, for all n
  1 * n = n * 1 = n, for all n
  max(−∞, n) = max(n, −∞) = n, for all n

the result of applying the operator to a set of 0 operands should be the unit element. The sum of 0 numbers is 0, the product of 0 numbers is 1, the maximum of 0 numbers is −∞. Why is that? Because it's the only logical answer. Think about this: if you split array A with 10 elements in two parts, A0 and A1, with 5 elements each, you expect that sumArray(A) be equal to sumArray(A0) + sumArray(A1). You expect exactly the same result if you split A so that A0 has 3 elements and A1 has 7. It's only natural to expect the same result if you split A such that A0 has 0 elements.

5 Responses to “Another look at the anti-IF campaign”

  1. Fabrizio Giudici Says:

    Excellent post. Of course I think that “I much prefer the first” should be read as “I much prefer the SECOND”, right? :-)

  2. matteo Says:

    Thank you! And, right :-)

  3. Luca Minudel Says:

    These suggestions now are listed at this wiki with others (in italian) http://wiki.ugidotnet.org/default.aspx/UGIdotNETWiki/IfNon%C3%A8objectoriented.html

  4. Franco Lombardo Says:

    Nice work, Matteo. I’m guessing if this post, like the one about the genesis of Agile mini project, where you wrote The participants followed the TDD rules, with poor results. I learned that TDD does not necessarily result in good code nor much functionality. does express in someway your disillusion about TDD and OO myths.

  5. matteo Says:

    @Franco: the way you express it here you make it look like I’m disillusioned with OOP and TDD. I really am not. In fact, I’m enthusiastic about TDD and OOP.

    I just do not think that TDD is like a magic dust that you can sprinkle on your work and make it suddenly right. Not so; like many good things, becoming proficient with TDD requires time, work, and dedication. The poor results of that evening could also be the result of me trying to have you do too much in too little time, with people who had a full day of work on their shoulder.

    Still, what are the alternatives to TDD? Formal methods and program derivation are too difficult. Unless you’re doing aerospace-grade software, it’s not practically possible to apply them. At least I can’t apply them successfully to my work; and while I’m certainly not the brightest star when it comes to math reasoning, I’m probably average, so that most other programmers would have the same problems that I have.

    What other method there is? “Guessing a correct solution and expressing it in correct code that is also readable and mantainable” is not really a method, it’s a goal. How do you get there reliably? A few gifted individuals can do that. I know I can’t, and even if I did, I could not teach this to others. It’s really code and fix: if you’re good you can make it work, sometimes.

    What else? The conventional advice is that you do “design” before coding. I have no objection to that; I do design in my head before coding, I often discuss design with collegues so that we all understand what we’re trying to do. You just have to be ready to change the design when you find it’s not adequate. A design is like a plan, and “no plan survives contact with the enemy”, that is, when you try to execute your design you realize it’s not adequate. So, no matter how much design you do, you are still left with coding as a difficult technical activity.

    I think TDD is the most exciting and useful advance in programming technique since OOP. The technique is still evolving and I’m ready to believe that it’s not the end of the story, and some day something better will be found to replace it. That day, I hope I’ll be around to be excited about it.

    In any case, I did not mean to say that the OO techniques for decreasing program complexity are less or more valuable than the ones I present here. What I’m telling you here is just a different angle.

Leave a Reply