Archive for October, 2009

My impressions of the Emergent Design Workshop

Thursday, October 29th, 2009

OK, so I did participate to the Emergent Design Workshop by Francesco Cirillo. This is the second time I attend a workshop with Francesco. The other one was about coaching and agile process management. This one is about the technicalities of making the Agile thing work for real in the code. It’s never easy to work with Francesco; if you do attend this workshop, be prepared to challenge everything you know.

In my particular case, I knew I didn’t know object-oriented design well. OK, I did read about the design patterns, and I did read some of Robert Martin’s writings, but never got really into this stuff. Yet somehow, I thought I could get away with not knowing this stuff deeply. This workshop changed this; now I realize more fully the amount of stuff I didn’t know, and why it’s very important for me to learn this.

And,… even more importantly… I learned to see why a certain kind of semi-procedural code disguised as object-oriented is not satisfactory; and it’s not fun. I gained a new set of eyes and a higher level of criticism for code. What I learned resonates with what I wrote earlier about “code that speaks.” It turns out I was on the right track there; the goal is to have code like Lego bricks; objects that you can combine together to obtain the desired results. Code that can withstand changes in specification, without becoming more complex. Above all, the thing I’m grateful to Francesco for is, to get back to the fun of working with software like objects.

A database for every developer

Saturday, October 17th, 2009

A database for every developer. No, *two* databases for every developer.

This is a fundamental for project organization that many projects get wrong. Every development workstation should be equipped with a full local development environment, with a local copy of the database software, and a one-command way to recreate the databases from scratch.

Why *two* databases? Well, one is for exploratory testing of the application we’re building. The other one is for automatic unit tests.

Why *local*? Because whenever the database server is not local, it becomes difficult to add a new workstation, it’s impossible to work when you’re not in the office, and you must depend on other people to fix your database problems.

The software that we write should *not depend* on the data sources that live outside our development workstation. To this end it’s a good start to have simple scripts that allow you to rebuild your database, so that you know you can experiment, change everything, make mistakes, and you’re still able to get back to a known working situation in a flash.

Why it’s important that I can rebuild the databases with *one command*? Because if it takes more than one command, it’s too complicated and I’m likely to make mistakes. Because it’s too easy to fall in the trap of not knowing exactly which steps are needed to set up a new database instance. If you have a single script that does the job, that script is also a living, always up-to-date document that describes how to recreate the database from scratch.

The benefits are not just in development; when the time comes to release our software in production, you can see how helpful it is to have a script that is able to set up the database with no effort. In fact, all database maintenance operations should be automated. It’s one of the principles explained so well in The Pragmatic Programmer, a very good book.

For example, this is a typical script that I use in my non-Rails projects:

#!/bin/bash

src=src/main/sql
dbname=myapp_development
dbname_test=myapp_test
dbuser=myapp_user
dbpassword=myapp_password

# Usually no changes needed beyond this point

if [ ! -d "$src" ]; then
  echo "Run this script from the main directory"
  exit 1
fi
read -s -p "mysql root password? (type return for no password) " MYSQL_ROOT_PASSWORD

if [ "$MYSQL_ROOT_PASSWORD" != "" ]; then
    MYSQL_ROOT_PASSWORD=-p$MYSQL_ROOT_PASSWORD
fi

mysqladmin -uroot $MYSQL_ROOT_PASSWORD drop $dbname
mysqladmin -uroot $MYSQL_ROOT_PASSWORD --force drop $dbname_test
mysqladmin -uroot $MYSQL_ROOT_PASSWORD create $dbname
mysqladmin -uroot $MYSQL_ROOT_PASSWORD create $dbname_test
echo "$dbname created"
echo "grant all on $dbname.* to '$dbuser'@localhost identified by '$dbpassword';" \
     | mysql -uroot $MYSQL_ROOT_PASSWORD $dbname
echo "grant all on $dbname_test.* to '$dbuser'@localhost identified by '$dbpassword';" \
     | mysql -uroot $MYSQL_ROOT_PASSWORD $dbname_test
echo "$dbuser authorized"
cat $src/???_*.sql | mysql -u$dbuser -p$dbpassword $dbname 
cat $src/???_*.sql | mysql -u$dbuser -p$dbpassword $dbname_test 
echo "schema loaded"

This handy little script will create the development and test databases, and load all sql scripts. I like to name sql scripts like 001_create_foobar_table.sql and 002_add_frobniz_column_to_foobar.sql, so that they can be loaded in sequence. It’s a simple way to develop the database schema incrementally. I may talk about it in another post.

Two floats are never equal

Saturday, October 17th, 2009

While we are on the topic of floating point fundamentals, there is another thing to remember: it is always a mistake to compare two floating-point numbers for equality.

It all boils down to the simple fact that floating-point arithmetic is not exact. It is meant for approximate calculations with engineering or scientific measurements, which are inexact to begin with. In fact, floating-point arithmetics results are almost never equal to the “true” value you would get by using exact real arithmetic.

Therefore, wherever you see something like x == 0.0, you can be fairly sure that it’s a mistake. Whatever computation produces the value of x, it’s unlikely to ever produce exactly 0.0.

The proper way to compare floating points is equality within some tolerance. For instance:

boolean approximatelyEqual(double a, double b, double epsilon) {
  return Math.abs(a - b) <= epsilon;
}

The above code works for most applications. It does not take into account the case that the inputs are NaN or infinities. I’m no expert of floating point arithmetic, so I will not give advice about this. For reference I copy here the following code from JUnit:

static public void assertEquals(String message, double expected,
    double actual, double delta) {
  if (Double.compare(expected, actual) == 0)
    return;
  if (!(Math.abs(expected - actual) <= delta))
    failNotEquals(message, new Double(expected), new Double(actual));
}  

The purpose of the compare call is to have the test pass when the two numbers are both NaN.

Money is not a float

Saturday, October 17th, 2009

One suggestion I took to heart is that in order to be great, you need to work on fundamentals. It’s no good to be up to date with the latest and greatest, be they Agile techniques or new technologies, if you’re weak on fundamentals.

So I’m starting a collection of fundamentals, that is certainly not going to be comprehensive. Rather, it’s a random collection of things that I think are fundamental, yet many experienced developers get wrong.

Let us start with a surprising discovery: did you know that the number 1/10 cannot be represented in a finite way in base 2? Yep, it turns out that in base 2 the number 1/10 is periodical, much like the number 1/3 has no finite decimal representation in base 10. But what is the implication for us?

The implication comes when we make the mistake of representing a money in a floating-point number. Suppose you encode the amount of “ten cents” in the floating-point number 0.10. And now look at this program, and guess what happens when it runs.

  public class MoneyIsNotAFloat {
    public static void main(String[] args) {
      double tenCents = 0.1;
      double sum = 0.0;
      for (int i=0; i<10; i++) { 
        sum += tenCents;
        System.out.println("0.1 * " + (i+1) + " = " + sum);
      }
    }
  }  

(Hint: 1.0 times 10 equals… 0.99999999999999).

And this is not a Java problem. The same happens with any language, for it’s a matter of floating point arithmetic.

The simple fact is that floating-point arithmetic is not exact, therefore it should not be used for representing money!

What to use then? One simple solution is to use a plain int to represent an amount of cents. Integer arithmetic is exact. A 32-bit int should be enough for most applications. If you’re worried about overflow, use a BigDecimal type. Java has one, and most modern languages do too. (Just a note: if you use a Java BigDecimal, remember that you should not compare them with “equals”, you must use “compare”. Go figure.)

Throwing the baby with the bathwater

Wednesday, October 14th, 2009

I was talking with a customer recently, who complimented us because the software we are producing is stable. He also said that one important goal of this project is to write something that is easy to mantain and extend.

At the same time, he was adamant that we should stop doing things like pair programming, and estimating user stories with the whole team, because these things “slow us down”.

Ahem. If you want good quality code, you better apply these practices, for they are meant for producing quality code. :-) Of course we will continue doing all these practices, for we think it’s best for the project.