A semi-forgotten design principle

The common wisdom is that Ruby is slow and Java is fast. In general, it’s true. But is it always? Look at this simple test.

$ cat hello.rb 
puts "Hello world!"
$ ruby hello.rb 
Hello world!
$ time ruby hello.rb 
Hello world!

real	0m0.008s
user	0m0.004s
sys	0m0.003s

So it looks like it takes 8ms to run a simple “hello, world” in Ruby. How does Java compare to this?

$ cat Hello.java 
public class Hello {
    public static void main(String ... args) {
        System.out.println("Hello, world!");
$ javac Hello.java 
$ java Hello
Hello, world!
$ time java Hello
Hello, world!

real	0m0.122s
user	0m0.061s
sys	0m0.028s

Even if we ignore the time it takes to compile the Java program, it looks like running the “Hello, world” in Java takes 15 times longer than Ruby. This is due to the long startup time of the Java Virtual Machine. The times you see here are taken on my MacBook Pro; they will be different on other operating systems, but not much different.

So what, you will say? “The startup time is not important! As soon as the JVM is up and running, Java can run circles around Ruby.”

I don’t agree that startup times are not important. The startup time for Java becomes much worse when you run complex applications. A vanilla Tomcat with no web applications installed takes about one minute to start up. Compare with Webrick, the Ruby web server, that is up and running with my web application in 3 seconds. The difference in startup times makes all the difference in the world when you’re developing software. It takes at least one minute, often much longer, to start up a Java application so that I can try it. There are times when you’re developing an application when you need to test it after each tiny change. It’s very difficult to do that in Java. The problem is made much worse by the fact that in general Java “containers” can’t reload changed classes without a restart. (Webrick can do that.)

What, you will say? “Matteo gave up TDD! He tests applications manually by clicking around like a monkey!” No, really, it’s not like this. I always write production code with TDD. That does not mean that you *never* test your stuff manually on the live application. Quite the opposite: there is a danger, with new converts to unit testing, that we trust our tests too much. I’ve seen people declare a story “finished” when all the unit tests are green, without ever checking if it really works! And of course, if you never test it manually, it will not work. There is a need for manual testing (some call it exploratory testing), even if you’re Kent Beck or Misko Every.

So I hope you’ll agree with me that short startup times are important for developers. But there are other implications. The fact that it takes a lot of time to startup a Java application means that Java developers are trained to write applications in a single JVM process. For instance, we often see dozens of web applications running in a single Tomcat. If you need concurrent operations, the solution is always to run more threads within the same process. And there is a big problem with this.

The operating system’s concept of a “process” is a very useful one. A “process” is a bundle of threads and resources: memory, open files, network connections, and the like. A process in Unix or Windows is a watertight compartment. When a process terminates, *all* of its resources are released. A process cannot easily corrupt the state of another process. A process can be given limits on how much memory or CPU it can take. It’s very useful to organize a concurrent application as a set of cooperating operating system processes. That’s the way the Apache Http server works, and that is a remarkably reliable software. It’s also one of the smart ideas in the Chrome browser, to run each tab in a separate process.

It’s a good design principle to have many small modules communicating with well-defined interfaces, rather than a single monolith where all the threads can interact in unforeseen ways. It’s also the way of Unix to design applications as collections of small communicating processes. Which makes me wonder how could Sun ever get us to believe that it’s a good idea to put all of our eggs in a single, huge process. Should not Sun be the champion of the Unix way? But I digress.

In conclusion, I claim that designing applications with small cooperating operating system processes is a good principle. Java current practice runs against this, but it need not be.

Leave a Reply