Work with asymmetric bounds

Let’s talk about ranges of integers. Quick: how many numbers are in the range 16 ≤ x ≤ 37, that is in the set {16, 17, …, 37}?

If you answered 21 then you got it wrong: the correct answer is 22. When you express a range inclusive of both ends, the formula to compute the number of elements is

size of [lower, upper] = upper - lower + 1

The “+1” in the above formula is the source of many “off-by-one” or fenceposts errors. Can we get rid of the “+1”? Yes, there is a way. Use asymmetric bounds, including the lower bound and excluding the upper bound. The formula becomes

size of [lower, upper) = upper - lower

I’m using square and round brackets as a shorthand notation for ranges with inclusive and exclusive bounds:

  [a, b] = { x | a ≤ x ≤ b }
  [a, b) = { x | a ≤ x < b }

Another reason in favor of asymmetric bounds: how do you express an empty range? With symmetric inclusive ranges, you can’t: the smallest range you can express contains one element. For instance [3, 3] contains only the number 3. With asymmetric bounds, you can express an empty range with, for instance, [3, 3), which is empty.

Asymmetric bounds work very well in conjunction with counting from zero. Java and C arrays are indexed from 0, so that the index range is [0, N). The upper bound is now equal to the number of elements! Expressing it with symmetric bounds gives [0, N-1], which is ugly because of the “-1”.

If you need to iterate on a C or Java array, the standard pattern is

  int[] array = new int[N];
  for (int i=0; i<N; i++) { ... }

You see how the body is executed when i==0, through i==N-1, but not when i==N, since we are using i<N as a boundary condition. How many times is the loop is executed? Exactly N! It’s now easy to see that the iteration is performed the correct number of times.

Asymmetric bounds can be concatenated easily. Suppose you have a function that paginates results on a web page. You are on the page which contains results 30, 31, … 39. If you express the range with two parameters from and to, the page url would look like

  http://example.com/results?from=30&to=40

And the links to the next and previous page would be

  <a href="/results?from=20&to=30">previous page</a>
  <a href="/results?from=40&to=50">next page</a>

where the bounds for the next or previous page are easily computed by adding or subtracting the page size from the current bounds. Compare with

  http://example.com/results?from=30&to=39
  <a href="/results?from=20&to=29">previous page</a>
  <a href="/results?from=40&to=49">next page</a>

it’s more difficult to check that the bounds are correct here: you have to think where the “-1” must be applied.

So, if you define “+” as the operation of concatenation of ranges, you get this nice little law:

  [a, b) + [b, c) = [a, c)

It’s this sort of properties that make it easy to combine software objects together. This helps towards the goal of programming with lego-like bricks.

In conclusion: express ranges with asymmetric bounds! You will avoid off-by-one errors, get rid of “+1″s from your code, and make your programs more modular.

References

The book C Traps and Pitfalls by Andrew Koenig has a good explanation of asymmetric bounds.

Leave a Reply