Anti-FOR tips from the Yahtzee Kata

Again on the Kata Yahtzee, that I blogged about some time ago.

If you have not solved the kata at least once, please stop reading this! Get back when you have.


*         *
*

Good to see you again! Now that you solved it, you probably know that the naive solution takes many “for” loops. Let D be the player dice, represented as an array of die results, e.g., D=(1,6,1,6,4). The naive rules for sixes would be

    def sixes_score 
      sum = 0
      for d in D
        if d == 6
          sum += 6
        end
      end
      return sum
    end
  

This solution involves searching for sixes and adding up. Why do we need to search? We need to search because there are many different D that are worth exactly the same for the sixes rule. For instance, both D=(1,2,3,6,6) and D=(6,6,1,2,3) are worth 12.

One way to avoid the search is to represent the die rolls in a canonical form, that is a form where two equivalent results are represented in the same way. The obvious way to obtain a canonical form is to sort D; but in this particular case, we’d still need to search for sixes.

An alternative canonical form would be to count how many results we have: for instance D=(1,1,3,6,6) would be represented as C = [2,0,1,0,0,2], that is “two times 1, one time 3, two times 6.” The rule for sixes becomes

    def sixes_score
      return 6*C[6]
    end
  

(Note: we take C to be a 1-based array. It’s easy to make one in Java or Ruby: use an array of length 7 and ignore index 0.)

The straight rules are also easy, because C is a canonical form for them as well:

    def small_straight_score
      if C == [1,1,1,1,1,0] then 15 else 0
    end
    def large_straight_score
      if C == [0,1,1,1,1,1] then 20 else 0
    end
  

Now, how does this help us with the other rules? Take the Yahtzee rule, for instance. The naive solution

    def really_naive_yahtzee_score   
      for (i=1; i < 5 ; i++)
        return 0 if D[i-1] ≠ D[i]
      end
      return 50
    end
  

can be slightly improved by

    def slighly_less_naive_yahtzee_score     
      for d in D
        return 0 if D[0] ≠ d
      end
      return 50
    end
  

Using C does not improve much as we still have to search:

    def still_naive_yahtzee_score
      for c in C
        return 50 if c == 5
      end
      return 0
    end
  

This is because there are many different C that are equivalent with respect to the Yahtzee rule: for instance C = [0,0,0,5,0] and C=[0,5,0,0,0]. Can we apply the same reasoning and find another canonical representation? Why yes! If we sort C = [0,0,0,5,0] to obtain S = {5,0,0,0,0} the yahtzee rule becomes very simple:

    def cool_yahtzee_score
      if S[0] == 5 then 50 else 0
    end
  

Many other rules are immediately codified this way:

    def four_of_a_kind_score
      if S[0] ≥ 4 then sum(D) else 0
    end
    
    def full_house_score
      if S[0] == 3 ∧ S[1] == 2 then 25 else 0
    end
  

The pair rule is a bit more challenging: it is not part of the “official” rules but it make for interesting coding :o). The rule is “Pair: The player scores the sum of the two highest matching dice. For example, 3, 3, 3, 4, 4 gives 8.” Using C requires searching. Using S would be no good (can you see why?)

Again: can you find a canonical representation for the pair rule so that we don’t have to search? Hint: remove “noise” to reveal information.

Conclusion

It’s important to find ways to remove IFs. It’s also important to find ways to remove FORs! I blogged about this before.

We used two ways to remove FORs here:

  1. Use canonical representations, like C and S;
  2. Hide them in well-known functions like sort and sum.

Our search for canonical forms helps us develop a language for reasoning effectively about our problem domain. In fact we are building a little theory.

Leave a Reply