Anti-FOR tips from the Yahtzee Kata
Again on the Kata Yahtzee, that I blogged about some time ago.
If you have not solved the kata at least once, please stop reading this! Get back when you have.
* *
*
Good to see you again! Now that you solved it, you probably know that the naive solution takes many “for” loops. Let D be the player dice, represented as an array of die results, e.g., D=(1,6,1,6,4). The naive rules for sixes would be
def sixes_score sum = 0 for d in D if d == 6 sum += 6 end end return sum end
This solution involves searching for sixes and adding up. Why do we need to search? We need to search because there are many different D that are worth exactly the same for the sixes rule. For instance, both D=(1,2,3,6,6) and D=(6,6,1,2,3) are worth 12.
One way to avoid the search is to represent the die rolls in a canonical form, that is a form where two equivalent results are represented in the same way. The obvious way to obtain a canonical form is to sort D; but in this particular case, we’d still need to search for sixes.
An alternative canonical form would be to count how many results we have: for instance D=(1,1,3,6,6) would be represented as C = [2,0,1,0,0,2], that is “two times 1, one time 3, two times 6.” The rule for sixes becomes
def sixes_score return 6*C[6] end
(Note: we take C to be a 1-based array. It’s easy to make one in Java or Ruby: use an array of length 7 and ignore index 0.)
The straight rules are also easy, because C is a canonical form for them as well:
def small_straight_score if C == [1,1,1,1,1,0] then 15 else 0 end def large_straight_score if C == [0,1,1,1,1,1] then 20 else 0 end
Now, how does this help us with the other rules? Take the Yahtzee rule, for instance. The naive solution
def really_naive_yahtzee_score for (i=1; i < 5 ; i++) return 0 if D[i-1] ≠ D[i] end return 50 end
can be slightly improved by
def slighly_less_naive_yahtzee_score for d in D return 0 if D[0] ≠ d end return 50 end
Using C does not improve much as we still have to search:
def still_naive_yahtzee_score for c in C return 50 if c == 5 end return 0 end
This is because there are many different C that are equivalent with respect to the Yahtzee rule: for instance C = [0,0,0,5,0] and C=[0,5,0,0,0]. Can we apply the same reasoning and find another canonical representation? Why yes! If we sort C = [0,0,0,5,0] to obtain S = {5,0,0,0,0} the yahtzee rule becomes very simple:
def cool_yahtzee_score if S[0] == 5 then 50 else 0 end
Many other rules are immediately codified this way:
def four_of_a_kind_score if S[0] ≥ 4 then sum(D) else 0 end def full_house_score if S[0] == 3 ∧ S[1] == 2 then 25 else 0 end
The pair rule is a bit more challenging: it is not part of the “official” rules but it make for interesting coding :o). The rule is “Pair: The player scores the sum of the two highest matching dice. For example, 3, 3, 3, 4, 4 gives 8.” Using C requires searching. Using S would be no good (can you see why?)
Again: can you find a canonical representation for the pair rule so that we don’t have to search? Hint: remove “noise” to reveal information.
Conclusion
It’s important to find ways to remove IFs. It’s also important to find ways to remove FORs! I blogged about this before.
We used two ways to remove FORs here:
- Use canonical representations, like C and S;
- Hide them in well-known functions like sort and sum.
Our search for canonical forms helps us develop a language for reasoning effectively about our problem domain. In fact we are building a little theory.