A random collection of unique integers is a requirement for most card games and for many other applications as well. This is part three of a four part series about different ways to create a random selection of unique integers. Both the efficiency and the statistical distribution are analyzed with graphical results using the StopWatch object. The series starts with an article that demonstrates the common, but flawed methods found doing a Bingle search. (You can start at the beginning using this link.)
This article, however, is a much more polished program provided by "Chris" that makes the whole thing much more object oriented and also demonstrates that the "shuffle" method doesn't really provide a fair deal. Page 1 explains the card dealing, including the Fisher-Yates method Chris suggests, and page 2 shows why Chris's program object oriented architecture is better. But Chris also improved on his own program in a fourth article in this series.
From the point of view of card dealing, or providing a truely random array of integers, Chris's program has two major improvements:
- He suggests using the Fisher-Yates algorithm.
- He demonstrates randomness by showing the distribution of the cards
To be honest, I didn't know about Fisher-Yates (and neither did any of the articles I found in my initial Bingle search). But then, In the first edition of his book, possibly the greatest authority in software, Donald Knuth, reinvented it because he didn't know about it either.
The way to understand Fisher-Yates is to think about the way a hand in real deck of cards is dealt. When each card is dealt, it's not in the deck anymore and the probability of drawing that particular card again is zero. The shuffle method in the first article doesn't have that property. As the algorithm proceeds through the deck, the same card might be handled several times. As it turns out, this introduces a subtle bias in the way the cards are distributed. The "check for duplicates" method does produce a statistically random deal, but it's very inefficient. Fisher-Yates solves both of these problems.
- Initialize the array from 1 to the number of elements.
- Pick a random integer from the array.
- Save that integer in the next position in the output array.
- Replace the integer just selected with the last of the remaining cards so the array is still continuous.
- Go back to the first step until all integers in the array have been selected.
(Note: A slightly more efficient method that doesn't require the array to be initialized exists, but it's more confusing.)
Here's Chris's code to do this. (Chris's code is explained on the next page.)
Dim deck = Enumerable.Range(0, numberOfCards).ToArray Dim shuffled As New List(Of Integer) Dim drawnCard As Integer For i = 0 To numberOfCards - 1 drawnCard = rng.Next(numberOfCards - i) ' Get the card from that index shuffled.Add(deck(drawnCard)) ' and replace that card with ' the last of the 'remaining' cards deck(drawnCard) = deck(numberOfCards - (i + 1)) Next GetCards = shuffled.ToArray
Chris also shows the distribution of the integers - whether one set of integers is selected more often than another. This immediately presents a problem. With 5 cards, there are 5 factorial - 120 - unique sets of integers to track. Chris apologizes for using a "simple brute force enumeration to compare the values rather than anything fancy and elegant that handled n cards". But he did code up something at least a little bit elegant that handled 3 cards. Since I discuss the code on the next page, just the result is enough to make the point here. (-1 -1 -1 is an array with duplicates such as 2 2 2. You can see that we don't have any.)
Timings for 100000 iterations -> Check For Dup took 3308.4771 ms -> Array Shuffle took 458.7001 ms -> Lambda took 1461.3899 ms -> Fisher Yates took 515.5012 ms Spread Check for 100000 iterations -> Check For Dup - 0 1 2 : 16790 times - 0 2 1 : 16578 times - 1 0 2 : 16642 times - 1 2 0 : 16673 times - 2 0 1 : 16666 times - 2 1 0 : 16651 times - -1 -1 -1 : 0 times -> Array Shuffle - 0 1 2 : 14626 times - 0 2 1 : 18740 times - 1 0 2 : 18666 times - 1 2 0 : 18404 times - 2 0 1 : 14940 times - 2 1 0 : 14624 times - -1 -1 -1 : 0 times -> Lambda - 0 1 2 : 16779 times - 0 2 1 : 16711 times - 1 0 2 : 16777 times - 1 2 0 : 16668 times - 2 0 1 : 16505 times - 2 1 0 : 16560 times - -1 -1 -1 : 0 times -> Fisher Yates - 0 1 2 : 16632 times - 0 2 1 : 16484 times - 1 0 2 : 16712 times - 1 2 0 : 16809 times - 2 0 1 : 16868 times - 2 1 0 : 16495 times - -1 -1 -1 : 0 times
It seems apparent that ...
- Check for duplicates takes way too long if you're doing more than just a few calculations.
- The commonly recommended shuffle in the first article isn't really a random distribution.