On LinkedIn a fellow named Francois Dewaste posts links to some Scala-related articles he runs across, and I often check them out. Recently he shared an article about a Monte Carlo simulation in Scala, written by Alvin Alexander. I found the problem interesting, so I want to share my own Scala solution, which is written using a more functional style, and also a mathematical interpretation of the results.

First I’ll repeat the basic problem:

If two people who are each right on a test 80% of the time (but whose answers are uncorrelated) have the same answer for a particular question, what is the probability that their answer is right?

You’ll get much more out of this post if you spend a minute thinking about the problem. How would you model it mathematically? How would you explore the problem using software?

Apparently several statisticians he asked said that the answer was 80%. Presumably their thinking was along the lines that since their answers are uncorrelated, the fact that two (or three or …) agree means nothing. But that’s wrong.

If you’re like me, your immediate answer will be “there isn’t enough information.” Clearly you would have to know how many possible answers there are, because that will affect the probability that they have the same wrong answer. If there are only two possible answers, then if they are wrong they agree 100% of the time; if there are an infinity of possible answers, then if they are wrong they agree 0% of the time. Looking at his code, though, it was clear that he meant for there to only be two possible answers to each question.

OK — ready for the answers?

### First the Simulation

I know most folks are here to look at the Scala code, so here it is. To make it a bit easier to follow, bold font indicates where the parameters of the simulation are used.

```
object MonteCarlo extends App {
// Get the simulation parameters from the command line.
val Array(numQuestions, numChoices, numPlayers, pctCorrect, numMustAgree)
= args map (_.toInt)
// The choices will be 0 .. numChoices-1; call the last choice correct.
val correctAnswer = numChoices - 1
// Generates an answer with a pctCorrect chance of being correct.
def genAnswer =
if (util.Random.nextInt(100) < pctCorrect)
correctAnswer
else // pick a wrong answer from 0 to correctAnswer-1
util.Random.nextInt(correctAnswer)
// For each question, generate player answers and look for consensus.
// Where consensus is achieved, yield whether or not it is correct.
// The result is an array, with one element for each consensus reached,
// containing true if the answer they agreed on was correct.
val correctnessOfConsensus =
for { i <- 1 to numQuestions
(answer,instances) <- Array.fill(numPlayers)(genAnswer) groupBy identity
if instances.size >= numMustAgree
} yield answer == correctAnswer
// Print how often the consensus answer was correct.
val timesAgreed = correctnessOfConsensus.size
val timesRight = correctnessOfConsensus count identity // num true
println( s"Consensus meant correctness $timesRight out of $timesAgreed times." )
if (timesAgreed > 0) {
println( f"That's ${ 100.0 * timesRight / timesAgreed }%.1f%%." )
}
}
```

It’s interesting to note that because of Scala’s type inference there isn’t a single type specification in the entire program, in spite of the fact that there are eleven bound variables. The call to toInt() is a conversion, not a type specification, and it would be needed in Python or Ruby as well.

The for-comprehension is the heart of the program: it repeatedly (once for each question) generates the players’ answers with the specified probability of being correct, figures out how many times each answer was used, picks out the cases where the required number of people agreed, and yields a Boolean indicating whether or not the agreed-upon value was correct. Note that there can be more than one consensus for a given question; for example, if you use ten players, five of which must agree, then you can achieve consensus twice for a single question. Twelve players can reach consensus four times on a question if only three have to agree. The for-comprehension (with its underlying flatmap) makes all of that trivial.

Now let’s run it for the parameters of the original problem, but with a million questions so we get a more accurate probability estimate:

$scala MonteCarlo 1000000 2 2 80 2Consensus meant correctness 640039 out of 679984 times. That's 94.1%.

And they were also interested in the probability of the answer being correct if 3 out of 3 had it:

$scala MonteCarlo 1000000 2 3 80 3Consensus meant correctness 512213 out of 520220 times. That's 98.5%.

Those are essentially the same answers Alvin got.

### Now the Math

So what’s behind these numbers? In the above runs (where everyone must agree), there are three possibilities for each question:

- A. They all agree and they are correct.
- B. They all agree and they are incorrect.
- C. They don’t all agree.

We don’t care about C; what we are interested in is

p(A) ------------- p(A) + p(B)

Since their answers are uncorrelated, the probability that two players will both be right is the product of the probabilities that each is right. Since each is right 80% of the time, p(A) = square(0.8) = 0.64, and p(B) = square(0.2) = 0.04. So the expression above is 0.64 / ( 0.64 + 0.04 ) = 0.941 (roughly).

With three people, p(A) = cube(0.8) = 0.512 and p(B) = cube(0.2) = 0.008, so the probability that the consensus is correct is 0.512 / ( 0.512 + 0.008 ) = 0.9846 (roughly).

Oh, and remember the point about the number of possible answers mattering? Let’s say that instead of 2 possibilities there are a million:

$scala MonteCarlo 1000000 1000000 2 80 2Consensus meant correctness 640167 out of 640167 times. That's 100.0%.

With so many possible answers, the players almost never agree if they are incorrect. So, if they *do* agree, it’s because they are right.

There you have it! Interesting problem, Alvin — thanks!

## One Trackback

[…] Ruby and Python. We just discussed an instance of the former; here are two examples of the latter: A and […]