38  Beliefs exercises

38.1 Judging a fund manager

You want to know if a fund manager is skilled as you believe skilled management can lead to outperformance. The following is known:

  • 20% of fund managers are skilled.

  • If a fund manager is skilled, the probability that they outperform the market is 80%.

  • If a fund manager is unskilled, the probability that they outperform the market is 40%.

You observe an outperforming fund manager. What is the probability that the fund manager is skilled?

We calculate the solution using Bayes’ rule.

P(\text{skilled}|\text{outperform})=\frac{P(\text{outperform}|\text{skilled})P(\text{skilled})}{P(\text{outperform})}

We can calculate P(outperform) using the law of total probability.

\begin{align*} P(\text{outperform})&=P(\text{outperform}|\text{skilled})P(\text{skilled})+P(\text{outperform}|\text{unskilled})P(\text{unskilled}) \\[6pt] &=0.8\times 0.2+0.4\times 0.8 \\[6pt] &=0.48 \end{align*}

Inputting this into Bayes’ rule, we get:

\begin{align*} P(\text{skilled}|\text{outperform})&=\frac{P(\text{outperform}|\text{skilled})P(\text{skilled})}{P(\text{outperform})} \\[6pt] &=\frac{0.8\times 0.2}{0.48} \\[6pt] &=0.333 \end{align*}

A fund manager who outperforms is skilled with 0.333 probability. You cannot neglect that low base rate of skilled managers.

38.2 Detecting a terrorist

Every month 100 million people fly on commercial airlines. Imagine 10 of them are terrorists.

Airport security are able to correctly identify that a person is a terrorist in 99% of cases and a non-terrorist in 99.9% of cases.

a) A person is identified by airport security as a terrorist. Using Bayes’ rule, what is the probability that they are a terrorist?

We can use Bayes’ Theorem to calculate the conditional probability:

\begin{aligned} P(\text{terrorist | identified}) &= \dfrac{P(\text{identified | terrorist})P(\text{terrorist})}{P(\text{identified})} \\[12 pt] &= \dfrac{0.99\times 0.0000001}{0.99\times 0.0000001+0.001\times 0.999999} \\[12 pt] &= 0.000099 \end{aligned}

Or approximately 1 in 10,000.

b) Intuitive responses to questions of this type tend to involve much higher probabilities. Discuss how intuitive responses could err due to confusion of conditional probabilities.

If a person confuses P(\text{terrorist | identified}) with P(\text{identified | terrorist}) they will wrongly assume the probability that someone identified as a terrorist is a terrorist is 99%. This is a common explanation for mistakes of this nature: e.g. identification of cabs problem discussed in Section 35.3.1.

c) State and solve the question in part a) in terms of natural frequencies.

Number of passengers: 100,000,000

Number of terrorists: 10

Number of terrorists identified as terrorists: 10\times 0.999 \approx 10

Number of non-terrorists identified as terrorists: 0.001\times 100,000,000=100,000

Proportion of people identified as terrorists who are terrorists = 10/(10+100000) \approx 1 \text{ in } 10000

38.3 Rolling a die

You have two six-sided dice:

  • one die is fair with the numbers 1 through to 6 occurring with equal probability

  • the other die is loaded and always rolls a 5 or a 6 with equal probability.

You pull one die out of your pocket and roll it. You did not check which die it was before you rolled. (Assume you could have pulled either die out of your pocket with equal probability.)

a) The die shows a six. What is the probability that it is the loaded die?

We are asked to calculate P(\text{D1 loaded}|6). To calculate this, I will use Bayes’ rule.

\begin{align*} P(\text{D1 loaded}|6)&=\frac{P(6|\text{D1 loaded})P(\text{D1 loaded})}{P(6)} \\[12pt] &=\frac{P(6|\text{D1 loaded})P(\text{D1 loaded})}{P(6|\text{D1 loaded})P(\text{D1 loaded})+P(6|\text{D1 fair})P(\text{D1 fair})} \\[12pt] &=\frac{0.5\times 0.5}{0.5\times 0.5+\frac{1}{6}\times 0.5} \\[12pt] &=0.75 \end{align*}

The first die is the loaded die with 75% probability.

b) You pull the other die out of your pocket and roll it. It shows a 5. What is the updated probability that the first die is the loaded die?

We are asked to calculate P(\text{D1 loaded}|\text{D2 } 5). We have a prior probability of the first die being loaded of 0.75, which comes from the answer in the previous question.

Putting this into Bayes’ rule:

\begin{align*} P(\text{D1 loaded}|\text{D2 } 5)&=\frac{P(\text{D2 } 5|\text{D1 loaded})P(\text{D1 loaded})}{P(\text{D2 } 5)} \\[12pt] &=\frac{P(\text{D2 } 5|\text{D1 loaded})P(\text{D1 loaded})}{P(\text{D2 } 5|\text{D1 loaded})P(\text{D1 loaded})+P(\text{D2 } 5|\text{D1 fair})P(\text{D1 fair})} \\[12pt] &=\frac{\frac{1}{6}\times 0.75}{\frac{1}{6}\times 0.75+0.5\times 0.25} \\[12pt] &=0.5 \end{align*}

Each die is loaded with a 50% probability.

You could reach the same result by calculating the probability that the second die is fair given a 5. This is the same as the probability that the first die is loaded given a 5 on the second die. This approach gives the same result as the above calculation, but you need only think about one die.

\begin{align*} P(\text{D2 fair}|5)&=\frac{P(5|\text{D2 fair})P(\text{D2 fair})}{P(5)} \\[12pt] &=\frac{P(5|\text{D2 fair})P(\text{D2 fair})}{P(5|\text{D2 fair})P(\text{D2 fair})+P(5|\text{D2 loaded})P(\text{D2 loaded})} \\[12pt] &=\frac{\frac{1}{6}\times 0.75}{\frac{1}{6}\times 0.75+0.5\times 0.25} \\[12pt] &=0.5 \end{align*}

38.4 Male or female

These two related questions come from magazine columnist Marilyn vos Savant.

a) A shopkeeper says she has two new baby beagles to show you, but she doesn’t know whether they’re male, female, or a pair. You tell her that you want only a male, and she telephones the fellow who’s giving them a bath. “Is at least one a male?” she asks him. “Yes!” she informs you with a smile. What is the probability that the other one is a male?

The prior probabilities are:

\begin{align*} P(MM)&=0.25 \\[6pt] P(MF)&=P(FM)=0.25 \\[6pt] P(FF)&=0.25 \end{align*}

Using Bayes’ rule:

\begin{align*} P(MM|M)&=\frac{P(M|MM)P(MM)}{P(M)} \\[6pt] &=\frac{P(M|MM)P(MM)}{\begin{aligned}P(&M|MM)P(MM)+P(M|MF)P(MF)\\&+P(M|FM)P(FM)+P(M|FF)P(FF)\end{aligned}} \\[6pt] &=\frac{1\times\frac{1}{4}}{1\times\frac{1}{4}+1\times\frac{1}{4}+1\times\frac{1}{4}+0\times\frac{1}{4}} \\[6pt] &=\frac{\frac{1}{4}}{\frac{3}{4}} \\[6pt] &=\frac{1}{3} \end{align*}

b) Say that a woman and a man (who are unrelated) each have two children. We know that at least one of the woman’s children is a boy and that the man’s oldest child is a boy. Can you explain why the chances that the woman has two boys do not equal the chances that the man has two boys?

For the woman, the prior probabilities before learning she has a boy are:

\begin{align*} P(BB)&=0.25 \\[6pt] P(BG)&=P(GB)=0.25 \\[6pt] P(GG)&=0.25 \end{align*}

Using Bayes’ rule:

\begin{align*} P(BB|B)&=\frac{P(B|BB)P(BB)}{P(B)} \\[12pt] &=\frac{P(B|BB)P(BB)}{\begin{aligned}P(&B|BB)P(BB)+P(B|BG)P(BG)\\&+P(B|GB)P(GB)+P(B|GG)P(GG)\end{aligned}} \\[12pt] &=\frac{1\times\frac{1}{4}}{1\times\frac{1}{4}+1\times\frac{1}{4}+1\times\frac{1}{4}+0\times\frac{1}{4}} \\[12pt] &=\frac{\frac{1}{4}}{\frac{3}{4}} \\[12pt] &=\frac{1}{3} \end{align*}

For the man, the prior probabilities before learning his eldest is a boy are:

\begin{align*} P(BB)&=0.25 \\[6pt] P(BG)&=0.25 \\[6pt] P(GB)&=0.25 \\[6pt] P(GG)&=0.25 \end{align*}

Using Bayes’ rule:

\begin{align*} P(BB|B)&=\frac{P(B|BB)P(BB)}{P(B)} \\[12pt] &=\frac{P(B|BB)P(BB)}{\begin{aligned}P(&B|BB)P(BB)+P(B|BG)P(BG)\\&+P(B|GB)P(GB)+P(B|GG)P(GG)\end{aligned}} \\[12pt] &=\frac{1\times\frac{1}{4}}{1\times\frac{1}{4}+1\times\frac{1}{4}+0\times\frac{1}{4}+0\times\frac{1}{4}} \\[12pt] &=\frac{\frac{1}{4}}{\frac{2}{4}} \\[12pt] &=\frac{1}{2} \end{align*}

Note: questions such are these are typically sensitive to unstated assumptions, particularly around the procedure used to elicit the information around the sex of the child. Due to this, part a) is probably less vulnerable to alternative assumptions than b) as it contains information about the elicitation procedure in the question.

38.5 Luggage

You have taken a flight and are worried that your luggage will not be on the flight. You know that 20% of bags have not been arriving with the passenger.

The following table of conditional probabilities (from Pearl & Mackenzie (2018)) gives the probability that your bag will be on the luggage carousel conditional on the bag being on the flight and how long you have been waiting at the carousel.

Bag on plane Time elapsed Carousel=true
False 0 0
False 1 0
False 2 0
False 3 0
False 4 0
False 5 0
False 6 0
False 7 0
False 8 0
False 9 0
False 10 0
True 0 0
True 1 10
True 2 20
True 3 30
True 4 40
True 5 50
True 6 60
True 7 70
True 8 80
True 9 90
True 10 100

a) You have been waiting 5 minutes for your bag and it has not arrived. What is the probability that your bag was not on the flight?

\begin{align*} P(\text{false}|\text{not arrived after 5})&=\frac{P(\text{not arrived after 5}|\text{false})P(\text{false})}{P(\text{not arrived after 5})} \\[12pt] &=\frac{P(\text{not arrived after 5}|\text{false})P(\text{false})}{\begin{aligned}P(&\text{not arrived after 5}|\text{false})P(\text{false})\\&+P(\text{not arrived after 5}|\text{true})P(\text{true})\end{aligned}} \\[12pt] &=\frac{1\times 0.2}{1\times 0.2+0.5\times 0.8} \\[12pt] &=0.333 \end{align*}

The probability that the bag was not on the flight is 33%.

b) You have been waiting 9 minutes for your bag and it has not arrived. What is the probability that your bag was not on the flight?

\begin{align*} P(\text{false}|\text{not arrived after 9})&=\frac{P(\text{not arrived after 9}|\text{false})P(\text{false})}{P(\text{not arrived after 9})} \\[12pt] &=\frac{P(\text{not arrived after 9}|\text{false})P(\text{false})}{\begin{aligned}P(&\text{not arrived after 9}|\text{false})P(\text{false})\\&+P(\text{not arrived after 9}|\text{true})P(\text{true})\end{aligned}} \\[12pt] &=\frac{1\times 0.2}{1\times 0.2+0.1\times 0.8} \\[12pt] &=0.714 \end{align*}

The probability that the bag was not on the flight is 71%.

c) You have been waiting 10 minutes for your bag and it has not arrived. What is the probability that your bag was not on the flight?

\begin{align*} P(\text{false}|\text{not arrived after 10})&=\frac{P(\text{not arrived after 10}|\text{false})P(\text{false})}{P(\text{not arrived after 10})} \\[12pt] &=\frac{P(\text{not arrived after 10}|\text{false})P(\text{false})}{\begin{aligned}P(&\text{not arrived after 10}|\text{false})P(\text{false})\\&+P(\text{not arrived after 10}|\text{true})P(\text{true})\end{aligned}} \\[12pt] &=\frac{1\times 0.2}{1\times 0.2+0\times 0.8} \\[12pt] &=1 \end{align*}

The probability that the bag was not on the flight is 100%.

38.6 Murder?

In a now famous case, Sally Clark was convicted of murder after the death of her two sons. The defence argued both children had died of sudden infant death syndrome. The prosecution called statistical evidence that the chance of a SIDS death was 1 in 8543, so the chance of two children dying of SIDS was 1 in 73 million (8543\times 8543).

The conviction was overturned on a second appeal. For what reasons could the following simple calculation be misleading?

\begin{align*} P(\text{SIDS death}\land\text{SIDS death})&=P(\text{SIDS death})\times P(\text{SIDS death}) \\[12pt] &=\frac{1}{8543}\times\frac{1}{8543} \\[12pt] &=\frac{1}{7.3\times 10^{7}} \end{align*}

Reason 1: The probability of a 2nd SIDS death given first SIDS death is not independent. e.g. SIDS deaths are related due to genetics, family environment. That is, the relevant probability is:

P(\text{SIDS death}|\text{genetics, family environment, etc})

This would mean that:

P(\text{2 SIDS deaths in same family})>(P(\text{1 SIDS death}))^2

The appropriate calculation is:

P(\text{SIDS death}\land\text{SIDS death})=P(1^{st}\text{ SIDS death})\times P(2^{nd}\text{ SIDS death}|1^{st}\text{ SIDS death})

Reason 2: We also need to consider the probability of alternative possibility - i.e. murder. We want calculate:

P(\text{murder}|{2\text{ deaths}})=\frac{P(2\text{ deaths}|\text{murder})P(\text{murder})}{P(2\text{ deaths})}

If murder itself is also unlikely, then it is not correct to simply attribute the alternative to SIDS the residual probability.

38.7 A fire alarm

You know the following statistics about fire:

  • The probability of your house catching fire on any particular day is 1 in 10,000

  • Your fire alarm correctly detects a house fire 95% of the time

  • The probability that your fire alarm sounds on a day when there is no fire (a false alarm) is 1 in 100.

a) Your alarm goes off. What is the probability that your house is on fire?

\begin{align*} P(\text{fire}|\text{alarm})&=\frac{P(\text{alarm}|\text{fire})P(\text{fire})}{P(\text{alarm})} \\[12pt] &=\frac{P(\text{alarm}|\text{fire})P(\text{fire})}{P(\text{alarm}|\text{fire})P(\text{fire})+P(\text{alarm}|\neg\text{fire})P(\neg\text{fire})} \\[12pt] &=\frac{0.95\times 0.0001}{0.95\times 0.0001+0.01\times 0.9999} \\[12pt] &=0.0094 \end{align*}

The probability of a fire if the alarm goes off is 0.94%.

b) Many people given this problem estimate the probability of the house being on fire as close to 95%. Provide one possible explanation for this error.

People often confuse P(A|B) with P(B|A). In this case, such confusion would lead them to conclude that P(\text{fire}|\text{alarm})=P(\text{alarm}|\text{fire})=95\%. You might also think of this as anchoring on the 95% and insufficiently adjusting from there.

Alternatively, people sometimes act as though they have assumed uniform priors: e.g. 50:50 as to whether a fire or not.

In that case:

\begin{align*} P(\text{fire}|\text{alarm})&=\frac{P(\text{alarm}|\text{fire})P(\text{fire})}{P(\text{alarm})} \\[12pt] &=\frac{P(\text{alarm}|\text{fire})P(\text{fire})}{P(\text{alarm}|\text{fire})P(\text{fire})+P(\text{alarm}|\neg\text{fire})P(\neg\text{fire})} \\[12pt] &=\frac{0.95\times 0.5}{0.95\times 0.5+0.01\times 0.5} \\[12pt] &=0.9896 \end{align*}

c) Express and solve this problem using natural frequencies.

  • Your house will catch fire on 100 out of 1,000,000 days. (You could choose any base number of days - I chose 1 million as gives round numbers for the following items.)

  • Your fire alarm will correctly detect a house fire on 95 of those days.

  • You will have a false alarm on 9,999 out of the 999,900 days without fire.

\begin{align*} P(\text{fire}|\text{alarm})&=\frac{95}{95+9999} \\[12pt] &=0.0094 \end{align*}

38.8 The law of small numbers

Lincoln observes performance by fund manager Neville. Neville may be a skilled, mediocre or unskilled manager:

  • A skilled fund manager has a 75% chance of beating the market each quarter.

  • A mediocre fund manager has a 50% chance of beating the market each quarter.

  • An unskilled fund manager has a 25% chance of beating the market each quarter.

Lincoln knows these odds.

The performance of a fund manager is independent from quarter to quarter.

Consider the model we used to examine behaviour involving a belief in the law of small numbers whereby the decision maker acts as though the process has the character of balls being drawn out of an urn without replacement. Lincoln develops his beliefs using this model with N = 12.

a) Lincoln thinks Neville is mediocre. What does Lincoln believe is the probability that Neville beats the market in the first quarter?

Lincoln thinks the realisation of Neville’s performance is like drawing from an urn with N=12 balls. Because he believes Neville is mediocre, he thinks half (6) of the balls are good-performance balls and half (6) are bad-performance balls. The likelihood of drawing a good ball (G) the first quarter is 6/12.

\begin{align*} \hat P(G)&=\frac{G}{N} \\[12pt] &=\frac{6}{12} \\[12pt] &=0.5 \end{align*}

b) Neville beats the market in the first quarter. What does Lincoln believe is the probability he does it again in the second quarter?

In Lincoln’s mind, the balls are not replaced once drawn. If Neville has a good first quarter, Lincoln believes that only five good balls remain. Therefore, Lincoln believes that the probability of Neville having a good second quarter is 5/11.

\begin{align*} \hat P(GG|G)&=\frac{G-1}{N-1} \\[12pt] &=\frac{5}{11} \end{align*}

c) Neville beats the market again. What does Lincoln believe is the probability that he will do so in the third quarter?

After two good quarters, Lincoln believes only four good balls remain. Therefore, the probability of Neville having another good quarter is 4/10.

\begin{align*} \hat P(GGG|GG)&=\frac{G-2}{N-2} \\[12pt] &=\frac{4}{10} \end{align*}

d) Lincoln observes Jill, who he believes is a skilled fund manager. What does Lincoln believe is the probability of her having 10 consecutive periods of out-performance?

Lincoln believes that in 12 quarters Jill will have nine quarters of out-performance. As a result, he does not believe it is possible for her to have ten consecutive periods of out-performance. After nine periods, only three balls are left in the urn. None of those balls are good.

\begin{align*} \hat P(GGGGGGGGGG|GGGGGGGGG)&=\frac{G-9}{N-9} \\[12pt] &=\frac{0}{3} \\[12pt] &=0 \end{align*}

e) What psychological bias does Lincoln’s behaviour reflect? Explain.

Each time Neville has a good quarter, Lincoln thinks it is less likely that Neville will have another. This is an example of gambler’s fallacy. Lincoln thinks that Neville’s sequence of performances should be the typical sequence of a mediocre fund manager, with the same number of good and bad quarters. This leads Lincoln to expect bad quarters to be more likely after a sequence of good quarters.

Similarly for Jill, Lincoln expects a string of success to correct itself and her record to revert to the average for a skilled manager.

38.9 Heuristics

For each of the following experiments from Tverksy and Kahneman (1974), explain what heuristic may be leading to the belief or decision.

a) Tverksy and Kahneman (1974) write:

Subjects were shown brief personality descriptions of several individuals, allegedly sampled at random from a group of 100 professionals-engineers and lawyers. The subjects were asked to assess, for each description, the probability that it belonged to an engineer rather than to a lawyer. In one experimental condition, subjects were told that the group from which the descriptions had been drawn consisted of 70 engineers and 30 lawyers. In another condition, subjects were told that the group consisted of 30 engineers and 70 lawyers. The odds that any particular description belongs to an engineer rather than to a lawyer should be higher in the first condition, where there is a majority of engineers, than in the second condition, where there is a majority of lawyers. Specifically, it can be shown by applying Bayes’ rule that the ratio of these odds should be (.7/.3^2), or 5.44, for each description. In a sharp violation of Bayes’ rule, the subjects in the two conditions produced essentially the same probability judgments.

This behaviour might be explained by representativeness. Tverksy and Kahneman (1974) write:

[S]ubjects evaluated the likelihood that a particular description belonged to an engineer rather than to a lawyer, by the degree to which this description was representative of the two stereotypes, with little or no regard for the prior probabilities of the categories.

The subjects used prior probabilities correctly when they had no other information. In the absence of a personality sketch, they judged the probability that an unknown individual is an engineer to be .7 and .3, respectively, in the two base-rate conditions. However, prior probabilities were effectively ignored when a description was introduced, even when this description was totally uninformative.

b) Tverksy and Kahneman (1974) write:

Suppose one samples a word (of three letters or more) at random from an English text. Is it more likely that the word starts with r or that r is the third letter? … [M]ost people judge words that begin with a given consonant to be more numerous than words in which the same consonant appears in the third position. They do so even for consonants, such as r or k, that are more frequent in the third position than in the first.

This behaviour might be explained by availability. Tverksy and Kahneman (1974) write:

People approach this problem by recalling words that begin with r (road) and words that have r in the third position (car) and assess the relative frequency by the ease with which words of the two types come to mind. Because it is much easier to search for words by their first letter than by their third letter, most people judge words that begin with a given consonant to be more numerous than words in which the same consonant appears in the third position.

c) Tverksy and Kahneman (1974) write:

Two groups of high school students estimated, within 5 seconds, a numerical expression that was written on the blackboard. One group estimated the product

8\times 7\times 6\times 5\times 4\times 3\times 2\times 1

while another group estimated the product

1\times 2\times 3\times 4\times 5\times 6\times 7\times 8

The median estimate for the ascending sequence was 512, while the median estimate for the descending sequence was 2,250. The correct answer is 40,320.

This behaviour might be explained by anchoring and adjustment. Tverksy and Kahneman (1974) write:

To rapidly answer such questions, people may perform a few steps of computation and estimate the product by extrapolation or adjustment. Because adjustments are typically insufficient, this procedure should lead to underestimation. Furthermore, because the result of the first few steps of multiplication (performed from left to right) is higher in the descending sequence than in the ascending sequence, the former expression should be judged larger than the latter.

d) Tverksy and Kahneman (1974) write:

In considering tosses of a coin for heads or tails … people regard the sequence H-T-H-T-T-H to be more likely than the sequence H-H-H-T-T-T … [or] … the sequence H-H-H-H-T-H.

This behaviour might be explained by representativeness. Tverksy and Kahneman (1974) write:

People expect that a sequence of events generated by a random process will represent the essential characteristics of that process even when the sequence is short. …

[P]eople expect that the essential characteristics of the process will be represented, not only globally in the entire sequence, but also locally in each of its parts. A locally representative sequence, however, deviates systematically from chance expectation: it contains too many alternations and too few runs. Another consequence of the belief in local representativeness is the well-known gambler’s fallacy. After observing a long run of red on the roulette wheel. for example, most people erroneously believe that black is now due, presumably because the occurrence of black will result in a more representative sequence than the occurrence of an additional red.

38.10 Overconfidence

Consider the following three statements. Suppose that each statement is an instance of overconfidence. For each statement name and define the form of overconfidence that provides the best explanation for the students’ beliefs.

a) 90% of students believe they will score above the class average in the final exam.

Overplacement.

Overplacement is the erroneous relative judgement that we are better than others.

b) 90% of students believe they will receive a high distinction.

Overestimation.

Overestimation is the belief that we can perform at a level beyond that which we realistically can.

c) Arthur believes with 90% probability that he will score between 74% and 76% in the final exam.

Overprecision.

Overprecision is the tendency to believe that our predictions or estimates are more accurate than they actually are.

38.11 Lethal events

When people are asked the frequency of lethal events, they are often inaccurate. The following table lists those events most subject to under- or over-estimation of the frequency.

Most overestimated Most underestimated

All accidents

Motor vehicle accidents

Tornadoes

Flood

All cancer

Fire and flames

Venomous bite or sting

Homicide

Diabetes

Stomach cancer

Stroke

Tuberculosis

Asthma

Emphysema

What heuristic could lead to this pattern of overestimation and underestimation? Why?

The most overestimated events tend to be vivid events that are often the subject of news. The most underestimated are much less vivid and likely receive less coverage.

This pattern could be driven by the availability heuristic. When using the availability heuristic, people judge the frequency of events by the ease with which instances of those events come to mind.

When asked to estimate the frequency of vivid events often in the news, instances of those events will easily come to mind. The availability heuristic will lead these events to be judged more probable.

Conversely, people will find it harder to call to mind those events which are less vivid and newsworthy, leading them to judge those events as being less frequent.