Brighter Thinking
A Level Fur ther Mathematics for AQA Statistics Student Book (AS/A Level) Stephen Ward and Paul Fannon
Contents
Contents 1 Discrete random variables 1 Section 1: Average and spread of a discrete random variable ������������������������������������� 2 Section 2: E xpectation and variance of transformations of discrete random variables ������������������������������������ 8 Section 3: The discrete uniform distribution ��������11 2 Poisson distribution 18 Section 1: Using the Poisson model ����������������������19 Section 2: U sing the Poisson distribution in hypothesis tests ����������������������������������� 25
Section 8: Rectangular distribution ���������������������� 77 Section 9: E xponential distribution ���������������������� 80 Section 10: Combining discrete and continuous random variables ������������ 84 Focus on … Proof 1 ���������������������������������������������� 93 Focus on … Problem solving 1 ���������������������������� 95 Focus on … Modelling 1 ���������������������������������������96
pl e
Introduction ��������������������������������������������������������������iv How to use this book ������������������������������������������������v
5 Further hypothesis testing 97 Section 1: t-tests ����������������������������������������������������� 98 Section 2: Errors in hypothesis testing ���������������102 6 Confidence intervals 114
4 Continuous distributions 53
Focus on … Proof 2 ���������������������������������������������128
Sa m
Section 1: Contingency tables ������������������������������ 33 Section 2: Yates’ correction ������������������������������������42
Section 1: Confidence intervals ��������������������������� 115 Section 2: C onfidence intervals for the mean when the population variance is unknown ����������������������������122
3 Chi-squared tests 32
D ra ft
Section 1: Continuous random variables ������������� 54 Section 2: E xpectation and variance of continuous random variables ������������� 58 Section 3: E xpectation and variance of functions of a random variable ����������� 60 Section 4: Sums of independent random variables ������������������������������������������������ 63 Section 5: Linear combinations of normal variables ������������������������������������67 Section 6: Cumulative distribution functions �������70 Section 7: P iecewise-defined probability density functions �������������������������������������������������73
Focus on … Problem solving 2 ���������������������������129
Focus on … Modelling 2 �������������������������������������131 Cross-topic review exercise ��������������������������������132
Practice paper 1 ����������������������������������������������������142 Practice paper 2 ����������������������������������������������������144 Formulae ����������������������������������������������������������������146 Answers ������������������������������������������������������������������152 Glossary ������������������������������������������������������������������168 Index �����������������������������������������������������������������������169 Acknowledgements �����������������������������������������������XX
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
iii
1 Discrete random variables In this chapter you will learn how to:
pl e
•• predict the mean, mode, median and variance of a discrete random variable •• understand how a linear transformation of a variable changes the mean and variance •• prove and use the formulae for expectation and variance of a special distribution called the uniform distribution •• recognise when it is appropriate to use a uniform distribution. If you are following the A Level course, you will also learn how to: •• calculate the mean of a discrete random variable after a non-linear transformation.
Before you start…
You should know how to use the rules of probability.
1 Two events A and B are independent. If P(A) = 0.4 and P(B) = 0.3, find P( A AND B) .
A Level Mathematics Student Book 1, Chapter 21
You should know how to find probabilities of discrete random variables.
2 P( X = x ) = kx for x = 1,2, 3. Find the value of k.
A Level Mathematics Student Book 1, Chapter 20
You should know how to find the mean, variance and standard deviation of data, including familiarity with formulae involving sigma notation.
3 Find the variance of 2, 5 and 8.
Further Mathematics Student Book 1, Chapter 11
You should know how to calculate sums of powers of n.
4 Find and simplify an expression for
D ra ft
Sa m
A Level Mathematics Student Book 1, Chapter 21
What are discrete random variables? A random variable is a variable which can change every time it is observed – such as the outcome when you roll a dice. A discrete random variable can only take certain values. In A Level Mathematics Student Book 1, Chapter 21, you covered the probability distributions of discrete random variables – a table or rule giving a list of all possible outcomes along with their probabilities. Many real-life situations follow probability distributions – such as the velocity of a molecule in a waterfall or the amount of tax paid by an individual. It is extremely difficult to make a prediction about a single observation, but it turns out that you can predict remarkably accurately the overall behaviour of many millions of observations. In this chapter you will see how you can predict the mean and variance of a discrete random variable.
n
∑r (r − 1). 1
Tip Discrete variables don’t have to take integer values. However the possible distinct values can be listed, though the list can be infinite. For example: If X is the standard UK shoe size of a random adult member of the public, X takes values 2, 2.5, 3, 3.5 up to 15.5 and is a discrete random variable. If Y is the exact foot length of a random adult member of the public (in cm), Y takes values in the interval [20, 35] and is a continuous random variable.
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1
A Level Further Mathematics for AQA Statistics Student Book
Section 1: Average and spread of a discrete random variable Tip
The most commonly used measure of the average of a random variable is the expectation. It is a value representing the mean result if the variable were to be measured an infinite number of times.
The expectation of a random variable does not need to be a value which the variable can actually be.
Key point 1.1 The expectation of a random variable X is written E( X ) and calculated as
Tip
E(X ) = ∑ xi pi
pl e
The subscript i in the formula in Key point 1.1 is just a counter referring to each possible value and its associated probability.
You do not need to be able to prove this result, but you might find it helpful to see this proof.
Sa m
PROOF 1 The mean of n pieces of discrete data is 1∑ f x x=n i i f = ∑ i xi n
Start from the definition of the mean.
1 is constant you can take it into the sum. Since n
D ra ft
f If n is large, ni will tend towards the probability of xi happening, therefore x = µ = ∑ xi pi
When the sample size tends to infinity, the sample mean x becomes the true population mean, µ.
WORKED EXAMPLE 1.1
The random variable X has a probability distribution as shown in the table. Calculate E(X ). X
1
2
3
4
5
6
P( X = x )
1 10
1 4
1 10
1 4
1 5
1 10
E( X ) = 1× 1 + 2 × 1 + 3 × 1 + 4 × 1 + 5 × 1 + 6 × 1 10 4 10 4 5 10 7 =2
Use the values from the distribution in the formula in Key point 1.1.
As well as knowing the expected average, you may also be interested in how far away from the average you can expect an outcome to be. The variance, σ 2 , of a random variable is a value representing the degree of variation that would be seen if the variable were to be repeatedly measured an infinite number of times. It is a measure of how spread out the variable is. 2
Fast forward You will see in Section 2 how to find expectations of other functions of X.
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1 Discrete random variables
Did you know?
Key point 1.2 The variance of a random variable X is written Var( X ) and calculated as Var( X ) = E(X 2 ) − E( X )2 where E(X 2 ) = ∑ xi2 pi
The quantity Σ xi2 pi is the expected value of X 2 , read as ‘the mean of the squares’. This variance formula is often read as ‘the mean of the squares minus the square of the mean’.
Calculate Var( X ) for the probability distribution in Worked example 1.1. From Worked example 1.1: E( X ) = 3.5
Var( X ) = E( X 2 ) − [ E( X )]2 = 14.6 − 12.25 = 2.35
Use the values from the distribution in the formula in Key point 1.2.
Sa m
1 1 E( X 2 ) = 12 × 10 + 22 × 4 1 1 + 32 × 10 + 42 × 4 1 1 + 52 × 5 + 62 × 10 = 14.6
pl e
WORKED EXAMPLE 1.2
Standard deviation – the square root of variance – is a much more meaningful representation of the spread of a variable. So why is variance used at all? The answer is purely to do with mathematical elegance. It turns out that the algebra of variance is far neater than the algebra of standard deviations.
Tip Many calculators can simplify this process. You normally have to treat the values of the random variable as data and the probabilities as the frequency.
D ra ft
Two other less commonly used measures of average are the mode and the median. For data, the mode is the most common result and this extends to variables.
Key point 1.3
The mode of a random variable X is the value of X associated with the largest probability.
For data, the median is the value which has half the data values below it and half above it. You can interpret this in terms of probabilities.
Key point 1.4 The median, M, of a random variable X is any value which has P( X ø M ) ù 0.5 and P( X ù M ) ù 0.5 If there are two possible values, you have to find their mean.
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
3
A Level Further Mathematics for AQA Statistics Student Book When there are two possible values and you have to take their mean, the median will take a value different from any observed value of the random variable. WORKED EXAMPLE 1.3 For the distribution in Worked example 1.1 find: a the mode b the median.
x
1
2
3
4
5
6
P( X = x )
1 10
1 4
1 10
1 4
1 5
1 10
P( X < x )
0.1
0.7
0.9
1
0.35 0.45
So the median is 4.
You can create a table of P( X ø x ).
Look for the first value which has a value of P( X ø x ) greater than or equal to 0.5. You could also check that P( X ù x ) ù 0.5 but this is not necessary here.
Sa m
b
pl e
1 a The largest probability is so there are two modes: 4 2 and 4.
A probability distribution can also be described by a function. WORKED EXAMPLE 1.4
W is a random variable which can take values −0.5,1.5, 2.5 and k where k > 0.
D ra ft
2 P(W = w ) = w 29 a Find the value of k. b Find the expected mean of W . c Find the standard deviation of W .
a
( −0.5)2 1.52 2.52 k 2 29 + 29 + 29 + 29 = 1
Use the fact that the total probability must add up to 1.
0.25 + 2.25 + 6.25 + k 2 = 29
k 2 = 20.25 k = 4.5 since k > 0.
( −0.5)2 1.52 2.52 + 1.5 × + 2.5 × 29 29 29 4.5 + 4.5 × ≈ 29 3.79
b E(W ) = −0.5 ×
c
2 ( −0.5)2 2 × 1.5 + 2.52 + 1.5 29 29 2.52 4.5 2× × + ≈ 29 4.5 29 15.7
E(W 2 ) = ( −0.5)2 ×
Use Key point 1.1.
To find the standard deviation you first need to find the variance which means you need to find E(W 2 ) and use Key point 1.2. Continues on next page
4
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1 Discrete random variables
Var(W ) = 15.7 − 3.792 ≈ 1.28
Although you only write down three significant figures in the working, make sure you use the full accuracy from your calculator to find the final answer.
So σ ≈ 1.28 = 1.13 (3 s.f.)
WORK IT OUT 1.1
x
0
1
2
P( X = x )
0.2
0.3
0.5
pl e
Find the variance of X , the random variable defined by this distribution.
Which is the correct solution? Can you identify the errors made in the incorrect solutions?
B
E( X ) = 0 + 1 + 2 = 1 3 2 + 12 + 2 2 0 2 E( X ) = = 5 3 3 5 2 2 Var( X ) = − 1 = 3 3
Sa m
A
E( X ) = 1 × 0.3 + 2 × 0.5 = 1.3
E( X 2 ) = 1 × 0.3 + 4 × 0.5 = 2.3 Var( X ) = 2.3 − 1.32 = 0.61 C
E( X ) = 0 × 0.2 + 1 × 0.3 + 2 × 0.5 = 1.3
E( X 2 ) = 02 × 0.22 + 12 × 0.32 + 22 × 0.52 = 1.09
D ra ft
Var( X ) = 1.09 − 1.32 = −0.6
EXERCISE 1A
1
Calculate the expectation, mode, median, variance and standard deviation of each of these random variables. a i
x
1
2
3
4
P( X = x )
0.4
0.3
0.2
0.1
x
10
20
30
40
P( X = x )
0.4
0.3
0.2
0.1
w
0.1
0.2
0.3
0.4
P(W = w )
0.4
0.1
0.25
0.25
ii
x
8
9
10
11
P( X = x )
0.4
0.3
0.2
0.1
x
80
90
100
110
P( X = x )
0.4
0.3
0.2
0.1
v
0.1
0.2
0.3
0.4
P(V = v )
0.5
0.3
0.1
0.1
b i
c i
2 d i P( X = x ) = x , x = 1, 2, 3 14
ii
ii
ii P( X = x ) = 1 , x = 2, 3, 6 x
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
5
A Level Further Mathematics for AQA Statistics Student Book 2
A discrete random variable X is given by P( X = x ) = k ( x + 1) for x = 2, 3, 4, 5, 6. a Show that k = 0.04 b Find E( X ).
3
A discrete random variable V has the probability distribution shown and E( V ) = 5.1. v
1
2
5
8
p
P(V = v )
0.2
0.3
0.1
0.1
q
a Find the values of p and q.
4
A discrete random variable X has its probability given by P( X = x ) = k ( x + 3) , where x is = 0,1, 2, 3. a Show that k = 1 . 18 b Find the exact value of E( X ).
The probability distribution of a discrete random variable X is defined by
Sa m
5
pl e
b Find the median of V .
P( X = x ) = kx (4 − x ), x = 1, 2, 3. a Find the value of k. b Find E( X ).
c Find the standard deviation of X . 6
A fair six-sided dice, with sides numbered 1,1, 2, 2, 2, 5, is thrown. Find the mean and variance of the score.
7
The table shows the probability distribution of a discrete random variable X . 0
1
2
3
D ra ft
x P( X = x )
0.1
p
q
0.2
a Given that E( X ) = 1.5 , find the values of p and of q. b Find the standard deviation of X .
8
A biased dice with four faces is used in a game. A player pays 5 counters to roll the dice. The table shows the possible scores on the dice, the probability of each score and the number of counters the player receives in return for each score. Score
1
2
3
4
Probability
1 2
1 4
1 5
1 20
Number of counters player receives
4
5
15
n
Find the value of n in order for the player to get an expected profit of 3.25 counters per roll.
6
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1 Discrete random variables 9
Two fair dice labelled with the values 1 to 6 are thrown. The random variable D is the difference between the larger and the smaller score, or zero if they are the same. a Copy and complete this table to show the probability distribution of D. d
0
1
P(D = d)
1 6
5 18
2
3
4
5
b Find E( D ) .
c Find Var( D ).
d Find the median of D.
e Find P( D > E( D )).
pl e
10 a I n a game a player pays an entrance fee of £n. He then selects one number from 1, 2, 3 or 4 and rolls three standard dice. If his chosen number appears on all three dice he wins four times the entrance fee. If his number appears on exactly two of the dice he wins three times the entrance fee. If his number appears on exactly one dice he wins £1. If his number does not appear on any of the dice he wins nothing. Copy and complete the probability table. −n
Probability
27 64
Sa m
Profit (£)
b The game organiser wants to make a profit over many plays of the game. Given that he must charge a whole number of pence, what is the minimum amount the organiser must charge? 11 Viewers are asked to rate a new film on a three point scale. Their marks are modelled by the random variable S as shown. s
1
2
P(S = s)
0.3
a
3
b
D ra ft
a The mean, median and mode of S are all equal. Find the variance of S.
b Two independent viewers of the film are both asked their opinion. i What is the probability that their total score is more than 4?
ii Show that the expectation of their total score is 4.
12 The number of books borrowed by each person who visits a library is modelled using the random variable B. b
0
1
2
3
4
P(B = b)
0.2
0.3
0.3
0.1
0.1
a Find the mean of B. b Show that the expectation of B is larger than the median of B. c Show that the standard deviation of B is less than the median of B. d 10 people visited the library during an audit period. The number of books they borrowed is independent of each other. Find: i the probability that exactly three people borrow no books ii the expected number of people who borrow no books. © Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
7
A Level Further Mathematics for AQA Statistics Student Book
Section 2: Expectation and variance of transformations of discrete random variables Linear transformations You may have noticed a link between question 1 parts a and b in Exercise 1A. The distributions were very similar but in part b all the x -values were multiplied by 10. All the averages and the standard deviations were also multiplied by 10 but the variances were multiplied by 100. This is an example of a transformation.
pl e
The most common type of transformation is a linear transformation. This is where the new variable (Y ) is found from the old variable ( X ) by multiplying by a constant and/or adding on a constant. You might do this if you change the units of measurement. This kind of change is also known as ‘linear coding’.
Key point 1.5
Sa m
If you know the original mean and variance and how the data was transformed, you can use a shortcut to find the mean and variance of the new data.
If X is a random variable, and Y is a new random variable such that Y = aX + b then E(Y ) = aE( X ) + b Var(Y ) = a 2 Var( X )
Fast forward You will prove Key point 1.5 after you have developed a little more theory.
D ra ft
This means that the standard deviation of Y , σ Y , is a σ X . This makes sense as multiplying the data by a does change how spread out they are, but adding on b does not change the spread. WORKED EXAMPLE 1.5
A random variable X has expectation 7 and variance 100. Y is a transformation of X given by Y = 100 − 2 X. Find: a the expectation of Y b the standard deviation of Y . a E(Y ) = 100 − 2 × E( X ) = 100 − 2 × 7 = 86
This is just a direct application of Key point 1.5.
b Var(Y ) = = 4 × 100 = 400 σ Y = 400 = 20
To find the standard deviation you need to first find the variance of Y using Key point 1.5.
(–2)2 Var(X )
8
Common error It is easy to get confused with the minus sign in the transformations in Worked example 1.5. Remember that both variances and standard deviations are always positive.
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1 Discrete random variables Non-linear transformations You can also apply non-linear transformations to X , such as X 2 , sin X or 1 . When you do this there is no shortcut to find the mean and 2X + 3 variance of the transformed variable. You need to adapt Key point 1.1.
x
1
2
3
4
5
6
y
1
4
9
16
25
36
P
1 6
1 6
1 6
1 6
1 6
1 6
pl e
Consider the random variable X = outcome on a fair dice. If Y = X 2 , you can construct the probability distribution for Y :
Sa m
The probability of Y being 9 is just the same as the probability of X being 3. So E( Y ) = 1 × 1 + 4 × 1 + 9 × 1 + 16 × 1 + 25 × 1 + 36 × 1 , 6 6 6 6 6 6 i.e. it is ∑ xi 2 pi.
Key point 1.6
For a random variable X with expectation E(X). If X is a random variable and g is a function applied to X, then E(g( X )) = ∑ g(xi )pi
WORKED EXAMPLE 1.6
D ra ft
The random variable X has this distribution:
If Y = sin X °, find: a E(Y )
x
30
45
60
P( X = x )
0.25
0.5
0.25
b Var(Y )
a E(Y ) = sin 30 × 0.25 + sin 45 × 0.5 + sin 60 × 0.25 ≈ 0.695
Apply Key point 1.6
b E(Y 2 ) = sin2 30 × 0.25
To find Var(Y ) you need E(Y 2 ) which is E(sin2 X )
+ sin2 45 × 0.5 + sin2 60 × 0.25 = 0.5 Var(Y ) ≈ 0.5 − 0.6952 = 0.0169 (3 s.f.)
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
9
A Level Further Mathematics for AQA Statistics Student Book You can use Key point 1.6 to prove Key point 1.5. PROOF 2 Let Y = aX + b Then: E(Y ) = Σ(axi + b)pi
Apply Key point 1.6 to the function g( x ) = ax + b .
= aΣxi pi + bΣpi = aE( X ) + b
You can separate out a sum into its different terms, taking out constant factors.
pl e
Use the fact that Σpi = 1 for any probability distribution and the definition of expectation from Key point 1.1. You have now established the first part of Key point 1.6. Considering E(Y 2) to get to the variance:
Apply Key point 1.6 to the function g( x ) = (ax + b)2 and expand the brackets.
E(Y 2 ) = Σ(axi + b)2 pi
Sa m
= Σ(a2 xi 2 + 2abxi + b2 ) pi
= a2Σxi 2pi + 2abΣxi pi + b2Σpi = a2E( X 2 ) + 2abE( X ) + b2
You can separate out a sum into its different terms, taking out constant factors. Use the fact that Σpi = 1 for any probability distribution, and the definitions of E( X ) and E( X 2 ).
Using the definition of variance from Key point 1.2:
D ra ft
Var(Y ) = E(Y 2 ) − E(Y )2
= a2E( X 2 ) + 2abE( X ) + b2 − (aE( X ) + b)2
= a2E( X 2 ) + 2abE( X ) + b2 − a2E( X )2 − 2abE( X ) − b2 = a2E( X 2 ) − a2E( X )2
Expand the brackets and lots of terms cancel!
= a2 (E( X 2 ) − E( X )2 )
Taking out a factor of a 2 leaves the expression for Var(X ) from Key point 1.2. This completes the proof.
= a2 Var( X )
EXERCISE 1B
1
10
E(X ) = 9 and Var( X ) = 25. Find E(Y ) and Var(Y ) if: a i Y = 3 X
ii Y = 4 X
b i Y = X − 1
ii Y = X − 2
c i Y = 2 X − 1
ii Y = 3 X − 5
d i Y = 10 − 3 X e i Y = X − 1 4
ii Y = 8 − 2 X ii Y = X + 5 10
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1 Discrete random variables 2
X follows this distribution: x
1
2
3
P( X = x )
0.5
0.4
0.1
Find E(Y ) and Var(Y ) if: ii Y = X 3
b i Y = 1 X c i Y = X
ii Y = 12 X ii Y = X + 1
d i Y = e X
ii Y = ln X
pl e
3
a i Y = X 2
Stephen goes on a 30 mile bike ride every weekend. The distance until he stops for a picnic is modelled by X , where E( X ) = 20 and Var( X ) = 16. Y is the distance remaining after his picnic. Find E(Y ) and Var(Y ).
4
The rule for converting between degrees Celsius (C ) and degrees Fahrenheit ( F ) is:
Sa m
F = 1.8C + 32.
When a bread oven is operating it has expected temperature 200 °C with standard deviation 5 °C. Find the expected temperature and standard deviation in degrees Fahrenheit. 5
The random variable X has expectation 10 and variance 25. If Y = aX + b, find the values of a and b so that the expectation of Y is zero and the standard deviation is 1.
6
X is a discrete random variable where E( X ) = 10 and E(X 2 ) = 200. Y is a transformation of X such that Y = X + 2. Find E(Y ) and the standard deviation of Y .
7
a X is a discrete random variable satisfying P( X = x ) = kx for x = 1, 2, 3, 4. Find the value of k.
D ra ft
c Find Var( X ). b Find E( X ). 1 . e Find Var 1 . d Find E X X The discrete random variable X has a distribution given by P( X = x ) = k for x = 1, 2, 3, ..., n. x +1 a Find, in terms of n and k, E( X + 1).
8
( )
( )
b Hence find, in terms of n and k, E( X ).
9
A discrete random variable X has equal expectation and standard deviation. Y is a transformation of X such that Y = aX – b. Prove that it is only possible for the expectation of Y to equal the variance of Y if b < 1 . 4 10 The St Petersburg Paradox describes a game where a fair coin is tossed repeatedly until a head is found. You win 2n pounds if the first head occurs on the nth toss. How much should you pay to play this game?
Section 3: The discrete uniform distribution You have already met some special distributions which occur so often that they are named; for example the binomial and the normal distributions. Another very common distribution is the discrete uniform distribution.
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
11
A Level Further Mathematics for AQA Statistics Student Book This is a distribution where all the whole numbers from 1 to n are equally likely and it is given the symbol U(n). For example, U(6) gives the distribution of the outcomes on a fair six-sided dice.
Key point 1.7 If a random variable X follows a uniform distribution X ~ U(n) then P( X = x ) = 1 for x = 1, 2 ... n. n
Key point 1.8
2 E( X ) = n + 1 and Var( X ) = n − 1 . 2 12
Sa m
If a random variable X follows a uniform distribution X ~ U(n) then P( X = x ) = 1 for x = 1, 2 ...n. n
pl e
If you identify a random variable as following a uniform distribution you can immediately write down the expectation and variance.
You can prove the result in Key point 1.8 using your knowledge of sums of powers of integers. PROOF 3
Rewind You met the rules for working with indices in A Level Mathematics Student Book 1, Chapter 2.
2 If X ~ U(n) then E( X ) = n + 1 and Var( X ) = n − 1 . 12 2
n
∑ r × n1
D ra ft E( X ) =
r =1 n
=1 n
1 n is a constant, you can take it out of the sum.
∑
r
r =1
n(n + 1) =1 2 n 1 n + = 2
E( X 2 ) =
n
∑r
2
r =1 n
=1 n
∑r
×1 n
Use the result for the sum of the first n positive integers: n n (n + 1) r= 2
∑ r =1
All the values of X need to be squared.
2 n
r =1
n ( n + 1) ( 2n + 1) = 1 6 n =
r denotes the possible values of X, which are 1, 2, … , n.
Use the result for
∑r . 2
r =1
( n + 1) ( 2n + 1) 6
Continues on next page
12
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1 Discrete random variables
Var(X ) = E( X 2 ) − [E(X )]2 =
( n + 1)( 2n + 1) − n + 1
Use the formula for variance.
2
2 = n + 1 2n + 1 − n + 1 2 3 2 6
( ) = n + 1 ( 4n + 2 − 3n + 3 ) 2 6 6
pl e
= n + 1 × n − 1 2 6 2 1 n − = 12
WORKED EXAMPLE 1.7
Sa m
In Section 2 you saw how to find the expectation and variance of a linear transformation of a discrete random variable. You can find the expectation and variance of a linear transformation of a discrete uniform distribution in the same way.
The discrete random variable Y is equally likely to take any even value between 10 and 20 inclusive. Find the variance of Y . Y = 2 X + 8 where X ~ U ( 6 )
The values of values of Y are 10,12,14 … 20 . These can be written as Y = 2 X + 8, where X = 1, 2, … 6.
So Y is a linear transformation of X ~ U(6).
35 12
Apply Key point 1.8.
D ra ft
Var( X ) =
Var(Y ) = 22 Var( X ) =
35 3
Apply Key point 1.5.
EXERCISE 1C
1
Find the mean and variance of these distributions. a i U(5)
ii U(8)
b i U(2 x )
ii U( x − 1)
2
A fair spinner has sides labelled 2, 4, 6, 8,10. Find the expected mean and standard deviation of the results of the spinner.
3
A fair dice has sides labelled 0,1, 2, 3, 4, 5. Find the expectation and standard deviation of the outcome of the dice.
4
a Th e random variable Y is equally likely to take any integer value between –n and n. Show that this can be written as aX + b where X ~ U(2n + 1). b Hence find the variance of Y .
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
13
A Level Further Mathematics for AQA Statistics Student Book 5
A string of 100 Christmas lights starts with a plug then contains a light every 4 cm from the plug. One light is broken. Assuming all bulbs are equally likely to break, what is the expected mean and variance of the distance of the broken light from the plug?
6
The random variable X is equally likely to take the value of any odd number between 1 and 99 inclusive. Find the variance of X .
7
The discrete random variable Y takes values m , m + 1, m + 2, ..., m + n. Find the expectation and variance of Y .
8
X ~ U (n) and Var(X ) = E(X ) + 4 . Find n.
A random number, X , is chosen from the fractions 1 , 2 , 3 , … ,1. n n n 1 1 Prove that E(X ) > but Var( X ) < . 2 12 10 X ~ U(n). Prove that 6Var(X ) is always divisible by E( X ).
Checklist of learning and understanding
pl e
9
• • •
• •
where E( X 2 ) = ∑ xi2 pi The mode of X is the value of X associated with the largest probability. The median, M, is any value which has P( X ø M ) ù 0.5 and P( X ù M ) ù 0.5. If there are two possible values, you have to find their mean. If Y = aX + b then E(Y ) = aE( X ) + b Var(Y ) = a 2 Var( X ) E(g ( X )) = ∑ g ( xi ) pi A uniform distribution models situations where all discrete outcomes are equally likely. 2 If X ~ U(n) then P( X = x ) = 1 for x = 1, 2, ..., n and E( X ) = n + 1 and Var( X ) = n − 1 . n 12 2
D ra ft
•
Sa m
• The expectation of a random variable X is written E( X ) and calculated as E( X ) = ∑ xi pi • The variance of a random variable X is written Var( X ) and calculated as Var( X ) = E( X 2 ) – [E( X )]2
14
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1 Discrete random variables
Mixed practice 1 1
A discrete random variable has E(X) = 3 and E(X 2) = 12. Find Var(2X + 1). Choose from these options. A 6
2
3
B 7
C 12
D 13 k A discrete random variable X has a distribution defined by P(X = x) = for x = 1, 2, 3. Find E(X). x Choose from these options. 6 18 B 1 C D 2 A 11 11 A drawer contains three white socks and five black socks. Two socks are drawn without replacement. B is the number of black socks drawn.
pl e
a Find the probability distribution of B. b Find E( B). 4
A fair six-sided dice is thrown once. The random variable X is calculated as half the result if the dice shows an even number or one higher than the result if the dice shows an odd number. a Write down a table representing the probability distribution of X .
c Find Var(X ).
Sa m
b Find E( X ).
e Find the median of X.
d Find the mode of X. 5
a X ~ U ( 13 ). Find the expectation and variance of X .
b Y is the discrete random variable which is equally likely to take any integer value between 14 and 26. Find E(Y ) and Var(Y ). c Z is the discrete random variable which is equally likely to take any even value between 2 and 26. Find E( Z ) and Var( Z ) . The random variable X follows this distribution: x
1
2
3
P( X = x )
a
b
0.6
D ra ft
6
a Write down the median of X .
b If E( X ) = 2.5, find the value of a and b. c Hence find E( X 2 ) and show that Var(X ) = 0.45.
7
X is a discrete random variable with E( X ) = 10 and Var( X ) = 16. Y = 12 − X . Find E(Y ) and the standard deviation of Y .
8
The random variable X has expectation 12 and variance 100. If Y = aX + b, find the values of a and b so that the expectation of Y is 10 and the standard deviation is 20.
9
X is a discrete random variable which can take the values 1 or 2. a If E( X ) = 1.2, find the standard deviation of X . b Y = 3 X + 4. Find E(Y ) and Var(Y ).
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
15
A Level Further Mathematics for AQA Statistics Student Book 10 A fair dice is thrown until a 6 has been thrown or three throws have been made. T is the discrete random variable number of throws taken. a Write down, in tabular form, the distribution of T . b Find E(T ). c Find the median of T. d The number of points awarded in the game, P , is given by 4 − T . Find the variance of P . 11 a A four-sided dice labelled with the values 1 to 4 is rolled twice. Write down, in a table, the probability distribution of S, the sum of the two rolls. b Find E(S ) and Var(S ) .
pl e
c A four-sided dice is rolled once and the score, X , is twice the result. Find the mean and variance of X .
12 The discrete random variable X follows the U(9) distribution. µ is the expectation of X and σ 2 is the variance of X . Find P( µ − σ < X < µ + σ )
Find, in terms of n:
Sa m
13 X is a discrete random variable satisfying P ( X = x ) = kx for x = 1, 2, 3, ..., n.
( )
d E 1 X 14 A box contains a large number of pea pods. The number of peas in a pod can be modelled by the random variable X . The probability distribution of X is shown here: b E( X )
a k
c Var( X )
x
2 or fewer
3
4
5
6
7
8 or more
P( X = x )
0
0.1
0.2
a
0.3
b
0
a Two pods are picked randomly from the box. Find the probability that the number of peas in each pod is at most 4.
D ra ft
b It is given that E( X ) = 5.1.
i Determine the values of a and b.
ii Hence show that Var( X ) = 1.29.
iii Some children play a game with the pods, randomly picking a pod and scoring points depending on the number of peas in the pod. For each pod picked, the number of points scored, N, is found by doubling the number of peas in the pod and then subtracting 5. Find the mean and the standard deviation of N. [© AQA 2014]
15 A random variable has E( X n ) = n. Find Var( X n ) in terms of n. 16 I n a card game a pack of 52 standard playing cards is used. The cards are dealt one at a time until the Queen of Spades (a unique card in the pack) is revealed. a What is the expected mean and standard deviation of the number of cards until the Queen of Spades is revealed? b In the game the player scores n 2 points if the Queen of Spades is the nth card revealed. Find the expected number of points scored.
16
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
1 Discrete random variables 17 In a computer game, players try to collect five treasures. The number of treasures that Isaac collects in one play of the game is represented by the discrete random variable X . The probability distribution of X is defined by 1 x = 1, 2, 3, 4 x+2 P( X = x ) = k x = 5 0 otherwise
iii Show that Var( X ) = 1.5275.
pl e
a i Show that k = 1 . 20 ii Calculate the value of E( X ).
iv Find the probability that Isaac collects more than 2 treasures.
b The number of points that Isaac scores for collecting treasures is Y where Y = 100 X – 50. Calculate the mean and the standard deviation of Y .
D ra ft
Sa m
[© AQA 2014]
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
17
2 Poisson distribution In this chapter you will learn how to: use the conditions required for a Poisson distribution to model a situation use the Poisson formula and how to calculate Poisson probabilities calculate the mean, variance and standard deviation of a Poisson variable use the distribution of the sum of independent Poisson distributions carry out a hypothesis test of a population mean from a single observation from a Poisson distribution.
Before you start…
pl e
•• •• •• •• ••
You should know how to work with the binomial distribution.
1 If X ~ B(4, 0.25), find P( X = 2).
A Level Mathematics Student Book 2, Chapter 20
You should know how to work with conditional probability.
2 If P( A ∩ B) = 0.4 and P( B) = 0.6, find P( A | B).
Chapter 1
You should know how to find the expectation and variance of discrete random variables.
3 Find E(X ) and Var(X ) for this distribution:
You should know how to carry out hypothesis tests on the binomial distribution.
D ra ft
A Level Mathematics Student Book 1, Chapter 22
Sa m
A Level Mathematics Student Book 1, Chapter 21
x
2
4
P( X = x )
0.4
0.6
4 A coin is tossed 12 times and 9 tails are observed. Use a two-tailed test to determine at the 5% significance level if this coin is biased.
What is the Poisson distribution?
When you are waiting for a bus there are two possible outcomes – at any given moment the bus either arrives or it doesn’t. You can try modelling this situation using a binomial distribution, but it is not clear what an individual trial is. Instead you have an average rate of success – the number of buses that arrive in a fixed time period. There are many situations in which you know the average rate of events within a given space or time, in contexts ranging from commercial, such as the number of calls through a telephone exchange per minute, to biological, such as the number of clover plants seen per square metre in a pasture. If the events can be considered independent of each other (so that the probability of each event is not affected by what has already been seen), the number of events in a fixed space or time interval can be modelled using the Poisson distribution.
18
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
2 Poisson distribution
Section 1: Using the Poisson model
Tip
The Poisson distribution is commonly used when these conditions hold: • The events occur singly (one at a time). • The events are independent of each other. • The average rate of events (conventionally called lambda, λ ) is constant. If these conditions are satisfied then the discrete random variable number of events, X , follows the Poisson distribution with mean λ . You write this as X ∼ Po(λ ).
If a question mentions average rate of success, or events occurring at a constant rate, you should use the Poisson distribution. If you can identify a fixed number of trials, you should use the binomial distribution.
pl e
The Poisson distribution can also be a useful approximate model for discrete random variables in other situations. However, if the stated conditions are not met this can only be established by looking empirically at data.
Sa m
Once you have identified that a situation follows a Poisson distribution, you can use facts about the probability of a certain number of events, the expected number of events and the expected variance.
Key point 2.1
If a random variable X follows a Poisson distribution X ~ Po(λ ) then: −λ x P( X = x ) = e λ for x = 0,1, 2 . . . x!
E( X ) = λ Var(X ) = λ
Common error Remember that 0! = 1, not 0.
D ra ft
These formulae will be given in your formula book.
Notice that the values of mean and variance are equal for the Poisson distribution. This is something you look out for when determining if data is likely to fit a Poisson model, although in itself is not sufficient to decide – there are other distributions with this feature. A typical Poisson distribution, the Po(1.2) distribution, is shown here: p
0.4
0.3
0.2
0.1
0
x 0
1
2
3
4
5
6
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
19
A Level Further Mathematics for AQA Statistics Student Book Notice that: • the mean rate does not have to be a whole number • the distribution is not symmetric • the graph, in theory, should continue on to infinite values of X , but the probabilities of very large values of X get very small. WORKED EXAMPLE 2.1 Recordable accidents occur in a factory at an average rate of 7 every year, independently of each other. Find the probability that in a given year exactly 3 recordable accidents occurred. Define the random variable.
X ~ Po(7)
Give the probability distribution.
e −7 73 3! = 0.521 (3 s.f.)
P(X = 3) =
pl e
Let X be the number of accidents in a year:
Sa m
Write down the probability required, and calculate the answer.
The Poisson distribution is scalable. For example, if the number of butterflies seen on a flower in 10 minutes follows a Poisson distribution with mean λ , then the number of butterflies seen on a flower in 20 minutes follows a Poisson distribution with mean 2 λ , the number of butterflies seen on a flower in 5 minutes follows a Poisson distribution with mean λ , and so on. 2
D ra ft
WORKED EXAMPLE 2.2
Tip Learn how to use your calculator to find Poisson probabilities, Po( X = x ), and cumulative probabilities, P( X ø x ).
If there are, on average, 12 buses per hour arriving at a bus stop, find the probability that there are more than 6 buses in 30 minutes. Let X be the number of buses in 30 minutes:
Define the random variable.
X ~ Po(6)
Give the probability distribution.
P( X > 6) = 1− P( X ø 6)
Write down the probability required. To use your calculator you must relate this probability to P( X ø k ).
= 0.161 (3 s.f.) from your calculator.
The scalability of the Poisson distribution is a consequence of a more general result. If two independent variables both follow a Poisson distribution then so does their sum.
Key point 2.2 If random variables X and Y follow Poisson distributions such that X ~ Po(λ ) and Y ~ Po(µ ) and Z = X + Y , then Z ~ Po(λ + µ )
20
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
2 Poisson distribution Although you do not need to know the proof of the result in Key point 2.2, it does show an interesting link with the binomial expansion. PROOF 4
P( Z = z ) = P( X = 0) P(Y = z ) + P( X = 1) P(Y = z − 1) . . . + P( X = z ) P(Y = 0) r=z
=
∑ P( X = r ) P ( Y = z − r )
Rewrite in sigma notation to keep the expression shorter.
∑ λ re!
r −λ
×
r =0
r =z
µ z −r e − µ ( z − r )!
Use the formula for the Poisson distribution.
∑ λr ! × ( zµ− r ) !
= e−λ e− µ
z −r
r
r =0
=
=
e − (λ + µ ) z!
e − (λ + µ ) z!
r =z
∑ r ! ( zz−! r ) ! λ µ r
r =0
r=z
∑ zr λ µ r
z −r
r =0
You can take out factors of e− λ and e− µ from the sum since they are constants.
Sa m
=
pl e
r =0 r =z
Consider all the different ways in which Z can take the value z. If X = 0 then Y = z . If X = 1 then Y = z − 1, etc.
z −r
e− ( λ + µ ) ( λ + µ ) z! This is a Poisson distribution with mean λ + µ.
D ra ft
z
=
You are close to having a binomial coefficient. Multiply by z! in the sum to get to this, but then you have to divide by z ! too. Replace the factorials with a binomial coefficient. You can recognise the sum as a binomial expansion.
WORKED EXAMPLE 2.3
Hywel receives an average of 4.2 emails and 3.1 texts each hour. These are the only types of messages he receives. a Assuming that the emails and texts each form an independent Poisson distribution, find the probability that he receives more than 4 messages in an hour. b Explain why the assumption that the emails and texts form independent Poisson distributions is unlikely to be true. a
Z = 'Number of messages per hour' Z ∼ Po(7.3)
You can use Key point 2.2 to combine the two Poisson distributions. Continues on next page
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
21
A Level Further Mathematics for AQA Statistics Student Book
P( Z > 4) = 1 − P( Z ø 4) = 0.853 (3 s.f.)
Common error
pl e
b The rate of arrival of messages is unlikely to be constant – there will probably be more at some times of the day than at others. Within each distribution messages are not likely to be independent as they may occur as part of a conversation. The two distributions are also probably not independent of each other, as times when more emails arrive might be similar to times when more texts might arrive.
You need to write the required probability in terms of a cumulative probability to use the calculator function.
WORK IT OUT 2.1
Sa m
Sometimes people think that the mean rate in a Poisson distribution has to be a whole number. This is not the case.
The number of errors in a computer code is believed to follow a Poisson distribution with a mean of 2.1 errors per 100 lines of code. Find the probability that there are more than 2 errors in 200 lines of code. Which is the correct solution? Can you identify the errors made in the incorrect solutions? A If X is the number of errors in 200 lines then X ~ Po(4.2).
D ra ft
P( X > 2) = 1 – P( X ø 2) = 1 – (P( X = 0) + P( X = 1) + P( X = 2)) ≈ 0.790
B If X is the number of errors in 100 lines then X ~ Po(2.1). More than 2 errors in 200 lines is equivalent to more than 1 error in 100 lines, so you need P( X > 1) = 1 – P( X ø 1) = 1 – (P( X = 0) + P( X = 1)) = 0.620
C
X ~ Po(4.2).
P( X > 2) = 1– P( X < 2) = 1– P( X = 1) + P( X = 0) ≈ 0.952
EXERCISE 2A 1
State the distribution of the variable in each of these situations. a Cars pass under a motorway bridge at an average rate of 6 per 10 second period. i The number of cars passing under the bridge in one minute. ii The number of cars passing under the bridge in 15 seconds.
22
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
2 Poisson distribution b Leaks occur in water pipes at an average rate of 12 per kilometre. i The number of leaks in 200 m. ii The number of leaks in 10 km. c 12 worms are found on average in a 1 m2 area of a garden. i The number of worms found in a 0.3 m2 area. ii The number of worms found in a 2 m by 2 m area of garden. 2
Calculate these probabilities. i P( X = 3)
ii P( X = 1).
b If Y ∼ Po(1.4) : i P(Y ø 3)
ii P(Y ø 1).
c If Z ∼ Po(7.9) : ii P( Z < 10).
d If X ∼ Po(5.9):
ii P( X > 1).
i P( X ù 3) e If X ∼ Po(11.4): i P(8 < X < 11) 3
Sa m
i P( Z < 6)
pl e
a If X ∼ Po(2):
ii P(8 ø X ø 12).
A random variable X follows a Poisson distribution with mean 1.7. Copy and complete this table of probabilities, giving results to 3 significant figures: x
1
2
3
4
>4
D ra ft
P( X = x )
4
From a particular observatory, shooting stars are observed in the night sky at an average rate of one every five minutes. Assuming that this rate is constant and that shooting stars occur (and are observed) independently of each other, what is the probability that more than 20 are seen over a period of one hour?
5
When examining blood from a healthy individual under a microscope, a haematologist knows he should see on average four white blood cells in each high power field. Find the probability that blood from a healthy individual will show: a seven white blood cells in a single high power field b a total of 28 white blood cells in six high power fields, selected independently.
6
A wire manufacturer is looking for flaws. Experience suggests that there are on average 1.8 flaws per metre in the wire. a Determine the probability that there is exactly one flaw in one metre of the wire. b Determine the probability that there is at least one flaw in 2 metres of the wire.
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
23
A Level Further Mathematics for AQA Statistics Student Book 7
8
The random variable X has a Poisson distribution with mean 5. Calculate: a P( X ø 5)
b P(3 < X ø 5)
c P( X ≠ 4)
d P(3 < X ø 5 | X ø 5).
The number of eagles observed in a forest in one day follows a Poisson distribution with mean 1.4. a Find the probability that more than three eagles will be observed on a given day. b Given that at least one eagle is observed on a particular day, find the probability that exactly two eagles are seen that day.
9
The random variable X follows a Poisson distribution. Given that P( X ù 1) = 0.6, find:
pl e
a the mean of the distribution b P( X > 2).
10 Let X be a random variable with a Poisson distribution, such that P( X > 2) = 0.3. Use technology to estimate P( X < 2), giving your answer to three significant figures.
Sa m
11 The number of emails Sarah receives per day follows a Poisson distribution with mean 6. Let D be the number of emails received in one day and W the number of emails received in a seven-day week. a Calculate P( D = 6) and P(W = 42).
b Find the probability that Sarah receives 6 emails every day in a seven-day week. c Explain why this is not the same as P(W = 42).
12 The number of mistakes a teacher makes while marking homework has a Poisson distribution with a mean of 1.6 errors per piece of homework. a Find the probability that there are at least two marking errors in a randomly chosen piece of homework.
D ra ft
b Find the most likely number of marking errors occurring in a piece of homework. Justify your answer. c Find the probability that in a class of 12 students fewer than half of them have errors in their marking.
13 A car company has two limousines that it hires out by the day. The number of requests per day has a Poisson distribution with mean 1.3 requests per day. a Find the probability that neither limousine is hired on any given day. b Find the probability that some requests have to be denied on any given day. c If each limousine is to be used equally, on how many days in a period of 365 days would you expect a particular limousine to be in use?
14 The random variable X follows a Poisson distribution with mean λ . If P( X = 2) = P( X = 0) + P( X = 1), find the exact value of λ . 15 The random variable Y follows a Poisson distribution with mean λ . a Show that P ( Y = y + 2 ) =
λ2
P Y = y ). ( y + 1 )( y + 2 ) (
b Given that λ = 6 2 , find the value of y such that P(Y = y + 2) = P(Y = y ).
24
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
2 Poisson distribution
Section 2: Using the Poisson distribution in hypothesis tests If it is known that a variable follows a Poisson distribution you can use data to make inferences about the value of the mean. To do this you use a hypothesis test. Work out the p-value – the probability of getting the observed result or more extreme assuming that the null hypothesis is true. You can then compare this to the significance level to determine whether or not to reject the null hypothesis.
pl e
For a one-tailed test, compare the calculated probability to the significance level directly. For a two-tailed test you usually find the probability of one tail and compare it to half of the significance level. WORKED EXAMPLE 2.4
H0 : λ = 8 H1 : λ ≠ 8 If X ∼ Po(8)
Sa m
The number of telephone calls received by a company follows a Poisson distribution. Over long experience it is thought that the mean is 8 calls per hour. After a redesign of their website it is found that they got 14 calls in an hour. Test at the 5% significance level if this provides significant evidence of a change in the mean number of calls per hour.
P( X ù 14) = 1 − P(X ø 13) = 0.0342
D ra ft
This is more than 2.5% so do not reject H0 .
It is a two-tailed test because you are looking for a change in either direction.
Calculate the probabilities assuming that H0 is true. Compare the upper tail to half of the significance value, since this is a two-tailed test. If you want the p-value, double the probability to get a p-value of 6.84%.
There is insufficient evidence to suggest that the mean number of calls has changed from 8 per hour.
WORK IT OUT 2.2
X is the random variable 'number of absences per day in a school'. It is thought to follow a Poisson distribution with mean 12. Following a change in the registration system the number of absences over five days was 40. Test at the 5% significance level if the change in the registration system has affected the average rate of absences. Which is the correct solution? Can you identify the errors made in the incorrect solutions? A
H0 : µ = 12, H1 : µ ≠ 12 Under H0, X ∼ Po(12). If there are 40 absences over five days this is a rate of eight per day, so you need P( X ø 8) = 0.155. This is more than 5% so you cannot reject H0. The average rate is 12 absences per day. Continues on next page
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
25
A Level Further Mathematics for AQA Statistics Student Book
B
H0 : λ = 12, H1 : λ ≠ 12 Let Y be the number of absences in five days. Under H0, Y ∼ Po(60) . P(Y ø 40) = 0.003 98. Since this is a two-tailed test you must double this to get a p-value of 0.797%. This is less than the significance level so you can reject H0. There is evidence at the 5% significance level that the average rate has changed from 12 absences per day.
C
H0 : λ = 12, H1 : λ < 12
EXERCISE 2B
2
Conduct these hypothesis tests based on the given observation. You may assume that the data follows a Poisson distribution. Use the 5% significance level. a i H0 : λ = 4, H1 : λ ≠ 4, x = 8
H0 : λ = 10, H1 : λ ≠ 10, x = 16 ii
b i H0 : λ = 10.2, H1 : λ ≠ 10.2, x = 2
ii H0 : λ = 8.3, H1 : λ ≠ 8.3, x = 1
c i H0 : λ = 6.9, H1 : λ > 6.9, x = 11
ii H0 : λ = 4.6, H1 : λ < 4.6, x = 2
d i H0 : λ = 5.1, H1 : λ < 5.1, x = 1
ii H0 : λ = 4.8, H1 : λ > 4.8, x = 10
Sa m
1
pl e
Y ∼ Po(60) . P(Y ø 40) = 0.003 98 so reject H0.
Find the critical region (the set of values for which the null hypothesis is rejected) at the 5% significance level if: a i H0 : λ = 5, H1 : λ > 5 b i H0 : λ = 6.2, H1 : λ > 6.2
D ra ft
c i H0 : λ = 8.7, H1 : λ ≠ 8.7
ii H0 : λ = 5, H1 : λ < 5 ii H0 : λ = 6.2, H1 : λ < 6.2 ii H0 : λ = 11.4, H1 : λ ≠ 11.4
3
It is known that a sample of radium emits 7.5 alpha particles per millisecond. A second sample of the same size and shape emits 2 alpha particles in a millisecond. Test at the 5% significance level if this sample has the same emission rate as radium.
4
a O ver a long period it is believed that the average number of cars travelling past a traffic light follows a Poisson distribution with 6.3 cars per minute. After some roadworks it is thought that the number of cars passing is lower. In a one minute observation only 3 cars pass the traffic light. Find the p-value of this observation and hence decide at the 10% significance level if the roadworks have caused a decrease in traffic levels. b Suggest two reasons why a Poisson distribution may not be appropriate.
5
a Th e number of accidents per month, X , on a road is studied. The mean number of accidents per month is 10.3 with standard deviation 3.1. Explain why this supports the suggestion that the number of accidents follows a Poisson distribution. b Assume that X does indeed follow a Poisson distribution. It is thought that adding a speed camera will reduce the average number of accidents from 10.3. In the month after the camera was added there were 4 accidents. Test at the 1% significance level if this is evidence of a reduction in the average number of accidents.
26
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
2 Poisson distribution 6
The number of mistakes in nine pieces of a student’s homework is shown: 4, 6, 7, 9, 9,10,11,12,13 a Estimate the mean and standard deviation of the number of mistakes, based upon this data. b Hence explain why the Poisson distribution is a plausible model. c After a study skills session the student produced a piece of work with 4 mistakes. You may assume that the number of mistakes does follow a Poisson distribution. Test at the 10% significance level if the mean number of mistakes is lower than the value found in part a.
7
The number of bees visiting a flower is thought to follow a Poisson distribution with mean 5 per minute.
pl e
a Describe in context two conditions which must be met for the Poisson distribution to be an appropriate model for the arrival of bees. b After a new hedge has been planted it is thought that the number of bees arriving will increase. In 5 minutes 33 bees visit the flower. Test at the 10% significance level if there is evidence that the number of bees has increased. The number of leaks in a pipe is known to follow a Poisson distribution with mean 3.8 leaks per km. After changing the water pressure an inspection of 10 km of pipe found 31 leaks. Has there been a change in the mean number of leaks? Test using 5% significance.
9
It is known from long experience that earthquakes occur in a particular town once every four months. Environmentalists believe that a change in the way oil is extracted from a well will increase the number of earthquakes. They monitor the activity for one year and six earthquakes occur.
Sa m
8
a Test at the 5% significance level if the number of earthquakes has increased from the long term trend, stating your p-value.
D ra ft
b They continue to monitor earthquake activity and the following year six earthquakes also occur. Test at the 5% significance level if the number of earthquakes has increased from the long-term trend, stating your p-value. 10 The discrete random variable X follows Po(λ ). A single observation is used to test H0 : λ = a against H1 : λ < a . What is the smallest value of a for which H0 will be rejected at the 5% significance level when the observation is X = 0?
Checklist of learning and understanding
• The Poisson distribution is commonly used when these conditions hold: • the events must occur singly (one at a time). • the events must be independent of each other. • the average rate of events must be constant (conventionally called λ ). • If X ∼ Po(λ ) then: • P( X = x ) = • E(X ) = λ • Var(X ) = λ
e − λ λ x for x = 0,1, 2 . . . x!
• If X ∼ Po(λ ), Y ∼ Po( µ ) and Z = X + Y , then ( X + Y ) ~ Po(λ + µ ) • You can use the Poisson distribution to conduct a hypothesis test to see if it suggests that the mean rate has
changed. © Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
27
A Level Further Mathematics for AQA Statistics Student Book
Mixed practice 2 The number of complaints in a shop in any hour while it is open follows a Poisson distribution with mean 1.5 per hour. Find the probability that in a 3 hour shift there are fewer than 6 complaints, giving your answer to three significant figures. Choose from these options. A 0.0141 2
C 0.703
D 0.996
A random variable X follows a Poisson distribution with standard deviation 2. Find P(X = 3) to three significant figures. Choose from these options. A 0.115
3
B 0.171
B 0.180
C 0.195
D 0.433
The random variable R is the number of robins who visit a bird table each hour. The random variable T is the number of thrushes who visit a bird table each hour. These are the only types of birds who visit the table. It is believed that R ∼ Po(1.5) and T ∼ Po(2.0) .
pl e
1
B is the random variable ‘Number of birds visiting the table each hour’. a Stating a necessary assumption, write down the distribution of B. c Find P(1 < B ø 6). 4
Sa m
b Find the probability that no birds visit the table in one hour.
X is the random variable ‘number of burgers ordered per hour in a restaurant’. It is thought that X ∼ Po(4.1). a Write down two conditions required for the Poisson distribution to model data. b Find P(1 < X ø 10).
D ra ft
c During a ‘happy hour’ special offer the number of burgers sold increased to 12. Test at the 5% significance level if the special offer has increased the average rate of burgers ordered from 4.1. 5
Salah is sowing flower seeds in his garden. He scatters seeds randomly so that the number of seeds falling on any particular region is a random variable with a Poisson distribution, with mean value proportional to the area. He intends to sow fifty thousand seeds over an area of 2 m2. a Calculate the expected number of seeds falling on a 1 cm2 region. b Calculate the probability that a given 1 cm2 area receives no seeds.
6
a If X ∼ Po(10), write down E(X ) and Var(X ).
b Hence find P(E( X ) − σ < X < E( X ) + σ ) where σ is the expected standard deviation of X .
7
Seven observations of the random variable X , the number of power surges per day in a power cable, are shown: 1,1, 2, 3, 4, 4, 6 a Estimate the mean and standard deviation of X , based upon these observations. b Use your answer to part a to explain why the Poisson distribution is a plausible model for X . c When a new brand of cable is used it is observed that there are 22 power surges in five days. Does this suggest that the new brand has a different average rate of power surges to your answer in part a? Use a 5% significance level.
28
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
2 Poisson distribution 8
A receptionist at a hotel answers on average 35 phone calls a day. a Find the probability that on a particular day she will answer more than 40 phone calls. b Find the probability that she will answer more than 35 phone calls every day during a five-day week.
9
During the month of August in Bangalore, India, there are on average 11 rainy days. a Find the probability that there are fewer than seven rainy days during the month of August in a particular year. b Find the probability that in ten consecutive years, exactly five have fewer than seven rainy days in August.
a the mean of the distribution b P (2 < X < 6) .
pl e
10 The random variable X follows a Poisson distribution. Given that P (Y ù 1) = 0.4 , find:
11 a Given that X ∼ Po(m) and P(X = 0) = 0.305, find the value of m.
Sa m
b Y ∼ Po(k ). Find the possible values of k such that P(X = 1) = 0.2.
c If W ∼ Po(λ ) and P(W = w + 1) = P(W = w ), express w in terms of λ . 12 A geyser erupts randomly. The eruptions at any given time are independent of one another and can be modelled using a Poisson distribution with mean 20 per day. a Determine the probability that there will be exactly one eruption between 10 a.m. and 11 a.m. b Determine the probability that there are more than 22 eruptions during one day. c Determine the probability that there are no eruptions in the 30 minutes Naomi spends watching the geyser.
D ra ft
d Find the probability that the first eruption of a day occurs between 3 a.m. and 4 a.m. e If each eruption produces 12 000 litres of water, find the expected volume of water produced in a week. f Determine the probability that there will be at least one eruption in at least six out of the eight hours the geyser is open for public viewing. g G iven that there is at least one eruption in an hour, find the probability that there is exactly one eruption.
13 In a particular town, rainstorms occur at an average rate of two per week and can be modelled using a Poisson distribution. a What is the probability of at least eight rainstorms occurring during a particular four-week period? b Given that the probability of at least one rainstorm occurring in a period of n complete weeks is greater than 0.99, find the least possible value of n. 14 P atients arrive at random at an emergency room in a hospital at the rate of 14 per hour throughout the day. a Find the probability that exactly four patients will arrive at the emergency room between 18:00 and 18:15. b Given that fewer than 15 patients arrive in one hour, find the probability that more than 12 arrive.
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
29
A Level Further Mathematics for AQA Statistics Student Book 15 It is thought that X ∼ Po(3.2). A single observation of X takes the value 1. Does this provide evidence at the 5% significance level that the average rate has decreased? Support your answer by writing down the p-value of the observation. 16 Based on long experience a gardener knows that birds tend to arrive in his garden at an average rate of 10 per hour. a State two assumptions required to model the birds’ arrival using a Poisson distribution. Are these reasonable assumptions? b If these assumptions do hold, find the probability of observing more than 15 birds in an hour. The gardener plants some new flowers. He wants to know if this changes the birds’ behaviour.
pl e
c If λ is the true average rate of arrival of birds after the new flowers have been planted, write down suitable null and alternative hypotheses for answering the gardener’s question. d If 15 birds are observed in an hour, what is the conclusion of the test at 5% significance?
Sa m
17 A water company believes that pipes have 3 leaks per km, following a Poisson distribution. After increasing water pressure they are concerned that there are more leaks. They find 10 leaks in a 2 km section of pipe. Does this provide significant evidence at the 5% significance level to suggest that the means number of leaks has increased? 18 A shop has four copies of the magazine Ballroom Dancing delivered each week. Any unsold copies are returned. The demand for the magazine follows a Poisson distribution with mean 3.2 requests per week. a Calculate the probability that the shop cannot meet the demand in a given week. b Find the most probable number of magazines sold in one week. c Find the expected number of magazines sold in one week.
D ra ft
d Determine the smallest number of copies of the magazine that should be ordered each week to ensure that the demand is met with a probability of at least 98%.
19 Annette is a senior typist and makes an average of 2.5 mistakes per letter. Bruno is a trainee typist and makes an average of 4.1 mistakes per letter. Assume that the number of mistakes made by any typist follows a Poisson distribution. a Calculate the probability that on a particular letter: i Annette makes exactly three mistakes
ii Bruno makes exactly three mistakes.
b Annette types 80% of all the letters.
i Find the probability that a randomly chosen letter contains exactly three mistakes. ii Given that a letter contains exactly three mistakes, find the probability that it was typed by Annette. c Annette and Bruno type one letter each. Given that the two letters contain a total of three mistakes, find the probability that Annette made more mistakes than Bruno. 20 Th e number of worms in a square metre in a forest satisfies the distribution Po(1). A scientist samples many metre-squared areas but only records areas where some worms are observed. What is the mean value of her observations? 30
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
2 Poisson distribution 21 Mohammed is offered a week’s trial with a view to being permanently employed to service bicycles in Robyn’s bicycle shop. The number of bicycles brought in to be serviced can be modelled by a Poisson distribution with mean 2.6 per day. a Find the probability that, on Mohammed’s first day, the number of bicycles brought in to be serviced is: i 2 or fewer ii more than 3 iii exactly 4.
pl e
b Before starting work, Mohammed told his mother that he hoped that, during his first week (5 days), the number of bicycles brought in to be serviced would be: •
at least 10, otherwise Robyn might decide that there was not enough work to justify permanently employing him
•
not more than 15, so that he would not have to work too hard.
Sa m
Find the probability that Mohammed’s hopes will be met.
[© AQA 2011]
22 At a Roman site, coins are found at an average rate of 1 coin per 10 m2 . Assume that the number of coins found can be modelled by a Poisson distribution. a Determine the probability that, in an area of 10 m2 : i at most 2 coins are found
ii exactly 4 coins are found.
b Determine the probability that more than 8 coins are found in an area of 100 m2 .
D ra ft
c Bronze brooches are less common than coins at this site, and are found at an average rate of 1 brooch per 50 m2. The number of these brooches found is independent of the number of coins found. Assume that the number of bronze brooches found can also be modelled by a Poisson distribution. i Determine the probability that the total number of coins and bronze brooches found in an area of 100 m2 is at least 15.
ii Sometimes, Romans buried a hoard of several coins together. They did not usually bury several bronze brooches together. State, with a reason, which of •
the number of coins found or
•
the number of bronze brooches found
is likely to be better modelled by a Poisson distribution. [© AQA 2013]
© Cambridge University Press 2017 The third party copyright material that appears in this sample may still be pending clearance and may be subject to change.
31