Describing distributions
We describe distributions for quantitative data in terms of shape, location and spread.
1
Shape See this from histogram, stemplot or bar chart.
2
Shape See this from histogram, stemplot or bar chart. Unimodal:
3
Shape See this from histogram, stemplot or bar chart. Unimodal:
Bimodal:
4
Shape
Unimodal distributions may be symmetric or skewed.
5
Shape
Unimodal distributions may be symmetric or skewed. Symmetric:
6
Shape
Unimodal distributions may be symmetric or skewed. Symmetric:
(NB: not necessarily symmetric about zero.)
Shape skewed:
skewed:
Note we are looking at overall shape, not at fine detail. 8
Shape Right (positively) skewed:
skewed:
Note we are looking at overall shape, not at fine detail. 9
Shape Right (positively) skewed:
Left (negatively) skewed:
Note we are looking at overall shape, not at fine detail. 10
Shape
Example: Difference in reaction times before and after drinking a double whisky.
-
0
Difference
11
Outliers
Newcomb’s data have 2 outliers. He decided to exclude the most extreme one (−44) and calculate speed of light from the other 65 observations.
12
Outliers
Newcomb’s data have 2 outliers. He decided to exclude the most extreme one (−44) and calculate speed of light from the other 65 observations. It can be dangerous to ignore outliers — they may be telling us something important. They should be checked to make sure they are not recording errors. If not, further investigation may reveal some reason for outlying observations.
13
Location Around what value is the distribution centred? There are several measures of central tendency.
Location Around what value is the distribution centred? There are several measures of central tendency. (i) The mean of observations x1 , x2 , . . . , xn is x̄ =
15
Location Around what value is the distribution centred? There are several measures of central tendency. (i) The mean of observations x1 , x2 , . . . , xn is Pn x1 + x2 + · · · + xn i=1 xi = x̄ = n n
16
Location Around what value is the distribution centred? There are several measures of central tendency. (i) The mean of observations x1 , x2 , . . . , xn is Pn x1 + x2 + · · · + xn i=1 xi = x̄ = n n It is very sensitive to outliers — one extreme observation can have a large effect on x̄
17
Location Around what value is the distribution centred? There are several measures of central tendency. (i) The mean of observations x1 , x2 , . . . , xn is Pn x1 + x2 + · · · + xn i=1 xi = x̄ = n n It is very sensitive to outliers — one extreme observation can have a large effect on x̄ Mean of 10, 20, 30, 40, 50 is Mean of 10, 20, 30, 40, 150 is
Location Around what value is the distribution centred? There are several measures of central tendency. (i) The mean of observations x1 , x2 , . . . , xn is Pn x1 + x2 + · · · + xn i=1 xi = x̄ = n n It is very sensitive to outliers — one extreme observation can have a large effect on x̄ Mean of 10, 20, 30, 40, 50 is 30 Mean of 10, 20, 30, 40, 150 is
Location Around what value is the distribution centred? There are several measures of central tendency. (i) The mean of observations x1 , x2 , . . . , xn is Pn x1 + x2 + · · · + xn i=1 xi = x̄ = n n It is very sensitive to outliers — one extreme observation can have a large effect on x̄ Mean of 10, 20, 30, 40, 50 is 30 Mean of 10, 20, 30, 40, 150 is 50
Location Because influenced by extreme observations, we say the mean is not resistant.
Location Because influenced by extreme observations, we say the mean is not resistant. The mean of skewed data is pulled towards the long tail, so may not be considered a good description of the location of the bulk of the data.
Location Because influenced by extreme observations, we say the mean is not resistant. The mean of skewed data is pulled towards the long tail, so may not be considered a good description of the location of the bulk of the data.
Mean = 30.013
Location Because influenced by extreme observations, we say the mean is not resistant. The mean of skewed data is pulled towards the long tail, so may not be considered a good description of the location of the bulk of the data.
Mean = 30.013
Median = 26.806
Location (ii) The median is the point with half the observations above it, half below.
Location (ii) The median is the point with half the observations above it, half below. To find the median M , order the observations from smallest to largest, x(1) ≤ x(2) ≤ · · · ≤ x(n) If n is odd, M is the middle observation, the value at position (n + 1)/2, M
= x( n+1 ) 2
Location (ii) The median is the point with half the observations above it, half below. To find the median M , order the observations from smallest to largest, x(1) ≤ x(2) ≤ · · · ≤ x(n) If n is odd, M is the middle observation, the value at position (n + 1)/2, M
= x( n+1 ) 2
If n is even, M is halfway between the observations at positions n/2 and (n/2) + 1, 1 M = x( n ) + x( n +1) 2 2 2
Location Median is easy to find from a stemplot. Example: Petrol consumption. 27, 28 29, 30 31, 32 33, 34 35, 36 37, 38
3 7 448 34 1345 2
Location Median is easy to find from a stemplot. Example: Petrol consumption. 27, 28 29, 30 31, 32 33, 34 35, 36 37, 38 n = 12 x(6) = 33.3, x(7) = 33.4, M = (33.3 + 33.4) /2 = 33.35
3 7 448 34 1345 2
Location Median is easy to find from a stemplot. Example: Petrol consumption. 27, 28 29, 30 31, 32 33, 34 35, 36 37, 38
3 7 448 34 1345 2
n = 12 x(6) = 33.3, x(7) = 33.4, M = (33.3 + 33.4) /2 = 33.35 mpg
Location
The median is a resistant measure of location. Median of 10, 20, 30, 40, 50 is Median of 10, 20, 30, 40, 150 is
31
Location
The median is a resistant measure of location. Median of 10, 20, 30, 40, 50 is 30 Median of 10, 20, 30, 40, 150 is 30
32
Location
The median is a resistant measure of location. Median of 10, 20, 30, 40, 50 is 30 Median of 10, 20, 30, 40, 150 is 30 Median is easy to find for small samples,
Location
The median is a resistant measure of location. Median of 10, 20, 30, 40, 50 is 30 Median of 10, 20, 30, 40, 150 is 30 Median is easy to find for small samples, but for large samples, ordering data is time consuming. The mean is quicker to calculate.
Location (iii) Trimmed mean is a more resistant measure than the mean. To calculate the 100p% trimmed mean for 0 < p < 0.5, discard the smallest 100p% and largest 100p% of the observations and compute the mean of the remaining 100(1 â&#x2C6;&#x2019; 2p)%
Location (iii) Trimmed mean is a more resistant measure than the mean. To calculate the 100p% trimmed mean for 0 < p < 0.5, discard the smallest 100p% and largest 100p% of the observations and compute the mean of the remaining 100(1 â&#x2C6;&#x2019; 2p)% Example: Newcomb light data. Mean is xĚ&#x201E; = 26.21, but we know the data include outliers.
Location (iii) Trimmed mean is a more resistant measure than the mean. To calculate the 100p% trimmed mean for 0 < p < 0.5, discard the smallest 100p% and largest 100p% of the observations and compute the mean of the remaining 100(1 â&#x2C6;&#x2019; 2p)% Example: Newcomb light data. Mean is xĚ&#x201E; = 26.21, but we know the data include outliers. Sample size n = 66, so for 5% trimmed mean, discard the largest and smallest observations.
Location (iii) Trimmed mean is a more resistant measure than the mean. To calculate the 100p% trimmed mean for 0 < p < 0.5, discard the smallest 100p% and largest 100p% of the observations and compute the mean of the remaining 100(1 − 2p)% Example: Newcomb light data. Mean is x̄ = 26.21, but we know the data include outliers. Sample size n = 66, so for 5% trimmed mean, discard the largest and smallest 66 × 0.05 = 3.3 observations.
Location (iii) Trimmed mean is a more resistant measure than the mean. To calculate the 100p% trimmed mean for 0 < p < 0.5, discard the smallest 100p% and largest 100p% of the observations and compute the mean of the remaining 100(1 − 2p)% Example: Newcomb light data. Mean is x̄ = 26.21, but we know the data include outliers. Sample size n = 66, so for 5% trimmed mean, discard the largest and smallest 66 × 0.05 = 3.3 observations. That is, discard the smallest 3 values (−44, −2, 16) and the largest 3 values (37, 39, 40).
Location (iii) Trimmed mean is a more resistant measure than the mean. To calculate the 100p% trimmed mean for 0 < p < 0.5, discard the smallest 100p% and largest 100p% of the observations and compute the mean of the remaining 100(1 − 2p)% Example: Newcomb light data. Mean is x̄ = 26.21, but we know the data include outliers. Sample size n = 66, so for 5% trimmed mean, discard the largest and smallest 66 × 0.05 = 3.3 observations. That is, discard the smallest 3 values (−44, −2, 16) and the largest 3 values (37, 39, 40). Compute the mean of the remaining 60 values.
Location (iii) Trimmed mean is a more resistant measure than the mean. To calculate the 100p% trimmed mean for 0 < p < 0.5, discard the smallest 100p% and largest 100p% of the observations and compute the mean of the remaining 100(1 − 2p)% Example: Newcomb light data. Mean is x̄ = 26.21, but we know the data include outliers. Sample size n = 66, so for 5% trimmed mean, discard the largest and smallest 66 × 0.05 = 3.3 observations. That is, discard the smallest 3 values (−44, −2, 16) and the largest 3 values (37, 39, 40). Compute the mean of the remaining 60 values. 5% trimmed mean = 27.40 (in coded units)
Which measure? Depends on shape of distribution, and what weâ&#x20AC;&#x2122;re interested in.
Which measure? Depends on shape of distribution, and what weâ&#x20AC;&#x2122;re interested in. The mean is not resistant, but contains more information than the median, since computed from all values.
Which measure? Depends on shape of distribution, and what weâ&#x20AC;&#x2122;re interested in. The mean is not resistant, but contains more information than the median, since computed from all values. (a) Symmetric data. Mean
Median
Which measure? Depends on shape of distribution, and what weâ&#x20AC;&#x2122;re interested in. The mean is not resistant, but contains more information than the median, since computed from all values. (a) Symmetric data. Mean
Mean = 20
Median
Which measure? Depends on shape of distribution, and what weâ&#x20AC;&#x2122;re interested in. The mean is not resistant, but contains more information than the median, since computed from all values. (a) Symmetric data. Mean
Mean = 20
Median
Median = 20
Which measure? Depends on shape of distribution, and what weâ&#x20AC;&#x2122;re interested in. The mean is not resistant, but contains more information than the median, since computed from all values. (a) Symmetric data. Mean = Median
Mean = 20
Median = 20
Which measure? (b) Positively skewed. Mean
Median.
48
Which measure? (b) Positively skewed. Mean
Median.
Mean = 9.1
49
Which measure? (b) Positively skewed. Mean
Mean = 9.1
Median.
Median = 7.7
50
Which measure? (b) Positively skewed. Mean > Median.
Mean = 9.1
Median = 7.7
51
Which measure? (c) Negatively skewed. Mean
Median.
52
Which measure? (c) Negatively skewed. Mean
Median.
Mean = 0.75
53
Which measure? (c) Negatively skewed. Mean
Mean = 0.75
Median.
Median = 0.77
54
Which measure? (c) Negatively skewed. Mean < Median.
Mean = 0.75
Median = 0.77
55
Which measure?
In general, is used for approximately symmetric distributions for skewed data. (easier to calculate, contains more info),
56
Which measure?
In general, mean is used for approximately symmetric distributions for skewed data. (easier to calculate, contains more info),
57
Which measure?
In general, mean is used for approximately symmetric distributions (easier to calculate, contains more info), median for skewed data.
58
Which measure?
In general, mean is used for approximately symmetric distributions (easier to calculate, contains more info), median for skewed data. Can be misleading to use the â&#x20AC;&#x2DC;wrongâ&#x20AC;&#x2122; measure - e.g. income distribution, see Gismo handout.
59
Which measure? For a bimodal distribution, mean & median may both be misleading, having values which rarely occur.
60
Which measure? For a bimodal distribution, mean & median may both be misleading, having values which rarely occur. e.g. Ages of cyclists in fatal accidents (USA, 2009). Frequency 140 6 per 10 year 120 interval 100 80 60 40 20 0
-
0
10
20
30
40
50
60
70
80
90
Age
Which measure?
The mean age of cyclists killed in accidents was xĚ&#x201E; = 41 years, but this is not very informative.
Which measure?
The mean age of cyclists killed in accidents was xĚ&#x201E; = 41 years, but this is not very informative. Best to say where the modes (peaks) occur.
Which measure?
The mean age of cyclists killed in accidents was x̄ = 41 years, but this is not very informative. Best to say where the modes (peaks) occur. Modal age groups are 10–15 years and 45–54 years.
Spread Again, there are several measures. (i) Standard deviation (SD) is a measure of spread about the mean.
Spread Again, there are several measures. (i) Standard deviation (SD) is a measure of spread about the mean. It is the square root of variance. Variance:
s2
=
(n is number of observations, xĚ&#x201E; is sample mean.)
Spread Again, there are several measures. (i) Standard deviation (SD) is a measure of spread about the mean. It is the square root of variance. Variance:
s2
=
(xi − x̄)
(n is number of observations, x̄ is sample mean.)
Spread Again, there are several measures. (i) Standard deviation (SD) is a measure of spread about the mean. It is the square root of variance. Pn 2 i=1 (xi − x̄) Variance: s = n
(n is number of observations, x̄ is sample mean.)
Spread Again, there are several measures. (i) Standard deviation (SD) is a measure of spread about the mean. It is the square root of variance. Pn 2 2 i=1 (xi − x̄) Variance: s = n
(n is number of observations, x̄ is sample mean.)
Spread Again, there are several measures. (i) Standard deviation (SD) is a measure of spread about the mean. It is the square root of variance. Pn 2 2 i=1 (xi − x̄) Variance: s = n−1
(n is number of observations, x̄ is sample mean.)
Spread Again, there are several measures. (i) Standard deviation (SD) is a measure of spread about the mean. It is the square root of variance. Pn 2 2 i=1 (xi − x̄) Variance: s = n−1 √ SD: s = Variance (n is number of observations, x̄ is sample mean.)
71
Spread Again, there are several measures. (i) Standard deviation (SD) is a measure of spread about the mean. It is the square root of variance. Pn 2 2 i=1 (xi − x̄) Variance: s = n−1 √ SD: s = Variance (n is number of observations, x̄ is sample mean.) When all observations have the same value (no spread), then s = 0. Otherwise s > 0.
72
Spread P P 1 x2i â&#x2C6;&#x2019; n1 ( xi )2 . Better formula to use Can show s2 = nâ&#x2C6;&#x2019;1 when computing by calculator.
Spread P P 1 x2i â&#x2C6;&#x2019; n1 ( xi )2 . Better formula to use Can show s2 = nâ&#x2C6;&#x2019;1 when computing by calculator. Example: Petrol consumption data.
Spread P P 1 x2i â&#x2C6;&#x2019; n1 ( xi )2 . Better formula to use Can show s2 = nâ&#x2C6;&#x2019;1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74
Spread P P 1 x2i − n1 ( xi )2 . Better formula to use Can show s2 = n−1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74 2
s
1 13342.74 − 12 (398.8)2 = = 12 − 1
Spread P P 1 x2i − n1 ( xi )2 . Better formula to use Can show s2 = n−1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74 2
s
1 13342.74 − 12 (398.8)2 = = 8.117 12 − 1
Spread P P 1 x2i − n1 ( xi )2 . Better formula to use Can show s2 = n−1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74 2
s
1 13342.74 − 12 (398.8)2 = = 8.117 (mpg)2 12 − 1
Spread P P 1 x2i − n1 ( xi )2 . Better formula to use Can show s2 = n−1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74 1 13342.74 − 12 (398.8)2 s = = 8.117 (mpg)2 12 − 1 √ s = 8.117 = 2
Spread P P 1 x2i − n1 ( xi )2 . Better formula to use Can show s2 = n−1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74 1 13342.74 − 12 (398.8)2 s = = 8.117 (mpg)2 12 − 1 √ s = 8.117 = 2.85 2
Spread P P 1 x2i − n1 ( xi )2 . Better formula to use Can show s2 = n−1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74 1 13342.74 − 12 (398.8)2 s = = 8.117 (mpg)2 12 − 1 √ s = 8.117 = 2.85 mpg 2
Spread P P 1 x2i − n1 ( xi )2 . Better formula to use Can show s2 = n−1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74 1 13342.74 − 12 (398.8)2 s = = 8.117 (mpg)2 12 − 1 √ s = 8.117 = 2.85 mpg 2
Standard deviation, unlike variance, is in same units as original observations.
Spread P P 1 x2i − n1 ( xi )2 . Better formula to use Can show s2 = n−1 when computing by calculator. Example: Petrol consumption data. P P 2 n = 12, x = 398.8, x = 13342.74 1 13342.74 − 12 (398.8)2 s = = 8.117 (mpg)2 12 − 1 √ s = 8.117 = 2.85 mpg 2
Standard deviation, unlike variance, is in same units as original observations. University exam calculator will give s, s2 directly.
Spread
The deviations (xi − x̄) must sum to zero (by definition of x̄), so only have (n − 1) independent deviations.
Spread
The deviations (xi − x̄) must sum to zero (by definition of x̄), so only have (n − 1) independent deviations. We say there are n − 1 degrees of freedom.
Spread
The deviations (xi − x̄) must sum to zero (by definition of x̄), so only have (n − 1) independent deviations. We say there are n − 1 degrees of freedom. Hence division by (n − 1) rather than n in calculating s2
Spread
The deviations (xi − x̄) must sum to zero (by definition of x̄), so only have (n − 1) independent deviations. We say there are n − 1 degrees of freedom. Hence division by (n − 1) rather than n in calculating s2 Only used when the mean is used as measure of location.
Spread
The deviations (xi − x̄) must sum to zero (by definition of x̄), so only have (n − 1) independent deviations. We say there are n − 1 degrees of freedom. Hence division by (n − 1) rather than n in calculating s2 Only used when the mean is used as measure of location. SD is not a resistant measure of spread.
Spread (ii) Range is the difference between the largest and smallest values, x(n) â&#x2C6;&#x2019; x(1) .
89
Spread (ii) Range is the difference between the largest and smallest values, x(n) â&#x2C6;&#x2019; x(1) . Not resistant, & not very useful.
90
Spread (ii) Range is the difference between the largest and smallest values, x(n) â&#x2C6;&#x2019; x(1) . Not resistant, & not very useful. (iii) Interquartile range (IQR) is the range of the middle 50% of the data.
91
Spread (ii) Range is the difference between the largest and smallest values, x(n) â&#x2C6;&#x2019; x(1) . Not resistant, & not very useful. (iii) Interquartile range (IQR) is the range of the middle 50% of the data. The rth percentile is the value with r% of the observations at or below it. Median is 50th percentile, observation (n + 1)/2 Lower quartile Q1 is 25th percentile, obs. (n + 1)/4 Upper quartile Q3 is 75th percentile, obs. 3(n + 1)/4 IQR = Q3 â&#x2C6;&#x2019; Q1
92
Spread (ii) Range is the difference between the largest and smallest values, x(n) â&#x2C6;&#x2019; x(1) . Not resistant, & not very useful. (iii) Interquartile range (IQR) is the range of the middle 50% of the data. The rth percentile is the value with r% of the observations at or below it. Median is 50th percentile, observation (n + 1)/2 Lower quartile Q1 is 25th percentile, obs. (n + 1)/4 Upper quartile Q3 is 75th percentile, obs. 3(n + 1)/4 IQR = Q3 â&#x2C6;&#x2019; Q1 It is resistant, & often used when the median is used as the measure of location.
Spread Example: 22, 25, 34, 35, 41, 41, 46
94
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7
95
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7
M = 35
96
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7 Q1 = x(2) = 25
M = 35 Q3 = x(6) = 41
97
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7 Q1 = x(2) = 25
M = 35 Q3 = x(6) = 41
IQR = 41 â&#x2C6;&#x2019; 25 = 16
98
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7 Q1 = x(2) = 25
M = 35 Q3 = x(6) = 41
IQR = 41 â&#x2C6;&#x2019; 25 = 16 Example: Newcomb light data, n = 66
99
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7 Q1 = x(2) = 25
M = 35 Q3 = x(6) = 41
IQR = 41 â&#x2C6;&#x2019; 25 = 16 Example: Newcomb light data, n = 66 M = obs. 33 21 = 12 x(33) + x(34) =
1 2 (27
+ 27) = 27
100
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7
M = 35
Q1 = x(2) = 25
Q3 = x(6) = 41
IQR = 41 − 25 = 16 Example: Newcomb light data, n = 66 M = obs. 33 21 = 12 x(33) + x(34) = Q1 = obs.
16 43
=
1 4 x(16)
+
3 4 x(17)
=
1 2 (27 1 4
+ 27) = 27
× 24 +
3 4
× 24 = 24
101
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7
M = 35
Q1 = x(2) = 25
Q3 = x(6) = 41
IQR = 41 − 25 = 16 Example: Newcomb light data, n = 66 M = obs. 33 21 = 12 x(33) + x(34) = Q1 = obs. Q3 = obs.
16 43 50 41
= =
1 4 x(16) 3 4 x(50)
+ +
3 4 x(17) 1 4 x(51)
= =
1 2 (27 1 4 3 4
+ 27) = 27
× 24 + × 31 +
3 4 1 4
× 24 = 24 × 31 = 31
Spread Example: 22, 25, 34, 35, 41, 41, 46 n=7
M = 35
Q1 = x(2) = 25
Q3 = x(6) = 41
IQR = 41 − 25 = 16 Example: Newcomb light data, n = 66 M = obs. 33 21 = 12 x(33) + x(34) = Q1 = obs. Q3 = obs.
16 43 50 41
= =
1 4 x(16) 3 4 x(50)
+ +
3 4 x(17) 1 4 x(51)
= =
1 2 (27 1 4 3 4
+ 27) = 27
× 24 + × 31 +
Interquartile range = 31 − 24 = 7 (coded units)
3 4 1 4
× 24 = 24 × 31 = 31
Boxplot
Another simple way to display data.
Boxplot
Another simple way to display data. The box extends from lower quartile to upper quartile.
Boxplot
Another simple way to display data. The box extends from lower quartile to upper quartile. In simplest form, the whiskers extend from smallest value to largest value.
Boxplot
Another simple way to display data. The box extends from lower quartile to upper quartile. In simplest form, the whiskers extend from smallest value to largest value. The line within the box indicates the median value.
Boxplot Example: 22, 25, 34, 35, 41, 41, 46
108
Boxplot Example: 22, 25, 34, 35, 41, 41, 46 From above, M = 35, Q1 = 25, Q3 = 41
Boxplot Example: 22, 25, 34, 35, 41, 41, 46 From above, M = 35, Q1 = 25, Q3 = 41 50 40 30 20 10 0
Boxplot Boxplots show Location, Spread and Skewness (lack of symmetry).
Symmetric
skew
skew
0
10
20
30
Boxplot Boxplots show Location, Spread and Skewness (lack of symmetry).
Symmetric
Right skew
skew
0
10
20
30
Boxplot Boxplots show Location, Spread and Skewness (lack of symmetry).
Symmetric
Right skew
Left skew
0
10
20
30
Boxplot
Boxplots do not show whether the data are unimodal or not. Should really only use a boxplot if you believe the data are unimodal.
114
Boxplot
Boxplots do not show whether the data are unimodal or not. Should really only use a boxplot if you believe the data are unimodal. Boxplots are particularly useful for comparing multiple samples. Side-by-side boxplots allow us to compare many samples.
115
Boxplot
Boxplots do not show whether the data are unimodal or not. Should really only use a boxplot if you believe the data are unimodal. Boxplots are particularly useful for comparing multiple samples. Side-by-side boxplots allow us to compare many samples. Often, outliers are separated off and plotted as individual points on the boxplot. Various rules exist for deciding what counts as an â&#x20AC;&#x2DC;outlierâ&#x20AC;&#x2122;.
116
Boxplot
One popular rule for separating off outliers is as follows. The INNER FENCES are at a distance 1.5 Ă&#x2014; Box length above or below the appropriate quartile.
117
Boxplot
One popular rule for separating off outliers is as follows. The INNER FENCES are at a distance 1.5 Ă&#x2014; Box length above or below the appropriate quartile. Any data values outside the Inner Fences are regarded as outliers, and drawn individually on the plot.
Boxplot
One popular rule for separating off outliers is as follows. The INNER FENCES are at a distance 1.5 Ă&#x2014; Box length above or below the appropriate quartile. Any data values outside the Inner Fences are regarded as outliers, and drawn individually on the plot. WHISKERS then extend to the largest and smallest data values which fall within the Inner Fences.
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence =
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 =
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 = 13.5
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 = 13.5 Upper Inner Fence =
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 = 13.5 Upper Inner Fence = 31 + 1.5 × 7 =
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 = 13.5 Upper Inner Fence = 31 + 1.5 × 7 = 41.5
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 = 13.5 Upper Inner Fence = 31 + 1.5 × 7 = 41.5 Data values outside the Inner Fences are
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 = 13.5 Upper Inner Fence = 31 + 1.5 × 7 = 41.5 Data values outside the Inner Fences are −44, −2
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 = 13.5 Upper Inner Fence = 31 + 1.5 × 7 = 41.5 Data values outside the Inner Fences are −44, −2 The outlying values are plotted as separate points. Whiskers extend to the smallest and largest of the remaining data values; that is,
Boxplot Example: Newcomb light data n = 66, M = 27, LQ = 24, U Q = 31, IQR = 7 Lower Inner Fence = 24 − 1.5 × 7 = 13.5 Upper Inner Fence = 31 + 1.5 × 7 = 41.5 Data values outside the Inner Fences are −44, −2 The outlying values are plotted as separate points. Whiskers extend to the smallest and largest of the remaining data values; that is, 16 and 40.
Boxplot
130
Changing the units of measurement
A linear transformation yi = a + bxi results in (a) Shape – (b) Location – (c) Spread – e.g. Celsius to Fahrenheit x = o C, y = o F, a = 32, b = 9/5
131
Changing the units of measurement
A linear transformation yi = a + bxi results in (a) Shape – no change (b) Location – (c) Spread – e.g. Celsius to Fahrenheit x = o C, y = o F, a = 32, b = 9/5
132
Changing the units of measurement
A linear transformation yi = a + bxi results in (a) Shape – no change (b) Location – mean, median, quartiles are all transformed in the same way as an individual observation, e.g. ȳ = a + bx̄ (c) Spread – e.g. Celsius to Fahrenheit x = o C, y = o F, a = 32, b = 9/5
133
Changing the units of measurement
A linear transformation yi = a + bxi results in (a) Shape – no change (b) Location – mean, median, quartiles are all transformed in the same way as an individual observation, e.g. ȳ = a + bx̄ (c) Spread – SD and IQR are multiplied by |b|, e.g. sy = |b|sx e.g. Celsius to Fahrenheit x = o C, y = o F, a = 32, b = 9/5
134