Spreadsheets, Calculators, and Graphing Lynn Stallings Marj Economopoulos Kennesaw State University
Have you wondered what all these graphing options are in spreadsheets?
Column Bar Line Pie XY (Scatter) Area Doughnut
Radar Surface Bubble Stock Cylinder Cone Pyramid
What about these graphing options in calculators?
Scatter plot XY Line Histogram Box [and whisker] Plot with and without outliers Normal Probability Plot (Shows if the distribution is normal.)
Let’s talk about Standards – What should we teach about
graphing? Common Graphs - Bar, Line, Area, Pie Less Common Graphs – Doughnut, Radar, Bubbles Appropriate, Inappropriate, and Misleading Graphs (Good, Bad, and Ugly) What makes a good graph?
A picture is worth …. Expanded role of graphs & charts New feature in Atlanta Journal & Constitution Editorial page
NCTM PSSM on Graphing In grades 6-8 all students should Select, create, and use appropriate graphical representation of data, including histograms, box plots, and scatter plots Discuss and understand the correspondence between data sets and their graphical representations, especially histograms, stemand-leaf plots, box plots, and scatterplots. Make conjectures about possible relationships between two characteristics of a sample on the basis of scatterplots of the data and approximate lines of fit.
What does the American Statistical Association say? The American Statistical Association set up a group to write Guidelines for Assessment and Instruction in Statistics Education (GAISE).
For the Curriculum Framework developed by this group, see http://www.amstat.org/education/gaise/.
Standards mention some graphs that you teach, but may not have studied in school* . . . Both of the following were created by John Tukey, a Princeton statistician. His 1977 book Exploratory Data Analysis made them popular. Both are commonly taught in middle school mathematics. Box-and-whisker Stem-and-leaf *At least, not if you’re my age.
Histogram Frequency distribution
An extension of stem & leaf Tally marks in a chart format Class interval depend on data
Using Excel to build histograms A data analysis tool in Excel
Select data Class intervals called “Bins” Excel will calculate the frequencies Need to know meaning for ranges An example follows
The role of class intervals (bins)
Bar, Line, Area Which to use when? Vertical vs. horizontal Does it matter?
A population example
Do you feel crowded?
Years 20 06
19 90
19 70
19 50
19 30
19 10
18 90
18 70
18 50
18 30
18 10
17 90
Millions
US Population Column (Bar chart)
300
250
200
150
100
50
0
Years 2006
2000
1990
1980
1970
1960
1950
1940
1930
1920
1910
1900
1890
1880
1870
1860
1850
1840
1830
1820
1810
1800
1790
Millions
US Population Line
300
250
200
150
100
50
0
US Population Cylinder 300 250 200 Millions 150 100 50 0 1790
1820
1850
1880
1910 Years
1940
1970
2000
Does this make sense? US Population Pie
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Pie Charts What communicates clearly? Plain M&M Color Distribution
brown yellow red blue orange green
http://us.mms.com/us/about/products/milkchocolate/
What about this pie chart? Plain M&M Color Distribution
brown yellow red blue orange green
Which gives you a better picture of the percent of each color you would find in a bag of M&Ms? Plain M&M Color Distribution
13%
16%
brown 14%
yellow red blue
20%
Plain M&M Color Distribution
13% 24%
brown yellow red blue orange green
orange green
Pie Charts Require proportional reasoning. Display data as a percentage of the whole. Are visually appealing. Don’t communicate exact numerical data. Make it hard to compare two data sets. Are usually best for 3-7 categories. Should be used with discrete data.
Let’s look at a few of the unusual graphing options in spreadsheets.
Column Bar Line Pie XY (Scatter) Area Doughnut
Radar Surface Bubble Stock Cylinder Cone Pyramid
Doughnuts? Demographics of Georgia (inner) and Atlanta Public School System (outer) Students 1%
8% 1%
4% Asian Black Hispanic Multiracial White
3% 38%
49% 2%
8% 86%
A doughnut graph shows how the percentage of each data item contributes to a total percentage. It’s a pie chart with a hole. It may be useful in comparing two groups, but represents them with unequal areas so may mislead.
Radar Graphs When you create a Radar chart you have a separate axis for each category of data. It basically has the appearance of spokes on a bike tire. When does it help to see data arranged this way? This example is about learning styles. http://www.learning-styles-online.com/
Bubble Charts  A bubble chart is basically just an XY (scatter) chart that represents an additional data series in the area of the point. Selected Georgia School Systems 1000 900 Clinch, 1,317
800
Square Area
700 600 Chatham, 32,842
500
Gw innett, 143980
400
Forsyth Dade
Bibb
M us co g
Richmond
ee
Cobb
300 200
Fulton, 79192
Sumter Oglethorpe
Dekalb, 99544
Clayton
100 0 -200,000
0
200,000
400,000
600,000
County Population (2005)
800,000
1,000,000
1,200,000
Big enough to see . . . Selected Georgia School Systems
The area of the circle represents the school system’s enrollment.
1000 Ware, 6098 Burke, 4342
800
Sumter Oglethorpe
400
Fulton, 79192
Dade, 2449
Dekalb, 99544
M us
200
Bibb
Cobb, 105526
co ge
Richmond Forsyth
Gw innett, 143980
Chatham
e, 32 49 0
Sq. Area
600
Clayton, 51948
0 -200000
0
200000
400000
600000
-200
County Population (2005)
800000
1000000
1200000
A Sixth Grade Text Introduction to graphs Misleading graphs Role of scale, equal intervals Begin comparisons at zero line
Stock market
Growth vs. Returns Are these appropriate?
Some common errors . . . The ratio of the
heights of bars within each category does not reflect the actual ratio. There is an implied precision that is unrealistic. The percentages are computed incorrectly. A doubling of costs is only a 100% increase.
Two groups comparison Questionnaire Statements ???
Huh?
Too many comparisons but global trends
What’s wrong here? The 3-D effects make it
difficult to read the bars. The non-horizontal scale artificially increases the lower-income bars compared to the upperincome bars. Some of the bars are missing a percentage. The interval sizes change. For example, all but the last two use $10,000.
What’s wrong here? It is not clear from the
horizontal axis where 1980 starts and ends. The 3-D tilting makes the back lines look steeper even if they have the same slope. Do you think that workforce participation rates have been falling for women? [Hint - look at the scale.] It is nice picture of a bus and a bus-stop. Are they relevant?
Women > 25
40% 50% 60%
Women 15-24
70% Men 15-24
80% ‘79
‘80
‘81
‘82
Men > 25
‘83
Is this Better?
Correct? Effective? Is a certain choice of graph ever wrong for a set of data? Is so, what is an example?
Are there times where you may make a choice among several types of graphs? If so, what criteria should you use?
To think about . . . “Excellence in statistical graphics consists of complex data communicated with clarity, precision, and efficiency.” (Tufte)
Correct
Ineffective Clear
Incorrect
What are the characteristics of excellent displays of data? Graphical displays should
Show the data Encourage the viewer to think about the context Avoid data distortion Present many numbers efficiently Make large data sets coherent Encourage the eye to compare different pieces of data Reveal the data at several levels of detail Serve a reasonable, clear purpose Be closely integrated with statistical and verbal descriptions of the data
Resources: Examples of bad graphs:
http://www.stat.sfu.ca/~cschwarz/Stat-201/Handouts/ http://www.shodor.org/interactivate/activities/ and then select STATISTICS Huff, D. (1982). How to lie with statistics. Norton. Jones, G. E. (2000). How to lie with charts. Authors Choice Press. Tufte, Edward R. (2006) The Visual Display of Quantitative Information. Graphics Press.
Thank you! Enjoy the rest of your stay in Atlanta! Lynn, lstalling@kennesaw.edu Marj, meconomo@kennesaw.edu PowerPoint will be at http://ksuweb.kennesaw.edu/~lstallin