Timss implications for science instruction

Page 1

Research in Comparative and International Education Volume 3 Number 2 2008 www.wwwords.co.uk/RCIE

RESEARCH IN

Comparative & International Education

Trends in Mathematics and Science Study (TIMSS): international accountability and implications for science instruction JONATHAN M. ECKERT Peabody College, Vanderbilt University, USA

ABSTRACT International educational comparisons are relatively recent phenomena and have long been the source of international debate. Since the formation of the International Association for the Evaluation of Educational Achievement (IEA) in the 1950s, assessments have grown increasingly more valid and reliable. TIMSS is an exemplary attempt at crossnational comparison. In trying to link results to practice, TIMSS released video studies that attempt to compare teaching methodologies employed in science in various countries. Much has been made of the TIMSS report and the pedagogy displayed in the video studies, but one question remains unanswered. What are the implications for science instruction based on cross-national comparisons? In countries such as the USA and the United Kingdom, the TIMSS reports have been the basis of much criticism of science instruction. This article explores the implications of the TIMSS reports and offers recommendations for the use of cross-national comparisons to impact classroom instruction. International educational comparisons are relatively recent phenomena and have long been the source of international debate. The International Association for the Evaluation of Educational Achievement (IEA) formed in the 1950s with headquarters in Sweden. The IEA published its first report in 1959 from a pilot survey of a test covering reading comprehension, geography, science, mathematics, and non-verbal ability. Since this first assessment, there have been 33 subsequent cross-national studies of academic achievement, 29 of which have been sponsored by the IEA (Heyneman, 2004; Heyneman & Lykins, 2008). Many of the problems and criticisms of international comparisons were significant challenges from the beginning. There were issues achieving comparable samples, misleading forms of reporting, differences in the degree of curriculum match. In addition, there were difficulties agreeing on common definitions, methodologies, sampling, and data management (Postlewaite, 1999; Chromy, 2002; Heyneman, 2004). From the inception of IEA in the 1950s until the Second International Study of Science (SIMS) in 1982, the IEA was a somewhat informal organization. According to the SIMS results, US science students were mediocre performers and US education was lagging behind other industrial democracies. The National Center for Education Statistics (NCES) and the National Science Foundation (NSF) questioned these findings on the grounds that the sample was not representative. In order to avoid this problem in the future, the NCES and the NSF concluded that any study must ensure that the data meet normal standards of reliability and validity and must accurately represent the student population. Therefore, the Third International Mathematics and Science Study (TIMSS) became a high federal priority. The goal was to make the USA first in mathematics and science (Vinovskis, 1999). In many ways, TIMSS was a Herculean undertaking. The study was the most expensive and far-reaching project ever attempted by the

202

http://dx.doi.org/10.2304/rcie.2008.3.2.202


Trends in Mathematics and Science Study IEA. The ripple effects from the initial and subsequent studies are evident today (Ravitch, 2003; Baker & LeTendre, 2005; Heyneman & Lykins, 2008). TIMSS In 1994-95, TIMSS was the largest and most ambitious study of international student achievement to date. The study was comprised of students in their third, fourth, seventh, eighth and the final year of secondary school in 41 countries. Students were tested in mathematics and science. Moreover, extensive information about teaching and learning was collected from students, teachers, and school principals. Combined, TIMSS tested and gathered contextual data for more than half a million students. The TIMSS results were released in 1996 and 1997 with the intent to provide policy makers and practitioners valuable information about mathematics and science instruction. All of the 41 countries participated in testing at Population 2 (grades 7 and 8), which was the core of TIMSS (Beaton et al, 1996). The tests were developed through an international consensus involving input from science and measurement specialists (Garden & Orpwood, 1996). The test for Population 2 contained 135 science items, one-quarter of which were in the free response format, which required students to generate and write their own responses. The remaining questions used a multiple-choice format. The assessment was designed to reflect current thinking and priorities within the field of science. The items underwent an iterative development and review process with a pilot test in 43 countries (Beaton et al, 1996). Table I shows the distribution of questions in each subject area. Table II disaggregates the questions based on performance expectations. Content category Earth Science Life Science Physics Chemistry Environmental Issues Total

Number of items 22 40 40 19 14 135

Number of multiplechoice items 17 31 28 15 11 102

Number of freeresponse items 5 9 12 4 3 33

Table I. Distribution of science items by content reporting category: Population 2. Source: IEA Third International Mathematics and Science Study (TIMSS) 1994-95.

Performance expectation Understanding simple information Understanding complex information Theorizing, analyzing, and solving problems Using tools, routine procedures, and science processes Investigating the natural world

Number of items 55 39

Number of multiplechoice items 53 29

Number of freeresponse items 2 10

28

9

19

8

8

0

5

3

2

Table II. Distribution of science items by performance expectations: Population 2. Source: IEA Third International Mathematics and Science Study (TIMSS) 1994-95.

This breakdown is similar to Population 1 and Population 3 as well as the subsequent TIMSS assessments in 1999, 2003, and 2007. While TIMSS has a secondary focus on problem solving, the assessment’s primary focus is on scientific knowledge. To ensure broad subject matter coverage without overburdening individual students, TIMSS clustered mathematics and science items into 26 different groups that were placed into eight different books resulting in a 90-minute test booklet containing both mathematics and science items. The tests were prepared in English and translated

203


Jonathan M. Eckert into 30 additional languages with multiple checks to ensure that validity and reliability were not lost in translation (Adams & Gonzalez, 1996). Half of the science items from the 1999 and 1995 TIMSS assessments have been released to encourage their use in instruction and assessment. Sixty-seven of the 135 items on each test have not been disclosed, so those items could possibly be used on future assessments (IEA, 1995, 1999). A challenge to a study of this nature is finding culturally appropriate, technically accurate, and grade-level accessible items that will paint a valid picture of academic progress (Garden & Orpwood, 1996; Chromy, 2002; Rotberg, 2006). Figure 1 is a released item from Population 2’s 1995 TIMSS assessment.

A. B. C. D.

Number of LEGS 2 4 6 8

Number of BODY PARTS 4 2 3 3

Figure 1. A Life Science Item from the 1995 TIMSS Assessment of Population 2: 111. What features do all insects have? Source: TIMSS Population 2 Item Pool. Copyright © 1994 by IEA. The Hague.

The international average of students responding correctly to this item at the eighth grade level was 45% (IEA, 1995). The majority of the TIMSS multiple-choice items resemble this format and the item’s efficient use of words. This increases the reliability and validity of the assessment as the item is able to assess accurately the knowledge of the student if the item is universally accessible. Given the resources expended, expertise involved, and the openness of the development and results, TIMSS has a great deal of well-deserved international credibility. Therefore, the results generate a great deal of debate as well as angst in countries that do not compare favorably to other countries (Postlewaite, 1999; Schmidt, 2003; Phillips, 2007; Heyneman & Lykins, 2008). In 1995, Singapore was the top-performing country in mathematics and science in both the seventh and eighth grades. Internationally, gender performance differences were pervasive in science with boys outperforming girls particularly in physics, chemistry, and earth science (Mullis, et al, 2000). Home factors such as educational resources, books in the home, and parents’ education, were strongly related to science achievement in every TIMSS country. The USA was placed seventeenth in average science achievement in eighth grade, a slight drop from its twelfth place finish in fourth grade. The USA was one of only four countries above the international average in fourth grade to drop to average or below at eighth grade. On the end-of-secondary school assessment, the USA scored at or near the bottom in every category from science literacy to physics (Beaton et al, 1996; Martin et al, 1997; Mullis et al, 1998). In 1999, IEA launched the Third International Mathematics and Science Study – Repeat (TIMSSR) for eighth-grade students only. The assessment included 38 nations. The USA was one of 23 countries to repeat the test for eighth-grade students. The purpose of TIMSS-R was twofold. First, TIMSS-R allowed countries like the USA to compare the achievement of its eighth-grade students four years after the initial TIMSS assessment relative to the international average of the 17 nations that participated in both assessments. Second, TIMSS-R included a videotape study of eighth-grade mathematics and science teaching in seven nations. The intent of the ambitious videotape study was to illuminate teaching techniques occurring in science classrooms around the world with the ultimate goal of identifying factors that might enhance student learning opportunities and, by extension, student achievement (Stigler et al, 1999, Roth et al, 2006). The relative performance of students on TIMSS-R was similar to their performance on TIMSS. The US eighth-grade students finished eighteenth, just behind Bulgarian eighth-grade students. There was no statistically significant change in performance of eighth-grade students in the USA from 1995 to 1999. Boys continued to achieve at higher levels than girls in science in a majority of the countries including the USA, although there was no disparity on the mathematics portion of the assessment. In a relative comparison of achievement of the cohort of fourth-grade

204


Trends in Mathematics and Science Study students in 1995 to eighth grade in 1999, the US cohort’s performance was lower than it was for fourth-grade students four years earlier (Martin et al, 2000; Gonzalez, et al, 2000). Five countries in the video study, Australia, the Czech Republic, Japan, the Netherlands, and the USA, were analyzed at length (Stigler et al, 1999, Roth et al, 2006) and created a rich body of data. The video study included 439 eighth-grade science lessons randomly collected from the five participating countries. Of the five countries, the US average score was 513 compared to an average score for the other four countries of 544. While many factors contribute to these scores, several findings about curriculum, teacher actions, and student actions warrant further investigation. First, in the USA, 23% of science instructional time is devoted to motivating students. When the percentages of the other four countries are combined, they equal 23%. Second, 27% of science instruction time in the USA includes lessons with no conceptual link to key science standards. This is a higher percentage than the other four countries’ percentages combined. The five countries have distinctive approaches to science instruction. Japan, the top-performing country in the video study, focuses science instruction on inquiry-oriented, inductive lessons that seek to connect ideas and evidence. However, this finding points to one of the criticisms of the TIMSS video study in that the Japanese study did not include juku schools, private schools that the majority of secondary students attend after the typical state school day (Brown, 1999). The success of the Japanese students on the TIMSS does not necessarily correlate with the type of instruction provided just in the state schools in the study. The Czech Republic, a close second to Japan, focuses instruction on talking about science through whole-class presentations and discussions. Science classes in the Netherlands focus instruction on independent science learning through homework, independent investigation, and reading. Australian eighth-grade classrooms attempt to make connections between main ideas and real-life issues. US classrooms implement a variety of activities and techniques to attempt to communicate concepts (Roth et al, 2006). Despite distinctive approaches to instruction, the four higher achieving countries were common in two ways. All four countries have high content standards and expectations for student learning; and instead of exposing students to a variety of pedagogical approaches, the four relatively highachieving countries appeared to reflect common instructional approaches that were contentfocused (Roth et al, 2006). Four years later, IEA administered the Trends in Mathematics and Science Study (TIMSS) with similar results. The study assessed students from 46 countries in eighth grade and students from 25 countries in fourth grade. Singapore once again headed the list for student achievement. Of the participating countries, boys in 33 nations outperformed girls in science, although girls made statistically significant improvement from 1999 to 2003. Home context once again was a key factor for science learning. There was a positive relationship between parents’ level of education, expectations for student learning and science achievement. Additionally, there was a positive correlation between the number of books in the home, computer usage, and science achievement. Finally, in all countries except the USA and Australia, science curricula are defined at the national level (Martin et al, 2004). US students show some improvement relative to other countries when results from 1995, 1999, and 2003 are compared. At the fourth grade level, the USA moved from twelfth place in 1995 to sixth place in 2003 although the average score for the USA actually dropped. The average US score was 536 compared to the international average of 489. US eighth grade students moved from eighteenth to ninth with an average score of 527 compared to the international average of 473. However, there continues to be a significant gap in science achievement between white students in the USA and black and Hispanic students (Gonzalez et al, 2004). Implications for Science Instruction US students comprise approximately 5% of the world’s student population in primary and secondary schools. In order to achieve a comprehensive view of education, results from other countries must be considered. Results from the 2007 TIMSS study are due on December 8, 2008 and the test follows a similar format and participation sample (Mullis et al, 2005). While crossnational comparisons can be valuable and informative, caution must be taken and local analytic work must hold more sway for local policy (Heyneman, 2004). Although TIMSS is an exemplary

205


Jonathan M. Eckert attempt at a representative, valid, and reliable assessment, history and research show the many pitfalls of international statistics (Heyneman, 1993, 1999; Guthrie & Hanson, 1995; Brown, 1999). Often, educators, policy makers, and more frequently, journalists, will obtain results from a study like TIMSS and report findings that imply causality without examining the full data set or waiting for multivariate analysis (Brown, 1999). Postlethwaite (1999) emphasizes the need for time and longitudinal analysis, which makes direct comparisons of the effectiveness of classroom practices quite difficult. Yet the results from cross-national comparisons such as TIMSS will be used in policy debate to guide the allocation of resources. Experts are divided on how these resources should best be allocated. The apparent discrepancy between the findings of Coleman et al (1966) and Heyneman & Loxley (1983) is one such example that continues today (Gamoran & Long, 2006). In the USA, there has been a great deal of support Coleman’s finding that schools are ineffective and inefficient at raising levels of student performance due to factors outside the school’s influence. However, the ‘Heyneman–Loxley’ effect suggests that the explanatory power of a student’s social background varies by country, and in poorer countries is exceeded by the power of school quality. When examined properly and with multivariate analysis of new cross-national data, current studies support the ‘Heyneman–Loxley’ effect of low impact of social background and high impact of school quality in low-income countries and the reverse in high-income countries (Baker et al, 2003; Gamoran & Long, 2006). In a similar fashion, experts continue to disagree on how money should be spent from state to state and district to district. Many policy makers use the TIMSS data to suggest that more money should be spent on science education in the USA in order to maintain its position as the world’s most powerful economy. However, some experts argue that spending more money on education has produced little in the way of increased student achievement over the past three decades (Hanushek, 1981, 1996; Heyneman & Lykins, 2008). Others argue that there is a return on investment in education in the USA (Hedges et al, 1994). Here is the ecological fallacy discussed by Brown (1999) and Postlethwaite (1999). This is the mistaken assumption that between-country correlations necessarily imply within-country correlations. However, Baker (2003) does point out some valid criticisms of the manner in which American school systems distribute school resources. He finds that American schools are less able to educate the most disadvantaged students. Nearly any measure, including international assessments, supports the Black–White and socio-economic achievement gap (Ferguson, 1998; Jencks & Phillips, 1998; Gamoran, 2001). Additionally, Baker uses cross-national comparisons to infer that the reason for US policy difficulties is the public governance structure in which elections for education posts determine policy direction. Political issues like sex education, prayer in schools, and class size become voter issues, which splinter reform efforts. The splintering of the US curriculum has also been well documented. William Schmidt (2003) called the US curriculum ‘a mile wide, and an inch deep’ in response to the poor performance of US students on national comparisons. Schmidt highlights the poor performance of US students on the 1995 twelfth-grade science literacy assessment. Only four of the 22 countries that participated had significantly lower scores than US twelfth grade students. Those countries are Hungary, Lithuania, Cyprus, and South Africa. He cites problematic course-taking patterns and a middle school curriculum that lacks coherence and rigor as possible causes. The findings from the 1999 TIMSS video study substantiate this claim as the other higher achieving countries had standards that are more rigorous and instruction that is more content-focused (Roth et al, 2006). However, conclusions based on international comparisons of curriculum should be arrived at with caution. Brown (1999) asserts that there are issues with the way experts like Stigler examined the video study and subsequent findings such as the aggregate number of hours Japanese students have been exposed to mathematics. While TIMSS surveys the amount of time devoted to mathematics in school each day, at no point is the time estimated for which the pupil has been exposed to mathematics in homework and private-tuition juku schools. A major factor in the Far East is the high frequency of private tuition, doubling the amount of time Japanese students devote to mathematics as compared to their English counterparts (Brown, 1996). TIMSS is also a rich resource for data on teachers and instructional practices. The TIMSS video study was a tremendous undertaking and is data rich. However, care must be taken to ensure that

206


Trends in Mathematics and Science Study science education does not become a prescriptive, scripted proposition. One could safely infer that US teachers need to help students make connections to the deeper science concepts taught around the world and that expectations and standards should be higher. US teachers seem to focus more on motivation and activities than the deeper understandings essential to science instruction. The reasons for this could be many, but data from TIMSS seems to indicate that this is less than productive. In a comprehensive study of all the participating countries and their most effective schools, there were no classroom or teacher characteristics that consistently correlated to increased student achievement (Martin et al, 2000). This illustrates the difficulty of generalizations from cross-national studies. Policy makers and educators often use TIMSS data to illustrate the gap in science education between boys and girls. Gender differences on the 1995 TIMSS showed that in one-third of the participating countries, there was a significant gap already in fourth grade. By eighth grade, the gap was wider and evident in two-thirds of the participating countries. By the twelfth grade assessment, the gap was prevalent in nearly every country, with males performing at significantly higher levels (Mullis et al, 2000). Although the gap has narrowed on subsequent TIMSS assessments, it still exists. Similar to data on the Black–White achievement gap, cross-national data from TIMSS and the Program for International Student Assessment (PISA) help policy makers triangulate data to support policy changes to enhance opportunities for girls in science education. If for no other reason, this is ample support to continue TIMSS and PISA. Some experts like Iris Rotberg (1990, 2006) will always criticize the validity of cross-national comparisons. Is it fair to compare a large and complex nation such as the USA with smaller, more homogenous nations? If states were treated as nations in 1991, five of the 10 highest scoring nations in 13-year-old mathematics proficiency were US states (National Center for Education Statistics, 1996). Unfortunately, for proponents of this logic, a recently released study shows that even this defense is no longer defensible. The report describes state and international education indicators for mathematics and science using state data collected by the 2005 and 2007 National Assessment of Educational Progress (NAEP) and 2003 TIMSS data. Using statistical linking, both studies are expressed in the same metric. On a positive note, most states are performing as well or better than most foreign countries. Unfortunately, the highest achieving states within the USA are still significantly below the highest achieving countries. North Dakota, the top-performing state on the eighth grade NAEP test, would be fifth overall when compared to other countries (Phillips, 2007). Further data from the Organization for Economic Cooperation and Development’s (OECD) 2006 Program for International Student Assessment (PISA), completes the triangulation of data from NAEP and TIMSS. Unlike TIMSS, PISA focuses only on content and process knowledge as it relates to science and does not examine home and school factors. PISA assesses problem solving and knowledge gained both inside and outside the classroom. On December 4, Education Week published the all-too-familiar headline, ‘U.S. Students Fall Short in Math and Science’ (Cavanagh, 2007). Thirty developed nations took part in the 2006 assessment. The test measured the performance of 15-year-old students, regardless of grade level. The US students scored an average of 489, 11 points below the international average of 500 among industrialized nations. Finland attained the top average of 563. The USA ranked below 16 of the 30 industrialized nations that participated. US students scored just below the international mean on questions that asked them to identify scientific issues, but 14 points below the mean on explaining phenomena scientifically (PISA, 2007). These findings echo similar findings from the 2003 PISA and TIMSS results, calling into serious question the depth of scientific understanding of American students. Future Research and Possible Pitfalls There will always be issues in international comparative data due to difficulty achieving comparable samples, misleading forms of reporting, differences in the degree of curriculum match, omission of results that would undermine policy, the ecological fallacy, and uncontrolled factors (Brown, 1999; Postlethwaite, 1999). Ample evidence is available of previous attempts by international organizations that have not met the standards or rigor of quality studies that allow findings to be generalized (Guthrie & Hansen, 1995; Puryear, 1995; Heyneman, 1999). However,

207


Jonathan M. Eckert significant progress has been made with both PISA and TIMSS toward representative, reliable, and valid results for mathematics and science achievement (Smith & Baker, 2001; Heyneman & Lykins, 2008). The 2007 TIMSS will release its results on December 8, 2008. In addition, IEA is planning an Advanced TIMSS study for 2008 for students in advanced mathematics and science. Nine countries have agreed to participate already. The USA has chosen not to participate in the 2008 Advanced TIMSS. The commissioner of NCES cited budgets and staffing constraints as the reasons for the lack of US participation (Viadero, 2007). This lack of participation is unlikely to become a trend as pressure continues to increase from both the public and politicians for accountability. With the ever-increasing pressure on academic achievement, there is some concern as to whether researchers and policy makers are asking the right questions. With the heavy emphasis on academic proficiency created by the No Child Left Behind legislation and international testing, Coleman et al (1966) and Heyneman (2005) might ask if these are the right issues to address. Coleman et al saw students’ backgrounds as the primary reason for school success or failure. Heyneman sees the primary purpose of public schooling as social cohesion. While TIMSS attempts to account somewhat for students’ home factors, there is no focus on the role of educational institution as a bastion of social cohesion. In contrast to Heyneman and Coleman, Phillips concludes from his analysis of NAEP and TIMSS that the USA needs to substantially increase the scientific and mathematical competency of the general population. This may lead to an element of social cohesion, as it is his hope that citizens can better understand and reach consensus on the world’s most pressing problems through improved scientific understanding. Additionally, he cites the need for more people working in the scientific disciplines in order to better compete in a global economic environment. To this end, need indicators of scientific and mathematical progress are necessary early in the education pipeline. One such way could be linking NAEP and TIMSS results through his metric (Phillips, 2007). The founders of the IEA wanted to see the world of education as a single laboratory. This did not mean that education was uniform or that identical principles or intervention strategies would emerge. Instead, they were energized by the possibility that people could learn by looking at themselves through information systematically collected from around the world (Heyneman, in Baker, 2003). This should be the goal of cross-national comparisons. Heyneman eloquently describes the purpose of assessments like TIMSS: ‘The appropriate function of cross-national work is to inform us, it is not to direct us. It is the “elephant” of education research. Cross-national education research is a marvelous and informative instrument, but it would be irresponsible of us to expect the elephant to fly’ (Heyneman, 2004, p. 351). References Adams, R. & Gonzalez, E. (1996) TIMSS Test Design, in M.O. Martin & D.L. Kelly (Eds) Third International Mathematics and Science Study Technical Report, Volume I: Design and Development. Chestnut Hill: Boston College. Baker, D.P. (2003) Should America Be More Like Them? Cross National High School Achievement and U. S. Policy, in D. Ravitch (Ed.) Brookings Papers on Education Policy, 309-339. Washington, DC: The Brookings Institution. Baker, D.P. & LeTendre, G. (2005) National Differences, Global Similarities, World Culture and the Future of Schooling. Stanford: Stanford University Press. Baker, D.P., Goesling, B. & LeTendre, G. (2003) Socioeconomic Status, School Quality and National Economic Development: a cross-national analysis of the ‘Heyneman- Loxley’ effect, Comparative Education Review, 46(3), 291-313. http://dx.doi.org/10.1086/341159 Beaton, A., Mullis, I., Martin, M., et al (1996) Science Achievement in the Middle School Years: IEA’s Third International Mathematics and Science Study (TIMSS). Chestnut Hill: Boston College. Brown, M. (1996) FIMS and SIMS: the first two IEA International Mathematics Surveys, Assessment in Education, 3, 181-200. Brown, M. (1999) Problems of Interpreting International Comparative Data, in B. Jaworski & D. Phillips (Eds) Comparing Standards Internationally: research and practice in mathematics and beyond, 183-207. Oxford: Symposium Books.

208


Trends in Mathematics and Science Study Cavanagh, S. (2007) U.S. Students Fall Short in Math and Science, Education Week, 12 April. http://www.edweek.org/ Chromy, R. (2002) Sampling Issues in Design, Conduct and Interpretation of International Comparative Studies of School Achievement, in National Research (Ed.) Methodological Advances in Cross-National Surveys of Educational Achievement, 80-117. Washington, DC: National Academy Press. Coleman, J.S., Campbell, E.Q., Hobart, C.J., et al (1966) Equality of Educational Opportunity. Washington, DC: US Government Printing Office, Department of Health, Education, and Welfare. Ferguson, Ronald F. (1998) Can Schools Narrow the Black–White Test Score Gap? in C. Jencks & M. Phillips (Eds) The Black–White Test Score Gap. Washington, DC: The Brookings Institution. Gamoran, A. (2001) American Schooling and Educational Inequality, Sociology of Education (extra issue), 135-153. http://dx.doi.org/10.2307/2673258 Gamoran, A., & Long, D.A. (2006) Equality of Educational Opportunity: a 40-year retrospective. WCER Working Paper No. 2006-9. Madison: University of Wisconsin-Madison, Wisconsin Center for Education Research. http://www.wcer.wisc.edu/publications/workingPapers/papers.php Garden, R.A. & Orpwood, G. (1996) Development of the TIMSS Achievement Test, in M.O. Martin & D.L. Kelly (Eds) Third International Mathematics and Science Study Technical Report, Volume I: Design and Development. Chestnut Hill: Boston College. Garden, R.A., Lie. S., Robitaille, D.F., et al (2006) TIMSS Advanced 2008 Assessment Frameworks. Chestnut Hill: International Association for the Evaluation of Educational Achievement. Gonzalez, P., Calsyn, C., Jocelyn, L., et al (2000) Highlights from the Third International Mathematics and Science Study – Revised (TIMSS-R) 1999 (NCES 2001-027) U.S. Department of Education, National Center for Education Statistics. Washington, DC: US Government Printing Office. Gonzales, P., Guzmán, J.C., Partelow, L., et al (2004) Highlights from the Trends in International Mathematics and Science Study (TIMSS) 2003 (NCES 2005-005) U.S. Department of Education, National Center for Education Statistics. Washington, DC: US Government Printing Office. Guthrie, J.W. & Hanson, J.S. (Eds) (1995) World Wide Education Statistics: enhancing UNESCO’s role. Washington, DC: National Academy of Sciences. Hanushek, E. (1981) Throwing Money at Schools, Journal of Policy Analysis and Management, 1(1), 19-41. http://dx.doi.org/10.2307/3324107 Hanushek, E. (1996) A More Complete Picture of School Resource Policies, Educational Researcher, 66(3), 397-409. Hedges, L., Laine, R. & Greenwald, R. (1994) Does Money Matter? A Meta-analysis of Studies of the Effects of Differential School Inputs on Student Income, Educational Researcher, 23(3), 5-14. Heyneman, S.P. (1993) Educational Quality and the Crisis of Education Research, International Review of Education, 39(6), 511-517. http://dx.doi.org/10.1007/BF01261533 Heyenman, S.P. (1999) The Sad Story of UNESCO’s Education’s Statistics, International Journal of Education Development, 19, 65-74. http://dx.doi.org/10.1016/S0738-0593(98)00068-6 Heyneman, S.P. (2004) The Use of Cross-National Comparisons to Shape Education Policy, Curriculum Inquiry, 34(3), 345-353. http://dx.doi.org/10.1111/j.1467-873X.2004.00299.x Heyneman, S.P. (2005) Student Background and School Achievement: what is the right question, American Journal of Education, 112(1), 1-9. http://dx.doi.org/10.1086/444512 Heyneman, S.P. & Loxley, W. (1983) The Effect of Primary School Quality on Academic Achievement in Twenty-Nine High- and Low-Income Countries, American Journal of Sociology, 88(6), 1162-1194. http://dx.doi.org/10.1086/227799 Heyneman, Stephen P. & Lykins, Chad R. (2008) The Evolution of Comparative and International Education Statistics, 105-127, in Helen F. Ladd & Edward B. Fiske (Eds) Handbook of Research in Education Finance and Policy. New York: Routledge. International Association for the Evaluation of Educational Achievement (IEA) (1995) Third International Mathematics and Science Study. Released Items for Population 2. http://timss.bc.edu/timss1995i/TIMSSPDF/BSItems.pdf (accessed November 21, 2007). International Association for the Evaluation of Educational Achievement (IEA) (1999) Third International Mathematics and Science Study – Repeat. Released Items for Population 2. http://timss.bc.edu/timss1999i/pdf/t99science_items.pdf (accessed December 11, 2007). Jencks, C. & Phillips, M. (1998) The Black–White Test Score Gap: an introduction, in The Black–White Test Score Gap. Washington, DC: The Brookings Institution.

209


Jonathan M. Eckert Martin, M., Mullis, I., Beaton, A., et al (1997) Science Achievement in the Primary School Years: TIMSS. Chestnut Hill: International Association for the Evaluation of Educational Achievement. Martin, M., Mullis, I., Gonzales, E. & Chrostowski, S. (2004) TIMSS 2003 International Science Report. Chestnut Hill: International Association for the Evaluation of Educational Achievement. Martin, M., Mullis, I., Gonzalez, E., et al. (2000) TIMSS 1999 International Science Report. Chestnut Hill: International Association for the Evaluation of Educational Achievement. Martin, M., Mullis, I., Gregory, K., Hoyle, C. & Shen, C. (2000) Effective Schools in Science and Mathematics. Chestnut Hill: International Association for the Evaluation of Educational Achievement. Mullis, I., Martin, M., Beaton, A., et al (1998) Mathematics and Science Achievement in the Final Year of Secondary School. Chestnut Hill: International Association for the Evaluation of Educational Achievement. Mullis, I., Martin, M., Fierros, E., Goldberg, A. & Stemler, S. (2000) Gender Differences in Achievement. Chestnut Hill: International Association for the Evaluation of Educational Achievement. Mullis, I., Martin, M., Ruddock, G., et al (2005) TIMSS 2007 Assessment Frameworks. Chestnut Hill: International Association for the Evaluation of Educational Achievement. National Center for Education Statistics (1996) Education in States and Nations: indicators comparing U.S. states with other industrialized countries in 1991. Washington, DC: US Department of Education. Phillips, G. (2007) Chance Favors the Prepared Mind: mathematics and science indicators for comparing states and nations. Washington, DC: American Institutes for Research. Postlethwaite, T.N. (1999) Overview of Issues in International Achievement Studies, in B. Jaworski & D. Phillips (Eds) Comparing Standards Internationally: research and practice in mathematics and beyond, 23-61. Oxford: Symposium Books. Program for International Student Assessment (PISA) (2007) Findings from the 2006 PISA. http://www.pisa.oecd.org/dataoecd/15/13/39725224.pdf (accessed December 7, 2007). Puryear, J. (1995) International Education Statistics and Research: status and problems, International Journal of Education Development, 15(1), 79-91. http://dx.doi.org/10.1016/0738-0593(94)E0015-G Ravitch, D. (Ed.) (2003) Brookings Papers on Educational Policy. Washington, DC: The Brookings Institution. Rotberg, I. (1990) I Never Promised You First Place, Phi Delta Kappan, 72(4), 296-303. Rotberg, I. (2006) Assessment around the World, Educational Leadership, 64(3), 58-63. Roth, K., Druker, S., Garnier, H., et al (2006) Highlights from the TIMSS 1999 Video Study of Eighth-Grade Science Teaching (NCES 2006-017) U.S. Department of Education, National Center for Education Statistics. Washington, DC: US Government Printing Office. Schmidt, W. H. (2003) Too Little too Late: American high schools in international context, in D. Ravitch (Ed.) Brookings Papers on Education Policy, 253-309. Washington, DC: The Brookings Institution. Smith, T.M. & Baker, D.P. (2001) Worldwide Growth and Institutionalization of Statistical Indicators for Education Policy-Making, Peabody Journal of Education, 76, 141-53. http://dx.doi.org/10.1207/S15327930PJE763&4_9 Stigler, J.W., Gonzales, P., Kawanaka, T., Knoll, S. & Serrano, A. (1999) The TIMSS Videotape Classroom Study: methods and findings from an exploratory research project on eighth-grade mathematics instruction in Germany, Japan, and the United States, Education Statistics Quarterly, 1(2). Washington, DC: National Center for Education Statistics. Viadero, D. (2007) U.S. Poised to Sit out TIMSS: physics, advanced math gauged in global study, Education Week, August 1. Vinovskis, M.A. (1999) The Road to Charlottesville: the 1989 education summit. Washington, DC: National Goals Panel.

JONATHAN M. ECKERT is a doctoral candidate at Peabody College, Vanderbilt University in the Department of Leadership, Policy, and Organizations. He has been an educator for 12 years and is currently a middle school science instructor, curriculum writer, consultant, and test development specialist working in both the public and private sectors. Correspondence: Jonathan M. Eckert, Peabody College, Vanderbilt University, 3167 Langley Drive, Franklin, TN 37064, USA (jonathan.m.eckert@vanderbilt.edu)

210


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.