8 minute read
Merit Pay: Is it now time to pull back high stakes testing along with ‘Merit Pay’ as a reward?
Dr. Hans Andrews
This is a multi-part article. This is part 2 further installments will be in upcoming issues.
Advertisement
Some union responses to merit pay and evaluation
Mary Futrell (1986) was interviewed on her views on merit pay when she was president of the National Education Association. She stated the Association’s opposition to merit pay because it had not previously worked and it proved to be discriminating (pp. 59-60)
Another former American Federation of Teachers (AFT) President, Albert Shanker (1985) pointed the finger at administrators for not doing the job on removing incompetent teachers and allowing them to remain in their schools:
Review of Literature
Merit pay historical and recent research
A study by the Educational Research Service (ERS) in 1979 determined that out of the 3,000 American schools they surveyed only 6.4 percent had attempted merit pay for their teachers. They found that most had dropped them after trying them out. The main reasons cited for dropping the programs were
(1) figuring out how to evaluate teachers fairly;
(2) teachers dislike of merit pay and
(3) declining teaching morale. In numerous schools surveyed they found that faculty unions had negotiated merit pay language out of their contracts.
In an extensive study that surveyed 1,000 current classroom teachers’ a summary of their feedback was received in the areas of
(1) teacher evaluation;
(2) how teachers want to spend their time at work;
(3) teacher retention; and
(4) financial incentives preferred (Education for Excellence, 2018).
In the arena of teacher evaluation: The top three teacher priorities were
(a) measures of student academic growth over time= 64%;
(b) students’ daily work/projects/portfolios = 45%; and
(c) classroom observation by administrators= 35%.
Students’ standardised test scores were selected by only 10% of the teachers and this was second from the bottom on the way’s teachers felt they should be evaluated.
How teachers want to .spend more of their time at work: The top three choices selected here were
(a) collaborating with other teachers = 80%;
(b) pmiicipating in professional development= 56%; and
(c) communicating with parents/guardians = 50%.
Collaborating with other teachers has been strained in large numbers of schools that have moved into inauguration of merit pay.
Teacher retention and motivating retention: Teachers identified
(1) a higher salary as their number one retention suggestion to motivate and to stem attrition and reward teachers for excellent = 75%;
(2) less standardised testing= 34%;
(3) more supportive administrators =21%.
The second item of less standardised testing fits into what was found in teacher evaluation above where it was only selected as a source of evaluation by 10% of the teachers.
Financial incentives preferred: Merit pay was not an item listed by any of the teachers looking at various financial incentives that teachers favor.
Recent outcomes on Texas and Tennessee merit pay plans
There are a few studies that have been conducted in recent years to show the outcomes received from merit and/or performance pay programs
The National Center on Performance Incentive summarized the Texas Educator Excellence Grant (TEEG) that involved 50,000 Texas teachers in I, 148 schools in the state. The outcomes for 140,000 or more students on the Texas Assessment of Knowledge and Skills testing concluded there was no systematic evidence that TEEG had an impact on student achievement gains. The Texas State Teachers Association said, “we predicted the program would be a flop, and that’s what it turned out to be” (Stulz, 2009)
Tennessee enrolled 300 middle school mathematics teacher volunteers in a three year randomized experiment program at Vanderbilt University. The idea behind this program was to find out if large monetary incentives could produce significant boosts in student test scores and also encourage these teachers to become more effective in their teaching.
OUTCOME:
There were only two small positive findings. Small positive results were found with fifth graders in both the second and third years. On the other hand, there were no positive effects for the students in the grades 6-8 in any of the three years of the experiment. This program was ended at the end of the three years (Education Week, September 21, 2010).
Some international experiences: Japan and England
Japan
Shinbun (2011) discussed some concerns since Japan adopted merit pay as part of their evaluation back in 2011. The latest results pointed to the fact that it was not working effectively. One principal said he gave all his teachers a ‘C’ rating because to do anything less would require him to present an evidence-based case. The ‘C’ was the middle grade between ‘A and F’ for ranking his teachers. It also spoke to this being a compromise in working to reconciliation with the teacher union that was not in favor of having teachers differentiated in teacher effectiveness. The Nagano Department of Education had a total of 16,767 our of 17,000 teachers evaluated with ‘C’ as their evaluation grade.
England
The chief inspector of schools was put in the position of having to ‘consider whether there is a correlation between the quality of teaching and salary progression.’ The Office for Standards in Education made an announcement that salaries for teachers could be frozen after the school inspections are made. Government officials in some areas wanted the reforms to discourage weak teachers from staying in the teaching field in England. The National Union of Teachers general secretary, Christine Blower, saw this movement as being detrimental to the teaching profession and saying, ‘teaching is a collegial profession and this is a divisive, unrealistic and simplistic way of
looking at how schools work.’
Another report titled, Great Teachers: attracting, training and retaining the best, came out in May of2012.from The Education Select Committee. Their recommendations for performance related pay were centered on having these payments for the reason of increasing the attainment of students by rewarding and retaining the most ‘effective’ teachers in the profession.
Gates Funded Teacher Evaluation Reforms Program
Kraft (2018) provided summary points from a RAND Corporation report on the Gates-Funded Teacher Evaluation Reforms program including, ‘the results are disappointing’. This program selected three school districts and four charter management organizations in the Gates Intensive Partnership (IP) program. It was a program from 2009 to 2015 with a funding commitment of$575 US million dollars. The total cost of the seven years of the program was over $1 billion US.
The schools were selected due to their strong commitment to reforms of major changes in teacher recruitment, better screening of candidates, improved evaluation of teachers and new compensation plans for teachers.
The report by RAND found that formative and summative evaluation of teachers, one of the major tenants of the program, proved to be too difficult for the schools and their administrators. The evaluations identified very few teachers who would be rated below the Proficient level. In fact, as the program moved ahead more teachers received the Proficient level. This was in direct conflict with what was expected with a stronger and more efficient evaluation system. There appeared reluctance by administrators to move toward dismissing less than proficient teachers and of providing significantly improving their evaluation processes.
One other key goal was to be boosting the retention of teachers through the awarding of bonuses and career ladder options. The findings provided evidence that the modest bonuses of a few thousand dollars did not, indeed, provide the motivation for teachers to improve. Significantly higher merit pay rewards had been promised in the program.
Race to the Top (RTTT) Tenets and Findings
The Race to the Top (RTTT) program in the United States included the Payment by Results on testing improvements in scores as a major tenet in its reward system. It provided for the offering of pay increases in the form of merit pay for those teachers whose students showed significant improvement on the high stakes tests being instituted by states and the federal government. It appears that little research on the success of merit pay systems in American had been used to arrive at this merit pay initiative.
It will take several more years to more completely identify the overall results that came out of the Race to the Top (RTTT) program from 2009 to 2016. Initial results found by Aldeman (2017) found some positive moves in the areas of teacher performance evaluations. These appear to have contributed to improved student achievement in some school districts but progress was uneven.
1. The changes were unpopular and caused some backlash over these changes and others over the last few decades.
2. In the New Teacher Project (TNTP) 99 percent of the teachers were rated as ‘satisfactory’ and perfonnance was ignored altogether when making decisions about recruitment, professional development, promotion, pay, or dismissal.
3. Teacher evaluation included requirements to include objective measures of student achievement in testing scores. Evaluation plans nearly tripled in the six years starting in 2009 from 15 to 43 states. Teacher evaluations were now used in tenure decisions in 23 states versus O states previously.
4. The incentives promised (merit pay) were smaller than expected as they were shared among larger numbers of teachers and principals which made them less effective.
5. The program was not communicated effectively to teachers and principals and mass confusion about who would be eligible for the awards and how large the awards would be for the teachers and principals.
6. While the requirements succeeded in making states adopt evaluation reforms there were few levers that would make states implement them well.
7. The number of teachers receiving a lessthan-satisfactory rating hardly budged in percentage in most states after these new systems began being implemented.
8. This program overestimated the field’s capacity to improve teacher evaluation systems. Few people have been adequately trained to help provide evaluation and feedback to teachers on how to improve.
9. Few teachers were dismissed with the new systems.
10. Parents started to fuel a backlash against testing in general and many parents started to ‘opt out’ of statewide testing for their children.
Hess (2018), in his review of former Secretary of Education in the U.S. Arne Duncan’s book on his work during the Obama administration, is very critical about the accomplishments that Duncan outlined:
National Assessment of Educational Progress (NAEP) had been showing significant growth over the previous decade. Seven years later these NAEP scores were stagnating. The latest research showed teacher evaluation improvements to be a failure. The NAEP gains actually fell between 2013 and 2017. Duncan, by tying teacher evaluation to student test scores, was highly criticized by the former Secretary of Education Lamar Alexander.
Major Challenge to the High Stakes Testing Movement
On the heels of these mixed results and failures of both the Obama administrations’ Race to the Top and The Gates Effective Teaching Initiative came a book challenging the whole testing movement in the United States. Koretz (2017) authored his book titled The Testing Charade: Pretending to Make Schools Better. This Harvard University researcher identifies what dominates everyday life in schools today with the simple answer of ‘tests’ (p. 21). He presents his concerns on how much time is spent on testing and testing preparation including shortening time in other school subjects and using that time in test preparation. Koretz also presents how Don Campbell’s Law has worked its way into American school systems over the years. Campbell’s law is stated as follows:
The more any quantitative social indicator is used in social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor (p. 30)
In showing the warnings schools need to be aware of in testing Koretz quoted E.
F. Lindquist who was one of the first developers of standardised testing. In 1951 Lindquist stated the following:
The widespread and continued use of a test will, in itself, tend to reduce the correlation between the test series and the criterion series (the later behaviour, outside of the testing situation, that is our real concern) for the population involved. Because of the nature and potency of the rewards and penalties associated in actual practise with high and low achievement test scores of students, the behavior measured by a widely used test tends in itself to become the real objective of instruction, to the neglect of the (different) behaviour with which the ultimate objective is concerned (p. 39).
The result is core inflation according to Koretz. He also reported how the emphasis on testing has caused widespread ‘cheating’ by teachers, administrators, and others concerned with trying to show testing increases.
Koretz suggested schools get back to evaluating teachers on their teaching practices and go back into the classrooms with ‘unscheduled observations’ where the actual teaching can be evaluated. The testing, Koretz mentioned, essentially replaces what should be the curriculum for courses.
Part 3 in May Issue. .