The president of the Colorado Education Association recently went on the record at a House Education hearing as saying that teachers should not be paid on the basis of effectiveness because they “all do the same job.” CEA’s refusal to acknowledge the obvious fact that there are effectiveness differences between teachers that must be recognized in meaningful ways defies reason. Yet this refusal shouldn’t surprise us in the context of union strategy in the wider debate over testing, teacher evaluation, and tenure reform.
While the testing controversy has brought about a strange convergence of two traditional enemies—conservatives and teachers unions—the two groups differ significantly when it comes to why they believe testing is a problem. Conservative testing angst has its roots in concerns over data privacy, federal involvement in education, and parental rights—all valid concerns that we need to work through moving forward. The unions, on the other hand, have a singular goal: The preservation of their interests in the face of what they view as an existential threat.
Like enemy soldiers accidentally finding themselves in the same trench during a heated battle, pressing issues have led to a temporary ceasefire between conservatives and teachers unions. But the sides do not share common goals or values, and that ceasefire is unlikely to hold once this brief moment of respite passes. In the meantime, inadvertently handing over teacher tenure, one of the most crucial pieces of education reform, is not wise.
The following is an examination of motivations, arguments, and dangers as the 2015 testing debate lurches forward.
The Union Perspective
Despite constant reminders of “strength” and “unity,” teachers unions are in decline in the United States. Teachers union membership nationwide has now fallen under 50 percent for the first time in history. At the same time, the unions are fighting a quiet but potentially devastating internal civil war between national-level unions and their increasingly militant local and state affiliates.
Across the nation, states are rolling out evaluation systems that tie student learning outcomes to teacher evaluations. And just last year, a California judge handed down an earth-shaking decision in Vergara v. California that destroyed the state’s teacher tenure statute, one of the unions’ most critical planks, on the basis that it disproportionately harms vulnerable student populations by keeping ineffective teachers in classrooms. After examining evidence of the effects of ineffective teachers on students, the judge stated in his opinion that “The evidence is compelling. Indeed, it shocks the conscience.”
For those unfamiliar with the concept, teacher tenure is a statutory provision that provides near-complete job security once a teacher has been employed for a certain period of time. In the past, Colorado simply required three years of continuous employment as a teacher in order to qualify for tenure, which in our state is known as “non-probationary status.” Once a teacher is granted non-probationary status, it often becomes extremely difficult for districts to let him or her go without first completing an arduous and expensive type of due process—even if that teacher is not effective in the classroom. During the Vergara case in California, it was posited that only 2.2 teachers were dismissed for unsatisfactory performance each year out of California’s total teaching force of 275,000. That amounts to a statewide performance-based dismissal rate of .0008 percent. In 2009, a large study showed that there had been zero formal dismissals in Denver Public Schools during a three-year period.
A statutory focus on seniority-based job protection also trickles into local district union contracts, where last-in, first-out (LIFO) policies force layoff decisions to be made on the basis of years of service in a district rather than effectiveness. In Jefferson County School District, Article 34-2-5A of the negotiated agreement between the teachers union and the district states that “displacement” decisions between teachers of equal seniority will be decided not by effectiveness, but by “a flip of a coin between the teachers involved by a disinterested third party.” (Note: The agreement was signed before the relevant section of SB 191 took effect.)
While it is certainly true that the majority of teachers are doing a great job, policies like these defy even the most basic logic. Even so, teachers unions fight savagely to defend them. Now, however, the tides are beginning to turn.
Passed in 2010, Colorado’s Senate Bill 191 (SB 191) significantly altered the landscape surrounding teacher effectiveness and tenure. The bill was unanimously supported by Republicans, though it caused deep rifts in the Democratic Party. Not surprisingly, SB 191 was vehemently opposed by the Colorado Education Association, the state’s powerful teachers union. SB 191 had four primary effects:
- Requiring that 50 percent of teacher and principal effectiveness ratings be tied to multiple measures of student academic growth
- Requiring that teacher effectiveness ratings be tied to the earning or loss of non-probationary status (non-probationary status could be earned after three consecutive years of demonstrated effectiveness and lost after two years of demonstrated ineffectiveness)
- Requiring the “mutual consent” of both a teacher and a principal when placing the teacher into a new school
- Requiring effectiveness ratings be a significant factor in layoff decisions, with seniority considered after effectiveness instead of the other way around
Although SB 191’s full implementation has been delayed, it is now entering the final phase of its rollout. The threat of SB 191 to union interests is underscored by the more recent Vergara ruling, which emanated from a state long known as a bastion of teachers union control. Unions today correctly perceive that they are facing the most serious threat they’ve ever seen to their outsized dominion in the education space. Tenure may well be their Waterloo.
A Backdoor Assault on Tenure Reform
The unions’ fight against tenure reform is complicated by the fact that tenure as it currently exists is not a popular concept. A nationally representative Education Next poll found that only 32 percent of the public supported teacher tenure. Adding a brief statement about the arguments for and against teacher tenure saw that figure sink to 26 percent. These figures should not come as a surprise; the concept of permanently keeping one’s job simply by virtue of time served as opposed to performance is completely foreign to most Americans. Add in the fact that public education is a taxpayer-funded system, and you have all the makings of a political witch’s brew.
Recognizing the political minefield, unions have largely chosen to skirt the issue of tying tenure to effectiveness ratings. Instead, they have opted for more general attacks on “high-stakes” testing and evaluation. Sensing that this approach may not be enough, the unions have also opened another front by taking the battle over evaluations to the courts in a number of states. Here in Colorado, CEA is leading a legal charge against SB 191’s mutual consent provision. So far, that effort has been unsuccessful.
Meanwhile, the National Education Association and its state and local affiliates have thrown their weight behind efforts to push opt-outs across the country—often alongside conservatives who have more legitimate concerns about data privacy and federal involvement in education. At least one large opt-out group is calling for the nation’s largest union to get even more involved. In Jefferson County, the local union president vowed to opt his own kids out of this spring’s tests and encouraged others to do the same. A large number of opt-outs would have a massive impact on the quality of the state’s accountability data, which would likely allow the unions to finally discredit the use of any state testing data in evaluations once and for all.
To be clear, parents’ rights to opt their children out of assessments should be codified in statute. The state’s role is not and should never be forcing parents to do things against their will. However, any statutory provision outlining parental opt-out rights should stop short of encouraging opt-outs. Policymakers should also be wary of creating incentives for teachers to “encourage” certain students or groups of students not to take state assessments in order to influence their scores or ratings. While teachers should not be penalized for parental opt-outs when such decisions are out of their control, teachers with large numbers of opt-outs should be required to demonstrate that they made a good faith effort to test every child. Anything less invites systematic bias—bias that will be exploited by the unions to further dismantle any effort to meaningfully evaluate teachers using student growth data.
What we are witnessing is an all-out backdoor assault on tenure reform. Rather than support a deeply unpopular position directly, the unions have opted instead to do everything they can to undermine any meaningful evaluation system upon which tenure decisions could be made. They would like nothing more than to see education systems safely settle on subjective evaluation systems in which nearly 100 percent of teachers are rated as meeting or exceeding state standards for effectiveness—a phenomenon known as rating inflation that has plagued teacher evaluations for years and is evident even in Colorado’s new evaluation pilot program, which does not yet include student learning data.
A system that does not meaningfully differentiate performance makes it impossible to reward great teachers, build smarter compensation systems, or dismiss ineffective teachers. That is true even if the system is accompanied by statutory changes linking tenure to performance. Thus, if such a system were preserved, unions could claim to have played ball on tenure reform while simultaneously retaining what would amount to a system not appreciably different in practice than the one we had before. As an added bonus, unions at all levels could exploit the commonly held perception that “high-stakes” evaluations tied to subjective observations by administrators are an open invitation for abuse, thereby using fear to strengthen their recruitment efforts and ossify local collective bargaining power.
The union tactic of attacking tenure reform’s underlying structures instead of defending tenure itself has proven useful. It has enabled unions to form strange alliances with their traditional political foes: Conservatives. It speaks powerfully to the depth of concern held by many conservatives over data collection and federal involvement in education that they would be willing to ally with teachers unions despite the unions’ traditionally overwhelming support for Democrats. During Colorado’s 2014 election cycle, teachers unions gave more than 99 percent of their political contribution to Democratic candidates. In more practical terms, for every one dollar given to a Republican, 134 dollars were given to Democrats.
But while the stars may have aligned for this brief moment in time, it is unlikely that all will be well once the testing debate is settled—particularly if the end result is the gutting of reform efforts and accountability often championed by conservative education reformers or the loss of comparable data used to drive school choice and help parents make informed educational decisions for their kids.
The Underlying Arguments
As the unions’ tactical narrative takes deeper root, it is becoming increasingly difficult to sort rhetoric from reality. Consequently, it is helpful to look briefly at the situation’s underlying arguments.
The concept of holding teachers accountable for student learning has its roots in two well-established facts: Effective teachers are the single most important school-related factor in a child’s learning, and strictly subjective evaluation systems tend to exhibit rating inflation when it comes to effectiveness ratings. In more practical terms, this means we know that putting an effective teacher in every classroom is critically important, but we also know we are not doing a good job of meaningfully defining effectiveness with strictly subjective evaluations. The use of multiple measures of student learning in teacher evaluations, coupled with tenure reform, is meant to bridge the divide between these two issues. By tying objective measures of student learning and growth to teacher effectiveness ratings, we can begin to push further toward the goal of having a truly effective teacher in every single classroom.
The counterargument is that student learning data are not suitable for use in “high-stakes” evaluations. This argument typically comes in two parts: The assertion that a large majority of people oppose the use of student learning data in teacher evaluations, and the assertion that student data should never, ever be used in personnel evaluations that have real consequences attached to them. But is that really what the evidence says?
Despite arguments that there is little public support for tying student learning data to teacher evaluations, the aforementioned 2014 Education Next poll found that 60 percent of public respondents supported the idea of requiring “adequate progress on state tests” as a condition of teacher tenure. The response to this is often to tout another national poll, conducted by Phi Delta Kappa and Gallup, in which only 38 percent of respondents favored a requirement that “teacher evaluations include how well a teacher’s students perform on standardized tests.” These two pieces of evidence present a curious case of contradiction between nationally representative polls, which raises serious methodological questions. (This observer strongly suspects that wording may have played a significant role in the disparity.)
Colorado-specific polling conducted as the unions launched their anti-SB 191 lawsuit found very high levels of support for both tenure reform and the incorporation of student learning data in evaluations. At the national level, the 2014 PDK/Gallup poll also found that 82 percent of respondents thought tying teacher evaluations to salaries or bonuses was important. A whopping 94 percent felt the same about the notion that documented ineffectiveness could lead to dismissal. Clearly, the public is rather comfortable with what the unions have termed “high-stakes” evaluation. Still, in light of some conflicting information, perhaps the fairest representation of the public’s opinion on the use of test scores in evaluation is “unclear.”
The question, then, falls largely into the academic realm. Unfortunately, much of the rhetoric in this area is used carelessly in black-and-white terms, with critics and supporters alike playing off one another to create a false dichotomy in which student learning data is either flawless or entirely useless in teacher evaluations. The reality is more nuanced.
For instance, critics of partially data-driven evaluations often cite a report from the American Statistical Association (ASA) questioning the use of value-added statistical models (VAMs)—sophisticated regression models that control for variables in student achievement that fall outside a teacher’s control—in teacher evaluations. (For the sake of simplicity, we will forgo a discussion on the even deeper nuance involved in examining differences between value-added models and the Colorado Growth Model and the effects those differences can have on end results).
The ASA report raises some valid concerns and cautions, though it should be noted that a research-based, point-by-point response to the ASA report by researchers at Harvard and Columbia argues that many of the ASA’s statistical concerns have been addressed by more contemporary research in the area. In fact, some of that recent research has found that teacher effectiveness as determined by VAMs has substantial impacts on student’s future outcomes, including income. In any case, the general thrust of the ASA report is not that VAMs should never be used; it is that they should not be carelessly overused or overemphasized in high-stakes environments. This warning is echoed by the authors of the Harvard response, who write: “The fact that classroom observation and VAM are both imperfect measures underscores why educators and policymakers are likely to make better decisions if they are based on multiple measures of job performance rather than any stand-alone metric.”
The Colorado Growth Model differs from the VAMs discussed by ASA. However, Colorado’s own technical guidance on the use of growth model data in evaluations uses similar language to both the ASA report and the Harvard response, stating that “[median growth percentiles] are most likely to be useful as a basis for evaluating teachers when this information is properly balanced against other information about student performance (e.g., evidence from student learning objectives) and direct observations of teaching practice.”
The Measures of Effective Teaching study, a massive, multi-year examination of evaluation systems conducted by research institutions like the RAND Corporation, Harvard, and Stanford, also found that the best results were achieved by combining student data thoughtfully with other, more subjective measures like classroom observations and student surveys. (Note that while some aspects of the MET study have been critiqued, it is the largest random-assignment study conducted on teacher evaluation systems).
If you are beginning to see a pattern emerge, you’re on the right track. Taken as a whole, research in this area suggests that the best evaluation systems will take into account a mixture of student data and subjective observations rather than basing teacher ratings solely on one type of measurement. Even within the portion of teacher evaluations based on data, the incorporation of multiple high-quality measures in addition to state test-based data can be enormously helpful in creating a more complete picture.
We should be thankful, then, that while SB 191 and its associated rules require that 50 percent of teacher evaluations be based on multiple measures of student growth, they do not prescribe exactly how these measures should be weighted. State assessment data and Colorado Growth Model data must be incorporated when appropriate. However, there is no requirement for how these data should be weighted or even if they should be applied individually (results from only one teacher) or collectively (results from all teachers teaching at that grade level and in that subject in that school). Additional measures—student learning objective results based on pre- and post-tests at the course level, district assessments, school performance frameworks, and even teacher-developed assessments—can also be used, giving teachers, schools, and districts the flexibility to design assessment systems that work best for them. Indeed, some districts have been developing and utilizing such systems for quite some time.
Moving Forward Instead of Looking Back
None of this is to say that the implementation of new evaluation systems will be flawless. There are still residual concerns surrounding the possible effects of “high-stakes” evaluation on behavior (and the potential data distortion those behavioral changes could cause), and the most ideal weights in each district have yet to be determined. Many districts are still in the process of determining appropriate student learning objectives and collective measures for teachers in non-core subject areas. Perhaps more fundamentally, there are very serious and important debates to be had about which state test(s) should be used in Colorado’s accountability system; PARCC probably needs to ride off into the sunset as soon as possible.
Other evaluation issues will likely also arise, just as they do with nearly every major policy change in nearly every area. Yet the best way to build stronger evaluation systems is to refine them through practice. As Colorado’s evaluation system enters full implementation, we should provide ample opportunities at both the district and state level for educators to give feedback on the system as it operates in reality. We should also support continued research into ways to further strengthen the system. By pressing ahead thoughtfully instead of reverting to the status quo, we can ensure that we are working toward improvement instead of stagnation. We know that where we came from wasn’t working, but we cannot reliably speak to the quality of where we’re going until we actually get there.
It should also be noted that no evaluation system is perfect. Even so, many millions of Americans go to work every day in a countless number of industries, fields, and occupations. The overwhelming majority of these Americans are evaluated on a regular basis, and those evaluations carry real consequences. Many of the evaluations are influenced by factors outside employees’ control. Few people are fortunate enough to work in a field in which objective data is even available, and fewer still have seen millions spent on researching, developing, and implementing evaluation systems that are painstakingly analyzed and agonized over in the interest of maximizing fairness. For most, this is simply the way the world functions—and a great many employers would tell you that “high-stakes” evaluations are a critical part of maintaining a great workforce. Viewed in this light, tenure in the absence of meaningful performance evaluation is a relic of a bygone era long since abandoned by the rest of a competitive, market-driven America. If we truly want what’s best for our children, it’s time for education to follow suit.
As the debate over testing and evaluation grows increasingly complex, the strange alignment of conservatives and teachers unions has resulted in an immensely confusing political landscape. On the surface, it appears that both sides want the same things: Opt-out rights for parents, a decrease in testing time, and less onerous tests. Dig a little deeper, though, and problems with that narrative begin to emerge. These problems—and the stark differences in motivation they illustrate—merit careful consideration. Now more than ever, it is critically important that decisions are made from sound philosophical and policy positions rather than on the basis of political expediency.
Ross Izard is an education policy analyst at the Independence Institute, a free market think thank in Denver.