Tuesday, May 17, 2011

Good Test Scores = Good Teachers?




Yesterday JS Online ran a story about Senate Bill 95, saying it would allow "School boards across Wisconsin could use teacher evaluations - which rely in part on the results of students' standardized state test scores - as part of the reason for dismissing and disciplining educators, according to legislation considered by the Assembly and Senate education committees Monday."

This is similar to legislation going on around the country, the "value-added modeling" approach.

Like New York- where test scores will account for 40% of teacher's evaluations

And Georgia, where 50% of the evaluations are based on student test scores



But do good test scores = good teachers?  Do good teachers = good test scores?  Is it really that simple of an equation- so simple that teachers' should be evaluated based on how well their students do on a test?

According to an article by the Economic Policy Institute, there are problems with using test scores to evaluate teachers.  "A review of the technical evidence leads us to conclude that, although standardized test scores of students are one piece of information for school leaders to use to make judgments about teacher effectiveness, such scores should be only a part of an overall comprehensive evaluation. Some states are now considering plans that would give as much as 50% of the weight in teacher evaluation and compensation decisions to scores on existing tests of basic skills in math and reading. Based on the evidence, we consider this unwise."

I would encourage everyone to read the entire study, which can be found here.  Here's one excerpt:
"For a variety of reasons, analyses of VAM results have led researchers to doubt whether the methodology can accurately identify more and less effective teachers. VAM estimates have proven to be unstable across statistical models, years, and classes that teachers teach. One study found that across five large urban districts, among teachers who were ranked in the top 20% of effectiveness in the first year, fewer than a third were in that top group the next year, and another third moved all the way down to the bottom 40%. Another found that teachers’ effectiveness ratings in one year could only predict from 4% to 16% of the variation in such ratings in the following year. Thus, a teacher who appears to be very ineffective in one year might have a dramatically different result the following year. The same dramatic fluctuations were found for teachers ranked at the bottom in the first year of analysis. This runs counter to most people’s notions that the true quality of a teacher is likely to change very little over time and raises questions about whether what is measured is largely a “teacher effect” or the effect of a wide variety of other factors."

And another:
"Adopting an invalid teacher evaluation system and tying it to rewards and sanctions is likely to lead to inaccurate personnel decisions and to demoralize teachers, causing talented teachers to avoid high-needs students and schools, or to leave the profession entirely, and discouraging potentially effective teachers from entering it. Legislatures should not mandate a test-based approach to teacher evaluation that is unproven and likely to harm not only teachers, but also the children they instruct."

Roger Tilles, a member of the New York State Board of Regents, recently made this position statement.  In it, he said, "If these value-added techniques were applied to other professions as they are being applied to teachers, it would mean that dentists be would evaluated not on their skills but only on how many cavities a dentist’s patients gets in a year or with a doctor on how many times his patients get sick in a year. Similarly, police are not evaluated on the number of crimes committed on their beat, nor fire personnel on number of fires in their jurisdiction. We would all acknowledge that such rating systems are at best incomplete."

And another study by RAND:  "Finally, our analysis and simulations demonstrate that VAM- based rankings of teachers are highly unstable, and that only large dif- ferences in estimated impact are likely to be detectable given the effects of sampling error and other sources of uncertainty. Interpretations of differences among teachers based on VAM estimates should be made with extreme caution."

No comments:

Post a Comment