“A VAM (Value-Added Model) score may provide teachers and administrators with information on their students’ performance and identify areas where improvement is needed, but it does not provide information on how to improve the teaching.” American Statistical AssociationToday, I spent a little time looking over the American Statistical Association’s "ASA Statement on Using Value-Added Models for Educational Assessment.” That statement serves as a reminder to school leaders regarding what these models can and cannot do. Here, in North Carolina and in other states, as school leaders begin looking at No Child Left Behind Waiver-imposed value added rankings on teachers, they would do well to remind themselves of the cautions describe by ASA last April. Here’s some really poignant reminders from that statement:
- “Estimates from VAMs should always be accompanied by measures of precision and a discussion of the assumptions and possible limitations of the model. These limitations are particularly relevant if VAMs are used for high-stakes purposes.”
- “VAMs are generally based on standardized test scores, and do not directly measure potential teacher contributions toward other student outcomes.”
- “VAMs typically measure correlation, not causation: Effects—positive or negative—attributed to a teacher may actually be caused by other factors that are not captured in the model.”
- “Under some conditions, VAM scores and rankings can change substantially when a different model or test is used, and a thorough analysis should be undertaken to evaluate the sensitivity of estimates to different models.”
- “Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions.
- “Ranking teachers by their VAM scores can have unintended consequences that reduce quality.”
- “The measure of student achievement is typically a score on a standardized test, and VAMs are only as good as the data fed into them.”
- “Most VAMs predict only performance on the test and not necessarily long-range learning outcomes.”
- “The VAM scores themselves have large standard errors, even when calculated using several years of data.”
Value-added ratings should never be used to inform school leaders about teacher quality. There are just too many problems. In the spirit of reviewing VAM data with teachers, here’s my top ten reminders or cautions about using value-added data in judging teacher quality:
1. Remember the limitations of the data. Though many states and companies providing VAM data fail to provide extensive explanations and discussion about the limitations of their particular value-added model, be sure those limitations are there. It is common to hide these limitations in statistical lingo and jargon, but as a school leader, you would do well to read the fine print, research for yourself, and understand value-added modeling for yourself. Once you understand the limitations of VAMs you will reluctantly make high stakes decisions based on such data.
2. Remember that VAMs are based on imperfect standardized test scores. No tests directly measure teacher contributions to student learning. In fact, in many states, tests used in VAMS were never intended to be used in a manner to judge teacher quality. For example, the ACT is commonly used in VAMS to determine teacher quality, but it was not designed for that purpose. As you review your VAM data, keep in mind the imperfect testing system your state has. That should give you pause in thinking that the VAM data really tells you flawlessly anything about a teacher’s quality.
3. Because VAMs measure correlation not causation, remind yourself as you look at a teacher’s VAM data that he or she alone did not cause those scores or that data. There are many, many other things that could have had a hand in those scores. No matter what promises statistics companies or policymakers make, remember that VAMs are as imperfect as the tests, the teacher, the students, and the system. VAM data should not be used to make causal inferences about the quality of teaching.
4. Remember that different VAM models produce different rankings. Even choosing one model over another reflects subjective judgment. For example, some state’s choose VAMs that do not control for other variables such as student demographical background because they feel to do so makes an excuse for lower performance for low-socioeconomic students. That is a subjective value judgment on which VAM to use. Because of this subjective judgment, they aren’t perfectly objective. All VAM models aren't equal.
5. Remind yourself that most VAM studies find that teachers account for about 1 to 14 % of variability in test scores. This means that teachers may not have as much control over test scores as many of those using VAMs to determine teacher quality assume. In a perfect manufacturing system where teachers are responsible for churning out test scores, VAMs make sense. Our schools are far from perfect, and there are many, many things out there impacting scores. Teaching is not a manufacturing process nor will it ever be.
6. Remind yourself that should you use VAMs in a high stakes manner, you may actually decrease the quality of student learning and harm the climate of your school. Turning your school into a place where only test scores matter, where teaching to the test is everybody’s business is a real possibility should you place too much emphasis on VAM data. Schools who obsess about test scores aren't fun places for anybody, teachers or students. Balance views of VAM data as well as test data is important.
7. Remember that all VAM models are only as good as the data fed into them. In practical terms, remember the imperfect nature of all standardized tests as you discuss VAM data. Even though states don’t always acknowledge the limitations of their tests, that doesn’t mean you can’t. Keep the imperfect nature of tests and VAMs in mind always. Perhaps then, you want use data unfairly.
8. Remember that VAMs only predict performance on a single test. They do not tell you thing about the long-range impact of that teacher on student performance.
9. Finally, VAMs can have large standard errors. Without getting entangled in statistical lingo, just let it suffice to say that VAMs themselves are imperfect. Keep that in mind when reviewing the data with teachers.
The improper use of VAM data by school leaders can downright harm education. It can turn schools into places where in-depth learning matters less than test content. It can turn teaching into a scripted process of just covering the content. It can turn schools from places of high engagement, to places where no one really wants to be. School leaders can prevent that by keeping VAM data in proper perspective, as the "ASA Statement on Using Value-Added Models for Educational Assessment" does.