22 Jul The best of a bad lot: Weighing updates to annual testing
By Dale Chu
One of the most frustrating aspects of the future of assessment conversation is the absence of specifics. So much of the talk is about the shortcomings of the current testing regime (e.g., too punitive, too expensive) and how we need something “better” (e.g., less time intensive, more instructionally useful, more innovative). But what options are practicably available that are actually “better”? It’s hard to pin down an answer, but to the extent it can be done, it’s often the lukewarm suggestion of through-year testing. This leaves a lot to be desired.
Not anymore. Last month, Bellwether performed a great service by publishing a report that helps to fill in the blanks on “better.” Titled Multiple Choices: Weighing Updates to State Summative Assessments, the authors examine two key policy goals—reducing the testing footprint and increasing instructional relevance—that have emerged as part of the push for new forms of testing. They concretely (praise Bellwether!) outline the current universe of alternative models that might help advance these goals. To wit, there are six potential approaches: (1) reducing test length; (2) matrix sampling of items; (3) sampling of students; (4) grade-band testing; (5) performance assessment; and (6) through-year assessment.
The report is worth reading in its entirety for a deep dive into the tradeoffs of each approach, along with a helpful glossary and backgrounder information. But spoiler alert: the upshot can be found on pages 24 and 25 in the policy recommendations section, where the six approaches are scored as either “not recommended,” or “recommended, with conditions.” It’s worth noting that none of the approaches are “recommended.” In fact, most of them (four) are not recommended at all: reducing test length, matrix sampling, sampling of students, and grade-band testing. Reducing the overall test length has been of particular interest to many, but the fact of the matter is that most states already employ the shortest exam possible without running afoul of what’s required, and doing so would result in an unacceptable loss of essential information (e.g., student subscores).
That leaves partial matrix-sampling of items and through-year assessment as the two that are recommended with conditions. However, based on the authors’ rationale, it would be more accurate to say that these are “not recommended, unless certain conditions are met.” The juice isn’t worth the squeeze for partial matrix-sampling of items, and the tradeoffs with through-year are not warranted if comparability is lost—to say nothing of the fact that many through-year models are not much of a departure from what states already use.
So where does this leave the state of play? The report’s title employs a clever play on words, but the recommendations, or lack thereof, suggest that there really aren’t that many choices to begin with when it comes to annual assessments. At least, not any good ones. Until a viable alternative is available, this means states and districts should hold fast to the status quo (i.e., when it comes to annual state testing) while the pursuit of new measurement solutions continues in earnest.
Sorry, the comment form is closed at this time.