07 Jun State tests that are useful for both instruction and accountability: A conversation with New Meridian’s Arthur VanderVeen (Part I)
Arthur VanderVeen is the founder and CEO of New Meridian, an assessment design and development company. Today, it works with over 2,500 districts in five states. Because of New Meridian’s leadership in response to the call for better measures, I wanted to take a closer look under the hood of their efforts. In part one of this two-part interview, Arthur talks about the process and possibilities of developing large-scale assessments that are useful for both instruction and accountability. Here’s what he said.
Dale Chu: Tell us a little about New Meridian’s state-level pilots in Montana and Louisiana. How are the designs similar/different?
Arthur VanderVeen: We have partnered with two forward-thinking state education leaders—Superintendents Elsie Arntzen in Montana and Cade Brumley in Louisiana—who are trying to make state assessments more relevant for teachers, students, and families.
In both states, the priority has been to better support classroom teachers with more instructionally useful information about students’ learning, so both are piloting a through-year model that can be aligned to the taught curriculum. Our through-year design is different from other models in that it’s composed of a bank of mini-assessments—or “testlets”—that can be flexibly aligned to the local curriculum. The testlets provide instructionally valuable feedback that teachers can use in the classroom. We then progressively aggregate these scores throughout the year to provide a growth measure and, finally, a summative measure of grade-level standards.
Dale: As I understand it, the pilots you’re working on are curriculum agnostic. Is that right?
Arthur: Yes, in the sense that the testlets can be flexibly aligned to any district’s scope and sequence and still aggregate up to a comparable, statewide summative score. We’ve taken this approach to achieve three outcomes: 1) allow for local control over curriculum choice; 2) reinforce the curriculum by aligning assessments to what’s taught in the classroom; and 3) provide a reliable, comparable measure of students’ mastery of grade-level standards irrespective of chosen curriculum.
We embarked on this work because teachers were telling us that the misalignment of their interim assessments with their taught curriculum was frustrating them and their students. Interim assessments generally measure the same end-of-year test blueprint multiple times during the year; students get frustrated when they are tested on concepts they haven’t yet had an opportunity to learn. And the skills-based score reports create pressure on teachers to drill discrete skills to prepare for the summative test, rather than reinforcing the high-quality curriculum plan.
Our approach solves these problems by designing the testlets to measure instructionally coherent clusters of standards that can be easily aligned to how most green-rated curricula introduce and develop key concepts and skills. We review high-quality curricula to determine the optimal size for these standards clusters to support maximum flexibility while also measuring the breadth and depth of grade-level standards by year end. Because we’re standards first and curriculum second, we are both curriculum agnostic and curriculum aligned.
Dale: Is it really possible to marry the dual purposes of instruction and accountability vis-a-vis testing?
Arthur: Short answer? Yes. And here’s why: In almost every classroom in this country, teachers give and grade assignments throughout the year. Then, come May or June, they convert those assignment grades into a final, summative class grade. They may use a variety of “scoring models” (average all grades, weight later grades, include a final end-of-year test) to arrive at that final course grade.
Through-year assessment systems are similar: We take multiple measures of what students learn throughout the year. Teachers use that information to differentiate and personalize ongoing instruction. We aggregate those measures into a final summative measure of their grade-level standards mastery.
There’s an important difference, however, between through-year assessments and teachers’ gradebooks. We put our testlets on a common scale so we can provide a standard, comparable measure of students’ progress toward grade-level standards mastery. This comparability is critical for an accountability measure.
There is another important consideration when trying to marry instruction and accountability: We are eager to understand how teachers and students react to regular, short, periodic assessments of their learning development throughout the year when they know these assessments also contribute to their overall summative score. Will the accountability component overshadow the formative value of the instructional component? Our hypothesis, which we’re testing through teacher and student interviews as part of our pilot research studies, is that multiple opportunities throughout the year closely aligned to what has just been taught—with opportunities for retakes—will reduce the overall stress currently associated with one final end-of-year high-stakes assessment. That’s a key part of our research agenda we want to examine.
Dale: Many want the “iron triangle” of state tests: better, faster, and cheaper. Which of the two can folks reasonably expect?
Arthur: Call me an optimist, but I believe we can have all three, so long as we are willing to think differently and embrace new approaches.
We have to continue to push the boundaries on quality. I started New Meridian in 2016 to enable the then-PARCC states to maintain their commitment to highest quality assessments while realizing economies of scale through collaboration. Our assessments focus on deep engagement with content, critical thinking, evidence-based reasoning, effective written communication, and mathematical modeling and reasoning. These are the skills that matter to prepare students for success in college and career.
Testing time has always been an issue, but this can be addressed by through-year models that rely on shorter, less-intrusive testing. The assessments in our Louisiana and Montana pilots, for example, can be administered in a single class period and scored immediately. It is simply a faster system that reduces overall testing time.
As for costs, we introduced a new model to the market through our licensing model. It’s ridiculous how much money individual states spend on developing custom content year after year to assess essentially the same state learning standards. We provide a cost-effective model whereby states can license high-quality test items that have been developed by multiple state partners, sharing the costs of custom development. We then work with educators within each state to review, select, and approve state-specific assessment designs drawing from our large banks of premium test content. The result? We can provide states with highest quality assessments more cost effectively.
Stay tuned for part two of this interview, which will be released on June 12.
This interview has been lightly edited for clarity.