State tests that are useful for both instruction and accountability: A conversation with New Meridian’s Arthur VanderVeen (Part II)

Arthur VanderVeen is the founder and CEO of New Meridian, an assessment design and development company. Today, it works with over 2,500 districts in five states. Because of New Meridian’s leadership in response to the call for better measures, I wanted to take a closer look under the hood of their efforts. In the second part of this two-part interview, Arthur peers into his crystal ball and discusses the future of state assessments, including the leveraging of artificial intelligence. Here’s what he said.

Dale Chu: Looking ahead, what role do you see technology (e.g., AI) playing when it comes to assessing students?

Arthur VanderVeen: There’s no doubt that artificial intelligence will have a major impact on assessments in coming years. It is both exciting and a little bit scary. What matters, of course, is how it is applied. If we are going to embrace AI in areas such as item development, scoring, data analytics, and reporting, for example, we must ensure that we maintain the highest technical and ethical standards and ensure that technology is serving us in the right ways.

We’re looking at AI in two areas: test development and reporting analytics. On the development side, we’ve all seen how rapidly Chat GPT can summarize texts and generate cogent responses to a myriad of questions. We believe that generative capability can exponentially reduce test development costs. On the analytics side, we are excited by the possibilities of how AI can analyze student response patterns at scale to inform how the introduction and development of concepts and skills can be optimized for better learning outcomes.

Dale: You recently sat down with Rick Hess to talk about New Meridian’s work. Was there anything that you guys didn’t cover—or perhaps left on the cutting room floor—that might be worth sharing?

Arthur: One interesting area we did not discuss is the importance of community involvement in test design and development. Students, families, and teachers generally don’t support summative assessment because they don’t receive much value from the time and effort they require. As we’re in the process of creating a new assessment system, we wanted to hear from “people most proximate to the problem” what they would value in a better assessment system. In both Louisiana and Montana, we conducted extensive surveys and interviews with students, parents, teachers, and administrators to understand the local educational context and how a statewide assessment can accommodate that. One message that came through loud and clear was to make them more instructionally useful. We have to connect the dots between classroom instruction and state assessments.

One of the most exciting things we did this year was to convene teachers from both states together to better understand pain points and aspirations for next-generation assessments. We also conducted joint item writing workshops where teachers shared their perspectives on how to measure key concepts and skills, expectations for what their students should know and be able to do, levels of rigor, etc. It was a professionally enriching opportunity for teachers and tremendously valuable in informing our test designs.

Dale: Can you speak to other projects or initiatives that you are particularly excited about?

Arthur: We are excited about states’ growing commitment to better align their science assessments with three-dimensional science standards. With the advent of Next Generation Science Standards, states have invested heavily in curriculum and professional development over the last decade, but assessment has proven to be a bigger challenge. We created the New Meridian Science Exchange to give states new options. Contributing states share their NGSS-aligned test items to a secure bank; we broker making those test items available to other states who select items to match their blueprint and pay a licensing fee back to the contributing states.

Licensing test items can lower states’ costs for developing three-dimensional science assessments and the time it takes to field a new science assessment program. Using items from our Science Exchange, we helped Maine design and develop a new program in a matter of months.

Finally, we engage leading science assessment experts to conduct a thorough quality review of each contributed scenario and item cluster and provide feedback to states so they can improve their science assessment quality.

Another area we’re excited about is enhanced reporting. We’re working with local districts in Illinois to tease out more information from the statewide summative assessment scores. We presented test items representing different points on the reporting scale to panels of local educators and worked with them to develop instructionally useful descriptions of what students can and cannot yet do at each score level. Combining the exemplar items with detailed performance level descriptors makes the summative score more meaningful to teachers, students, and families. Districts will next begin looking at mapping instructional resources to these descriptors.

Dale: What’s your prediction for how state summative tests will look like ten years from now?

Arthur: I believe in 10 years state summative assessment systems will be much more balanced, combining micro-assessments to support personalized competency based learning (PCBL) models throughout the year and more holistic performance-based assessments to support development of durable skills like problem-solving, critical thinking, teamwork, flexibility, adaptability and creativity. Micro-assessments align more closely to students’ learning progress, giving timely feedback on key concepts and skills as they develop. Performance assessments give students opportunities to pull those developing skills into more authentic, real-world tasks. The combination is very student centered and focused on learning.

The technical challenges to achieve this vision are real: Just as we’re wrestling with how to combine through-year assessments into a comparable, reliable summative score, finer-grained PCBL micro-assessments will make this even more challenging. And reliably scoring performance assessments at a reasonable cost also presents real challenges. I believe AI offers real promise for solving many of these challenges.

Dale: What’s the best resource(s) for readers who want to learn more about New Meridian’s assessment efforts?

Arthur: We have a great deal of information on our website, including full pages that explain our through-year system and how the New Meridian Science Exchange can provide a new approach to science assessment. I would also point people to our assessment literacy modules, which take you on a step-by-step journey toward a better understanding of modern assessment, including videos and short quizzes. And of course, we welcome conversations with anyone who wants to discuss the future of assessment. We believe that we’re better as a community if we work together to make assessment a useful tool for modern classrooms.

This interview has been lightly edited for clarity.

State tests that are useful for both instruction and accountability: A conversation with New Meridian’s Arthur VanderVeen (Part II)

CONTACT

CONTACT

Sign Up for Updates

from the Collaborative