16 Sep The new testing landscape: A conversation with FutureEd’s Lynn Olson
By Dale Chu
Lynn Olson is an award-winning writer and editor, and a senior fellow at FutureEd, a think tank at Georgetown University’s McCourt School of Public Policy. They recently released a new report called The New Testing Landscape: State Assessments Under ESSA, which includes a scan of state testing systems across the country, a close look at half a dozen state systems, and interviews with testing experts including state assessment directors and commercial test publishers. I recently talked with Lynn about her new report and what it tells us about the current state of affairs on testing. Here’s what she said.
Dale Chu: After scanning state testing programs and talking with more than twenty experts in the field, did you find anything that surprised you?
Lynn Olson: I was surprised by how fragmented the state testing landscape has become, particularly given the desire among states in the past decade to collaborate on high-quality, standards-aligned assessments that would do a better job of demonstrating what students know and can do.
Dale: The SAT and ACT are well known brands to parents, and an increasing number of states have embraced these college admissions tests as their high school assessment. What are the long-term consequences of this decision?
Lynn: We don’t know for certain, but there are several concerns. First, studies suggest these tests are not fully aligned to the standards states have identified as important for students to master by the time they graduate high school and, in some cases, the test content is pegged to lower grade levels. So, it could encourage high schools to water down or narrow expectations for students. Second, there’s concern that the tests favor more affluent students. Research has found a strong relationship between family income and SAT and ACT scores. And students from well-off families often have access to test-prep classes and tutors to help boost their scores, although both ACT and the College Board have tried to make test preparation more accessible to students from low-income families. So, relying on these tests could exacerbate rather than reduce educational inequities.
Dale: A different kind of shift is occurring in grades 3 through 8. A number of states are redesigning their test to reflect individual state standards and meet accountability requirements under the Every Student Succeeds Act. How has the decision shaped transparency, comparability and quality?
Lynn: A recent study by the National Center for Education Statistics found that cut scores—or the benchmark for what states consider proficient—have risen over time. So, the good news is states are expecting more of students. Though most state performance standards still correspond more with the basic than proficient level on the National Assessment of Educational Progress in grades 4 and 8.
But it’s much harder to get a handle on test quality because there’s less transparency. PARCC and Smarter Balanced released a large number of test items and publicly released information about test quality and alignment to standards. But for political and cost reasons, states are being less transparent. It’s also harder to compare student performance across states when states are using so many different tests.
Dale: Studies have shown that the quality and rigor of state tests improved with the advent of the PARCC and Smarter Balanced testing consortia in 2010. As states move away from the consortia, what will it take to ensure the state tests continue to evolve and not backslide?
Lynn: Today, only a dozen states remain part of the Smarter Balanced assessment consortium and DC is the only jurisdiction giving the full PARCC exam), although several states plan to use PARCC items in their new tests in the 2019-20 school year. I think two things are critical to maintain transparency, quality, and alignment to standards, beyond federal enforcement: First, we need to have a different conversation with parents and teachers about the value of state testing systems so that they demand better tests. And, second, we need more powerful incentives for innovation and improvement than now exist under the federal innovation demonstration pilot.
Dale: How has computer-based testing transformed the testing landscape? What’s the next frontier in online testing as states look to innovate with limited budgets?
Lynn: In 2011, only five states required that students take state tests online. Today, most states give their annual tests online as the default option. This has enabled faster scoring and reporting of results; more accommodations that make the tests accessible to students with disabilities and to English learners; the use of computer-adaptive tests, where questions get harder or easier based on a student’s initial responses; and technology-enabled performance tasks that ask students to draw, write, conduct lab experiments, and solve multi-step problems—all of which can provide richer information and make the tests more engaging for students. I think one area of innovation is using AI and other advances to score students’ writing, so that more authentic writing assignments could be incorporated into state tests.
Dale: Increasing equity has long been a hallmark in standardized testing. Please describe how standardized testing has helped address this issue in America’s schools. What more needs to be done on the testing front to keep states and districts mindful of equity?
Lynn: Several federal requirements have been key to addressing equity issues: States must assess all students on challenging academic standards; provide appropriate accommodations for students who need them; and report outcomes by subgroups of students, such as students of color, low-income students, students with disabilities, and students learning English.
Having this data enables states, districts, and schools—as well as parents and advocacy groups—to identify achievement gaps between historically disadvantaged students and their more advantaged peers and to begin closing them. Comparable data lets us know which districts or schools are underserving all students or certain groups of students and the types of supports and improvement strategies that hopefully result in improved outcomes.
Having said that, there are also concerns about the continued relationship between family income and test results, and whether existing tests perpetuate rather than ameliorate inequities. And there are concerns that communities furthest away from opportunity haven’t really been brought into the conversation—including about how these measures are used or mis-used.
Dale: In what ways has the Trump administration’s Education Department and its handling of ESSA plans left its mark on state assessment policies?
Lynn: Given the administration’s strong focus on pushing authority down to the state and local level, states haven’t felt a lot of pressure around issues of quality, transparency, and comparability across states. Limited money and incentives to innovate means only a few states have applied and received approval to develop alternatives to their existing state tests.
Dale: Can you talk a little bit about the innovative assessment models more specifically, and the numerous states that are exploring an NWEA-like through-course assessment? Do we have confidence in this approach? What do we need to do to ensure the rolled-up, end of year score is reflective of student-performance?
Lynn: To date four states—Georgia, Louisiana, New Hampshire, and North Carolina—have been approved under the federal Innovative Assessment Demonstration Authority. What all four states have in common is an attempt to pilot assessments that would be given more frequently during the school year and be more instructionally useful to students and teachers.
Louisiana is piloting English tests that would ask students to use the social studies texts they’re already reading in classrooms in order to demonstrate and build their background knowledge, which is crucial to reading comprehension. New Hampshire is piloting competency-based assessments in which teachers work together to develop rich performance tasks, some of which are in common across participating districts, and which can be given at multiple points during the school year. North Carolina wants to roll up interim assessments into a summative assessment rating. And one of Georgia’s two models is exploring an NWEA-like through-course assessment that would roll up computer-adaptive tests given over the course of the school year into an end-of-year score that is reflective of student performance.
Some of the issues to pay attention to are whether these tests cover the breadth and depth of the standards over the course of the school year; how comparable the quality and rigor of the tasks and the scoring are, in a case like New Hampshire. And whether computer-adaptive tests can ensure all students have access to grade-level content, rather than only being taught—and assessed—based on current performance, with no hope of ever getting exposed to grade-level work. I think all of these innovations are worth exploring but it will be important to be vigilant about evaluations and results, so that we can learn from them.
This interview has been lightly edited for clarity.