Validating standardized testing

Get plenty of sleep. Eat a healthy breakfast. Every spring, school children are sent home with these directives to help them score well on standardized tests. Fears of their children being held back or failing to meet graduation requirements can send parents into a panic.

Teachers and school administrators also feel the pressure for students to post high test scores. A provision of No Child Left Behind requires schools to report adequate yearly progress based on those scores. If the school does not meet AYP standards in consecutive years, schools can be forced to transfer students, replace teachers or be completely reorganized.

What might surprise many is that people at the state level also feel pressure when it comes to high stakes testing. States spend a great deal of time and money creating the tests, but a team of educational researchers including Karen Samuelsen of the University of Georgia’s College of Education has discovered that many people at the state level feel unsure about how to validate the tests they are developing.

“Ensuring that a test is fair is a lot more complicated than people might think,” said Samuelsen, an assistant professor in the department of educational psychology and instructional technology.

Samuelsen and Robert Lissitz, a professor of education and director of the Maryland Assessment Research Center for Education Success at the University of Maryland, have created a framework with a new vocabulary and a new way of thinking about validity that makes the validation of educational assessments a less daunting task.

The results of their work were featured in Educational Researcher, the official journal of the American Educational Research Association. The article is followed by commentary from well-known scholars in the field of measurement with Lissitz and Samuelsen’s response to the comments. Comments on the article vary from total agreement or disagreement to some even expanding Lissitz and Samuelen’s framework.

Their article has raised controversy because some experts in the field view it as an attack on renowned psychometrician Samuel J. Messick. His work at the Educational Testing Service examining construct validity is the basis for the testing standards of the National Council on Measurement in Education, American Educational Research Association and American Psychological Association. While Lissitz and Samuelsen believe in Messick’s insistence on a variety of sources of validity evidence, conversations with those in state and local education showed Messick’s unitary theory of validity left them puzzled as to how to provide that evidence.

Samuelsen says she has received only positive feedback since the article was published. State governments have found the new framework useful, and others in educational measurement have reported that it has informed their decision making. Those working in the area of classroom assessment have also shown interest in how to help teachers better understand the validation process. Samuelsen is hopeful that their framework will continue to be adapted by others, with input from those on the front lines of educational assessment.