Region III Comprehensive Center George Washington University
Region III Comprehensive Center

Standards and Assessments

Center for Equity and Excellence in Education

Research and Best Practices

Summary of Article Published in American Educational Research Journal, Summer 1994, Vol. 31, No.2, pp. 231-262

 

Assessment for Measurement or Standards: The Peril and Promise of Large-Scale Assessment Reform

by Catherine Taylor


Which assessment model for standards-based reform?

As states and districts strive for reform and experiment with methods to test the success of reforms, tension has resulted from the demand that assessments serve two incompatible purposes; first that they measure students' achievement of standards of performance; and second that they provide relative measurements of students, schools, districts and states on scales of achievement. Taylor uses the term measurement model for assessments that serve the first purpose and standards model for assessments that serve the second. In practice, however, the measurement model has often been used to serve both purposes in an attempt to streamline educational testing. Such practice undermines the goals of standards-based reform. School reform efforts will be supported only if new assessment systems are developed using an assessment model that is in harmony with the goals of reform.

To support the goals of standards-based reform, Taylor makes the case for the use of a standards model that employs a variety of performance-based assessments. By clarifying the uses and abuses of the measurement model, which has been the foundation of norm-referenced testing for the past 60 years, she underscores the dangers of accepting this model to gage progress in meeting performance standards. Norm-referenced standardized achievement tests are designed to differentiate between and rank students, not to provide solid information about how well students are learning various objectives. The cry for performance-based assessments is partly a consequence of inappropriate uses of norm-referenced achievement tests. However, using performance-based tests for high stakes purposes will not eliminate their negative consequences if performance tests continue to accept the assumptions of the measurement model. A different model is needed.

In contrast to the measurement model, the standards model suggests a very different set of assumptions, including:

  • Public educational standards can be set and met;
  • Most students can internalize and achieve the standards;
  • Very different student performances and exhibitions can and will reflect the same standards;
  • Educators can be trained to internalize the standards and be fair and consistent judges of diverse student performances.

Test development and the standards model

Large-scale assessments serve two important purposes: (1) providing accountability information about schools and districts, and (2) establishing a consistent standard of measurement for students. "Unless both of these assessment needs are met through the standards model, efforts to replace assessments based on the measurement model will fail. For this reason, the most critical aspect of the work for the standards model is that of identifying the essential performances in given disciplines, establishing standards and criteria for those performances, obtaining examples of performances that reflect those standards and criteria, and communicating all this to the public."

Large-scale test development using the standards model begins with the standards-setting process. Stakeholders articulate values and expectations that translate into tangible educational outcomes. Establishing benchmark performances and defining performance criteria at each developmental level follow. Once performance criteria are determined for each developmental level, the performance criteria are supplemented with examples of student work that reflect the criteria and represent the desired quality of work for each level. Various types of performance assessments (performance tasks, performance examinations, and portfolios) can be used to reflect different aspects of standards. In order to help students improve, rubrics should be developed that help teachers evaluate students who do not yet meet the standards. Taylor provides details on the development of rubrics and their uses.

The next stage, working with students to help them internalize and achieve the standards, is perhaps the most difficult. Teachers and students work together to design performances that meet the performance criteria. Students then practice these performances to bring them up to the standards. Performance examinations are not given until both teachers and students determine the students are ready to perform well on the exam. Taylor also discusses obtaining evidence for the validity and reliability of assessments based on standards.

Implications for Educational Reform In discussing the implications for educational reform of the measurement and the standards models, Taylor points out that decision makers must be informed about the choices to be made and the assumptions underlying each choice. "Do we continue creating instruments that are designed to rank and compare students, or do we want assessment systems that give us clear ideas about whether students are achieving complex learning targets? ...Do we believe our schools are supposed to sort students to find the brightest and the best, or do we believe that our democracy will be stronger if we foster the creativity and capacity of every individual? A true choice between models requires public debate over these very questions. In choosing between the standards model and the measurement model, we will have made an implicit statement about what we believe to be the purpose of schools."