Recommendations for Teachers
Teachers who have begun to use alternative assessment in their classrooms
are good sources for ideas and guidance. The following recommendations were
made by teachers in Virginia after they spent six months developing and
implementing alternative assessment activities in their classrooms.
1. Start small. Follow someone else's example in the beginning, or
do one activity in combination with a traditional test.
2. Develop clear rubrics. Realize that developing an effective
rubric (rating scale with several categories) for judging student
products and performances is harder than carrying out the
activity. Standards and expectations must be clear. Benchmarks
for levels of performance are essential. Characteristics of
typical student products and performances may be used to generate
performance assessment rubrics and standards for the class.
3. Expect to use more time at first. Developing and evaluating
alternative assessments and their rubrics requires additional
time until you and your students become comfortable with the
4. Adapt existing curriculum. Plan assessment as you plan
instruction, not as an afterthought.
5. Have a partner. Sharing ideas and experiences with a colleague is
beneficial to teachers and to students.
6. Make a collection. Look for examples of alternative assessments
or activities that could be modified for your students and keep a
file readily accessible.
7. Assign a high value (grade) to the assessment. Students need to
see the experience as being important and worth their time. Make
expectations clear in advance.
8. Expect to learn by trial and error. Be willing to take risks and
learn from mistakes, just as we expect students to do. The best
assessments are developed over time and with repeated use.
9. Try peer assessment activities. Relieve yourself of some grading
responsibilities and increase student evaluation skills and
accountability by involving them in administering assessments.
10. Don't give up. If the first tries are not as successful as you
had hoped, remember, this is new to the students, too. They can
help you refine the process. Once you have tried an alternative
assessment, reflect and evaluate the activities. Ask yourself
some questions. What worked? What needs modification? What
would I do differently? Would I use this activity again? How
did the students respond? Did the end results justify the time
spent? Did students learn from the activity?
Virginia Education Association and the
Appalachia Educational Laboratory (1992)
A LONG OVER VIEW ON ALTERNATIVE ASSESSMENT
Prepared by Lawrence Rudner, ERIC/AE and Carol Boston, ACCESS ERIC
So, what's all the hoopla about? Federal commissions have endorsed
performance assessment. It's been discussed on C-SPAN and in a number of
books and articles. Full issues of major education journals, including
Educational Leadership (April 1989 and May 1992) and Phi Delta Kappan
(February 1993), have been devoted to performance assessment. A
surprisingly large number of organizations are actively involved in
developing components of a performance assessment system. Chances are good
that one or more of your professional associations is in the middle of
debating goals and standards right now.
Is this just the latest bandwagon? Another short-term fix? Probably not.
The performance assessment movement encompasses much more than a technology
for testing students. It requires examining the purposes of education,
identifying skills we want students to master, and empowering teachers.
Even without an assessment component, these activities can only be good for
education. You can be certain they will have an impact on classrooms.
This article describes performance assessments, weighs their advantages and
disadvantages as instructional tools and accountability measures, and
offers suggestions to teachers and administrators who want to use
performance assessments to improve teaching and learning.
Key Features of Performance Assessment
The Office of Technology Assessment (OTA) of the U.S. Congress (1992)
provides a simple, yet insightful, definition of performance assessment:
testing that requires a student to create an answer or a product
that demonstrates his or her knowledge or skills.
A wide variety of assessment techniques fall within this broad definition.
Several are described in Table 1. One key feature of all performance
assessments is that they require students to be active participants.
Rather than choosing from presented options, as in traditional multiple-
choice tests, students are responsible for creating or constructing their
responses. These may vary in complexity from writing short answers or
essays to designing and conducting experiments or creating comprehensive
portfolios. It is important to note that proponents of "authentic
assessment" make distinctions among the various types of performance
assessments, preferring those that have meaning and value in themselves to
those that are meaningful primarily in an academic context. In a chemistry
class, for example, students might be asked to identify the chemical
composition of a premixed solution by applying tests for various
properties, or they might take samples from local lakes and rivers and
identify pollutants. Both assessments would be performance-based, but the
one involving the real-world problem would be considered more authentic.
Testing has traditionally focused on whether students get the right
answers; how they arrive at their answers has been considered important
only during the test development. When students take a standardized
mathematics test, for example, there is no way to distinguish among those
who select the correct answer because they truly understand the problem,
those who understand the problem but make a careless calculation mistake,
and those who have no idea how to do the work but simply guess correctly.
Performance assessments, on the other hand, require students to demonstrate
knowledge or skills; therefore, the process by which they solve problems
becomes important. To illustrate, if high school juniors are asked to
demonstrate their understanding of interest rates by comparison shopping
for a used-car loan and identifying the best deal in a report, a teacher
can easily see if they understand the concept of interest, know how to
calculate it, and perform mathematical operations accurately.
In performance assessment, items directly reflect intended outcomes.
Whereas a traditional test might ask students about grammar rules, a
performance assessment would have them demonstrate their understanding of
English grammar by editing a poorly written passage. A traditional auto
mechanics test might include questions about a front-end alignment; a
performance assessment would have students do one.
Performance assessments can also measure skills that have not traditionally
been measured in large groups of students skills such as integrating
knowledge across disciplines, contributing to the work of a group, and
developing a plan of action when confronted with a novel situation. Grant
Wiggins (1990) captures their potential nicely:
Do we want to evaluate student problem-posing and problem-solving
in mathematics? Experimental research in science? Speaking,
listening, and facilitating a discussion? Doing document-based
historical inquiry? Thoroughly revising a piece of imaginative
writing until it `works' for the reader? Then let our assessment
be built out of such exemplary intellectual challenges.
What's Wrong With the Way We've Been Doing It?
Many tests used in state and local assessments, as well as the Scholastic
Aptitude Test and the National Assessment of Educational Progress, have
been criticized for failing to provide the information we need about
students and their ability to meet specific curricular objectives. Critics
contend that these tests, as currently formulated, often assess only a
narrow range of the curriculum; focus on aptitudes, not specific curriculum
objectives; and emphasize minimum competencies, thus creating little
incentive for students to excel. Further, they yield results that are
analyzed primarily on the national, state, and district levels rather than
used to improve the performance of individual pupils or schools.
The true measure of performance assessment must, however, lie in its
ability to assess desired skills, not in the alleged inability of other
forms of assessment.
Here We Go Again?
You might ask, "Is performance assessment really new?" Good classroom
teachers have used projects and portfolios for years, preparing numerous
activities requiring students to blend skills and insights across
disciplines. Performance assessment has been particularly common in
vocational education, the military, and business. ERIC has used
"performance tests" as a descriptor since 1966.
What is new is the widespread interest in the potential of performance
assessment. Many superintendents, state legislators, governors, and
Washington officials see high-stakes performance tests as a means to
motivate students to learn and schools to teach concepts and skills that
are more in line with today's expectations. This perspective will be
called the motivator viewpoint in this article. Many researchers,
curriculum specialists, and teachers, on the other hand, see performance
assessment as empowering teachers by providing them with better
instructional tools and a new emphasis on teaching more relevant skills, a
perspective that will be referred to as the empowerment viewpoint here.
Proponents of both viewpoints agree on the need to change assessment
methods but differ in their views about how assessment information should
On the Value of Performance Assessments
Advocates of the motivator and empowerment viewpoints concur that
performance assessments can form a solid foundation for improving schools
and increasing what students know and can do. However, the two groups
frame the advantages differently. Their positions are sketched here
briefly, then developed more fully in the sections that follow.
The motivators emphasize that performance-based assessments, if instituted
on a district, state, or national level, will allow us to monitor the
effectiveness of schools and teachers and track students' progress toward
achieving national educational goals (see "Standards, Assessments, and the
National Education Goals" on pp. X X). According to the motivator
viewpoint, performance assessments will make the educational system more
accountable for results. Proponents expect them to do the following:
˛ prompt schools to focus on important, performance-based outcomes;
˛ provide sound data on achievement, not just aptitude;
˛ allow valid comparisons among schools, districts, and states; and
˛ yield results for every important level of the education system,
from individual children to the nation as a whole.
Those in the empowerment camp, on the other hand, tend to focus on how
performance assessments will improve teaching and learning at the classroom
level. Instructional objectives in most subject areas are being redefined
to include more practical applications and more emphasis on synthesis and
integration of content and skills. Performance assessments that are
closely tied to this new curriculum can give teachers license to emphasize
important skills that traditionally have not been measured. Performance
assessments can also provide teachers with diagnostic information to help
guide instruction. The outcomes-based education (OBE) movement supports
instructional activities closely tied to performance assessment tasks.
Under OBE, students who do not demonstrate the level of accomplishment
their local communities and school districts have agreed upon receive
additional instruction to bring them up to the level.
High-Stakes Performance Assessments as Motivators
One of the most historic events concerning education occurred in September
1989, when President George Bush held an education summit in
Charlottesville, Virginia, with the nation's governors. Together, the
participants hammered out six far-reaching national education goals,
effectively acknowledging that education issues transcend state and local
levels to affect the democratic and economic foundations of the entire
country. In a closing statement, participants announced,
We unanimously agree that there is a need for the first time in this
nation's history to have specific results-oriented goals. We
recognize the need for ... accountability for outcome-related results.
Consensus is now building among state legislators, governors, members of
Congress, Washington officials, and the general public regarding the
desirability and feasibility of some sort of voluntary national assessment
system linked with high national standards in such subject areas as
mathematics, science, English, history, geography, and the arts. A number
of professional organizations have received funding to coordinate the
development of such standards (see sidebar on p. X). The groundbreaking
work of the National Council of Teachers of Mathematics (NCTM) serves as a
model for this process: NCTM developed its Standards in CB: date and is
now developing curriculum frameworks and assessment guidelines to match it
(see "From Standards to Assessment" on p. X).
The National Council on Education Standards and Testing (NCEST), an
advisory group formed by Congress and the President in response to national
and state interest in national standards and assessments, describes the
motivational effect of a national system of assessments in its 1992 report,
Raising Standards for American Education:
National standards and a system of assessments are
desirable and feasible mechanisms for raising
expectations, revitalizing instruction, and
rejuvenating educational reform efforts for all
American schools and students (p. 8).
Envision, if you will, the enormous potential of an assessment that
perfectly and equitably measures the right skills. NCEST believes that
developing standards and high-quality assessments has "the potential to
raise learning expectations at all levels of education, better target human
and fiscal resources for educational improvement, and help meet the needs
of an increasingly mobile population". This is a shared vision. At least
a half-dozen groups have begun calling for a national assessment system or
developing instrumentation during the past two years (see Calls for New
According to NCEST, student standards must be "world class" and include the
"specification of content what students should know and be able to do and
the level of performance students are expected to attain" (p. 3). NCEST
envisions standards that include substantative content together with
complex problem-solving and higher-order thinking skills. Such standards
would reflect "high expectations not expectations of minimal competency"
(p. 13). NCEST believes in the motivation potential of these world-class
standards, stating that they will "raise the ceiling for students who are
currently above average" and "lift the floor for those students who now
experience the least success in school" (p. 4).
Acknowledging that tests tend to influence curriculum, NCEST suggests that
assessments should be developed to reflect the new high standards. Such
assessments would not be immediately associated with high stakes. However,
once issues of validity, reliability, and fairness have been resolved,
these assessments "could be used for such high-stakes purposes as high
school graduation, college admission, continuing education, or
certification for employment. Assessments could also be used by states and
localities as the basis for system accountability" (p. 27).
The U.S. already has one national assessment in place, the National
Assessment of Educational Progress (NAEP). Since 1969, the U.S. Department
of Education-sponsored NAEP has been used to assess what our nation's
children know in a variety of curriculum areas, including mathematics,
reading, science, writing, U.S. history, and geography. Historically, NAEP
has been a multiple-choice test administered to random samples of fourth-,
eighth-, and twelfth-graders in order to report on the educational progress
of our nation as a whole. As interest in accountability has grown, NAEP
has begun to conduct trial state-level assessments. NAEP is also
increasing the number of performance-based tasks to better reflect what
students can do (see "Performance-Based Aspects of the National Assessment
of Educational Progress" on p. x). The National Council on Education
Standards and Testing envisions that large-scale sample assessments such as
NAEP will be one component of a national system of assessments, to be
coupled with assessments that can provide results for individual students.
Supporters argue that a system of national assessments would improve
education by giving parents and students more accurate, relevant, and
comparable data and encouraging students to strive for world-class
standards of achievement. Critics of a national assessment system are
equally visible. The National Education Association and other professional
associations have argued that high-stakes national assessments will not
improve schooling and could easily be harmful. They are particularly
concerned that students with disabilities, students whose native language
is not English, and students and teachers attending schools with minimal
resources will be penalized under such a system. Fearing that a national
assessment system might not be a good model and could short-circuit current
reform efforts, The National Center for Fair and Open Testing, or FairTest,
testified that the federal dollars would be better spent in support of
Performance Assessment for Teacher Empowerment
An enormous amount of activity is taking place in the area of establishing
national standards and a system of assessments. The assessments are
expected to encompass performance-based tasks that call on students to
demonstrate what they can do. They may well have strong accountability
features and be used eventually to make high-stakes decisions. Should
building principals and classroom teachers get excited about performance
assessment now? Absolutely. Viewed in its larger context, performance
assessment can play an important part in the school reform/restructuring
Performance assessment can be seen as a lever to
promote the changes needed for the assessment to be
maximally useful. Among these changes are a
redefinition of learning and a different conception of
the place of assessment in the education process
In order to implement performance assessment fully, administrators and
teachers must have a clear picture of the skills they want students to
master and a coherent plan for how students are going to master those
skills. They need to consider how students learn and what instructional
strategies are most likely to be effective. Finally, they need to be
flexible in using assessment information for diagnostic purposes to help
individual students achieve. This level of reflection is consistent with
the best practices in education. As Joan Herman, Pamela Aschbacher, and
Lynn Winters note in their important book, A Practical Guide to Alternative
No longer is learning thought to be a one-way
transmission from teacher to students, with the teacher
as lecturer and the students as passive receptacles.
Rather, meaningful instruction engages students
actively in the learning process. Good teachers draw
on and synthesize discipline-based knowledge, knowledge
of student learning, and knowledge of child
development. They use a variety of instructional
strategies, from direct instruction to coaching, to
involve their students in meaningful activities . . .
and to achieve specific learning goals (p. 12).
Quality performance assessment is a key part of this vision because "good
teachers constantly assess how their students are doing, gather evidence of
problems and progress, and adjust their instructional plans accordingly"
Properly implemented, performance assessment offers an opportunity to align
curriculum and teaching efforts with the important skills we wish children
to master. Cognitive learning theory, which emphasizes that knowledge is
constructed and that learners vary, provides some insight into what an
aligned curriculum might look like (see Implications from Learning Theory).
The Case for Authentic Assessment.
ERIC Digest. ED328611 TM016142
American Institutes for Research, Washington, DC.; ERIC Clearinghouse on
Tests, Measurement, and Evaluation, Washington, DC. Dec 1990
Mr. Wiggins, a researcher and consultant on school reform issues, is a
widely-known advocate of authentic assessment in education. This digest is
based on materials that he prepared for the California Assessment Program.
WHAT IS AUTHENTIC ASSESSMENT?
Assessment is authentic when we directly examine student performance on
worthy intellectual tasks. Traditional assessment, by contract, relies on
indirect or proxy 'items'--efficient, simplistic substitutes from which we
think valid inferences can be made about the student's performance at those
Do we want to evaluate student problem-posing and problem-solving in
mathematics? experimental research in science? speaking, listening, and
facilitating a discussion? doing document-based historical inquiry?
thoroughly revising a piece of imaginative writing until it "works" for the
reader? Then let our assessment be built out of such exemplary intellectual
Further comparisons with traditional standardized tests will help to
clarify what "authenticity" means when considering assessment design and
* Authentic assessments require students to be effective performers
with acquired knowledge. Traditional tests tend to reveal only whether the
student can recognize, recall or "plug in" what was learned out of context.
This may be as problematic as inferring driving or teaching ability from
written tests alone. (Note, therefore, that the debate is not "either-or":
there may well be virtue in an array of local and state assessment
instruments as befits the purpose of the measurement.)
* Authentic assessments present the student with the full array of
tasks that mirror the priorities and challenges found in the best
instructional activities: conducting research; writing, revising and
discussing papers; providing an engaging oral analysis of a recent
political event; collaborating with others on a debate, etc. Conventional
tests are usually limited to paper-and-pencil, one- answer questions.
* Authentic assessments attend to whether the student can craft
polished, thorough and justifiable answers, performances or products.
Conventional tests typically only ask the student to select or write
correct responses--irrespective of reasons. (There is rarely an adequate
opportunity to plan, revise and substantiate responses on typical tests,
even when there are open-ended questions). As a result,
* Authentic assessment achieves validity and reliability by emphasizing
and standardizing the appropriate criteria for scoring such (varied)
products; traditional testing standardizes objective "items" and, hence,
the (one) right answer for each.
* "Test validity" should depend in part upon whether the test simulates
real-world "tests" of ability. Validity on most multiple-choice tests is
determined merely by matching items to the curriculum content (or through
sophisticated correlations with other test results).
* Authentic tasks involve "ill-structured" challenges and roles that
help students rehearse for the complex ambiguities of the "game" of adult
and professional life. Traditional tests are more like drills, assessing
static and too-often arbitrarily discrete or simplistic elements of those
Beyond these technical considerations the move to reform assessment is
based upon the premise that assessment should primarily support the needs
of learners. Thus, secretive tests composed of proxy items and scores that
have no obvious meaning or usefulness undermine teachers' ability to
improve instruction and students' ability to improve their performance. We
rehearse for and teach to authentic tests--think of music and military
training--without compromising validity.
The best tests always teach students and teachers alike the kind of
work that most matters; they are enabling and forward-looking, not just
reflective of prior teaching. In many colleges and all professional
settings the essential challenges are known in advance--the upcoming
report, recital, Board presentation, legal case, book to write, etc.
Traditional tests, by requiring complete secrecy for their validity, make
it difficult for teachers and students to rehearse and gain the confidence
that comes from knowing their performance obligations. (A known challenge
also makes it possible to hold all students to higher standards).
WHY DO WE NEED TO INVEST IN THESE LABOR-INTENSIVE FORMS OF ASSESSMENT?
While multiple-choice tests can be valid indicators or predictors of
academic performance, too often our tests mislead students and teachers
about the kinds of work that should be mastered. Norms are not standards;
items are not real problems; right answers are not rationales.
What most defenders of traditional tests fail to see is that it is the
form, not the content of the test that is harmful to learning;
demonstrations of the technical validity of standardized tests should not
be the issue in the assessment reform debate. Students come to believe that
learning is cramming; teachers come to believe that tests are
after-the-fact, imposed nuisances composed of contrived
questions--irrelevant to their intent and success. Both parties are led to
believe that right answers matter more than habits of mind and the
justification of one's approach and results.
A move toward more authentic tasks and outcomes thus improves teaching
and learning: students have greater clarity about their obligations (and
are asked to master more engaging tasks), and teachers can come to believe
that assessment results are both meaningful and useful for improving
If our aim is merely to monitor performance then conventional testing
is probably adequate. If our aim is to improve performance across the board
then the tests must be composed of exemplary tasks, criteria and standards.
WON'T AUTHENTIC ASSESSMENT BE TOO EXPENSIVE AND TIME-CONSUMING?
The costs are deceptive: while the scoring of judgment-based tasks
seems expensive when compared to multiple-choice tests (about $2 per
student vs. 1 cent) the gains to teacher professional development, local
assessing, and student learning are many. As states like California and New
York have found (with their writing and hands-on science tests) significant
improvements occur locally in the teaching and assessing of writing and
science when teachers become involved and invested in the scoring process.
If costs prove prohibitive, sampling may well be the appropriate
response--the strategy employed in California, Vermont and Connecticut in
their new performance and portfolio assessment projects. Whether through a
sampling of many writing genres, where each student gets one prompt only;
or through sampling a small number of all student papers and school-wide
portfolios; or through assessing only a small sample of students, valuable
information is gained at a minimum cost.
And what have we gained by failing to adequately assess all the
capacities and outcomes we profess to value simply because it is
time-consuming, expensive, or labor-intensive? Most other countries
routinely ask students to respond orally and in writing on their major
tests--the same countries that outperform us on international comparisons.
Money, time and training are routinely set aside to insure that assessment
is of high quality. They also correctly assume that high standards depend
on the quality of day-to-day local assessment--further offsetting the
apparent high cost of training teachers to score student work in regional
or national assessments.
WILL THE PUBLIC HAVE ANY FAITH IN THE OBJECTIVITY AND RELIABILITY OF
We forget that numerous state and national testing programs with a high
degree of credibility and integrity have for many years operated using
* the New York Regents exams, parts of which have included essay
questions since their inception--and which are scored locally (while
audited by the state);
* the Advanced Placement program which uses open-ended questions and
tasks, including not only essays on most tests but the performance-based
tests in the Art Portfolio and Foreign Language exams;
* state-wide writing assessments in two dozen states where model
papers, training of readers, papers read "blind" and procedures to prevent
bias and drift gain adequate reliability;
* the National Assessment of Educational Progress (NAEP), the
Congressionally-mandated assessment, uses numerous open-ended test
questions and writing prompts (and successfully piloted a hands-on test of
* newly-mandated performance-based and portfolio-based state-wide
testing in Arizona, California, Connecticut, Kentucky, Maryland, and New
Though the scoring of standardized tests is not subject to significant
error, the procedure by which items are chosen, and the manner in which
norms or cut-scores are established is often quite subjective--and
typically immune from public scrutiny and oversight.
Genuine accountability does not avoid human judgment. We monitor and
improve judgment through training sessions, model performances used as
exemplars, audit and oversight policies as well as through such basic
procedures as having disinterested judges review student work "blind" to
the name or experience of the student--as occurs routinely throughout the
professional, athletic and artistic worlds in the judging of performance.
Authentic assessment also has the advantage of providing parents and
community members with directly observable products and understandable
evidence concerning their students' performance; the quality of student
work is more discernible to laypersons than when we must rely on
translations of talk about stanines and renorming.
Ultimately, as the researcher Lauren Resnick has put it, What you
assess is what you get; if you don't test it you won't get it. To improve
student performance we must recognize that essential intellectual abilities
are falling through the cracks of conventional testing.
Archbald, D. & Newmann, F. (1989) "The Functions of Assessment and the
Nature of Authentic Academic Achievement," in Berlak (ed.) Assessing
Achievement: Toward the development of a New Science of Educational
Testing. Buffalo, NY: SUNY Press.
Frederiksen, J. & Collins, A. (1989) "A Systems Approach to Educational
Testing," Educational Researcher, 18, 9 (December).
National Commission on Testing and Public Policy (1990) From Gatekeeper
to Gateway: Transforming Testing in America. Chestnut Hill, MA: NCTPP,
Wiggins, G. (1989) "A True Test: Toward More Authentic and Equitable
Assessment," Phi Delta Kappan, 70, 9 (May).
Wolf, D. (1989) "Portfolio Assessment: Sampling Student Work,"
Educational Leadership 46, 7, pp. 35-39 (April).
Alternatives to Standardized Educational Assessment.
ED312773 EA021431 ERIC Digest Series Number EA 40.
Bowers, Bruce C.
ERIC Clearinghouse on Educational Management, Eugene, Oreg. 1989
An American educator who was examining the British educational system
once asked a headmaster why so little standardized testing took place in
British schools. "My dear fellow," came the reply, "In Britain we are of
the belief that, when a child is hungry, he should be fed, not weighed."
This anecdote suggests the complementary question: "Why is it that we do so
much standardized testing in the United States?"
WHAT ARE THE MAIN USES OF STANDARDIZED TESTING IN AMERICAN PUBLIC SCHOOLS?
Advocates of standardized testing assert that it simply achieves more
efficiently and fairly many of the purposes for which grading and other
traditional assessment procedures were designed. Even critics of
standardized testing acknowledge that it has filled a vacuum. As Grant
Wiggins (1989a) puts it, "Mass assessment resulted from legitimate concern
about the failure of schools to set clear, justifiable, and consistent
standards to which it would hold its graduates and teachers accountable."
Standardized testing is currently used to fulfill (1) the
administrative function of providing comparative scores for individual
students so that placement decisions can be made; (2) the guidance function
of indicating a student's strengths or weaknesses so that he or she may
make appropriate decisions regarding a future course of study; and, more
recently, (3) the accountability function of using student scores to assess
the effectiveness of teachers, schools, and even entire districts (Robinson
and Craver 1989).
WHAT PROBLEMS HAVE ARISEN AS A RESULT OF WIDESPREAD USE OF STANDARDIZED
The phrase "test-driven curriculum" (Livingston, Castle, and Nations
1989) captures the essence of the major controversy surrounding
standardized testing. When test scores are used on a comparative basis not
only to determine the educational fate of individual students, but also to
assess the relative "quality" of teachers, schools, and school districts,
it is no wonder that "teaching to the test" is becoming a common practice
in our nation's schools. This would not necessarily be a problem if
standardized tests provided a comprehensive, indepth assessment of the
knowledge and skills that indicate mastery of a given subject matter.
However, the main purpose of standardized testing is to sort large numbers
of students in as efficient a manner as possible. This limited goal, quite
naturally, gives rise to short-answer, multiple-choice questions. When
tests are constructed in this manner, active skills, such as writing,
speaking, acting, drawing, constructing, repairing, or any of a number of
other skills that can and should be taught in schools are automatically
relegated to a second-class status.
WHAT ALTERNATIVES TO STANDARDIZED TESTING HAVE BEEN SUGGESTED?
It is reasonable to assume that the demand for test results that can be
compared across student populations will remain strong. The critical
question is whether such results can be obtained from tests that attempt a
more comprehensive assessment of student abilities than the present
standardized tests are capable of providing. An ancillary, but equally
critical, question is whether such tests are too costly to be widely
Suggested alternatives are based on the concept of a
"performance-based" assessment. Depending on the subject matter being
tested, the performance may consist of demonstrating any of the active
skills mentioned above. For example, in the area of writing, drawing, or
any of the "artistic expression" skills, it has been suggested that a
"portfolio assessment," involving the ongoing evaluation of a cumulative
collection of creative works, is the best approach (Wolf 1989). For
subjects that require the organization of facts and theories into an
integrated and persuasive whole (for example, sciences and social
sciences), an assessment modelled after the oral defense required of
doctoral candidates has been suggested (Wiggins 1989a).
A third approach, which might be termed the "problem solving model,"
can be adapted to almost any knowledge-based discipline. It involves the
presentation of a problematic scenario that can be resolved only through
the application of certain major principles (theories, formulae) that are
central to the discipline under examination (Archbald and Newmann 1988).
CAN PERFORMANCE-BASED ASSESSMENTS BE USED TO COMPARE STUDENTS ACROSS
Performance-based assessment is more easily scored using a
criterion-referenced, rather than a norm-referenced approach. Instead of
placing a student's score along a normal distribution of scores from
students all taking the same test, a criterion-referenced approach focuses
on whether a student's performance meets a criterion level, normally
reflecting mastery of the skills being tested.
How can such an assessment be reliably compared to similar assessments
made by other teachers in other settings? It has been suggested that
American educators adopt the "exemplary system" being called for in Great
Britain. In this system, teachers involved in scoring meet regularly "to
compare and balance results on their own and national tests" (Wiggins
1989b), thus increasing reliability across settings. Clearly, however, such
an approach (similar to the approach currently in use for the scoring of
Advanced Placement essay exams) could be prohibitively expensive if carried
out on a large scale. A key question is whether the costs associated with
this labor intensive scoring system would be offset by the presumed
instructional gains obtained from an assessment model that rewarded a more
thorough and holistic approach to instruction.
HAVE THERE BEEN ANY STATEWIDE EFFORTS TO PROVIDE ALTERNATIVES TO
California has probably made the greatest effort in this direction,
beginning in 1987 with its statewide writing test and continuing with its
current development of performance-based assessment in science and history
(Massey 1989). The Connecticut Assessment of Educational Progress Program
uses a variety of performance tasks in its assessment of science, foreign
languages, and business education (Baron 1989). (However, this assessment
includes only a sample of students at any given grade level, and, in
addition, every year there is change in the subjects for which performance
tasks are required.) Vermont education officials are currently seeking
legislative approval for funds to pursue a portfolio assessment approach in
addition to the current standardized testing (Massey 1989).
WHAT IS THE PROGNOSIS FOR A GENERAL SHIFT AWAY FROM STANDARDIZED TESTING
AND TOWARD PERFORMANCE-BASED TESTING?
In psychometric terms, the tradeoff in such a shift is to sacrifice
reliability for validity. That is, performance-based tests do not lend
themselves to a cost- and time-efficient method of scoring that, in
addition, provides reliable results. On the other hand, they actually test
what the educational system is presumably responsible for teaching, namely,
the skills prerequisite for performing in the real world. The additional
costs involved in producing reliable results across different settings for
performance-based tests are unknown.
The question is whether a majority of educators will echo the
sentiments of George Madaus, director of the Center for the Study of
Testing, Evaluation, and Educational Policy, who believes that
performance-based testing "is not efficient; it's expensive; it doesn't
lend itself to mass testing with quick turnaround time--but it's the way to
go" (Brandt 1989).
Archbald, Doug A., and Fred M. Newmann. "Beyond Standardized Testing:
Assessing Authentic Academic Achievement in the Secondary School." Reston,
VA: National Association of Secondary School Principals, 1988. 65 pages. ED
Baron, Joan B. "Performance Testing in Connecticut." EDUCATIONAL
LEADERSHIP 46, 7 (April 1989): 8. EJ 387 136.
Brandt, Ron. "On Misuse of Testing: A Conversation with George Madaus."
EDUCATIONAL LEADERSHIP 46, 7 (April 1989): 26-29. EJ 387 140.
Livingston, Carol, Sharon Castle, and Jimmy Nations. "Testing and
Curriculum Reform: One School's Experience." EDUCATIONAL LEADERSHIP 46, 7
(April 1989): 23-25. EJ 387 139.
Massey, Mary. "States Move to Improve Assessment Picture." ACSD UPDATE
31, 2 (March 1989): 7.
Ralph, John, and M. Christine Dwyer. "Making the Case: Evidence of
Program Effectiveness in Schools and Classrooms." Washington, D.C.: U.S.
Department of Education, Office of Educational Research and Improvement,
November 1988. 54 pages.
Robinson, Glen E., and James M. Craver. "Assessing and Grading Student
Achievement." Arlington, VA: Educational Research Service, 1989. 198 pages.
Wiggins, Grant. "A True Test: Toward More Authentic and Equitable
Assessment." PHI DELTA KAPPAN 70, 9 (May 1989a):703-13. EJ 388 723.
Wiggins, Grant. "Teaching to the (Authentic) Test." EDUCATIONAL
LEADERSHIP 46, 7 (April 1989b): 41-47.
Wolf, Dennie P. "Portfolio Assessment: Sampling Student Work."
EDUCATIONAL LEADERSHIP 46, 7 (April 1989): 35-39. EJ 387 143. -----