Navigation Salon Salon Technology email print
Arts & Entertainment
Books
Comics
Health & Body
Media
Mothers Who Think
News
People
Politics2000
.Technology
- Free Software Project
Travel & Food
_______
Columnists

 

- - - - - - - - - - - -

- - - - - - - - - - - -

Search Salon


  
Advanced Search  |  Help

- - - - - - - - - - - -

Recently in Salon Technology


What does it take to make a buck off of Usenet?
Unable to turn a profit as a geek hangout, Deja.com has recast itself as a consumer-oriented community.

By Janelle Brown
[05/24/99]

Silicon Follies
Silicon Follies
Chapter 20: Liz regroups at the chateau.

By Thomas Scoville
[05/22/99]

21st Challenge
21st Challenge No. 22
Draft a memo from the Gates home planet.

By Charlie Varon and Jim Rosenau
[05/22/99]


Porn, the Harvard dean and tech support
What should the support staff do when it finds 'suspicious' material on your computer?

By "Richard Hemingway"
[05/21/99]


Tracks of freedom
Why should open source be limited to computer programs? The same logic could unleash a world of creative, personalized music.

By Jimmy Guterman
[05/20/99]

Complete archives for Technology

- - - - - - - - - - - -

- - - - - - - - - - - -

Technology
by e-mail
Sign up here to receive our weekly e-mail newsletter listing recent and upcoming articles and events in Technology.

 
Unsubscribe

- - - - - - - - - - - -




Essay questions | page 1, 2

Writing for a computerized audience is, to some critics' thinking, an absurd waste of time that can only warp the educational process. (ETS is also considering making E-rater available to score practice essays for students preparing to take the GMAT.) "Even before this, the pressure was there to teach to the test," says Baron. If students know their essays are being graded by a machine that can parse semantics and syntax, they "will learn to write for the formula."

Critics have long lodged similar complaints against all standardized testing, arguing that the tests measure students' ability to take tests, not their ability to learn and produce ideas. Some also maintain that the tests often include subtle racial, class or gender biases -- benefiting students who are white, middle-class and male.

But could we use technology to eliminate bias from the grading process, and to promote fairness and consistency? "In essence [the technology] is doing what a person is trained to do when they're doing holistic grading," says Darrell Laham, chief scientist for Knowledge Analysis Technologies, which developed the Intelligent Essay Assessor. "You see samples of what an excellent essay is supposed to look like, or a medium essay, or a very bad essay. With a person, their criteria may shift a little bit." The software, on the other hand, is 100 percent consistent: "You give it the same set of parameters, and it will always give the same results."

But Monty Neill, executive director of FairTest -- an advocacy group that fights for fairness in standardized testing -- says the software's lack of bias doesn't mean electronic grading will be free of prejudice: It all depends on how the software is programmed. "If you're looking for things that are not really relevant but are associated with a particular demographic group, then certainly that would reinforce a bias," he says.

A question assuming knowledge of stock dividends, for example, could penalize test-takers whose family never owned securities. But Neill agrees that a computerized grading system, properly programmed, could eliminate other forms of bias. "You might have someone who identifies black writing as automatically bad, whereas a machine might not," he says.

Laham says the best way to escape grading bias is to choose the model essays with care: "The underlying comparison set of essays should represent the population that the grades are meant to represent." To be fair to the test-takers, he says, his Intelligent Essay Assessor is designed to know its limits and not give a student a poor mark when the software can't "read" an essay, for stylistic or other reasons. "What the technology will do when it sees an essay that is completely unlike what it has seen before is to flag it and tell a teacher to look at it ... It won't be able to grade it, but it will know it can't grade it."

E-rater examines 50 linguistic features, including transitional phrases, vocabulary and the ratio of complement clauses to the total number of sentences. "For each essay, about eight to 12 of the features turn out to be particularly predictive and explain why an essay should get a certain score," says Jill Bustein, a developmental scientist who invented the E-rater prototype and led the ETS development team.

E-rater is surprisingly consistent with human graders. The E-rater scores agree with scores given by a human grader about 90 percent of the time -- or as often as a second human reader would, according to ETS statistics. And when a second human grader does score a disputed essay, he or she agrees with E-rater about 97 percent of the time. In other words, the electronic graders seem to do the job about as well as their human counterparts.

Computerized grading could cut student fees by $5 to $10 per test, according to ETS; readers who score the GMATs currently earn $23.75 per hour. And at Knowledge Analysis Technologies, Laham argues that essay-grading software can improve education by helping to eliminate multiple-choice testing. His company's Web site says: "Students need many more opportunities to put their knowledge into words and find out how well they've done and how to do better"; and Laham asserts that student writing, even when written for a computerized reader, demonstrates "a much deeper level of learning" than multiple-choice exams do.

But he is conscious of his product's limitations. "When you start getting into the creativity types of things, that's not really our focus," says Laham. "This technology is not appropriate for looking at term papers where every student is writing on a unique topic. We see it as a way to provide students with the opportunity to write and revise their writing and to get immediate feedback that they simply can't have right now. A person can't always look at what a student produces."

University of Illinois professor Baron still criticizes the system, however, saying he's gotten surprisingly good grades after submitting essays that were completely off-topic to a demonstration of the Intelligent Essay Assessor that is available online. "If you don't care about what might be in the text that doesn't match your template, then I suppose you can go ahead and use it," he says. "But it seems to me that it's also an insult to the writer. You're asking these test-takers to write connected prose, but you're having it graded by an entity that has no sense of what's good about connected prose and how to evaluate it." (Laham defended the product, saying that the version of IEA currently online does not yet have the system's full battery of validity checks.)

Meanwhile, won't students rebel against computerized readers?

Test-takers haven't been troubled by the electronic grading of GMATs, says McHale. "We were expecting more negative reaction, but we've had minimal complaints, and just a single response of 'I don't want a computer grading my essay,' which someone wrote in one of their essays." Part of the reason for the subdued response may be that a person still reads each submission -- a procedure that McHale expects to continue. "For the large-scale, high-stakes kind of testing that we do, I don't see a human reader being taken out of the loop," he said. "The small discrepancies that we do see could be very creative responses that we really do want to allow in the testing."

So far, there's no plan to employ E-rater as a judge of literary merit or creative writing, but ETS is researching the possibility of computerized grading for the Test of English as a Foreign Language and the Graduate Record Examinations. The GMAT was the first to employ the software because the test had already phased out handwritten essays in favor of keyboarded essays.

While it's unlikely that computerized grading will ever replace the careful eye of a teacher, technology proponents like Laham say it can be a great addition to the current academic system. "The reality is that teachers can't read enough to provide the student with enough feedback," says Laham.

So instead of comparing the software to a human reader -- where it can't help but appear a poor substitute -- Laham argues critics should view electronic grading as a great benefit to students who want to write more than their teachers can read. Dismissing the technology's detractors, Laham says, "There aren't as many of the critics as there are teachers who want this system."
salon.com | May 25, 1999

 

- - - - - - - - - - - -

About the writer
Christopher Ott is a writer in Madison, Wis.

Sound off
Send us a Letter to the Editor

Send e-mail to Christopher Ott

- - - - - - - - - - - -

Print this story  Get a printer-friendly version

Email this story  E-mail a friend about this article

Backflip This Story  Backflip this article to find it again

- - - - - - - - - - - -

Search Salon


  
Advanced Search  |  Help

 

Salon | Search | Archives | Contact Us | Table Talk | Ad Info

Arts & Entertainment | Books | Comics | Life | News | People
Politics | Sex | Tech & Business | Audio
The Free Software Project | The Movie Page
Letters | Columnists | Salon Plus

Copyright © 2000 Salon.com All rights reserved.