'Newsblaster' software scans, summarizes news
At Columbia University, researchers are working on software that can do what members of the Fourth Estate often struggle with: Write a lead paragraph. Newsblaster's version isn't as jazzy as the typical wordsmith's, but its ability to synthesize a dozen articles and summarize them is enough to give a journalist pause - and to offer readers a new way to follow the news.
Here's how it works: The program reads through a number of online news sites (including CNN, Reuters, ABCNews, Fox News, The Los Angeles Times, The Washington Post), separates the news into categories and events, and then looks for themes. Eventually it spots repeated phrases and chooses sentences to pick out and paste together for a summary.
Here's part of Newsblaster's take on the General Accounting Office lawsuit against Vice President Cheney, for example:
"The investigative arm of Congress filed an unprecedented lawsuit against the White House on Friday, demanding to learn the role that energy companies including Enron Corp. played in developing the Bush administration's energy policy. In its coming court fight with congressional investigators, the White House will argue that releasing records from its energy task force meetings would undercut the Constitution's separation of powers, weakening the presidency and possibly national security."
Researchers say Newsblaster shouldn't make journalists nervous. "What it does is take what journalists have already written and provides a way to easily browse," explains Kathleen McKeown, the computer science professor overseeing the project. "We could not write the stories ourselves."
She and her team are still working out how to identify situations where two sources contradict each other and how to track news across days to document what is truly "new." They decided to start running Newsblaster last Sept. 16: They felt their system was ready, and they wanted to archive the events of Sept. 11 (www.cs.columbia.edu/nlp/newsblaster).
The project "can be useful to people in the government," says Professor McKeown. "It could also be useful if you're in a company and tracking a particular event at a particular time." And, of course, it could potentially be of service to the average reader - or grade-schooler. "Right now," she says, "my 11-year-old daughter uses it when she has to do her homework on current events."