Skip to: Content
Skip to: Site Navigation
Skip to: Search

Mining data to nab terrorists: fair?

Digital minutiae could be used to track terror networks, but it could produce false positives.

(Page 2 of 3)

"Given the nature of the work we do, it would be irresponsible to comment on actual or alleged operational issues; therefore, we have no information to provide," NSA spokesman Don Weber wrote in an e-mail. "However, it is important to note that NSA takes its legal responsibilities seriously and operates within the law."

Skip to next paragraph

Data mining as a counterterrorism tool has grown enormously in recent years, according to the Government Accountability Office.

"Don't think of all these records as separate things," says Mr. Schneier of Counterpane Internet Security. "If analysis tools are being used across this collection, it's really just one big sea of data being analyzed and crosschecked."

In the weeks after 9/11, Valdis Krebs scoured news reports and the Internet for tidbits about the hijackers, plugging their meetings, travels, and relationships into a computer program he had created. At last, it spit out a chart detailing the terrorists' links to one another.

It revealed, for instance, that two of the 9/11 hijackers - Nawaf Alhazmi and Khalid Almihdhar - were at the center of a spider web linking those accused of bombing the USS Cole in 2000.

What Mr. Krebs, a Cleveland-based expert in social network analysis, did in a small way was what he and others say US agencies are trying to do on a much larger scale - by piling up mountains of data to sift for unseen patterns that reveal hidden terrorist networks.

"This can be a very powerful tool," he says, if the technique is used to "drill down" into known terrorists for their associates and those they call and connect with.

But Krebs and other experts are doubtful about the opposite method, a "top down" approach in which the bulk of data on innocent Americans is sifted for terrorists.

It's a massive job, and accuracy is made more difficult by the size of the database, he says. Just one AT&T database - dubbed "Hawkeye," some of whose call records on millions of customers are alleged to have been handed over to the NSA - contained more than 300 terabytes of information, according to an Electronic Frontier Foundation (EFF) lawsuit filed in January. That's about 15 times the information in the Library of Congress.

"By looking at all this data, they're making the problem far more difficult for themselves," says Krebs. "Instead of looking for a likely haystack and the needles that might be there, the NSA is vastly expanding the haystack by requesting everybody's phone records."

Krebs worries about false positives. He offers the example of people living in US who haven't called each other in years, who all of a sudden start calling and e-mailing and planning a trip to Washington, D.C. Maybe they're even calling overseas. "[?]It would be easy for a computer to flag this social network as a terror cell, when it's really a group of Vietnam vets planning a reunion at the Vietnam Memorial."

"There's a million probabilities where it could generate a false positive," Krebs says. "How deep are you going to dig on each one of these to find out they're only vets? By then, you've dug deep into the lives of innocent folks you shouldn't have wasted the time on."

Besides the issue of whether such data collection is useful, legal experts wonder whether the rapidly expanding uses of computer power are tearing a hole in the legal fabric of the nation.

Orin Kerr, a law professor at George Washington University, wrote in an online legal analysis that the newly reported NSA program "touches on at least five laws" - the Fourth Amendment, the Foreign Intelligence Surveillance Act, the Pen Register statute, the Stored Communications Act, and the Communications Act of 1934.