Mining data to nab terrorists: fair?
Digital minutiae could be used to track terror networks, but it could produce false positives.
What can the United States government really glean from the phone-call histories - records of who called whom, when, and for how long - of millions of Americans?
After all, it's the same information that has long been available to authorities armed with a subpoena, though not sought en masse until after the 9/11 terror attacks. Its value, say computer experts and others, is that it can be used to identify a "social network" of interconnected people - including, perhaps, would-be terrorists.
"From phone records you can learn who are my friends - and who their friends are - what services I use, where I shop," says Johannes Gehrke, a computer scientist at Cornell University who has written search algorithms for government analysis programs. "Our social interactions leave a digital trail. [Phone-record analysis] is government learning about human behavior from analyzing that trail."
Moreover, they assert, phone records are just one part of a much larger government effort to analyze the digital minutiae of American life in the hope of uncovering terrorist networks buried within it. Potentially invasive, such counterterror activity aims to build databases that can be cross-referenced in the hope of matching patterns, relationships, and activities that bear investigating, experts say.
"You should presume that phone numbers are being collated with Internet records, credit-card records, everything," says Bruce Schneier, a security technologist with Counterpane Internet Security in Mountain View, Calif.
Cross-indexing phone records can reveal a social profile of friends and acquaintances and a geographic profile. Each individual in that chain might then be cross-indexed against his or her retail purchases, credit history, e-mail, medical records, airline reservations, Social Security number, fingerprints - anything that can be digitized and stored in databases, and assuming that the government has access to them. Such activity is potentially invasive, many experts acknowledge, but will it work?
"In the commercial and consumer world, data mining has seemed quite successful," says George Cybenko, an engineering professor at Dartmouth College. "We don't have that many new technologies in our repertoire to address this new terrorist threat. So we have to explore them."
Others argue that this "social network analysis" - the computer data-mining technique rapidly gaining ground in intelligence and law-enforcement circles - carries legal and practical risks. As the effort gains momentum, it may collide with law, invade Americans' privacy, undermine civil liberties, or grow to such an enormous size that it may actually make finding terrorists harder by producing too many "false positives," some say.
A USA Today report last week indicated that the National Security Agency is well along that technological path. Using data from phone company computers, the NSA has been building a gargantuan database containing the detailed calling histories of most Americans, the report said.
The idea is for computer algorithms to identify digital calling patterns that authorities would expect a terrorist might follow.
While congressmen of both parties fumed last week, President Bush defended government surveillance activity in general, saying it "strictly targets Al Qaeda and its known affiliates." Americans' privacy is being "fiercely protected," he said.
Contacted by the Monitor, the NSA did not offer a response to specific questions about its role as described by the USA Today report.
"Given the nature of the work we do, it would be irresponsible to comment on actual or alleged operational issues; therefore, we have no information to provide," NSA spokesman Don Weber wrote in an e-mail. "However, it is important to note that NSA takes its legal responsibilities seriously and operates within the law."
Data mining as a counterterrorism tool has grown enormously in recent years, according to the Government Accountability Office.
"Don't think of all these records as separate things," says Mr. Schneier of Counterpane Internet Security. "If analysis tools are being used across this collection, it's really just one big sea of data being analyzed and crosschecked."
In the weeks after 9/11, Valdis Krebs scoured news reports and the Internet for tidbits about the hijackers, plugging their meetings, travels, and relationships into a computer program he had created. At last, it spit out a chart detailing the terrorists' links to one another.
It revealed, for instance, that two of the 9/11 hijackers - Nawaf Alhazmi and Khalid Almihdhar - were at the center of a spider web linking those accused of bombing the USS Cole in 2000.
What Mr. Krebs, a Cleveland-based expert in social network analysis, did in a small way was what he and others say US agencies are trying to do on a much larger scale - by piling up mountains of data to sift for unseen patterns that reveal hidden terrorist networks.
"This can be a very powerful tool," he says, if the technique is used to "drill down" into known terrorists for their associates and those they call and connect with.
But Krebs and other experts are doubtful about the opposite method, a "top down" approach in which the bulk of data on innocent Americans is sifted for terrorists.
It's a massive job, and accuracy is made more difficult by the size of the database, he says. Just one AT&T database - dubbed "Hawkeye," some of whose call records on millions of customers are alleged to have been handed over to the NSA - contained more than 300 terabytes of information, according to an Electronic Frontier Foundation (EFF) lawsuit filed in January. That's about 15 times the information in the Library of Congress.
"By looking at all this data, they're making the problem far more difficult for themselves," says Krebs. "Instead of looking for a likely haystack and the needles that might be there, the NSA is vastly expanding the haystack by requesting everybody's phone records."
Krebs worries about false positives. He offers the example of people living in US who haven't called each other in years, who all of a sudden start calling and e-mailing and planning a trip to Washington, D.C. Maybe they're even calling overseas. "[?]It would be easy for a computer to flag this social network as a terror cell, when it's really a group of Vietnam vets planning a reunion at the Vietnam Memorial."
"There's a million probabilities where it could generate a false positive," Krebs says. "How deep are you going to dig on each one of these to find out they're only vets? By then, you've dug deep into the lives of innocent folks you shouldn't have wasted the time on."
Besides the issue of whether such data collection is useful, legal experts wonder whether the rapidly expanding uses of computer power are tearing a hole in the legal fabric of the nation.
Orin Kerr, a law professor at George Washington University, wrote in an online legal analysis that the newly reported NSA program "touches on at least five laws" - the Fourth Amendment, the Foreign Intelligence Surveillance Act, the Pen Register statute, the Stored Communications Act, and the Communications Act of 1934.
"This is nothing but an illegal fishing expedition into the private records of every American," says Kevin Bankston, EFF staff attorney. "No order could be valid that asked for every single phone record in America."
But Charles Fried, a Harvard law professor, dismisses what he sees as a lot of huffing and puffing by civil libertarians amid precious little legal protection for such phone records.
"Nobody has made the case so far that this violates any law," he says. "What violations of civil liberties are involved? This is just an exercise in mindless labeling, trotting out the word and applying it to something you don't like."
The government is defending its ability to collect such data. On Friday, the Justice Department sought dismissal of the EFF lawsuit, announcing it would "assert the military and state secrets privilege," according to a "statement of interest" it filed in federal court in San Francisco.
Traditionally, phone companies have closely guarded phone records, which are protected by the laws cited by Professor Kerr. But the USA Today report named AT&T, Verizon, and BellSouth as all having voluntarily handed over years of call- history data to the NSA. Qwest, worried about legal fallout, reportedly declined to share its data.
Contacted by the Monitor about the USA Today report, AT&T, Verizon, and BellSouth issued statements. BellSouth says it "does not provide any confidential customer information to the NSA or any governmental agency without proper legal authority." Verizon said it "acts in full compliance with the law and we are committed to safeguarding our customers' privacy."
Responding to Monitor questions about the sharing of at least parts of its Hawkeye database with the NSA, AT&T noted its "long history of vigorously protecting customer privacy. Our customers expect, deserve, and receive nothing less than our fullest commitment to their privacy."
AT&T also noted that "we also have an obligation to assist law enforcement and other government agencies responsible for protecting the public welfare, whether it be an individual or the security interests of the entire nation."
Many Americans see giving up some of their civil liberties or privacy as necessary to help aid the war on terror. Opinion polls show Americans are split over an NSA program the president has acknowledged authorizing - one that permits the agency to eavesdrop, without getting warrants, on communications from abroad.
In a poll released Saturday, Newsweek found that a 53 percent of Americans say the NSA's surveillance program "goes too far in invading people's privacy." Forty-one percent, the poll showed, see it as a vital tool for combatting terrorism.
Last week's revelation "makes me feel terrible, like my privacy is being invaded," says Brandi Dawson, a receptionist from Somerville, Mass. "The fact they have access to all these records, even in the fight on terror, that's going too far."
But some say giving up calling records - and some privacy - may be sad but worth it, even if computers misidentify them and they end up being investigated by the government.
"It doesn't really bother me because I have nothing to hide," says Dale Wyman, a computer network engineer eating lunch in the mall at the foot of the Prudential Tower in Boston's Back Bay.
"I personally would rather have a false-positive come at me than be sitting here and having a building come down on me because of a terrorist."