Privacy online: OKCupid study raises new questions about 'public' data
Information on OkCupid can be accessed by any registered user on the site. But does that mean the information is "public"? And if so, is it ok for researchers to distribute it?
When you sign up for a dating website, you are making your information available for other users to see. But does that mean your information is “public”?
Experts are now mulling this question after researchers released a data set of nearly 70,000 users from the online dating site OkCupid.
The researchers, Emil Kirkegaard and Julius Daugbjerg Bjerrekær – at Aarhus University in Denmark – used a "scraper," or a browser extension designed to collect data from web pages, to collect the data. In other words, they collected the data without OkCupid’s permission, breaking the site’s terms of usage and and the Computer Fraud and Abuse Act, as an OkCupid spokesman told Vox.
[The university says Kirkegaard, a student, was not working on the behalf of the university, and that "his actions are entirely his own responsibility."]
The data was uploaded on Open Science Framework, an online forum that encourages researchers to share data for easy collaborations, but it has since been removed. The scraped data revealed many user details including name, age, gender, religion, and detailed information about users' habits and preferences.
When asked whether the researchers took measures to anonymize the data, Mr. Kirkegaard, the lead researcher responded, “No. Data is already public.”
“Some may object to the ethics of gathering and releasing this data,” the researchers wrote in their paper, Vox reported. “However, all the data found in the dataset are or were already publicly available, so releasing this dataset merely presents it in a more useful form.”
But even if the data is available to other users, should it be shared publicly? Some experts don’t think so. While OkCupid lets registered users view profiles of other users on the site, that doesn't justify anyone releasing this information to the public, they say.
In this case, the researchers breached the ethics of Social Science Research, which requires researchers to obtain consent from subjects as well as ensure that researchers are maintaining confidentiality before they can publicly share personal information.
“It is our responsibility as scholars to ensure our research methods and processes remain rooted in long-standing ethical practices,” wrote Michael Zimmer, a privacy and social media scholar, in 2010. “Concerns over consent, privacy and anonymity do not disappear simply because subjects participate in online social networks; rather, they become even more important.”
Mr. Zimmer, an associate professor in the School of Information Studies at the University of Wisconsin, was writing in response to the announcement from Pete Warden, a former Apple engineer, that he had collected information from 210 million Facebook profiles and planned to share it all publicly. Mr. Warden later deleted the information, after Facebook threatened legal action.
The OkCupid profiles include very personal information on everything from political views to sexual habits. OkCupid asks its users hundreds of questions to help its algorithm generate better matches. In other words, it tells users that the more questions they answer, the more likely they are to get a perfect match.
Though the researchers didn’t release real names with the data, just profile user names, that is not considered maintaining confidentiality, say experts. One Twitter user claimed that he could link some bits of data to actual names of more than 10,000 users on OkCupid.
The release has re-ignited a debate about online privacy that tech companies have repeatedly grappled with. In the age of Wikileaks and Edward Snowden’s revelations, some experts contend that it’s becoming increasingly hard to keep up with privacy rights.
Some 55 percent of experts polled by the Pew Research Internet Project in 2014 predicted that policymakers and technologists “would not be able to create a basic, unified privacy-rights infrastructure by 2025,” the Christian Science Monitor reported.
And Americans have a skeptical view towards online privacy, too. A separate Pew poll found that 91 percent of Americans believe that consumers have lost control over how their personal information is collected.
Another recent study revealed that concerns over privacy have deterred many Americans from online activities including using social media sites, shopping, and online banking.
[Editor's note: The original post incorrectly listed Oliver Nordbjerg as a researcher, and implied that Aarhus University in Denmark was supporting the research.]