Stanford researchers reveal surprisingly sensitive information from phone metadata

Original press release was issued by Stanford University.

The collection of phone metadata (numbers dialed and length of calls), as controversial as it is on general principle, has never really been a subject of significant public backlash as a serious breach of privacy. As a society, we just don’t suppose that what government agencies can learn from our metadata is in some way sensitive private information. Which is why this information can be accessed without a warrant.

This might soon change in the light of research performed by a team at Stanford University, who have managed to easily infer suprisingly sensitive and accurate personal information – such as health details – from metadata alone. Additionally, the reach of such surveillence has been demonstrated to be larger than previously thought – following metadata “hops” from one person’s communications can involve thousands of people. The findings provide the first empirical data on the privacy properties of telephone metadata.

The researchers set out to fill knowledge gaps within the National Security Agency’s current phone metadata program, which has drawn conflicting assertions about its privacy impacts. The law currently treats call content and metadata separately and makes it easier for government agencies to obtain metadata, in part because it assumes that it shouldn’t be possible to infer specific sensitive details about people based on metadata alone.

“I was somewhat surprised by how successfully we inferred sensitive details about individuals,” said study co-author Patrick Mutchler, a graduate student at Stanford. “It feels intuitive that the businesses you call say something about yourself. But when you look at how effectively we were able to identify that a person likely had a medical condition, which we consider intensely private, that was interesting.”

From a small selection of the users, the Stanford researchers were able to infer, for instance, that a person who placed several calls to a cardiologist, a local drugstore and a cardiac arrhythmia monitoring device hotline likely suffers from cardiac arrhythmia. Another study participant likely owns an AR semiautomatic rifle, based on frequent calls to a local firearms dealer that prominently advertises AR semiautomatic rifles and to the customer support hotline of a major firearm manufacturer that produces these rifles.

The computer scientists built an app that retrieved the previous call and text message metadata from more than 800 volunteers’ smartphone logs. In total, participants provided records of more than 250,000 calls and 1.2 million texts. The researchers then used a combination of inexpensive automated and manual processes to illustrate both the extent of the reach – how many people would be involved in a scan of a single person – and the level of sensitive information that can be gleaned about each user.

By extrapolating participant data, the researchers estimated that the NSA’s current authorities could allow for surveilling roughly 25,000 individuals – and possibly more – starting from just one “seed” phone user.

Although the results are not surprising, the researchers said that the raw, empirical data provide a better-informed starting point for future conversations between privacy interest groups and policymakers.

“If we’re going to pick a sweet spot as society, where we want the privacy vs. security tradeoff to lie, it’s important to understand the implications of the polices that we have,” Mutchler said. “In this paper, we have empirical data, which I think will help people make informed decisions.”