You may think you are anonymous on the web, but you really aren’t. Stanford and Princeton researchers have shown how anonymous browsing history and Twitter usage is enough to reveal a user’s real identity.
The research team launched what they call the “Footprints Project” over the summer, inviting people to participate in an online experiment, disclosing their browser history, including information about active Twitter usage. Based on that information alone, they have managed to correctly identify 11 out of 13 people on their first day of operation.
The Footprints experiment ended in October, studying almost 300 users, and accurately identifying 80% of them.
“This is kind of scary,” says Stanford undergraduate Ansh Shukla, a senior studying mathematics, who is working on the project with Stanford Engineering assistant professor Sharad Goel and Stanford computer science PhD student Jessica Su.
“You should kind of go into the internet assuming that everything you go to someone might learn about someday,” Shukla says.
How did it work? Volunteers who participated in Footprints gave the researchers permission to gather the names of any websites that a participant clicked on through Twitter while using Google Chrome. This unique set of links is a fingerprint. To find that user, the researchers crawled through millions of Twitter profiles to see who everyone is following.
So imagine that Jane Doe, John Smith and Susie Q all participated anonymously, and that each of these three volunteers follow 100 Twitter accounts. All three might follow the official Stanford Engineering Twitter account. But Jane and John also follow the New York Times’ Twitter account for their news, while Susie instead follows the Los Angeles Times as her newspaper of choice. Researchers can then deduce that the person who visited links tweeted from Stanford Engineering and the New York Times is more likely to be Jane or John, not Susie.
And you can bet that many advertisers and internet companies are already using similar techniques to learn everything they can about individual internet users. Even without linking an online user with their real name, companies can cross-reference databases to learn very interesting – and lucrative – information. You may have heard of cookies, but it just doesn’t end there.
It goes to show that our digital footprints are considerably more vulnerable and exploitable than we often think, and clearing cookies and browsing history does not help. It also takes more than to check “do not track” settings in a browser. An average user on an average machine is quite frankly an open book, and it’s only up to us whether or we mind.