How the content of your tweets can give your location away

Twitter is a source of a vast amount of unstructured textual data that can be used for location-based advertisement and entertainment. However, all but a tiny fraction of Twitter users choose to opt out of the geo-tagging option. Kisung Lee, Raghu K. Ganti, Mudhakar Srivatsa, and Ling Liu from the College of Computing at the Georgia Institute of Technology and IBM T. J. Watson Research Center in the USA have decided to tackle the sparseness of location data.

Their research paper won the Best Paper Award at last year’s edition of MOBIQUITOUS.

The research presents ‘a methodological approach to increasing the number of geotagged tweets by predicting the fine-grained location of those tweets in which their location can be inferred with high confidence’, based only on their textual data. Their experiments proved that their 3-step technique (Filtering-Ranking-Validating) can increase the number of geo-tagged tweets 4.8 times, and place 34% of predicted tweets within 250m from their actual location. They conclude by suggesting several ways in which the framework could be extended.

Importantly, the researchers note that the value of geo-tagged tweets goes beyond advertisement – it can be used to detect unexpected events, such as earthquakes, robbery or gunshots, and ‘notify the right people instantly’. Moreover, they prevent skeptics from pointing at potential privacy issues arising from this method by explaining that their framework is capable of warning a concerned Twitter user about potential threats to their location privacy, if it detects that the location of a given tweet can be predicted.

Do the benefits really outweigh the potential risks? Read the full paper and judge by yourself.