IBM creates 2.5 quintillion bytes of data every day. In 2012, the US president’s administration announced the Big Data Research & Development initiative to exploit big data for enhancing research and innovation. When dealing with big data management and analysis, cloud computing presents one of the most convenient computing and storage infrastructures today. However, the use of the cloud further complicates the problem of data privacy.
Collected data, even if anonymized by removing identifiers such as names or social security numbers, when linked with other data, may lead to the individuals to whom specific data is related being re-identified. Also, as organizations such as governmental agencies often need to collaborate on security tasks, data sets are exchanged across different organizations, resulting in these data sets being available to many different parties. The big question is – how to share and analyse big data in a manner that preserves privacy?
Researchers Elisa Bertino and Bharath Kumar Samanthula from the Department of Computer Science at Purdue University in West Lafayette, Indiana, USA, are developing efficient and effective privacy-enhancing techniques for management and analysis of big data in cloud computing. They presented their work at COLLABORATECOM 2014, the 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharin. While the first phase of their research focused on encrypted data, they intend to investigate other data transformation techniques such as anonymization in future analyses.
The whole paper can be found here.