Before Big Data mining, we need to ensure privacy–but how?

image

 

First things first, but not necessarily in that order!

— words of wisdom from Doctor Who

 

 

 

 

The tweet below got me thinking about the importance of privacy.

Data mining demands sound privacy policies in age of ‘big data’ http://t.co/1N0gZhz #datamining #BI #bigdata

 

After all, in this day of age, with access to all sorts of disparate information, it becomes easier and easier to uniquely identify a person by their behaviors and external patterns.   In Dr. Latanya Sweeney’s ground breaking paper k-Anonymity: a model for protecting privacy, she had noted the following startling observations:

  • Based on the 1990 census, over the 80% of the US population was personally identifiable based on the three attributes of 5-digit zip code, birth date, and gender
  • By combining the state of Massachusetts voter’s list with PII healthcare records, she was able to identify the medical records of then Governor William Weld

 

With the power of Big Data, it is easy to forget that the more we dig and the mine, the more we are potentially invading privacy.   So now more than ever, we need to make use of privacy mechanisms such as k-anonymity or privacy preserving histogram such as episilon noise via Analyzing Data while Protecting Privacy – A Case Study

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s