First things first, but not necessarily in that order!
— words of wisdom from Doctor Who
The tweet below got me thinking about the importance of privacy.
Data mining demands sound privacy policies in age of ‘big data’ http://t.co/1N0gZhz #datamining #BI #bigdata
After all, in this day of age, with access to all sorts of disparate information, it becomes easier and easier to uniquely identify a person by their behaviors and external patterns. In Dr. Latanya Sweeney’s ground breaking paper k-Anonymity: a model for protecting privacy, she had noted the following startling observations:
- Based on the 1990 census, over the 80% of the US population was personally identifiable based on the three attributes of 5-digit zip code, birth date, and gender
- By combining the state of Massachusetts voter’s list with PII healthcare records, she was able to identify the medical records of then Governor William Weld
With the power of Big Data, it is easy to forget that the more we dig and the mine, the more we are potentially invading privacy. So now more than ever, we need to make use of privacy mechanisms such as k-anonymity or privacy preserving histogram such as episilon noise via Analyzing Data while Protecting Privacy – A Case Study