In possibly a world first, the UK’s Information Commissioner’s Office is seeking feedback on a proposed “Data Anonymisation Code of Practice.” But just how anonymous is de-identified data and when can “Big Data” processors be confident they are dealing with non personally identifiable data? Sue Gold reports.
Topic: Privacy
Who: The Information Commissioner's Office (ICO)
Where: Wilmslow, Cheshire, UK
When: 31 May 2012
Law stated as at: 6 July 2012
Background:
Anonymisation is of particular relevance now, given the increased amount of information being made publicly available through open data initiatives and through individuals posting their own personal data online.
Removing personally identifiable information from individuals' digital profiles so that information can then be used in aggregate is now widely used. The debate has grown over whether or not in practice such information can really be truly anonymised.
This concern has been highlighted with the developing technical capabilities allowing identification from supposedly "anonymous" information leading to questioning on how to safely handle such information.
What happened:
The ICO has recognised the need for further clarification in this area and has launched a public consultation on a new anonymisation code of practice which will close on 23 August 2012. A final version of the code is due for publication in September. The code is intended to demonstrate that the effective anonymisation of personal data is possible, and can help to ensure the availability of rich data resources whilst protecting individuals' privacy.
The code supports the Information Commissioner's view that the Data Protection Act 1998 should not be used as a barrier to prevent the anonymisation of personal data, given that anonymisation is ultimately intended to safeguard individuals' privacy.
The ICO would also like to hear from organisations interested in bidding for a funding allocation of £15,000 to create, develop and support a professional network for sharing expertise concerning anonymisation techniques and data release.
The proposed code explains the implications of anonymising personal data, and of disclosing data which has been anonymised. It provides good practice advice for all organisations that need to convert personal data into a form in which the individuals are no longer identifiable.
It also contains a number of examples that illustrate some of the techniques that can be used to anonymise personal data including the use of encryption algorithms. The code includes an example where supermarkets may want to anonymise data to share with a third party to carry out research looking at the correlation between what the public eat and diabetes. The code proposes using an encryption algorithm to generate unique reference numbers.
Additional points for consideration:
Consent. Personal data could be anonymised without the individual's consent if that anonymisation is necessary for the purposes of the legitimate interests pursued by the organisation in question. This "legitimate interests" justification must be weighed against the impact such anonymisation would have on the interests of the data subject.
"Motivated Intruder". The code also introduces the "motivated intruder" test to help to determine whether anonymised information would allow for re-identification of individuals. The test assumes the "motivated intruder" is reasonably competent using the internet and publicly available information and investigatory techniques to identify someone, but is not assumed to have any specialist knowledge such as computer hacking skills.
"Educated guess". The code states that the possibility of making an educated guess about someone's identity may present a privacy risk but not a data protection one but nevertheless recommends the need for caution and consideration of the possible impact on individuals.
Why this matters:
The definition of personal data has become wider over the recent years as seen with the debate concerning IP addresses. Enhanced technology facilitating the identification of individuals from aggregated data has brought into question whether data can ever be truly anonymised.
This draft code reopens this debate and may be of relevance in a number of areas including the collection of analytics using aggregated data. Linked in with this discussion is the proposal from the EU's Council of Ministers on 27 June 2012 outlining some revisions to the draft Data Protection Regulation. The revisions include a proposal that information should not be regarded as personal information "if identification requires a disproportionate amount of time, effort or material resources".
The Council in addition proposed that anonymised information that does not allow individuals to be identified should be outside the scope of the data protection law.
This is an important area to keep under review and is of growing importance when looking at the ability to carry out analytics and the aggregation of information including through the use of web analytics.