7 minute read

1.6 Data Minimisation and Anonymisation

Next Article
Privacy notice

Privacy notice

The government welcomes views on the following issues:

Q1.5.18. Please share your views on the effectiveness and proportionality of data protection tools, provisions and definitions to address profiling issues and their impact on specific groups (as described in the section on public trust in the use of data-driven systems), including whether or not you think it is necessary for the government to address this in data protection legislation.

Advertisement

Q1.5.19. Please share your views on what, if any, further legislative changes the government can consider to enhance public scrutiny of automated decision-making and to encourage the types of transparency that demonstrate accountability (e.g. revealing the purposes and training data behind algorithms, as well as looking at their impacts).

Q1.5.20. Please share your views on whether data protection is the right legislative framework to evaluate collective data-driven harms for a specific AI use case, including detail on which tools and/or provisions could be bolstered in the data protection framework, or which other legislative frameworks are more appropriate.

113. Further work is underway, as part of the National AI Strategy and Centre for Data Ethics and Innovation’s AI Assurance workstream, to assess the need for broader algorithmic impact assessments. The responses to the above questions will inform that work. This consultation document also covers algorithmic transparency in the public sector specifically, detailed in section 4.4.

1.6 Data Minimisation and Anonymisation

114. The UK's data protection legislation requires that personal data is adequate, relevant and limited to what is necessary, or not excessive, in relation to the purposes for which it is processed; this is commonly known as the data minimisation principle. This principle requires, for example, that organisations employ methods for processing that achieve their ends without making use of personal data unnecessarily.

115. Data minimisation techniques, such as pseudonymisation, can be applied to safeguard personal data, which in turn may allow for such data to be shared in safer ways. Sharing data openly but safely can be highly valuable - for example, in the research community it can allow for crossvalidation of scientific results, significantly improving the reliability of findings.

116. Personal data is defined as any information relating to an identified or identifiable individual. There is a spectrum of identifiability: the process of safeguarding a dataset with data minimisation techniques should make it less easy to use on its own or in combination with other information to identify a person either directly or indirectly. For example, pseudonymised data is personal data which has been put through a process so that it cannot be used to identify an individual without additional information. Personal data may also undergo the process of anonymisation, so that an individual is not or no longer identifiable.

117. The distinction between anonymised and pseudonymised data is important because it delimits the scope of data protection legislation, including the UK GDPR. Pseudonymised data falls within the scope of data protection legislation, whereas anonymous data is not.

118. Determining whether personal data is anonymous may be complex; organisations must make a context-specific decision, taking into account various risks and external factors. The thresholds for determining anonymisation are arguably unclear, even for sectors where anonymisation is a fundamental part of data use and sharing, such as the research community. These thresholds are dynamic and change, for example, with the development of new techniques both to safeguard data and to re-identify it.

119. As a result, organisations may incorrectly classify data as anonymous and therefore fail to adequately protect personal data. Alternatively, organisations may apply anonymisation techniques to a greater extent than is necessary to mitigate risks to an individual's data protection rights, potentially compromising the value of the data set for re-use. Overuse of the anonymisation procedure may also be driven by the perceived risk of enforcement action in the event that a dataset is treated as anonymous but later used, possibly by a malevolent actor, to identify a person.

120. The government believes more could be done to help organisations understand what needs to be done to anonymise data. Organisations that have carried out appropriate due diligence to ensure that the data is reasonably unlikely to be re-identified should be able to realise its benefits. A proposed approach is set out in the following sections.

Clarifying the circumstances in which data will be regarded as anonymous

121. The UK's data protection legislation only regulates information relating to an identifiable living individual: it does not regulate anonymous information. At the moment, the legislation does not include a clear test for determining when data will be regarded as anonymous. There is, however, some guidance in the ICO's code of practice on anonymisation and in the recitals to the UK GDPR to help in determining whether data is anonymous. In the interests of certainty, the

government is proposing to place such a test onto the face of legislation, and is considering two options:

a. Placing Recital 26 of the UK GDPR onto the face of legislation. Recital 26 states that to determine whether a person is identifiable, account should be taken of ‘all the means reasonably likely to be used’ to (re-)identify the person, including all objective factors, such as costs and time required for identification, in light of available technology at the time of the processing and technological developments.

b. Creating a statutory test based on the wording of the Explanatory Report

accompanying the Council of Europe’s modernised Convention 108 (Convention for the Protection of Individuals with regard to Automatic Processing of Personal

Data) (Convention 108+). This explains that data will be regarded as anonymous when it is impossible to re-identify the data subject or if such re-identification would require unreasonable time, effort or resources, assessed on a case by case basis taking into consideration the purpose of the processing and objective criteria such as the cost, the

benefits of identification, the type of controller, the technology and so on, noting that this may change in light of technological and other developments.34

122. Either of these options would provide a clearer test for determining when data will be regarded as anonymous, and the government welcomes views on their benefits and risks.

Clarifying that the test for anonymisation is a relative one

123. In addition to the proposals outlined above, the government is considering legislation to

confirm that the question of whether data is anonymous is relative to the means available

to the data controller to re-identify it. The Court of Justice of the European Union (CJEU) favoured a relative approach when assessing whether dynamic IP addresses constitute personal data in the case of Breyer vs Germany. The Court noted it was relevant to ask what means of identification were available to the relevant controller processing the data, including where the relevant means of identification were in the hands of a third party from whom the controller could reasonably obtain them. If the data controller has no way of obtaining the means of identification, the data is not identifiable in the hands of the controller (and so is anonymous in that particular controller’s hands even though it may be identifiable in another controller’s hands).

124. The confirmation of a relative test for re-identification would not change the current position that the test may apply differently over time (owing to technological developments, for example), so organisations would continue to be expected to carry out appropriate due diligence to remain compliant. Furthermore, it would maintain the current position that the status of data - that is, whether it is anonymous or not - can be different depending on whose hands it is in. Organisations would therefore still have to consider whether data could be re-identified when shared with third parties. However, a relative test could give organisations more confidence to anonymise data and use it more innovatively within their own organisations, or when shared with organisations that adhere to similar standards on anonymisation and re-identification.

125. Greater use of effective anonymisation could help to better protect individuals' personal information, reduce risks for organisations and provide the opportunity for broader economic and societal benefits through an increase in the availability of data. Forthcoming guidance from the ICO on anonymisation, pseudonymisation and privacy-enhancing technologies (PETs) will also look to help organisations build confidence in the use of de-identification measures and any subsequent decisions to share data with third parties. The government is not currently planning any further legislative reform to better support use and standards of PETs beyond the proposals on anonymisation outlined above, but would welcome views on whether there are other areas it should explore.

The government welcomes views on the following questions:

Q1.6.1. To what extent do you agree with the proposal to clarify the test for when data is anonymous by giving effect to the test in legislation? ○ Strongly agree ○ Somewhat agree ○ Neither agree nor disagree ○ Somewhat disagree

This article is from: