The General Data Protection Regulation expands the definition of personal data

By Efrén Díaz Díaz

The General Data Protection Regulation (GDPR) will soon be directly applicable in EU countries and apply to all personal data about human beings, or 'natural people' as we're called in jurisprudence. Businesses within and working with EU countries, or providing services to EU citizens, will need to take these new rules into account in their business practice.

Concluded earlier this year, the GDPR comes into force in 2018. It extends the definition of personal data in its article 5 to: "any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person".

This definition includes identifiers that would be considered 'personal data', where a person can be directly or indirectly identified, such as from an ID number, a personal address, IP address, telephone number, fingerprints, retinal patterns, hand geometry, drug identification number or images of individuals captured by a video surveillance system.

Some data is definitely personal data

Your current location is personal data. Where you live is personal data. So is your route to work. But sometimes these things can still be published as open data to unlock greater value for companies, public bodies and society in general.

The GDPR says that datasets containing personal data can only be published as open data by controllers or processors with the consent of the data subject, or on some other legitimate basis (for example, compliance with legal obligations under article 6).

Data can also be published if it is anonymised, but this is tricky and laborious. Controllers and processors are required to provide sufficient guarantees and implement appropriate technical and organisational measures to meet the requirements of the GDPR regulation and ensure the protection of the rights of the data subject. The Information Commissioner’s Office and the UK Anonymisation Network have put together detailed guidance about anonymising datasets.

Some data is definitely not personal data

A register of countries, for example, is clearly not personal data. An address database without data that provides information about owners or occupants such as individual names, house price or identifying details of tenants or landlords is not personal data. Most spatial information such as maps, road networks, cadastre boundaries and information are not personal data so long as they do not include information about the ownership of those areas. Bus timetables would not be considered personal data if all that a dataset consists of is the general times and routes of buses.

These datasets can be combined with lots of data – including personal data – to create value, but these datasets themselves are not personal data.

Sometimes it can be really hard to tell

This is where geospatial data comes in. Geospatial data is a valuable attribute in many datasets, but it can make identification of individuals easier.

You need to think about what’s in your dataset. A personalised bus timetable – an individual's commute, for example – may be considered personal data. This is because you may be able to identify that person through ascertaining their home and work address. The same applies to a route cycled on the weekend or a weekly run.

Some services have been designed to combine anonymisation techniques and user consent to allow people to choose to publish personal data containing location information. For example, Strava allows people to publish their running or cycling activities.

The app says that: "Strava allows you to make any individual activity private. You can also create a privacy zone perimeter around any address like your home, office, or any place you tend to start activities from that you’d like to keep private. You can make your profile viewable only by signed in Strava members, and abbreviate your last name for more anonymity. You can also require approval before allowing someone to follow you."

This type of privacy-by-design and privacy-by-default is a pattern that other organisations could adapt and use to create open data from personal and geospatial data.

Considering scale and link ability

Whether some data is personal data depends on the ease of linking it to a person. For geospatial data, this can be a function of scale. For example, if you could zoom into a satellite photo of Europe and identify a person in their back garden, then the photo is personal data. If the scale of the spatial information does not enable someone to identify a natural person then it can be published as open data.

Geospatial data will be considered personal data if it is linked to a person, obviously about their property or has impact on the person. An example is the value of a particular house. Although it looks like just simple information about an object and not about a person, it may be personal data under the definition in the GDPR as it could be combined with the occupier of the house to infer information about their income or tax liabilities. Data protection rules will clearly not apply if this information, also open data in certain countries, is aggregated or anonymised and published to summarise real estate prices in larger geographic areas.

Meanwhile, in Spain the National Data Protection Authority is investigating one of the biggest mobile phone operators because it is collecting and using people’s geolocation data without consent. This data is clearly personal data: it could show if users regularly go to the headquarters of a political party, a union, or a religious temple. It needs to be used in a way that is compliant with data protection regulations.

Not all geospatial data is personal data, but some is

The GDPR has expanded and clarified the definition of personal data and the responsibilities of organisations that hold and process it.

Knowing and determining whether geospatial data is personal data can be tricky. But this is very important, especially if you also want to publish that geospatial data as open data. Before publishing it is crucial to ascertain whether that geo-information is personal data or not, and if it can be legally published as open data. Initially, geospatial data is not considered personal data, but when this dataset includes other data that is personal data, the whole dataset is personal data, GDPR applies and anonymisation mechanisms need to be applied.

The General Data Protection Regulation (GDPR) has big implications both for technological and legal aspects of geo-information. However, the privacy impact can be assessed to allow the publication of a dataset as open data, taking into account legal issues, particularly those related to privacy and protection of personal data in the geo-information databases.

Efrén Díaz Díaz is a lawyer at Bufete Mas y Calvet.

If you have ideas or experience in open data that you'd like to share, pitch us a blog or tweet us at @ODIHQ.