Proposed Amendments to Individual Pupil Information Prescribed Persons Regulations

1. Do you agree with the proposal to widen the purposes for which data from the National Pupil Database can be shared? Please explain the reasons for your answer.

Disagree

The ODI supports the release of open data: data free for unrestricted use. It also supports the collection and careful analysis of data that can contribute to evidence-based policy formation.

The ODI does not support the release of personal data as open data, or the use of personal data for commercial ends in the absence of affected individuals being both

  • informed of that use; and
  • giving explicit consent

This proposal does not include the publication of open data: although organisations will not have to pay for the data, the Department for Education has stringent restrictions over its access.

The proposal widens the set of purposes for which data about individual pupils can be made available in two ways: the goal of the reuse (to improve children’s well-being as well as promoting their education) and the beneficiaries of the results (to commercial organisations providing information and data-based products as well as researchers). The ODI supports the former but not the latter, and would support the change to the wording of the regulation if it said:

persons conducting research into the well-being or educational achievements of children in England and who require individual pupil information for that purpose

As well as the regulation itself, the Department for Education maintains a separate set of terms and conditions for the use of the National Pupil Database; any requests for the data are assessed by the Data and Statistics Division and the Data Management Advisory Panel against both the regulation and the terms. We understand that the terms organisations sign up to state that recipients of the data must not attempt to identify individuals within the dataset; organisations who have access to the data are subject to the Data Protection Act and breaching the terms would expose those organisations to prosecution, negative publicity and substantial financial penalties. However, commercial organisations take calculated risks, and if the likelihood of detection of breaches of terms is low (or the company able to absorb any penalties), and the benefits high, we believe there is a high risk that they will break those terms of use.

We are concerned that the level of anonymisation used within the data extracts created from the National Pupil Database is not sufficient to prevent personal data from being discovered about individuals held within the database. For example, the year of birth, school attended and subjects taken (and grades achieved) of an individual are often easily locatable through an individual’s CV, and act as a fingerprint for an individual within the database, enabling their record to be identified even when “identifiable” information is omitted.

We believe that the Data and Statistics Division and the Data Management Advisory Panel will do their best to safeguard the hundreds of thousands of children and young people whose personal details are stored within this database. However, the fact is that the more widely the data is distributed, the more likely it is that a security breach will occur. The number of people affected by such a breach, and the amount of information that would be made available about them, make the proposed widened availability too much of a risk.

We also do not see any evidence in the consultation for extra steps being taken to inform parents, children and young people about each additional use of the data, nor provision to obtain individual consent about this use. We believe informed consent should be a prerequisite for commercial use of personal data and we do not believe that there is any implied consent to such usage for the data that is currently held within the National Pupil Database. Our response to Question 4 below describes the measures we believe the Department for Education should put into place to obtain this consent if the National Pupil Database is made available for commercial purposes.

In addition, we believe that the media and the public will perceive the widened usage of the National Pupil Database as an example of the government carrying through its Open Data policy objectives. Although the proposal does not make the National Pupil Database available as open data, the distinction between "open data" and "free data you have to apply to access" is rather subtle, particularly when it is justified in the same way (to promote economic growth through the development of the information industry). We are concerned that negative publicity and feeling about commercial usage of this data, particularly if there were a security breach, will have a negative knock-on effect on attitudes towards open data, which would set back our own work and impact other global initiatives that are driving positive change.

The National Pupil Database does contain data that could be useful for third-party applications (see the response to Question 2 below). However, the vast majority of these applications can be created based on aggregate data from which (if appropriately treated to control levels of disclosure) it is harder to identify individuals.

For these reasons, we believe that the individual-level data within the National Pupil Database should remain restricted in its availability, and the scope of the purposes for which the data can be shared should remain limited to research, but with the widened goal of pupil well being. We do, however, encourage the Department for Education to release more detailed aggregation data than it currently makes available, as open data.

2. How could you or your organisation potentially use the data?

The Open Data Institute supports start-ups and other data consumers to discover and use open data. There are a number of uses to which aggregations derived from the National Pupil Database could be put. For example:

  • property-finder applications could use school-level aggregations of attainment in individual subjects to help parents find a school or nursery that is suitable to their child’s individual strengths and weaknesses
  • property-finder applications could use an aggregation based on pupil addresses to identify the implicit catchment area of a school, which would help parents identify where to move to get into a particular school
  • a service could be created that matched schools based on their intake and enabled them to share best practices to help drive up the quality of teaching within schools

The aggregations these applications require could be generated by the Department for Education. None of these services require access to individual-level data.

3. What do you see as the benefits of widening the purposes for which data can be shared?

We can see that there are many potential benefits in widening the purposes for which individual-level data from the National Pupil Database can be shared in a secure way (not as open data), particularly for the organisations who could gain insight by having access to the data, leading to better targetting of products and services and to internal efficiencies. But we see many risks in providing that access, particularly to commercial organisations, as described in our answer to Question 1, and feel these outweigh the benefits.

We also see great benefits, which crucially do not have the same risks, in providing as open data suitably anonymised aggregate information sourced from the National Pupil Database. This could lead to economic growth, as new businesses are built on top of the available data, as well as social and environmental benefit if services are developed that enhance the teaching provided to children and the environmental impact of the schools.

We also see benefits in enabling researchers to have access to the data to research children’s well-being (not just their educational achievement). Such research could help inform evidence-based policy by identifying interventions that will benefit children and young people in the future.

4. Do you have any other comments you would like to make about the proposals in this consultation document?

If the Department for Education follows through on this proposal to widen the purposes of use of the National Pupil Database, it will need to invest additional resources in the approvals process around requests for access, including

  • assessing the security controls in the companies making these requests
  • ongoing audits of the technical and procedural measures that are put into place within the Department and those companies
  • preventing and dealing with security breaches
  • prosecuting organisations that breach the terms of use

We think that these resources would be better spent in building a robust mechanism for publishing useful low-level aggregate data from the National Pupil Database as open data. Publishing open data would enable new companies to easily use the aggregate data to create the kinds of services described in our answer to Question 2, reducing the workload both on those companies and Department for Education staff.

As part of the UK Anonymisation Network, the ODI would be very interested in exploring, with the Department for Education and some of the startups that we are currently incubating, how such aggregate data could be created without exposing personal information.

If the Department for Education decides to continue with the proposal to widen the purposes for which the National Pupil Database is made available, we would encourage the Department for Education to consult with the UK Anonymisation Network and explore methods of limiting the personal information that is made available to companies that make it through the approvals process.

The Department for Education should make available as open data the details of who has access to information from the National Pupil Database, which information they have access to, and for what purposes.

The Department for Education should seek the consent of the children and young people whose details are made available (or their parents), for the new uses to which the data will be put. This consent should be granted for the widened use of the data, particularly with an increase of scope to commercial purposes (previous consent for the purpose of research should not be taken as implied consent). Consent should be sought for all those affected, not just the current cohort of school and nursery children.

It is unlikely that companies will be interested in all the data held by the National Pupil Database; to limit exposure, the Department for Education should continue to only make available to companies the information that they need (which may involve subsetting the data to particular records and/or only releasing requested fields for those records)

5. Please let us have your views on responding to this consultation (e.g. the number and type of questions, whether it was easy to find, understand, complete etc.).

The ODI is extremely grateful for the opportunity to respond to this consultation, and thanks the Department for Education for alerting us to it. We would encourage future consultations to enable responses through a web form, to facilitate the submission of responses through third-party websites and the capture of those responses. We would also encourage the republication of consultation responses as open data (subject to permission from the responder).