Open Data and the National Pupil Database

The defining characteristic of open data is that everyone can use it.

Data that you have to pay for to reuse isn’t open data because people who can’t afford to pay can’t use it.

Data that you can only access after going through an application process, and where that application might be refused, isn’t open data because people whose applications fail can’t use it.

Data that you can’t use within a business isn’t open data because people who want to make money from using the data can’t use it.

These are some of the fundamental lines between closed and open data.

Looking at how the Department for Education is describing the wider sharing of extracts of the National Pupil Database, I’m reminded that we need to continue to reiterate them.

The National Pupil Database contains records about every child and young person who has been through England’s educational system since 1995. This is personal and sensitive data, and quite rightly the Department for Education has stringent controls and limitations around who can access the individual-level data. They control what parts of the database others can get hold of and how it can be used. Those who want access have to apply, and if the application fails they can’t have it.

So the National Pupil Database is not open data.

But when the National Pupil Database was first entered in data.gov.uk, it was incorrectly described as open data and licensed under the Open Government Licence. The National Pupil Database also appears within the Department for Education’s Open Data Strategy document as being available under the Open Government Licence. This is misleading and potentially damaging; as we argue in our response to the Department for Education’s consultation, access to the National Pupil Database should be tightly restricted.

However, the Department for Education does make available a wide range of useful aggregations based on analysis of the data within the National Pupil Database. There is no need to apply to get hold of the aggregated data, you can just download them from the Department for Education website. This is open data, and three-star at that!

To their credit, the Department for Education have corrected some of these errors, but their original confusion highlights the need for us to describe more clearly what open data means and how and when to use it. Doing this is a fundamental part of our mission at the Open Data Institute.

Thanks to Phil Booth and Owen Boswarva for their pointers contributing to this post.