Seven principles to help us strengthen our data infrastructure

Since publication, these principles have evolved into one of the guides that we use in our work. The latest version is available for anyone to use. If you want help in using the principles then contact us at [email protected]

null

Good infrastructure is simply there when we need it – we know it’s working when we don’t need to think about it. CC BY 2.0, uploaded by Hefin Owen.

The UK Government has launched a public consultation on moving the operations of the Land Registry for England and Wales to the private sector. The Land Registry registers the ownership of land and property. Land ownership is a valuable part of any country’s data infrastructure. We will be responding to the consultation and hope to persuade the government to avoid the mistakes it made when it lost control of address data.

We had already begun devising our own high-level principles to shape how we approach policies, research and tools for data infrastructure over the coming year. We developed the principles through internal workshops and from the lessons we have learnt over the years. The Land Registry consultation will be one of the ways that we test and iterate the principles (set out below). We wanted to share them in draft to hear other people’s feedback or ideas for other potential uses. We will publish the next iteration of the principles in two months’ time.

Data infrastructure should be boring and reliable

Data is infrastructure. It underpins innovation, transparency, accountability, businesses, public services, and civil society.

The data in our infrastructure exists on a spectrum, from closed to shared to open. Each part of the spectrum is useful; not all data will be open or closed. For example, while few people want to publish medical records openly, societies accept that medical records will always be shared with a doctor in time of a need.

Society is not currently treating data as infrastructure. We are not giving it the same importance as our road, railway and energy networks were given in the industrial revolution – and are still given now. Good infrastructure is simply there when we need it. We know our data infrastructure is working when it is boring – when we don’t need to think about it.

At the moment, too much of our data infrastructure is unreliable. It doesn’t work easily or doesn’t work at all. Data innovators struggle to get hold of data, to work out what they can use it for, to know whether or not data will continue to be maintained or is of reasonable quality. The time and effort that goes into fixing data infrastructure as and when these potholes and dead-ends are discovered could be spent building services or finding insights to improve them.

Our data infrastructure could contribute more value to our economies and our lives than it already does. We should take every opportunity to strengthen it.

Design principles for data infrastructure

A data infrastructure consists of data assets, the organisations that operate and maintain them and guides describing how to use and manage the data. Trustworthy data infrastructure is sustainably funded, and is managed in a way that maximises data use and value by meeting the needs of society. Data infrastructure includes technology, processes and organisations.

These principles for data infrastructure complement our draft principles for personal data, which will help to build services and find insights in ways that people can understand and trust. Data infrastructure is a framework that allows those services and insights to flourish. These design principles should be used when that framework is being assessed or built. We believe that they will help create data infrastructure that generates impact and justifies investment.

1. Be purposeful

There should be a clear focus and remit for any part of our data infrastructure, whether it be a data asset, a piece of technology, a process guide, or an organisation that maintains data. That clear focus and remit should always meet a need. The governance of the data infrastructure should have sufficient control to ensure that the focus and remit is met.

2. Benefit all stakeholders

Data infrastructure is used by citizens, businesses, civil society and governments. Some of our data infrastructure is used by every sector of society, while some is only used by part of a single sector. Data infrastructure should be designed so that all stakeholders can benefit from the services and insights enabled by the data. This will require people building data infrastructure to understand their stakeholders’ differing needs and find shared goals.

3. Invest based on need

We should think strategically and focus investment where it is justified. We should build infrastructure that is reliable and fit-for-purpose. Sometimes that might be every sector of society. Sometimes it might be a small group of specialised users. We do not build super highways where we only need country lanes, but sometimes we upgrade our country lanes when we learn that the extra traffic will create sufficient positive impact.

4. Be as open as possible

Data infrastructure should be as open as possible. In the most impactful and valuable data infrastructure, open data is maximised while data sharing is minimised but what is private remains private. The same is true of the organisations, technology and processes in our data infrastructure. Open source and collaborative maintenance models can create trust and reduce cost. Everyone should be able to contribute to and help maintain the data infrastructure. Everyone should be able to use as much data as possible for any purpose. This will encourage open innovation.

5. Build trust through openness

The organisations in our data infrastructure should operate transparently. People who publish or generate data should understand how their data forms part of our data infrastructure. We should publish information about how data is being used and how it has been collected as open data.

6. Evolve and be dynamic

Expect change to happen, measure impact and learn from mistakes. We should plan for data infrastructure to adapt to meet changing needs, whether they be to support an innovative idea in one area or reduce demand in another; to help collect a new source of data or take advantage of new technology. Sometimes data will no longer be in regular use and should be archived for future historians.

7. Build for the web

Data infrastructure should work as part of the open and decentralised web. Use persistent URLs, linked identifiers and open standards. Reuse, build on or link to things that already exist. We should build this web of data openly and in collaboration with the growing network of data publishers and data users.

Since publication, these principles have evolved into one of the guides that we use in our work. The latest version is available for anyone to use. If you want help in using the principles then contact us at [email protected]

Peter Wells is an Associate at the ODI. Follow @peterkwells on Twitter.