R&D: Open data publishing

Data publishers use tools to publish data. Key considerations around data publishing are quality, speed, automation and cost. Find out how we are auditing current tools, reviewing user needs and offering solutions through open source prototypes.

People to talk to about this

Improving data publishing: why is it important?

Data publishing gets data to people, and to be of value, needs to be high quality, timely and affordable. But, research shows that only a fraction of the value to the EU has so far been realised.

One of the ways to address this – as well as improving data access and usability – is to look at how open data is currently being published, identify problem areas, and develop potential solutions.

This project considers when, why and how organisations publish open data and suggests improvements to current processes.

Data publishing in practice

The Office for National Statistics (ONS) has a publicly accessible calendar of recent and scheduled publishing, providing a transparent overview of its publishing process. The ONS also provides a contact email address alongside each dataset, providing a feedback loop.

Our approach

As part of our research we audited over 30 data publishing tools and focused on what support we could provide to open data publishers to meet both their own needs and those of data users. Our work explores those needs and suggests new solutions through open source prototypes.

To explore how a good, usable publishing tool could work, we experimented with Octopub – a tool we developed previously to publish data onto the GitHub platform and to prototype new approaches to pre-publishing tasks and quality assurance – as well as the process of publishing itself. We have now released a new version of the tool, and will continue using Octopub as an experimentation platform.

We are also working with partners to develop other data publishing tools such as Lintol (an open data validating tool – a bit like a grammar checker for open data) and the Frictionless Data Project – a project to create better open data publication workflows.

Key outputs

Report: What data publishers need. Here we examined how people and organisations create, gather, clean, process, describe, vet and publish open data for others to access, use and share. 

Publishing high-quality open data can still be costly and ad-hoc. The ODI is building and improving publishing tools to help speed up and automate the process. Read more about open data publishing tools and processes.

An ODI experiment, Octopub offers simple way to prepare and check a dataset, and publish it online onto the GitHub platform. Find out more about Octopub.

Lintol: a new system for data validation and cleanup – a bit like a grammar checker for open data.

The Frictionless Data Project: an initiative to create better open data publication workflows.

Background and funding

This work is part of a three-year innovation programme, running to March 2020 with a funding profile of £2m each year from Innovate UK, the UK’s innovation agency.

Through our R&D programme, we aim to shape future services and promote productivity and growth with cutting-edge expertise.

Browse our research project reports and blogs below