Read Prototyping an AI-ready National Data Library
The National Data Library (NDL) is a proposal that lies at the heart of the UK government’s AI strategy. It will support UK innovation by making the swathes of public data collected across the UK, including economic, scientific, and environmental datasets held by public bodies, easily accessible for domestic startups and institutions. Combined, this data can form the foundations for all kinds of academic research, entrepreneurship, and empowered public services, especially now in the world of AI.
However, while the NDL’s potential is clear and government funding is in place, it has remained mostly abstract, a policy idea rather than a practical implementation. The Department for Science, Innovation and Technology has been conducting evidence gathering over the past year, but there are few public actions that have been taken towards the construction of the NDL.
We believe that, without these first steps being taken, the UK runs the risk of losing momentum, not only with the NDL initiative but also with the UK’s broader role in the global AI ecosystem.
In this work, we put our words into action by utilising our institutional expertise and experience to design and implement an AI-ready prototype of the NDL. We collected open datasets from six governmental sources, processed them and restructured them according to our framework for AI-ready data, then demonstrated how the resulting data product could be used. In doing so, we identified some challenges the government will face and built three key insights that should be taken into account when the NDL initiative reaches its development stage:
- It’s not difficult to get started; to build a successful prototype, our team worked with limited resources and methodologies, many of them based on open-source infrastructure.
- Public sector data repositories will hold the NDL back if they are not AI-ready; we recommend that any NDL work addresses improving the state of public sector datasets before they are made accessible through the NDL.
- AI agents find government data hard to use, often choosing to look elsewhere; for cutting-edge AI agents to be incentivised to use only trusted, authoritative government data sources, we believe it is the government’s responsibility to ensure those data sources are robustly accessible and AI-ready
Our prototype, NDL-lite, is a tangible, affordable and fast first step that represents a movement away from concepts and towards realities. By publishing both the prototype and the code underlying its creation publicly, we hope to provide a focus point for redesign and iteration towards a fully functional NDL.