Why is open data a public good?
We often refer to open data as a public good, but what does this mean? And what does it imply about how our national information infrastructure should be managed?
When I say that open data is a public good, sometimes people reply, “but people can do bad things with open data (so it’s not really good)” or “but it is more easily used by people who are data literate (so it’s not really public)”. These statements are both true, but based on a misunderstanding of what a ‘public good’ is. Wikipedia says:
In economics, a public good is a good that is both non-excludable and non-rivalrous in that individuals cannot be effectively excluded from use and where use by one individual does not reduce availability to others.
In other words, a public good is something that you can’t stop anyone using, and that doesn’t get used up. The examples of public goods that people tend to use are “clean air”, “lighthouses” or “public parks”. Open data also fits this economic definition.
The objections to describing open data as a public good apply equally to public parks. People can do immoral or unlawful things in public parks: they can let their dogs make a mess, deal drugs, mug people. We don’t deal with this misuse by closing public parks or having identity checks at every entrance, but through the same laws and social norms that apply elsewhere. Similarly, misuse of open data such as providing out-of-date flood alerts or misrepresenting statistics can and should be addressed through laws and social norms, not through restricting access to that data.
Public parks, like other public goods, don’t benefit everyone equally: those that live close by and those with children or dogs benefit more than those who live further away or those with impaired mobility. Lighthouses benefit those who own and man ships, and their families, far more than anyone else. Similarly, open data might bring most benefits to those for whom the data is relevant, those who are data literate, or those who already have lots of data. But from an economic point of view, these public goods are non-excludable: once they are available, it’s impossible to prevent others from benefiting from them.
Public goods cost money to create and maintain, but because it’s not just the people who pay for it that benefit from them, it can be hard to get enough people to contribute to their maintenance. This is known as the free rider problem: it’s the feeling “why should I pay when they’re not?” If sufficient people feel that way, contributions to maintenance fall away, the public good falls into disrepair, and everyone loses.
When it comes to open data, the fear of the free rider problem leading to open data disappearing can become a problem in itself. On several occasions I’ve heard people say they would rather pay for data because doing so reassures them that the data will continue to be available long-term. Who would invest in developing a product or service reliant on a resource that could disappear at any time?
We can learn about how open data should be paid for by looking at how other public goods are maintained. There are several methods:
Government: Funding by government is the usual solution to the free rider problem. It removes people’s individual choices over contributing to public goods: we pay our taxes; government uses the money to create public goods; democracy and accountability act as controls over which public goods are created and maintained.
Collaboratives: Groups can club together to create a public good that all the members benefit from. To make this sustainable, and avoid members reneging on their commitment to contribute (while still being able to benefit from the public good the other members continue to maintain), such groups usually need to have a contractual obligation to ongoing contributions.
Cross-subsidy: When the group that is the primary beneficiary of a public good also has to pay for a private good (something members from outside the group don’t benefit from), a portion of the payment for the private good can be used to subsidise the public good.
Volunteering: Volunteering can serve as a powerful mechanism for maintaining public goods. It usually needs to be supplemented with other types of contribution (eg you can’t pay for servers with volunteer time), and it requires an infrastructure that actively provides volunteers with something useful to do.
Social norms: Public goods can be maintained simply due to social pressure: creating the public good becomes the Thing To Do (and not maintaining it disapproved of); contributing to a public good either becomes a normal cost of doing business or is a target for charitable donations.
For open data to thrive as a public good, we will need to draw on all these models. What you think about how much the taxpayer should fund public goods such as open data will depend on your political outlook. But for sure as the role of the state in society changes, we will see changes in the way open data is maintained: away from government, as public services are mutualised or privatised; towards the crowd, as the internet enables collaboration.
We need to learn which governance models work well together; what guarantees are needed for reusers to trust the supply of open data; and about the range of roles that government, and everyone else in society, can play. And this is particularly necessary for the governance of open national information infrastructure, something that ODI will be particularly focused on this year.
Image: Flickr(CC BY-SA 2.0) - Nestor Galina