Up and running for just two years, OpenCorporates is already the largest open database of companies in the world. The project, which up until a few weeks ago was manned by just two people, has data for 49 million companies and aims to have an entry for every single company in the world.
OpenCorporates is the brainchild of former journalist Chris Taggart and Rob McKinnon – both of whom have been running open data projects for a number of years. Chris explains the OpenCorporates set up: ‘I run it, the others write code and deal with data. We take messy data from government websites, company registers, official filings and data released under the Freedom of Information Act, clean it up and using clever code make it available to people.’
The OpenCorporates team was inspired to bring greater clarity to corporate data. Chris explains: ‘when you look at global government data that relates to companies, it is often unclear, incomplete, inaccurate or hasn’t been kept up to date. We also found that the same companies appeared on different government registers without being linked, so there was duplication and valuable connections in the data had been missed. We wanted to change that and we knew there would be a market for it.’
The team began with just three company registers: UK, Bermuda and Jersey. Chris says: ‘From those little bits of data, we now have nearly 50 million companies, with tens of millions of bits of accompanying data. We are being used by all sorts of people and organisations from journalists to audit companies to tax offices.
The OpenCorporates project operates what it calls a ‘sharealike licence’, whereby all its data is free for anyone to use. In return, any product of that data must also be open for others to use. For organisations who don’t want to give back data, they pay OpenCorporates a fee. Although Chris and Rob hadn’t projected any income for the first two years, they already have paying customers who are eager for access to their data.
The data and its challenges
OpenCorporates uses mainly company registers, but also takes data from a wide range of other published datasets, both national and global. Chris continues: ‘every night we import the London Gazette, the Belfast Gazette and the Edinburgh Gazette, which is where official insolvency notices are published. Every day we look for the latest Health and Safety Executive enforcement notices and download the latest world trademark register.’ The team also sources data from the UK’s Financial Services Authority, the US’s Central Contracting Registration system, and a wide variety of companies. Chris says: ‘we’ve just started looking at mining licences in Kenya. Basically, any data to do with companies, we want it!’
In terms of challenges, handling such vast amounts of data has caused some technology scale problems, though Chris says the team have been able to work their way around these. As for many open data companies, the main issue is around the release of government data, Chris says, ‘governments have stopped seeing company data as a public register and now see it as a source of income. There are critical datasets that need to be released which are holding things back.’
The ODI and the future for open data
Chris is enthusiastic about the future of open data: ‘I think this is a terrifically important and exciting time. When you look at what the UK has done in terms of open data, I can’t think of any better environment to be starting an open data business. We’ve definitely got a head start here and a lot of civic minded hackers that really get open data.’
The team also feels that, as well as publishing open data, governments need to start consuming it themselves. Chris adds: ‘government can save money by buying open data rather than closed data’.
In terms of the impact the ODI will make, the OpenCorporates team says its greatest benefit is garnering a critical mass of open data talent. ‘Bringing people together, so you meet other people who are doing similar things, where they can help you, or you can help them’. Being able to utilise the ODI workspace is also really useful for OpenCorporates, particularly as they are taking on new staff and Chris spends a lot of time overseas: ‘I don’t need to worry about whether they know the code for the office alarm!’, Chris says.
Success and the future
There are two achievements that stand out from the rest for the OpenCorporates team. The first was finding out that one of the big audit companies was using their data on a daily basis. The second was being invited to join an advisory panel for some Financial Stabilities Board work. Chris continues: ‘we had been going for a year, a tiny company, two of us, getting most of our data from government websites, and yet the Financial Stabilities Board wanted us on its advisory panel for work on a global legal identifier for entities in the financial sector. It showed that we were doing something unique and valued, and that governments are now starting to understand that open data is an essential part of their future.’
Looking ahead, the team feels there is real potential to develop business opportunities with City of London firms. Chris says, ‘open data provides businesses with intelligence on competitors, suppliers and themselves. It’s a huge industry predicated on having an edge, so our data is invaluable.’
OpenCorporates wants to be the OpenStreetMap of the corporate world. Chris ends by saying ‘we aren’t trying to supplant traditional data suppliers but I think that in five years’ time, those companies will be taking data from us. I foresee every journalist, every newspaper using us every single day - we will be the first place you go to for information about companies. We have just scratched the surface - in five years’ time our data will be 100 times the size it is now!’
Visit OpenCorporates at: www.opencorporates.com
Media enquiries: [email protected] on 07990 804805