How Facebook, Apple and Microsoft are contributing to an openly licensed map of the world

Multinational organisations are collaborating in the open to build an openly licensed map of the world: OpenStreetMap. Here’s what we’ve learned so far about what makes this kind of collaboration work best

OpenStreetMap, launched in 2004, has grown into one of the most successful collaboratively maintained open datasets in the world. Today, contributors to the maps include not just keen local mappers, but also a diverse mix of commercial organisations, non-governmental organisations, humanitarian organisations and also large commercial organisations.

At the State of the Map conference in Milan, the teams from Microsoft, Apple and Facebook presented their projects, describing how they are working with communities.

There were many common factors that made these projects successful. These included:

  • Understanding and aligning the company’s policies with those of the global and local OpenStreetMap community to understand how they can work together towards common goals
  • Publicly documenting their plans and goals so that the community was clear about what problems were being addressed and why
  • Becoming contributing members of the community, making sure that each contributor has a clear affiliation and has been trained to work effectively with the community
  • Engaging with the community wherever they are: on mailing lists, through comments on updates, via GitHub or through a mix of social messaging applications
  • Listening to the needs of the community and supporting them to fix problems with their local maps, eg helping to fix local data quality problems, like incomplete or inaccurate areas, and not just focusing on their own objectives
  • Being responsive to questions and concerns when raised, and importantly taking on feedback from local expert contributors to help improve how they work
  • Supporting the community in advocating for more open data from local governments and other organisations, and helping them solve data licensing issues
  • Publishing data and code under open licences so they can be used as resources by the community
  • Taking on some of the maintenance and mentoring work to help support the community, eg by reviewing edits and helping to improve quality issues that might otherwise be left to the community mappers

Each of these organisations is meeting their own business objectives by contributing to OpenStreetMap. And it’s also helped them to better serve their customers and users.

Working in the open

These approaches go beyond the evolving guidelines of the OSM community and demonstrate a commitment to working in the open, in collaboration and partnership with the OpenStreetMap community. Any of these organisations have the resources to create independent initiatives but instead have chosen to engage with an existing project.

Working in the open, to deliver equitable value from data, can help large organisations like Facebook to build trust with their users.

There are perhaps some lessons here for other collaborative, community-owned open data initiatives. For example: being clear about how and where you would like commercial support and contributions. And the means by which organisations can engage with your work.

For example, OpenStreetMap UK was formed to make it easier for organisations of all types to engage with the UK OpenStreetMap community. Having a clear contact point can help organisations who are not used to working in the open to reach out to existing communities.

Our national and global data infrastructure can be stronger if we use open data, open source and open innovation as a tool for collaboration. OpenStreetMap is proving to be a great example of how this can work in practice.

State of the Map

At the ODI we work with companies and governments to build an open, trustworthy data ecosystem. To do that we need to build a sustainable and well-managed data infrastructure that creates the best social and economic value from data for everyone.

As part of our R&D project on open geospatial data, we attended the State of the Map conference in Milan. Our goal was to learn more about the OpenStreetMap ecosystem and how a diverse mix of communities, organisations and governments are collaborating to create an open map of the world.

You can read more about what we learned in our previous blog post. In this post we wanted to share a bit more about how large organisations like Microsoft, Apple, Facebook and Telenav are working in the open with the OpenStreetMap community.

If you work for a commercial organisation working with OpenStreetMap data, especially in the UK, we’d love to hear about your experiences working with the OpenStreetMap community, the data and the value you’re creating from openly licensed geospatial data. Contact Leigh Dodds

Image: by Pixabay (cc by 1.0)

Our research and development work on geospatial data and mapping

The problem with social numbers driving self-worth

A tweet from grime artist JME prompted Head of Content Anna Scott to consider how data can skew the presentation and interpretation of art, friendship and society

It’s a familiar scenario. I’m on a bus home to South London from a couple of post-work drinks with a friend, observing the people around me, listening to music, reading a little and, inevitably, scrolling through things on my phone with no particular intention.

Passively sated in this half-conscious, automatic act, it’s jarring when something pops up that points out its absurdity. Scrolling through Twitter, no doubt motivated – however consciously – by the prospect of a few new followers or a few more likes or retweets, a video pops up from the grime artist JME with a simple message: “Get rid of counters on social media please.”

It’s an interesting ask from someone with just shy of a million Twitter followers himself.

“We need to have a full year on no stats, no visible stats, no friend counter, no like counter, no view counter. No numbers. It’s social media – it’s social. We don’t socialise with numbers,” JME explains to camera.

“When I meet my friends to go and eat I don’t ring each group […] and go to the group with [the] most people there. I go to meet my friends that I like because I like them, not because they’ve got more friends with them […] We don’t have numbers when we’re being social in real life. Online we’re governed by numbers and it’s so hard to ignore them.”

Online we’re governed by numbers and it’s so hard to ignore them

JME goes onto admit that even he finds it hard to promote a song he loves on YouTube, if it has only 50 views. He adds that brands – who’ve become so accustomed to working without scrutiny with ‘influencers’ with the highest numbers of followers – should be digging deeper to find who they really want to work with and why that is.

It’s a refreshing angle, and strangely comforting to hear someone talk so honestly about the tension that comes with the opposing feelings of validation and comfort, and alienation and self-awareness that come with living our lives online.

It put me in mind of conversations we have a lot at the ODI about ‘data and the self’, both informally and through projects we work on, like our Data as Culture and Research and Development programmes.

One of the best parts of working in an open office with inquisitive people is that rich discussions happen all the time between us on the desk, in meetings and over our instant messaging platform, Slack. When I suggested writing a piece about this in a message on Slack to my editorial assistant (and friend) Steffica Warwick this morning, it struck a chord.

“I think there’s also something in there about authenticity,” Steff said.

“We want more data, because we want an accurate picture of how the world really is. We need it to help us understand the world better – to identify and solve problems. So it’s so dangerous when data is used to skew the world around us.

“Having a huge amount of Instagram likes implies that people like the content, but that’s not true when the followers have been paid, or are bots, or have lots of money pushing a marketing campaign behind them. And it makes people feel bad about themselves, but it’s not an accurate reflection of reality. And it also shouldn’t matter, because our obsession with quantifying our lives to measure happiness or success (in ways we never did before) puts us at risk of neglecting quality.”

Our obsession with quantifying our lives […] puts us at risk of neglecting quality

Our Data as Culture art programme, curated by Julie Freeman and Hannah Redler Hawes, commissions artists and works that use data as an art material.

In its current exhibition entitled ‘😹 LMAO’ works have been selected for their playful yet critical approach to data and its uses. Irreverent, provocative, unconventional and plain silly, they ask us to challenge our preconceptions of data, and consider the humanity behind our technologies. Participating artists poke fun at the ineptitude of Google’s image search capabilities or the expectation that ‘big data’ will predict the future.

For me, the work that particularly stirs that same uncanny tension of comfort and self-awareness is Ceiling Cat, Franco and Eva Mattes’ physical Internet meme (not least because I sit directly underneath it).

The half-hidden real cat’s omnipresence ensures that we remember to reflect on both sides of the data story. As the artists say “It’s a taxidermy cat peeking through a hole in the ceiling, always watching you. It’s cute and scary at the same time, like the internet.”

For our Research and Development programme, we’ve been exploring multiple scenarios of imagined futures to help us understand how we move towards a world where people, organisations and communities use data to make better decisions, and are protected against any of its harmful impacts. Some of these are pretty dystopian, with a view to stimulating discussion and debate about where we’d like to get to.

One of these imagined futures is a reputation barometer: a service that measures and monitors an individual’s standing by giving an overall score for their reputation. This reputation score would be informed by behaviour in renting or letting properties on peer-to-peer accommodation platforms. It could be used by other users of the platform or services outside the sector to make decisions about whether and how to interact with that person.

We explored how the reputation could be represented, for example as a whole number on a scale that ranges from 0 – 600. This ‘score’ is assigned to an individual through the assessment of a number of data points, building a picture of an individual’s trustworthiness.

Image: An illustration we commissioned from design collective Du.st to portray the Reputation Barometer potential future.

If you’re reminded of the ‘Nosedive’ episode of Black Mirror – where people can rate each other for every interaction they have, impacting their socioeconomic status – I’m with you.

We will be exploring some of these concepts, and related issues of how we measure progress towards diversity and fair representation, at our ODI Summit in November this year.

The theme of the summit is around ‘data and value’ – how we can create value (whether economic, social or environmental) with data, as well as embed our values within data.

Along with equity and fairness in broadening data’s benefits for society, we will look at how we can improve trust in data and tech, and how to be ethical as well as innovative. We’ll also be asking how and why we measure diversity. What are the unintended consequences that can come with particular methods, and how can they be worked on? How can the insights we gain, either as organisations, governments or communities, be communicated and acted upon constructively? How can we quantify people while respecting their individuality, community, and equality?

We’re keen to hear your ideas for people and concepts to include. Please share any you have with us at [email protected], and feel free to tweet me at @anna_d_scott.

Data’s value: how and why should we measure it?

Is it possible to measure the value of data? Many now recognise how important data is, but how it should be governed and regulated is often confused by a lack of consensus on how it can be valued

By Ben Snaith, in collaboration with Peter Wells and Anna Scott

The ODI has a longstanding interest in the challenge of how to value data, alongside our mission to build an open, trustworthy data ecosystem.

On 18 July, we held a workshop – gathering people from national and multinational public sector organisations, academia, big businesses, startups, philanthropic funders and venture capitalists – to explore this hotly debated subject. The workshop was hosted by the ODI CEO Jeni Tennison and Diane Coyle from the Bennett Institute for Public Policy. Attendees included entrepreneur and writer/curator of the Exponential View newsletter Azeem Azhar; Jonathan Haskel from Imperial College Business School – who recently co-wrote a book on the intangible economy; and Will Page, Director of Economics at Spotify.

Stifled innovation and unaddressed problems for citizens

Trust in data itself has been undermined by recent global events. In light of the changing landscape, there has been debate around how to protect citizens from data exploitation, while continuing to get value from data. This dilemma can affect everything from whether there is a list of postal addresses in a country, freely available for people and companies to use and innovate with, to how we respond to situations such as the recent Facebook/Cambridge Analytica scandal.

Until society makes more progress in learning how to value data, then innovation will continue to be stilted and problems that affect citizens and consumers will remain unaddressed.

Without knowing how to determine the value of data, how can we expect it be fairly distributed?

A 2017 report on the transport sector, produced by the ODI and Deloitte, illustrates why data sharing is so important. It states that an estimated £15bn is not being realised due to three main reasons: siloed thinking; a fear of breaching privacy, security and safety; and a belief that the costs of sharing data outweigh benefits. A belief that would be easier to challenge if we had a better understanding of how to value data.

The recent Wendy Hall and Jérôme Presenti independent review of AI for the UK government calls for data trusts, meant here as proven and trusted frameworks and agreements, to ensure that data exchanges are “secure and mutually beneficial” for all stakeholders – including organisations and citizens. The UK government is already acting upon this recommendation, but without knowing how to determine the value of data, how can we expect it to be fairly distributed?

We convened the workshop to enable people to share their perspectives on the value of data and the policy implications, and address open research questions. We hoped the discussions and insights would also inform current research at the ODI and the Bennett Institute, and identify research questions for future work. We discussed various timely issues, from the changing business landscape to data’s use determining its value.

The most valuable companies now rely on data

The six most valuable companies in the world are now technology companies that rely upon data, while the companies dislodged at the top are now attempting to catch up and will need data to do that.

Data networks and the AI lock-in-loop are affecting market competition by creating new barriers to entry. These two effects are inherently linked; the data network effect is when a product becomes smarter the more it is used and the more data it receives from users. The AI lock-in-loop is the idea that this better product will then attract more users and therefore keep exploiting the network effects to improve. The loop will continue and make it increasingly difficult for new entrants to join the market.

The firms who were able to first capitalise on these effects – such as Uber, Netflix and Facebook – are now in strong market positions.

While older algorithmic approaches tend to reach a performance limit (at which point adding more data is futile), newer AI methods – such as deep learning and neural networks – seem to continue to improve, no matter the volume of data added (as shown in the diagram below). This is one of the factors driving the push for increased data collection and will continue to affect which data will be seen as most valuable, likely to be the data most useful for deep learning.

Image: Andrew Ng, from a post on lessons learned from his deep learning course

Data’s value is more than the sum of its parts

There are many issues with the oft-made adage that ‘data is the new oil’. The two differ in many characteristics; data is superabundant compared to the finite oil supply and is non-rivalrous in nature.

Data, as an intangible asset, is characterised by externalities or spillovers, whereby its production or consumption benefits or harms third parties. Managed well – and in a way that reduces harm – data can be an important public good, which can help businesses to innovate and grow. Data is more valuable when it is used than when simply hoarded.

There is a paradox to be resolved that is central to the issue of valuing data. A single piece of data only becomes useful – and therefore valuable – once combined with other data. The Strava heatmap demonstrates this. Individual fitness routes don’t tell us much apart from an individual’s habits, but when every Strava users’ routes are combined we can learn a great deal about the popularity of certain areas and general trends across a population. This is valuable to local communities who can better support running infrastructure and planning, but the map could also be used to identify secret military bases across the world. Hence, with the power of stewarding large datasets comes great responsibility.

But if data cannot be valued as the sum of its parts – how can it be valued? Some have suggested that data marketplaces, such as Ocean Protocol, will be key to addressing this challenge.

But not every country has a market economy; different countries place different emphases on the competing rights of individuals, companies, communities and governments to data.

Governments and businesses will use the same data for different purposes. To put it another way, they have different desired outcomes, economic systems and social contracts. The Chinese government’s mechanisms to value data seem different to the USA and to European nations. As China’s influence grows will this have an effect have on how other countries value data?

The overall benefits of treating data as infrastructure are increasingly felt. Firms that invest in improving their data infrastructure will gain a competitive advantage as the next generation of public and private services increasingly rely upon data.

Jonathan Haskell’s previous work helps us to understand how to quantify the value of data to the economy by counting up the costs it incurs. Yet it is much more difficult to quantify the effect on GDP of investment in data than it is to quantify investment in more tangible assets, such as machinery – this is yet more value that is not fully captured. Similarly, Diane Coyle had previously concluded that current GDP estimates were failing to include the full extent of digital activities, or measure the further value when data is resold or reused. Moving forward new methods will have to be adopted to get a truer estimate of the value of data.

The shifting value of consumer data: from CDs to Spotify

Spotify’s new dashboard for musicians lets them view data, such as their listening figures, number of followers, the age, gender and location of their listeners and much more. This has value for the artist, who can alter their behaviour to make better decisions – such as when to tour, who to collaborate with and when to release new music. This was data that was previously either uncollected or held closely within music companies, now it can create more value. This data is only available to the artists, and not to the public, therefore its value is difficult to calculate.

The methods of measuring the value of the music sector have also changed. When music was listened to through CDs, cassettes and vinyl, analysis could inform us who was buying what album at which store, but little more. The analysis of streaming data offers the opportunity to get a much richer understanding of how music is being consumed. A 2012 study found that the UK music industry was worth £3.2bn more than previously thought. But there are still challenges in how comparisons between online and physical consumption are made and how to best use this new source of data.

We must collaborate to find new methods of measuring data’s value

Society is just at the start of realising the full potential of data to society and to the economy – driving this will be vastly increasing volumes of data being collected, stored and used. It is, therefore, increasingly important to find a way to measure data’s full value, as it is clear that our current methods are insufficient.

For this change to happen, we need to work collaboratively and push for engagement between people with different expertise – this is an issue that needs to be addressed by policymakers, economists, technologists, business and wider civil society.

Jeni and Diane will be developing a research agenda to explore the ideas raised further, with a particular focus on the measurement and regulatory aspects of valuing data.

If you want to discuss this topic more please let us know at [email protected]

Can data infrastructure help fix travel woes in the north of England?

Transport is in crisis in the north of England. Cancelled, overcrowded and slow trains cause misery for millions of people. Both central and local governments agree that we need greater investment in, and local control over, transport services and infrastructure.

But we need more than physical transport infrastructure. Investment in data infrastructure (invisible but vital) is also needed to both solve the current problems and unlock the next generation of transport jobs and services.

UK transport industry challenges

Currently in the UK we lack access to good quality data about: national transport investment; the availability and capacity of transport services; the volume of desired and actual passenger journeys; congestion on roads; and even the price of bus journeys and which company offers them.

We need access to robust data to make good decisions – ranging from which route to take to work, through to public sector decisions about where and how to target investment.

Transport companies are not sharing data effectively which has a detrimental impact on services. People who want to travel, whether it be from one side of Manchester to another or a longer trip from Newcastle to Blackpool, struggle to find the quickest and most cost-effective route. If they choose to travel by public transport then they want to buy a single ticket regardless of whether this involves a combination of metro, train, bus or taxi. They just want to get from A to B.

In our 2017 paper, the case for government involvement to incentivise data sharing in the UK intelligent mobility sector, co-authored with Deloitte and the Transport Systems Catapult, we found that organisations weren’t sharing the necessary data to enable effective, joined-up services. We reported that unless action was taken then by 2025 the UK would lose £15bn of potential benefits. Some of this action could be taken by individual companies but others needed public sector support.

We also explored the ‘human elements’ of sharing transport data in our report Personal data in transport: exploring a framework for the future. We found that alongside inspiring innovation and the creation of better services, data sharing highlights the need for organisations to address critical questions of trust, ethics, equity and engagement in how data is used. This will become increasingly important as people grow more aware of data issues and as they gain more control over data about them.

Meanwhile, other countries are taking steps forward:

  • France is at the forefront of driverless metro technology as detailed in our report Transport data in the UK and France.
  • In the USA and China governments and companies are investing heavily in driverless cars with trials taking place in multiple cities.
  • The Ethics Committee of the German Federal Transport Ministry, with its own strong car industry, has published a code on automating driving describing the role of data, and providing guidance on vehicle behaviour.
  • New York’s Taxi and Limousine Commission has introduced rules requiring rideshare companies, like Uber and Lyft, to share detailed data about their journeys so that it can improve road planning.

The UK is currently behind the curve on realising the benefits of data in transport, and alongside the poor service provision, we risk missing out on the next generation of jobs and the tax revenue that comes with them.

Tackling these challenges requires investment in data infrastructure

Data is an emerging, if invisible, form of infrastructure that every sector of the economy relies on. Good infrastructure is there when we need it but, at the moment, too much of our data infrastructure is unreliable, inaccessible, siloed or is not freely available. Data innovators struggle to get hold of data and to work out how they can best use it, while individuals do not feel that they are in control of how data about them is used or shared.

Without a reliable, maintained and far-reaching rail and road network, the ability of individuals and businesses to move around is restricted, communities become isolated, and the transfer of knowledge and ideas becomes difficult. It is the same with a restricted data infrastructure: innovation is restricted; services become biased around pockets of information; and communities can remain unrepresented.

Open data is the foundation of this emerging vital infrastructure. Much of the data that helps with our public debates – such as the information about spend and congestion that helps us make investment decisions, and our personal decisions, such as the price of a bus journey – should be open for all of us to use.

To realise these benefits, this we need private sector transport providers, central government and local transport authorities to open up more data.

For over 10 years, Transport for London has been openly publishing data (timetables, service status and disruption information), driving operational efficiencies, increasing use of the service and generating £130m of economic benefits in job creation and faster journeys. To encourage similar initiatives across the UK will need more involvement from private sector transport providers who are more dominant outside London. The benefits should be clear but if private sector providers cannot see them then it may need government to intervene, either directly or by giving more powers to local regulators.

Other parts of data infrastructure need a different kind of investment. Open standards for data describes not just the format of data but also the rules by which it can be shared, and who with. We need the sector to work together to create better standards for transport data. Those standards might describe how personal data can be collected, shared and used. They should be accompanied with guidelines on the ethical questions to be asked and publicly debated as new services are being designed and built.

Accompanying both open data and open standards for shared data is the need to invest in data governance. Governance to help align investment in data infrastructure with what people need, to ensure that data is made available in a way that creates equitable outcomes, and to help ensure that data is not misused. We are all still working out at what level – city, sector, national and global – this data governance is needed, but the UK’s existing transport regulators, like many other regulators, are generally unfamiliar with this type of data governance work and will need more skills if they are to take on the role.

It is investments and interventions like this that will tackle the data challenges that we identified in our reports and help both create new jobs and improve our transport services.

The north led the way on transport in the Industrial Revolution, can it do the same again?

Investing in data infrastructure alongside physical transport infrastructure is essential in the 21st century. It can:

  • unlock innovation.
  • create a share in what the Transport Systems Catapult estimates to be a £900bn industry by 2025.
  • help to create better services, such as buying a single ticket, regardless of how many transport providers we use on a journey.
  • create more trust in new technologies like ride sharing services and driverless cars.
  • help us to make better transport investment decisions.

This requires a brave and fearless approach. Citizens want and deserve better services but many citizens are also fearful of data issues. Companies also instinctively want to keep data to themselves and are are often uneasy about making data more accessible.

We need to be open with data and open minded to ideas; to win over passengers’ trust; to give them more control over and access to data; and look to a future where complaining about delayed trains and badly scheduled buses is no longer a British pastime.

When George Stephenson was designing and building The Rocket in Newcastle in 1829, he was helping launch a new era of transport services – the railway – but he was also building on the innovations of previous generations. The Rocket was tested on tracks that had been built for horse-drawn wagons carrying coal from mines to ships. The railways amplified the benefits of the Industrial Revolution, led to a wave of change that improved people’s lives, connected cities together and led to massive job creation in the north of England.

We are now living through a similar wave of change created by the invention of computers, the internet and the web. The north needs greater investment in transport infrastructure but this won’t deliver all of the potential benefits. Can the north repeat its 19th century trick in the 21st century, but this time by investing in data infrastructure alongside its transport infrastructure?

Whether you’re a transport business, a transport regulator or the inventor of the next Rocket if you want help with tackling your own transport data challenges then get in touch with the ODI at [email protected]

First published on Transport-Network.co.uk

State of the Map 2018: what we learned about open geospatial data

As part of our R&D project exploring open geospatial data, we decided to attend this year’s State of the Map conference.

It gave us the opportunity to strengthen ties with both the UK and international OpenStreetMap (OSM) community while learning more about the ecosystem that has built up around the project. We were amongst over than 400 attendees, representing over 150 organisations from 56 countries.

Key themes

Several themes came up in the workshops, discussions and presentations we went to over the course of the weekend.

Working towards greater diversity and inclusivity. One of the OSM values is inclusivity. The community works to improve the diversity of its contributors (mainly degree-educated, european males, currently). The barriers to achieving a fully diverse and representative community came up in the talks. The potential for OSM and the ability to crowdsource other types of geospatial data, eg to tackle social gender and accessibility issues, were powerful examples of the data’s breadth of use.

Many types of organisation are interested in OSM. While the diversity of OSM contributors needs improving, there was a diverse mix of organisations represented by attendees. These ranged from social enterprises and startups to government organisations and large corporates. The OSM community were encouraged to connect with and learn from the broader open data and open source communities.

The power of the local community. The unique selling point of OSM is its foundation on local knowledge. Through local knowledge, insight and mapping, the data gets more accurate. As the quality improves, maps will attract an increased level of trust and wider use. The OSM community is actively exploring how machine learning can be used to improve the data’s speed, quality and detail. But their approach is to use these new tools to support local knowledge while encouraging continued community ownership of the data, and ensure sustainability.

Humanitarian uses. One of the first applications of OSM for humanitarian aid was following the Haiti earthquake in 2010. The power of OSM in countries with poor national mapping is vast, and ranges from enhancing knowledge-sharing and community empowerment in Mozambique and supporting the elimination of Malaria in Botswana, to tackling FGM in Tanzania.

Transport applications. Many of the sessions called for additional community support and discussion around improving the quality of OSM to better support transport applications, eg marking entrances to bus/ train stations and interchanges. Others described how they were enhancing OSM for use in long-term transportation planning or navigating the complexity of freight movements. Those sharing their experience found common issues in global data standards to support updates.

What we observed

Hearing and discussing experiences at State of the Map sparked a number of thoughts for us.

  1. The OSM community is truly open – all of the talks during the event involved a mixture of open source, open data and open forms of collaboration. The community is using a variety of means to work together and in partnership with larger organisations. We saw how the community sees how it can communicate its value better during an open debate around the future of OSM.
  2. Crowdsourced does not mean low quality – there are many tools, techniques and processes behind the scenes to validate the data input to OSM. To the new or occasional contributor, these processes aren’t immediately obvious. There’s a lot to be learned around how the community monitors and tries to improve quality of the map. The community and its wider ecosystem are constantly creating and improving their data-quality tools, allowing them to deal with a continuous stream of contributions. For example, MapBox introduced the quality tools that allow them to review 80,000 changes each day. In their view, the “OSM data is eventually consistent”.
  3. OSM is not just for hobbyists or humanitarian applications – there’s a rich ecosystem of organisations and communities around OSM. Startups and small businesses spoke about how they were using OSM data or creating new products and services around it. It was also great to hear how Microsoft, Apple, Facebook & Telenav are all contributing to and using OSM in their mapping applications.
  4. We can do more to communicate the value of OpenStreetMap – the variety of sectors and important services and tools now reliant on OpenStreetMap shows its value to society, the economy and the environment. The OpenStreetMap community is keen to explain the value of the data, by demonstrating its flexibility to create more custom, tailored maps, for example.

We were welcomed into this community with open arms and we have come away feeling inspired. The conference gave us a lot of insight into OpenStreetMap and its community and we’ll be using this to help shape up the next phases of our project.

If you’re an SME or startup based in the UK that uses or contributes to OpenStreetMap, we would love to hear from you and learn from your experiences. Contact Deborah Yates at [email protected].

The ODI responds to House of Lords call for evidence into internet regulation

In March 2018 the House of Lords Select Committee on Communications invited contributions to its inquiry on the regulation of the internet.

Our response is focussed on data and openness. Our vision is for people, organisations and communities to use data to make better decisions and be protected from any harmful impacts. We work with governments and businesses around the world to deliver on this vision.

Beyond ownership: we need better control and rights over data

The new UK Centre for Data Ethics and Innovation is starting to take shape. It recently appointed a chair, Roger Taylor, and published a consultation to help determine its role, activities and operating model. The consultation is open until 5 September 2018. We’ll be submitting our thoughts and would encourage other people to do the same.

The centre also asked our CEO, Jeni Tennison, to write a short discussion paper about intellectual property and data ownership. This is what she wrote for them.

A paper for the UK Centre for Data Ethics and Innovation

Data is a new form of intangible infrastructure that underpins every sector of our society and economy. It increasingly supports the fundamental services and systems that enable our economies to function, allow us to communicate, and improve our lives. Like other infrastructure, we have to design our data infrastructure to meet our societal and economic needs. That includes designing the laws and institutions that govern who can own — or more broadly control — our data infrastructure, and what limits are placed on them.

There are five types of overlapping rights over data in legislation:

  • IP (intellectual property) rights
  • data protection rights
  • data access rights for people and businesses
  • government’s right to data to carry out its democratic responsibilities
  • citizen’s rights to data about their governments

These rights have different motivations, implementations and challenges.

Intellectual property rights (IPR) grant controls to data creators and maintainers. They were invented to encourage people to innovate by ensuring they can benefit financially from the intangible assets they create. While copyright over creative works is recognised globally, only the European Union and a few other countries grant database rights to those who invest in creating and maintaining databases. More traditional IPR tends to expire after a period of time, during which it is expected that the creator has received adequate reward for this investment; database rights are designed to reward ongoing maintenance and curation of the intangible asset. Creators get these rights automatically, by law; except for some special circumstances, others must get permission from the rights holder (through a licence) to use that intellectual property.

There are several challenges in applying existing intellectual property regimes to data. The cost of investment in creating some forms of data is minimal but at the same time, data is most valuable when it is combined with other data. If accessible, data will frequently validate, augment or be brought together with other datasets in ways that are unanticipated and that unlock value that would never otherwise be realised. The resulting data may itself be valuable as input for another process. But a lack of clarity in legislation, and case law to provide precedents, makes it hard to know whether particular uses of data are lawful or not, and the degree to which the holder of rights in one set of data can assert rights in data derived from it. Equally, new technology models, such as the Internet of Things and Artificial Intelligence combine ownership of physical things, data IPR and software IPR in ways that may not benefit consumers and that make it difficult to determine how to fairly reward multiple creators and maintainers. It is unclear whether the existing IPR framework around data are really stimulating innovation, and whether they are rewarding value creation or value extraction.

Data protection rights give controls to individuals to allow or prevent the collection and use of data. They are designed to enforce human rights, primarily the right of privacy, and have been strengthened as use and capability of technology has advanced. Their aim is to protect citizens and consumers, and reduce bad use of data and bias. The most recent iteration of data protection rights in the EU and UK are the General Data Protection Regulation (GDPR) and the 2018 Data Protection Act (DPA). Following the recent Facebook/Cambridge Analytica scandal there have been calls for these rights to be strengthened further to give individuals ownership of data about them in a model akin to property rights. This would be a significant departure from Europe’s current rights framework and create new difficulties. Proposed data ownership models, current data protection legislation and most online services take a hyper-individualistic approach. They give control to single individuals while most data is about multiple people, whether friends and families in social media data or the groups of people that can be identified and targeted by data analytics techniques.

Data access rights aim to encourage competition and innovation by giving individuals and organisations the right to access data within a defined framework. The GDPR and DPA both contain rights of data access and portability for individuals. These data rights enable people to use data themselves or provide it to trusted third parties who can perform analysis or build services using it. There are also data access rights within specific sectors such as banking, in which payments providers must provide access to data under the EU Payment Services DIrective 2 (PSD2). The UK open banking initiative complements PSD2 in the retail banking sector and the UK Government is exploring building a similar initiative in the energy sector. There is still work to do to understand the risks of data access rights, their impact on the market, and government’s role in making them work.

Governments have rights to data to allow them to perform their democratic responsibilities to their citizens. For example, health services can access data to reduce the chance and impact of outbreaks of infectious diseases; national statistics bodies can access data from businesses to produce statistics; police forces can access data to investigate crimes; and the intelligence services can access data to reduce the chance of terrorism. In line with democratic norms, government’s rights to data must be subject to democratic scrutiny both before they are put in place and during their operation and use.

Citizens have rights to data held by governments to allow them to perform their democratic responsibilities by scrutinising government and holding them to account. These rights also provide access to data infrastructure controlled by the public sector. These rights are embodied in legislation such as the Freedom of Information Act (FOI) and Reuse of Public Sector Information (PSI) Regulations. They help to create a public data infrastructure that is as open as possible, support open democracy and encourage citizen engagement with government. These rights are challenged by government’s desire to control the flow of information about policy, and by the legacy business models of public data infrastructure stewards who have historically been encouraged to make revenues by selling the data they hold.

Recommendation

Each of these types of data rights have different motivations – such as protecting people, encouraging innovation, and supporting the democratic process – and challenges. Some of them lack clarity in themselves while others overlap or conflict.

A major challenge for the future, and one that the Centre for Data Ethics and Innovation should help the UK take a lead on, is how to grow them into a coherent data rights framework, with a corresponding monitoring, resolution and enforcement regime, that supports ethical innovation and greater use of data to make better decisions while managing any harmful impacts.

What’s next

These are important issues and ones that many countries around the world are grappling with. If you’d like to discuss them then do drop us a mail to [email protected]

Sharing data for good in the peer-to-peer accommodation sector

People feel more comfortable sharing data about them on peer-to-peer accommodation platforms if they understand the personal or societal benefits in doing so, a survey commissioned by the ODI reveals

The results of the survey follow the ODI’s work in early 2018, looking at how data can be used better across the peer-to-peer accommodation sector, and are particularly relevant for those interested in understanding the impacts of the sector, such as the Scottish Government’s recent response to the report of the Scottish Expert Advisory Panel on the Collaborative Economy.

Data sharing for good over profit

The research reveals that the level of comfort people have about sharing information depends on how that data will be used. We commissioned a UK-wide survey in March 2018 which was conducted online by YouGov. The data has been published here under an open licence.

People were more comfortable knowing that data about them is shared with local councils or other public services if it has a positive impact or improves society. People were most comfortable sharing data about them: to better understand fire or other health and safety risks (58%); to use for planning public services (47%): and to help local councils understand the availability of local housing available to rent or buy (45%).

There was also strong support for sharing data to ensure that citizens’ duties were being fulfilled. Half of the respondents (50%) would be happy to share data about them to ensure that hosts are paying the right level of tax, and 56% would do so to ensure that both guests and hosts are complying with local laws. One example where this kind of data sharing may make a difference is to support local interventions focused on limiting the amount of time a property can be rented for, such as the 90-day rule in London, Berlin, where hosts can only rent out their primary residence, or Toronto where hosts may only rent out their primary residence as well as one additional property.

However, respondents were slightly less comfortable sharing data about them to help people make a profit, such as using data about them for planning and licensing decisions (eg for new hotels, clubs and bars etc). Nearly half (46%) of respondents were not comfortable sharing data about themselves in this context (compared to 44% who would be comfortable sharing this information).

These findings are particularly relevant for the cities around the world who are looking to understand the impacts of the peer-to-peer accommodation sector. Research we conducted through our peer-to-peer accommodation project focused on how a variety of data from distinct contexts can help to understand the positive and negative impacts of the peer-to-peer accommodation sector on society.

One of the things we investigated was ‘data observatories’ and whether they could provide ways of helping created more informed debate which should help lead to better decisions about the sector.

Other key findings

With the introduction of the right to data portability, we were interested in the type of information people who rented properties through peer-to-peer accommodation platforms would be comfortable sharing with websites and hosts. The information renters were most comfortable sharing were the ratings they had received from other peer-to-peer accommodation websites (45%), while one-third of renters would be comfortable sharing information they held about other guests.

People were less comfortable sharing information about themselves which wasn’t directly involved in renting short-term accommodation through a platform. Only 23% were comfortable sharing from other peer-to-peer platforms such as Uber or eBay, 13% were comfortable sharing information about their criminal record, and only 8% would share information about their credit score. Nearly one-third (31%) of people surveyed would not be comfortable sharing any information with websites and hosts.

We also asked what types of things, such as perks or offers, might encourage individuals to share information about themselves. The biggest draw for people were preferential rates and discounts (37%), having a wider choice of platforms (28%) and having access to properties that met their specific needs (22%). Two-fifths (40%) of people surveyed would not share information about themselves to get better access to perks and offers.

About the survey

All figures, unless otherwise stated, are from YouGov Plc. The total sample size was 739 adults. Fieldwork was undertaken between 2nd – 14th March 2018. The survey was carried out online.

Get in touch

If you’d like to talk to us about the poll, data skills or to see if there’s an opportunity for us to collaborate, please do get in touch.

Digital Preservation Awards 2018: ODI sponsors award to celebrate new tools and approaches

This year’s Digital Preservation Awards, recognising the very best work in digital preservation across all sectors. Entries close on Monday 30 July

The particular award we are sponsoring is the Award for the Most Outstanding Digital Preservation Initiative in Commerce, Industry and the Third sector. This award is aimed at encouraging and recognising the adoption of digital preservation tools and approaches in institutions that are not libraries, museums and other ‘memory institutions’. The award includes a cash prize of £1,000, a trophy and certificates.

Additionally, shortlisted entrants will each receive a ticket to this year’s ODI Summit hosted by our founders, inventor of the web Sir Tim Berners-Lee and artificial intelligence (AI) expert Sir Nigel Shadbolt, and  CEO Jeni Tennison.

The ODI works with companies and governments to build an open, trustworthy data ecosystem. An important part of this ecosystem is preserving our digital heritage and providing access and openly as possible.

“I’ve been passionate about digital preservation for over a decade, and was ecstatic when DPC decided to add the industry award in 2016 to promote and encourage work in the commercial sector,” says ODI Learning Lead and Awards Judge David Tarrant. “At the ODI, we are passionate about everyone getting value from data. In order for this to happen, we must encourage and reward activities in both the public and private sectors.”

The ODI is already proactive in the area of digital preservation. Through the ARCHANGEL project, we are investigating the role of emerging technology, such as the blockchain, in guaranteeing the integrity of digital archives.

Through sponsoring the award, the ODI is looking to celebrate those in e-commerce, industry and third sector, which are leading the way in preserving our digital heritage. Last year’s winner was HSBC for its ‘Global Digital Archive System (GDA)’, which documents the development of the organisation since 1865, and many of the banks acquired by HSBC and its predecessors over the years. The records also shed light on the social, economic and political history of the communities and countries where HSBC has done business.

The ODI’s Commercial Director, David Beardmore, says: “we are incredibly excited to sponsor this award that celebrates the engagement of industry in important digital preservation activities to steward data, both now and for the future. These stewards play a key role in our value chain of creating positive impact from data.”

Entries for the awards close on Monday 30 July. The awards ceremony will be held in Amsterdam as part of an International Conference, hosted by the Dutch Digital Heritage Network and the Amsterdam Museum on World Digital Preservation Day, Thursday 29 November.

Framing for our thinking about trust

By Rachel Wilson, Senior Software Developer

What gets in the way of trust?

For five years now the ODI has been working to encourage greater access to data in ways that protect privacy, and in the hundreds of conversations we’ve had with those that hold data, we occasionally come across resistance and concerns.

We know that at the heart of these concerns is often a fear of the unforeseen, or something else that gets in the way of trust. Could it be that it is not only infrastructure, but also trust, that is essential for an open future for data?

An essential factor

Trust is an essential factor of life and as humans we have an intuitive understanding that it is a complicated issue. How each of us determines who to trust, and to what extent, is shaped by so many aspects: it depends on our culture, history, values, relationships, motivations and perceptions and so on.

Organisations are people

Since organisations, governments, businesses and institutions are made of people, then the systems, processes and mechanisms we design will in some way reflect the people who have created them. This includes the mechanisms to share our resources in a way that gives us confidence that the benefits will be distributed fairly.

For example, some people are concerned that if Ordnance Survey (national mapping agency in the UK) makes map and survey data open that ultimately only Google will benefit, at the expense of both the Ordnance Survey and UK-based businesses.

Therefore we would like to understand how and why trust is built, maintained and lost, and how this relates to sharing and openness.

Handwritten poster showing the ODI research team's workings on the elements that make up trust in the context of data sharing
The ODI research team workings on the elements that make up trust in the context of data sharing

Can trust be defined?

Trust appears to be an intuited, subjective and often nuanced topic. We can find many examples of how trust has been lost between parties, and we can point to many elements that are intended to increase trust – be that certificates, third-party intermediaries or regulation.

But can we point to a reason why we trust, why it is lost, and why a mechanism may work in a given situation but not in another? How does trust relate to the ecosystems we build?

Fortunately, during early research we came across an interesting academic paper (A general definition of trust by Kieron O’Hara) that formally describes trust in a way which we have found to be useful in framing our thinking. It defines trust in the following terms:

  • Partners in a relationship
  • Claims made about future behaviour
  • Context about which claims are made
  • The ‘audiences’ who receive benefits from behaviour
  • Inputs into the judgements we make when deciding who to trust and to what extent

In team conversations about our other current research into data sharing models we have found this definition to be very helpful. It has given us a shared vocabulary, particularly when exploring how trust has been lost or how it might be increased within a system.

A basis for conversation

It’s a fascinating topic and we are keen to have this conversation with a wider audience. Especially since we are discovering that those involved in the design of data access models are attempting to address issues of trust.

So we are planning to summarise the paper for readers without an academic background and relate the concepts within to examples in the data ecosystem. We’d like to produce something that we can refer to – as a complement to our other research – that defines some shared vocabulary as a basis for ongoing conversations.

We trust you’ll be back to find out more. If you’d like to join the conversation, please get in touch.