The popular, worldwide platform “Far.far.from.home.com” is constantly striving to add value to their original offer by being more than a simple peer-to-peer property broker. They are a platform that would like their users to get the best experience possible out of the available rentals, and through their own research they have discovered that users want to take part in local community life and activities.
In order to do that, they’ve decided to highlight local sports and activity centres.
This data is made available by Sport England as a publicly available dataset. In this case the data is made available as a complete data file. Without a searchable API provided by Sport England anyone who wants to search the data needs to download the datafile and store it in a local database. For the latest corrections and updates to the data, the platform can call an API; the corrections and updates are merged with the platform’s local database.
The data for each site offering physical activities contains the following and more:
- Location information such as address and postcode for searching, and latitude/longitude for placing on a map
- Access information such as provision of disabled parking, toilets and changing facilities
- A list and details of facilities such as gyms, pools, football pitches, tennis courts and many more facility types
The user experience
Each property has a details page that includes a ‘Nearby’ section. The property would be highlighted on a map, along with shops, bars, restaurants, and sports centres.
The map would be interactive, with users able to click on each ‘pin’ to open a panel of detailed information about the facilities, taken directly from the dataset.
Responsibility for data quality and impact on user experience
At this point the platform might want to consider who – from the user’s point of view – is considered liable or responsible for the accuracy of the data.
In the case of a large platform with millions of users worldwide, the possibility of adopting and using open data can feel like it carries a significant risk. Incorrect data shown as information on the platform could lead to some unpleasant situations: what if the address for a facility has changed, and the platform recommends its users go jogging in what is now a toxic waste plant? As they consider the potential value of this dataset and how it might benefit users, the platform’s lawyers and product managers will undoubtedly consider these potential outcomes, affecting how they might decide to mitigate risks.
The first mitigation strategy is for the platform to clearly state the provenance of the data. Doing this right can be a challenge: in the typically information-rich interfaces of online platforms, it can be hard to get people to notice, let alone understand, that some information comes from, and is under the control of, a third party. But not only is stating provenance a fair way for the platform to give credit to the source of the open data, it can, when done right, also help direct comments to the right channels should the users of the platform notice missing or incorrect data. This is why, for example, the BBC visibly credits the source of information when integrating open content and open data in their online services.
Demonstrating provenance does not fully address the concern that the platform may be showing incorrect or otherwise problematic information based on their integration of this open data.
Here the platform has a choice of mechanisms to adopt. Assuming that the platform adds an interface for its users to report incorrect information:
- They could simply hide incorrect data to minimise the impact on others, and the effect on their reputation for quality
- They could inform the user that the data is managed by someone else and redirect them to the stewards of the data, in anticipation of receiving a correction via the Sport England API at a later point. In effect this conveys that they are exempting themselves from liability and responsibility
- They could correct their local copy of the data, or update it from the source, where it might have already been fixed
- They could take it upon themselves to make sure the data is fixed at the source, assuming the source has feedback mechanisms to support this. We explore this in prototype two
- They can also provide a mechanism for integrating corrections supplied to them back into the original data source, thereby connecting the user and the third party
Adding contextual information can help to build trust and create transparency. This includes indicating when the data was last updated, and using high quality sources, such as those published at source by organisations like Sport England, or by checking the provenance of data supplied by intermediaries.
In this example, Far.far.from.home has opted for a mix of the above: the user can ‘Report information as incorrect’, and the platform automatically hides that particular location while they make efforts to rectify the mistake, potentially at source. This way other users will not see the incorrect information and the platform won’t get in trouble for displaying inaccurate data to users. Platforms might also have policies in place to avoid abuse between competitors, for example where a company repeatedly flags a competitor’s location as incorrect in an attempt to gain more business for themselves.
What happens next, as illustrated here, is unknown to the user: they do not know if the feedback is sent to a member of platform staff to check, passed on to the third party to check, or permanently deleted on Far.far.from.home’s database. Is the user expected to notify the third party themselves if they wish the data to be accurate for future users, or users of other platforms? In the second prototype (detailed below), we will explore a different approach.
Prototype one walkthrough: https://14dim6.axshare.com/#c2
Benefits of integrating third party data
- Third parties have specialist knowledge and understand sophisticated domains, beyond accommodation – the peer-to-peer platform can concentrate on what they do best while gaining value from the third party’s expertise and services
- The user has access to information that they wouldn’t see or know about in other circumstances, for relatively little effort on the part of the platform
- Integrating data enhances the visibility and utility of that data to end users, giving an incentive to data stewards to publish more and keep what they publish up to date, which in turn makes it more useful and usable for both the platform and others
Challenges to integrating third-party data
- Platforms will need to synchronise with open data on a regular basis to keep up to date and provide the ability to search quickly
- The data is not managed by the platform, leaving the possibility of presenting errors and inconsistencies for which the platform may appear liable. Furthermore, the platforms can’t directly change the source data in the case of mistakes and may encounter difficulties keeping locally corrected data both correct and synchronised with the original source
- Different third parties make their data available in different ways, using different formats and API conventions. The platform will need to take different technical approaches depending on the data provider
- Without providers adhering to standards, it would be difficult to combine data. For example, different providers may use different identifiers for locations, or use different data models to describe the same domains
- Platforms have no control over when and how the third party might change the format or availability of data that they rely on, but their customers might come to rely on and expect that data to be present. The platform will need to have contingencies in place for unexpected interruptions to the provision of data, or its gradual obsolescence if the third party stops investing in its maintenance
We have attempted to explore the challenges of feedback and corrections in our second prototype.