1.1 Face-Blue-HeroBanner-1441x452-ODI-Research

Read Towards the ethical use of synthetic data in health research

Synthetic data has been gaining traction in recent years as a means by which to increase the availability of data that, while deemed valuable, may be sensitive in nature and/or in limited supply. It is part of a suite of tools and practices often referred to as privacy-enhancing technologies. These technologies can be used to conduct analyses on sensitive data that can be difficult for organisations to share, or would otherwise need to be kept closed.

These technologies can therefore be particularly helpful in the context of conducting health research, where sensitive patient data can be valuable, but also subject to greater protections. But while there are promising studies that demonstrate the potential of using synthetic data to address data challenges within health research, there remain technical and ethical limitations to synthetic data that currently limit its broader use in certain contexts.

This research focused on these limitations in an attempt to identify opportunities in which synthetic data could be used more widely to enable health research through the provision of additional guidance and/or policy. We took an ethics-based approach to consider current prevailing attitudes and efforts towards using synthetic data within health research, to determine what can be considered as optimal in terms of responsible use, in the absence of definitive guidance.

This report was made possible by the generous support of the Responsible AI team at GSK.