Data Decade: Who looks after the data?

Wed Jun 29, 2022

By Astha Kapoor, co-founder of Aapti Institute; Dr Wen Hwa Lee, CEO and Chief Scientist at Action Against AMD; and Jack Hardinges, Head of Programmes at the ODI As part of the Data Decade, at the Open Data Institute (ODI), we are exploring how data surrounds and shapes our world through 10 stories from different data perspectives. The third, Who looks after the data?, looks at the different forms of data stewardship – from individuals and communities, to institutions – and what the future has in store.

Listen to the podcast now

Listen to "Who Looks After The Data?" on Spreaker.

Listen on Apple Podcasts or Spotify

You can also listen and subscribe to the ODI’s podcast on your preferred platform: Apple Podcasts | Spotify

This is Data Decade, a podcast by the ODI.

Emma Thwaites: Hello! Welcome to Data Decade, the podcast from the ODI. I’m Emma Thwaites, and across this series we’re looking at the last 10 years of data, and also the next decade ahead, and the transformational possibilities for data in future.

So far, we’ve looked at data in arts and culture, and also at how data is shaping our cities and infrastructure. But in this episode, we’re tackling the question: who looks after the data? Access to the right data can help us tackle the biggest challenges we face – from the early detection of diseases to addressing climate change. But collecting, using and sharing data to meet these ambitious aims has to be done thoughtfully and safely. The past 10 years have given rise to new ways of managing and sharing data.

And so in this episode, we’re going to take a look at the different forms of stewardship – from individuals and communities, to institutions – and see what the future has in store.

Welcome to Data Decade.

Emma Thwaites: And joining me for this episode of Data Decade, a great panel to talk about data ownership. With me is Dr Wen Hwa Lee, the CEO and Chief Scientist at Action Against Age-Related Macular Degeneration; Jack Hardinges, the Head of Programmes at the ODI; and down the line from Bangalore in India, Astha Kapoor, the co-founder of the Aapti Institute, a Bangalore-based tech and society research firm.

Welcome everyone and great to have you all with us.

Jack Hardinges: Hi, Emma.

Wen Hwa Lee: Delighted to be here.

Astha Kapoor: Same.

Emma Thwaites: Good to see you. So, uh, Jack, I’m gonna come to you first actually. We talk about stewarding and sharing data – that’s the language that is used in the data community, if you like. What do those terms actually mean? And what sorts of things are we talking about?

Jack Hardinges: Sure. So, at the ODI, we tend to talk about data stewardship and we talk about it in quite a functional way. So the real basics of collecting, maintaining, and sharing data.

And I mean, depending on who you are, that could be wildly exciting or incredibly boring, but actually the more important and interesting idea here is in stewarding data and performing those tasks, people are making really important decisions about who can access data for what purposes and under what conditions.

And that’s really, really important as data is economic, political and social power. So how it’s stewarded ultimately affects the nature of the value that we can derive from data, the scale of our data economies, the impact it has on society. So although it’s quite a foundational act, and on the ground can look quite functional, the impacts of data stewardship are vast, and that’s really why it’s such a foundational, important idea, not only to us at ODI, but also increasingly others in this data world.

Emma Thwaites: You said something really interesting there actually, um, basically data is power. Just say a bit more about that. What does that actually mean?

Jack Hardinges: So one way of interpreting that from an economic point of view, then lots of new technologies, including artificial intelligence and machine learning, are predicated on access to data. So in order to train and build these technologies and these systems, organisations, people, researchers need access to data.

And so if you hold data as an organisation, um, then you can determine the nature and shape of these products and services. So you may be in a better position to develop them yourselves. I mean, at the ODI, we tend to talk about open approaches to data. So whereby the value of data increases as we connect it to other data, and to other organisations and to other people.

And so actually for us, a lot of our work in- on stewardship is about finding this balance between making data available and open and protecting it and ensuring that its harmful impacts don’t spread across economies and societies. And so from an economic point of view, access to data is power.

And from a social point of view, it’s interesting in that if we were recording this podcast, maybe five or 10 years ago, I don’t think we’d be talking about the social impact of data and how it’s stewarded. So we wouldn’t be talking about the Netflix documentaries on data and the data that’s collected about us as we traverse the web. So I think we’ve woken up to the, the social implications and potentially even the impacts on democracies of data.

And last year, I think in recent years, we’ve also seen politicians and civil society, and in the UK, our civil servants wake up to the inherent politics of data. So how data is stewarded affects, um, what we can understand about our democracies and about, um, systems of power.

So from all of those different lenses, um, the way that data steward is important.

Emma Thwaites: Absolutely. I mean, I guess therefore, the hands in which that stewardship is invested, that’s really important – the kinds of organisations that steward data.

Astha, I wonder what your perspective is on that, you know, the kinds of organisations that steward data, who those organisations are, who they should be. And why we’re talking about them now, why this is a timely discussion to have at this moment in time.

Astha Kapoor: So, I’ll start with maybe the last question of why are we talking about this now. I think a few things are happening in terms of the timing of this conversation.

One is, of course, as Jack said, things have moved rather rapidly in the last five years or so. And so, you know, we know that data is ubiquitous. We know that we are generating it in every activity that we’re involved in online – and even offline, I think, you know, uh, whether we’re buying groceries, whether we’re recording this podcast, a lot of these things are generating data that we don’t necessarily have a sense of.

So there’s this, uh, digital exhaust or data exhaust that’s increased. And you know, many other forums have talked about this. The pandemic has contributed to it. So, you know, children are at school, healthcare, you know, all of that is now more datafied than it was even in, you know, 2019.

So there’s that. The other thing is that there’s an increased perception of harm coming from data that people are slowly starting to sort of awaken to. So, you know, we understand – and the big one was of course the Cambridge Analytica movement where people realised that their data could be used to manipulate their decisions, which is actually quite a frightening realisation in a certain sense. But other things such as, um, you know, algorithmic decision-making on, on platforms, uh, such as Uber, Grab, Lyft, um, surveillance of workers in various places.

And all of that is actually making it clear that data can be used to cause harm to individuals in their livelihoods and their everyday life. Um, and the third is I think that there’s also an increased conversation, which relates to all of this, is that there are new regulations that are coming up.

So of course, the EU with the GDPR has led the way. But other countries are also contemplating how to govern, regulate, the collection and use of data. AI policies are coming up, uh, globally. So I think all of that put together makes this conversation quite timely. And I think that we are also seeing the idea of data stewardship being instantiated in different policy documents.

In terms of the kinds of organisations that are data stewards, I would think that all organisations at the moment in some shape or form are collecting and using data. But the question is, are they all stewards in the way that, uh, you know, we define it and also in the way that Jack defined it? Likely not. Uh, we think that the responsible stewardship of data is the differentiator.

And I think that there is no single form of a data steward, a data steward can come in different governance structures. You could have a cooperative, or a trust, or a collaborative – it depends on how you govern it. But different institutions can become stewards based on what their purpose is, what the community wants from this steward, and the kind of value they are aiming to generate.

And I think that that’s just something that we’re still discovering and learning much more about. And I think if we look ahead – and since this is, you know, about the next decade of data – I imagine that we will see a multitude of different governance structures for data stewards going forward, and more organisations coming up in this form.

Emma Thwaites: So it’s definitely an emergent field and quite experimental in nature at this point. Jack, I know you were really interested in some of the points that Astha made.

Jack Hardinges: Yeah. So one of the things looking forward to the next 10 years that I’m excited about is exploring this diversity, or if you like, species of organisations and institutions that we need to steward data effectively.

So stewardship is this foundational idea, the organisations and the institutions I described that we need to practice it is really for me, the kind of exciting open question. And I mean, some of these are here already. So ODI have been keen to recognise some really important and foundational institutions out there that in some cases have been stewarding data responsibly for, for decades.

The one I love is UK Biobank, as a steward of genetic data in the UK, has held onto data on more than half a million people and has played this role of enabling access to that data for vital research and development, but also ensured that that data, as far as we’re aware, hasn’t leaked, and hasn’t gotten into the wrong hands or been abused.

So in some cases, those, those institutions are here and the future’s here already. But in other, other circumstances then, there’s really kind of an open door, a blank page to explore new institutional forms, new types of organisations that we need to emerge if we are to kind of unlock the positive value of data and mitigate against those, those negative impacts.

Emma Thwaites: That’s a really neat segue actually, causeI’m gonna come to you next Lee.

You know, the, the link with, with healthcare there was a bit of a gift – and science as well, of course – because this is your domain. This is where you work. And I know you see enormous potential for data stewardship and some of the sort of mechanisms of data institutions that we’ve heard about so far.

Can you give us some examples of what those things look like in practice, particularly in your area of expertise?

Wen Hwa Lee: Yeah, of course. And perhaps even before that, I’d just like to set up the framework of discussion.

I think it’s important to acknowledge that many groups have been working and stewarding data responsibly, as Jack already mentioned. Healthcare is one of these examples. And then of course, nobody listening would be surprised if I say that everything we have today, that we use in our medicines, in our treatment, et cetera, they all come from intel or data or information gathered from many different people, which have then already, almost granted, if you like the stewardship of that data to use for the betterment of society in forms of new treatments or new medicines.

So there’s a lot of people who already subscribe to that notion without really framing into what we perceive today. I think the crucial difference between now and before is that, with the digitisation of more and more aspects of our lives, the traditional wall separating those disciplines, those walls of ‘this pertains only to medical doctors’, ‘this pertains only to, uh, smart watch manufacturers’. Those walls are crumbling, and that is crumbling down at such a big speed that not even the experts are keeping up.

And now that creates the problem, right? The problem is not the data itself, it’s not only the stewardship itself. Really it’s the scale and the potential to amplify any good use, but also misuse.

Coming back to that point, I think we need to acknowledge, if we want development of better, better treatment, faster for when we need it, we really have to embrace this culture of quick innovation. But at the same time, be really cognizant that there’s a lot of fear-mongering out there on the back of possible breach, possible misuse, possible harm.

So when we define what is important to society, I think we have to really balance that matter of what is important for you, what is important for me and what is important for the society?

So a concrete example of what we are trying to do, and builds back on what Jack said, you know UK Biobank, there’s another large initiative in the UK called Health Data Research UK, which is trying to really look at all the digital health data that exists in the entire country.

Now, how do we make that available quickly? How do we decide with the society, with the public and the patients, what is the best use?

So this is great work that we have been very lucky to be working with Jack here at the ODI, and set up what we are calling the INSIGHT Data Trust Advisory Board – DataTAB. And that is a specific panel which brings in members of the public and patients, mixed with the experts, which are now making decisions or helping to make decisions on different data use applications that are coming to use eye scans.

I work in the area where we are concerned about eye health. So we need to make that treasure trove of eye scans to be used for future development of treatments and drugs.

Emma Thwaites: So just, can we go back a point from, from that and just talk us through what that structure looks like? So what is the data who, who holds the data? You know, who are the organis- or what are the organisations that want to get access to that data? And what’s the mechanism for making decisions about whether they should, or they shouldn’t have access to it? Who kind of who, who are the, the decision-makers in that process? Like the mechanics of how it works, I think is quite interesting to people who might not, who might be listening to this and don’t know.

Wen Hwa Lee: Right. So that’s a very interesting question, Emma. So what happens is that this is initially a collaboration between two large NHS foundation trusts. Each of them will hold data separately. And we know that data is only even more powerful if we can bring them together to look at the complete picture. It doesn’t help looking at just pockets of data.

So these two large NHS foundation trusts, which have access to a lot of eye scans. One of them is Moorfields Eye Hospital, and the other one is University Hospitals Birmingham. So they decided to come together and find a way to make data that exists in two different compartments available at one single request.

What are these requests? These could be scientists interested in finding a cure for – biased, but – age-related macular degeneration or glaucoma, or any of the eye conditions. So they apply and try to use the data. So what happens is that during that application, there’s a series of questions and that goes through almost like a technical check.

The technical check is done by the NHS foundation trust themselves. Are they bonafide researchers? Are they with a institution which can be audited for instance, right? So all those background checks, if you like. And then the remainder of it, when they say there is a value for the patients, there is a root into patients, or this is why we are doing this research, that goes into this panel, which is called the DataTAB

And the DataTAB, we have 10 members. And the idea is that we didn’t want to find only 10 seats, one representing each organisation, because we’ll never get the public view of that. We’ve asked people with a lot of experience in different ways, in different forms to come and participate as citizens. So, if you like, there could be one person who has the eye condition, but the application that comes in is not on their eye condition.

So for that proposal, they’re not a patient. But they understand the dynamics of treatment or what do they need. But also this person could be an expert in data, so they can interpret and give their view on whether the data they’re requesting is sufficient or insufficient. So all this additional parameters can be discussed and we form a consensus, and then that is then fed back into the data controllers themselves who then can make the final decision.

Emma Thwaites: See, the thing that I find really interesting about this model is the involvement of the citizen. The citizen as data steward, I think, is really fascinating.

And Astha, I wanted to ask you about that actually. And you know, how you perceive the role of the citizen in a domain, which actually until fairly recently, I think has been seen as incredibly technical and difficult and lots of things going on in, in a black box. And these initiatives now, in different kinds of data institutions, to award the role of data steward to the citizen.

What are your thoughts on that?

Astha Kapoor: Yeah, I think to me that participation is very much at the core of the idea of data stewardship and, you know, we at Aapti have been talking about bottom-up data stewardship, which is anchored in enabling participation. Which is also, I mean, you know, the example Lee has just described is, is very interesting because, you know, citizens are able to give their preferences and decide where data is going to be deployed.

And we’re seeing ways in which this is able to happen. I mean, there are ways in which citizens can decide together. So, you know, through voting, et cetera. There are other mechanisms in which citizens can actually delegate some of this decision to a trustee or a board of trustees to decide on their behalf.

And I think that this is where the conversation of data stewardship deviates from the way data is being governed till now. Um, and I know that we are very much, you know, it’s a, it’s a short journey on data governance questions, but up until now, what was seen as sufficient is you inform people on how their data is being used and you ask them to consent.

And oftentimes, you know, we all know this from our own lives, that every time that cookie, you know, notification pops up, we’re all inclined to press what is the easiest option without necessarily processing what the information is and the implications it has.

And that’s an efficient, and a quick way of dealing with questions of data governance, but not always the most helpful to the citizens or people themselves, because there are many, many, you know, questions of harm that come up.

So the idea of participation and actually breaking down some of this information, making sure that people understand what data is being collected about them, how it’s being used, where it’s being used, making sure that consent is dynamic – so it’s not like, oh, you pressed the cookie notification and you said yes, like, seven days ago, and that still is applicable. Uh, making sure that it’s granular. So if I consent to X doesn’t automatically mean that I consent to Y. So if I consent to give my doctor certain kinds of health data, doesn’t mean that I automatically consent to giving the pharma companies similar kinds of data.

So all of those things need to be broken down. And that’s where we see the role of the steward becoming increasingly important. Like I said, it’s possible to participate at every instance, which can be quite cumbersome for people because this then becomes one more thing we have to think about, or it can be done through a trusted intermediary – and I use ‘trusted’ in quotes because there’s a lot of, uh, debate around that question – but who is empowered and enabled and accountable to decide on this. And it can be, like I said, the board members of a cooperative that have a responsibility towards their members. It can be a board of trustees, you know, Jack and I worked significantly on that issue as well.

So there are these different mechanisms. Collaboratives can set up their own rule systems and decide what data can and cannot be shared. We are increasingly realising that data is not about individuals. It is about communities. It is a, it is relational. It is a collective question.

So it’s only natural that we are not isolated in our digital journeys. And have to decide together what happens to our data, which is already, uh, you know, we all use it as a plural and not the singular data, which is no, nobody uses it because there is no, you know, singular data.

Emma Thwaites: Fantastic. Uh, there’s a lot to unpack and comment on there, but I know Lee, you, you wanted to come back in on a couple of those points.

Wen Hwa Lee: Yes, yes. And again, thank you Astha, excellent point.

There has been a shift. I don’t know whether you’ve seen this or not, right? We landed at the concept of citizen because we didn’t want this to be yet another ‘patient’ group. Why? Because the moment you put a strong label, you’re creating antagonism, you’re creating them and us. Who says a doctor cannot be a patient? And who says lay-people cannot be a doctor or an expert in one subject, and be back to the lay-person in another subject?

So we try to build this at DataTAB in a way where if somebody feels that they have the voice to represent that group, and put the question on the table, they don’t have to wait for somebody else locked behind seven doors to write an answer to you. It’s right there on the table. They can ask the expert, they can ask the regulators, for instance, somebody with regulatory experience, and then the decisions made.

And we always try to capture that learning. And that’s something, you know, Jack and the team has been doing. And try to share this as soon as possible, so we can all learn in the process.

Emma Thwaites: Jack?

Jack Hardinges: Yeah. One, one of the things that I’m interested in. So I think what, what Lee and Astha both described and, and tapped into is this, this trend in data stewardship of increasing the participation of citizens, consumers, individuals, groups, communities in the process, and breaking beyond that binary of, of consent or not.

And I think there’s, there’s a long way to go with that work and, and a lot more data systems, services, organisations that could benefit from involving those affected in the stewardship of data. There’s, there’s a long, long way to go on that. However, what I’m quite interested in, in the spirit of, of looking forward is the, the limits to that.

So as an individual, to what extent can I be involved in each and every decision about data about me, that might be collected on the web or in, in the process of visiting the hospital? And so in the same way, I had kind of complete agency of, of, without going too deep into free will, I had the agency to choose my route here to, to record the podcast. But in other scenarios, I defer or I delegate decision-making and authority to others.

So I trust my doctor to prescribe to me the medicine, I’m not involved in each and every deliberation necessarily. I, I might choose to be more involved, but not completely. So what I’m quite interested in are, are the limits of this trend of participation and bottom-up stewardship and instead thinking about how in different contexts, the different types of participation we need.

And so having a slightly more nuanced conversation is, as the next 10 years progress around the degrees – or even not even thinking of them as, as degrees, but approaches to participation, um which suit different people in different contexts, and at different times. I think the current trend of more participation is good and, and necessary, in response to some of the ills that we’ve described.

But actually longer-term, I can see it being slightly more nuanced and finding the right fit for the right context. I can see that happening.

Emma Thwaites You completely anticipated my final question which is my favourite question of the podcast actually, because this is where I get to ask everyone to make themselves hostages to fortune.

I would like you all to give me your predictions for the next 10 years of data stewardship and who looks after our data and how.

I’m gonna come to Astha first if I can.

Astha Kapoor: So I think that the next 10 years of data stewardship will see more and more organisations in more and more forms of governance, uh, for what we now understand as data stewards.

I think that in an ideal scenario, me as an individual will have multiple data stewards, for my health data, for my mobility data, who are safeguarding my interests, uh, and, and allowing me to engage with questions of data.

And so I think that that’s the ideal scenario.I think that governments and policies are also going to create more space for these institutions. And I also think that the private sector will have to interface much, much more with these citizen bodies, in order to navigate both the distribution of value and also the prevention of harm.

I’m very optimistic. And I hope that Jack and Lee share my optimism as well.

Emma Thwaites: So, Lee, what is in your crystal ball?

Wen Hwa Lee: Wow, wow. Well I think Astha looked into it!

But just to top that up, I do think that Astha is absolutely right. There will be a broader range of different institutions, different flavours. We should be even thinking about almost professionalisation – why not? – of this task, which is so important in the future. Cause if we don’t do this, we will just be swamped by all the different competing aspects. Rather than devolving control, we will be removing that from us.

Now, more than that, my hope and dream in 10 years, is that a lot of the things that, you know, Astha’s doing, we are trying to do with Jack, the HDR UK – we are trying to make it very transparent.

So my hope is that the society can start to see that there’s no corner we can hide under, they have transparency. And then, hopefully, the trust will come. People will start trusting these newly arising institutions. So I think I’m hoping that data institutions can show this is the way, the trust can be built, and we will start to see those institutions thriving for many applications.

So, that’s my wishful thinking.

Emma Thwaites: Sounds like sunny uplands to me, that’s for sure. Jack, you only get half the time because you, you partially answered this question off your own back.

Jack Hardinges: Well, I’m gonna have half the time, but hopefully squeeze two ideas in. If the mic doesn’t, doesn’t cut out.

So one, I think for the short term and, and I say it as I’m wrestling with it as we speak, is this idea of the, how two quite different ideas converge over the next couple of, of ideas. So we’ve spoken about institution building, which is almost inherently centralising resources, people, capital, and, and other resources and, and making those responsible, with some of the more radical ideas on the web of decentralisation and decentralisation of resources.

I’m really intrigued as to how those quite different ideas battle in some context or converge in others. So that for me over the next two or three years will be interesting.

Then out into maybe even beyond the 10 year horizon and into the next generation, this codifying of the duties and responsibilities of data stewards.

So that could be at the institutional-level. Might we need registers of them in the same way we have registers of charities, might our registers of charities need to evolve to recognize stewardship? And then also at the individual-level. So professionalising what it means for individuals within organisations to, to be responsible stewards of data as well.

So that codifying over the, the long run and the slightly shorter one, the, the, yeah, the convergence of, of data institutions and decentralisation. I’m, I’m, I’m really intrigued about those two.

Emma Thwaites: Well, we shall wait to find out. This has been a fascinating conversation and I’d love to say a big thank you to all of you, to Astha for joining us from Bangalore, to Lee and to Jack.

Thank you very much.

Jack Hardinges: Thank you, Emma.

Astha Kapoor: Thank you.

Wen Hwa Lee: See you in 10 years.

Emma Thwaites: So that brings us to the end of this episode of Data Decade, and a really interesting one too, looking at data ownership and data stewardship. Thanks again to our guests, with some fascinating insights.

As ever, if you want to find out more about anything you’ve heard in this episode, head over to theodi.org, where we continue the conversation around the last 10 years of data, and also, what the next decade has in store for us.

And if you’ve enjoyed the podcast, please do subscribe for updates. In the next episode, we’ll be looking at trust and misinformation.

I’m Emma Thwaites, and this has been Data Decade from the ODI.

Access to the right data can help us tackle the biggest challenges we face – from the early detection of diseases to addressing climate change. But collecting, using and sharing data to meet these ambitious aims has to be done thoughtfully and safely. The past 10 years have given rise to new ways of managing and sharing data.

How do we make sure data isn’t monopolised and hoarded? How do we ensure individuals are comfortable with how data about them might be used? And how do we make sure data can be used to help empower people and communities in the age of algorithmic decision making?

What is data stewardship?

At the ODI, we believe that part of the answer lies in the way that data is stewarded. Data stewardship – collecting, maintaining and sharing it – is the foundational activity in the lifecycle of data. In stewarding data, you are making decisions about who can access data and for what reasons.

How data is stewarded can have a huge effect – for example, organisations that hold data could determine the future of products and services that are reliant on data, like AI or machine learning.

“Data is economic, political and social power. So how it’s stewarded ultimately affects the nature of the value that we can derive from data, the scale of our data economies, and the impact it has on society.”

– Jack Hardinges, Head of Programmes, ODI

Responsible data stewardship

Over the last decade, we’ve generated more and more data, with every online (and even offline) activity. The pandemic has accelerated this, with the ‘datafication’ of services from healthcare to education. Alongside this, there’s an increased perception of harm around data – for example with the Cambridge Analytica scandal – and new regulations are emerging around its use.

All organisations deal with data, but not all are what we would consider to be ‘data stewards’. The differentiator is responsible data stewardship. These organisations embrace sharing data for the betterment of society while mitigating the harms that could come with data sharing, and there are different forms this can take – from data institutions, to data trusts and data cooperatives.

We can already see examples of this in healthcare. Take UK Biobank, for example, which stewards genetic data from half a million UK participants. Research can apply to access the database to advance the prevention, diagnosis and treatment of serious and life-threatening illnesses. Or, there’s HDR UK’s INSIGHT Data Trust Advisory Board (INSIGHT DataTAB), which brings together the public and experts to make decisions around the different applications of eye scan data.

“If we want the development of better treatments fast, we really have to embrace this culture of quick innovation, but at the same time, be really cognisant that there’s a lot of fear mongering out there on the back of possible breach, possible misuse possible harm.”

– Dr Wen Hwa Lee, CEO and Chief Scientist, Action Against AMD

Bringing people into data governance

One idea that has emerged in the last decade of data stewardship is the idea of the individual as a data steward. Participatory models, what we refer to as ‘bottom-up’ models, empower individuals to make decisions around what data is being used, by who and for what reasons. For example, BitsaboutMe enables people to make individual decisions about their consumer data; or healthbank allows people to take more control over their health data.

This is where data stewardship deviates from traditional data governance processes – where people are informed on how their data will be used, and are given a binary ‘yes’ or ‘no’ for consent. On the other hand, participatory models embrace the idea that consent is dynamic and granular – consenting to x doesn’t automatically mean you consent to y. For example, you might consent to sharing your data with your doctor, but not with pharmaceutical companies.

These models might empower people to make decisions on an individual basis, as a collective (for example, through group voting), or by delegating the decision-making to someone else.

There will of course be limits to these models. For example, individual decision-making may give people more agency over their data, but they can also be cumbersome due to the amount of decisions that need to be made.

As we move into the future, we may see combinations or new models develop to contend with these limitations. For example, in some scenarios, individuals may want full agency, whereas in others they may want to rely on delegated or collective decision-making.

Future trends

As we look to the next decade, new trends in data stewardship and sharing will emerge.

It’s likely we’ll see a multitude of different structures for data stewardship develop over the next 10 years. These may build on existing models in some sectors – like in healthcare, with UK Biobank and INSIGHT DataTab – but in other areas, there’s a blank page to explore new types of organisations that will unlock the value of data while mitigating risks. It will also be interesting to see how these burgeoning institutions contend with other emerging ideas – for example, the decentralisation of resources, which may seem at odds with building institutions that pool and share data.

“I think that in an ideal scenario, me as an individual will have multiple data stewards, – for my health data, for my mobility data – who are safeguarding my interests and allowing me to engage with questions of data in the ways that I want. In some places I just want to be informed and other places I want to be actively engaging.”

– Astha Kapoor, co-founder, Aapti Institute

We may also see the codifying of data stewardship. This may be at an individual level, where roles necessary for data stewardship are identified and professionalised; but also at an institutional level – for example, might we need for registers for data stewards, in the same way we register charities?

Governments and policymakers will need to make space for these new types of institutions, and private sector organisations will have to interact with these citizen-led and citizen-governed bodies to distribute value and mitigate harm that may come along with these new institutions.

As these institutions emerge, we hope that in the next decade we’ll work together to make these institutions more transparent, to foster trust between the public and these institutions. By increasing transparency, society will see that there’s no corner for these institutions to hide in, and that they can trust these institutions to steward data about them responsibly and in their best interests.

About us

Our five year plan

What we do

Ready for consultancy?

Membership