At the Open Data Institute, we’ve been working on defining responsible data stewardship for the between June 2022 and March 2023, with the support of the Patrick J. McGovern Foundation. This blog was collaboratively written with Professors Tom Jackson and Ian Hodgkinson from the Loughborough Business School at Loughborough University.
The internet and climate change: a structural paradox
Data and digital technologies play a pivotal role in tackling climate change, serving as catalysts towards achieving the Sustainable Development Goals (SDGs). Data-driven and AI technologies are already being used to tackle different impacts of climate change. This includes enhancing our capacities to monitor and analyse environmental variables, such as resource depletion, droughts, pollution and predicting deforestation patterns and temperature fluctuations in a more comprehensive manner. Data stewardship is an important part of ensuring that data is available to support the fight against the climate crisis. Data enables more informed decision making and better targeted interventions, as well as improving and enriching scientific research. This was illustrated by the case of Nepal, where effective data sharing played a pivotal role after the 2015 earthquake. Collaborative data sharing between private, public and third sectors emerged from this tragic event, leading to a mitigation of the natural disaster aftermath. Given the current climate crisis, similar catastrophes may become more commonplace, and collaborative data sharing will play a key role in averting and mitigating these crises.
However, too little is known (and done) about how data and technology contribute to the crisis. The internet creates 1.6 billion metric tons of greenhouse gas emissions per year. There is a growing recognition of the internet's cost to the climate, including the impact of online video streaming. This is intrinsically linked to the way we collect, use and share data as individuals, as organisations and as society more broadly. Questions of environmental sustainability of data and technology, like energy-intensive AI models, are gaining traction in the public debate.
As our reliance on data and technology increases, the environmental impacts of their use and development are likely to become more and more significant. Governments and organisations creating these technologies and stewarding large amounts of data around the world will need to act to mitigate these risks. This could be through stronger international policy agreements related to climate change, or via responsible data sharing in times of natural disasters or climate-related catastrophes.
This blog will explore the concept of responsibility, which pertains to being accountable for actions and their effects on others, a common principle that applies both to data stewardship (defined as ‘the collection, maintenance and sharing of data’) and to the environment. There is even a commonality between how people use the language of responsibility and stewardship for both the planet and data.
Environmental sustainability is an important consideration for organisations who want to steward data responsibly
At the ODI, we’ve been working on the topic of responsible data stewardship, which seeks to explore how organisations stewarding data can do so ‘responsibly’. In our work we found responsibility to be a multidimensional concept, which should include a commitment to sustainability.
We consider responsible data stewardship as ‘an iterative, systemic process of ensuring that data is collected, used and shared for public benefit, mitigating the ways that data can produce harm, and addressing how it can redress structural inequalities.’ Responsible data stewardship involves a holistic outlook that goes beyond merely addressing privacy and security concerns to encompass ecological obligations and the pursuit of sustainability.
- Environmental sustainability and public benefit. Realising the benefits of data stewardship is an important part of achieving net zero. For instance, various organisations are stewarding repositories of climate data, data collaboratives are being developed to mitigate climate change, participatory methods are being adopted for environmental monitoring, and citizen science initiatives are collecting new data sets on air pollution and other climate-related issues.
- Environmental sustainability and mitigating harms. Embedding environmental sustainability within responsible data stewardship examining the unexpected outcomes of data processing, including the environmental harms generated by data handling or AI models training. Some of the mitigations to these harms are explored further below.
- Environmental sustainability and redressing structural inequalities. There are many inequalities in tackling the climate crisis. For example, North America and Europe have been responsible for half of all accumulated global Greenhouse gas (GHG) emissions since 1850, yet the impacts of climate change are not felt equally around the world. There are further tensions in the emphasis on using technology to tackle the climate crisis, which ultimately relies upon the destruction of communities in the Global South to extract the resources necessary to develop these technologies. Resources like the Global Material Flows Database and Follow the Oil enable us to better understand these relationships, and empower the communities to fight back. It is important for organisations stewarding data to acknowledge and question the prevailing power dynamics and imbalances within data ecosystems to ensure a just transition.
Overview of practices aiming to mitigate the digital impact on the planet
With projections indicating that the data industry will contribute to a larger share of carbon emissions than the automotive, airline, and energy sectors combined. As such, it is critical that governments and organisations not only tackle traditional carbon emissions but also address digital decarbonisation. This section provides some examples of how organisations are mitigating the impact of data on the planet.
Measuring the digital carbon footprint for future decarbonisation strategies
Measuring and monitoring the environmental impact of data stewardship appears to be the first step to any decarbonisation strategy at an organisation level. Several initiatives have been launched in order to address this challenge.
While a variety of tools exist to evaluate the carbon footprint associated with digital activities like data storage, they predominantly operate in a 'reactive' manner—assessing and gauging CO2 impacts post-execution. A case in point is cloud providers who offer CO2 assessments of your digital cloud footprint through a customer dashboard. Though valuable, these tools primarily facilitate retrospective action, such as reducing dark data, only after the generation of greenhouse gas emissions has transpired.
Recognising the challenge confronting organisations and decision-makers who want a more proactive approach to digital sustainability, Loughborough University have developed a tool that enables organisational teams to forecast the data CO2 footprint of any upcoming project.
The Data Carbon Ladder is an empirically driven diagnostic instrument, offering a structured approach encompassing five pivotal considerations for data teams during the project planning phase:
- Dis(aggregation) Strategy: Evaluating how the new data will be aggregated or disaggregated, influencing the resulting CO2 output. This involves decisions related to data importing, or integration with existing datasets.
- Dataset Size: Computing the carbon score of the proposed dataset, based on its size (measured in megabytes, gigabytes, terabytes, petabytes).
- Data Velocity: Assessing the data's velocity against a project’s requirements for real-time information (measured in megabytes, gigabytes, terabytes, petabytes over one month).
- Storage Strategy: Defining the intended data storage approach, which could include no storage, storage in a data centre, or on-premises storage.
- Analytics Type: Choosing the category of data analytics to be performed—ranging from low carbon impact (descriptive analytics) to moderately carbon-intensive (prescriptive, predictive) and highly carbon-intensive (cognitive analytics).
By employing these decision guidelines project teams can swiftly ascertain the data CO2 footprint of their proposed initiatives. This exercise reveals potential CO2 hotspots, enabling teams to introspect and evaluate how data can be harnessed more effectively for their organisations while positively contributing to their ecological responsibilities. The Data Carbon Ladder offers an opportunity to embrace a proactive stance in digital sustainability efforts, fostering environmentally conscious decision-making and shaping a more responsible digital landscape.
Discussing the movement towards data minimisation and digital sobriety
The concept of data minimisation, primarily considered as a principle to adhere to within GDPR, holds promise as a way to reduce the climate impact of data processing and practices associated with it. This principle goes beyond privacy concerns and encompasses the idea that “less is more” when it comes to collecting and storing data. Data is everywhere, every connected individual or organisation is handling vast amounts of data on a daily basis - but is all of this data necessary and useful? Tools like the Data Carbon Ladder are helping organisations to understand what data is useful and solutions to more sustainable data practices are becoming more common, for example using less technological devices internally, using a shared platform and reducing the amount of email exchanges. There are also voices from the private sector advocating for more sustainable data practices, for example, the Economist argues for the elimination of storage waste, realising the value of small/dark data, optimising networks and data transmission.
Going even further than data minimisation, data removal or erasure would be to delete non-relevant data to minimise the data storage of a given activity. It could be an interesting avenue to explore to encourage organisations to think about how they use data and the environmental impact of that use and storage. For example, repurposing the RAD tipsheet practical guide/resource put together by The Engine Room for environmental purposes.
Policy and governmental frameworks should integrate efforts for digital decarbonisation
Governments and international organisations have a key role to play in supporting and encouraging stakeholders to steward data for environmental purposes and in the most sustainable way possible. The literature on the topic illustrates that data stewardship is not taken into account in sustainability frameworks, such as ESG. Similarly environmental sustainability is not always embedded within data governance frameworks. This is an important moment to act in this space. For example, B-Corp is launching a second round of public consultation as part of the review of their certification requirements. This presents an opportune moment to champion the inclusion of a criterion addressing the ecological cost of data and digital technologies.
A cultural shift is needed to address digital decarbonisation
Achieving environmental sustainability in responsible data stewardship requires a cultural shift. There is a lack of understanding of the value that responsible data stewardship in addressing climate change could constitute.
Public perceptions around the environmental impact of the meat or the airline industries seem much more developed and structured than the one around the environmental cost of data digital technologies. Research, documentaries, and public awareness campaigns have risen in the last decade contributing to sensitising public opinions and policies about the environmental cost of those industries. Widespread understanding of the true environmental cost of the data stewardship and practices would be a powerful tool in achieving cultural shift by:
- Embedding understanding of the scale of the phenomenon, thereby driving ambitious policy-making.
- Sharing good practices that could inspire and educate other organisations across an ecosystem.
- Mitigating harmful scenarios where organisations transition from offline to online services, inadvertently causing adverse environmental consequences.
- Boosting innovation in sustainable technologies and data practices
- Enhancing collaborative efforts, e.g. data sharing partnerships, research collaborations, towards reducing the digital carbon footprint
For that matter, it is essential that sustainability is included in all dialogue around digital and data ethics.
Conclusion
The risks to environmental sustainability will only increase as data and technology continue to proliferate. Data stewardship can be seen as a way to address climate change. However, this approach must also acknowledge that data practices generate carbon emissions, adding to the larger issue. Therefore, there is an urgency to concretely connect these elements and increasingly integrate them into public environmental discourse.
Embedding environmentally-friendly data stewardship practices requires involvement and commitment from various groups, including academia, third-sector research, policy-makers and the public, just like the initiative Climate Change AI demonstrates.
If you advocate for climate change within civil society, private companies, policy-making or the tech sector and understand the impacts of data on our environment, come and speak to us about Responsible Data Stewardship by sending us an email on [email protected]. Feel free to share your valuable insights or any responsible practices you may have come across.