Case study: How Aidspan use open data to track health spending
This case study was supported by the Partnership for Open Data, funded by the World Bank.
Authors: Stephane Boyera and Carlos Iglesias
- Executive summary
- About Aidspan
- Aidspan's data approach
- Aidspan’s consumption of data
- Data supplied by Aidspan
- Tools and technology
- Key success factors
- Challenges for watchdogs like Aidspan
Aidspan is an international NGO that serves as an independent watchdog for the Global Fund, a partnership that includes various public and private donors, governments, international organisations and civil society organisations whose role is to fight HIV/AIDS, tuberculosis and malaria around the world. The objective of this study is to explore how Aidspan has been using open data to increase its impact and to provide a series of analysis and visualisation tools to its community.
It also contains recommendations for how Aidspan can extract more value from open data. The vast majority of datasets used in the various Web tools offered by Aidspan come from the Global Fund, with some coming from other well-known organisations such as the International Aid Transparency Initiative (IATI) or the World Bank. The datasets that are integrated in the Aidspan web site and tools represent only a subset of the overall sources of information used by Aidspan analysts to build their analysis and reports. The case study focuses only on the subset of datasets used as open data.
Aidspan provides different tools on their website to visualise information related to Global Fund grants and grantees, grant performance and donors. Internally, Aidspan use a homemade tool – the Aidspan Portal Workbench (APW) – to integrate the various datasets and make analysis. This tool is not yet public but will be released soon. Thanks to this data-driven approach, Aidspan is a very innovative watchdog organisation that efficiently supports the Global Fund community at large, not only with evidence-based analysis but also by providing tools to help various stakeholder groups engage with the Global Fund (grantees, grant applicants, technical partners and so on).
However, Aidspan faces a number of challenges. These include:
- Community building: While Aidspan is well-connected to the Global Fund community, it is difficult for it to engage with its broader audience and understand the needs of various actors in terms of datasets or specific visualisations.
- Capacity building: Aidspan does not have the resources to integrate all the datasets they are currently using. Moreover, without an open data background, it is difficult for members of Aidspan staff to take full advantage of the open data community resources and some key elements, like data licenses, are missing.
- Data quality and completeness: While the quality of data has been improving over the years, especially since the new Global Fund Data Site was put in place, exploiting other sources is still difficult. Some sources are not structured, or are not in a machine-readable format.
These challenges lead to a series of recommendations:
- Increase the use of external data sources: For now, the open data tools provided by Aidspan are primarily using Global Fund data. It would be useful to cover more datasets to expand analysis capabilities of Aidspan audience.
- Publishing your own data: The APW is currently an internal tool only. It will be useful for the community to make it public and then expand the number of datasets published.
- Expand your activities: Aidspan could expand its activities to cover not only financial dimensions of the Global Fund, but also cover activities and impact.
- Build internal open data capacity: Aidspan should develop its internal capacity in open data and could also develop national-level watchdog organisations open data capacities.
- Increase knowledge of your customers: Aidspan could use open data tools online to engage with its audience to understand new needs, requirements and how to support them better.
Finally, the report contains recommendations for other actors, such as the Global Fund community at large, watchdog organisations in general, data producers or publishers or the open data community in general. These recommendations include how they can take advantage of the Aidspan example, or how they can support organisations like Aidspan more effectively.
The study explores how Aidspan, a watchdog organisation monitoring the Global Fund, has been using open data to increase its impact and provide a series of useful tools to its community.
In terms of methodology, the study was conducted in three phases, including preliminary desk research, a series of interviews with representatives from Aidspan and its various stakeholder groups and a final analysis consolidating the findings and proposing a series of recommendations. The report is structured in three major parts: a short Aidspan profile and its open data approach; key findings of the study; and finally the recommendations and conclusion.
Aidspan is an international non-governmental organisation that serves as an independent watchdog for the Global Fund, a partnership that includes various public and private donors, governments, international organisations and civil society organisations and whose role is to fight AIDS, Tuberculosis and Malaria in the world. The Global Fund was founded in 2002, and is a grant-making organisation investing nearly US$4bn a year to support programmes in more than 140 countries.
Aidspan was founded by Bernard Rivers in 2002, just after the launch of the Global Fund. Its mission is:
To serve as an independent watchdog of the Global Fund and its grant implementers through providing information, analysis and advice, facilitating critical debate and promoting greater transparency, accountability, effectiveness and impact.
Over its twelve-year history, Aidspan has developed various activities to realise its mission. Since its creation, Aidspan has been focusing its action on exploiting available data, processing them and extracting valuable analysis to make them meaningful to the Global Fund community at large. In recent years they have also been developing and releasing various data analysis and visualisation tools to support the different Global Fund stakeholders and to build capacities of local country-level watchdog organisations.
Although people at Aidspan are the primary users of the data they produce during their daily work, its audience also includes all other stakeholders of the Global Fund. These consist of the Global Fund’s Board and Secretariat; Global Fund donors; grant applicants and grant recipients. It also covers other NGOs and international organisations working as observers and activists, as well as researchers and media interested in exploiting Global Fund data in different ways. Finally, its audience includes technical partners of the Global Fund – eg UNAIDS; WHO; STOP TB; Roll Back Malaria; UNICEF; and UNITAID. This is a relatively small community and a large part of it is already aware of Aidspan and its work.
Aidspan currently provides information at the national level for more than 120 countries where the Global Fund operates. Aidspan is not yet able to provide data at the sub-national level, due to its limited internal resources and capacities, as well as the additional effort required to tap into national datasets that need heavy quality check and may not be properly formatted.
Nevertheless, in order to be able to provide more specific insight at the country level, Aidspan is starting to work more closely with a range of other local partners in countries, including activists on health issues; professional associations; health workers; health rights campaigners; budget tracking and transparency organisations; and other local communities.
While the underlying spirit of the Aidspan approach has much in common with open data principles, Aidspan staff were driven by the desire to facilitate data access. They began years before the open data concept was developed, and until very recently haven’t been connected to the open data community at all. The same applies for the Global Fund itself, where transparency and accountability were key founding principles and the publication of their data has been part of their core activities.
Aidspan has, over a period of time, evolved in its use of Global Fund data. In the very beginning, Aidspan analysts tended to use a small amount of statistical data from spreadsheets and manually process the datasets to extract key elements that were then reported in Aidspan publications.
Subsequently, they moved to a full programmatic access to data in order to increase the amount of information collected and to perform a more in-depth and regular analysis.
Aidspan automated this process and developed an internal tool, the Aidspan Portal Workbench (APW) to ease the work of their analysts. At the same time, Aidspan decided to make the data available on their website on multiple ways. Finally, Aidspan is now in the process of releasing this internal tool openly to facilitate further data exploration and analysis by external users.
It is important to note that the datasets integrated in the Aidspan website and tools represent only a subset of the overall sources of information used by its analysts to build their daily reports. In the sections below, this case study only focuses on the datasets that are used on Aidspan website and tools.
On the data demand side, Aidspan primarily uses data from the Global Fund Data Site, although also some other Global Fund data, like that which they publish on the IATI Registry. They also use some country economic indicators (Gross National Income and Economic classification) from the World Bank open databases along with other disease-specific data, but this is limited to the most recent feature related to donor analysis.
Aidspan is now in the process of exploring the integration of other new valuable data providers, such as the World Health Organisation or the United States Agency for International Development or the US Congress, as part of their future plan not only to focus on following the money, but also to start exploring the overall programmes’ impact at the activities level. This is particularly hard due to the lack of access to reliable structured data and the limited resources in the Aidspan team to integrate them (and check their quality) when they are available.
In order to perform appropriate analysis, data must be very granular, which is almost impossible from the different sources Aidspan has been assessing. They also need to deal with more specific data sources that are not always properly structured, such as national or health budgets. In other cases, when the data is available, it is still difficult to make the connection back to the original Global Fund grants.
Aidspan provides two main ways of interacting with data:
- The different tools available on their website, with data and analysis based on the Global Fund data sources, including:
- Grant Portfolio: a tool to explore total grant amounts by country, disease or regions, as well as country-level details.
- Grant Performance Analysis: a tool that allows for the analysis of grant performance accordingly to the Global Fund ratings scale by a number of different criteria: region, country, disease component, health system, recipients and so on.
- Donor information: a tool providing information about donor pledges and contributions by the different types of donors, with an Aidspan 'generosity score' added in.
These tools are intended to help the average user gain access and understanding of the original data.
- Aidspan's own 'backend' set of utilities (the APW) is not available for the general public yet, but will be [partially made public soon (it is currently being tested by some partner organisations and wider publication is expected for 2015). The original intention was to input the Global Fund data into their own databases so they can easily transform, adapt, create secondary data from different calculations and get more elaborate insights in general (eg with respect to specific country or regional profiles, or different disease profiles and spending or disbursement data analysis).
This tool also allows data export in a series of machine-readable formats (csv, xml and xls) and its data structure and storage database have been designed to support easy plugin of new data sources. Once available, other stakeholders will be able to access Aidspan raw data and use them to create ad-hoc reports or products.
Apart from these data and related tools, Aidspan publishes also other data, such as the different reports and papers, guides, or newsletters currently available on their website as unstructured data, mostly .pdf or text files. They also rely on an internal content management system that is used to store documents and references for their daily work.
As a result, not only can people read the various reports and analyses that Aidspan staff publish but they could also conduct their own analysis using the same data as Aidspan, challenge their conclusions, or run their own secondary research on specific regions or themes. People are already using this data in a variety of ways:
- Learning what is going on in the field and detecting trends
- Acquiring deeper knowledge on how to build successful grant proposals
- Identifying opportunities on finishing grants
- Evaluating risks for a given grant, a specific country, or with specific partners
- Identifying potential competitors in the same area
All Aidspan tools for data processing have been developed internally after initial conceptualisation of their specific needs. Then, other visual frameworks and libraries are used for the graphical presentation on the top of their content management system. A future plan starting in 2015 includes the use of geolocalised presentation of information.
Aidspan usually gathers all the data from the interfaces provided by the original data sources, using a variety of techniques and standards (web services, APIs, scripts, JSON, OData and so on). They try to align with the technologies their main data providers are using.
This section lists the key success factors of Aidspan in its current activities.
Aidspan is a great example of a progressive watchdog that, while being completely independent, is working together with the organisation they are monitoring on improving data quality and facilitating access. Such a positive relationship is beneficial for both parties, as it makes the monitoring work easier and increases the value of the organisation being monitored.
Aidspan is not perceived as a problem by the Global Fund, but quite the opposite: an asset demonstrating their transparency and the quality of their work and helping the Global Fund to be more useful and efficient improving the way it relates to its partners. This good and stable relationship between staff working on the datasets is key for mutual trust and success.
Adding value to data
Aidspan’s major success is transforming complex data from the Global Fund to a comprehensible and accessible format for the audience. From the perspectives of grantees and other Aidspan data consumers, it is far simpler and easier now for them to access and use the data. They can easily track how their grants are performing or where they stand in terms of budget and spending.
Aidspan has also had a relevant impact on the fund’s quality of data while relentlessly commenting on its accuracy and its implications. There are several examples of improved quality of data because of their push of the Global Fund for better information and more complete datasets. One of the most relevant comes from mid-2013, when the test of the new web services resulted in Aidspan detecting that about half of all the grant data reported on the old website had been incorrect, by anything from a few dollars up to $300m.
Being innovative as data-driven activists
Aidspan is also proud to be an organisation made up not just of activists but data analysts. The founder of Aidspan is an economist by education and passionate about investigative journalism, having spent a significant part of his life exploring and utilising public domain datasets to extract information that is easily understandable for the public.
This rigorous scientific approach is also helpful at the time of building legitimacy of watchdog organisations, providing a more professional and neutral image of the organisations working on such activities. Now Aidspan is also starting to engage in helping other organisations to be more efficient in their watchdog function by questioning and evaluating data.
This section summarises the key challenges that Aidspan is currently facing.
A balanced open data ecosystem requires both data supply and demand. Releasing data could be costly in terms of resources and time. People and organisations are likely to support these activities only if they can see a real sustaining impact. Partnerships are useful in allowing Aidspan to do more than its small team can achieve by themselves.
On the other hand, while qualitatively Aidspan knows well what its potential audience is, it does not know exactly which assets are being used by its users, how well their needs are currently being covered or what their future expectations are. There is also a general lack of feedback or requirements from Aidspan's audience, which might never have requested specific data analysis or new tools.
For a long time, Aidspan has tried to engage with its community through different online tools and some participatory events, but such initiatives have not yet delivered promising results. There is currently no specific interaction within the community to address these questions, apart from data enquiries and requests received from time to time, or during casual meetings at workshops and other events.
Aidspan is a small organisation that has a mix of different roles and capacities in its team covering the skills they need, including IT and data analysis. They also rely on the support of some external consultants for more specific tasks when required, for example, analysis of complex health systems. At present, the team is more limited by lack of time than any lack of skills.
For this reason, it could be useful to have local country teams that are able to do their own analysis in order to compare with official data from the Global Fund in search of problems and discrepancies. This is a field that Aidspan has started to explore, but working with communities of practice and local watchdogs is sometimes difficult due to the weak capacity to analyse data. Getting access to proper support and financing to address this issue is also problematic for an NGO like Aidspan.
The quality of available data was a major challenge in the early days and remains a problem due to the difficulty in checking or triangulating data from the Global Fund. Data was usually hard to use (eg only available within scanned documents), with different quality problems and often incomplete. Now the situation is different, given that the quality of data has been improving over the years, especially since the new Global Fund Data website was put in place. That made data quicker and easier to access or see problems with. Also, the predisposition and quick reaction from the Global Fund towards the feedback received made the quality leap possible.
The move from the old reports and spreadsheets approach to the current dynamic Global Fund data store brought an unexpected consequence with some loss of transparency in practice, given that many data users do not hold the required skills that such programmatic access to data requires (and were better able to cope with, for example, csv or excel files). This has probably reinforced Aidspan’s role, although the Global Fund has also reacted by incorporating new ways to explore and reuse the data, such as the data analysts and the data explorer to accommodate a wider range of users.
Aidspan has proven a very useful actor improving access to data from the Global Fund, but there are still several important missing pieces, such as the different activities at grant level (deliverables, beneficiaries, audits, etc) or some grantee information (work, performance, etc). Not all recipients are listed on the websites, e.g. principal recipients are listen, but not sub-recipients. Data that the Global Fund currently publishes mostly comes directly from its business management software but data is first filtered and approved to avoid publishing any sensitive information.
More globally, Aidspan faces some limitations with regards to data availability, given that even when there are multiple potential sources it is not always easy to get structured data. Another challenge in this regards comes from the granularity of current data, where sub-recipient registers do not exist for public scrutiny, this prevents Aidspan for analysing at that level. This is a very important issue when you want to be more relevant at the local level.
Finally, the adoption of the new Global Fund funding model brings a series of new indicators for Aidspan to incorporate in its analytical tools. In addition, there were no standardised key performance indicators for Global Fund data up to now, but it is now moving towards annual standardised indicators that will make comparisons possible and therefore open up very interesting new opportunities for data integration.
Following the findings above, we present here a list of recommendations that would help Aidspan to embrace a more complete open data approach. We also include a section with additional recommendations for other actors in the Global Fund ecosystem.
Recommendations for Aidspan
Based on the findings of the study, we have identified five major open data-related recommendations that should help Aidspan to increase its impact and better serve its audience. Those recommendations are:
Use more external data sources through exploring datasets available at the national level and from international organisations.
Publish your own data, including internal secondary data and a structured library of publications.
Expand activities to include specific views on data that are still missing, visualisation related the new funding model and information related to programme activities.
Build internal open data capacity including open data tools, principles and recommendations such as data licenses; metadata standards; data extraction and transformation tools; visualisation concepts and so on.
Increase knowledge about your customers with a two-sided approach, keeping track of current users and increasing community engagement through direct dialogue.
Increase the use of external data sources
Today, most of Aidspan’s analysis is based on Global Fund data. This is largely due to internal capacity and the time required to bring new sources on board, plus the quality gap that will need to be addressed. Nevertheless, with the increasing number of governments and organisations engaging in the open data movement and making their data assets available in a way that enables re-use, it should be easier to integrate more datasets in the near future. In that respect, there are two areas in which Aidspan should engage:
Explore datasets at the national level
With countries all over the world embracing the open data movement, this is a great opportunity for Aidspan to import more data from national repositories and later at the subnational level. Various public bodies (eg agencies and ministries) could be new useful sources of additional datasets for Aidspan. The same approach as the one put in place with the Global Fund should be developed again in this case, establishing positive dialogue with the various ministries.
Furthermore, most national open data initiatives are launched today in conjunction with activities aiming at developing capacity in civil society and developers’ communities. This is a great opportunity to create partnerships between Aidspan, an identified local watchdog organisation, and local open data experts to develop relationships with various ministries and exploit existing data. Here, the authors recommend that Aidspan follow the Code-for-Africa model to bring within organisations local expertise on open data.
Explore international organisations
The second area for Aidspan to explore is using data sources from international organisations as an integral part of the APW. This might include both UN organisations and other associations like the International Budget Partnership or Pepfar for HIV data.
Many of these organisations are currently releasing huge quantities of data and are also cataloguing large datasets available in various regions of the world. As an example of the potential in this area, Aidspan is planning to exploit data collected by the WHO at the country level about activities and usage of Global Funds grants for tuberculosis. This data is already available and could be used today to map not only money but also activities in the field.
Publish your own data
While Aidspan is primarily a data consumer, it is currently also starting to explore the possibilities of releasing its own data with immediate plans for making its data management tool partially available to the public. Further steps on this direction may include in particular releasing also all its publications as open data with proper metadata to increase findability, exploration and re-use by others, as well as publishing as much as possible of the various secondary data that Aidspan is producing from the Global Fund.
In the same way, Aidspan could offer to local watchdog organisations or researchers publishing analysis on Global Fund activities a place to publish its own data, increasing visibility of these local organisations as well as bringing added value to Aidspan customers while creating a more complete knowledge database.
Expand your activities
Aidspan has primarily focused on 'following the money' which means, in the case of the Global Fund, reporting on various aspects of grants. The various interviews conducted for this study highlighted the demand for more information in this area as well as in others, with some specific new needs:
One is related to specific views that are missing. For example, it seems that grant recipients or grant applicants have difficulties identifying actors working with the Global Fund in a given country or across various countries or regions. In this case, for example, a per-beneficiary (grantee) view is missing.
The second specific area is related to the new funding model. According to a few interviewees, the new funding model of the Global Fund requires new indicators to properly monitor activities. These indicators should be added to the current set of tools and visualisations.
Apart from the financial aspect, almost all interviewees mentioned the need to have indicators and information related to programme activities, not only monetary information.
Build open data capacities
The development of open data capacities should also be an integral part of the support to country-level watchdog organisations. At the local level, the development of capacities may also encompass tools for data collection by means of mobile technologies, or mobile services in general, to help local organisations gather data, citizen reports and human stories similar to the one published by Aidspan for World AIDS Day.
Know your customers
Aidspan, while releasing lots of tools and publications, has limited knowledge about its audience: who the users of its various assets are, what they look for when visiting Aidspan, their patterns of usage and needs, or what they are currently missing and would like to see developed.
All these dimensions are critical information for Aidspan to prioritise the various items in its roadmap or to adapt the tools to the various profiles of visitors. From our perspective, there are two main (not mutually exclusive) options to identify the different users and their behaviour in a given community that are presented below:
Keep track of your users
One approach could be to apply methods to analyse and detect usage patterns on the Aidspan website by various types of stakeholders. Apart from the usual analytics tools and user tracking techniques, there are other tools that are relatively easy to set up and which require only some web development, providing additional interesting information on Aidspan’s audience:
Organising various surveys, either at random on the Aidspan website, or using registration information & mailing.
Enabling people to easily use some of Aidspan’s data and tools on its websites through an embedded widgets approach (see an example on the Land Portal).
Engage with your community
The second option proposes a different way to approach the audience. Today, Aidspan’s website is primarily informative and our suggestion is to add a community-driven component. This would involve engaging the various stakeholder groups on different online and face-to-face events to have direct dialogues, for example:
- Online debates: this is more than just opening comments on publications, and consists of moderated time-bound and subject-specific online debates such as the ones at e-agriculture.org or Land Portal websites.
- Blog space for the community at large, allowing people to publish and share their own views on relevant topics.
- Local watchdog organisations dedicated web area to have a federated structure that meets on the Aidspan website. This means that all watchdog organisations supported by Aidspan could be hosted by Aidspan to form a network and central knowledge base of national Global Fund watchdog organisations.
- Face-to-face events such as hackathons, boot camps and other peer-learning sessions in various places and events where groups of Aidspan audience might meet.
Recommendations for other actors
Many different groups could take advantage of Aidspan’s experience to increase the impact of their work. These actors include:
- Members of the Global Fund community at large, who could participate in dialogues with Aidspan and help them to identify new requirements and data resources.
- Watchdog organisations in general, which might be inspired by the Aidspan example to go further in their monitoring role.
- Data producers or publishers who could realise the added value of getting into the open with regards to quality and transparency.
- The open data community at large who could look beyond the usual community members and collaborate with other types of organisations.
Global Fund community
Concerning the Global Fund community at large, we feel it would be helpful to have greater engagement with Aidspan at least at two complementary levels:
- Participating in dialogues with Aidspan and the community about the type of information, visualisations or aggregation that are most useful. This is the best way to secure new tools in the future that will best serve the Global Fund’s needs.
- Helping Aidspan to identify new sources of data that could be exploited by Aidspan to do a better job in monitoring activities and impact of the Global Fund.
Aidspan should be an inspiring example for other watchdog organisations in general, as it goes beyond the usual work of these organisations. By moving from manual analysis of data to machine processes, Aidspan identified various issues and was able to report this to the data producer. The detection of these errors is critical to provide a fair analysis of the data.
By providing its own analysis and its tools, data and visualisations that empower other members of the community to do their own analysis, Aidspan is itself transparent in its work, and increases its impact by allowing others to complement their monitoring work.
Finally, Aidspan shows that a positive relationship with the entity it monitors can have more impact than a relationship of conflict. It allows the watchdog to motivate the producer of data to adapt to their needs, and fix the issues identified without compromising their independence or their critical view on the entity they watch.
The case study shows clearly that releasing data is a very powerful way of being transparent and accountable. However, beyond this it shows that by allowing others to exploit your data in a machine-readable way, you allow them to detect issues and incoherency. This in return allows you to increase the quality of your data, and identify issues in various internal processes such as data collection or data curation.
In this case open data release could be seen as a critical quality assurance process. However, for this to take place, it is essential to work together with potential re-users of data to understand the data, formats and level of disaggregation they need.
The open data community
It is essential for its growth and impact that the open data community works towards supporting organisations like Aidspan and the Global Fund.
The only real way to demonstrate open data impact on society is to engage with existing actors and raise awareness of its potential. Building a library of case studies and promoting them a positive step in that direction. Collaborating to provide tools and resources and build capacities within these organisations seems to be the natural next step.
Aidspan is a great example of how a watchdog can exploit existing data to analyse and provide insights on how the organisations it watches are performing.
While Aidspan is doing a similar job as traditional watchdog organisations, ie collecting data, analysing them or advising their communities (see for example the Bretton Woods project), its approach is very innovative.
It empowers its community by providing tools and data for people to do their own analysis easily and detect trends in specific regions or on specific topics. In that way, it is a shining example of how to make data easily accessible and valuable to all.
This case study also highlights areas where Aidspan could develop further and increase its impact through a greater use of open data. Further engagement with audience may uncover needs for new tools or new datasets that will further increase the value of Aidspan in its community.
We would like to thank the following people whose contribution to the study was invaluable: Dr Kate Macintyre, CEO of Aidspan, Bernard Rivers, founder of Aidspan, Kelvin Kinyua, Aidspan Senior Systems Officer, Angela Kageni, Aidspan senior Outreach Programme Officer, Murad Hirji from the Global Fund, Trevor Mwiu from World Vision UK, and Dr Christian Gunneberg from World Health organisation.
We would like also to thank William Gerry, Liz Carolan and Emma Truswell from the ODI, who led and organised this study.