Open data in Mexico: how Data Squads jump-started national publishing

Part of the National Digital Strategy of Mexico. This case study and the involvement of the Open Data Institute in the Data Squads were supported by the Partnership for Open Data, funded by the World Bank.

Authors: Paulina Bustos Arellano, Ricardo Daniel Alanis Tamez, William Gerry and Codeando Mexico



Introduction

In 42 days between May and July 2014, the Mexican Data Squads programme led to the release of 100 datasets, coordinating 10 federal agencies and greatly advancing the Mexican National Digital Strategy. It led to the development of datos.gob.mx, and is becoming an example of how quickly a government can make strides in open data with the right leadership.

Here we present a short case study about the how the Data Squads were formed, what they did and lessons learned from the programme, aiming to inform other governments considering similar initiatives. Its purpose is to tell the story of the initiative, rather than to evaluate its success or give advice on next steps. The case study was informed by interviews with key actors in the Data Squads, and by examination of the written materials that guided the development of the programme.


Background: Mexico’s National Digital Strategy

In October 2013, Mexico launched its National Digital Strategy to promote the use of Information and Communications Technology. The primary goal of the strategy is to achieve a ‘digital Mexico’ in which technology facilitates economic development and improves the quality of citizens’ lives.

The strategy has five objectives:

  • transforming the culture surrounding digital technology within government
  • developing a digital economy
  • providing quality education
  • implementing universal and effective health services
  • improving citizen security

To effect a change in culture in the Mexican government, the strategy introduces open government principles, of which open data is a key component. This aims to use open data to create openness in government, develop a national geospatial strategy and introduce an open data element to the public policy process.

Mexico has worked on implementing its open data programme through three different strands. First, an open data portal (catalogo.datos.gob.mx) has been successfully developed to the Beta version for wide release to the public. Secondly, the government has begun to establish an open data policy for Mexico, which will detail how federal agencies must implement open data principles into their processes and activities. Finally, the Data Squad programme was piloted in summer 2014, spurring the release of a large number of datasets in a very short period.


Data Squads: Programme design

The main objective of the Data Squads programme is to fast-track the delivery of open data by federal agencies by overcoming technical, legal, organizational and political barriers. The Data Squads are groups of open data experts organised by the President’s Office to build the capacity of federal agencies to release open data.

Programme structure

The program is comprised of three main actors: management, the Data Squad and federal agencies.

Management - The National Digital Strategy receives their mandate from the Office of the President of Mexico. This is crucial, as the team have the explicit backing of the President: when coordinating a wide range of federal agencies, lack of authority would be fatal to the programme. Members of the strategy team are responsible for the implementation of open data in the federal government.

The Data Squad - The squad was drawn from two agencies in Mexico: the National Digital Strategy and the Information and Documentation Technology Fund (INFOTEC). The squad is divided into three areas: information architecture and security, public policy, and legal issues. The various areas require experts in these fields to ensure obstacles can be overcome quickly depending on the requirements of the federal agencies.

Federal agencies - Mexico’s federal government consists of 273 federal agencies and organisations. The Data Squads are focused on accelerating the delivery of open data from these agencies, as well as ensuring that departments understand and can follow the open data policy.

10 agencies participated in the initial Data Squads initiative. Some agencies were invited by the National Digital Strategy team to participate in the programme, based on their preparedness for open data and their ownership of critical datasets; others applied to join the programme. Two different liaison officers worked to develop relationships with the agencies: one focuses on the technical field, whilst the other concentrates on the data collection and availability.


Programme timeline

The squad programme guided agencies through a series of seminars delivered over six weeks, building from an introduction to open data policy to a final stage supporting the use of data. The programme was built with the help of experts from the Open Data Institute. The following process was used to prepare chosen agencies for data releases:

  • Week 1: Introduction: This session communicated the objectives and agenda of the programme, along with information on open data.

  • Week 2: Planning: Prior to this session, the Squad explored the systems and web pages controlled by the agency which related to the collection, analysis and publication of data. The objective was to identify the most valuable datasets and develop an action plan for quickly delivering them as open data.

  • Week 3: Opening data: This interactive session was geared towards progressing datasets to an open format, and ensuring that data is delivered using web services. The Data Squad gave attendees examples of materials and tools (e.g. Open Refine) to analyse, clean and standardise data.

  • Week 4: Automation: This session’s objective was to create the agency’s strategy for delivering data (e.g. setup of web services), which included exploring Mexico’s open data portal and standard delivery methods.

  • Week 5: Publication: During this session the method of publication was finalised and agreed. The Data Squad demonstrated a tool they had created (ADELA) to accelerate the publication process.

  • Week 6: Usage: This session demonstrated examples of data products, and then developed a list of products the agencies could build using their data.


Piloting the programme

Agencies

The National Digital Strategy Coordinator selected agencies to engage with in the pilot phase of the squad's programme. Invitations were sent to agency heads, who directed the appropriate division to work with the squads. Most agencies were represented either by their technology or statistics divisions.

The following agencies participated in the pilot cohort:

  • CONAGUA: National Water Commission
  • CONAPO: National Population Council
  • CONEVAL: National Council for Social Policy Evaluation
  • NAFIN: National Development Bank
  • PEMEX: Mexican Oil Company
  • SAGARPA: Ministry of Agriculture
  • SALUD: Ministry of Health
  • SCT: Ministry of Communications and Transportation
  • SEDESOL: Ministry of Social Development
  • SEP: Ministry of Education.

Datasets released and results

The datasets that were developed and released through the activities of the squad and agencies can be found at catalogo.datos.gob.mx.

The following table analyses the quality and the quantity of these datasets. The generated data's quality was assessed using the 5 stars rating framework initially set out by Sir Tim Berners-Lee. The type of data released is categorized according to the groupings of the Open Data Barometer (ODB). The evaluation grade of each agency’s implementation is indexed using Open Data Barometer methodology.

Agency No. of datasets 5 stars evaluation Categories of open data Implementation grade (Max 100)
CONAGUA 7 3 to 4 National Environment Statistics, Detailed Government Budget, Map Data 87
CONAPO 1 3 Detailed Census Data 95
CONEVAL 9 3 Performance Data 90
NAFIN 1 3 Detailed Government Expenditure 90
PEMEX 51 4 International Trade, Performance Data and Detailed Government Budget 95
SAGARPA 18 3 Map Data, International Trade, Land Ownership, Detailed Government Spend, Company Registrar and Legislation 95
SALUD 3 3 Detailed Government Budget and Performance Data 95
SCT 6 3 Public Transport Timetables, Performance Data, Detailed Government Spend, Detailed Government Budget, Legislation 95
SEDESOL 4 2 to 3 Detailed Census Data, Performance Data, Detailed government spend, Company Register 87
SEP 5 3 Detailed Census Data, Performance Data, Detailed Government Budget 95

Note: Some agencies went beyond the publication of datasets: for example, a tool was built by the Meteorological National System (part of CONAGUA) and Google to spread weather alerts was developed using the Data Squads-generated API.

The programme advocated for the datasets that were published by the Data Squads to be of at least 3 stars rating. The Open Data Barometer indicator point system was used as a metric for measuring the success of implementation: 65 of a maximum of 100 points are achieved by uploading the data to the open data platform. The next 35 points depend on the agency ensuring that the uploaded datasets continues to be released in bulk, up-to-date, and maintained in a sustainable way. The final 5 points are awarded for the linked data (5 star quality), a challenging but desirable step.

Consolidating change

A concern of some agency heads is that data quality was undermined due to the rapid implementation schedule of the Data Squads programme. In our interviews, they argue that a more stable, permanent programme would enable the release of consistently high quality data, which can be more valuable to users.

Feedback from agencies

Open data is still an emerging concept. It is important to examine the experience of the actors within them. In the table below we provide a summary of the agencies experience. (NOTE: Appendix A shows a summary of experiences from the agencies ) The agencies have been divided into three clusters to categorise the different working approaches to implementing open data in government:

Cluster 1: - Agencies with established data and ICT infrastructure**

"PEMEX is a big agency and people don't understand it, this program helps because it promotes PEMEX work and makes it visible to the world" - Fernando Rodríguez Rivera, PEMEX

Some agencies, such as PEMEX, already have a strong ICT infrastructure. The programme must ensure that those agencies build value from the data they already collect and control. Feedback from the agencies suggested looking for partnerships with other countries (e.g. creating international standards) and look for value beyond opening data (by creating data products)

Cluster 2: - Agencies with transparency and statistics infrastructure**

"We didn't call it Open Data, but we already had the infrastructure to share data to inform our constituents" - Enrique Minor, CONEVAL

For most agencies, involvement with the Data Squads was their initial engagement with open data ideas. However, many of these agencies already had processes in place to meet transparency and statistics requirements. It was important for the Squads to approach these agencies with an understanding of the work already done in these fields.

Cluster 3: - Agencies with organisational and resource limitations

"Our agency depends on the Secretary of Governance, thus, we need to work with their agenda and resources of this Secretary in order to accomplish the Open Data policy goals" Sergio Iván Velarde Villalobos, CONAPO *

Some agencies in the federal administration are not independent and lack resources to carry out an open data initiative. The National Population Council (CONAPO), for example, depends on the Secretary of Governance: it lacked adequate resources to conduct the program on its own.

It is important that the Squads recognise these obstacles and adapts to circumstances.


Conclusion and lessons learned

This methodology provides early success stories or "quick wins", building momentum and enabling leaders to bypass some bureaucratic obstacles. The programme was effective at expanding the supply of open data, making a high number of datasets available for use and re-use.

Several governments in Latin America have approached the Mexican government and Open Data Institute to discuss deploying the Data Squads methodology in their countries to provide a rapid implementation of their open data initiatives. This shows that there is a demand for programmes that swiftly unlock the value of open data.

The Data Squad programme provides a valuable tool that spreads open data principles quickly and enables rapid unlocking of datasets. The programme was effective at expanding the supply of open data, making a high number of datasets available for use and re-use. The relatively cheap and impactful nature of the programme suggests that it may also be repeatable in other contexts.


Acknowledgments

Thank you to the following people who helped us with scheduling interviews and participated in them:

  • Ania Calderón - Director of Digital Innovation at National Digital Strategy
  • Carlos Castro Correa - Director of Open Data Analysis at National Digital Strategy
  • M.C Ebenezer Hasdai Sánchez Pacheco - Open Data Squad Technical link
  • Rodolfo Wilhelmy - Director of Open Data at National Digital Strategy
  • Enrique Zapata - Director of Open Data Policy at National Digital Strategy

Agency interviewees:

  • Fernando Rodríguez Riera - PEMEX
  • Sergio Iván Velarde Villalobos - CONAPO
  • Carlos Arriola Alva - SAGARPA
  • Enrique Minor - CONEVAL
  • Dr. Carlos Patiño - SCT
  • José Antonio Orozco – SEDESOL

Appendix A: Summary of experiences (where interviews were possible)

Agency Background Benefits Challenges Recommendations
CONAPO New to open data Visibility to Transparency topics for constituents Organizational challenges, related to resources. It is important for Squads to analyze resources of the agencies in which the program is being implemented.
CONEVAL Existing infrastructure for publishing data in their site Supporting the National Digital Strategy Different approaches to open data The Squad should fully understand the policies for which the agencies are accountable for.
PEMEX Had a big infrastructure for statistics and reporting data Data gives visibility to PEMEX operations to citizens. The current PEMEX infrastructure allowed to smoothly meet with the program. Bring suggestions of usage of data from other countries.
SAGARPA Had published data for a long time to improve the way people cultivate food in Mexico. Change perspective on "data property" inside the agency. Found the goals of the open data team difficult to follow. For example, they created their own Open Data Portal, because at the beginning they believed they had to. Build from the work in the agency.
SCT No previous work was performed, but some knowledge of open data within the leadership. The agency is left with the adequate tools to meet the requirements of the Open Data Policy. Include all the organization inside the agency in the program. The Squad should include expertise in the work that the agency performs.
SEDESOL No previous work on open data, but protocols were in place for publishing information to citizens. Standardize conversations and tools around open data inside the agency. Large amounts of information Diagnose the agency work before starting with the program.