October 31

Frank Pichel / Lindsay Ferris

Pausing the Data Revolution to Ask a Few Questions

Google us.

You’ll learn where both of us live, went to school and worked. You can even learn our political affiliation.

All of this is thanks to the ongoing revolution of open data.

Never before has so much information about our fellow humans been so readily available.

But, as highlighted earlier this month at the International Open Data Conference in Madrid, the data we have can be skewed and does not always reflect global priorities. We have far more data from developed countries than developing countries. And the data that we have from the developing world – even in the rare cases when it is a complete and accurate dataset – is not easily accessed.

So, for example, you can learn our political affiliation if you dig online for a few minutes. But days at your laptop will not yield reliable information about who has legal rights to vast swaths of land across dozens of countries in sub-Saharan Africa. And the information which is available often doesn’t reflect on-the-ground realities.

As the data revolution continues, we must ask:

  • How do we prioritize what data is collected?
  • What data is opened and to whom is it made accessible?
  • And who should make these decisions?

The answers to these key questions will have far-reaching impact on global and national-level government priorities and decision-making.

We believe the answers to these questions should be the product of discussions between open data experts and domain specialists who can provide a nuanced understanding of their field as well as data end users including communities, governments and corporations.

While our first instinct in collecting and opening datasets is to prioritize those that have the most demand and potential for re-use, questions of whose demand we should prioritize remain.

The most powerful potential users – governments and corporations – might be those most capable of demanding and making the most use of land datasets. But the whole movement of open data is predicated on the notion that open data is the great equalizer – that giving farmers in Africa data can empower them.

“Open data ensures that power will be shared – and that the world we change will, with luck, become a fairer and more democratic one,” wrote Joel Gurin of the Center for Open Data Enterprise in the Guardian.

Currently, those we want to empower with data – for example, farmers in Africa who want to know who has rights to a certain plot of land that they may want to buy and invest in (or even more fundamentally, want a public legal record of their own rights to land so that they can securely invest in their land and improve their harvests and their lives) – don’t have the same power and voice to demand data that meets their needs.

Four key themes can help guide our discussions on how we prioritize efforts to collect various open datasets to maximize the social good:

  • Increased coordination: Between the Global Open Data Index, the Open Data Barometer and the Open Data Inventory, there are several instruments measuring the “openness” of government datasets. We must recognize that different indexes can focus on different data needs and progress toward different social goods. Increased cooperation between these varied indexes can minimize overlap and provide an opportunity to examine open data from different angles. Each index should look to fill knowledge gaps left by others. While the Global Open Data Index prioritizes datasets based on civil society data needs, the Open Data Barometer could focus on datasets that stimulate increased interoperability between government agencies.
  • Create measurement standards: Together, we should agree on our approach to prioritizing datasets we believe all governments should be asked to collect and release. This requires collaboration between sector experts and data experts, but also a discussion that includes all actors – from powerful governments and corporations to representatives of indigenous communities and other vulnerable actors. This group should also offer operational guidance focused on maximizing social good.
  • Impact and demand: The aim is always to promote datasets that will have the most impact in their release. However, determining what kind of impact and whether it will actually come to fruition is hard to predict, especially across a global context. Therefore, we should first prioritize the outcomes we’d like to see and then measure how much releasing a dataset will get us toward these desired outcomes.
  • Addressing power imbalances: Often the civil society organizations that create indexes have their own aims, including a desire to get data into the hands of those who might not have the ability to pay for subscription datasets. Therefore, we should strive to collect and make accessible datasets that will alleviate, rather than exacerbate, information asymmetries.


Frank Pichel is co-founder and chief program officer, and Lindsay Ferris is the Open Data in Land Fellow at Cadasta Foundation.

Photo illustration courtesy of Open Data Watch, via Facebook.



data, healthcare technology