Big Data in a Time of Crisis: Maximizing its Value – And Avoiding its Risks – In the Fight Against COVID-19
It is estimated that approximately 2 million people around the world have been infected by the coronavirus, and the numbers continue to grow. Yet the data being reported are primarily coming from national governments, which are – for the most part – basing it on who has been tested, which likely reveals only a fraction of the scale of the pandemic. There’s a critical need for real-time reported data from hospitals, health clinics and outreach facilities, based not only on testing, but on symptoms and other key determinants. In addition, we need to know in real-time who lacks access to a health clinic or other basic services, and which clinics don’t have personal protective equipment (PPE), ventilators or beds – as well as where the most vulnerable people live, their age, gender and other crucial demographic information. If there is one thing this pandemic has exposed, it is the acute weakness of the world’s data systems. Even in supposedly “developed” countries, we’re struggling to record vital pieces of health information, and even when we are, we’re all using different definitions.
With more timely, accurate health system reporting we could be sending urgent supplies, drugs, ventilators and PPE rapidly to the right areas. Additionally, with more disaggregated population data, we could target interventions at vulnerable, elderly and high-risk populations.
The solution to all this is investment in modern and robust data systems that can produce real-time data. Now more than ever, it is clear that data should not be an afterthought, and it is not just about monitoring and evaluation. Data is a crucial part of making informed, timely decisions. Building high-functioning data systems is like putting foundations under a house: Without it, the walls are bound to come down.
Innovations for Better Data
To address this growing need, many governments are turning towards innovation aimed at generating and analyzing big data. By partnering with the private sector and other data innovators, such as universities and NGOs, governments are finding new ways to understand where and how people live, to assess their health and well-being, and to evaluate their needs, demands and access to services. Thanks to emerging technologies, they are able to do this in a timelier, more reliable way.
During the current pandemic, there are multiple examples of big data and artificial intelligence being used to inform planning and decisions. For example, Taiwan has used national health insurance data, combined with immigration and customs data, to track citizens’ travel histories and possible COVID-19 symptoms. Meanwhile, the European Commission has requested telecommunications companies, like Orange and Deutsche Telekom, to share their data to help countries track the virus’ patterns and mobility. And O2, another mobile network provider, is in current talks with the UK government about generating anonymous heatmaps to track citizens’ movements and disease transmission. Additionally, researchers in several countries are examining traffic and pollution data to understand how cities are being affected by quarantines and movement restrictions.
The use of satellite imagery is not as new and shiny as big data or AI, but it’s equally crucial, as it can be used to map individuals’ access to services and infrastructure across populations. Groups like Flowminder, as well as members of SDSN TReNDS’ POPGRID Data Collaborative, are using their satellite-based population estimates to identify where the most vulnerable people are, and whether they have physical access to healthcare services. These methods are especially useful during the current pandemic because they enable us to reach individuals who may otherwise be missed due to the geographic and temporal constraints often posed by traditional data-collection methods, like censuses and household surveys.
Avoiding the Downsides of Big Data
While these innovations are hugely promising, they certainly have their critics. Chief Statisticians fear the declining authority of statistics and often argue that these new data-driven methods, which are not overseen by the UN Statistical Commission and vetted with painstaking methodological precision, are sub-par proxies that will compromise the quality of national statistics. These views undoubtedly have merit, but it’s better to have additional data on which to base decisions – even if it’s not as rigorously vetted – than to depend exclusively on information that comes from official sources, like censuses and surveys, which may not be up to date.
However, as we rush to respond to the pandemic and to find quick solutions to the complex challenges it has raised, we must not forget the long-term implications of gathering more and better data. Giving governments, private companies or civil society groups access to highly personalized data – even if anonymized – is risky at the best of times (as highlighted in a recent Economist article, and by my colleague Tom Orrell of DataReady). We must carefully balance the need to innovate with individuals’ privacy, especially when the legal rules need to be bent out of necessity.
That’s why there’s a critical need for a clear policy framework that lays out how data can be used, by whom and for what purposes. Moreover, if new data-generating measures are enacted during a crisis – like COVID-19 – there must be parameters that establish their duration and when they’ll be rolled back (in line with the UN Siracusa Principles). The International Committee of the Red Cross’ data protection guide for the humanitarian sector and the Organisation for Economic Cooperation and Development’s guidelines on the protection of data privacy (among others) offer important international standards and guidelines that governments should adopt during this time of uncertainty. And above all, any innovation should be done transparently, so citizens know what data gathering approach is being trialed, and how it will impact their personal information. Now, more than ever, freedom of information acts and other checks and balances are crucial, and countries reneging on these measures need to be held to account by the UN and others.
Every organization and business, big or small, wants to help address COVID-19, and real-time data can play a pivotal role in this. It is right for governments to try to capitalize on offers of assistance from data-focused enterprises – every helping hand matters if we are to improve the quality and timeliness of national health and information systems. But unless new agreements and processes are transparent, and citizens retain the right to monitor what information is being used and how, we are likely to end up sacrificing our long-term data rights for short-term benefits.
Jessica Espey is a Senior Advisor to the United Nations Sustainable Development Solutions Network (SDSN), and is the Director of SDSN TReNDS.
Photo courtesy of sumanley.