Innovating for Sustainable Data Collection: Generating ‘Gold Standard’ Survey Data – For 75% Less Cost
Editor’s note: This article is part of the NextBillion series “Big Data: Big Risks, Big Opportunities,” one of three special series we’re running this year. Learn more about NextBillion’s 2020 series here.
Traditional face-to-face surveys are regarded as the gold standard for financial inclusion surveys, as they use probability sampling methods and trained interviewers to administer the surveys to people in their homes. However, sending surveyors out into rural and excluded communities is an expensive proposition. A typical face-to-face survey can be as much as 75% more expensive than a mixed mode methodology that uses more cost-effectives data collection modes, such as SMS (text messaging) and computer-assisted telephone interviewing (CATI). The other major limitation of traditional face-to-face surveys is that they are extremely long and result in cognitive fatigue, which reduces the quality of data being collected.
That’s why the insight2impact facility set out to find a more sustainable approach to acquiring limited but high-quality financial inclusion data. We approached that task with the goal of generating insights into people’s behaviour and perceptions that were just as accurate and relevant as the data acquired by traditional face-to-face surveys – and to do so at a much lower cost. (This project is just one element of insight2impact’s work, which aims to positively impact the entire data landscape to improve financial inclusion outcomes.)
Generating Accurate Financial Inclusion Data at a Lower Cost
To reduce data collection costs, we had to consider various technological modes for delivering the survey: We eventually decided on SMS as our primary digital mode, as it allowed us to achieve the widest reach in the populations we were interested in. We had pre-piloted standalone SMS surveys in 2017 and knew that they resulted in skewed estimates, so we needed to fine-tune our approach for our second and third rounds of piloting. In 2018 we decided that a mixed-mode data collection strategy with statistical modeling to correct for sampling and mode bias would be required to produce estimates that were credible.
We set off and started to collect data via SMS and CATI in eight countries. As the face-to-face Financial Inclusion Insights survey datasets were available in all eight countries, we did not have to collect additional face-to-face data: We used these datasets as our reference survey against which we would evaluate our results. In fact, any large-scale face-to-face survey, such as FinMark Trust’s FinScope, could have been used for this comparison – but we decided on the Financial Inclusion Insights surveys, as they also contained the digital financial service indicators that we were interested in. In the end, we had three datasets for each of the eight countries (FTF, SMS and CATI). To make a long story short, we were able to show that by using a small sample of face-to-face data, then combining that with the SMS data and using multilevel regression and post stratification (MRP) to model the results, we were able to calculate estimates for the digital financial service indicators that were comparable to the full indicators in the Financial Inclusion Insights gold standard reference survey. This approach also reduced the cost of collecting these indicators by almost 75% – a major success for us.
Lessons Learned for Sustainable Data Collection
The most important thing we learned in this work is a lesson that’s applicable to anyone who’s trying to innovate: Listen to the nay-sayers and then ignore them! We were told many times by fellow researchers that our approach wouldn’t work, due to the inherent inaccuracies the data collection mode would produce – not to mention the difference expected due to mode effects. But our view was that even if we failed, we would still have advanced our thinking around these new methodologies. If we had only tried to incrementally improve on methods that had been proven to work in the past, we’d never generate new knowledge.
The other key lesson we took from the pilots is that it’s essential for innovators to surround themselves with a quality team. Innovation doesn’t happen in a vacuum, with one person dreaming up a creative solution all alone. Our innovations developed organically, as we grappled as a group with the various technical challenges we encountered throughout the three year-long process. We were lucky to have great donor programme managers, as well as a global network of partners with the specialist skills we needed.
An important objective for the programme was to get our new measurement frameworks and data collection innovations adopted, so we would advocate for them in various forums. The most difficult thing we encountered in those efforts was resistance from fellow researchers, when we asked them to innovate their measurement frameworks or to consider alternative data collection strategies. The resistance typically centered around a reluctance to lose the ability to make direct comparisons to previously collected data, to remove questions from surveys, or to accept that alternative data sources or shorter surveys might produce better quality data. In addressing this reluctance, we found that if you collaborate actively and practically with researchers from different organizations on an actual survey or research problem, you are more likely to achieve success – as opposed to just standing at a pulpit and preaching about how things should be done differently. Advocacy will only get you so far. It’s true that this practical collaborative approach is time-consuming and resource-intensive, but the result was well worth it for us and the organisations to which we introduced the innovations.
Developing a 360-Degree View of Financial Lives
Our data team’s view has always been that you should match the research problem to the most appropriate data source, whether that’s survey data, supply-side data, transactional data or alternative data sources like social media. We now emphasize the importance of a data ecosystem, where all types of data and research methods have a role to play in building a picture of financial inclusion. To develop a rich picture of how people live their financial lives, we needed a 360-degree view that can only be attained by consulting different data sources and using multiple research methods.
Indeed, we’re not suggesting that large scale face-to-face data collection should be phased out. Our contention is that comprehensive demand side surveys will remain very important for diagnostic purposes: After all, the focus of this data is on the poor, who often have no digital footprint – so the only way to make them visible is to collect survey data. However, a key learning from this project is that researchers should not be afraid to use new data sources and research methods, even if they are not currently perfect. No data is ever perfect – you just need to define the boundaries of that data to understand to what extent it can be generalised.
We still have quite a bit of work to do: For starters, we need to write up our findings for the eight-country digital financial services mixed mode pilots in a digestible form. We are also planning to collaborate with partners to produce an academically publishable article on our approach. With this kind of work, it is always valuable to have it pass academic scrutiny. We had a panel of four academics that reviewed our work throughout the process. We even had Andrew Gelman, the modern-day father of MRP, join in our final review session – and to our delight he was impressed with our work.
We have another round of piloting running now, which explores the use of mixed mode data collection and MRP to collect gender-specific financial inclusion data in Kenya, Tanzania, Uganda and Pakistan. There is a huge gap in the availability of good gender data, and we believe our low-cost mixed mode method will help close some of this gap.
Going forward, a key focus for us will be to disseminate the work as widely as possible during the final months of the programme. This process is off to a good start, as we have already had a development partner ask us to design a mixed mode survey to measure headline financial inclusion indicators in several countries. We believe that this promising reception demonstrates the demand for new, innovative data collection methods that generate robust results in a sustainable manner. This advancement could have a substantial impact not only on the gathering of financial inclusion data, but also in collecting Sustainable Development Goal data, and data that speaks to the intersection of financial systems and the real economy.
Photo courtesy of Mirko Grisendi.