In the early 2010s, World Bank Group President Jim Yong Kim stated his aim of universal access to financial services by 2020. In this article, I look at why this goal is still so important, and how machine learning technology can be used to help achieve it; even if a little late.
An estimated 1.7 billion people do not have access to a bank account, over one-fifth of the global population, including 80% of those living on less than $2 a day. This means no structured savings, so in an emergency, such as as a failed crop, school fees cannot be paid. Children dropping out of school then has serious repercussions on their ability to leave the poverty cycle. Approximately 95 million unbanked adults in Sub-Saharan Africa must travel regularly to collect cash payments for agricultural products, costing time and money. Closer to home, banking allows people to own a home, to pay for gas and electricity efficiently, and to save money by attaining discounts on recurring payments. Many of us don’t even notice these benefits. But in the UK, 1.2 million people are unbanked and paying a ‘banking poverty premium’ of £485 a year.
22% of Africa’s working age population owns a business, the highest entrepreneurship rate in the world, with small to medium sized companies making up the bulk of the economy. But without financial services, growth is difficult. Take Jenipher, a real life case-study, based in Nairobi, Kenya. She has been running a food stall for decades and has put her three sons through school. Loan sharks operate with 300% interest rates, and family loans aren’t large enough. Although her business does well, covering her expenses, she cannot get a business loan; traditional banks have no reason to trust that she will pay the money back. The data that usually signifies this trustworthiness, such as how many loans you’ve previously paid off and your savings balance, do not work in Jenipher’s favour. By using a mobile financial technology app, the likes of which have seen unprecedented growth in financial inclusion in the developing world, she was able to receive a loan. She used the money to start two more food stalls and a restaurant, hiring people and invigorating her local economy. This is the power of machine learning.
Machine Learning is all about training models from data, and is the reason Jenipher was able to secure a loan. Data comes in two forms, discrete and continuous. Discrete data (e.g. colour) can only take certain values, whereas continuous data can take any value in a given range (e.g. temperature). Let’s say we have photos of wombats and butterflies. In one type of machine learning, we ask a human expert to label each photo W or B, and then feed this labelled data to our algorithm. The algorithm is then trained to be able to label previously unseen data (i.e. correctly identify a new butterfly photo). This is called supervised learning (think of the supervision being from the human expert who can label perfectly), and since we have discrete data this algorithm would also be a classification algorithm (we’re classifying data into two separate categories). A supervised learning algorithm working on continuous data is called a regression algorithm. For example, we can take some butterfly wing silhouettes and have an expert label them with corresponding numerical air resistance values. A regression model would be able to label new silhouettes with air resistance values. Note that there are no categories in regression algorithms, just numbers that can take any value.
In unsupervised learning algorithms, there are no labels. Let’s say we gave the photos of wombats and butterflies to a group of people who had never encountered either previously. They would not be able to identify them, but they would be able to sort them into two groups based on similar properties, maybe even subgroups based on other distinguishing features. This is called clustering, and can be used for a future classification algorithm. Going back to our continuous data example of silhouettes, an unsupervised learning algorithm may be able to simplify each silhouette down to one feature (such as horizontal length) to make them easier to compare - this is called dimensionality reduction. Pattern recognition (e.g. do silhouettes with longer horizontal length also have longer vertical lengths?) is also a key part of unsupervised learning algorithms that take continuous data as inputs.
With access to a few key data points, a semi-supervised learning algorithm was able to make Jenipher a new kind of credit score. She made regular calls to a few close contacts, had regular travel patterns, and a large network of contacts. By feeding an algorithm thousands of data points, Jenipher’s loan application was accepted. Repayment rates for the service are above 90% - in line with traditional banks. To quote Jim Yong Kim ‘Access to financial services can serve as a bridge out of poverty.’ The question is: who, or what, decides who can cross?
Sources:
https://www.worldbank.org/en/topic/financialinclusion/overview
World Bank Global Findex Database 2017
https://www.un.org/africarenewal/magazine/january-2010/abolishing-fees-boosts-african-schooling
Demirgüç-Kunt, Asli, Leora Klapper, Dorothe Singer, Saniya Ansar, and Jake Hess. 2018. The Global Findex Database 2017: Measuring Financial Inclusion and the Fintech Revolution. Washington, DC: World Bank
https://www.theguardian.com/money/2019/apr/22/britons-without-bank-account-pay-poverty-premium
African Development Bank, Organisation for Economic Co-operation and Development, United Nations Development Programme (2017)
https://www.ted.com/talks/shivani_siroya_a_smart_loan_for_people_with_no_credit_history_yet
https://www.worldbank.org/en/topic/financialinclusion/overview