1.2212480-1570993152

When the World Bank recently surveyed small businesses in the Middle East and North Africa (MENA) on the biggest challenges they face, a consistent theme was the difficulty in accessing credit. Indeed, the International Finance Corporation has estimated that the gap between supply and demand for credit in the region is an astounding $240 billion.

The causes of this gap are multifaceted, but one reason is the difficulty for creditors — of which the region faces no shortage — in assessing the credit risk of potential borrowers. Only one in seven adults in the MENA region has a credit history recorded by a credit bureau, compared to two-thirds of adults in OECD countries, according to the Doing Business index.

When creditors cannot distinguish between strong and weak borrowers, they resort to high prices and strict collateral arrangements that lock many borrowers out of the market. Thus the problem persists.

Fortunately, many innovative companies are working to change this. Around the world, novel techniques are emerging to assess borrowers’ creditworthiness based on online third-party data, behavioural analytics, or through direct access to digital financial records, often in combination with sophisticated machine-learning techniques.

While these innovative approaches are promising as a whole, if they are to help close the credit gap in the MENA region, there are a number of pitfalls they will have to avoid.

Among the most eye-catching innovations for assessing credit risk are those which rely on social media or other online data about potential borrowers. Given the near-ubiquitous reach of social media platforms — Facebook, for instance, has over 2 billion users globally — they may seem to offer a highly portable, almost universal solution to the credit assessment problem.

Indeed, many companies have attempted to make use of social media data, including Big Data Scoring, a pioneer in the field, that built a credit scoring model relying exclusively on Facebook data. However, it has gradually become clear that social media data may be less useful than initially hoped, and Facebook’s terms of service now explicitly bans the use of its data to decide whether “to approve or reject an application or how much interest to charge on a loan”.

Big Data Scoring itself now says that social media data is “not suitable for credit scoring”.

That said, other approaches relying on online third-party data for credit assessment still seem viable. Companies like Tala make credit decisions based on data collected from applicants’ smartphones — a method which has fewer technical and legal restrictions, especially on Android devices.

But collecting detailed data on applicants’ contacts, messages, phone calls and locations, while potentially quite useful for consumer credit decisions, can also be seen as unnecessarily invasive. Recent privacy scandals plaguing Facebook, whose data on millions of users unintentionally fell into the hands of political research firm Cambridge Analytica, who used it improperly, may make consumers increasingly wary of such intrusiveness.

In addition, much of the credit gap facing the MENA region revolves not just around individual consumers, but around the financing needs of small and medium enterprises (SMEs). It is not likely that the personal data of whoever happens to submit a business loan application, whether collected from social media or elsewhere, is particularly useful for assessing the creditworthiness of a firm beyond a certain size.

Fortunately, other companies are developing novel approaches to assess the credit risk of businesses with little or no credit history.

One of the most promising approaches focuses on connecting directly to the financial data of an applicant’s business. In developed markets, companies like Kabbage allow businesses to link their bank accounts and other financial data with their loan application, enabling a direct assessment of the financial health and cash flows of the applicant business.

Of course, this approach is difficult to replicate in many developing markets, since banks typically do not provide APIs and businesses often do not keep digital financial records. However, some companies, like liwwa in Jordan, have developed technology to bridge this gap, allowing offline records such as bank statements to be easily digitised and included in more advanced credit models.

This allows data-driven approaches to be applied in emerging markets that don’t yet have a well-developed digital infrastructure.

Regardless of what data the assessment is based on, whenever credit scoring is performed by machine learning or AI rather than carried out by a human analyst, several novel risks emerge. For instance, many popular machine learning methods, such as artificial neural networks, are “black-box” models, meaning that the decisions they make cannot be easily interpreted or understood.

This is especially problematic for credit decisions, where it is crucial to be able to explain to applicants, regulators and other stakeholders why a particular application was rejected or approved. It is also possible that such models inadvertently discriminate against minority groups, who may be underrepresented in the training data the model uses.

However, there are way to address these concerns. Data scientists working on credit assessment should eschew the use of deep learning models like neural networks and instead rely on easily interpretable models, whose decisions can be understood and explained.

Despite the hype surrounding deep learning, there is little to suggest that these approaches outperform simpler models when it comes to assessing credit risk. In addition, models should be trained using solid financial data as inputs, instead of relying on tangentially related factors that may unintentionally discriminate against underrepresented applicants.

The MENA region faces a difficult challenge in addressing the gap between credit supply and credit demand. Innovations in credit assessment can help.

But to reap their full benefit, we must be thoughtful about how they are designed and transparent about how they are used.

The writer is Chief Data Scientist at liwwa.