Project snapshot
This case study reveals the story of helping a prominent international banking group dramatically improve credit scoring accuracy by redesigning their neural network architecture with a unified multi-modal approach for successful entry into high-risk international markets.
Client
International banking group with headquarters in NYC, offering digital banking, payments, wealth management, and personal and business loans across multiple markets.
Solution
Unified multi-modal neural network that processes embeddings from multiple data sources instead of separate mono-modal networks, enabling higher accuracy predictions for loan default assessment.
Business function
Risk assessment and credit scoring
Industry
Finance & banking
Challenge
Improve the reliability of AI-based credit scoring when entering a high-risk international market with limited credit histories and different regulatory requirements.
Result
Client background
The client is a prominent international banking group with headquarters in New York City, offering digital banking, payments, wealth management, and personal and business loans across multiple markets.
The bank was planning to enter the Indian market with retail banking products, facing several critical challenges:
The company needed an advanced AI-driven credit scoring solution to accurately assess borrower reliability in markets with limited credit data while maintaining profitability.
The company sought an advanced AI-driven credit scoring solution to improve loan default prediction accuracy when entering high-risk international markets with limited credit data availability.
Potential threat: If the solution was not implemented, the client would face higher loan default rates, increased financial losses, and potential failure to establish successful operations in the new market.
The unified multi-modal neural network presented challenges in integrating three independent data sources and optimizing model architecture. Xenoss team created solutions to improve training efficiency and prediction accuracy.
The original approach treated three data sources (card transactions, debit account transactions, credit history) as mutually independent, using separate neural networks and logistic regression for final scoring.
Solution: We developed a unified multi-modal architecture that processes embeddings from all data sources simultaneously, allowing the model to identify correlations between different data types that were previously missed in the independent approach.
Initial experiments with direct raw data input to a unified model resulted in long training times and difficulty identifying the impact of individual improvements under time constraints.
Solution: We implemented an embedding-based approach where each data source is transformed into vectorized representations before feeding to the main neural network, enabling faster training and better parallelization of development processes.
Not all clients have complete data across all three sources (card transactions, debit accounts, credit history), creating gaps in the scoring process.
Solution: We designed constant embedding mechanisms that use average embeddings for missing data sources, ensuring the model can still make accurate predictions even when some client data is unavailable.
Supporting multiple models instead of a single end-to-end solution created development and maintenance complexity while requiring fast iteration cycles.
Solution: Our modular architecture allows for independent improvement of individual data source models while maintaining the unified prediction layer, enabling parallel development and faster experimentation with new approaches.
The original architecture made it difficult to incorporate additional data sources without major system redesign, limiting future expansion capabilities.
Solution: We built a flexible framework where new data sources can be easily added as additional embedding inputs, allowing the bank to enhance scoring accuracy by incorporating new data types without architectural changes.
Balancing model complexity with inference speed requirements while maintaining high accuracy standards for real-time credit scoring decisions.
Solution: We optimized the multilayer perceptron architecture for fast inference while preserving the quality improvements from multi-modal data fusion, ensuring production-ready performance with significant accuracy gains.
Discovery phase findings
The bank used three data sources to calculate the client’s default probability:
1. Card transaction: Credit score calculation based on card transaction data
2. Debit account transactions: Credit score calculation based on debit account data
3. Credit history: Credit score calculation based on credit history
Each data source had a separate neural network model to calculate credit scores based on the input data sequence. The bank team used logistic regression to mix the scores received from these three methods and get the final credit score:
Improvement idea
In the original approach, the data sources were treated as mutually independent. The client used each data source to calculate scores using alternative methods. For example, the overall score was first calculated based on card transactions, then alternatively based on credit history, etc.
What if a combination of factors from different data sources correlates with the likelihood of a user defaulting on a loan?
If this hypothesis is correct, treating all existing data sources as a unified input for a single neural network can give us higher precision.
Unified multi-modal neural network with embedded input
The alternative idea was to transform source data sets into embeddings and feed these embeddings to the main fully connected neural network. An embedding contains much more information than a single scalar, and with enough data, this approach allows for better model training.
Pros & Cons of the approach
Pros
Fast training: Such a model is trained quite quickly due to its architectural simplicity and the
absence of recurrent layers
Parallelization: This approach allows for the acceleration of the development of a general solution due to the parallelization of the processes of improving individual models
Cons
Multi-model support: The challenge is that we must support a set of models instead of just one end-to-end model.
Architectural highlights
Vectorized data source as input
The model takes the embeddings of each data source (card transactions, current account transactions, credit history, etc.) as input. There can be any number of sources.
Constant embedding for missed data sources
If there is no embedding for one of the data sources for the client, the model takes the average constant embedding corresponding to the data source as input.
MLP classification
The embedding of each data source is fed to the input of a multilayer perceptron (MLP). Then, their outputs are concatenated to obtain a generalized vector representation of the client. This resulting embedding is fed to the input of the MLP classification layer to form the final prediction.
Target metric: Gini
When evaluating a credit scoring model, we used the Gini metric. It shows us how well the model ranks clients in predicting client default.
Gini = 100% * (2 * ROC AUC – 1)
Increasing the Gini metric value allows the bank to issue more loans at the same risk level. Increasing the number of loans directly leads to an increase in profit, so when calculating the financial effect of the model, each Gini point is converted into profit. Depending on the size of the bank’s lending operations, the effect of one additional Gini point can be estimated from hundreds of thousands to hundreds of millions of dollars of extra profit.
1.8-point Gini metric uplift (existing data)
The unified multi-modal neural network achieved significant improvement in credit scoring accuracy using the same three data sources (card transactions, debit account data, credit history) without adding new information.
2.6-point Gini uplift (new data sources)
When new data sources were integrated into the modular architecture, the model demonstrated even higher performance gains, showcasing the scalability and flexibility of the embedding-based approach.
Improved loan default prediction accuracy
The embedding model preserves the quality of individual models on data sources while producing higher output quality by mixing input data from different sources, enabling better risk assessment for high-risk markets.
Want to build your own solution?
Contact us