Scoring model metrics in numbers and graphs do not necessarily give a correct and complete picture of its real profitability in monetary terms. How to avoid the frequent situation when improving metrics results in a decrease in the company's income? Dmitry Melnik, Head of Risk (CRO) at TAMGA, shares his practical experience in improving credit scoring.
Improving the current scoring system is a challenge that every data scientist faces, whether they represent the vendor supplying the scoring or work within an inhouse team. It is likely that for developers it is formulated as "Increase approval rate without increasing delinquency" or "Decrease delinquency without decreasing approval rate", the ultimate goal, as we all realize, is to increase the profitability of the model.
So, the problem is solved. The updated scoring system for these metrics shows impressive progress. In a report, it might look something like this:
That said, it is worth defining the terminology immediately.
Overdue is the number of unreturned loans divided by the number of issued loans.
Approval rate is the number of approved loan applications divided by the number of applications received.
Let's count the money
If we look deeper, it may turn out that while one of the indicators - the percentage of overdue loans - seems to be improving, the amount of delinquency in monetary terms may increase. How is this possible? You should look for reasons related to the quality of the scoring model and the parameters it uses.
For example, in our practice we have a case when the model was quite accurate in predicting delinquency for applications for small amounts, but for applications for larger amounts the accuracy decreased significantly. It is in such a situation that the overall delinquency may actually worsen as the loan amount increases.
Having analyzed the quality of the model in different segments, the following picture can be obtained:
Suppose, however, that this is not our case: we have studied the model in different slices, and it predicts applications for large and small amounts equally well. We plot delinquency in both units and money, and see that delinquency has decreased in both representations. In addition, we see an improvement in the yield graph:
The X-axis shows the number of days since the loan was disbursed, and the Y-axis shows the yield in percent. The graph shows that the new scoring model yields more than the old one. Thus, the delinquency has decreased and the yield has increased.
Yield = (Total amount of loan payments - Amount of loan disbursed) / Amount of loan disbursed * 100%
In this formula, the total loan payments are the total amount of money received from the borrower in principal and interest payments, and the loan amount disbursed is the amount of money that was loaned out.
Additional metrics change the picture
During the application of the new scoring model, a situation may arise where the approval rate has remained unchanged, delinquency has decreased, and profitability has increased, but the return is less than before.
The thing is that the score itself and its comparison with the cut-off threshold is not the final decision on granting a loan. The final stage of decision-making on the borrower in most lenders includes determining the amount of credit that can be granted. In the case of a favorable decision on the application, there are two options: approve the client for a smaller amount than they requested, or approve a larger amount. This means that it is incorrect to interpret the approval level in units unambiguously.
In such a situation, it is optimal to represent the approval level in monetary terms, using the average check principle, as the ratio of the approved loan amount to the requested amount in the application. In this way, the approval rate in monetary terms paints a very different picture:
One might ask: does this practice of determining the loan amount not reduce profits? In fact, quite the opposite is true. Studies show that changing the loan amount affects the default rate for customers with the same risk and the same score. The method of determining the loan amount is an effective tool for managing the default rate. The graphs of default rate depending on score for the same model speak for themselves:
The graph on the left shows applications that were approved for the desired loan amount, and on the right are applications that were reduced depending on the score. Note that the model and score are the same, but the final decision - determining the amount of credit that can be approved - was made differently. This approach allows us to quickly reduce delinquency by tabulating the score according to the amount approved.
But let's get back to the true purpose of the scoring change - to increase profits.
Let's say the model evaluates applications for large and small amounts equally well, resulting in higher approval rates in both units and monetary terms, with no increase in delinquency. And the company still makes less money than before. Why does this happen?
It's simple: the approval rate is not the same as the loan disbursement rate. When a company approves a client for a loan, it does not mean that he or she will necessarily use it. In practice, if customers with a good credit history have their loan amount significantly reduced, they may refuse the approved loan and go elsewhere, where they will be approved for the desired amount immediately. There is an outflow of customers as a result. Thus, two other indicators are significant: the conversion rate of approved applications and the approval rate.
Approved applications conversion ratio is the ratio of the number of loans issued to the number of approved applications, expressed as a percentage.
Approval rate is the ratio of the sum of loans disbursed to the total sum stated in the applications.
A high approval rate combined with a strict policy of determining the amount of credit can lead to a low conversion of applications into actual loan disbursements. That is, less money will be disbursed out of an equal number of applications received. Nevertheless, the delinquency rate and approval rate will look better than in the previous version of the system.
Application of lessons learned and the end result
Suppose that all of the above indicators in our decision-making system have been improved. However, despite this, we are still unable to determine exactly how much we earn. Yield shows how paid loans compensate for non-performing loans and is expressed as a percentage. But the same percentage from a million and from a hundred are completely different amounts.
After painstaking work, we have come to a situation where we can carefully track various metrics in our decision-making system, we have a lot of graphs and data, but we can't clearly say how much we earn, and we can't compare models in an A/B test because we can't express their quality with a single number. So we ultimately have a hard time answering the question of whether we've improved or worsened our system from the perspective of the "make more money" goal.
We decided to try to express the effectiveness of the decision making system with a single number and compare two systems. We have 4 indicators that fully describe our system. In order to simplify the indicators, we made the following transition:
Stage #1:
1) Yield
2) Approval rate in units
3) Average check
4) Conversion rate of approved applications
Stage #2:
1) Yield
2) Volume of approved loan portfolio in monetary terms
3) Conversion rate of approved applications
Stage #3:
1) Yield
2) Volume of loan portfolio disbursed (number of approved loans multiplied by approved application conversion ratio and average check)
Reducing the number of indicators makes the results more illustrative, but we still need one integral indicator that will help to easily select the right model. The revenue per request is such an indicator. It includes the average check, the approval rate in units, the conversion rate of approved applications in percentage, and the conversion of approved applications to actual disbursements.
Revenue per application is the difference of the sum of all payments with interest and the amount disbursed divided by the number of applications received.
This way it is possible, with minimal inputs, to find out the key question for the business without being distracted by secondary indicators, as they can be misleading.