Credit scoring 101
scoring
credit score
machine learning
What is credit score? And why it useful
A credit score is a numerical expression based on a statistical analysis of a person’s credit files, to represent the creditworthiness of an individual. The higher the credit score the more confident a lender can be of the applicant's creditworthiness.
Uses of credit score
Lender, such as banks and credit card companies, use credit score to:
- evaluate the potential risk posed by lending money to consumers and to mitigate losses due to bad debt.
- Determine who qualifies for a loan, at what interest rate, and what credit limits.
- Determine which applicants are likely to bring in the most revenue.
- Automatic and fast to assess risk.
3 Factors that make up a credit scoring
- Five Cs Of Credit Analysis
- Historical loan
- External data
Credit score for each stage
- The Applicant Scoring Model (A-Score)
- The Transaction / performance Scoring Model (B-Score)
- The Collection Scoring Model (C-Score)
How we do it? Of Course using CRISP-DM
What are we targeting during prediction?
Experience in data is a historical data, and the best variable among them is the target variable. It is the variable that we want to use the model to explain or predict using the rest of the variables.
The target variable is usually:
- It can be 0 for performing applicants and 1 to indicate defaulted applicants.
- We can use the term defaulted applicant, when a applicant is late on their first payment (First Payment Default (FPD)).
- We can use the term defaulted applicant, when a applicant is more than 30/60/90 days late on their active tenor (Days Past Due (DPD))
In binary form, sometimes we will refer to bad applicants as the ones in some sort of default and good applicants as the others.
3 types of Model
- Modelling with simplest approach - rule based
- Modelling with simple approach - Scorecard
- Modelling with Machine learning approach
Machine learning as a black box
Machine learning has great potential for improving products, processes and research. But computers usually do not explain their predictions which is a barrier to the adoption of machine learning.
We can use local and/or global method to interpret behavior of a machine learning model.
What we expect as the output here?
By using those models, we expect to get:
- Overall score value
- Probability to pay / to default
- Scorecard points
Especially form ML model, we need Interpretable model such as:
- Feature importance
- Feature contribution
Developing credit score