Project Everest

[Proposed Experiment] SoCon Fiji - Alternate Data; Inputs for Credit Risk Model

Lean Phase: MVP Prototyping / Solution

Assumption: Those who are able to regularly pay their other important bills on-time (water bills, phone bills) will be much more likely to be able to make on-time repayments of loans. Gathering the data from individuals will allow SoCon to determine for ourselves whether these individuals are going to pay on time by creating inputs for credit risk model (i.e. not default).

Time Period:
6 weeks

Success Metric:
Various identified forms of data and whether the assumptions surrounding these have direct correlation with an individual’s ability to pay back a microfinance loan
Correlation on default rate → default rate on measurement

To see whether these inputs are valid and correlated with the ability to pay debt
Positive Outcome: We find an input that has high correlation to their ability to pay for the loan, at which point we include in the final credit risk model.
Negative Outcome: Opposite of above

Green Light: To include and finalise the entirety of the credit risk model

Success point: 80% ability to pay based on relevant data/variable

Orange Light: Find other inputs that are more relevant

Red light: 60% ability to pay based on relevant data/variable

Experiment Build:
Research and compile input variables (water bills, phone bills, etc.) → find ideally 20 different potential variables
Go out and get the information from public and private sources
Come back and input into excel model
Provide loans to 20 test subjects 
Track and evaluate to eliminate or promote relevant variables that have correlation

Organise and group the data based on the info we received (weighting inputs accordingly)

edited on 8th October 2018, 09:10 by Seif Zakri Stacey

Seif Zakri Stacey 10 months ago

It is still unclear what your MVP is and the value it is delivering (the emotional delta created). It is still unclear how your success metrics effect the assumption you’re testing, and the conversion of people experiencing the mentioned variables and the default rate. Like when it says 80% ability to pay does that mean 80% experience these variables and therefore have the ability to pay? If so, I don’t see how it validates your assumption.

Reply 0

Seif Zakri Stacey 10 months ago

Sustains: Assumptions are good

Improves: Is the time period long enough? Do people really fully pay back their loans within 6 weeks or is it longer?

What types of variables would you be testing and is their an individual success point for each variable? Or is the 80% around the idea that 80% of the variables chosen positively correlate with ability to pay back loans?

Could a part of your experiment build be seeing how this is commerically done at the moment and seeing if there is any software/ technology already available?

Reply 0

Kurt Michl 10 months ago


improves :
need more info on the build,
wouldn't the variable be "on-time payment" on the different bills?
which excel model?

how will you select 20 test subjects?
is there not capacity to have a precursory model that eliminates candidates which have exhibited payment late or never? but then this experiment would be one-sided, as in it would not test wether people with bad "credit" would still repay the loan, and therefore maybe it is better to randomly selects? needs specification, and purpose of the design.

Also, most importantly, a sample of 20 is likely not statistically significant if you want to be regressing on up to 20 variables. maybe consider choosing 2-5 most important variables to maintain a small sample size.

Reply 0

Kurt Michl 10 months ago

maybe easier to break it down into a series of smaller experiences? six weeks is a long time when there is so much to do upfront.

Reply 0

Rose Martin 10 months ago

Loving the concept of this! But it needs to be fleshed out more and written more clearly.

Some questions/points of recommendation I have:
- You're trying to find 20 different variables, and are providing 20 loans ; so are you attempting to test all 20 variables in these 20 loans? Doubling (or tripling, etc) up on variables could make the results skewed and you may not be able to determine what result was to do with what variable. My advice would be to definitely do research and find a large number, say 20, variables which 'could' work in theory, but it may be better to test one at a time. Thus, maybe in your research, you should look into some microfinance models which have used this type of data before, and see which types of alternative data were most effective here.
- The green light says to 'finalise entirety of credit risk model' - but surely this is unreasonable considering you're only looking at alternative data sources here. Ideally an 'entire' credit risk model will include much more than this.
- Time period. Based on the types of loans you give you may need more time for this. BUT, we also have to work within the scope of project. For this experiment only, you are testing things to do with credit risk, thus I think giving cash loans with a shorter repayment time may be a better use of your time in country (e.g. in July we gave cash loans for rocket stoves with only 3-4 weeks repayments). After all, you're not testing anything to do with how they repay, just that they CAN repay. So if you can swing it, the shorter the better on repayment period.

Hope this helps guys, let me know if you disagree/want to discuss any of the above points!

Reply 0