Modeling Credit Using ML

My AutoML software, Black Tree AutoML, can already predict credit outcomes with no specialization at all. But it just dawned on me that with a bit of work, you can use any clustering and classification system to model credit in a meaningful way. First let’s define the relevant properties of a credit, which are its assets, and its liabilities, and for simplicity, we’ll include the equity capital of the credit in its liabilities. This will allow us to express a credit, at a given moment in time t, as a vector h(t) = (a_1, \dots, a_k;l_1, \ldots, l_m), where each a_i is the value of assets of type i owned by the credit, and each l_i is the value of liabilities of type i owed by the credit. Because this is so abstract, this allows you to consider not only corporates, but SPV’s as well, and individuals.

Now let’s posit a dataset of credits S = \{h_1, \ldots, h_M\}, that were sampled over time. That is, h_i is actually a time-series of a given credit, and we can evaluate h_i(t), for any t within some ordinal interval, though you could also consider specific periods of time as well. The overall gist being, we have observed and recorded the state of a given credit over time, in the form of the vector h(t) = (a_1, \dots, a_k;l_1, \ldots, l_m). We can therefore, pull all credits that are sufficiently similar to some new input credit h(t), which will produce a cluster of similar credits. Because our dataset contains time-series data for each of the credits returned in the cluster, we can form possible future paths for h(t). This will allow us to say, as a general matter, what the future of h will look like, given its present state h(t). Moreover, we can easily construct a probability of default, again using the cluster, since all of the credits in the cluster either paid or didn’t, though you could have some unknowns as well as a practical matter (i.e., those credits are still outstanding).

Applying this process repeatedly to the initial state of some credit h(t_0), we will construct a set of possible future paths for h(t_0), given that initial state. Specifically, first we find the cluster associated with h(t_0). Then, we find the next state for each credit in that cluster. So, e.g., if credit x(t_j) is in the cluster associated with h(t_0), we find the next state of x(t_j) in the cluster, which we can represent as x(t_{j+1}). We do this again, for all such x, and continue as desired, and this will produce a dataset of possible future paths for the credit, which will grow exponentially as a function of time, and at each ordinal interval of time, there will be some probability of default based upon the dataset.

My AutoML software is typically really accurate, so I would wager that if you use my software for the clustering step, you’re going to get great answers, and probably make a lot of money as a consequence, and so it’s another great reason to buy my software, which is comically better than everyone else’s.