My main categorization algorithm that underlies Prometheus generates mutually exclusive categories. But we can use a similar method to generate categories that aren’t mutually exclusive. Specifically, we can generate a value , and then ask, as a general matter, whether two elements of a dataset are within of each other. Represented visually, we can assign each data point in the dataset a vertex in a discrete graph, and if two data points are within of each other, then we connect them with an edge.
We can generate using my main categorization algorithm, which will produce an in-context value of , or we can instead use another technique I introduced previously that measures the local consistency of data. Using the local consistency technique, if we have elements in our dataset, we would produce an matrix, where entry is only if data points and are within of each other.
We would then iterate through different values of , and select the value that generates the greatest change in the entropy of the matrix. For an explanation of why this would work, you can have a look at my main research paper on AI.
This will produce a category associated with each element of the dataset, where another element is a member of that category only if it is within of the element that defines the category.
We can use the resultant matrix to define a graph that is associated with the dataset. This graph will show, visually, which data points are sufficiently similar to be connected by an edge, which in turn allows for the quantization of distance by path length between all elements of the dataset.
The script to generate this categorization / graph is available on my researchgate blog.
There are other variations on this theme that we could use, like calculating a measure of the entropy of the graph, rather than the entropy of the matrix that defines the graph, but this works, and so, that’s it for now.
I plan to use this technique in a real-time deep learning algorithm called Ayin ( ע ) that I’m currently working on, which should (hopefully) be ready in the next few days. I will also further optimize my existing algorithms to make maximum use of vectorization, which I hope will push performance over the edge, allowing for truly real-time deep learning on consumer devices.