I’ve updated my mass-scale classification algorithms to be easier to read, and more consistent with my prior algorithms. The runtime seems to be about the same as the other algorithms I introduced in my other recent articles, and the accuracy seems about the same as well –
The difference is that it’s much easier to read this code, and that the dataset doesn’t need to be sorted beforehand.
Attached is a simple command line script that calls the new algorithms, using a dataset of 104,866 Euclidean 3-vectors, that together comprise two statistical spheres. The classification task is to correctly cluster the points within the spheres, which in this case took 4.0479 minutes, with an accuracy of 100%, though as you can see, there is significant local clustering, in that the spheres are plainly subdivided into smaller clusters. There are in this case 254 clusters. Note that points in the same cluster are colored the same in the image above.