Permutation-Based Clustering

I wrote this algorithm a while back, and revisited it because I plan to offer it in the Pro Version of my AutoML Software, Black Tree. The original version was supervised, this one is unsupervised, just for completeness. The nice thing about this algorithm is that the clusters are guaranteed to be non-empty. In short, the way it works is to first generate a few copies of the original dataset, and permute all of them. Then, it runs nearest neighbor on each increasing sized subsets of each of the permuted copies, which generates unique matches each iteration. Then there’s an external function that optimizes the clusters after the fact. I believe this to be all the underlying code, but if anything is missing, you can email me at charles@blacktreeautoml.com, or have a look at my last library upload, as functions should still have similar names, so just sort and look for anything that’s missing.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s