N-Dimensional Mass-Clustering

As noted in previous articles, I’ve turned my attention to applying A.I. to thermodynamics, which requires the analysis of enormous datasets of Euclidean vectors. Though arguably not necessary, I generalized a clustering algorithm I’ve been working on to N-dimensions, and it is preposterously efficient:

The attached example takes a dataset of 150,000, 10-dimensional vectors, and clusters them in about 54.2 seconds, with absolutely no errors.

The example is a simple two class dataset, but nonetheless, the efficiency is simply remarkable, and completely eclipses anything I’ve ever heard of, even my own work.

This is particularly useful in thermodynamics, since it lets you cluster points as a function of their distance from an origin, say the center of mass, or the center of some volume. As a result, it allows you to quickly identify the perimeter of a set of points –

Simply take the cluster furthest from the center, which will be last in the list.

So this algorithm is also likely to have applications in object tracking, since you can compress the object to the points at its perimeter, and then track only those points.

This is years into my work, and rather than reexplain things, if you’re interested in how this process works, I’d suggest reading my original paper on machine learning and information theory.

This is another instance of the same ideas in information theory, but in this case, making use of radical compression.

Command Line:

8_1CMNDLINE

Function Code:

vectorized_clustering_EMC_N

Full library:

ResearchGate

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s