I was on a Meetup video conference, and someone mentioned a dataset that doesn’t cluster well (the UCI Sonar Dataset), so I naturally did a bit of work on it, and it turns out, the dataset literally contains very few clusters. Specifically, roughly 34% of the rows are contained in spherical clusters. The average cluster size over all rows is about .8 elements per cluster, again suggesting that you’re not going to get good clustering out of this dataset, because there are no real clusters to begin with, just as a matter of geometry. Nearest Neighbor nonetheless performs reasonably well, with an accuracy of 82.692%.
Code attached.