Calculation of Error in my Deck

I just realized it’s probably more fair to report the error for my core clustering algorithm (not the others) differently than I do in my deck, while working on my embedding algorithm. I disclosed exactly how I calculate error, so it’s not a lie, but it’s not right, in the sense that even though the clusters are not mutually exclusive, the better answer is to say that the error is given as follows:

1 - (numerrors/clustersize),

whereas in the deck, I say the error is given by,

1 - (numerrors/numrows).

The latter measure does capture something, in that the number of rows is the number of opportunities to make errors, but that’s not what you want in this context, which is the accuracy of the cluster itself, regardless of how many rows there are.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s