Vectorized Image Preprocessing

In a previous article, I introduced an algorithm that can quickly partition an image into rectangular regions, and calculate the average color in each region, which would then be used for image classification. When I applied it to the MNIST dataset, I got good results, but there was an apparent ceiling of about 87% accuracy that I couldn’t breach, regardless of how many rows of the dataset I used. My original MNIST preprocessing algorithm had much better results, and the prediction algorithm was the same, which suggests a question –

Why is this one not as good?

And I realized, the answer is, I didn’t get rid of the background of the image, which in the aggregate, ends up generating a significant amount of noise, because it’s not exactly the same in every image, yet it’s a substantial portion of each image, which cumulatively, interfered with the prediction algorithm.

As a solution, I simply nixed every dimension that has a standard deviation of less than the average standard deviation across all dimensions, by setting them to zero. What this says is, if a dimension doesn’t really move much across all rows in a dataset, in the context of image processing, it’s probably background. You can’t always say this is correct, but for the MNIST dataset, it’s obviously true, because the background is in roughly the same place in every image, and roughly equal in luminosity. This implies that the background dimensions should have significantly lower standard deviations than the average across all dimensions. And it turns out, this works quite well, and produces accuracy that is on par with my previous preprocessing algorithm (93.35%, given 2,500 rows). However, the difference is, in this case, this is a general approach that should work for all simple image datasets that have a roughly uniform background, unlike my previous preprocessing algorithm, which was specific to the MNIST dataset.

The region identified as background across the MNIST dataset in black (left), and a sample image from the dataset (right).

Leave a comment