Local Learning Algorithms

The scientific paper, “Local Learning Algorithms,” describes a few methods of speeding up the classification process using the closest available data. “Closest” in this context refers to the proximity of data points in the input space to the item we are trying to classify. Each data point is a representation of a training example in n-dimensional space, where each dimension represents some aspect of the data.

The paper spends a lot of time discussing the k Nearest Neighbor algorithm. Using kNN, an unknown sample is tested against the samples closest to it, where k is the number of samples to test against. So if you have some item to be classified, and k = 3, you would check it against the 3 closest samples, and then classify it as being whichever class is most prevalent among those choices.

The researchers developed their own local learning algorithm to test against kNN and other local learning algorithms. Their algorithm is similar to kNN, in that each sample is initially classified using information from its k closest neighbors. After that, the highest output (i.e. the chosen classification) and the second highest output are compared, and if the difference is below a given threshold, the pattern is rejected. The researchers tested their algorithm with a set of handwritten numbers, and found that their raw error was lower than those of the other classification methods they tested (the only method that performed better was human classification).