K-Nearest Neighbors: dangerously simple

Yeah, Mathbabe’s got it right: People who use kNN often don’t think about these things.

For those who aren’t familiar with this technique, here’s a description from Zhi-Hua Zhou in Ensemble Methods: Foundations and Algorithms (section 1.2.5):

“The k-nearest neighbor (kNN) algorithm relies on the principle that objects similar in the input space are also similar in the output space. It is a lazy learning approach since it does not have an explicit training process, but simply stores the training set instead. For a test instance, a k-near neighbor learner identifieds the k insteances from the training set that are closest to the test instance. Then, for classification, the test instance will be classified to the majority class among the k instances; while for regression, the test instance will be assigned the average value of the k instances.”

About ecoquant

See https://wordpress.com/view/667-per-cm.net/ Retired data scientist and statistician. Now working projects in quantitative ecology and, specifically, phenology of Bryophyta and technical methods for their study, notably Macrophotography. Some photos of mine: https://www.flickr.com/photos/198372469@N03/
This entry was posted in big data, data science, evidence, machine learning. Bookmark the permalink.

Leave a Reply