k-Nearest Neighbors in plain English

Here is how it works in plain English:

we have training set ( known features ( normalized), and classification) :

many data points: [ ( feature1,feature2,feature 3,…), ( f1,f2,f3 …), ….]

and corresponding labels/classification: [category1, 2, …]

for any new data point  t

calculate the distance between this  t to each of the training set data points

find/sort the K most near ( most similar) data points

–> take a majority vote from K ‘s label  as the new label/class

 

The idea seems simple but it is quite powerful, one example is the handwriting recognition, e.g:  handwriting for 1,2,… 9, with enough training sets, we can easily recognize some new handwriting!

 

From Wiki:

  • In k-NN classification, the output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.
  • In k-NN regression, the output is the property value for the object. This value is the average of the values of its k nearest neighbors.

 

 

References:

https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

http://www.saedsayad.com/k_nearest_neighbors.htm

Please rate this


Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>