The K-Nearest Neighbors ("KNN") method is a [[Supervised Learning]] technique, that predicts the value of an unknown observation $\hat y$ based on the $k$ most similar observations in the dataset. $ \widehat{y}_{a} = \frac{ \displaystyle \sum _{b \, \in \, \text {KNN}(a)} \text {sim}(a,b) *y_{b}}{ \displaystyle \sum _{b \, \in \, \text {KNN}(a)} \text {sim}(a,b)} $ Let $\text{KNN}(a)$ be the set of $k$ observations that are most similar to observation $a$. The call these observations “the neighbors” and denote them with $b$. To get $\hat y_{a}$ we take the weighted average of the neighbors’ $y_b$ labels, where the weighting function $\text{sim}(a,b)$ is a measure of similarity or distance between $a$ and $b$. The similarity function can be defined in many different ways. See two popular examples below: - *Euclidean distance:* Measures the straight-line distance between two points in the feature space. Smaller distances indicate higher similarity. $ \lVert x_a - x_b \rVert $ - *Cosine similarity:* Measures the cosine of the angle between two vectors in the feature space. A value of 1 indicates identical vectors, while 0 indicates orthogonality (no similarity). $ \cos(\theta) = \frac{x_a \cdot x_b}{\vert \vert x_a\rvert \rvert *\lVert x_b \rVert} $