Cosine Similarity - SoojungHong/StatisticalMind GitHub Wiki
Cosine similarity is a measure of similarity between two non-zero vectors of an inner product space that measures cosine of the angle between them. The cosine of 0° is 1, and it is less than 1 for any other angle in the interval [0,0.5π). It is thus a judgment of orientation and not magnitude: two vectors with the same orientation have a cosine similarity of 1, two vectors oriented at 90° relative to each other have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude. Cosine similarity is particularly used in positive space, where the outcome is neatly bounded in [0,1]. The name derives from the term "direction cosine": in this case, note that unit vectors are maximally "similar" if they're parallel and maximally "dissimilar" if they're orthogonal (perpendicular). This is analogous to the cosine, which is unity (maximum value) when the segments subtend a zero angle and zero (uncorrelated) when the segments are perpendicular.
The cosine of two non-zero vectors can be derived by using the Euclidean dot product formula:
a ⋅ b = ‖ a ‖ ‖ b ‖ cos θ
cos θ = (a ⋅ b)/‖ a ‖ ‖ b ‖
Note that these bounds apply for any number of dimensions, and cosine similarity is most commonly used in high-dimensional positive spaces. For example, in information retrieval and text mining, each term is notionally assigned a different dimension and a document is characterised by a vector where the value of each dimension corresponds to the number of times that term appears in the document. Cosine similarity then gives a useful measure of how similar two documents are likely to be in terms of their subject matter.