Cosine Similarity


Cosine similarity as its name suggests identifies the similarity between two (or more) vectors.  In this example, the vectors are in two dimensional space but the results can be easily generalised to higher dimensions.

Vector Theory

A vector has length and direction.  The vector’s absolute position is not relevant.  Two vectors may be same even though their starting positions position are different.  For example, the vectors in the first slide have different starting positions.  They can all be moved to have the same starting position at the (0,0) origin by simply focussing on their movements in X and Y space.

For example, Vector b moves 1 unit in the x direction and 1 unit in the y direction, so this vector can be translated to (1,1) with respect to the (0,0) origin.

Vector Length

The length of a vector in in two dimensional space is the length of its hypotenuse. The length of a vector is called its “norm” and is defined below:


Unit Vectors

To scale different vectors to a common length, each vector is divided by its length.  This ensures that the resultant vector has a length of one.  This is called the unit vector as shown below:


Dot Product

The dot product of two vectors is simply multiplying each component of the vectors and then adding the results. This is shown below:


Assuming that z is (2,0), then the dot product of b (1,1) and z (2,0) would be (1 x 2) + (1 x 0) = 2.

Cosine Similarity

Now that the dot product and norm has been defined, then the cosine similarity of two vectors is simply the dot product of two unit vectors.  This is shown below:


Given that vector b moves up and to the right by equal amounts, it would be expected that this vector is 45 degrees to the x axis. If the x axis is represented by z (2,0). The result of the cosine similarity between b and z is equal to:   0.7071.  The inverse cosine of this value is .7855 radians or 45 degrees.

Implementation in R

A collection of vectors can be implemented as a matrix.  The following shows how the raw scores from slide 1 can be implemented as two matrices. The matrix mRaw contains the original vectors, while mOr contains their respective origins.  The matrix mRaw is then rooted at the origin by: mRaw – mOr (as seen in slide two).

#column names
feature <- c('feat.1', 'feat.2')
#row names
observation <- c('A', 'B', 'C', 'D', 'E')
#define a 5 x 2 matrix of raw values:
mRaw <- matrix(c(1,2,2,1.5,1.5,-1.5,-2,-1,-2, 1.5),nrow = 5, byrow = TRUE)
#define a 5 x 2 matrix of origin values:
mOr <- matrix(c(0.5,1,1,0.5,0.5,-1,-0.5,-0.5,0,0.5),nrow = 5, byrow = TRUE)
#assign row and col names
dimnames(mRaw) <- list(observation, feature)
dimnames(mOr) <- list(observation, feature)
#calculate deviation from origin
mDevOr <- mRaw - mOr

Now, that the vectors have a common origin, their norms are calculated and then from this unit vectors are calculated (slide 3).  The cosine similarity of two matrices are calculated by multiplying one matrix by the other’s transpose. The results are then converted to degrees.

#calculate norms
normDevOr <- apply(mDevOr, 1, function(x) {sqrt(sum(x^2))})
#create unit vectors (vectors / norms)
mUnit <- mDevOr / normDevOr
#create similarity matrix
mDevSim <- mUnit %*% t(mUnit)
#convert similarity to degrees
mDegrees <- round(acos(mDevSim) * 180 / pi,0)

Interpreting Cosine Similarity

As shown in slide 3, vectors a and b are similar and have a difference of 18 degrees. Vectors a and c are completely different and have a similarity of 0.00 which converts to an angle of 90 degrees – this similarity of 0.00 is called by orthoganal.  Finally, vectors c and e have a similarity of -1.00 which converts to an angle of 180 degrees – they are the opposite of each other.  The similarity matrix created by the last line of code is shown below:

#show similarity matrix (in degrees)
A 0 18 90 135 90
B 18 0 72 153 108
C 90 72 0 135 180
D 135 153 135 0 45
E 90 108 180 45 0

Github Code and Data

The complete code is listed above but can obtained from Github here.