NlpTools API
Class

NlpTools\Similarity\CosineSimilarity

class CosineSimilarity implements SimilarityInterface, DistanceInterface

Given two vectors compute cos(theta) where theta is the angle between the two vectors in a N-dimensional vector space.

cos(theta) = A•B / |A||B|
'•' means inner product

Since the vectors are meant to be feature vectors, the value of
each vector for each dimension is simply the frequency of this
feature. Moreover, there cannot be negative frequency of occurence so
there cannot be negative vector coefficients and the angle will
always be between 0 and pi/2.

If the current key of the passed array is not the number 0 then the feature
vector is supposed to have been passed as a mapping between the feature name
and a value like the following
array(
'feature_1'=>1,
'feature_2'=>0.55,
'feature_3'=>12.7,
....
)

Methods

float similarity(array $A, array $B)

dist($A, $B)

Details

at line 45
public float similarity(array $A, array $B)

Parameters

array $A Either feature vector or simply vector
array $B Either feature vector or simply vector

Return Value

float The cosinus of the angle between the two vectors

at line 84
public dist($A, $B)

Parameters

$A
$B