class GroupAverage extends HeapLinkage
In single linkage clustering the new distance of the merged cluster with cluster i is the average distance of all points in cluster x to i and y to i.
The average distance is efficiently computed by assuming that every point from
every other point in each cluster have the same distance (the average distance).
Then the computation is simply a weighted average of the average distances.
Methods
initializeStrategy(DistanceInterface $d, array $docs)
Initialize the distance matrix and any other data structure needed to calculate the merges later. |
||
array |
getNextMerge()
Return the pair of clusters x,y to be merged. |
Details
at line 19
public
initializeStrategy(DistanceInterface $d, array $docs)
Initialize the distance matrix and any other data structure needed to calculate the merges later.
at line 37
public array
getNextMerge()
Return the pair of clusters x,y to be merged.
1. Extract the pair with the smallest distance
2. Recalculate the distance of the merged cluster with every other cluster
3. Merge the clusters (by labeling one as removed)
4. Reheap