abstract class HeapLinkage implements MergeStrategyInterface
HeapLinkage is an abstract merge strategy.
It creates a pairwise distance matrix. It then uses a heap to compute
efficiently the minimum in the distance matrix. Then the two clusters
with the minimum distance are merged and the distance of the new cluster
with every other cluster i is recomputed. This recomputation is delegated
to the children classes through the abstract function newDistance().
The class uses an SplFixedArray that is filled with the lower triangle of
the distance matrix. This is done to save memory. The index of a pair x,y
is computed as follows:
1. if x>y swap x,y
2. index = y*(y-1)/2 + x
Methods
initializeStrategy(DistanceInterface $d, array $docs)
Initialize the distance matrix and any other data structure needed to calculate the merges later. |
||
array |
getNextMerge()
Return the pair of clusters x,y to be merged. |
Details
at line 44
public
initializeStrategy(DistanceInterface $d, array $docs)
Initialize the distance matrix and any other data structure needed to calculate the merges later.
at line 78
public array
getNextMerge()
Return the pair of clusters x,y to be merged.
1. Extract the pair with the smallest distance
2. Recalculate the distance of the merged cluster with every other cluster
3. Merge the clusters (by labeling one as removed)
4. Reheap