NlpTools API
Class

NlpTools\Analysis\Idf

class Idf implements ArrayAccess

Idf implements the inverse document frequency measure.

Idf is a measure of whether a term T is common or rare accross
a set of documents.

Idf implements the ArrayAccess interface so it should be used
as a read only array that contains tokens as keys and idf values
as values.

Methods

__construct(TrainingSet $tset, FeatureFactoryInterface $ff = null)

float offsetGet(string $token)

Implements the array access interface.

bool offsetExists(string $token)

Implements the array access interface.

offsetSet($token, $value)

Will not be implemented.

offsetUnset($token)

Will not be implemented.

Details

at line 27
public __construct(TrainingSet $tset, FeatureFactoryInterface $ff = null)

Parameters

TrainingSet $tset The set of documents for which we will compute the idf
FeatureFactoryInterface $ff A feature factory to translate the document data to single tokens

at line 61
public float offsetGet(string $token)

Implements the array access interface.

Return the computed idf or
the logarithm of the count of the documents for a token we have not
seen before.

Parameters

string $token The token to return the idf for

Return Value

float The idf

at line 77
public bool offsetExists(string $token)

Implements the array access interface.

Return true if the token exists
in the corpus.

Parameters

string $token The token to check if it exists in the corpus

Return Value

bool

at line 86
public offsetSet($token, $value)

Will not be implemented.

Throws \BadMethodCallException because
one should not be able to alter the idf values directly.

Parameters

$token
$value

at line 95
public offsetUnset($token)

Will not be implemented.

Throws \BadMethodCallException because
one should not be able to alter the idf values directly.

Parameters

$token