NlpTools API
Class

NlpTools\Models\Maxent

class Maxent extends LinearModel

Maxent is a model that assigns a weight for each feature such that all the weights maximize the Conditional Log Likelihood of the training data.

Because it does that without making any assumptions about the data
it is named maximum entropy model (maximum ignorance).

Constants

INITIAL_PARAM_VALUE

Methods

__construct(array $l)

from LinearModel
float getWeight(string $feature)

Get the weight for a given feature

from LinearModel
array getWeights()

Get all the weights as an array.

from LinearModel
void train(FeatureFactoryInterface $ff, TrainingSet $tset, MaxentOptimizerInterface $opt)

Calculate all the features for every possible class.

float P(array $classes, FeatureFactoryInterface $ff, DocumentInterface $d, string $class)

Calculate the probability that document $d belongs to the class $class given a set of possible classes, a feature factory and the model's weights l[i]

CLogLik(TrainingSet $tset, FeatureFactoryInterface $ff)

Not implemented yet.

dumpWeights()

Simply print_r weights.

Details

in LinearModel at line 18
public __construct(array $l)

Parameters

array $l

in LinearModel at line 28
public float getWeight(string $feature)

Get the weight for a given feature

Parameters

string $feature The feature for which the weight will be returned

Return Value

float The weight

in LinearModel at line 39
public array getWeights()

Get all the weights as an array.

Return Value

array The weights as an associative array

at line 29
public void train(FeatureFactoryInterface $ff, TrainingSet $tset, MaxentOptimizerInterface $opt)

Calculate all the features for every possible class.

Pass the
information to the optimizer to find the weights that satisfy the
constraints and maximize the entropy

Parameters

FeatureFactoryInterface $ff The feature factory
TrainingSet $tset A collection of training documents
MaxentOptimizerInterface $opt An optimizer, we need a maxent optimizer

Return Value

void

at line 78
public float P(array $classes, FeatureFactoryInterface $ff, DocumentInterface $d, string $class)

Calculate the probability that document $d belongs to the class $class given a set of possible classes, a feature factory and the model's weights l[i]

Parameters

array $classes The set of possible classes
FeatureFactoryInterface $ff The feature factory
DocumentInterface $d The document
string $class A class for which we calculate the probability

Return Value

float The probability that document $d belongs to class $class

at line 99
public CLogLik(TrainingSet $tset, FeatureFactoryInterface $ff)

Not implemented yet.

Simply put:
result += log( $this->P(..., ..., ...) ) for every doc in TrainingSet

Parameters

TrainingSet $tset
FeatureFactoryInterface $ff

Exceptions

Exception

at line 108
public dumpWeights()

Simply print_r weights.

Usefull for some kind of debugging when
working with small training sets and few features