NlpTools\Models\FeatureBasedNB

class FeatureBasedNB implements MultinomialNBModelInterface

Implement a MultinomialNBModel by training on a TrainingSet with a FeatureFactoryInterface and additive smoothing.

Methods

	__construct()
float	getPrior(string $class) Return the prior probability of class $class P(c) as computed by the training data
float	getCondProb(string $term, string $class) Return the conditional probability of a term for a given class.
array	train_with_context(array $train_ctx, FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1) Train on the given set and fill the model's variables.
array	train(FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1) Train on the given set and fill the models variables
	__sleep() Just save the probabilities for reuse

Details

at line 21
`public __construct()`

at line 35
`public float getPrior(string $class)`

Return the prior probability of class $class P(c) as computed by the training data

Parameters

string

$class

Return Value

float

prior probability

at line 47
`public float getCondProb(string $term, string $class)`

Return the conditional probability of a term for a given class.

Parameters

string	$term	The term (word, feature id, ...)
string	$class	The class

Return Value

float

at line 70
`public array train_with_context(array $train_ctx, FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)`

Train on the given set and fill the model's variables.

Use the
training context provided to update the counts as if the training
set was appended to the previous one that provided the context.

It can be used for incremental training. It is not meant to be used
with the same training set twice.

Parameters

array	$train_ctx	The previous training context
FeatureFactoryInterface	$ff	A feature factory to compute features from a training document
TrainingSet	$tset	The training set
integer	$a_smoothing	The parameter for additive smoothing. Defaults to add-one smoothing.

Return Value

array

Return a training context to be used for further incremental training, although this is not necessary since the changes also happen in place

at line 112
`public array train(FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)`

Train on the given set and fill the models variables

priors[c] = NDocs[c]/NDocs
condprob[t][c] = count( t in c) + 1 / sum( count( t' in c ) + 1 , for every t' )
unknown[c] = condbrob['word that doesnt exist in c'][c] ( so that count(t in c)==0 )

More information on the algorithm can be found at
http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html

Parameters

FeatureFactoryInterface	$ff	A feature factory to compute features from a training document
TrainingSet	$tset	The training set
integer	$a_smoothing	The parameter for additive smoothing. Defaults to add-one smoothing.

Return Value

array

Return a training context to be used for incremental training

at line 192
`public __sleep()`

Just save the probabilities for reuse

NlpTools\Models\FeatureBasedNB

Methods

Details

at line 21 public __construct()

at line 35 public float getPrior(string $class)

Parameters

Return Value

at line 47 public float getCondProb(string $term, string $class)

Parameters

Return Value

at line 70 public array train_with_context(array $train_ctx, FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)

Parameters

Return Value

at line 112 public array train(FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)

Parameters

Return Value

at line 192 public __sleep()

at line 21
`public __construct()`

at line 35
`public float getPrior(string $class)`

at line 47
`public float getCondProb(string $term, string $class)`

at line 70
`public array train_with_context(array $train_ctx, FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)`

at line 112
`public array train(FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)`

at line 192
`public __sleep()`