class FeatureBasedNB implements MultinomialNBModelInterface
Implement a MultinomialNBModel by training on a TrainingSet with a FeatureFactoryInterface and additive smoothing.
Methods
__construct() | ||
float |
getPrior(string $class)
Return the prior probability of class $class P(c) as computed by the training data |
|
float |
getCondProb(string $term, string $class)
Return the conditional probability of a term for a given class. |
|
array |
train_with_context(array $train_ctx, FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)
Train on the given set and fill the model's variables. |
|
array |
train(FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)
Train on the given set and fill the models variables |
|
__sleep()
Just save the probabilities for reuse |
Details
at line 21
public
__construct()
at line 35
public float
getPrior(string $class)
Return the prior probability of class $class P(c) as computed by the training data
at line 47
public float
getCondProb(string $term, string $class)
Return the conditional probability of a term for a given class.
at line 70
public array
train_with_context(array $train_ctx, FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)
Train on the given set and fill the model's variables.
Use the
training context provided to update the counts as if the training
set was appended to the previous one that provided the context.
It can be used for incremental training. It is not meant to be used
with the same training set twice.
at line 112
public array
train(FeatureFactoryInterface $ff, TrainingSet $tset, integer $a_smoothing = 1)
Train on the given set and fill the models variables
priors[c] = NDocs[c]/NDocs
condprob[t][c] = count( t in c) + 1 / sum( count( t' in c ) + 1 , for every t' )
unknown[c] = condbrob['word that doesnt exist in c'][c] ( so that count(t in c)==0 )
More information on the algorithm can be found at
http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html
at line 192
public
__sleep()
Just save the probabilities for reuse