Classifiers
In machine learning classification is the problem of identifying in which of a set of categories (classes) a new observation belongs. Wikipedia
Interface
The Classifier interface contains only the function classify that receives the set of categories (classes) and a Document which contains all the data of a new observation and returns a predicted class for the Document.
interface ClassifierInterface { /** * Decide in which class C member of $classes would $d fit best. * * @param array $classes A set of classes * @param Document $d A Document * @return string A class */ }
Feature based linear classifier
This classifier needs a Feature Factory and a Linear Model.
For a given document and a class the feature vector is computed. Through a linear combination with the weights of the Linear Model a vote for the given class Ci is computed. The class C that maximizes the vote is the predicted class.
Multinomial Naive Bayes Classifier
This classifier also needs a Feature Factory. In addition it needs a Multinomial Naive Bayes Model.
You can find a thorough explanation of this method by the Stanford NLP department.
In short, the probability of a document d belonging in the class c is computed using the prior probablities of the classes and the assumption that each indicator random variable (the features in our case) is independent with any other.