Classes | NlpTools API

NlpTools\Analysis\FreqDist	Extract the Frequency distribution of keywords
NlpTools\Analysis\Idf	Idf implements the inverse document frequency measure.
NlpTools\Classifiers\ClassifierInterface
NlpTools\Classifiers\FeatureBasedLinearClassifier	Classify using a linear model.
NlpTools\Classifiers\MultinomialNBClassifier	Use a multinomia NB model to classify a document
NlpTools\Clustering\CentroidFactories\CentroidFactoryInterface
NlpTools\Clustering\CentroidFactories\Euclidean	Computes the euclidean centroid of the provided sparse vectors
NlpTools\Clustering\CentroidFactories\Hamming	This class computes the centroid of the hamming distance between two stringsthat are the binary representations of two integers (the strings are supposedto only contain the characters 1 and 0).
NlpTools\Clustering\CentroidFactories\MeanAngle	MeanAngle computes the unit vector with angle the average of all the given vectors.
NlpTools\Clustering\Clusterer
NlpTools\Clustering\Hierarchical	This class implements hierarchical agglomerative clustering.
NlpTools\Clustering\KMeans	This clusterer uses the KMeans algorithm for clustering documents.
NlpTools\Clustering\MergeStrategies\CompleteLink	In single linkage clustering the new distance of the merged cluster with cluster i is the maximum distance of either cluster x to i or y to i.
NlpTools\Clustering\MergeStrategies\GroupAverage	In single linkage clustering the new distance of the merged cluster with cluster i is the average distance of all points in cluster x to i and y to i.
NlpTools\Clustering\MergeStrategies\HeapLinkage	HeapLinkage is an abstract merge strategy.
NlpTools\Clustering\MergeStrategies\MergeStrategyInterface	In hierarchical agglomerative clustering each document starts in its own cluster and then it is subsequently merged with the "closest" cluster.
NlpTools\Clustering\MergeStrategies\SingleLink	In single linkage clustering the new distance of the merged cluster with cluster i is the smallest distance of either cluster x to i or y to i.
NlpTools\Documents\DocumentInterface	A Document is a representation of a Document to be classified.
NlpTools\Documents\RawDocument	RawDocument simply encapsulates a php variable
NlpTools\Documents\TokensDocument	Represents a bag of words (tokens) document.
NlpTools\Documents\TrainingDocument	A TrainingDocument is a document that "decorates" any other document to add the real class of the document.
NlpTools\Documents\TrainingSet	A collection of TrainingDocument objects.
NlpTools\Documents\WordDocument	A Document that represents a single word but with a context of a larger document.
NlpTools\Exceptions\InvalidExpression	Used by the tokenization, primarily
NlpTools\FeatureFactories\DataAsFeatures
NlpTools\FeatureFactories\FeatureFactoryInterface
NlpTools\FeatureFactories\FunctionFeatures	An implementation of FeatureFactoryInterface that takes any number of callables (function names, closures, array($object,'func_name'), etc.) and calls them consecutively using the return value as a feature's unique string.
NlpTools\Models\FeatureBasedNB	Implement a MultinomialNBModel by training on a TrainingSet with a FeatureFactoryInterface and additive smoothing.
NlpTools\Models\Lda	Topic discovery with latent dirchlet allocation using gibbs sampling.
NlpTools\Models\LinearModel	This class represents a linear model of the following form f(x_vec) = l1x1 + l2x2 + l3*x3 ...
NlpTools\Models\Maxent	Maxent is a model that assigns a weight for each feature such that all the weights maximize the Conditional Log Likelihood of the training data.
NlpTools\Models\MultinomialNBModelInterface	Interface that describes a NB model.
NlpTools\Optimizers\ExternalMaxentOptimizer	This class enables the use of a program written in a different language to optimize our model and return the weights for use in php.
NlpTools\Optimizers\FeatureBasedLinearOptimizerInterface
NlpTools\Optimizers\GradientDescentOptimizer	Implements gradient descent with fixed step.
NlpTools\Optimizers\MaxentGradientDescent	Implement a gradient descent algorithm that maximizes the conditional log likelihood of the training data.
NlpTools\Optimizers\MaxentOptimizerInterface	Marker interface to use with the Maxent model for type checking
NlpTools\Random\Distributions\AbstractDistribution
NlpTools\Random\Distributions\Dirichlet	Implement a k-dimensional Dirichlet distribution using draws from k gamma distributions and then normalizing.
NlpTools\Random\Distributions\Gamma	Implement the gamma distribution.
NlpTools\Random\Distributions\Normal
NlpTools\Random\Generators\FromFile	Return floats from a file.
NlpTools\Random\Generators\GeneratorInterface	An interface for pseudo-random number generators.
NlpTools\Random\Generators\MersenneTwister	A simple wrapper over the built in mt_rand() method
NlpTools\Similarity\CosineSimilarity	Given two vectors compute cos(theta) where theta is the angle between the two vectors in a N-dimensional vector space.
NlpTools\Similarity\DistanceInterface	Distance should return a number proportional to how dissimilar the two instances are(with any metric)
NlpTools\Similarity\Euclidean	This class computes the very simple euclidean distance between two vectors ( sqrt(sum((a_i-b_i)^2)) ).
NlpTools\Similarity\HammingDistance	This class implements the hamming distance of two strings or sets.
NlpTools\Similarity\JaccardIndex	http://en.wikipedia.org/wiki/Jaccard_index
NlpTools\Similarity\Simhash	Simhash is an implementation of the locality sensitive hash function families proposed by Moses Charikar using the Earth Mover's Distance http://www.cs.princeton.edu/courses/archive/spring04/cos598B/bib/CharikarEstim.pdf
NlpTools\Similarity\SimilarityInterface	Similarity should return a number that is proportional to how similar those two instances are (with any metric).
NlpTools\Stemmers\GreekStemmer	This stemmer is an implementation of the stemmer described by G.
NlpTools\Stemmers\LancasterStemmer	A word stemmer based on the Lancaster stemming algorithm.
NlpTools\Stemmers\PorterStemmer	Copyright 2013 Katharopoulos Angelos <katharas@gmail.com>
NlpTools\Stemmers\RegexStemmer	This stemmer removes affixes according to a regular expression.
NlpTools\Stemmers\Stemmer	http://en.wikipedia.org/wiki/Stemming
NlpTools\Tokenizers\ClassifierBasedTokenizer	A tokenizer that uses a classifier (of any type) to determine if there is an "end of word" (EOW).
NlpTools\Tokenizers\PennTreeBankTokenizer	PennTreeBank Tokenizer Based on http://www.cis.upenn.edu/~treebank/tokenizer.sed
NlpTools\Tokenizers\RegexTokenizer	Regex tokenizer tokenizes text based on a set of regexes
NlpTools\Tokenizers\TokenizerInterface
NlpTools\Tokenizers\WhitespaceAndPunctuationTokenizer	Simple white space tokenizer.
NlpTools\Tokenizers\WhitespaceTokenizer	Simple white space tokenizer.
NlpTools\Utils\ClassifierBasedTransformation	Classify whatever is passed in the transform and pass it a different set of transformations based on the class.
NlpTools\Utils\EnglishVowels	Helper Vowel class, determines if the character at a given index is a vowel
NlpTools\Utils\Normalizers\English	For English we simply transform to lower case using mb_strtolower.
NlpTools\Utils\Normalizers\Greek	To normalize greek text we use mb_strtolower to transform to lower case and then replace every accented character with its non-accented counter part and the final ς with σ
NlpTools\Utils\Normalizers\Normalizer	The Normalizer's purpose is to transform any word from any one of the possible writings to a single writing consistently.
NlpTools\Utils\StopWords	Stop Words are words which are filtered out because they carry little to no information.
NlpTools\Utils\TransformationInterface	TransformationInterface represents any type of transformation to be applied upon documents.
NlpTools\Utils\VowelsAbstractFactory	Factory wrapper for Vowels