NlpTools\Analysis\FreqDist | Extract the Frequency distribution of keywords |
NlpTools\Analysis\Idf | Idf implements the inverse document frequency measure. |
NlpTools\Classifiers\ClassifierInterface | |
NlpTools\Classifiers\FeatureBasedLinearClassifier | Classify using a linear model. |
NlpTools\Classifiers\MultinomialNBClassifier | Use a multinomia NB model to classify a document |
NlpTools\Clustering\CentroidFactories\CentroidFactoryInterface | |
NlpTools\Clustering\CentroidFactories\Euclidean | Computes the euclidean centroid of the provided sparse vectors |
NlpTools\Clustering\CentroidFactories\Hamming | This class computes the centroid of the hamming distance between two stringsthat are the binary representations of two integers (the strings are supposedto only contain the characters 1 and 0). |
NlpTools\Clustering\CentroidFactories\MeanAngle | MeanAngle computes the unit vector with angle the average of all the given vectors. |
NlpTools\Clustering\Clusterer | |
NlpTools\Clustering\Hierarchical | This class implements hierarchical agglomerative clustering. |
NlpTools\Clustering\KMeans | This clusterer uses the KMeans algorithm for clustering documents. |
NlpTools\Clustering\MergeStrategies\CompleteLink | In single linkage clustering the new distance of the merged cluster with cluster i is the maximum distance of either cluster x to i or y to i. |
NlpTools\Clustering\MergeStrategies\GroupAverage | In single linkage clustering the new distance of the merged cluster with cluster i is the average distance of all points in cluster x to i and y to i. |
NlpTools\Clustering\MergeStrategies\HeapLinkage | HeapLinkage is an abstract merge strategy. |
NlpTools\Clustering\MergeStrategies\MergeStrategyInterface | In hierarchical agglomerative clustering each document starts in its own cluster and then it is subsequently merged with the "closest" cluster. |
NlpTools\Clustering\MergeStrategies\SingleLink | In single linkage clustering the new distance of the merged cluster with cluster i is the smallest distance of either cluster x to i or y to i. |
NlpTools\Documents\DocumentInterface | A Document is a representation of a Document to be classified. |
NlpTools\Documents\RawDocument | RawDocument simply encapsulates a php variable |
NlpTools\Documents\TokensDocument | Represents a bag of words (tokens) document. |
NlpTools\Documents\TrainingDocument | A TrainingDocument is a document that "decorates" any other document to add the real class of the document. |
NlpTools\Documents\TrainingSet | A collection of TrainingDocument objects. |
NlpTools\Documents\WordDocument | A Document that represents a single word but with a context of a larger document. |
NlpTools\Exceptions\InvalidExpression | Used by the tokenization, primarily |
NlpTools\FeatureFactories\DataAsFeatures | |
NlpTools\FeatureFactories\FeatureFactoryInterface | |
NlpTools\FeatureFactories\FunctionFeatures | An implementation of FeatureFactoryInterface that takes any number of callables (function names, closures, array($object,'func_name'), etc.) and calls them consecutively using the return value as a feature's unique string. |
NlpTools\Models\FeatureBasedNB | Implement a MultinomialNBModel by training on a TrainingSet with a FeatureFactoryInterface and additive smoothing. |
NlpTools\Models\Lda | Topic discovery with latent dirchlet allocation using gibbs sampling. |
NlpTools\Models\LinearModel | This class represents a linear model of the following form f(x_vec) = l1*x1 + l2*x2 + l3*x3 ... |
NlpTools\Models\Maxent | Maxent is a model that assigns a weight for each feature such that all the weights maximize the Conditional Log Likelihood of the training data. |
NlpTools\Models\MultinomialNBModelInterface | Interface that describes a NB model. |
NlpTools\Optimizers\ExternalMaxentOptimizer | This class enables the use of a program written in a different language to optimize our model and return the weights for use in php. |
NlpTools\Optimizers\FeatureBasedLinearOptimizerInterface | |
NlpTools\Optimizers\GradientDescentOptimizer | Implements gradient descent with fixed step. |
NlpTools\Optimizers\MaxentGradientDescent | Implement a gradient descent algorithm that maximizes the conditional log likelihood of the training data. |
NlpTools\Optimizers\MaxentOptimizerInterface | Marker interface to use with the Maxent model for type checking |
NlpTools\Random\Distributions\AbstractDistribution | |
NlpTools\Random\Distributions\Dirichlet | Implement a k-dimensional Dirichlet distribution using draws from k gamma distributions and then normalizing. |
NlpTools\Random\Distributions\Gamma | Implement the gamma distribution. |
NlpTools\Random\Distributions\Normal | |
NlpTools\Random\Generators\FromFile | Return floats from a file. |
NlpTools\Random\Generators\GeneratorInterface | An interface for pseudo-random number generators. |
NlpTools\Random\Generators\MersenneTwister | A simple wrapper over the built in mt_rand() method |
NlpTools\Similarity\CosineSimilarity | Given two vectors compute cos(theta) where theta is the angle between the two vectors in a N-dimensional vector space. |
NlpTools\Similarity\DistanceInterface | Distance should return a number proportional to how dissimilar the two instances are(with any metric) |
NlpTools\Similarity\Euclidean | This class computes the very simple euclidean distance between two vectors ( sqrt(sum((a_i-b_i)^2)) ). |
NlpTools\Similarity\HammingDistance | This class implements the hamming distance of two strings or sets. |
NlpTools\Similarity\JaccardIndex | http://en.wikipedia.org/wiki/Jaccard_index |
NlpTools\Similarity\Simhash | Simhash is an implementation of the locality sensitive hash function families proposed by Moses Charikar using the Earth Mover's Distance http://www.cs.princeton.edu/courses/archive/spring04/cos598B/bib/CharikarEstim.pdf |
NlpTools\Similarity\SimilarityInterface | Similarity should return a number that is proportional to how similar those two instances are (with any metric). |
NlpTools\Stemmers\GreekStemmer | This stemmer is an implementation of the stemmer described by G. |
NlpTools\Stemmers\LancasterStemmer | A word stemmer based on the Lancaster stemming algorithm. |
NlpTools\Stemmers\PorterStemmer | Copyright 2013 Katharopoulos Angelos <katharas@gmail.com> |
NlpTools\Stemmers\RegexStemmer | This stemmer removes affixes according to a regular expression. |
NlpTools\Stemmers\Stemmer | http://en.wikipedia.org/wiki/Stemming |
NlpTools\Tokenizers\ClassifierBasedTokenizer | A tokenizer that uses a classifier (of any type) to determine if there is an "end of word" (EOW). |
NlpTools\Tokenizers\PennTreeBankTokenizer | PennTreeBank Tokenizer Based on http://www.cis.upenn.edu/~treebank/tokenizer.sed |
NlpTools\Tokenizers\RegexTokenizer | Regex tokenizer tokenizes text based on a set of regexes |
NlpTools\Tokenizers\TokenizerInterface | |
NlpTools\Tokenizers\WhitespaceAndPunctuationTokenizer | Simple white space tokenizer. |
NlpTools\Tokenizers\WhitespaceTokenizer | Simple white space tokenizer. |
NlpTools\Utils\ClassifierBasedTransformation | Classify whatever is passed in the transform and pass it a different set of transformations based on the class. |
NlpTools\Utils\EnglishVowels | Helper Vowel class, determines if the character at a given index is a vowel |
NlpTools\Utils\Normalizers\English | For English we simply transform to lower case using mb_strtolower. |
NlpTools\Utils\Normalizers\Greek | To normalize greek text we use mb_strtolower to transform to lower case and then replace every accented character with its non-accented counter part and the final ς with σ |
NlpTools\Utils\Normalizers\Normalizer | The Normalizer's purpose is to transform any word from any one of the possible writings to a single writing consistently. |
NlpTools\Utils\StopWords | Stop Words are words which are filtered out because they carry little to no information. |
NlpTools\Utils\TransformationInterface | TransformationInterface represents any type of transformation to be applied upon documents. |
NlpTools\Utils\VowelsAbstractFactory | Factory wrapper for Vowels |