NlpTools API
Namespace

NlpTools\Tokenizers

ClassifierBasedTokenizer A tokenizer that uses a classifier (of any type) to determine if there is an "end of word" (EOW).
PennTreeBankTokenizer PennTreeBank Tokenizer Based on http://www.cis.upenn.edu/~treebank/tokenizer.sed
RegexTokenizer Regex tokenizer tokenizes text based on a set of regexes
WhitespaceAndPunctuationTokenizer Simple white space tokenizer.
WhitespaceTokenizer Simple white space tokenizer.

Interfaces

TokenizerInterface