NlpTools API
Class

NlpTools\Tokenizers\PennTreeBankTokenizer

class PennTreeBankTokenizer extends WhitespaceTokenizer

PennTreeBank Tokenizer Based on http://www.cis.upenn.edu/~treebank/tokenizer.sed

Constants

PATTERN

Methods

array tokenize(string $str)

Calls internal functions to handle data processing

__construct()

Details

at line 30
public array tokenize(string $str)

Calls internal functions to handle data processing

Parameters

string $str The text for tokenization

Return Value

array The list of tokens from the string

at line 21
public __construct()