class WhitespaceAndPunctuationTokenizer implements TokenizerInterface
Simple white space tokenizer.
Breaks either on whitespace or on word
boundaries (ex.: dots, commas, etc)
Does not include white space in tokens.
Every punctuation character is a signle token
Methods
array |
tokenize(string $str)
Break a character sequence to a token sequence |
Details
at line 13
public array
tokenize(string $str)
Break a character sequence to a token sequence