NlpTools API
Class

NlpTools\Tokenizers\RegexTokenizer

class RegexTokenizer implements TokenizerInterface

Regex tokenizer tokenizes text based on a set of regexes

Methods

__construct(array $patterns)

Initialize the Tokenizer

array tokenize(string $str)

Break a character sequence to a token sequence

Details

at line 18
public __construct(array $patterns)

Initialize the Tokenizer

Parameters

array $patterns The regular expressions

at line 39
public array tokenize(string $str)

Break a character sequence to a token sequence

Parameters

string $str The string to be tokenized

Return Value

array The tokens