Skip to main content

ngram

Generates character n-grams for fuzzy and substring matching.

Options

OptionTypeDefaultDescription
MINGRAMinteger2Minimum n-gram length
MAXGRAMinteger3Maximum n-gram length
PRESERVEORIGINALbooleanfalseEmit original token alongside n-grams
INPUTTYPEstring'utf8'Input encoding: 'binary', 'utf8'
STARTMARKERstringPrefix marker at n-gram boundary
ENDMARKERstringSuffix marker at n-gram boundary

Examples

CREATE TEXT SEARCH DICTIONARY ngram_dict (
TEMPLATE = 'ngram',
MINGRAM = 2,
MAXGRAM = 3
);

Unigrams and bigrams

CREATE TEXT SEARCH DICTIONARY unigram_dict (
TEMPLATE = 'ngram',
MINGRAM = 1,
MAXGRAM = 2
);

See also