Skip to main content

norm

The norm template normalizes the whole input — folding case and, optionally, stripping accents for the given LOCALE — and returns it as a single token without splitting it into words. Because the entire value becomes one token, it behaves like a normalized keyword: two strings match only if they are equal after normalization.

Use it for exact-match or keyword columns — tags, codes, names, enum-like values — that should still compare case-insensitively or accent-insensitively, rather than for free-text search. For per-word tokenization with the same normalization, use text.

Options

OptionTypeDefaultDescription
LOCALEstringICU locale
CASEstring'none'Case conversion: 'none', 'lower', 'upper'
ACCENTbooleantruePreserve accent marks (false folds them away)

Tokenization

norm always emits exactly one token: the input with case and accents normalized per the options. Spaces and punctuation are kept verbatim — the value is never split. The table below shows how the same input transforms under different option combinations.

InputCASEACCENTOutput token
CAFÉ'lower'falsecafe
CAFÉ'none'true (default)CAFÉ (unchanged)
café'upper'trueCAFÉ

Because two values collide only when their normalized forms are identical, a norm dictionary with CASE = 'lower' and ACCENT = false makes CAFÉ, Café and cafe all match.

Query
CREATE TEXT SEARCH DICTIONARY norm_dict (    template = 'norm',    locale = 'en_US.UTF-8',    case = 'lower',    accent = false);
SELECT ts_lexize('norm_dict', 'CAFÉ');
Result
 ts_lexize----------- {cafe}

Uppercase normalization, accents preserved

Folding to upper case while keeping accent marks turns café into CAFÉ:

Query
CREATE TEXT SEARCH DICTIONARY norm_upper (    template = 'norm',    locale = 'en_US.UTF-8',    case = 'upper',    accent = true);
SELECT ts_lexize('norm_upper', 'café');
Result
 ts_lexize----------- {CAFÉ}

See also