Edit this page

norm

The norm template normalizes the whole input — folding case and, optionally, stripping accents for the given LOCALE — and returns it as a single token without splitting it into words. Because the entire value becomes one token, it behaves like a normalized keyword: two strings match only if they are equal after normalization.

Use it for exact-match or keyword columns — tags, codes, names, enum-like values — that should still compare case-insensitively or accent-insensitively, rather than for free-text search. For per-word tokenization with the same normalization, use text.

Options

Option	Type	Default	Description
`LOCALE`	string	—	ICU locale
`CASE`	string	`'none'`	Case conversion: `'none'`, `'lower'`, `'upper'`
`ACCENT`	boolean	`true`	Preserve accent marks (`false` folds them away)

Tokenization

norm always emits exactly one token: the input with case and accents normalized per the options. Spaces and punctuation are kept verbatim — the value is never split. The table below shows how the same input transforms under different option combinations.

Input	CASE	ACCENT	Output token
`CAFÉ`	`'lower'`	`false`	`cafe`
`CAFÉ`	`'none'`	`true` (default)	`CAFÉ` (unchanged)
`café`	`'upper'`	`true`	`CAFÉ`

Because two values collide only when their normalized forms are identical, a norm dictionary with CASE = 'lower' and ACCENT = false makes CAFÉ, Café and cafe all match.

Query

CREATE TEXT SEARCH DICTIONARY norm_dict (    template = 'norm',    locale = 'en_US.UTF-8',    case = 'lower',    accent = false);
SELECT ts_lexize('norm_dict', 'CAFÉ');

Result

 ts_lexize----------- {cafe}

Uppercase normalization, accents preserved

Folding to upper case while keeping accent marks turns café into CAFÉ:

Query

CREATE TEXT SEARCH DICTIONARY norm_upper (    template = 'norm',    locale = 'en_US.UTF-8',    case = 'upper',    accent = true);
SELECT ts_lexize('norm_upper', 'café');

Result

 ts_lexize----------- {CAFÉ}

Options​

Tokenization​

Uppercase normalization, accents preserved​

See also​

Options

Tokenization

Uppercase normalization, accents preserved

See also