Edit this page

delimiter

The delimiter template splits the input on one delimiter character and emits the pieces as tokens, with no further analysis. It is the simplest tokenizer and suits structured values whose parts are separated by a known character — comma-separated tags, slash-separated paths, dotted identifiers.

For example, with DELIMITER = ',' the value red,green,blue produces the tokens red, green and blue. To split on more than one separator, use multi_delimiter; to further process each piece — lower-case it, stem it, drop stop words — chain this template into a pipeline.

Options

Option	Type	Default	Description
`DELIMITER`	string	required	Delimiter character

Tokenization

The template cuts the input at every occurrence of DELIMITER and emits the pieces between the cuts verbatim — no case folding, stemming or trimming. Adjacent or leading delimiters therefore yield empty tokens, since the piece between two cuts is itself empty.

Input	Delimiter	Tokens
`red,green,blue`	`,`	`{red,green,blue}`
`com.example.app`	`.`	`{com,example,app}`
`/usr/local/bin`	`/`	`{"",usr,local,bin}`

The third row shows the leading / producing an empty first token. Preview the split with ts_lexize:

Query

CREATE TEXT SEARCH DICTIONARY tok_delim_comma (    template = 'delimiter',    delimiter = ',');
SELECT ts_lexize('tok_delim_comma', 'red,green,blue');

Result

 ts_lexize------------------ {red,green,blue}

Any single character works as the delimiter — here a dot splits a reverse-DNS identifier into its components:

Query

CREATE TEXT SEARCH DICTIONARY tok_delim_dot (    template = 'delimiter',    delimiter = '.');
SELECT ts_lexize('tok_delim_dot', 'com.example.app');

Result

 ts_lexize------------------- {com,example,app}

To further process each piece — lower-case it, stem it, drop stop words — chain this template into a pipeline.

Examples

Query

CREATE TEXT SEARCH DICTIONARY pipe_delim (    template = 'delimiter',    delimiter = '|');

Query

CREATE TEXT SEARCH DICTIONARY comma_delim (    template = 'delimiter',    delimiter = ',');

Options​

Tokenization​

Examples​

See also​

Options

Tokenization

Examples

See also