Skip to main content

Phrase and Proximity Search

Search for tokens appearing in a specific order. This allows matching partial or full sentences within indexed text. Requires POSITION = true in the dictionary.

See Setup for the shared dataset used in all examples.

Use the @@ operator with ts_phrase to find documents containing tokens in sequence:

Query
SELECT id, titleFROM movies_idxWHERE description @@ ts_phrase('biggest blockbuster')ORDER BY id;
Result
 id | title----+---------------  4 | Jurassic Park  6 | Scary Movie

Both documents contain the phrase "biggest blockbuster" in their descriptions.

Multi-word phrases

Search for longer sequences:

Query
SELECT id, titleFROM movies_idxWHERE description @@ ts_phrase('the war against its controllers');
Result
 id | title----+------------  1 | The Matrix

Combining phrase conditions with AND

Find documents matching multiple phrases:

Query
SELECT id, titleFROM movies_idxWHERE description @@ ts_phrase('alien') AND description @@ ts_phrase('galaxy')ORDER BY id;
Result
 id | title----+-------------------------------  7 | Star Trek: The Motion Picture

Combining phrase conditions with OR

Find documents matching any of several phrases:

Query
SELECT id, titleFROM movies_idxWHERE description @@ ts_phrase('computer hacker') OR description @@ ts_phrase('serial killer')ORDER BY id;
Result
 id | title----+-------------  1 | The Matrix  6 | Scary Movie

Phrase search across columns

Search different columns in the same query:

Query
SELECT id, titleFROM movies_idxWHERE title @@ ts_phrase('the matrix') AND description @@ ts_phrase('machine')ORDER BY id;
Result
 id | title----+------------------------  2 | The Matrix Reloaded  3 | The Matrix Revolutions

Combine with exact matching

Use phrase search together with term operations:

Query
SELECT id, title, genreFROM movies_idxWHERE genre @@ 'sci-fi' AND description @@ ts_phrase('galaxy')ORDER BY id;
Result
 id | title                         | genre----+-------------------------------+--------  7 | Star Trek: The Motion Picture | sci-fi  8 | Alien                         | sci-fi

Combine with analytics

The power of SereneDB: search and aggregate in a single query:

Query
SELECT genre, COUNT(*) AS matches, AVG(runtime) AS avg_runtimeFROM movies_idxWHERE description @@ ts_phrase('film')GROUP BY genreORDER BY matches DESC, genre;
Result
 genre | matches | avg_runtime-------+---------+------------- drama |       1 |          96
Query
SELECT genre,       COUNT(*) AS count,       MIN(year) AS earliest,       MAX(year) AS latestFROM movies_idxWHERE description @@ ts_phrase('biggest blockbuster')GROUP BY genreORDER BY genre;
Result
 genre     | count | earliest | latest-----------+-------+----------+-------- adventure |     1 |     1993 |   1993 comedy    |     1 |     2000 |   2000

How phrase search works

  1. The query text goes through the same dictionary as the indexed data
  2. The resulting tokens must appear in the same order and at consecutive positions in the document
  3. Because both sides use the same normalization (case, stemming, accents), matching is consistent

For example, with basic_dict (CASE = 'lower', ACCENT = false):

Query
-- The query "Biggest Blockbuster" becomes tokens: {biggest, blockbuster}SELECT ts_lexize('basic_dict', 'Biggest Blockbuster');
Result
 ts_lexize----------------------- {biggest,blockbuster}

See also