Phrase and Proximity Search
Search for tokens appearing in a specific order. This allows matching partial or full sentences within indexed text. Requires POSITION = true in the dictionary.
See Setup for the shared dataset used in all examples.
Basic phrase search
Use PHRASE to find documents containing tokens in sequence:
SELECT id, title
FROM movies_idx
WHERE PHRASE(description, 'biggest blockbuster')
ORDER BY id;
id | title
----+---------------
4 | Jurassic Park
6 | Scary Movie
Both documents contain the phrase "biggest blockbuster" in their descriptions.
Multi-word phrases
Search for longer sequences:
SELECT id, title
FROM movies_idx
WHERE PHRASE(description, 'the war against its controllers');
id | title
----+------------
1 | The Matrix
Combining phrase conditions with AND
Find documents matching multiple phrases:
SELECT id, title
FROM movies_idx
WHERE PHRASE(description, 'alien') AND PHRASE(description, 'galaxy')
ORDER BY id;
id | title
----+---------------------------------
7 | Star Trek: The Motion Picture
8 | Alien
Combining phrase conditions with OR
Find documents matching any of several phrases:
SELECT id, title
FROM movies_idx
WHERE PHRASE(description, 'computer hacker') OR PHRASE(description, 'serial killer')
ORDER BY id;
id | title
----+--------------
1 | The Matrix
6 | Scary Movie
Phrase search across columns
Search different columns in the same query:
SELECT id, title
FROM movies_idx
WHERE PHRASE(title, 'the matrix') AND PHRASE(description, 'machine')
ORDER BY id;
id | title
----+----------------------------
2 | The Matrix Reloaded
3 | The Matrix Revolutions
Combine with exact matching
Use phrase search together with term operations:
SELECT id, title, genre
FROM movies_idx
WHERE TERM_EQ(genre, 'sci-fi') AND PHRASE(description, 'galaxy')
ORDER BY id;
id | title | genre
----+---------------------------------+--------
7 | Star Trek: The Motion Picture | sci-fi
8 | Alien | sci-fi
Combine with analytics
The power of SereneDB: search and aggregate in a single query:
SELECT genre, COUNT(*) AS matches, AVG(runtime) AS avg_runtime
FROM movies_idx
WHERE PHRASE(description, 'film')
GROUP BY genre
ORDER BY matches DESC;
SELECT genre,
COUNT(*) AS count,
MIN(year) AS earliest,
MAX(year) AS latest
FROM movies_idx
WHERE PHRASE(description, 'biggest blockbuster')
GROUP BY genre;
genre | count | earliest | latest
-----------+-------+----------+--------
adventure | 1 | 1993 | 1993
comedy | 1 | 2000 | 2000
How phrase search works
- The query text goes through the same dictionary as the indexed data
- The resulting tokens must appear in the same order and at consecutive positions in the document
- Because both sides use the same normalization (case, stemming, accents), matching is consistent
For example, with basic_dict (CASE = 'lower', ACCENT = false):
-- The query "Biggest Blockbuster" becomes tokens: {biggest, blockbuster}
SELECT ts_lexize('basic_dict', 'Biggest Blockbuster');
-- {biggest,blockbuster}
-- These tokens are searched at consecutive positions in the index
See also
- Case-Sensitivity and Diacritics — how normalization affects phrase matching
- Exact Value Matching — single-token matching
- BM25/TFIDF Ranking — ordering results by relevance