BM25 Search
Postgres Extension
Deploy ParadeDB on a VPS
Elasticsearch-quality search inside Postgres. BM25 scoring, Tantivy-backed indexes, and hybrid search with pgvector — no separate cluster.
At a Glance
| Project | ParadeDB pg_search 0.23.x |
| License | AGPL-3.0 (Community) |
| Recommended Plan | RamNode Cloud VPS 2 vCPU / 4 GB / 20+ GB |
| OS | Ubuntu 24.04 LTS |
| Backed by | Tantivy (Rust Lucene) |
1
PostgreSQL 17 from PGDG
Install Postgres 17
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl ca-certificates gnupg lsb-release ufw
sudo install -d /usr/share/postgresql-common/pgdg
sudo curl -o /usr/share/postgresql-common/pgdg/apt.postgresql.org.asc \
--fail https://www.postgresql.org/media/keys/ACCC4CF8.asc
echo "deb [signed-by=/usr/share/postgresql-common/pgdg/apt.postgresql.org.asc] https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" \
| sudo tee /etc/apt/sources.list.d/pgdg.list > /dev/null
sudo apt update
sudo apt install -y postgresql-17 postgresql-contrib-17
sudo systemctl enable --now postgresql
sudo -u postgres psql -c "SELECT version();"2
Install pg_search .deb
Pull the prebuilt package
PG_SEARCH_VERSION="0.23.4"
PG_MAJOR="17"
ARCH="amd64" # or arm64
curl -L \
"https://github.com/paradedb/paradedb/releases/download/v${PG_SEARCH_VERSION}/postgresql-${PG_MAJOR}-pg-search_${PG_SEARCH_VERSION}-1PARADEDB-noble_${ARCH}.deb" \
-o /tmp/pg_search.deb
sudo apt-get install -y /tmp/pg_search.deb
rm /tmp/pg_search.deb
ls /usr/share/postgresql/17/extension/ | grep pg_search
ls /usr/lib/postgresql/17/lib/ | grep pg_search3
Configure + Tune
/etc/postgresql/17/main/postgresql.conf
shared_preload_libraries = 'pg_search'
# Search-friendly tuning (adjust to host RAM)
shared_buffers = 1GB
effective_cache_size = 3GB
work_mem = 64MB
maintenance_work_mem = 512MB
max_parallel_workers = 8
max_parallel_workers_per_gather = 4
max_parallel_maintenance_workers = 4Restart + verify
sudo systemctl restart postgresql
sudo tail -n 50 /var/log/postgresql/postgresql-17-main.log4
Create the Extension
In psql
sudo -u postgres psql
CREATE DATABASE search_demo;
\c search_demo
CREATE EXTENSION pg_search;
\dx5
BM25 Index Example
Sample table + index
CALL paradedb.create_bm25_test_table(
schema_name => 'public',
table_name => 'mock_items'
);
CREATE INDEX search_idx ON mock_items
USING bm25 (id, description, category, rating)
WITH (key_field = 'id');Search with the ||| operator + pdb.score()
SELECT
id, description, category, rating,
pdb.score(id) AS score
FROM mock_items
WHERE description ||| 'keyboard'
OR category ||| 'electronics'
ORDER BY pdb.score(id) DESC
LIMIT 5;6
Hybrid Search with pgvector
Install + enable
sudo apt install -y postgresql-17-pgvector
# Add 'vector' to shared_preload_libraries, then:
sudo systemctl restart postgresqlRRF combining BM25 + cosine
WITH
bm25_results AS (
SELECT id, ROW_NUMBER() OVER (ORDER BY pdb.score(id) DESC) AS rank
FROM mock_items
WHERE description ||| 'wireless headphones'
LIMIT 50
),
vector_results AS (
SELECT id, ROW_NUMBER() OVER (
ORDER BY embedding <=> '[...query_embedding...]'::vector ASC
) AS rank
FROM mock_items
LIMIT 50
)
SELECT
m.id, m.description,
COALESCE(1.0 / (60 + b.rank), 0.0)
+ COALESCE(1.0 / (60 + v.rank), 0.0) AS rrf_score
FROM mock_items m
LEFT JOIN bm25_results b ON b.id = m.id
LEFT JOIN vector_results v ON v.id = m.id
WHERE b.id IS NOT NULL OR v.id IS NOT NULL
ORDER BY rrf_score DESC
LIMIT 10;7
Lock Down Connections
/etc/postgresql/17/main/pg_hba.conf
local all postgres peer
local all all scram-sha-256
host all all 127.0.0.1/32 scram-sha-256
host search_demo app_user 10.0.0.0/24 scram-sha-256
host all all 0.0.0.0/0 rejectApp user + firewall
sudo ufw allow OpenSSH
sudo ufw allow from 10.0.0.0/24 to any port 5432 comment 'Postgres from app subnet'
sudo ufw enable8
Backups
Daily logical backup
sudo -u postgres pg_dump -Fc -d search_demo \
-f /var/backups/postgresql/search_demo_$(date +%F).dumpBM25 indexes are NOT included in pg_dump; they're rebuilt on restore. For large datasets use pgBackRest for physical backups (index files included, faster restore).
