pgvector at 100M Scale — Belgavi.AI Lab

pgvector turned Postgres into a serious vector database. At 100K vectors any approach works; at 100M it gets harder. The tuning matters — HNSW parameters, partitioning, hybrid filtering — and many teams hit walls because they didn't plan past the prototype scale.

Advertisement

HNSW vs IVFFlat

HNSW (default since pgvector 0.5): high quality, fast queries, more memory and slower build. IVFFlat: faster build, lower memory, lower recall at high scale. Use HNSW unless build time is a hard constraint.

HNSW parameters

m (max connections per node, default 16): higher = better recall, more memory. ef_construction (build quality, default 64): higher = better recall at build time, slower build. ef_search (query quality, default 40): higher = better recall per query, slower. Tune via your eval data, not generic defaults.

Advertisement

Partition for parallelism

Postgres can parallel-scan partitions. Partition by tenant, date, or category. Each partition gets its own HNSW index. Searches against one partition are fast; cross-partition scans are slower but bounded.

Hybrid filtering

Filter by metadata first (WHERE tenant_id = X), then vector-search the matching subset. pgvector + IVFFlat handles this well; HNSW has filtered-search support since 0.6 but with quality tradeoffs at heavy filter selectivity.

Limits

Past ~100M vectors per partition, HNSW build time becomes painful. Past ~500M total, dedicated vector DBs (Milvus, Pinecone) win on operational simplicity. Cassandra 5 vector search is the other Postgres-adjacent option.

HNSW + tuned m/ef + partition by tenant + hybrid filter pre-vector. pgvector works to ~100M; bigger needs dedicated DB.