pgvector turned Postgres into a serious vector database. At 100K vectors any approach works; at 100M it gets harder. The tuning matters — HNSW parameters, partitioning, hybrid filtering — and many teams hit walls because they didn't plan past the prototype scale.
HNSW vs IVFFlat
HNSW (default since pgvector 0.5): high quality, fast queries, more memory and slower build. IVFFlat: faster build, lower memory, lower recall at high scale. Use HNSW unless build time is a hard constraint.
HNSW parameters
m (max connections per node, default 16): higher = better recall, more memory. ef_construction (build quality, default 64): higher = better recall at build time, slower build. ef_search (query quality, default 40): higher = better recall per query, slower. Tune via your eval data, not generic defaults.
Partition for parallelism
Postgres can parallel-scan partitions. Partition by tenant, date, or category. Each partition gets its own HNSW index. Searches against one partition are fast; cross-partition scans are slower but bounded.
Hybrid filtering
Filter by metadata first (WHERE tenant_id = X), then vector-search the matching subset. pgvector + IVFFlat handles this well; HNSW has filtered-search support since 0.6 but with quality tradeoffs at heavy filter selectivity.
Limits
Past ~100M vectors per partition, HNSW build time becomes painful. Past ~500M total, dedicated vector DBs (Milvus, Pinecone) win on operational simplicity. Cassandra 5 vector search is the other Postgres-adjacent option.