Cassandra 5 ships native vector search via SAI (Storage-Attached Indexes) with HNSW. For teams already on Cassandra, this eliminates the need to run a separate Pinecone/Weaviate cluster for many workloads. Quality is competitive with dedicated vector DBs at small-to-medium scale.
How it's exposed
New VECTOR<FLOAT, N> type. CREATE INDEX ... USING 'sai' WITH OPTIONS = {'similarity_function': 'cosine'}. Query with ORDER BY embedding ANN OF [...] LIMIT 10. Familiar CQL surface.
HNSW under the hood
Hierarchical Navigable Small World graph. log-N query time. Recall and latency tunable via M and ef parameters. Defaults are reasonable for 768-dim embeddings (typical sentence-transformer output).
Where it fits
Storing embeddings alongside their source rows (chat history, products, articles) means no cross-system join. Hybrid queries: filter by metadata first, ANN second. Up to ~100M vectors per cluster works well; beyond that, dedicated vector DB still wins.
Limits to know
Index build is RAM-heavy; size accordingly. No quantization yet (so vectors stored at full precision — disk-heavy). No GPU acceleration. For pure-vector workloads beyond 100M, throughput is below Milvus/Pinecone.
Operational pattern
Add the vector column to an existing table; backfill embeddings with a separate process; build the SAI index last. Don't rebuild the index repeatedly — it's expensive. Monitor index_size_bytes alongside table size.