Cassandra 5 ships native vector search via SAI (Storage-Attached Indexes) with HNSW. For teams already on Cassandra, this eliminates the need to run a separate Pinecone/Weaviate cluster for many workloads. Quality is competitive with dedicated vector DBs at small-to-medium scale.

Advertisement

How it's exposed

New VECTOR<FLOAT, N> type. CREATE INDEX ... USING 'sai' WITH OPTIONS = {'similarity_function': 'cosine'}. Query with ORDER BY embedding ANN OF [...] LIMIT 10. Familiar CQL surface.

HNSW under the hood

Hierarchical Navigable Small World graph. log-N query time. Recall and latency tunable via M and ef parameters. Defaults are reasonable for 768-dim embeddings (typical sentence-transformer output).

Advertisement

Where it fits

Storing embeddings alongside their source rows (chat history, products, articles) means no cross-system join. Hybrid queries: filter by metadata first, ANN second. Up to ~100M vectors per cluster works well; beyond that, dedicated vector DB still wins.

Limits to know

Index build is RAM-heavy; size accordingly. No quantization yet (so vectors stored at full precision — disk-heavy). No GPU acceleration. For pure-vector workloads beyond 100M, throughput is below Milvus/Pinecone.

Operational pattern

Add the vector column to an existing table; backfill embeddings with a separate process; build the SAI index last. Don't rebuild the index repeatedly — it's expensive. Monitor index_size_bytes alongside table size.

Cassandra 5 vector search is good enough to delete your Pinecone instance for many workloads, especially when you already store the source data here.