Why Indexing Strategy Matters

The indexing strategy you choose for an AI conversation archive determines not just search quality but also cost, latency, and maintenance overhead. A personal archive with a few hundred conversations can use any approach. An enterprise archive with millions of conversations and millisecond latency requirements demands careful architecture — and the wrong choice at the design stage creates expensive migration work later.

BM25: Fast Keyword Retrieval

BM25 (Best Match 25) is the classic full-text search algorithm used by Elasticsearch, Solr, SQLite FTS5, and PostgreSQL's tsvector. It scores documents based on term frequency, inverse document frequency, and document length normalization. BM25 is fast (sub-millisecond on well-indexed corpora of millions of documents), predictable, and excellent for queries where users know the exact words they're looking for. It fails when users search by concept rather than keyword.

Dense Vector Search

Dense vector search converts both queries and documents into high-dimensional embedding vectors and retrieves documents by cosine similarity or dot product. It handles concept-based queries, synonyms, and cross-lingual retrieval that BM25 cannot. Latency depends on index size and the approximate nearest neighbor (ANN) algorithm used: FAISS HNSW, Qdrant, Weaviate, and Pinecone all offer sub-10ms retrieval at million-scale. The cost is compute (embedding generation) and storage (vectors are 6KB-25KB per chunk depending on dimensionality).

Hybrid Retrieval

Hybrid retrieval combines BM25 and dense vector search, typically using Reciprocal Rank Fusion (RRF) to merge result sets. In practice, hybrid retrieval outperforms either approach alone on most retrieval benchmarks, capturing the complementary strengths of exact keyword matching and semantic similarity. The BEIR benchmark shows hybrid retrieval improving NDCG@10 by 5-15% over dense-only retrieval across diverse query types — a significant real-world improvement in search quality.

Knowledge Graph Approaches

Knowledge graph indexing extracts entities and relationships from conversations (people, companies, decisions, dates, metrics) and stores them as nodes and edges in a graph database (Neo4j, Amazon Neptune, or a property graph). Graph retrieval excels at relationship queries: "What did we decide about the pricing model in Q3?" or "Which conversations mention both the Acme project and the Jenkins team?" These queries are impossible for vector or keyword search alone.

Benchmark Comparison

For AI conversation retrieval specifically, hybrid search (BM25 + dense vector, RRF merged) is the recommended baseline for production systems. It provides the best recall across query types, handles both exact and semantic queries, and has mature open-source tooling. Add knowledge graph indexing when entity relationship queries are a significant portion of search volume. For pure speed at the cost of some recall, BM25-only on a well-maintained index is viable for corpora under 100k conversations.