Chunking
Split into overlapping windows. Process each. Combine. Overlap prevents cutting mid-idea. Sentence/paragraph-aware splitting.
Advertisement
Map-reduce
Map: process each chunk. Reduce: combine intermediate outputs. Classic for summarization at scale.
Advertisement
Retrieval (RAG)
Embed corpus. Query embeds top-K chunks. Feed only relevant chunks to LLM. Standard for KB Q&A.