Hallucinations — confident incorrect statements — are the #1 reason LLM apps lose user trust. They can't be eliminated entirely but can be reduced from ~10% to <1% with the right architecture. Four techniques stack: RAG grounding, citation requirements, self-verification, and answer abstention.
RAG grounding
Retrieve relevant documents; instruct the model to answer ONLY from the retrieved context. Add: 'If the answer is not in the context, say I do not know.' This reduces freeform generation, which is where hallucinations originate.
Citation requirements
Require the model to cite source doc + line number for every factual claim. Reject responses without citations. The act of citing forces the model to ground itself; ~3x reduction in hallucinations.
Self-verification chain
After generating an answer, prompt the model: 'Review the previous answer. Identify any claims not supported by the context. Rewrite to remove unsupported claims.' Costs one extra call but catches the model's own confident wrongness.
Answer abstention
Train (or prompt) the model to say 'I don't know' when context is insufficient. Counter-intuitive: this INCREASES user trust. Users tolerate 'I don't know' but never forget a confident wrong answer.
Measure hallucination rate
Maintain a benchmark of 100 questions where you know the right answer. After each prompt change, count: correct, incorrect, abstained. Track over time. A reduction from 12% to 3% wrong is a win even if abstention goes from 5% to 15%.