Where to put critical info
Start or end. Middle attention drops sharply. Repeat critical instructions at both ends for insurance.
Advertisement
Cost
1M input tokens ~ $3-10 depending on model. Cache aggressively. Batch similar queries.
Advertisement
Needle in haystack
Retrieve fact planted at random position. Modern models pass simple needle tests. Fail on multi-fact reasoning across long context.