Where to put critical info

Start or end. Middle attention drops sharply. Repeat critical instructions at both ends for insurance.

Advertisement

Cost

1M input tokens ~ $3-10 depending on model. Cache aggressively. Batch similar queries.

Advertisement

Needle in haystack

Retrieve fact planted at random position. Modern models pass simple needle tests. Fail on multi-fact reasoning across long context.