Advertisement
logits[recent_tokens] /= penalty for tokens already used.
What you're seeing
Hack to avoid loops. Used in many production samplers. Can degrade quality if too aggressive.
★ KEY TAKEAWAY
Repetition penalty divides logits of recent tokens by a factor (typically 1.1-1.3). Hack to break loops; can degrade quality if too aggressive.
▶ WHAT TO TRY
- Slide Penalty from 1.0 (no effect) to 2.0 (very aggressive).
- Red bars are recent tokens that get penalized.