▶ Interactive Lab

Positional Encoding — Sinusoidal vs RoPE

Two ways to inject position information.

Advertisement
Sinusoidal (original) adds position to embeddings. RoPE (modern) rotates Q/K vectors.

What you're seeing

Sinusoidal: original Transformer paper. Add fixed sin/cos of different frequencies to the token embedding before attention.

RoPE: rotate Q and K vectors by a position-dependent angle. The dot product naturally encodes relative position. Better extrapolation, used by Llama, Mistral, every modern LLM.

★ KEY TAKEAWAY
Sinusoidal adds a fixed pattern. RoPE rotates Q/K. Both inject position info; RoPE extrapolates better.
▶ WHAT TO TRY
  • Switch sinusoidal vs RoPE.
  • Different dimension indices have different frequencies — the heatmap shows this.