Windowing

Classify last-N tokens or last sentence. Wait for sentence boundary → classify → decide continue/cut. Adds latency but precise.

Advertisement

Chunked classification

Fixed chunk size (100 tokens). Faster but may misclassify partial thoughts.

Advertisement

Cutoff message

On cut, replace stream with 'Response terminated for policy reasons.' Don't reveal internal classifier.