Windowing
Classify last-N tokens or last sentence. Wait for sentence boundary → classify → decide continue/cut. Adds latency but precise.
Advertisement
Chunked classification
Fixed chunk size (100 tokens). Faster but may misclassify partial thoughts.
Advertisement
Cutoff message
On cut, replace stream with 'Response terminated for policy reasons.' Don't reveal internal classifier.