Shows how the Softmax function converts raw logits into a probability distribution, making the scores sum to 1.0 so the model can sample the next token.