Backprop Chain Rule — Belgavi.AI Lab

Advertisement

Forward: compute outputs. Backward: compute gradients via chain rule.

What you're seeing

3-layer network: x → h1 → h2 → ŷ. Forward computes activations layer by layer. Backward computes ∂L/∂params via chain rule: from loss back to weights, multiplying Jacobians.

PyTorch's autograd builds a computation graph during forward and traverses it backward automatically. You write forward; backward is free.

★ KEY TAKEAWAY

Backward = forward, traversed in reverse. Each layer's gradient is the chain rule applied to its local Jacobian.

▶ WHAT TO TRY

Click Forward pass to see activations flow left-to-right.
Click Backward pass to see gradients flow back right-to-left.
This is what PyTorch's autograd does for you, automatically, for any computation graph.