Matrix Multiplication Step-Through

Advertisement

Y[i,j] = sum_k X[i,k] * W[k,j]. Click Step to compute one entry at a time.

What you're seeing

For shapes [N,d] · [d,m] = [N,m]. Each output entry is a dot product of a row of X with a column of W. Cost: N·m dot products of length d. Total: O(N·m·d) multiply-adds.

BLAS libraries (OpenBLAS, MKL) implement this with cache-blocking, SIMD vectorization, and threading. ~10-100× faster than naive Python loops.

★ KEY TAKEAWAY

Matrix multiplication = a grid of dot products. Each output cell is one row × one column.

▶ WHAT TO TRY

Click Step one entry to watch one Y[i,j] get computed at a time.
Click Compute all to see the final result instantly.
Notice that BLAS libraries do this same math, but ~100× faster via cache blocking + SIMD.