Inner dimensions must match. (n×D) · (D×k) = (n×k). This is the engine of Deep Learning.
Inner Dimensions match (4 == 4) ✅Row 0 · Col 0 = Output[0, 0]
(N×D) · (D×K) = (N×K). The inner dimensions must match and collapse.
Adding a vector to a matrix? The vector repeats implicitly to match dimensions.
Loops are slow. We operate on the entire batch (N) at once.