By: Ole Kröger
Re-posted from: https://opensourc.es/blog/2021-10-11-matrix-multiplication-performance/index.html
A deep dive into the performance we can obtain by thinking about cache lines and parallel code. An example step by step guide on optimizing dense matrix multiplication.