Related: https://en.wikipedia.org/wiki/Roofline_model
A measure to determine if your kernel or computation for a particular hardware is compute- or memory-bound, in order to scale & optimize the kernel.
Definition
Note
I started digging more into this topic as I read a blog post on transformers inference arithmetic a while ago. This spurred me to practice deriving and developing intuition for various common machine learning kernels (see examples below).