The key idea here is to avoid dependencies between work, as much as possible.

By telling the GPU which sets of work are independent of each other, the GPU can fill in the gaps in the SM’s budget with blocks from other programs to fully utilize the SM. This is known as oversubscription—an excess of work available for the GPU to do.