Building the CUDA kernels takes quite a bit of memory, and when building all of
them the combined usage can be too much on some systems (especially VMs).
Therefore, this patch adds an option to force the build system to build them
sequentially by making each build step depend on the previous kernel.
Reviewers: brecht, sergey
Differential Revision: https://developer.blender.org/D3623