Use multiple threads for building the MIS table, if the
resolution is higher than 512.
Also replace division by cdf_total, with a inverse multiplication by
cdf_total_inv. This gives further speedup.
On my Macbook (8 CPU threads) this improves the time to build the table:
Resolution 4096: From 0.16s to 0.03s
Resolution 8096: From 0.61s to 0.11s
This especially helps to reduce the scene update time, when tweaking world
shader while viewport rendering is running.
Patch by Sergey and myself.
Differential Revision: https://developer.blender.org/D1159