Both spot and area light have large areas where they're not visible.
Therefore, this patch stops the light sampling code when one of these cases (outside of the spotlight cone or behind the area light) occurs, before the lamp shader is evaluated.
In the case of the area light, the solid angle sampling can also be skipped.
In a test scene with Sample All Lights and 18 Area lamps and 9 Spot lamps that all point away from the area that the camera sees, render time drops from 12sec to 5sec.
Reviewers: brecht, sergey, dingto, juicyfruit
Differential Revision: https://developer.blender.org/D2216
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.
This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.
On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
This commit changes the way how we pass bounce information to the Light
Path node. Instead of manualy copying the bounces into ShaderData, we now
directly pass PathState. This reduces the arguments that we need to pass
around and also makes it easier to extend the feature.
This commit also exposes the Transmission Bounce Depth to the Light Path
node. It works similar to the Transparent Depth Output: Replace a
Transmission lightpath after X bounces with another shader, e.g a Diffuse
one. This can be used to avoid black surfaces, due to low amount of max
bounces.
Reviewed by Sergey and Brecht, thanks for some hlp with this.
I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split
and Mega kernel. Hopefully this covers all devices. :)
This is so-called GPU limitation boundary hit, told compiler to NOT include
volume bound function, otherwise some real weird things used to happen.
We actually might want to do the same for CPU, inlining everything is not
the way to get fastest code.
* __VOLUME__ is basic volume support with Emission and Absorption.
* __VOLUME_SCATTER__ enables volume Scattering support.
* __VOLUME_DECOUPLED__ enables Decoupled Ray Marching.
* CUDA can be compiled with Volume support again, change line 78 kernel_types.h for that.
Volumes are still fragile on GPU though, got some Memory/Address CUDA errors in tests.. needs to be investigated more deeply.
* Volume multiple importace sampling support to combine equiangular and distance
sampling, for both homogeneous and heterogeneous volumes.
* Branched path "Sample All Direct Lights" and "Sample All Indirect Lights" now
apply to volumes as well as surfaces.
Implementation note:
For simplicity this is all done with decoupled ray marching, the only case we do
not use decoupled is for distance only sampling with one light sample. The
homogeneous case should still compile on the GPU because it only requires fixed
size storage, but the heterogeneous case will be trickier to get working.