Commit Graph

1872 Commits

Author SHA1 Message Date
Brecht Van Lommel
23098cda99 Code refactor: make texture code more consistent between devices.
* Use common TextureInfo struct for all devices, except CUDA fermi.
* Move image sampling code to kernels/*/kernel_*_image.h files.
* Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-10-07 14:53:14 +02:00
Sergey Sharybin
837383ac78 Cycles: Cleanup, indendation 2017-10-06 19:33:59 +05:00
Sergey Sharybin
a950af8e24 Fix T53012: Shadow catcher creates artifacts on contact area
The issue was caused by light sample being evaluated to nan at some point.
This is root of the cause which is to be fixed, but is very hard to trace down
especially via ssh (the issue only happens on AVX2 release build). Will give it
a closer look when back to my AVX2 machine.

For until then this is a good check to have anyway, it corresponds to what's
happening in regular radiance sum.
2017-10-06 17:27:34 +05:00
Sergey Sharybin
0d3c8d0701 Cycles: Cleanup, indentation and wrapping 2017-10-06 16:54:37 +05:00
Brecht Van Lommel
4537e85584 Fix T53001: more workarounds for crash in AMD compiler with recent drivers. 2017-10-05 17:57:58 +02:00
Brecht Van Lommel
fb99ea79f8 Code refactor: split displace/background into separate kernels, remove luma. 2017-10-05 17:57:58 +02:00
Brecht Van Lommel
6da6f8d33f Cycles: CUDA faster rendering of small tiles, using multiple samples like OpenCL.
The work size is still very conservative, and this doesn't help for progressive
refine. For that we will need to render multiple tiles at the same time. But this
should already help for denoising renders that require too much memory with big
tiles, and just generally soften the performance dropoff with small tiles.

Differential Revision: https://developer.blender.org/D2856
2017-10-04 21:58:47 +02:00
Brecht Van Lommel
77f300e2a9 Fix use of uninitialized memory in Cycles normal baking. 2017-10-04 21:11:14 +02:00
Brecht Van Lommel
5bb677e592 Code refactor: zero render buffers outside of kernel.
This was originally done with the first sample in the kernel for better
performance, but it doesn't work anymore with atomics. Any benefit was
very minor anyway, too small to measure it seems.
2017-10-04 21:11:14 +02:00
Brecht Van Lommel
12f4538205 Code refactor: use split variance calculation for mega kernels too.
There is no significant difference in denoised benchmark scenes and
denoising ctests, so might as well make it all consistent.
2017-10-04 21:11:14 +02:00
Brecht Van Lommel
e3e16cecc4 Code refactor: remove rng_state buffer and compute hash on the fly.
A little faster on some benchmark scenes, a little slower on others, seems
about performance neutral on average and saves a little memory.
2017-10-04 21:11:14 +02:00
Brecht Van Lommel
5b7d6ea54b Code refactor: add WorkTile struct for passing work to kernel.
This makes sharing some code between mega/split in following commits a bit
easier, and also paves the way for rendering multiple tiles later.
2017-10-04 21:11:14 +02:00
Brecht Van Lommel
660e8e59e7 Fix T52645, T52645: AMD OpenCL compiler crash with recent drivers.
Work around the bug by reshuffling code.
2017-10-04 21:00:46 +02:00
Brecht Van Lommel
f55735e533 CMake: support CUDA 9 toolkit, and automatically disable sm_2x binaries.
Fermi cards (GTX 4xx and 5xx) are no longer supported with this version, so
we can keep supporting both CUDA 8 and 9 for a while.
2017-10-01 14:14:53 +02:00
Brecht Van Lommel
d2bbd41b4e Fix Cycles OpenCL compiler error after recent changes. 2017-09-29 14:54:10 +02:00
Brecht Van Lommel
400e6f37b8 Cycles: reduce subsurface stack memory usage.
This is done by storing only a subset of PathRadiance, and by storing
direct light immediately in the main PathRadiance. Saves about 10% of
CUDA stack memory, and simplifies subsurface indirect ray code.
2017-09-28 15:18:43 +02:00
Sergey Sharybin
cb6f07f59e Cycles: Cleanup, indentation 2017-09-25 11:15:54 +05:00
Sergey Sharybin
c0480bc972 Cycles: Fix compilation error of OpenCL megakernel on Apple 2017-09-23 17:07:19 +05:00
Sergey Sharybin
b460b8fb4a Cycles: Fix compilation error of megakernel on NVidia device
It is more readable to explicitly compare to NULL anyway.
2017-09-23 17:03:02 +05:00
Brecht Van Lommel
07ec0effb6 Code cleanup: simplify kernel side work stealing code. 2017-09-21 22:29:18 +02:00
Brecht Van Lommel
90d4b823d7 Cycles: use defensive sampling for picking BSDFs and BSSRDFs.
For the first bounce we now give each BSDF or BSSRDF a minimum sample weight,
which helps reduce noise for a typical case where you have a glossy BSDF with
a small weight due to Fresnel, but not necessarily small contribution relative
to a diffuse or transmission BSDF below.

We can probably find a better heuristic that also enables this on further
bounces, for example when looking through a perfect mirror, but I wasn't able
to find a robust one so far.
2017-09-20 19:38:08 +02:00
Brecht Van Lommel
095a01a73a Cycles: slightly improve BSDF sample stratification for path tracing.
Similar to what we did for area lights previously, this should help
preserve stratification when using multiple BSDFs in theory. Improvements
are not easily noticeable in practice though, because the number of BSDFs
is usually low. Still nice to eliminate one sampling dimension.
2017-09-20 19:38:08 +02:00
Brecht Van Lommel
b3afc8917c Code cleanup: refactor BSSRDF closure sampling, for next commit. 2017-09-20 19:38:08 +02:00
Brecht Van Lommel
d029399e6b Code cleanup: remove SOBOL_SKIP hack, seems no longer needed. 2017-09-20 19:38:08 +02:00
Brecht Van Lommel
d750d182e5 Code cleanup: remove hack to avoid seeing transparent objects in noise.
Previously the Sobol pattern suffered from some correlation issues that
made the outline of objects like a smoke domain visible. This helps
simplify the code and also makes some other optimizations possible.
2017-09-20 19:38:08 +02:00
Carlo Andreacchio
ab9079f459 Fix Cycles adaptive compile without volumes broken after recent changes.
Differential Revision: https://developer.blender.org/D2847
2017-09-18 12:52:32 +02:00
Hristo Gueorguiev
6798a061b7 Cycles: Fix compilation error with OpenCL split kernel 2017-09-16 12:33:03 +02:00
Brecht Van Lommel
32449e1b21 Code cleanup: store branch factor in PathState. 2017-09-13 15:24:14 +02:00
Brecht Van Lommel
9e258fc641 Code cleanup: avoid used of uninitialized value in case of precision issue. 2017-09-13 15:24:14 +02:00
Brecht Van Lommel
37d9e65ddf Code cleanup: abstract shadow catcher logic more into accumulation code. 2017-09-13 15:24:14 +02:00
Brecht Van Lommel
f77cdd1d59 Code cleanup: deduplicate some branched and split kernel code.
Benchmarks peformance on GTX 1080 and RX 480 on Linux is the same for
bmw27, classroom, pabellon, and about 2% faster on fishy_cat and koro.
2017-09-13 15:24:14 +02:00
Brecht Van Lommel
c4c450045d Code cleanup: tweak inlining for 2% better CUDA performance with hair. 2017-09-13 15:24:14 +02:00
Mathieu Menuet
659ba012b0 Cycles: change AO bounces approximation to do more glossy and transmission.
Rather than treating all ray types equally, we now always render 1 glossy
bounce and unlimited transmission bounces. This makes it possible to get
good looking results with low AO bounces settings, making it useful to
speed up interior renders for example.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2818
2017-09-12 15:37:35 +02:00
Brecht Van Lommel
de6ecc82ed Fix rare firefly in volume equiangular sampling when sampling short distance. 2017-09-12 12:50:44 +02:00
Brecht Van Lommel
cd6c9e9e5f Cycles: improve sample stratification on area lights for path tracing.
Previously we used a 1D sequence to select a light, and another 2D sequence
to sample a point on the light. For multiple lights this meant each light
would get a random subset of a 2D stratified sequence, which is not
guaranteed to be stratified anymore.

Now we use only a 2D sequence, split into segments along the X axis, one for
each light. The samples that fall within a segment then each are a stratified
sequence, at least in the limit. So for example for two lights, we split up
the unit square into two segments [0,0.5[ x [0,1[ and [0.5,1[ x [0,1[.

This doesn't make much difference in most scenes, mainly helps if you have a
few large area lights or some types of HDR backgrounds.
2017-09-12 12:45:29 +02:00
Brecht Van Lommel
d454a44e96 Fix Cycles bug in RR termination, probability should never be > 1.0.
This causes render differences in some scenes, for example fishy_cat
and pabellon scenes render brighter in a few spots. This is an old
bug, not due to recent RR changes.
2017-09-12 12:43:26 +02:00
Sergey Sharybin
467d92b8f1 Cycles: Tweaks to avoid compilation error of megakernel
Also moved code out of deep-inside ifdef block, otherwise it was quite confusing.
2017-09-12 13:33:46 +05:00
Sergey Sharybin
ae41a08288 Cycles: Attempt to work around compilation of sm_20 and sm_21
Disabled forceinline for those architectures, which seems to be compiling
successfully more often.

There might be ~3% slowdown based on quick tests, but better be rendering
something rather than failing to compile kernels again and again.

Those architectures will be doomed for abandon once we'll switch to toolkit 9.
2017-09-08 18:37:54 +02:00
Brecht Van Lommel
ce1f2e271d Cycles: disable fast math flags, only use a subset.
Empty BVH nodes are set to NaN which must be preserved all the way to the
tnear <= tfar test which can then give false for empty nodes. This needs
strict semantices and careful argument ordering for min() and max(), so
the second argument is used if either of the arguments is NaN.

Fixes T52635: crash in BVH traversal with SSE4.1.

Differential Revision: https://developer.blender.org/D2828
2017-09-08 15:12:37 +02:00
Brecht Van Lommel
c10ea88420 Fix T52660: CUDA volume texture rendering not working on Fermi GPUs. 2017-09-06 18:12:45 +02:00
Brecht Van Lommel
2d407fc288 Fix T52661: mesh light shader using backfacing not working, after new sampling. 2017-09-06 13:51:48 +02:00
Brecht Van Lommel
dd8016f708 Fix T52652: Cycles image box mapping has flipped textures.
This breaks backwards compatibility some in that 3 sides will be mapped
differently now, but difficult to avoid and can be considered a bugfix.
2017-09-06 13:51:45 +02:00
Sergey Sharybin
750e38a526 Cycles: Fix compilation error with CUDA after recent changes 2017-09-05 16:52:45 +02:00
Sergey Sharybin
f01e43fac3 Fix T52433: Volume Absorption color tint
Need to exit the volume stack when shadow ray laves the medium.

Thanks Brecht for review and help in troubleshooting!
2017-09-05 15:48:34 +02:00
Sergey Sharybin
b0bbb5f34f Cycles: Cleanup, style 2017-09-05 12:43:02 +02:00
Sergey Sharybin
018137f762 Cycles: Cleanup, indentation and trailing whitespace 2017-08-31 14:47:49 +02:00
Stefan Werner
68dfa0f1b7 Fixing T52477 - switching from custom ray/triangle intersection code to the one from util_intersection.h. This fixes the bug and makes the code more readable and maintainable. 2017-08-30 11:48:49 +02:00
Lukas Stockner
f9a3d01452 Cycles: Mark pixels with negative values as outliers
If a pixel has negative components, something already went wrong, so the best option is to just ignore it.

Should be good for 2.79.
2017-08-25 17:46:15 +02:00
Brecht Van Lommel
76b74a93a8 Fix Cycles CUDA transparent shadow error after recent fix in c22b52c.
Fishy cat benchmark was rendering with wrong shadows. Cause is unclear,
adding printf or rearranging code seems to avoid this issue, possibly a
compiler bug. This reverts the fix and solves the OSL bug elsewhere.
2017-08-24 03:43:02 +02:00
Brecht Van Lommel
b85d36d811 Code cleanup: remove shader context.
This was needed when we accessed OSL closure memory after shader evaluation,
which could get overwritten by another shader evaluation. But all closures
are immediatley converted to ShaderClosure now, so no longer needed.
2017-08-24 03:43:02 +02:00