Commit Graph

1819 Commits

Author SHA1 Message Date
Brecht Van Lommel
c22b52cd36 Fix T52452: OSL trace broken after shadow catcher recent changes.
We should only early out with any hit in BVH traversal if the only visibility
bits used are opaque shadow. Not when opaque shadow is one of multiple bits.
2017-08-19 18:14:16 +02:00
Brecht Van Lommel
cfa8b762e2 Code cleanup: move rng into path state.
Also pass by value and don't write back now that it is just a hash for seeding
and no longer an LCG state. Together this makes CUDA a tiny bit faster in my
tests, but mainly simplifies code.
2017-08-19 18:14:16 +02:00
Stefan Werner
7a4696197d Cycles: Fix for a division by zero that could happen with solid angle triangle light sampling 2017-08-17 15:07:59 +02:00
Stefan Werner
8141eac2f8 Improved triangle sampling for mesh lights
This implements Arvo's "Stratified sampling of spherical triangles". Similar to how we sample rectangular area lights, this is sampling triangles over their solid angle. It does significantly improve sampling close to the triangle, but doesn't do much for more distant triangles. So I added a simple heuristic to switch between the two methods. Unfortunately, I expect this to add render time in any case, even when it does not make any difference whatsoever. It'll take some benchmarking with various scenes and hardware to estimate how severe the impact is and if it is worth the change.

Reviewers: #cycles, brecht

Reviewed By: #cycles, brecht

Subscribers: Vega-core, brecht, SteffenD

Tags: #cycles

Differential Revision: https://developer.blender.org/D2730
2017-08-17 12:44:32 +02:00
Brecht Van Lommel
dc7fcebb33 Code cleanup: make L_transparent part of PathRadiance. 2017-08-13 01:19:07 +02:00
Brecht Van Lommel
7542282c06 Code cleanup: make DebugData part of PathRadiance. 2017-08-13 01:19:07 +02:00
Brecht Van Lommel
fce405059f Code cleanup: make it easier to test only Sobol, CMJ or Pseudorandom. 2017-08-13 01:19:07 +02:00
Brecht Van Lommel
8f97108353 Cycles: optimize CPU split kernel data init. 2017-08-12 20:43:34 +02:00
Brecht Van Lommel
601f94a3c2 Code cleanup: remove unused Cycles random number code. 2017-08-12 20:40:38 +02:00
Brecht Van Lommel
85ad248c36 Code cleanup: fix warning and improve terminology. 2017-08-12 13:18:05 +02:00
Sergey Sharybin
2e25754ecd Cycles: Clarify new argument in PathRadiance 2017-08-11 13:49:50 +02:00
Sergey Sharybin
bd069a89aa Fix T52229: Shadow Catcher artifacts when under transparency
Added some extra tirckery to avoid background being tinted dark with transparent
surface. Maybe a bit hacky, but seems to work fine.
2017-08-11 13:49:50 +02:00
Sergey Sharybin
176ad9ecdd Cycles: Remove ulong usage
This is a bit confusing, especially when one mixes OpenCL code where ulong equals
to uint64_t with CPU side code where ulong is expected to be something else from
the naming.

This commit makes it so we use explicit name, common on all platforms.
2017-08-09 14:08:58 +02:00
Sergey Sharybin
c961737d0f Cycles: Fix compilation error of filter kernels on 32 bit Windows
We don't enable global SSE optimizations in regular kernel, and we
keep those disabled on Linux 32bit.

One possible workaround would be to pass arguments by ccl_ref, but
that is quite a few of code which better be done accurately.
2017-08-08 22:01:17 +02:00
Sergey Sharybin
19d19add1e Cycles: Cleanup, de-duplicate function parameter list
Was only needed to sue const reference on CPU. Now it is done using ccl_ref.
2017-08-08 15:27:25 +02:00
Sergey Sharybin
fd397a7d28 Cycles: Add utility macro ccl_ref
It is defined to & for CPU side compilation, and defined to an empty for any GPU
platform. The idea here is to use this macro instead of #ifdef block with bunch
of duplicated lines just to make it so CPU code is efficient.

Eventually we might switch to references on CUDA as well, but that would require
some intensive testing.
2017-08-08 15:27:25 +02:00
Mai Lavelle
ec8ae4d5e9 Cycles: Pack kernel textures into buffers for OpenCL
Image textures were being packed into a single buffer for OpenCL, which
limited the amount of memory available for images to the size of one
buffer (usually 4gb on AMD hardware). By packing textures into multiple
buffers that limit is removed, while simultaneously reducing the number
of buffers that need to be passed to each kernel.

Benchmarks were within 2%.

Fixes T51554.

Differential Revision: https://developer.blender.org/D2745
2017-08-08 07:12:04 -04:00
Sergey Sharybin
451ccf7396 Cycles: Cleanup, move curve intersection functions to own file
This way curve file becomes much shorter and it's also easier to write a
benchmark application to check performance before/after future changes.
2017-08-07 20:53:30 +02:00
Sergey Sharybin
77a7a7f455 Cycles: Cleanup, trailign whitespace 2017-08-07 20:53:30 +02:00
Sergey Sharybin
95fe9b2617 Cycles: Cleanup, remove bvh prefix from curve functions
Those are nothing to do with BVH, and can be used separately.
2017-08-07 20:53:30 +02:00
Sergey Sharybin
a4bbce8949 Cycles: Fix compilation error on NVidia OpenCL after recent refactor
Still need to verify this is proper thing to do for AMD OpenCL. At least now
i can compile OpenCL kernel on my laptop with sm21 card.
2017-08-07 20:52:24 +02:00
Brecht Van Lommel
fc38276d74 Fix Cycles shadow catcher objects influencing each other.
Since all the shadow catchers are already assumed to be in the footage,
the shadows they cast on each other are already in the footage too. So
don't just let shadow catchers skip self, but all shadow catchers.

Another justification is that it should not matter if the shadow catcher
is modeled as one object or multiple separate objects, the resulting
render should be the same.

Differential Revision: https://developer.blender.org/D2763
2017-08-07 17:54:26 +02:00
Sergey Sharybin
580741b317 Cycles: Cleanup, space after keyword 2017-08-07 14:47:51 +02:00
Brecht Van Lommel
ee77c1e917 Code refactor: use float4 instead of intrinsics for CPU denoise filtering.
Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
Brecht Van Lommel
a24fbf3323 Code refactor: add, remove, optimize various SSE functions.
* Remove some unnecessary SSE emulation defines.
* Use full precision float division so we can enable it.
* Add sqrt(), sqr(), fabs(), shuffle variations, mask().
* Optimize reduce_add(), select().

Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
Brecht Van Lommel
a8cc0d707e Code refactor: split defines into separate header, changes to SSE type headers.
I need to use some macros defined in util_simd.h for float3/float4, to emulate
SSE4 instructions on SSE2. But due to issues with order of header includes this
was not possible, this does some refactoring to make it work.

Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
Brecht Van Lommel
2a74f36dac Fix Cycles CUDA adaptive megakernel build error. 2017-08-07 00:27:08 +02:00
Brecht Van Lommel
45dcd20ca9 Cycles: CUDA split performance tweaks, still far from megakernel.
On Pabellon, 25.8s mega, 35.4s split before, 32.7s split after.
2017-08-05 14:32:59 +02:00
Brecht Van Lommel
cd023b6cec Cycles: remove min bounces, modify RR to terminate less.
Differential Revision: https://developer.blender.org/D2766
2017-08-04 23:11:03 +02:00
Sergey Sharybin
a280697e77 Cycles: Support "precompiled" headers in include expansion algorithm
The idea here is that it is possible to mark certain include statements
as "precompiled" which means all subsequent includes of that file will
be replaced with an empty string.

This is a way to deal with tricky include pattern happening in single
program OpenCL split kernel which was including bunch of headers about
10 times.

This brings preprocessing time from ~1sec to ~0.1sec on my laptop.
2017-08-02 20:59:19 +02:00
Brecht Van Lommel
9e929c911e Fix Cycles multi scatter GGX different render results with Clang and GCC.
The order of evaluation of function arguments is undefined, and the order
was reversed between these compilers. This was causing regressions tests
to give different results between Linux and macOS.
2017-07-23 23:25:12 +02:00
Brecht Van Lommel
e982ebd6d4 Fix T52152: allow zero roughness for Cycles principled BSDF, don't clamp. 2017-07-22 23:58:51 +02:00
Brecht Van Lommel
ec831ee7d1 Fix Cycles denoising NaNs with a 1 sample renders.
This was causing different render results with different compilers. We
can't do much useful with 1 sample, but better for debugging.
2017-07-22 23:58:51 +02:00
Brecht Van Lommel
3b12a71972 Fix T52125: principled BSDF missing with macOS OpenCL. 2017-07-20 15:15:43 +02:00
Stefan Werner
c1ca3c8038 Cycles: fixed the SM_2x CUDA kernel build that I broke in my previous commit 2017-07-20 13:28:34 +02:00
Stefan Werner
4bc6faf9c8 Fix T52107: Color management difference when using multiple and different GPUs together
This commit unifies the flattened texture slot names for bindless and regular CUDA textures. Texture indices are now identical across all CUDA architectures, where before Fermi used different indices, which lead to problems when rendering on multi-GPU setups mixing Fermi with newer hardware.
2017-07-20 10:03:27 +02:00
Sergey Sharybin
5f35682f3a Fix T52021: Shadow catcher renders wrong when catcher object is behind transparent object
Tweaked the path radiance summing and alpha to accommodate for possible contribution of
light by transparent surface bounces happening prior to shadow catcher intersection.

This commit will change the way how shadow catcher results looks when was behind semi
transparent object, but the old result seemed to be fully wrong: there were big artifacts
when alpha-overing the result on some actual footage.
2017-07-18 09:46:21 +02:00
Sergey Sharybin
d8906f30d3 Cycles: Remove meaningless camera ray check
In branched path tracing main loop is always a camera ray, with varying
number of transparent bounces.
2017-07-18 09:27:36 +02:00
Mai Lavelle
1f933c94a7 Cycles: Fix comparison in principled BSDF
Could have lead to black pixels.
2017-07-11 23:41:22 -04:00
Brecht Van Lommel
29ec0b1162 Fix T52027: OSL getattribute() crash, when optimizer calls it before rendering. 2017-07-11 22:39:51 +02:00
Lukas Stockner
15fd758bd6 Fix T51950: Abnormally long Cycles OpenCL GPU render times with certain panoramic camera settings
The problem here was that when a "invalid" path is generated by the panoramic camera, it was tagged
as RAY_TO_REGENERATE with the intention of generating a new path in kernel_buffer_update.

However, since that state was not handled in kernel_queue_enqueue, kernel_buffer_update did not
process the path which resulted in an infinite loop.
2017-07-03 18:26:19 +02:00
Brecht Van Lommel
29c8c50442 Fix T51956: color noise with principled sss, radius 0 and branched path. 2017-07-02 19:21:08 +02:00
Brecht Van Lommel
52b9516e03 Fix principled BSDF incorrectly missing subsurface component with base color black. 2017-07-02 18:22:24 +02:00
Lukas Stockner
1f3fd8e60a Fix T51909: Cycles: Uninitialized closure normals for the Hair BSDF
As the title says, the normal wasn't set for the Hair BSDF because it wasn't
needed before. However, the denoiser uses it to store the feature passes, so
it needs to be set now.
2017-06-28 21:32:02 +02:00
Lukas Stockner
1979176088 Cycles: Fix excessive sampling weight of glossy Principled BSDF components
If there was any specularity in the Principled BSDF, it would get a sampling
weight of one regardless of its actual impact.

This commit makes Cycles estimate the contribution of the component and adjust
the weighting accordingly, which greatly improves the noise characteristics of
the Principled BSDF in many cases.

Note that this commit might slightly change the brightness of areas when using
MultiGGX and high roughnesses, but the new brightness is more accurate and
closer to the result of Branched Path Tracing. See T51836 for details.

Differential Revision: https://developer.blender.org/D2677
2017-06-22 00:09:56 +02:00
Lukas Stockner
8cb741a598 Fix T51836: Cycles: Fix incorrect PDF approximations of the MultiGGX closures
The PDF of the MultiGGX sampling is approximated by the singlescattering GGX
term as well as a scaled diffuse term that makes up for the energy in the
multiscattering component that's missed by GGX.

However, there were two problems with the glossy terms: The diffuse term missed
a normalization factor, and the singlescattering term was not properly scaled
down based on the albedo estimate.

The glass term was completely wrong and has been rewritten. It uses the fresnel
factor to weight reflection vs. refraction and uses the glossy MultiGGX model
for reflection.
For refraction, the correct singlescattering term is now used, and a new
albedo approximation is used that was derived by evaluating GGX albedo for
roughnesses from 0 to 1 and IORs from 1 to 3 and fitting numerical
approximations to it. The resulting model has a mean relative error of 9e-5,
but could probably be simplified without losing noticable accuracy in the
final render.

The improved PDFs help with glossy highlights (due to better light sampling vs.
closure sampling MIS) and fix the situation described in T51836 where mixing
MultiGGX with other closures (as it happens in e.g. the Principled
BSDF) causes incorrect darkening.
2017-06-22 00:09:56 +02:00
Brecht Van Lommel
14ea0c5fcc Fix T51849: change Cycles clearcoat gloss to roughness.
This is compatible with UE4 and more consistent with specular and transmission
roughness, even if it deviates from the original Disney BRDF.
2017-06-21 19:55:20 +02:00
Sergey Sharybin
64aa0cff89 Cycles: Fix typo in comment 2017-06-14 09:54:07 +02:00
Sergey Sharybin
40c04dd649 Cycles: Cleanup, indentation 2017-06-13 10:28:38 +02:00
Sergey Sharybin
0aa5431998 Cycles: Fix compilation error of OpenCL mega kernel
Was some mismatch in address space. Seems to be caused by recent additions.

Additionally, moved decoupled ray marching functions under ifdef, so they
don't try to use malloc() functions.

Thanks Mai for testing the patch!
2017-06-13 10:26:45 +02:00