Commit Graph

2742 Commits

Author SHA1 Message Date
Sergey Sharybin
70424195a8 Cycles: Fix possible access to non-initialized light sample in volume
Happened in barbershop file where number of bounces to the light was
reached.

Differential Revision: https://developer.blender.org/D13336
2021-11-23 16:38:15 +01:00
Brecht Van Lommel
48c2b4012f Merge branch 'blender-v3.0-release' 2021-11-22 21:06:10 +01:00
Brecht Van Lommel
29681f186e Fix T93283: Cycles render error with CUDA CPU + GPU after recent optimization
BVH2 triangle intersection was broken on the GPU since packed floats can't
be loaded directly into SSE. The better long term solution for performance
would be to build a BVH2 for GPU and Embree for CPU, similar to what we do
for OptiX.
2021-11-22 21:02:46 +01:00
Brecht Van Lommel
e2b736aa40 Fix part of T93278: transparent glass option not working with environment pass 2021-11-22 20:58:09 +01:00
Brecht Van Lommel
06a2e2b28c Merge branch 'blender-v3.0-release' 2021-11-19 18:05:17 +01:00
Brecht Van Lommel
1b686c60b5 Fix T93046: Cycles world volume rendering very slow in OptiX with some scenes
With very long ray distance, OptiX ends up traversing many BVH nodes due to
a feature that improves precision. However this causes very slow rendering.

We now avoid generating such long rays by rejecting the few samples that have
long ray distances and very low probability of being generated. This should not
meaningfully affect render results.

Thanks to Sergey and Patrick for the investigation.
2021-11-19 17:42:22 +01:00
Brecht Van Lommel
1b94c53aa6 Cleanup: fix typos in comments and docs
Contributed by luzpaz.

Differential Revision: https://developer.blender.org/D10447
2021-11-19 13:02:16 +01:00
Brecht Van Lommel
167ee8f2c7 Merge branch 'blender-v3.0-release' 2021-11-18 19:37:48 +01:00
Brecht Van Lommel
fd2a155d06 Fix T91797: Cycles volume rendering artifact with overlapping volumes
With the new volume rendering code this was no longer accurate, we always
need to use a new dimension for the next volume segment.
2021-11-18 19:27:37 +01:00
Sybren A. Stüvel
ada6742601 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-18 17:58:26 +01:00
Brecht Van Lommel
f0be276514 Fix T93082: Cycles baking not handling transparency correctly
For baking, replace transparent BSDF with holdout for baking. This ensure no
objects behind are baked, and that the baked image has alpha.
2021-11-18 17:13:16 +01:00
Michael Jones
d1f944c186 Cycles: declare constants at program scope on Metal
MSL requires that constant address space literals be declared at program
scope. This patch moves the `blackbody_table_r/g/b` and `cie_colour_match`
constants into separate files so they can be declared at the appropriate scope.

Ref T92212

Differential Revision: https://developer.blender.org/D13241
2021-11-18 14:38:05 +01:00
Michael Jones
d19e35873f Cycles: several small fixes and additions for MSL
This patch contains many small leftover fixes and additions that are
required for Metal-enablement:

- Address space fixes and a few other small compile fixes
- Addition of missing functionality to the Metal adapter headers
- Addition of various scattered `__KERNEL_METAL__` blocks (e.g. for
  atomic support & maths functions)

Ref T92212

Differential Revision: https://developer.blender.org/D13263
2021-11-18 14:38:02 +01:00
Brecht Van Lommel
c0d52db783 Merge branch 'blender-v3.0-release' 2021-11-18 14:33:43 +01:00
Brecht Van Lommel
bd2e3bb7bd Fix T93045: Cycles HIP not rendering OpenVDB volumes
Build HIP kernels with NanoVDB, and patch NanoVDB to work with HIP.

This is a header only library so no rebuild is needed. The changes are being
submitted upstream to openvdb, so this patch should be temporary.

Thanks Thomas for help testing this.
2021-11-18 13:24:56 +01:00
Brecht Van Lommel
fa7a6d67a8 Fix Cycles CUDA/HIP compiler error after recent changes 2021-11-17 19:56:18 +01:00
Sebastian Herholz
d9bc8f189c Cycles: add build option to enable a debugging feature for MIS
This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with
a feature that allows debugging/selecting the direct-light sampling strategy.
The same option may later be used to add other debugging features that could
affect performance in release builds.

The three options are:
* Forward path tracing (e.g., via BSDF or phase function)
* Next-event estimation
* Multiple importance sampling combination of the previous two methods

Such a feature is useful for debugging light different sampling, evaluation,
and pdf methods (e.g., for light sources and BSDFs).

Differential Revision: https://developer.blender.org/D13152
2021-11-17 18:03:56 +01:00
Brecht Van Lommel
063ad8635e Cycles: reduce triangle memory usage with packed_float3
Depends on D13243

Differential Revision: https://developer.blender.org/D13244
2021-11-17 17:29:41 +01:00
Brecht Van Lommel
9937d5379c Cycles: add packed_float3 type for storage
Introduce a packed_float3 type for smaller storage that is exactly 3
floats, instead of 4. For computation float3 is still used since it can
use SIMD instructions.

Ref T92212

Differential Revision: https://developer.blender.org/D13243
2021-11-17 17:29:41 +01:00
Hans Goudey
c9fb08e075 Merge branch 'blender-v3.0-release' 2021-11-16 14:55:13 -06:00
Brecht Van Lommel
7293c1b357 Fix T93106: Cycles SSS not working with normals pointing inside 2021-11-16 19:44:45 +01:00
Michael Jones
64003fa4b0 Cycles: Adapt volumetric lambda functions to work on MSL
This patch adapts the existing volumetric read/write lambda functions for Metal. Lambda expressions are not supported on MSL, so two new macros `VOLUME_READ_LAMBDA` and `VOLUME_WRITE_LAMBDA` have been defined with a default implementation which, on Metal, is overridden to use inline function objects.

This patch also removes the last remaining mention of the now-unused `ccl_addr_space`.

Ref T92212

Reviewed By: leesonw

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13234
2021-11-16 13:42:23 +00:00
Campbell Barton
1143bf281a Cleanup: spelling in comments, comment block formatting 2021-11-13 13:07:13 +11:00
Campbell Barton
acc800d24d Cleanup: clang-format 2021-11-13 12:47:18 +11:00
Brecht Van Lommel
1b55b911f2 Merge branch 'blender-v3.0-release' 2021-11-12 20:04:05 +01:00
Brecht Van Lommel
b4d9b8b7f8 Fix T91893, T92455: wrong transmission pass with hair and multiscatter glass
We need to increase GPU memory usage a bit. Unfortunately we can't get away
with writing either reflection or transmission passes because these BSDFs may
scatter in either direction but still must be in a fixed reflection or
transmission category to match up with the color passes.
2021-11-12 20:03:46 +01:00
Brecht Van Lommel
ef0b8d6306 Fix T92002: no Cycles combined baking support for filter settings 2021-11-12 20:03:46 +01:00
Sergey Sharybin
ce395c84a3 Merge branch 'blender-v3.0-release' 2021-11-11 15:29:35 +01:00
Sergey Sharybin
d26d3cfe19 Fix T92868: Cycles catcher with transparency crashes
The issue was caused by splitting happening twice.

Fixed by checking for split flag which is assigned to the both states
during split.

The tricky part was to write catcher data at the moment of split: the
transparency and shadow catcher sample count is to be accumulated at
that point. Now it is happening in the `intersect_closest` kernel.
The downside is that render buffer is to be passed to the kernel, but
the benefit is that extra split bounce check is not needed now.

Had to move the passes write to shadow catcher header, since include
of `film/passes.h` causes all the fun of requirement to have BSDF
data structures available.

Differential Revision: https://developer.blender.org/D13177
2021-11-11 15:21:35 +01:00
Andrii
c63e735f6b Cycles: Add sample offset option
This patch exposes the sampling offset option to Blender. It is located in the "Sampling > Advanced" panel.
For example, this can be useful to parallelize rendering and distribute different chunks of samples for each computer to render.

---

I also had to add this option to `RenderWork` and `RenderScheduler` classes so that the sample count in the status string can be calculated correctly.

Reviewed By: leesonw

Differential Revision: https://developer.blender.org/D13086
2021-11-11 09:39:25 +01:00
Brecht Van Lommel
3fa86f4b28 Merge branch 'blender-v3.0-release' 2021-11-10 20:19:09 +01:00
Brecht Van Lommel
6b0008129e Fix T92972: Cycles HIP wrong render display after a recent refactor
It's unclear why this fails. Maybe the size of half4 is not the expected
8 bytes and adjacent pixels are overwritten. Or there is some bug in the
HIP compiler writing a struct into global memory, which we probably don't
do elsewhere in the kernel.

Thanks to Thomas, William and Jeroen for helping investigate this.
2021-11-10 20:03:07 +01:00
Patrick Mours
f565620435 Fix T92985: CUDA errors with Cycles film convert kernels
rB3a4c8f406a3a3bf0627477c6183a594fa707a6e2 changed the macros that create the film
convert kernel entry points, but in the process accidentally changed the parameter definition
to one of those (which caused CUDA launch and misaligned address errors) and changed the
implementation as well. This restores the correct implementation from before.

In addition, the `ccl_gpu_kernel_threads` macro did not work as intended and caused the
generated launch bounds to end up with an incorrect input for the second parameter (it was
set to "thread_num_registers", rather than the result of the block number calculation). I'm
not entirely sure why, as the macro definition looked sound to me. Decided to simply go with
two separate macros instead, to simplify and solve this.

Also changed how state is captured with the `ccl_gpu_kernel_lambda` macro slightly, to avoid
a compiler warning (expression has no effect) that otherwise occurred.

Maniphest Tasks: T92985

Differential Revision: https://developer.blender.org/D13175
2021-11-10 15:49:50 +01:00
Michael Jones
3a4c8f406a Cycles: Adapt shared kernel/device/gpu layer for MSL
This patch adapts the shared kernel entrypoints so that they can be compiled as MSL (Metal Shading Language). Where possible, the adaptations avoid changes in common code.

In MSL, kernel function inputs are explicitly bound to resources. In the case of argument buffers, we declare a struct containing the kernel arguments, accessible via device pointer. This differs from CUDA and HIP where kernel function arguments are declared as traditional C-style function parameters. This patch adapts the entrypoints declared in kernel.h so that they can be translated via a new `ccl_gpu_kernel_signature` macro into the required parameter struct + kernel entrypoint pairing for MSL.

MSL buffer attribution must be applied to function parameters or non-static class data members. To allow universal access to the integrator state, kernel data, and texture fetch adapters, we wrap all of the shared kernel code in a `MetalKernelContext` class. This is achieved by bracketing the appropriate kernel headers with "context_begin.h" and "context_end.h" on Metal. When calling deeper into the kernel code, we must reference the context class (e.g. `context.integrator_init_from_camera`). This extra prefixing is performed by a set of defines in "context_end.h". These will require explicit maintenance if entrypoints change. We invite discussion on more maintainable ways to enforce correctness.

Lambda expressions are not supported on MSL, so a new `ccl_gpu_kernel_lambda` macro generates an inline function object and optionally capturing any required state. This yields the same behaviour. This approach is applied to all parallel_... implementations which are templated by operation. The lambda expressions in the film_convert... kernels don't adapt cleanly to use function objects. However, these entrypoints can be macro-generated more concisely to avoid lambda expressions entirely, instead relying on constant folding to handle the pixel/channel conversions.

A separate implementation of `gpu_parallel_active_index_array` is provided for Metal to workaround some subtle differences in SIMD width, and also to encapsulate some required thread parameters which must be declared as explicit entrypoint function parameters.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13109
2021-11-09 21:43:10 +00:00
Brecht Van Lommel
5f44298280 Fix T92645: Cycles OSL crash due use of uninitialized pointer
Thanks to Ilja Razinkov for identifying the problem and solution.
2021-11-09 15:29:41 +01:00
Patrick Mours
440a3475b8 Cycles: Improve OptiX denoising with dark images and fix crash when denoiser is destroyed
Adds a pass before denoising that calculates the intensity of the image, which can be
passed into the OptiX denoiser for more optimal results for very dark or very bright images.

In addition this also fixes a crash that sometimes occurred on exit. The OptiX denoiser object
has to be destroyed before the OptiX device context object (since it references that). But in
C++ the destructor function of a class is called before its fields are destructed, so
"~OptiXDevice" was always called before "OptiXDevice::~Denoiser" and therefore
"optixDeviceContextDestroy" was called before "optixDenoiserDestroy", hence the crash.

Differential Revision: https://developer.blender.org/D13160
2021-11-09 14:49:00 +01:00
Brecht Van Lommel
c56cf50bd0 Fix T92876: Cycles incorrect volume emission + absorption handling 2021-11-09 13:04:58 +01:00
Brecht Van Lommel
97ff37bf54 Cycles: perform CPU film reading in the kernel, to use AVX2 half conversion
Adds a bunch of CPU kernel function to process on row of pixels, and use those
instead of calling unoptimized implementations.

Fixes T92598
2021-11-05 22:04:36 +01:00
Brecht Van Lommel
d1a9425a2f Fix T91733, T92486: Cycles wrong shadow catcher with volumes
Changes:
* After hitting a shadow catcher, re-initialize the volume stack taking
  into account shadow catcher ray visibility. This ensures that volume objects
  are included in the stack only if they are shadow catchers.
* If there is a volume to be shaded in front of the shadow catcher, the split
  is now performed in the shade_volume kernel after volume shading is done.
* Previously the background pass behind a shadow catcher was done as part of
  the regular path, now it is done as part of the shadow catcher path.

For a shadow catcher path with volumes and visible background, operations are
done in this order now:

* intersect_closest
* shade_volume
* shadow catcher split
* intersect_volume_stack
* shade_background
* shade_surface

The world volume is currently assumed to be CG, that is it does not exist in
the footage. We may consider adding an option to control this, or change the
default. With a volume object this control is already possible.

This includes refactoring to centralize the logic for next kernel scheduling
in intersect_closest.h.

Differential Revision: https://developer.blender.org/D13093
2021-11-05 20:50:19 +01:00
Brecht Van Lommel
4b56eed0f7 Fix T92566: Cycles distant lights too dim in reflections 2021-11-05 20:24:13 +01:00
Brecht Van Lommel
f24ad274cb Fix T92503: Cycles OSL crash with object attributes
Can't cast to float4 because it might not have correct alignment.
2021-11-05 20:07:03 +01:00
Brecht Van Lommel
5c34e34195 Fix part of T91797: Cycles CPU and GPU render differences with camera inside volume 2021-11-04 19:03:49 +01:00
Brecht Van Lommel
ffe115d1a8 Fix T92450: Cycles wrong render with overlapping glass, transparency and volumes
We need to store the continuation probability used to make the termination
decision in intersect_closest, instead of recomputing it in shade_surface.
Because otherwise a shade_volume in between can change the throughput and
change the probability.
2021-11-04 16:39:49 +01:00
Brecht Van Lommel
48e2a15160 Fix T77681, T92634: noise texture artifacts with high detail
We run into float precision issues here, clamp the number of octaves to
one less, which has little to no visual difference. This was empirically
determined to work up to 16 before, but with additional inputs like
roughness only 15 appears to work.

Also adds misisng clamp for the geometry nodes implementation.
2021-11-02 18:56:25 +01:00
William Leeson
0b060905d9 Fix T92575: Cycles black pixels when rendering with > 65k samples
Differential Revision: https://developer.blender.org/D13039
2021-11-01 08:36:50 +01:00
Brecht Van Lommel
35f4d254fd Fix T92513: Cycles stereo pole merge not rotating along with camera 2021-10-28 22:38:07 +02:00
Brecht Van Lommel
f2cc38a62b Fix T92255: Cycles Christensen-Burley render errors with scaled objects 2021-10-28 21:53:30 +02:00
Brecht Van Lommel
673984b222 Fix T92158: Cycles crash with Fast GI and area light MIS 2021-10-28 21:33:52 +02:00
William Leeson
82cf25dfbf Cycles: Scrambling distance for the PMJ sampler
Adds scrambling distance to the PMJ sampler. This is based
on the work by Mathieu Menuet in D12318 who created the original
implementation for the Sobol sampler.

Reviewed By: brecht

Maniphest Tasks: T92181

Differential Revision: https://developer.blender.org/D12854
2021-10-27 14:21:15 +02:00
William Leeson
7b1c5712f8 Cycles: Replace saturate with saturatef
saturate is depricated in favour of __saturatef this replaces saturate
with __saturatef on CUDA by createing a saturatef function which replaces
all instances of saturate and are hooked up to the correct function on all
platforms.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D13010
2021-10-27 14:05:46 +02:00