Commit Graph

2613 Commits

Author SHA1 Message Date
Brecht Van Lommel
1df3b51988 Cycles: replace integrator state argument macros
* Rename struct KernelGlobals to struct KernelGlobalsCPU
* Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs
  that every device can define in its own way.
* Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and
  replace with these new typedefs.
* Add explicit state argument to INTEGRATOR_STATE and similar macros

In preparation for decoupling main and shadow paths.

Differential Revision: https://developer.blender.org/D12888
2021-10-18 19:02:10 +02:00
Charlie Jolly
78b5050ff4 Cycles: Voronoi noise, fix uninitialised variable
Caused a debug crash in Windows MSVS.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D12873
2021-10-15 15:01:10 +01:00
Brecht Van Lommel
2f36762def Cleanup: refactor BVH2 shadow intersection for upcoming changes 2021-10-15 15:42:44 +02:00
Brecht Van Lommel
5d565062ed Cleanup: refactor OptiX shadow intersection for upcoming changes 2021-10-15 15:42:44 +02:00
Brecht Van Lommel
eb71157e2a Cleanup: add utility functions for packing integers 2021-10-15 15:42:44 +02:00
Brecht Van Lommel
2ba7c3aa65 Cleanup: refactor to make number of channels for shader evaluation variable 2021-10-15 15:42:44 +02:00
Brecht Van Lommel
53f25df5bc Fix T92128: Cycles CUDA wrong hair attributes, after recent changes 2021-10-15 15:42:44 +02:00
Michael Jones
a0f269f682 Cycles: Kernel address space changes for MSL
This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation.

MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness.

The vast majority of deltas in this patch fall into one of two cases:

- Ensuring ccl_private is specified for thread-local pointer types
- Ensuring ccl_global is specified for device-wide pointer types

Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant.

In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture.

The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D12864
2021-10-14 16:14:43 +01:00
Sergey Sharybin
aa46459543 Fix shadow catcher behind transparent object on GPU
The assumption about absent shadow path was wrong.

The rest of the changes are to ensure shadow paths are finished prior
to the split, so that they write to the proper passes.

The issue was caught by running regression tests on OptiX.

Differential Revision: https://developer.blender.org/D12857
2021-10-14 09:39:38 +02:00
Campbell Barton
c1c6c11ca6 Cleanup: spelling in comments 2021-10-12 17:55:02 +11:00
Brecht Van Lommel
a94343a8af Cycles: improve SSS Fresnel and retro-reflection in Principled BSDF
For details see the "Extending the Disney BRDF to a BSDF with Integrated
Subsurface Scattering" paper.

We split the diffuse BSDF into a lambertian and retro-reflection component.
The retro-reflection component is always handled as a BSDF, while the
lambertian component can be replaced by a BSSRDF.

For the BSSRDF case, we compute Fresnel separately at the entry and exit
points, which may have different normals. As the scattering radius decreases
this converges to the BSDF case.

A downside is that this increases noise for subsurface scattering in the
Principled BSDF, due to some samples going to the retro-reflection component.
However the previous logic (also in 2.93) was simple wrong, using a
non-sensical view direction vector at the exit point. We use an importance
sampling weight estimate for the retro-reflection to try to better balance
samples between the BSDF and BSSRDF.

Differential Revision: https://developer.blender.org/D12801
2021-10-11 18:22:54 +02:00
Brecht Van Lommel
73a05ff9e8 Cycles: restore Christensen-Burley SSS
There is not enough time before the release to improve Random Walk to handle
all cases this was used for, so restore it for now.

Since there is no more path splitting in cycles-x, this can increase noise in
non-flat areas for the sample number of samples, though fewer rays will be traced
also. This is fundamentally a trade-off we made in the new design and why Random
Walk is a better fit. However the importance resampling we do now does help to
reduce noise.

Differential Revision: https://developer.blender.org/D12800
2021-10-11 18:22:54 +02:00
Brecht Van Lommel
736be7cf58 Fix T91997: Cycles glass + SSS not rendering correctly 2021-10-08 16:11:02 +02:00
Sergey Sharybin
f01c4f27f9 Fix Cycles speed regression after dynamic volume stack change
Only copy required part of volume stack instead of entire stack.

Solves time regression introduced by D12759 and avoids need in
implementing volume stack calculation to exactly match what the
path tracing will do (as well as potentially makes scenes with
a lot of volumes ans a tiny bit of deeply nested ones render
faster).

Still need to look into memory aspect of the regression, but
that is for separate patch.

Ref T92014

Maniphest Tasks: T92014

Differential Revision: https://developer.blender.org/D12790
2021-10-08 15:44:03 +02:00
Campbell Barton
de07bf2b13 Cleanup: spelling 2021-10-08 13:23:19 +11:00
Brecht Van Lommel
4ee97f129a Cleanup: remove unnecessary data from LocalIntersection 2021-10-07 21:35:24 +02:00
Brecht Van Lommel
04857cc8ef Cycles: fully decouple triangle and curve primitive storage from BVH2
Previously the storage here was optimized to avoid indirections in BVH2
traversal. This helps improve performance a bit, but makes performance
and memory usage of Embree and OptiX BVHs a bit worse also. It also adds
code complexity in other parts of the code.

Now decouple triangle and curve primitive storage from BVH2.
* Reduced peak memory usage on all devices
* Bit better performance for OptiX and Embree
* Bit worse performance for CUDA
* Simplified code:
** Intersection.prim/object now matches ShaderData.prim/object
** No more offset manipulation for mesh displacement before a BVH is built
** Remove primitive packing code and flags for Embree and OptiX
** Curve segments are now stored in a KernelCurve struct
* Also happens to fix a bug in baking with incorrect prim/object

Fixes T91968, T91770, T91902

Differential Revision: https://developer.blender.org/D12766
2021-10-06 17:52:04 +02:00
Sergey Sharybin
0194e54fd3 Fix compilation error with MSVC
MSVC does not support variable size array definition.
Use maximum possible stack, similar to the GPU case.

Not expected to have user-measurable difference.
2021-10-06 16:51:07 +02:00
Sergey Sharybin
c6275da852 Fix T91922: Cycles artifacts with high volume nested level
Make volume stack allocated conditionally, potentially based on the
actual nested level of objects in the scene.

Currently the nested level is estimated by number of volume objects.
This is a non-expensive check which is probably enough in practice
to get almost perfect memory usage and performance.

The conditional allocation is a bit tricky.

For the CPU we declare and define maximum possible volume stack,
because there are only that many integrator states on the CPU.

On the GPU we declare outer SoA to have all volume stack elements,
but only allocate actually needed ones. The actually used volume
stack size is passed as a pre-processor, which seems to be easiest
and fastest for the GPU state copy.

There seems to be no speed regression in the demo files on RTX6000.

Note that scenes with high nested level of volume will now be slower
but correct.

Differential Revision: https://developer.blender.org/D12759
2021-10-06 15:46:32 +02:00
Brecht Van Lommel
03f8c1abd0 Build: add ccache support for CUDA kernels on Linux 2021-10-06 14:21:26 +02:00
Mikhail Matrosov
ca0450feef Fix T91064: Cycles low poly meshes having black edges when shade smoothed
Fixes:{T91064}

Caused by {rBcd118c5581f482afc8554ff88b5b6f3b552b1682}

- Applies `ensure_valid_reflection()` to the normal input on all BSDFs for CPU and GPU.
- This doesn't affect hair.
- Removes `ensure_valid_reflection()` from the output of Bump Map and Normal Map nodes for CPU/GPU as it is not needed.
- The fix doesn't touch OSL.

Reviewed By: brecht, leesonw

Maniphest Tasks: T91064

Differential Revision: https://developer.blender.org/D12403
2021-10-06 10:25:09 +02:00
Campbell Barton
df8f507f41 Cleanup: spelling in comments 2021-10-06 14:54:05 +11:00
Jesse Yurkovich
76de3ac4ce Cleanup: Remove data duplication from various lookup tables in Cycles
This effectively undoes some of the following commit:
rB4537e8558468c71a03bf53f59c60f888b3412de2

The tables in question were duplicated 5-6 times into the blender
executable due to the headers being used in multiple translation units.
This contributes ~6.3kb worth of duplicate data into the binary.

Some further details are in the below revision.

Differential Revision: https://developer.blender.org/D12724
2021-10-05 19:09:01 -07:00
Sergey Sharybin
6e268a749f Fix adaptive sampling artifacts on tile boundaries
Implement an overscan support for tiles, so that adaptive sampling can
rely on the pixels neighbourhood.

Differential Revision: https://developer.blender.org/D12599
2021-10-05 16:19:14 +02:00
Brecht Van Lommel
55b8fc718a Cycles: improve detection of HIP compiler for buildbot
And fix various broken things in the HIP kernel compilation.
2021-10-05 13:47:50 +02:00
Sergey Sharybin
f806bd8261 Fix T91861: Black environment behind shadow catcher
Always sample background pass behind shadow catcher (if the pass
exists, of course), regardless of whether shadow catcher will be
used as approximate or accurate.

Allows to combine accurate shadows into an environment map.

Differential Revision: https://developer.blender.org/D12747
2021-10-04 15:07:32 +02:00
Brecht Van Lommel
fc4886a314 Fix T91894: Cycles baking normal maps of transformed objects not working 2021-10-04 13:58:37 +02:00
Brecht Van Lommel
a80a2f07b7 Fix T90815: wrong Cycles OSL normal map render after recent optimization 2021-10-04 13:58:37 +02:00
Brecht Van Lommel
76238af213 Fix Cycles render time pass being available in UI, but it was removed
This previously only work for CPU rendering, and isn't that practical to get
working in the new architecture.
2021-10-04 13:58:37 +02:00
Campbell Barton
74f45ed9c5 Cleanup: spelling in comments 2021-10-03 12:13:29 +11:00
Charlie Jolly
be70827e6f Nodes: Add Float Curve for GN and Shader nodes.
Replacement for float curve in legacy Attribute Curve Map node.

Float Curve defaults to [0.0-1.0] range.

Reviewed By: JacquesLucke, brecht

Differential Revision: https://developer.blender.org/D12683
2021-09-30 19:24:40 +01:00
Sergey Sharybin
6f23e4484d Fix non-finite curve normal causing Cycles to crash
Similar to the previous change in the area: need to avoid ray
point and direction becoming a non-finite value.

Use the view direction when the geometrical normal can not be
calculated.

Collaboration and sanity inspiration with Brecht!

Differential Revision: https://developer.blender.org/D12703
2021-09-29 19:49:59 +02:00
Brecht Van Lommel
4d4113adc2 Cycles: record large number of transparent shadow intersections on CPU
So we can do fewer intersection calls, only on the GPU do we need to save
memory and do this in small steps.

Ref T87836
2021-09-29 16:37:32 +02:00
Sergey Sharybin
fe070fe33b Fix Cycles crash in certain hair configurations
The issue was caused by hair shader setup setting normal to a non
finite value, which then gets used to create a ray with non-finite
direction, making BVH traversal to run out of stack memory.

Happens with 150_0040_A.lighting.blend frame 112 of the Sprites
project.

Differential Revision: https://developer.blender.org/D12692
2021-09-29 16:14:23 +02:00
Sergey Sharybin
ffb9577ac9 Cycles: Ensure finite displacement and background evaluation
Avoids possible numerical issues in the path tracing kernel, which
is most important for displacement as non-finite values in BVH can
lead to infinite node recursion during traversal.

Differential Revision: https://developer.blender.org/D12690
2021-09-29 14:06:10 +02:00
Campbell Barton
79290f5160 Cleanup: spelling in comments 2021-09-29 07:29:15 +10:00
Brecht Van Lommel
86ec9d79ec Fix build without Cycles HIP device 2021-09-28 20:00:55 +02:00
Brian Savery
044a77352f Cycles: add HIP device support for AMD GPUs
NOTE: this feature is not ready for user testing, and not yet enabled in daily
builds. It is being merged now for easier collaboration on development.

HIP is a heterogenous compute interface allowing C++ code to be executed on
GPUs similar to CUDA. It is intended to bring back AMD GPU rendering support
on Windows and Linux.

https://github.com/ROCm-Developer-Tools/HIP.

As of the time of writing, it should compile and run on Linux with existing
HIP compilers and driver runtimes. Publicly available compilers and drivers
for Windows will come later.

See task T91571 for more details on the current status and work remaining
to be done.

Credits:

Sayak Biswas (AMD)
Arya Rafii (AMD)
Brian Savery (AMD)

Differential Revision: https://developer.blender.org/D12578
2021-09-28 19:18:55 +02:00
Brecht Van Lommel
5bea5e25d5 Fix T91728: Cycles render artifacts with motion blur and object attributes 2021-09-27 17:40:03 +02:00
Patrick Mours
2189dfd6e2 Cycles: Rework OptiX visibility flags handling
Before the visibility test against the visibility flags was performed in an any-hit program in OptiX
(called `__anyhit__kernel_optix_visibility_test`), which was using the `__prim_visibility` array.
This is not entirely correct however, since `__prim_visibility` is filled with the merged visibility
flags of all objects that reference that primitive, so if one object uses different visibility flags
than another object, but they both are instances of the same geometry, they would appear the same
way. The reason that the any-hit program was used rather than the OptiX instance visibility mask is
that the latter is currently limited to 8 bits only, which is not sufficient to contain all Cycles
visibility flags (12 bits).

To mostly fix the problem with multiple instances and different visibility flags, I changed things to
use the OptiX instance visibility mask for a subset of the Cycles visibility flags (`PATH_RAY_CAMERA`
to `PATH_RAY_VOLUME_SCATTER`, which fit into 8 bits) and only fall back to the visibility test any-hit
program if that isn't enough (e.g. the ray visibility mask exceeds 8 bits or when using the built-in
curves from OptiX, since the any-hit program is then also used to skip the curve endcaps).

This may also improve performance in some cases, since by default OptiX can now perform the normal
scene intersection trace calls entirely on RT cores without having to jump back to the SM on every
hit to execute the any-hit program.

Fixes T89801

Differential Revision: https://developer.blender.org/D12604
2021-09-27 17:12:43 +02:00
Brecht Van Lommel
4a562f5077 Fix T91714: Cycles direct/indirect clamp distinction not working correctly 2021-09-27 14:25:22 +02:00
Dalai Felinto
c618075541 Cleanup: make format 2021-09-27 12:43:54 +02:00
William Leeson
f3ace5aa80 Fixes T91632 by stopping the sample correlation between dimensions which was causing rendering artifacts on simple scenes.
Fix T91632: Stops the sample correlation between dimensions which was causing rendering artefacts on simple scenes.

This is done by increasing the amount of jitter the Cranley Patterson Rotation is allowed to add. Also, it uses the y dimension of the of the sample table for 1D sampling which causes further decorrelation between dimensions. As an additional measure the x and y dimensions are swapped randomly to provide further decorrelation.

Maniphest Tasks: T91632

Differential Revision: https://developer.blender.org/D12610
2021-09-27 09:54:37 +02:00
Jeroen Bakker
6a88f83d67 Hair Info Length Attribute
Goal is to add the length attribute to the Hair Info node, for better control over color gradients or similar along the hair.

Reviewed By: #eevee_viewport, brecht

Differential Revision: https://developer.blender.org/D10481
2021-09-24 07:44:22 +02:00
Brecht Van Lommel
ed541de29d Fix T91626: Cycles sss behind fully transparent object renders differently 2021-09-23 18:32:30 +02:00
Brecht Van Lommel
6279efbb78 Fix Cycles compiler warning on GCC 11
For shadow rays there are no closures, leave out the closure merging code
there to avoid warnings about accessing closure memory that does not exist.
2021-09-23 17:48:16 +02:00
Campbell Barton
b659d1a560 Cleanup: spelling in comments 2021-09-23 22:08:02 +10:00
Brecht Van Lommel
204b01a254 Fix T91590: Cycles specular baking not using smooth normal 2021-09-22 16:08:45 +02:00
Brecht Van Lommel
53e7c64be7 Fix T91597: Cycles volume scatter visibility on lights not working 2021-09-22 16:08:45 +02:00
Campbell Barton
4d66cbd140 Cleanup: spelling in comments 2021-09-22 14:54:01 +10:00