* Rename struct KernelGlobals to struct KernelGlobalsCPU
* Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs
that every device can define in its own way.
* Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and
replace with these new typedefs.
* Add explicit state argument to INTEGRATOR_STATE and similar macros
In preparation for decoupling main and shadow paths.
Differential Revision: https://developer.blender.org/D12888
This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation.
MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness.
The vast majority of deltas in this patch fall into one of two cases:
- Ensuring ccl_private is specified for thread-local pointer types
- Ensuring ccl_global is specified for device-wide pointer types
Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant.
In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture.
The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D12864
For details see the "Extending the Disney BRDF to a BSDF with Integrated
Subsurface Scattering" paper.
We split the diffuse BSDF into a lambertian and retro-reflection component.
The retro-reflection component is always handled as a BSDF, while the
lambertian component can be replaced by a BSSRDF.
For the BSSRDF case, we compute Fresnel separately at the entry and exit
points, which may have different normals. As the scattering radius decreases
this converges to the BSDF case.
A downside is that this increases noise for subsurface scattering in the
Principled BSDF, due to some samples going to the retro-reflection component.
However the previous logic (also in 2.93) was simple wrong, using a
non-sensical view direction vector at the exit point. We use an importance
sampling weight estimate for the retro-reflection to try to better balance
samples between the BSDF and BSSRDF.
Differential Revision: https://developer.blender.org/D12801
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycleshttps://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
Custom properties defined on objects are not accessible from the
attribute node when rendering a volume in Cycles. This is because
this case is not handled.
To handle it, added a primitive type for volumes in the kernel,
which is then used in the initialization of ShaderData and to
check whether an attribute lookup is for a volume.
`volume_attribute_float4` is also now checking the attribute
element type to dispatch to the right lookup function.
Reviewed By: #cycles, brecht
Maniphest Tasks: T87194
Differential Revision: https://developer.blender.org/D11728
This keeps render results compatible for combined CPU + GPU rendering.
Peformance and quality primitives is quite different than before. There
are now two options:
* Rounded Ribbon: render hair as flat ribbon with (fake) rounded normals, for
fast rendering. Hair curves are subdivided with a fixed number of user
specified subdivisions.
This gives relatively good results, especially when used with the Principled
Hair BSDF and hair viewed from a typical distance. There are artifacts when
viewed closed up, though this was also the case with all previous primitives
(but different ones).
* 3D Curve: render hair as 3D curve, for accurate results when viewing hair
close up. This automatically subdivides the curve until it is smooth.
This gives higher quality than any of the previous primitives, but does come
at a performance cost and is somewhat slower than our previous Thick curves.
The main problem here is performance. For CPU and OpenCL rendering performance
seems usually quite close or better for similar quality results.
However for CUDA and Optix, performance of 3D curve intersection is problematic,
with e.g. 1.45x longer render time in Koro (though there is no equivalent quality
and rounded ribbons seem fine for that scene). Any help or ideas to optimize this
are welcome.
Ref T73778
Depends on D8012
Maniphest Tasks: T73778
Differential Revision: https://developer.blender.org/D8013
The kernel did not work correctly when these were disabled anyway. The
optimized BVH traversal for the no instances case was also only used on
the CPU, so no longer makes sense to keep.
Ref T73778
Depends on D8010
Maniphest Tasks: T73778
Differential Revision: https://developer.blender.org/D8011
This simplifies compositors setups and will be consistent with Eevee render
passes from D6331. There's a continuum between these passes and it's not clear
there is much advantage to having them available separately.
Differential Revision: https://developer.blender.org/D6848
Custom render passes are added in the Shader AOVs panel in the view layer
settings, with a name and data type. In shader nodes, an AOV Output node
is then used to output either a value or color to the pass.
Arbitrary names can be used for these passes, as long as they don't conflict
with built-in passes that are enabled. The AOV Output node can be used in both
material and world shader nodes.
Implemented by Lukas, with tweaks by Brecht.
Differential Revision: https://developer.blender.org/D4837
The Lamp data was not always set. When using CUDA or CPU it was, but when using OpenCL
without `OBJECT_MOTION` `sd->lamp` not updated to the actual lamp. This made the
TextureCoordinate output the wrong normal when used in a light shader.
As the normal was incorrect it made the IES node render incorrectly.
(what is the default for the IES node).
By setting the lamp data when no `__OBJECT_MOTION__` compile directive is present makes
sure that the normal is correctly calculated.
Fix D4450
Reviewed By: Brecht van Lommel
BF-admins agree to remove header information that isn't useful,
to reduce noise.
- BEGIN/END license blocks
Developers should add non license comments as separate comment blocks.
No need for separator text.
- Contributors
This is often invalid, outdated or misleading
especially when splitting files.
It's more useful to git-blame to find out who has developed the code.
See P901 for script to perform these edits.
This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object.
The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats").
Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now.
Reviewers: brecht, sergey, swerner
Reviewed By: brecht, swerner
Differential Revision: https://developer.blender.org/D3892
This allows for extra output passes that encode automatic object and material masks
for the entire scene. It is an implementation of the Cryptomatte standard as
introduced by Psyop. A good future extension would be to add a manifest to the
export and to do plenty of testing to ensure that it is fully compatible with other
renderers and compositing programs that use Cryptomatte.
Internally, it adds the ability for Cycles to have several passes of the same type
that are distinguished by their name.
Differential Revision: https://developer.blender.org/D3538
This means the shader can now be used for procedural texturing. New
settings on the node are Samples, Inside, Local Only and Distance.
Original patch by Lukas with further changes by Brecht.
Differential Revision: https://developer.blender.org/D3479
We now continue transparent paths after diffuse/glossy/transmission/volume
bounces are exceeded. This avoids unexpected boundaries in volumes with
transparent boundaries. It is also required for MIS to work correctly with
transparent surfaces, as we also continue through these in shadow rays.
The main visible changes is that volumes will now be lit by the background
even at volume bounces 0, same as surfaces.
Fixes T53914 and T54103.
With a Titan Xp, reduces path trace local memory from 1092MB to 840MB.
Benchmark performance was within 1% with both RX 480 and Titan Xp.
Original patch was implemented by Sergey.
Differential Revision: https://developer.blender.org/D2249
For the first bounce we now give each BSDF or BSSRDF a minimum sample weight,
which helps reduce noise for a typical case where you have a glossy BSDF with
a small weight due to Fresnel, but not necessarily small contribution relative
to a diffuse or transmission BSDF below.
We can probably find a better heuristic that also enables this on further
bounces, for example when looking through a perfect mirror, but I wasn't able
to find a robust one so far.
Similar to what we did for area lights previously, this should help
preserve stratification when using multiple BSDFs in theory. Improvements
are not easily noticeable in practice though, because the number of BSDFs
is usually low. Still nice to eliminate one sampling dimension.