blender

Author	SHA1	Message	Date
Brecht Van Lommel	1df3b51988	Cycles: replace integrator state argument macros * Rename struct KernelGlobals to struct KernelGlobalsCPU * Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs that every device can define in its own way. * Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and replace with these new typedefs. * Add explicit state argument to INTEGRATOR_STATE and similar macros In preparation for decoupling main and shadow paths. Differential Revision: https://developer.blender.org/D12888	2021-10-18 19:02:10 +02:00
Michael Jones	a0f269f682	Cycles: Kernel address space changes for MSL This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation. MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness. The vast majority of deltas in this patch fall into one of two cases: - Ensuring ccl_private is specified for thread-local pointer types - Ensuring ccl_global is specified for device-wide pointer types Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant. In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture. The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation. Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D12864	2021-10-14 16:14:43 +01:00
Brecht Van Lommel	0803119725	Cycles: merge of cycles-x branch, a major update to the renderer This includes much improved GPU rendering performance, viewport interactivity, new shadow catcher, revamped sampling settings, subsurface scattering anisotropy, new GPU volume sampling, improved PMJ sampling pattern, and more. Some features have also been removed or changed, breaking backwards compatibility. Including the removal of the OpenCL backend, for which alternatives are under development. Release notes and code docs: https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles https://wiki.blender.org/wiki/Source/Render/Cycles Credits: * Sergey Sharybin * Brecht Van Lommel * Patrick Mours (OptiX backend) * Christophe Hery (subsurface scattering anisotropy) * William Leeson (PMJ sampling pattern) * Alaska (various fixes and tweaks) * Thomas Dinges (various fixes) For the full commit history, see the cycles-x branch. This squashes together all the changes since intermediate changes would often fail building or tests. Ref T87839, T87837, T87836 Fixes T90734, T89353, T80267, T80267, T77185, T69800	2021-09-21 14:55:54 +02:00
Brecht Van Lommel	073bf8bf52	Cycles: remove WITH_CYCLES_DEBUG, add WITH_CYCLES_DEBUG_NAN WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we mainly use Embree an OptiX now, this information is no longer important. WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values in the kernel, for Cycles developers. Previously these asserts where enabled in all debug builds, but this is too likely to crash Blender in scenes that render fine regardless of the NaNs. So this is behind a CMake option now. Fixes T90240	2021-07-28 19:27:57 +02:00
Brecht Van Lommel	68dd7617d7	Cycles: add utility functions for zero float2/float3/float4/transform Ref D8237, T78710	2021-02-17 16:26:24 +01:00
Brecht Van Lommel	dab50ad718	Cleanup: use float3 instead of float4 for shadow, since w is never used Contributed by pembem22. Differential Revision: https://developer.blender.org/D8947	2020-09-22 16:36:43 +02:00
Brecht Van Lommel	18cda8be87	Cycles: change perspective depth pass to be more standard Now it matches Eevee, OpenGL and other renderers. Panoramic camera depth passes are unchanged, and are still distance from the camera center.	2020-06-02 04:54:44 +02:00
Brecht Van Lommel	53981c7fb6	Cleanup: refactor adaptive sampling to more easily change some parameters No functional changes yet, this is work towards making CPU and GPU results match more closely.	2020-04-07 20:29:48 +02:00
Brecht Van Lommel	db65a6e0fb	Fix T74345: missing albedo for Cycles principled hair BSDF	2020-03-20 15:23:39 +01:00
Stefan Werner	51e898324d	Adaptive Sampling for Cycles. This feature takes some inspiration from "RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and "A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination" The basic principle is as follows: While samples are being added to a pixel, the adaptive sampler writes half of the samples to a separate buffer. This gives it two separate estimates of the same pixel, and by comparing their difference it estimates convergence. Once convergence drops below a given threshold, the pixel is considered done. When a pixel has not converged yet and needs more samples than the minimum, its immediate neighbors are also set to take more samples. This is done in order to more reliably detect sharp features such as caustics. A 3x3 box filter that is run periodically over the tile buffer is used for that purpose. After a tile has finished rendering, the values of all passes are scaled as if they were rendered with the full number of samples. This way, any code operating on these buffers, for example the denoiser, does not need to be changed for per-pixel sample counts. Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4686	2020-03-05 12:21:38 +01:00
Brecht Van Lommel	e0085bfd24	Cycles: move sss and diffuse transmission into diffuse pass This simplifies compositors setups and will be consistent with Eevee render passes from D6331. There's a continuum between these passes and it's not clear there is much advantage to having them available separately. Differential Revision: https://developer.blender.org/D6848	2020-02-25 11:44:47 +01:00
Lukas Stockner	dc1db0791e	Cycles: Track specular throughput to account for reflection color in denoising albedo pass To determine the albedo pass, Cycles currently follows the path until a predominantly diffuse-ish material is hit and then takes the albedo there. This works fine for normal mirrors, but as it completely ignores the color of the bounces before that diffuse-ish material, it also means that any textures that are applied to the specular-ish BSDFs won't affect the albedo pass at all. Therefore, this patch changes that behaviour so that Cycles also keeps track of the throughput of all specular-ish closures along the path so far and includes that in the albedo pass. This fixes part of the issue described in T73043. However, since it has an effect on the albedo pass in most scenes, it could cause cause regressions, which is why I'm uploading it as a patch instead of just committing as a fix. Differential Revision: https://developer.blender.org/D6640	2020-02-06 03:37:48 +01:00
Lukas Stockner	902209eda5	Partial Fix T73043: Denoising Albedo not working well for Sheen Similar to the Microfacet Closures, the Principled BSDF Sheen closure is added at a high weight but typically results in fairly low values. Therefore, the default weight is a bad indicator of importance. The fix here is the same as it was back then for Microfacets: Compute an average weight using the normal as the half-vector and use it to scale down the sample weight and the albedo channel. In addition to drastically improving denoising of materials with sheen when using the new Denoising node, this also can reduce noise on such materials considerably.	2020-01-20 23:06:08 +01:00
Stefan Werner	2f1d3ba6da	Cycles: Fixed OpenCL kernel build. transform_direction() can't handle parameters in constant address space. Creating a local copy of the parameter satisfies the OpenCL compiler. CUDA and CPU compilers should be able to optimize this away I hope.	2020-01-09 14:40:24 +01:00
Patrick Mours	d5ca72191c	Cycles: Add OptiX AI denoiser support This patch adds support for the OptiX denoiser as an alternative to the existing NLM denoiser in Cycles. It's re-using the same denoising architecture based on tiles and therefore implicitly also works with multiple GPUs. Reviewed By: sergey Differential Revision: https://developer.blender.org/D6395	2020-01-08 16:53:11 +01:00
Lukas Stockner	e760972221	Cycles: support for custom shader AOVs Custom render passes are added in the Shader AOVs panel in the view layer settings, with a name and data type. In shader nodes, an AOV Output node is then used to output either a value or color to the pass. Arbitrary names can be used for these passes, as long as they don't conflict with built-in passes that are enabled. The AOV Output node can be used in both material and world shader nodes. Implemented by Lukas, with tweaks by Brecht. Differential Revision: https://developer.blender.org/D4837	2019-12-10 20:44:46 +01:00
Lukas Stockner	4659fa5471	Cycles: Scale denoising albedo contribution of Principled BSDFs according to average fresnel The Principled BSDF uses Microfacet closures that include a fresnel term, which are a special case since their weight tends to be near white even if their average contribution is fairly low. The sample weight is scaled by the average fresnel weight to account for this, but the denoising albedo still used the unscaled weight. This was fine for the original denoiser, but apparently OIDN can't handle the resulting albedo pass well. Therefore, this commit adds the described scaling to the albedo pass contribution as well. This problem was described in T69770. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6289	2019-11-27 21:26:47 +01:00
Jeroen Bakker	271c6794d6	Cycles: Viewport Rendering Memory Improvement Small memory reduction change by only storing the pixels of the combined pass when it is being shown in the viewport. Previously the combined pass was always calculated and present in the output buffer. The combined pass will still be calculated. It is a limitation in Blender that Cycles always had a combined pass. This patch will remove the limitation from the code base of Cycles. Blender still has the limitation, but will always request the combined renderpass when doing final rendering. Reviewed By: brecht Differential Revision: https://developer.blender.org/D5784	2019-09-17 11:24:55 +02:00
Patrick Mours	b05e7ea719	Cycles: fixes for building kernel without certain features Ref D5363	2019-08-26 10:10:35 +02:00
Campbell Barton	6529d20d79	Cleanup: spelling in comments	2019-06-12 09:43:49 +10:00
Campbell Barton	e12c08e8d1	ClangFormat: apply to source, most of intern Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat	2019-04-17 06:21:24 +02:00
Lukas Stockner	7fa6f72084	Cycles: Add sample-based runtime profiler that measures time spent in various parts of the CPU kernel This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object. The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats"). Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now. Reviewers: brecht, sergey, swerner Reviewed By: brecht, swerner Differential Revision: https://developer.blender.org/D3892	2018-11-29 02:45:24 +01:00
Campbell Barton	e742e0934d	Cleanup: trailing space	2018-11-25 08:01:14 +11:00
Sergey Sharybin	cb4b5e12ab	Cycles: Cleanup, spacing after preprocessor It is supposed to be two spaces before comment stating which if else/endif statements corresponds to. Was mainly violated in the header guards.	2018-11-09 11:34:54 +01:00
Stefan Werner	e58c6cf0c6	Cycles: Added Cryptomatte output. This allows for extra output passes that encode automatic object and material masks for the entire scene. It is an implementation of the Cryptomatte standard as introduced by Psyop. A good future extension would be to add a manifest to the export and to do plenty of testing to ensure that it is fully compatible with other renderers and compositing programs that use Cryptomatte. Internally, it adds the ability for Cycles to have several passes of the same type that are distinguished by their name. Differential Revision: https://developer.blender.org/D3538	2018-10-28 05:37:41 -04:00
Campbell Barton	1daa20ad9f	Cleanup: strip trailing space for cycles	2018-07-06 10:17:58 +02:00
Matt Heimlich	e3f1d98098	Cycles: take into account diffuse roughness for roughness baking. Roughness baking previously defaulted to 1.0 for all diffuse materials, now we also bake roughness values of Oren-Nayer and Principled Diffuse. Differential Revision: https://developer.blender.org/D3115	2018-03-28 23:45:15 +02:00
Lukas Stockner	322f0223d0	Cycles: option to make background visible through glass transparent. This can be enabled in the Film panel, with an option to control the transmisison roughness below which glass becomes transparent. Differential Revision: https://developer.blender.org/D2904	2018-01-12 01:34:28 +01:00
Lukas Stockner	a0c02e4d1b	Cycles: Add Volume Direct and Volume Indirect passes for volume-scattered light No color pass because it's hard to define what to use as color in a volume. Reviewers: sergey, brecht Differential Revision: https://developer.blender.org/D2903	2017-11-17 16:39:45 +01:00
Lukas Stockner	f78e963858	Cycles: Refactor PassType from bitflag to index in order to allow for more passes	2017-11-17 16:34:19 +01:00
Lukas Stockner	d8066fb0f1	Cycles: Refactor closure roughness detection to fix a potential bug with Denoising of specular shaders	2017-11-14 04:17:54 +01:00
Sergey Sharybin	0d3c8d0701	Cycles: Cleanup, indentation and wrapping	2017-10-06 16:54:37 +05:00
Brecht Van Lommel	6da6f8d33f	Cycles: CUDA faster rendering of small tiles, using multiple samples like OpenCL. The work size is still very conservative, and this doesn't help for progressive refine. For that we will need to render multiple tiles at the same time. But this should already help for denoising renders that require too much memory with big tiles, and just generally soften the performance dropoff with small tiles. Differential Revision: https://developer.blender.org/D2856	2017-10-04 21:58:47 +02:00
Brecht Van Lommel	5bb677e592	Code refactor: zero render buffers outside of kernel. This was originally done with the first sample in the kernel for better performance, but it doesn't work anymore with atomics. Any benefit was very minor anyway, too small to measure it seems.	2017-10-04 21:11:14 +02:00
Brecht Van Lommel	12f4538205	Code refactor: use split variance calculation for mega kernels too. There is no significant difference in denoised benchmark scenes and denoising ctests, so might as well make it all consistent.	2017-10-04 21:11:14 +02:00
Brecht Van Lommel	37d9e65ddf	Code cleanup: abstract shadow catcher logic more into accumulation code.	2017-09-13 15:24:14 +02:00
Brecht Van Lommel	f77cdd1d59	Code cleanup: deduplicate some branched and split kernel code. Benchmarks peformance on GTX 1080 and RX 480 on Linux is the same for bmw27, classroom, pabellon, and about 2% faster on fishy_cat and koro.	2017-09-13 15:24:14 +02:00
Brecht Van Lommel	b5f8063fb9	Cycles: support baking normals plugged into BSDFs, averaged with closure weight.	2017-08-20 16:51:53 +02:00
Brecht Van Lommel	dc7fcebb33	Code cleanup: make L_transparent part of PathRadiance.	2017-08-13 01:19:07 +02:00
Brecht Van Lommel	7542282c06	Code cleanup: make DebugData part of PathRadiance.	2017-08-13 01:19:07 +02:00
Sergey Sharybin	0aa5431998	Cycles: Fix compilation error of OpenCL mega kernel Was some mismatch in address space. Seems to be caused by recent additions. Additionally, moved decoupled ray marching functions under ifdef, so they don't try to use malloc() functions. Thanks Mai for testing the patch!	2017-06-13 10:26:45 +02:00
Lukas Stockner	3bf69b26ef	Cycles Denoising: Skip feature pass writing for volume-only shaders Volume shaders without anything connected to the surface output are treated as if they had a transparent BSDF as the surface shader in Cycles, so the denoiser should skip feature pass writing for them just as it does with an actual transparent BSDF.	2017-05-21 05:40:13 +02:00
Lukas Stockner	cf1127f380	Fix T51506: Wrong shadow catcher color when using selective denoising	2017-05-19 04:04:54 +02:00
Lukas Stockner	58a0c27546	Cycles: Fix occasional black pixels from denoising with excessive radii Numerical inaccuracies would cause the XtWX matrix to be no longer positive-semidefinite, which in turn caused the LSQ solver to fail.	2017-05-11 03:21:54 +02:00
Lukas Stockner	43b374e8c5	Cycles: Implement denoising option for reducing noise in the rendered image This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!	2017-05-07 14:40:58 +02:00
Mai Lavelle	352ee7c3ef	Cycles: Remove ccl_fetch and SOA	2017-03-08 00:52:41 -05:00
Mai Lavelle	230c00d872	Cycles: OpenCL split kernel refactor This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering	2017-03-08 00:52:41 -05:00
Sergey Sharybin	4ee08e9533	Atomics: Make naming more obvious about which value is being returned	2016-11-15 12:16:26 +01:00
George Kyriazis	7f4479da42	Cycles: OpenCL kernel split This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200	2015-05-09 19:52:40 +05:00
Sergey Sharybin	ae7d84dbc1	Cycles: Use native saturate function for CUDA This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1) into a single instruction and uses 4 instructions instead. Original patch by @lockal with own modification: Don't make changes outside of the kernel. They don't make any difference anyway and term saturate() has a bit different meaning outside of kernel. This gives around 2% of speedup in Barcelona file, but in more complex shader setups with lots of math nodes with clamping speedup could be much nicer. Subscribers: dingto Projects: #cycles Differential Revision: https://developer.blender.org/D1224	2015-04-28 00:38:32 +05:00

1 2

73 Commits