blender

Author	SHA1	Message	Date
Brecht Van Lommel	55b8fc718a	Cycles: improve detection of HIP compiler for buildbot And fix various broken things in the HIP kernel compilation.	2021-10-05 13:47:50 +02:00
Brecht Van Lommel	86ec9d79ec	Fix build without Cycles HIP device	2021-09-28 20:00:55 +02:00
Brian Savery	044a77352f	Cycles: add HIP device support for AMD GPUs NOTE: this feature is not ready for user testing, and not yet enabled in daily builds. It is being merged now for easier collaboration on development. HIP is a heterogenous compute interface allowing C++ code to be executed on GPUs similar to CUDA. It is intended to bring back AMD GPU rendering support on Windows and Linux. https://github.com/ROCm-Developer-Tools/HIP. As of the time of writing, it should compile and run on Linux with existing HIP compilers and driver runtimes. Publicly available compilers and drivers for Windows will come later. See task T91571 for more details on the current status and work remaining to be done. Credits: Sayak Biswas (AMD) Arya Rafii (AMD) Brian Savery (AMD) Differential Revision: https://developer.blender.org/D12578	2021-09-28 19:18:55 +02:00
Brecht Van Lommel	0803119725	Cycles: merge of cycles-x branch, a major update to the renderer This includes much improved GPU rendering performance, viewport interactivity, new shadow catcher, revamped sampling settings, subsurface scattering anisotropy, new GPU volume sampling, improved PMJ sampling pattern, and more. Some features have also been removed or changed, breaking backwards compatibility. Including the removal of the OpenCL backend, for which alternatives are under development. Release notes and code docs: https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles https://wiki.blender.org/wiki/Source/Render/Cycles Credits: * Sergey Sharybin * Brecht Van Lommel * Patrick Mours (OptiX backend) * Christophe Hery (subsurface scattering anisotropy) * William Leeson (PMJ sampling pattern) * Alaska (various fixes and tweaks) * Thomas Dinges (various fixes) For the full commit history, see the cycles-x branch. This squashes together all the changes since intermediate changes would often fail building or tests. Ref T87839, T87837, T87836 Fixes T90734, T89353, T80267, T80267, T77185, T69800	2021-09-21 14:55:54 +02:00
Brecht Van Lommel	073bf8bf52	Cycles: remove WITH_CYCLES_DEBUG, add WITH_CYCLES_DEBUG_NAN WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we mainly use Embree an OptiX now, this information is no longer important. WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values in the kernel, for Cycles developers. Previously these asserts where enabled in all debug builds, but this is too likely to crash Blender in scenes that render fine regardless of the NaNs. So this is behind a CMake option now. Fixes T90240	2021-07-28 19:27:57 +02:00
Brecht Van Lommel	cf74cd9367	Cycles: upgrade CUDA to 11.4 This fixes a performance regression on Ampere cards, on specific scenes like classroom. For cycles-x there is little difference, but this is still helpful for LTS releases, and we need to upgrade at some point anyway.	2021-07-26 19:46:51 +02:00
Brecht Van Lommel	b42454be8b	Cleanup: move BVH utility functions into own file	2021-04-19 21:07:34 +02:00
Patrick Mours	c10546f5e9	Cycles: Add support for shader raytracing in OptiX Support for the AO and bevel shader nodes requires calling "optixTrace" from within the shading VM, which is only allowed from inlined functions to the raygen program or callables. This patch therefore converts the shading VM to use direct callables to make it work. To prevent performance regressions a separate kernel module is compiled and used for this purpose. Reviewed By: brecht Differential Revision: https://developer.blender.org/D9733	2020-12-04 13:04:11 +01:00
Campbell Barton	2bd8f7e059	Cleanup: use string APPEND/PREPEND Replace 'set' with 'string(APPEND/PREPEND ...)'. This avoids duplicating the variable name.	2020-11-06 12:32:54 +11:00
Patrick Mours	3bb3b26c8f	Cycles: Add CUDA 11 build support With this patch the build system checks whether the "CUDA10_NVCC_EXECUTABLE" CMake variable is set and if so will use that to build sm_30 kernels. Similarily for sm_8x kernels it checks "CUDA11_NVCC_EXECUTABLE". All other kernels are built using the default CUDA toolkit. This makes it possible to use either the CUDA 10 or CUDA 11 toolkit by default and only selectively use the other for the kernels where its a hard requirement. Reviewed By: brecht Differential Revision: https://developer.blender.org/D9179	2020-10-13 15:15:44 +02:00
Patrick Mours	3df90de6c2	Cycles: Add NanoVDB support for rendering volumes NanoVDB is a platform-independent sparse volume data structure that makes it possible to use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles, replacing the previous usage of dense 3D textures. Since it has a big impact on memory usage and performance and changes the OpenVDB branch used for the rest of Blender as well, this is not enabled by default yet, which will happen only after 2.82 was branched off. To enable it, build both dependencies and Blender itself with the "WITH_NANOVDB" CMake option. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8794	2020-10-05 15:03:30 +02:00
Brecht Van Lommel	f04260d8c6	CMake: refresh building and external library handling of Cycles standalone * Support precompiled libraries on Linux * Add license headers * Refactoring to deduplicate code Includes work by Ray Molenkamp and Grische for precompiled libraries. Ref D8769	2020-09-04 17:10:50 +02:00
Patrick Mours	d64e171c4b	Cycles: Enable OptiX on first generation Maxwell GPUs again	2020-07-27 16:11:00 +02:00
Patrick Mours	a9644c812f	Cycles: Use pre-compiled PTX kernel for older generation when no matching one is found This patch changes the discovery of pre-compiled kernels, to look for any PTX, even if it does not match the current architecture version exactly. It works because the driver can JIT-compile PTX generated for architectures less than or equal to the current one. This e.g. makes it possible to render on a new GPU architecture even if no pre-compiled binary kernel was distributed for it as part of the Blender installation. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8332	2020-07-20 19:25:27 +02:00
Brecht Van Lommel	d1ef5146d7	Cycles: remove SIMD BVH optimizations, to be replaced by Embree Ref T73778 Depends on D8011 Maniphest Tasks: T73778 Differential Revision: https://developer.blender.org/D8012	2020-06-22 13:28:01 +02:00
Lukas Stockner	eacdcb2dd8	Cycles: Add new Sky Texture method including direct sunlight This commit adds a new model to the Sky Texture node, which is based on a method by Nishita et al. and works by basically simulating volumetric scattering in the atmosphere. By making some approximations (such as only considering single scattering), we get a fairly simple and fast simulation code that takes into account Rayleigh and Mie scattering as well as Ozone absorption. This code is used to precompute a 512x128 texture which is then looked up during render time, and is fast enough to allow real-time tweaking in the viewport. Due to the nature of the simulation, it exposes several parameters that allow for lots of flexibility in choosing the look and matching real-world conditions (such as Air/Dust/Ozone density and altitude). Additionally, the same volumetric approach can be used to compute absorption of the direct sunlight, so the model also supports adding direct sunlight. This makes it significantly easier to set up Sun+Sky illumination where the direction, intensity and color of the sun actually matches the sky. In order to support properly sampling the direct sun component, the commit also adds logic for sampling a specific area to the kernel light sampling code. This is combined with portal and background map sampling using MIS. This sampling logic works for the common case of having one Sky texture going into the Background shader, but if a custom input to the Vector node is used or if there are multiple Sky textures, it falls back to using only background map sampling (while automatically setting the resolution to 4096x2048 if auto resolution is used). More infos and preview can be found here: https://docs.google.com/document/d/1gQta0ygFWXTrl5Pmvl_nZRgUw0mWg0FJeRuNKS36m08/view Underlying model, implementation and documentation by Marco (@nacioss). Improvements, cleanup and sun sampling by @lukasstockner. Differential Revision: https://developer.blender.org/D7896	2020-06-17 21:06:41 +02:00
Brecht Van Lommel	d97c83712c	Cycles: mark CUDA 10.2 as officially supported It appears to work fine after a recent bugfix and testing for the past few weeks.	2020-05-05 15:06:49 +02:00
Ray Molenkamp	aeb42cf8ab	Cycles/Optix: Support building the optix kernels on demand. CMake: `WITH_CYCLES_DEVICE_OPTIX` did not respect `WITH_CYCLES_CUDA_BINARIES` causing the optix kernel to be always build at build time. Code: `device_optix.cpp` did not count on the optix kernel not existing in the default location. For this to work, one should have before starting blender 1) working nvcc environment 2) Optix SDK installed and the OPTIX_ROOT_DIR environment variable pointing to it which is not set by default Differential Revision: https://developer.blender.org/D7400 Reviewed By: Brecht	2020-04-11 12:59:21 -06:00
Ray Molenkamp	86c61ce64f	Cycles: Restore cycles_cubin_cc to working order Reviewed by: brecht pmoursnv Differential Revision: https://developer.blender.org/D7136	2020-03-26 11:41:44 -06:00
Stefan Werner	51e898324d	Adaptive Sampling for Cycles. This feature takes some inspiration from "RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and "A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination" The basic principle is as follows: While samples are being added to a pixel, the adaptive sampler writes half of the samples to a separate buffer. This gives it two separate estimates of the same pixel, and by comparing their difference it estimates convergence. Once convergence drops below a given threshold, the pixel is considered done. When a pixel has not converged yet and needs more samples than the minimum, its immediate neighbors are also set to take more samples. This is done in order to more reliably detect sharp features such as caustics. A 3x3 box filter that is run periodically over the tile buffer is used for that purpose. After a tile has finished rendering, the values of all passes are scaled as if they were rendered with the full number of samples. This way, any code operating on these buffers, for example the denoiser, does not need to be changed for per-pixel sample counts. Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4686	2020-03-05 12:21:38 +01:00
Charlie Jolly	20a4cdfd70	Cycles: Vector Rotate Node using Axis and Angle method This node provides the ability to rotate a vector around a `center` point using either `Axis Angle` , `Single Axis` or `Euler` methods. Reviewed By: #cycles, brecht Differential Revision: https://developer.blender.org/D3789	2020-02-17 15:43:18 +00:00
Lukas Stockner	e760972221	Cycles: support for custom shader AOVs Custom render passes are added in the Shader AOVs panel in the view layer settings, with a name and data type. In shader nodes, an AOV Output node is then used to output either a value or color to the pass. Arbitrary names can be used for these passes, as long as they don't conflict with built-in passes that are enabled. The AOV Output node can be used in both material and world shader nodes. Implemented by Lukas, with tweaks by Brecht. Differential Revision: https://developer.blender.org/D4837	2019-12-10 20:44:46 +01:00
Campbell Barton	d310cbfa0f	Merge branch 'blender-v2.81-release'	2019-10-29 01:38:34 +11:00
Campbell Barton	312075e688	CMake: add missing headers, use space before comments	2019-10-29 01:33:44 +11:00
Stefan Werner	35a545b752	Cycles: Allow PTX targets for CUDA kernel build. This is intended for developers on Windows primarily: Now, CUDA architectures of type compute_xx are supported. This allows for quicker builds, at the expense of the CUDA driver running ptxas the first time a kernel is loaded. Differential Revision: https://developer.blender.org/D5953	2019-10-16 10:29:04 +02:00
Patrick Mours	a2b52dc571	Cycles: add Optix device backend This uses hardware-accelerated raytracing on NVIDIA RTX graphics cards. It is still currently experimental. Most features are supported, but a few are still missing like baking, branched path tracing and using CPU memory. https://wiki.blender.org/wiki/Reference/Release_Notes/2.81/Cycles#NVIDIA_RTX For building with Optix support, the Optix SDK must be installed. See here for build instructions: https://wiki.blender.org/wiki/Building_Blender/CUDA Differential Revision: https://developer.blender.org/D5363	2019-09-13 11:50:11 +02:00
OmarSquircleArt	2ea82e86ca	Shading: Add Vertex Color node. This patch adds a new Vertex Color node. The node also returns the alpha of the vertex color layer as an output. Reviewers: brecht Differential Revision: https://developer.blender.org/D5767	2019-09-12 17:42:13 +02:00
OmarSquircleArt	baaa89a0bc	Shading: Rewrite Mapping node with dynamic inputs. This patch rewrites the Mapping node to support dynamic inputs. The Max and Min options have been removed. They can be added as Min and Max Vector Math nodes manually. Texture nodes still use the old matrix-based mapping. A new SVM node `NODE_TEXTURE_MAPPING` has been added to preserve this functionality. Similarly, in GLSL, a `mapping_mat4` function has been added. Reviewers: brecht, JacquesLucke	2019-09-04 23:17:13 +02:00
OmarSquircleArt	23564583a4	Shading: Extend Noise node to other dimenstions. This patch extends perlin noise to operate in 1D, 2D, 3D, and 4D space. The noise code has also been refactored to be more readable. The Color output and distortion patterns changed, so this patch breaks backward compatibility. This is due to the fact that we now use random offsets as noise seeds, as opposed to swizzling and constants offsets. Reviewers: brecht, JacquesLucke Differential Revision: https://developer.blender.org/D5560	2019-09-04 17:54:32 +02:00
OmarSquircleArt	133dfdd704	Shading: Add White Noise node. The White Noise node hashes the input and returns a random number in the range [0, 1]. The input can be a 1D, 2D, 3D, or a 4D vector. Reviewers: brecht, JacquesLucke Differential Revision: https://developer.blender.org/D5550	2019-08-21 20:04:09 +02:00
OmarSquircleArt	313b789289	Shading: Add Clamp node to Cycles and EEVEE. This patch adds a new node that clamps a value between a maximum and a minimum values. Reviewers: brecht Differential Revision: https://developer.blender.org/D5476	2019-08-13 22:22:15 +02:00
OmarSquircleArt	71641ab56d	Shading: Add Map Range node to Cycles and EEVEE. This patch adds a new Map Range node that linearly remaps an input value from a range to another. This node is similar to the compositor's Map Range node. Reviewers: brecht, JacquesLucke Differential Revision: https://developer.blender.org/D5471	2019-08-13 16:38:56 +02:00
Brecht Van Lommel	b84db342a5	Fix build errors with older GCC versions like 4.9 We can add more fine grained checks for when these flags are supported so that adding asan flags manually still has all the workarounds, but for now compiling succesfully is more important.	2019-08-13 06:04:17 +02:00
Brecht Van Lommel	47bf754de4	Build: disable address sanitizer for Cycles optimized kernels with GCC It's extremely slow to compile and run, so just disable it unless WITH_CYCLES_KERNEL_ASAN is manually enabled. For Clang it's always enabled since that appears to work ok. This also limits the -fno-sanitize=vptr flag to the Cycles kernel, as it was added specifically to work around an issue there. Differential Revision: https://developer.blender.org/D5404	2019-08-05 15:23:57 +02:00
Campbell Barton	e12c08e8d1	ClangFormat: apply to source, most of intern Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat	2019-04-17 06:21:24 +02:00
Campbell Barton	5498e7f193	CMake: add library deps to CMakeLists.txt Tested to work on Linux and macOS. This will be enabled once all platforms are verified. See D4684	2019-04-16 06:20:52 +02:00
Campbell Barton	813e470eac	CMake: cleanup, arg rename, add definitions last	2019-04-16 06:15:18 +02:00
Brecht Van Lommel	65d95879f7	Cycles: upgrade to CUDA 10.1 as the one officially supported version. This version fixes various bugs, and there is no need anymore to use both 9.1 and 10.0 for different cards. There is a bug related to WITH_CYCLES_CUBIN_COMPILER and bump mapping in the regression tests, so that remains disabled same as it was for CUDA 10.0. Fix T59286: CUDA bake failing on some cards. Fix T56858: CUDA 9.2 and 10 issues.	2019-03-15 16:52:28 +01:00
Jeroen Bakker	02a7e875d7	Cycles OpenCL: Remove single program Part of the cleanup of the OpenCL codebase. Single program is not effective when using OpenCL, it is slower to compile and slower during rendering (when used in for example `barbershop` or `victor`). Reviewers: brecht, #cycles Maniphest Tasks: T62267 Differential Revision: https://developer.blender.org/D4481	2019-03-08 16:31:35 +01:00
Jeroen Bakker	949ab753bb	Cycles OpenCL: Remove OpenCL MegaKernel Using OpenCL MegaKernel has been slow and therefore not usefull. This patch will remove the mega kernel from the OpenCL codebase and the OpenCLDeviceBase class. T61736: removal of mega kernel T61703: baking does not work with mega kernel Tags: #cycles Differential Revision: https://developer.blender.org/D4383	2019-02-20 15:17:22 +01:00
Jeroen Bakker	667033e89e	T61463: Separate Baking kernels Cycles OpenCL: Split baking kernels in own program Fix T61463. Before this patch baking was part of the base kernels. There are 3 baking kernels that and all 3 uses shader evaluation. Only for one of these kernels the functionality was wrapped in the __NO_BAKING__ compile directive. When you start baking this leads to long compile times. By separating in individual programs will reduce the compile times. Also wrapped all baking kernels with __NO_BAKING__ to reduce the compilation times. Impact on compilation time job \| scene_name \| previous \| new \| percentage --------+-----------------+----------+-------+------------ T61463 \| empty \| 10.63 \| 7.27 \| 32% T61463 \| bmw \| 17.91 \| 14.24 \| 20% T61463 \| fishycat \| 19.57 \| 15.08 \| 23% T61463 \| barbershop \| 54.10 \| 48.18 \| 11% T61463 \| classroom \| 17.55 \| 14.42 \| 18% T61463 \| koro \| 18.92 \| 17.15 \| 9% T61463 \| pavillion \| 17.43 \| 14.23 \| 18% T61463 \| splash279 \| 16.48 \| 15.33 \| 7% T61463 \| volume_emission \| 36.22 \| 34.19 \| 6% Impact on render time job \| scene_name \| previous \| new \| percentage --------+-----------------+----------+---------+------------ T61463 \| empty \| 21.06 \| 20.54 \| 2% T61463 \| bmw \| 198.44 \| 189.59 \| 4% T61463 \| fishycat \| 394.20 \| 388.50 \| 1% T61463 \| barbershop \| 1188.16 \| 1185.49 \| 0% T61463 \| classroom \| 341.08 \| 339.27 \| 1% T61463 \| koro \| 472.43 \| 360.70 \| 24% T61463 \| pavillion \| 905.77 \| 902.14 \| 0% T61463 \| splash279 \| 55.26 \| 54.92 \| 1% T61463 \| volume_emission \| 62.59 \| 39.09 \| 38% I don't have a grounded explanation why koro and volume_emission is this much faster; I have done several tests though... Maniphest Tasks: T61463 Differential Revision: https://developer.blender.org/D4376	2019-02-19 16:34:55 +01:00
Brecht Van Lommel	9800837b98	Cycles: Support multithreaded compilation of kernels This patch implements a workaround to get the multithreaded compilation from D2231 working. So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function. Depends on D2231. Patch by lukasstockner97, jbakker, brecht job \| scene_name \| compilation_time ----------+-----------------+------------------ Baseline \| empty \| 22.73 D2264 \| empty \| 13.94 Baseline \| bmw \| 56.44 D2264 \| bmw \| 41.32 Baseline \| fishycat \| 59.50 D2264 \| fishycat \| 45.19 Baseline \| barbershop \| 212.28 D2264 \| barbershop \| 169.81 Baseline \| victor \| 67.51 D2264 \| victor \| 53.60 Baseline \| classroom \| 51.46 D2264 \| classroom \| 39.02 Baseline \| koro \| 62.48 D2264 \| koro \| 49.03 Baseline \| pavillion \| 54.37 D2264 \| pavillion \| 38.82 Baseline \| splash279 \| 47.43 D2264 \| splash279 \| 37.94 Baseline \| volume_emission \| 145.22 D2264 \| volume_emission \| 121.10 This patch reduced compilation time as the split kernels and base kernels are compiled in parallel. In cycles debug mode (256) you can set unmark the opencl single program file, what reduces the compilation time even further (bmw 17 seconds, barbershop 53 seconds). Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97 Reviewed By: brecht Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli Differential Revision: https://developer.blender.org/D2264	2019-02-15 08:56:20 +01:00
Brecht Van Lommel	765795aed7	Fix macOS buildbot build, wrong CUDA version check.	2018-12-11 14:16:48 +01:00
Brecht Van Lommel	f5b46daf52	Fix build with old CMake versions.	2018-12-05 12:53:19 +01:00
Brecht Van Lommel	f63da3dcf5	Buildbot: enable support for NVIDIA Turing cards in Cycles (like GTX 20xx). We currently only build the sm_7x kernels with CUDA 10.0, older cards still use 9.1 until rendering errors are solved for them.	2018-12-04 16:03:18 +01:00
Brecht Van Lommel	b14ec18601	Cycles: add initial CUDA 10.0 support, but only recommend use for Turing cards. There may still be rendering errors when used for older graphics cards.	2018-12-04 16:03:18 +01:00
Lukas Stockner	7fa6f72084	Cycles: Add sample-based runtime profiler that measures time spent in various parts of the CPU kernel This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object. The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats"). Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now. Reviewers: brecht, sergey, swerner Reviewed By: brecht, swerner Differential Revision: https://developer.blender.org/D3892	2018-11-29 02:45:24 +01:00
Stefan Werner	2c5531c0a5	Cycles: Added Embree as BVH option for CPU renders. Note that this is turned off by default and must be enabled at build time with the CMake WITH_CYCLES_EMBREE flag. Embree must be built as a static library with ray masking turned on, the `make deps` scripts have been updated accordingly. There, Embree is off by default too and must be enabled with the WITH_EMBREE flag. Using Embree allows for much faster rendering of deformation motion blur while reducing the memory footprint. TODO: GPU implementation, deduplication of data, leveraging more of Embrees features (e.g. tessellation cache). Differential Revision: https://developer.blender.org/D3682	2018-11-07 12:58:12 +01:00
Stefan Werner	e58c6cf0c6	Cycles: Added Cryptomatte output. This allows for extra output passes that encode automatic object and material masks for the entire scene. It is an implementation of the Cryptomatte standard as introduced by Psyop. A good future extension would be to add a manifest to the export and to do plenty of testing to ensure that it is fully compatible with other renderers and compositing programs that use Cryptomatte. Internally, it adds the ability for Cycles to have several passes of the same type that are distinguished by their name. Differential Revision: https://developer.blender.org/D3538	2018-10-28 05:37:41 -04:00
Brecht Van Lommel	a0402074ed	Fix wrong CUDA version warning in cmake. Fix suggested by Dalai.	2018-09-19 16:24:45 +02:00

1 2 3 4 5

239 Commits