Commit Graph

79 Commits

Author SHA1 Message Date
Campbell Barton
123e29c274 Cleanup: missing CMake headers from source lists 2020-07-16 13:17:31 +10:00
Szymon Ulatowski
9de09220fc EEVEE: Implement the missing Sky texture
I'm not sure if the Sky was deliberately left out or was just waiting for a
better moment, but so many I was disappointed that Sky in EEVEE is
completely white.

There are already 2 implementations (osl and gpu) so this is the third one.
Looking at other cases it seems that we are not supposed to share sources
between cycles and the rest? So the new util_sky_model files are just
copies of what is already in cycles, except that the data file uses the RGB
variant of the Hosek/Wilkie model, because we output RGB anyway (but can be
easily changed to XYZ if desired - the results are nearly identical).
I am not sure if it is okay to pass 3*9 float values as 3 mat4 uniforms (I
wanted to use mat3 but it does not work).
Also, should I cache the sky model data between renders if the parameters
do not change?

Reviewed By: fclem, brecht

Differential Revision: https://developer.blender.org/D7108
2020-07-09 17:31:36 +02:00
Brecht Van Lommel
669befdfbe Cycles: add Intel OpenImageDenoise support for viewport denoising
Compared to Optix denoise, this is usually slower since there is no GPU
acceleration. Some optimizations may still be possible, in avoid copies
to the GPU and/or denoising less often.

The main thing is that this adds viewport denoising support for computers
without an NVIDIA GPU (as long as the CPU supports SSE 4.1, which is nearly
all of them).

Ref T76259
2020-06-24 15:17:36 +02:00
Brecht Van Lommel
e389c05410 Cleanup: move TBB includes into own header 2020-06-24 14:32:06 +02:00
Brecht Van Lommel
d8c2092b15 Cycles: make TBB a required library dependency, and use in a few places
Now that the rest of Blender also relies on TBB, no point in maintaining custom
code for paraller_for and thread local storage.
2020-06-22 13:06:47 +02:00
Lukas Stockner
eacdcb2dd8 Cycles: Add new Sky Texture method including direct sunlight
This commit adds a new model to the Sky Texture node, which is based on a
method by Nishita et al. and works by basically simulating volumetric
scattering in the atmosphere.

By making some approximations (such as only considering single scattering),
we get a fairly simple and fast simulation code that takes into account
Rayleigh and Mie scattering as well as Ozone absorption.

This code is used to precompute a 512x128 texture which is then looked up
during render time, and is fast enough to allow real-time tweaking in the
viewport.

Due to the nature of the simulation, it exposes several parameters that
allow for lots of flexibility in choosing the look and matching real-world
conditions (such as Air/Dust/Ozone density and altitude).

Additionally, the same volumetric approach can be used to compute absorption
of the direct sunlight, so the model also supports adding direct sunlight.
This makes it significantly easier to set up Sun+Sky illumination where
the direction, intensity and color of the sun actually matches the sky.

In order to support properly sampling the direct sun component, the commit
also adds logic for sampling a specific area to the kernel light sampling
code. This is combined with portal and background map sampling using MIS.

This sampling logic works for the common case of having one Sky texture
going into the Background shader, but if a custom input to the Vector
node is used or if there are multiple Sky textures, it falls back to using
only background map sampling (while automatically setting the resolution to
4096x2048 if auto resolution is used).

More infos and preview can be found here:
https://docs.google.com/document/d/1gQta0ygFWXTrl5Pmvl_nZRgUw0mWg0FJeRuNKS36m08/view

Underlying model, implementation and documentation by Marco (@nacioss).
Improvements, cleanup and sun sampling by @lukasstockner.

Differential Revision: https://developer.blender.org/D7896
2020-06-17 21:06:41 +02:00
Brecht Van Lommel
f48d15a861 Cycles: limit number of processes compiling OpenCL kernel based on memory
The numbers here can probably be tweaked to be better, but it's hard to
predict and this should at least avoid excessive memory swapping.

Fixes T57064.
2020-03-25 16:39:37 +01:00
OmarSquircleArt
1c2f7b022a Cycles: Add Random Per Island attribute.
The Random Per Island attribute is a random float associated with each
connected component (island) of the mesh. It is particularly useful
when artists want to add variations to meshes composed of separate
units. Like tree leaves created using particle systems, wood planks
created using array modifiers, or abstract splines created using AN.

Reviewed By: Sergey Sharybin, Jacques Lucke

Differential Revision: https://developer.blender.org/D6154
2019-11-27 12:07:20 +02:00
Campbell Barton
312075e688 CMake: add missing headers, use space before comments 2019-10-29 01:33:44 +11:00
Mai Lavelle
697fd86506 Cycles: Stitching of subdivided and displaced meshes
This patch stitches the vertices along patch edges so that cracks can
no longer form when applying subdivision or displacement a mesh.

Subpatches are now formed in a way that ensures vertex indices along
subpatch edges are equal for adjacent subpatches. A mapping of vertices
along patch edges is built to preform stitching. Overall performance is
roughly the same, some gains were made in splitting, but some was lost
in stitching.

This fixes:
- T49049 (cracks between patches from material and uv seams)
- T49048 (discontinuous normals with true displacement)

Reviewers: sergey, brecht

Differential Revision: https://developer.blender.org/D3692
2019-08-27 14:27:53 -04:00
Campbell Barton
dfe2ca26f7 Cleanup: style, indentation 2019-06-19 07:32:21 +10:00
Brecht Van Lommel
b10921f0cc Fix Cycles CUDA suboptimal performance on Windows 10 with recent graphics cards
When compute preemption is available we schedule more work which is more
efficient. However the CUDA driver appears to be incorrectly reporting this as
unavailable, even though it should be supported starting with Windows 10 1803
and Pascal and Turing (10x0 and 20x0) graphics cards.

This reduces render time by about a 25% difference on our benchmark scenes. On
Linux compute preemption appears to be reported correctly.
2019-06-18 20:05:36 +02:00
Campbell Barton
e12c08e8d1 ClangFormat: apply to source, most of intern
Apply clang format as proposed in T53211.

For details on usage and instructions for migrating branches
without conflicts, see:

https://wiki.blender.org/wiki/Tools/ClangFormat
2019-04-17 06:21:24 +02:00
Campbell Barton
813e470eac CMake: cleanup, arg rename, add definitions last 2019-04-16 06:15:18 +02:00
Brecht Van Lommel
d377cd5db1 Fix Cycles standalone build as part of Blender. 2019-01-27 20:05:25 +01:00
Sergey Sharybin
826d7adde7 Fix T59874: Cycles CPU 25% load only during rendering
The issue was introduced by a Threadripper2 commit back in
ce927e15e0. This boils down to threads inheriting affinity
from the parent thread. It is a question how this slipped
through the review (we definitely run benchmark round).

Quick fix could have been to always set CPU group affinity
in Cycles, and it would work for Windows. On other platforms
we did not have CPU groups API finished.

Ended up making Cycles aware of NUMA topology, so now we
bound threads to a specific NUMA node. This required adding
an external dependency to Cycles, but made some code there
shorter.
2018-12-27 19:12:59 +01:00
Lukas Stockner
7fa6f72084 Cycles: Add sample-based runtime profiler that measures time spent in various parts of the CPU kernel
This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object.

The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats").

Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now.

Reviewers: brecht, sergey, swerner

Reviewed By: brecht, swerner

Differential Revision: https://developer.blender.org/D3892
2018-11-29 02:45:24 +01:00
Sergey Sharybin
c86d4b1d80 Cycles: Cleanup, split array from vector
Those are similar but different types, no reason to keep
their definitions in a single file.
2018-11-09 11:54:24 +01:00
Stefan Werner
e58c6cf0c6 Cycles: Added Cryptomatte output.
This allows for extra output passes that encode automatic object and material masks
for the entire scene. It is an implementation of the Cryptomatte standard as
introduced by Psyop. A good future extension would be to add a manifest to the
export and to do plenty of testing to ensure that it is fully compatible with other
renderers and compositing programs that use Cryptomatte.

Internally, it adds the ability for Cycles to have several passes of the same type
that are distinguished by their name.

Differential Revision: https://developer.blender.org/D3538
2018-10-28 05:37:41 -04:00
Sergey Sharybin
73f2056052 Cycles: Add BVH8 and packeted triangle intersection
This is an initial implementation of BVH8 optimization structure
and packated triangle intersection. The aim is to get faster ray
to scene intersection checks.

    Scene                BVH4      BVH8
barbershop_interior    10:24.94   10:10.74
bmw27                  02:41.25   02:38.83
classroom              08:16.49   07:56.15
fishy_cat              04:24.56   04:17.29
koro                   06:03.06   06:01.45
pavillon_barcelona     09:21.26   09:02.98
victor                 23:39.65   22:53.71

As memory goes, peak usage raises by about 4.7% in a complex
scenes.

Note that BVH8 is disabled when using OSL, this is because OSL
kernel does not get per-microarchitecture optimizations and
hence always considers BVH3 is used.

Original BVH8 patch from Anton Gavrikov.
Batched triangles intersection from Victoria Zhislina.
Extra work and tests and fixes from Maxym Dmytrychenko.
2018-08-29 15:03:09 +02:00
Stefan Werner
4d00e95ee3 Cycles: Adding native support for UINT16 textures.
Textures in 16 bit integer format are sometimes used for displacement, bump and normal maps and can be exported by tools like Substance Painter. Without this patch, Cycles would promote those textures to single precision floating point, causing them to take up twice as much memory as needed.

Reviewers: #cycles, brecht, sergey

Reviewed By: #cycles, brecht, sergey

Subscribers: sergey, dingto, #cycles

Tags: #cycles

Differential Revision: https://developer.blender.org/D3523
2018-07-05 13:53:34 +02:00
Lukas Stockner
48155c210a Cycles: Add Support for IES files as textures for light strength
This patch adds support for IES files, a file format that is commonly used to store the directional intensity distribution of light sources.
The new IES node is supposed to be plugged into the Strength input of the Emission node of the lamp.

Since people generating IES files do not really seem to care about the standard, the parser is flexible enough to accept all test files I have tried.
Some common weirdnesses are distributing values over multiple lines that should go into one line, using commas instead of spaces as delimiters and adding various useless stuff at the end of the file.

The user interface of the node is similar to the script node, the user can either select an internal Text or load a file.
Internally, IES files are handled similar to Image textures: They are stored in slots by the LightManager and each unique IES is assigned to one slot.

The local coordinate system of the lamp is used, so that the direction of the light can be changed. For UI reasons, it's usually best to add an area light,
rotate it and then change its type, since especially the point light does not immediately show its local coordinate system in the viewport.

Reviewers: #cycles, dingto, sergey, brecht

Reviewed By: #cycles, dingto, brecht

Subscribers: OgDEV, crazyrobinhood, secundar, cardboard, pisuke, intrah, swerner, micah_denn, harvester, gottfried, disnel, campbellbarton, duarteframos, Lapineige, brecht, juicyfruit, dingto, marek, rickyblender, bliblubli, lockal, sergey

Differential Revision: https://developer.blender.org/D1543
2018-05-27 01:24:57 +02:00
Brecht Van Lommel
516e82a900 Code refactor: add ProjectionTransform separate from regular Transform.
This is in preparation of making Transform affine only.
2018-03-10 04:54:04 +01:00
Ray Molenkamp
36c1122b96 msvc: Use source folder structure for project file.
This patch changes the huge list of projects in visual studio into a nice tree matching the source folder structure. see D2823 for details.

Differential Revision: http://developer.blender.org/D2823
2018-02-03 16:38:27 -07:00
Lukas Stockner
fa3d50af95 Cycles: Improve denoising speed on GPUs with small tile sizes
Previously, the NLM kernels would be launched once per offset with one thread per pixel.
However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown.

Therefore, the kernels are now launched in a single call that handles all offsets at once.
This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory.
On the other hand, of course, the smaller tiles significantly reduce the size of the memory.

The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum.
I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere.

To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now.
Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-30 07:37:08 +01:00
Brecht Van Lommel
a8cc0d707e Code refactor: split defines into separate header, changes to SSE type headers.
I need to use some macros defined in util_simd.h for float3/float4, to emulate
SSE4 instructions on SSE2. But due to issues with order of header includes this
was not possible, this does some refactoring to make it work.

Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
Lukas Stockner
43b374e8c5 Cycles: Implement denoising option for reducing noise in the rendered image
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.

To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.

Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.

Finally, thanks to all the people who supported this project:

- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
  on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
  mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
  that could and/or should work better!
2017-05-07 14:40:58 +02:00
Sergey Sharybin
0a07cdbe80 Cycles: Split vectorized math utilities to a dedicated files
This file was even a bigger mess than vectorized types header,
cleaning it up to make it easier to maintain this files and
extend further.
2017-04-25 10:33:26 +02:00
Sergey Sharybin
51ec9441b7 Cycles: Split vectorized types into separate files
The final goal to reach is to make vectorized types much easier to maintain
and the previous design had following issues:

- Having all types and methods implementation made the source file rather
  bloated and unfun to navigate in.

- It was not possible to quickly glance available API for the type you are
  interested in.

- Adding more vectorization types will bloat the file even more, making
  things even more tricky to follow.
2017-04-25 10:33:26 +02:00
Sergey Sharybin
0579eaae1f Cycles: Make all #include statements relative to cycles source directory
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.

For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.

Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.

This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.

Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.

Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner

Reviewed By: lukasstockner97, maiself, nirved, dingto

Subscribers: brecht

Differential Revision: https://developer.blender.org/D2586
2017-03-29 13:41:11 +02:00
Sergey Sharybin
1c5cceb7af Cycles: Move intersection math to own header file
There are following benefits:

- Modifying intersection algorithm will not cause so much re-compilation.
- It works around header dependency hell and allows us to use vectorization
  types much easier in there.
2017-03-23 17:45:19 +01:00
Sergey Sharybin
272412f9c0 Cycles: Implement texture size limit simplify option
Main intention is to give some quick way to control scene's memory
usage by clamping textures which are too big. This is really handy
on the early production stages when you first create really nice
looking hi-res textures and only when it all works and approved
start investing time on optimizing your scene.

This is a new option in Scene Simplify panel and it acts as
following: when texture size is bigger than the given value it'll
be scaled down by half for until it fits into given limit.

There are various possible improvements, such as:

- Use threaded scaling using our own task manager.

  This is actually one of the main reasons why image resize is
  manually-implemented instead of using OIIO's resize. Other
  reason here is that API seems limited to construct 3D texture
  description easily.

- Vectorization of uchar4/float4/half4 textures.

- Use something smarter than box filter.

  Was playing with some other filters, but not sure they are
  really better: they kind of causes more fuzzy edges.

Even with such a TODOs in the code the option is already quite
useful.

Reviewers: brecht

Reviewed By: brecht

Subscribers: jtheninja, Blendify, gregzaal, venomgfx

Differential Revision: https://developer.blender.org/D2362
2016-11-22 12:00:09 +01:00
Sergey Sharybin
6a4ec3ca43 Cycles: Add new avxf vectorized data type
Based on existing ssef data type and to my knowledge it's also what happens in
Embree nowadays.

Inspired by Maxym Dmytrychenko and required for the upcoming triangle
intersection commit.

Hopefully the copyright message is correct.
2016-10-12 13:54:13 +02:00
Mai Lavelle
013a5c27a5 Cycles: Remove odd definition from CMake file
This was causing Cycles standalone to fail to build from Blender repo.
Hopefully nothing breaks from removing this.
2016-08-11 14:35:43 -04:00
Sergey Sharybin
fdc43f993d Cycles: Use static assert to control structures alignment 2016-08-11 10:12:06 +02:00
Sergey Sharybin
b62faa54de Cycles: Add support of processor groups
Currently for windows only, this is an initial commit towards native
support of NUMA.

Current commit makes it so Cycles will use all logical processors on
Windows running on system with more than 64 threads.

Reviewers: juicyfruit, dingto, lukasstockner97, maiself, brecht

Subscribers: LazyDodo

Differential Revision: https://developer.blender.org/D2049
2016-06-06 09:14:37 +02:00
Thomas Dinges
9c916b0172 Cleanup: Move texture definitions to util, to avoid bad level include. 2016-04-15 23:02:44 +02:00
Thomas Dinges
ed050753ce Add a version number to Cycles standalone
Now Cycles has its own versioning, that is mainly interesting for external projects, which integrate the engine.

We start with version 1.7.0. Reasons for that:

* The engine is too mature for a 1.0 release.
* We assume that Cycles inside of Blender 2.61 was version 0.1. We count upwards in 0.1 steps, therefore Cycles inside of Blender 2.77 would be 1.7.

We use a common versioning scheme here, with 3 decimals for the major, minor and patch level.

At the moment cycles --version can be used to display the version, easy to parse for external projects. The info will be added to the UI later aswell.
2016-04-13 09:45:23 +02:00
Sergey Sharybin
ffe59c54cb Cycles: Add STL allocator which uses stack memory
At this point we might want to rename allocator files to
util_allocator_foo.c so the stay nicely grouped in the folder.
2016-03-31 10:06:21 +02:00
Thomas Dinges
6a593aba44 Cleanup: Move Cycles sky model data to util. 2016-02-13 13:41:40 +01:00
Sergey Sharybin
c8d2bc7890 Cycles: Always use guarded allocator of vectors
We don't have vectors re-allocation happening multiple times from inside
a loop anymore, so we can safely switch to a memory guarded allocator for
vectors and keep track on the memory usage at various stages of rendering.

Additionally, when building from inside Blender repository, Cycles will
use Blender's guarded allocator, so actual memory usage will be displayed
in the Space Info header.

There are couple of tricky aspects of the patch:

- TaskScheduler::exit() now explicitly frees memory used by `threads`.
  This is needed because `threads` is a static member which destructor
  isn't getting called on Blender's exit which caused memory leak print
  to happen.

  This shouldn't give any measurable speed issues, reallocation of that
  vector is only one of fewzillion other allocations happening during
  synchronization.

- Use regular guarded malloc (not aligned one). No idea why it was
  made to be aligned in the first place. Perhaps some corner case tests
  or so. Vector was never expected to be aligned anyway. Let's see if
  we'll have actual bugs with this.

Reviewers: dingto, lukasstockner97, juicyfruit, brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D1774
2016-02-12 15:43:26 +01:00
Sergey Sharybin
f25f7c8030 Cycles: Re-implement some utilities to avoid use of boost
The title says it all actually, the idea is to make Cycles
only requiring Boost via 3rd party dependencies like OIIO
and OSL.

So now there are only few places which still uses Boost:

- Foreach, function bindings and threading primitives.

  Those we can easily get rid with C++11 bump (which seems
  inevitable sooner or later if we'll want ot use newer
  LLVM for OSL),

- Networking devices

  There's no quick solution for those currently, but there
  are some patches around which improves serialization.

Reviewers: juicyfruit, mont29, campbellbarton, brecht, dingto

Reviewed By: brecht, dingto

Differential Revision: https://developer.blender.org/D1764
2016-02-06 19:19:20 +01:00
Brecht Van Lommel
d5e929a9d3 Code cleanup: remove unused Cycles code from BVH cache. 2016-02-06 11:55:35 +01:00
Sergey Sharybin
ac7aefd7c2 Cycles: Use special debug panel to fine-tune debug flags
This panel is only visible when debug_value is set to 256 and has no
affect in other cases. However, if debug value is not set to this
value, environment variables will be used to control which features
are enabled, so there's no visible changes to anyone in fact.

There are some changes needed to prevent devices re-enumeration on
every Cycles session create.

Reviewers: juicyfruit, lukasstockner97, dingto, brecht

Reviewed By: lukasstockner97, dingto

Differential Revision: https://developer.blender.org/D1720
2016-01-12 16:21:30 +05:00
Sergey Sharybin
6552d5bebd Cycles: Avoid recursion when doing constant fold
This reduces stress on the the stack memory which could be really handy
on certain operation systems which applies strict limits on the stack.

Reviewers: brecht, juicyfruit, dingto

Reviewed By: brecht, juicyfruit, dingto

Differential Revision: https://developer.blender.org/D1656
2015-12-02 16:19:39 +05:00
Sergey Sharybin
548ef2d88b Cycles: Add utility functions to evaluate CDF of a given functor 2015-10-28 02:43:06 +05:00
Campbell Barton
483fa4c387 CMake: picky style edit
'cmake_consistency_check.py' relies on this formattng.
2015-02-19 07:15:00 +11:00
Sergey Sharybin
a445e49186 Cycles: Implement guarded allocator for STL classes
The commit implements a guarded allocator which can be used by STL classes
such as vectors, maps and so on. This allocator will keep track of current
and peak memory usage which then can be queried.

New code for allocator is only active when building Cycles with debug flag
(WITH_CYCLES_DEBUG) and doesn't distort regular builds too much.

Additionally now we're using own subclass of std::vector which allows us
to implement shrink_to_fit() method which would ensure capacity of the
vector is as big as it should be (without this making vector smaller will
still use all previous memory allocated).
2015-02-15 02:01:48 +05:00
Sergey Sharybin
01067fe51c Cycles: Replace own aligned allocator with system one
This replaces our own implementation of aligned malloc with system calls,
which depends on which operation system you're on.

This is probably really minor noticeable change, but in the same time it
might reduce amount of wasted memory.
2015-02-15 02:01:48 +05:00
Sergey Sharybin
dc1043dda0 Cycles: Add fast math function module
It is based on fmath.h from OIIO and could be used to give some speedup
in areas where absolute accuracy is not so critical.
2015-01-31 01:49:41 +05:00