Commit Graph

247 Commits

Author SHA1 Message Date
Lukas Stockner
7fa6f72084 Cycles: Add sample-based runtime profiler that measures time spent in various parts of the CPU kernel
This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object.

The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats").

Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now.

Reviewers: brecht, sergey, swerner

Reviewed By: brecht, swerner

Differential Revision: https://developer.blender.org/D3892
2018-11-29 02:45:24 +01:00
Sergey Sharybin
cb4b5e12ab Cycles: Cleanup, spacing after preprocessor
It is supposed to be two spaces before comment stating which if
else/endif statements corresponds to. Was mainly violated in the
header guards.
2018-11-09 11:34:54 +01:00
Campbell Barton
1daa20ad9f Cleanup: strip trailing space for cycles 2018-07-06 10:17:58 +02:00
Lukas Stockner
799779d432 Cycles: change Ambient Occlusion shader to output colors.
This means the shader can now be used for procedural texturing. New
settings on the node are Samples, Inside, Local Only and Distance.

Original patch by Lukas with further changes by Brecht.

Differential Revision: https://developer.blender.org/D3479
2018-06-15 22:16:06 +02:00
Brecht Van Lommel
7f86afec9d Cycles: don't count volume boundaries as transparent bounces.
This is more important now that we will have tigther volume bounds that
we hit multiple times. It also avoids some noise due to RR previously
affecting these surfaces, which shouldn't have been the case and should
eventually be fixed for transparent BSDFs as well.

For non-volume scenes I found no performance impact on NVIDIA or AMD.
For volume scenes the noise decrease and fixed artifacts are worth the
little extra render time, when there is any.
2018-03-01 01:21:29 +01:00
Brecht Van Lommel
2d81758aa6 Cycles: better path termination for transparency.
We now continue transparent paths after diffuse/glossy/transmission/volume
bounces are exceeded. This avoids unexpected boundaries in volumes with
transparent boundaries. It is also required for MIS to work correctly with
transparent surfaces, as we also continue through these in shadow rays.

The main visible changes is that volumes will now be lit by the background
even at volume bounces 0, same as surfaces.

Fixes T53914 and T54103.
2018-02-22 00:55:32 +01:00
Brecht Van Lommel
606bc5f301 Fix T54105: random walk SSS missing in branched indirect paths.
Unify the path and branched path indirect SSS code. No performance impact
found on CUDA, for AMD split kernel the extra code was already there.
2018-02-21 17:56:26 +01:00
Brecht Van Lommel
0df9b2c715 Cycles: random walk subsurface scattering.
It is basically brute force volume scattering within the mesh, but part
of the SSS code for faster performance. The main difference with actual
volume scattering is that we assume the boundaries are diffuse and that
all lighting is coming through this boundary from outside the volume.

This gives much more accurate results for thin features and low density.
Some challenges remain however:

* Significantly more noisy than BSSRDF. Adding Dwivedi sampling may help
  here, but it's unclear still how much it helps in real world cases.
* Due to this being a volumetric method, geometry like eyes or mouth can
  darken the skin on the outside. We may be able to reduce this effect,
  or users can compensate for it by reducing the scattering radius in
  such areas.
* Sharp corners are quite bright. This matches actual volume rendering
  and results in some other renderers, but maybe not so much real world
  objects.

Differential Revision: https://developer.blender.org/D3054
2018-02-09 19:58:33 +01:00
Brecht Van Lommel
aabafece03 Code refactor: tweaks in SSS code to prepare for coming changes.
This also fixes a subtle bug in the split kernel branched path SSS, the
volume stack update can't be shared between multiple hit points.
2018-02-08 16:56:11 +01:00
Lukas Stockner
322f0223d0 Cycles: option to make background visible through glass transparent.
This can be enabled in the Film panel, with an option to control the
transmisison roughness below which glass becomes transparent.

Differential Revision: https://developer.blender.org/D2904
2018-01-12 01:34:28 +01:00
Mathieu Menuet
83e80db56e Fix T53349: AO bounces not working correct with OpenCL. 2017-11-26 15:53:00 +01:00
Lukas Stockner
f78e963858 Cycles: Refactor PassType from bitflag to index in order to allow for more passes 2017-11-17 16:34:19 +01:00
Mai Lavelle
087331c495 Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variable
Goal is to reduce OpenCL kernel recompilations.

Currently viewport renders are still set to use 64 closures as this seems to
be faster and we don't want to cause a performance regression there. Needs
to be investigated.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2775
2017-11-09 01:04:06 -05:00
Brecht Van Lommel
8a72be7697 Cycles: reduce closure memory usage for emission/shadow shader data.
With a Titan Xp, reduces path trace local memory from 1092MB to 840MB.
Benchmark performance was within 1% with both RX 480 and Titan Xp.

Original patch was implemented by Sergey.

Differential Revision: https://developer.blender.org/D2249
2017-11-05 20:48:33 +01:00
Brecht Van Lommel
2c02a04c46 Code refactor: remove emission and background closures, sum directly. 2017-11-05 18:13:44 +01:00
Brecht Van Lommel
171c4e982f Cycles: use AO factor to let user adjust intensity of AO bounces.
We are already using the AO distance, so might as well offer this extra
control over the intensity. Useful when an interior scene is supposed to
be significantly darker than the background shader.
2017-10-25 21:46:23 +02:00
Brecht Van Lommel
cdb0b3b1dc Code refactor: use DeviceInfo to enable QBVH and decoupled volume shading. 2017-10-08 13:17:33 +02:00
Brecht Van Lommel
5bb677e592 Code refactor: zero render buffers outside of kernel.
This was originally done with the first sample in the kernel for better
performance, but it doesn't work anymore with atomics. Any benefit was
very minor anyway, too small to measure it seems.
2017-10-04 21:11:14 +02:00
Brecht Van Lommel
e3e16cecc4 Code refactor: remove rng_state buffer and compute hash on the fly.
A little faster on some benchmark scenes, a little slower on others, seems
about performance neutral on average and saves a little memory.
2017-10-04 21:11:14 +02:00
Brecht Van Lommel
400e6f37b8 Cycles: reduce subsurface stack memory usage.
This is done by storing only a subset of PathRadiance, and by storing
direct light immediately in the main PathRadiance. Saves about 10% of
CUDA stack memory, and simplifies subsurface indirect ray code.
2017-09-28 15:18:43 +02:00
Brecht Van Lommel
90d4b823d7 Cycles: use defensive sampling for picking BSDFs and BSSRDFs.
For the first bounce we now give each BSDF or BSSRDF a minimum sample weight,
which helps reduce noise for a typical case where you have a glossy BSDF with
a small weight due to Fresnel, but not necessarily small contribution relative
to a diffuse or transmission BSDF below.

We can probably find a better heuristic that also enables this on further
bounces, for example when looking through a perfect mirror, but I wasn't able
to find a robust one so far.
2017-09-20 19:38:08 +02:00
Brecht Van Lommel
095a01a73a Cycles: slightly improve BSDF sample stratification for path tracing.
Similar to what we did for area lights previously, this should help
preserve stratification when using multiple BSDFs in theory. Improvements
are not easily noticeable in practice though, because the number of BSDFs
is usually low. Still nice to eliminate one sampling dimension.
2017-09-20 19:38:08 +02:00
Brecht Van Lommel
b3afc8917c Code cleanup: refactor BSSRDF closure sampling, for next commit. 2017-09-20 19:38:08 +02:00
Brecht Van Lommel
d750d182e5 Code cleanup: remove hack to avoid seeing transparent objects in noise.
Previously the Sobol pattern suffered from some correlation issues that
made the outline of objects like a smoke domain visible. This helps
simplify the code and also makes some other optimizations possible.
2017-09-20 19:38:08 +02:00
Carlo Andreacchio
ab9079f459 Fix Cycles adaptive compile without volumes broken after recent changes.
Differential Revision: https://developer.blender.org/D2847
2017-09-18 12:52:32 +02:00
Brecht Van Lommel
32449e1b21 Code cleanup: store branch factor in PathState. 2017-09-13 15:24:14 +02:00
Brecht Van Lommel
37d9e65ddf Code cleanup: abstract shadow catcher logic more into accumulation code. 2017-09-13 15:24:14 +02:00
Brecht Van Lommel
f77cdd1d59 Code cleanup: deduplicate some branched and split kernel code.
Benchmarks peformance on GTX 1080 and RX 480 on Linux is the same for
bmw27, classroom, pabellon, and about 2% faster on fishy_cat and koro.
2017-09-13 15:24:14 +02:00
Brecht Van Lommel
c4c450045d Code cleanup: tweak inlining for 2% better CUDA performance with hair. 2017-09-13 15:24:14 +02:00
Mathieu Menuet
659ba012b0 Cycles: change AO bounces approximation to do more glossy and transmission.
Rather than treating all ray types equally, we now always render 1 glossy
bounce and unlimited transmission bounces. This makes it possible to get
good looking results with low AO bounces settings, making it useful to
speed up interior renders for example.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2818
2017-09-12 15:37:35 +02:00
Sergey Sharybin
f01e43fac3 Fix T52433: Volume Absorption color tint
Need to exit the volume stack when shadow ray laves the medium.

Thanks Brecht for review and help in troubleshooting!
2017-09-05 15:48:34 +02:00
Brecht Van Lommel
b85d36d811 Code cleanup: remove shader context.
This was needed when we accessed OSL closure memory after shader evaluation,
which could get overwritten by another shader evaluation. But all closures
are immediatley converted to ShaderClosure now, so no longer needed.
2017-08-24 03:43:02 +02:00
Brecht Van Lommel
cfa8b762e2 Code cleanup: move rng into path state.
Also pass by value and don't write back now that it is just a hash for seeding
and no longer an LCG state. Together this makes CUDA a tiny bit faster in my
tests, but mainly simplifies code.
2017-08-19 18:14:16 +02:00
Brecht Van Lommel
dc7fcebb33 Code cleanup: make L_transparent part of PathRadiance. 2017-08-13 01:19:07 +02:00
Brecht Van Lommel
7542282c06 Code cleanup: make DebugData part of PathRadiance. 2017-08-13 01:19:07 +02:00
Brecht Van Lommel
601f94a3c2 Code cleanup: remove unused Cycles random number code. 2017-08-12 20:40:38 +02:00
Brecht Van Lommel
85ad248c36 Code cleanup: fix warning and improve terminology. 2017-08-12 13:18:05 +02:00
Sergey Sharybin
bd069a89aa Fix T52229: Shadow Catcher artifacts when under transparency
Added some extra tirckery to avoid background being tinted dark with transparent
surface. Maybe a bit hacky, but seems to work fine.
2017-08-11 13:49:50 +02:00
Brecht Van Lommel
fc38276d74 Fix Cycles shadow catcher objects influencing each other.
Since all the shadow catchers are already assumed to be in the footage,
the shadows they cast on each other are already in the footage too. So
don't just let shadow catchers skip self, but all shadow catchers.

Another justification is that it should not matter if the shadow catcher
is modeled as one object or multiple separate objects, the resulting
render should be the same.

Differential Revision: https://developer.blender.org/D2763
2017-08-07 17:54:26 +02:00
Brecht Van Lommel
2a74f36dac Fix Cycles CUDA adaptive megakernel build error. 2017-08-07 00:27:08 +02:00
Sergey Sharybin
5f35682f3a Fix T52021: Shadow catcher renders wrong when catcher object is behind transparent object
Tweaked the path radiance summing and alpha to accommodate for possible contribution of
light by transparent surface bounces happening prior to shadow catcher intersection.

This commit will change the way how shadow catcher results looks when was behind semi
transparent object, but the old result seemed to be fully wrong: there were big artifacts
when alpha-overing the result on some actual footage.
2017-07-18 09:46:21 +02:00
Hristo Gueorguiev
40e6f65ea1 Fix T50937: baking with OpenCL and CPU have slightly different brightness
OpenCL baking with SSS and Volume are not supported.
2017-05-17 12:24:16 +02:00
Lukas Stockner
43b374e8c5 Cycles: Implement denoising option for reducing noise in the rendered image
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.

To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.

Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.

Finally, thanks to all the people who supported this project:

- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
  on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
  mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
  that could and/or should work better!
2017-05-07 14:40:58 +02:00
Mai Lavelle
915766f42d Cycles: Branched path tracing for the split kernel
This implements branched path tracing for the split kernel.

General approach is to store the ray state at a branch point, trace the
branched ray as normal, then restore the state as necessary before iterating
to the next part of the path. A state machine is used to advance the indirect
loop state, which avoids the need to add any new kernels. Each iteration the
state machine recreates as much state as possible from the stored ray to keep
overall storage down.

Its kind of hard to keep all the different integration loops in sync, so this
needs lots of testing to make sure everything is working correctly. We should
probably start trying to deduplicate the integration loops more now.

Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes
could use more testing still.

Reviewers: sergey, nirved

Reviewed By: nirved

Subscribers: Blendify, bliblubli

Differential Revision: https://developer.blender.org/D2611
2017-05-02 14:26:46 -04:00
Sergey Sharybin
0579eaae1f Cycles: Make all #include statements relative to cycles source directory
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.

For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.

Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.

This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.

Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.

Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner

Reviewed By: lukasstockner97, maiself, nirved, dingto

Subscribers: brecht

Differential Revision: https://developer.blender.org/D2586
2017-03-29 13:41:11 +02:00
Hristo Gueorguiev
8ada7f7397 Cycles: Remove ccl_addr_space from RNG passed to functions
Simplifies code quite a bit, making it shorter and easier to extend.
Currently no functional changes for users, but is required for the
upcoming work of shadow catcher support with OpenCL.
2017-03-27 10:46:28 +02:00
Sergey Sharybin
d14e39622a Cycles: First implementation of shadow catcher
It uses an idea of accumulating all possible light reachable across the
light path (without taking shadow blocked into account) and accumulating
total shaded light across the path. Dividing second figure by first one
seems to be giving good estimate of the shadow.

In fact, to my knowledge, it's something really similar to what is
happening in the denoising branch, so we are aligned here which is good.

The workflow is following:

- Create an object which matches real-life object on which shadow is
  to be catched.

- Create approximate similar material on that object.

  This is needed to make indirect light properly affecting CG objects
  in the scene.

- Mark object as Shadow Catcher in the Object properties.

Ideally, after doing that it will be possible to render the image and
simply alpha-over it on top of real footage.
2017-03-27 10:46:03 +02:00
Hristo Gueorguiev
57e26627c4 Cycles: SSS and Volume rendering in split kernel
Decoupled ray marching is not supported yet.

Transparent shadows are always enabled for volume rendering.

Changes in kernel/bvh and kernel/geom are from Sergey.
This simiplifies code significantly, and prepares it for
record-all transparent shadow function in split kernel.
2017-03-09 17:09:37 +01:00
Mai Lavelle
352ee7c3ef Cycles: Remove ccl_fetch and SOA 2017-03-08 00:52:41 -05:00
Sergey Sharybin
0330741548 Cycles: Add option to replace GI with AO approximation after certain amount of bounces
This is a speed up option which is mainly useful for viewport. Gives nice speedup in
the barbershop scene of 2x when replacing GI with AO after 2nd bounce without loosing
too much details.

Reviewers: brecht

Subscribers: eyecandy, venomgfx

Differential Revision: https://developer.blender.org/D2383
2017-01-27 14:21:49 +01:00