We started to run out of bits there, so now we separate flags
which came from __object_flags and which are either runtime or
coming from __shader_flags.
Rule now is: SD_OBJECT_* flags are to be tested against new
object_flags field of ShaderData, all the rest flags are to
be tested against flags field of ShaderData.
There should be no user-visible changes, and time difference
should be minimal. In fact, from tests here can only see hardly
measurable difference and sometimes the new code is somewhat
faster (all within a noise floor, so hard to tell for sure).
Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself
Differential Revision: https://developer.blender.org/D2428
Basically, the problem here was that the transform that's used to bring texture coordinates
to world space is either fetched while setting up the shader (with Object Motion is enabled) or
fetched when needed (otherwise). That helps to save ShaderData memory on OpenCL when Object Motion isn't needed.
Now, if OM is enabled, the Lamp transform can just be stored inside the ShaderData as well. The original commit just assumed it is.
However, when it's not (on OpenCL by default, for example), there is no easy way to fetch it when needed, since the ShaderData doesn't
store the Lamp index.
So, for now the lamps just don't support local texture coordinates anymore when Object Motion is disabled.
To fix and support this properly, one of the following could be done:
- Just always pre-fetch the transform. Downside: Memory Usage increases when not using OM on OpenCL
- Add a variable to ShaderData that stores the Lamp ID to allow fetching it when needed
- Store the Lamp ID inside prim or object. Problem: Cycles currently checks these for whether an object was hit - these checks would need to be changed.
- Enable OM whenever a Texture Coordinate's Normal output is used. Downside: Might not actually be needed.
In scenes with many lights, some of them might have a very small contribution to some pixels, but the shadow rays are traced anyways.
To avoid that, this patch adds probabilistic termination to light samples - if the contribution before checking for shadowing is below a user-defined threshold, the sample will be discarded with probability (1 - (contribution / threshold)) and otherwise kept, but weighted more to remain unbiased.
This is the same approach that's also used in path termination based on length.
Note that the rendering remains unbiased with this option, it just adds a bit of noise - but if the setting is used moderately, the speedup gained easily outweighs the additional noise.
Reviewers: #cycles
Subscribers: sergey, brecht
Differential Revision: https://developer.blender.org/D2217
When using the Normal output of the Texture Coordinate node on Point and Spot lamps, the coordinates now depend on the rotation of the lamp.
On Area lamps, the Parametric output of the Geometry node now returns UV coordinates on the area lamp.
Credit for the Area lamp part goes to Stefan Werner (from D1995).
Using ones complement for detecting if transform has been applied was confusing
and led to several bugs. With this proper checks are made.
Also added a few transforms where they were missing, mostly affecting baking
and displacement when `P` is used in the shader (previously `P` was in the
wrong space for these shaders)
Also removed `TIME_INVALID` as this may have resulted in incorrect
transforms in some cases.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2192
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.
This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.
On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
Make sure we don't perform any implicit address space conversion.
A bit annoying, but less intrusive approaches (like using temp private
variable in .cl kernel) do not work correct here.
Using generic address space will help from code side here, but will
be somewhat slower due to extra things happening as far as i know.
This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the
multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model".
Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes
the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until
the ray leaves it again, which ensures perfect energy conservation.
In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing
roughness - is solved in a physically correct and efficient way.
The downside of this model is that it has no (known) analytic expression for evalation. However, it can be
evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the
balance heuristic guarantee an unbiased result at the cost of slightly higher noise.
Reviewers: dingto, #cycles, brecht
Reviewed By: dingto, #cycles, brecht
Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel
Differential Revision: https://developer.blender.org/D2002
The goal is to make Experimental kernel closer in performance to the
official kernel, avoiding spills and such.
There should not be big impact on official kernel, own tests showed
few percent performance drop on laptop's GPU. CPU was always the
same speed on AVX, AVX2 and SSE4.1 CPUs i've been testing here.
This seems to be the last essential step before we can get rid of
Experimental kernel and enable SSS officially on GPU without causing
some major performance issues.
Surely some more tweaks are possibly required, but that we can do
for until cows go home anyway.
This commit changes the way how we pass bounce information to the Light
Path node. Instead of manualy copying the bounces into ShaderData, we now
directly pass PathState. This reduces the arguments that we need to pass
around and also makes it easier to extend the feature.
This commit also exposes the Transmission Bounce Depth to the Light Path
node. It works similar to the Transparent Depth Output: Replace a
Transmission lightpath after X bounces with another shader, e.g a Diffuse
one. This can be used to avoid black surfaces, due to low amount of max
bounces.
Reviewed by Sergey and Brecht, thanks for some hlp with this.
I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split
and Mega kernel. Hopefully this covers all devices. :)
* Did not check data2, this partially fixes T45583.
* Initialize data2 in some closures to avoid potential problems.
Differential Revision: https://developer.blender.org/D1436
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.
Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.
Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.
This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.
More feature will be enabled once they're ported to the split kernel and
tested.
Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.
Based on the research paper:
https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf
Here's the documentation:
https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit
Design discussion of the patch:
https://developer.blender.org/T44197
Differential Revision: https://developer.blender.org/D1200
Issue was introduced in 01ee21f where i didn't notice *_setup()
function only doing partial initialization, and some of parameters
are expected to be initialized by callee function.
This was hitting only some setups, so tests with benchmark scenes
didn't unleash issues. Now it should all be fine.
This is to go to the 2.74 branch and we actually might re-AHOY.
This was caused by some internal optimization which evaluated SSS with
size of zero as BSDF but used different ID so the evaluation result
didn't appear in regular diffuse pass.
This lead to situation when SSS data was nowhere stored if the
size was zero.
Now SSS with zero size and close-to-zero sizes will be handled in the
same way from the passes point of view.
tri_shader does no longer need to a float.
Reviewers: dingto, sergey
Reviewed By: dingto, sergey
Subscribers: dingto
Projects: #cycles
Differential Revision: https://developer.blender.org/D789
Root of the issue goes back to the on-fly normals commit and the
latest fix for it wasn't actually correct. I've mixed two fixes
in there.
So the idea here goes back to storing negative scaled object flag
and flip runtime-calculated normal if this flag is set, which is
pretty much the same as the original fix for the issue from me.
The issue with motion blur wasn't caused by the rumtime normals
patch and it had issues before, because it already did runtime
normals calculation. Now made it so motion triangles takes the
negative scale flag into account.
This actually makes code more clean imo and avoids rather confusing
flipping code in mesh.cpp.
Fix T41079: Solid black render of object with negative scale and smooth shading
In both cases the issue was caused by negative scaled objects with single mesh
users for which scale gets applied when using static BVH.
Since the on-fly normals calculation land normals for such cases weren't flipped
leading them to point to a wrong direction.
Added a special object flag for this, which is a bit of a bummer because now
we've got less bits for real useful things, but this is the only way to get
proper normals without adding more complexity in the on-fly calculations.
* Volume multiple importace sampling support to combine equiangular and distance
sampling, for both homogeneous and heterogeneous volumes.
* Branched path "Sample All Direct Lights" and "Sample All Indirect Lights" now
apply to volumes as well as surfaces.
Implementation note:
For simplicity this is all done with decoupled ray marching, the only case we do
not use decoupled is for distance only sampling with one light sample. The
homogeneous case should still compile on the GPU because it only requires fixed
size storage, but the heterogeneous case will be trickier to get working.
Instead of pre-calculation and storage, we now calculate the face normal during render.
This gives a small slowdown (~1%) but decreases memory usage, which is especially important for GPUs,
where you have limited VRAM.
Part of my GSoC 2014.
This was the original code to get things working on old GPUs, but now it is no
longer in use and various features in fact depend on this to work correctly to
the point that enabling this code is too buggy to be useful.
This can for example be useful if you want to manually terminate the path at
some point and use a color other than black.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D454
This is done by adding a Volume Scatter node. In many cases you will want to
add together a Volume Absorption and Volume Scatter node with the same color
and density to get the expected results.
This should work with branched path tracing, mixing closures, overlapping
volumes, etc. However there's still various optimizations needed for sampling.
The main missing thing from the volume branch is the equiangular sampling for
homogeneous volumes.
The heterogeneous scattering code was arranged such that we can use a single
stratified random number for distance sampling, which gives less noise than
pseudo random numbers for each step. For volumes where the color is textured
there still seems to be something off, needs to be investigated.
This is done using the existing Emission node and closure (we may add a volume
emission node, not clear yet if it will be needed).
Volume emission only supports indirect light sampling which means it's not very
efficient to make small or far away bright light sources. Using direct light
sampling and MIS would be tricky and probably won't be added anytime soon. Other
renderers don't support this either as far as I know, lamps and ray visibility
tricks may be used instead.