Slowdown was caused by watertight intersection commit and follow-up workaorund
for compiler crash which uninlined utility function which rotates the ray.
Now it's only uninlined for sm_50 and sm_52 experimental kernels which are the
only ones which failed to compile.
Rendering still might be a bit slower but at least shouldn't be that dramatic.
The issue was caused bu the optimization in surface attributes for cases when
there's only a volume shader used. Some attributes doesn't make sense in that
case and were skipped from calculation.
However, it is possible that kernel would still try to access them (because of
the shader setup etc). Prevented an infinite loop in the kernel now, which
should not have much affect on regular renders.
It is possible that ray distance will be zero which would make intersection
refinement return NaN as the refined position which would later lead to all
sort of mathematical issues.
Don't think there are ways to improve intersection accuracy for such rays
so just return original intersection coordinate.
This should fix T43475.
TODO: Need to look into possible issues in Ashikhmin BSDF which might return
zero-length reflected/transmitted ray?
Precision of the fast functions seems to be enough in there and
since the code was heavily using inverse trigonometric functions
this change gives few percent speedup on Victor's hair.
From the tests files from ctests storage doesn't have any meaningful
difference, hair on Victor is all below 4% absolute error and only
few pixels are exceeding 1% absolute difference.
In any case, let it be as it is currently so it allows us to have
fast math file in sources for it's further evaluation and possible
usage in other areas as well.
BSDF sampler function shouldn't give labels it's not intended to do.
That said reflection shouldn't give transmission ray and transmission
give reflection ray.
Added an assert in the transmission sampling but reflection still
needs some investigation because even after recent fixes the check
for projection onto the reflected ray could give both positive and
negative values.
It shouldn't have any affect on renders just makes internal logic
consistent and unleashes an issue to be investigate further.
Hair BSDF did not have proper behavior because of non-normalized
tangent direction (which it expected to be normalized).This lead
to wrong labels being returned by the hair BSDF samplers.
Reported and nailed down by Michale (MeshLogic).
The code that fixes this was commented out, but Brecht gave the go ahead to use it even if it is not the real solution
This is the same as blender internal's texture mapping from another object,
so this way it's possible to control texture space of one object by another.
Quite straightforward change apart from the workaround for the stupidness of
the dependency graph. Now shader has flag telling that it depends on object
transform. This is the simplest way to know which shaders needs to be tagged
for update when object changes. This might give some false-positive tags now
but reducing them should not be priority for Cycles and rather be a priority
to bring new dependency graph.
Also GLSL preview does not support using other object for mapping.
This is actually correct for BI shading as well and to be addressed as
a part of general GLSL viewport improvements since it's not really clear
how to support this in GLSL.
Reviewers: brecht, juicyfruit
Subscribers: eyecandy, venomgfx
Differential Revision: https://developer.blender.org/D1021
This way Cycles finally becomes feature-full on image projections
compared to Blender Internal and Gooseberry Project Team could
finally finish the movie.
Issue seems to be caused by not totally proper pdf and eval values for this
closure. Changed it so they reflect to ggx/beckmann reflection with roughness
set to 0, which is effectively the same as the sharp reflection.
This patch adds the option to set minimum/maximum latitude/longitude values for
the equirectangular panorama camera in Cycles, as discussed in T34400.
The separate functions in kernel_projection.h are needed because the regular
ones are also used as helper functions for environment map sampling.
Reviewers: #cycles, sergey
Reviewed By: #cycles, sergey
Subscribers: dingto, sergey, brecht
Differential Revision: https://developer.blender.org/D960
There seems to be inconsistency in flags checks in Cycles kernel. In the interface
glossy means "Glossy Reflection" and it is properly taken into account when doing
visibility check in BVH traversal.
The check in indirect background/light emission was treating this flags as "any of
glossy reflection or transmission" which is kind of weird.
Made it so emission code follows ray visibility assumptions in other parts of the
kernel now.
Really stupid issue caused by typo in bitfield bit lead to bit conflict,
Not sure how it was done, could be some bad merge conflict resolve in the
original commit or just pure man stupidnes.
This is a nice example when having set of small test render scenes hooked
to the ctest would really help.
It's probably not that stopper issue (even tho still quite bad) since it
was made 2 months ago. But if we ever do 'a' this time it's a nice change
to include.
This commit enables BVH leaf nodes split by the primitive type and makes it
so BVH traversal code is now aware and benefits from this.
As was mentioned in original commit, this change is crucial to be able to do
single ray to multiple triangle intersection. But it also appears to give
barely visible speedup in some scene.
In any case there should be no noticeable slowdown, and this change is what
we need to have anyway.
The idea of this change is make it possible to split leaf nodes by primitive
type, making leaf containing primitives of the same type.
This would become handy when working on a single ray to multiple triangles
intersection code, plus with careful implementation it might give some extra
benefits on BVH traversal code by avoiding primitive type fetch and check for
each primitive in the node. But that's a bit tricky to have benefits on this
change only because depth of BVH increases.
This option is not exposed to the interface at all and not used even secretly,
the commit is only needed to help working further in this direction without
messing around with local patches and worrying of them running out of date.
OpenCL apparently does not support templates, so the idea of generic
function for swapping is a bit of a failure. Now it is either inlined
into the code (in triangle intersection) or has specific implementation
for QBVH.
This is probably even better, because we can't create QBVH-specific
function in util_math anyway.
This commit contains all the tweaks which were missing in initial patch
re-integration from the standalone Cycles repository.
This commit also contains an utility cmake macro to help linking targets
with different libraries for release/debug builds, the name currently is
target_link_libraries_decoupled
it gets a target and list of libraries and makes sure debug builds are
using libraries with "_d" suffix.
After all this changes it'll hopefully be easier to interchange patches
between blender and standalone repositories, because they're now quite
identical.
This way it is now possible to use gflags >= 2.1, where all the
functions were moved from google to gflags namespace.
This isn't currently used in blender, but for standalone repository
this change is essential.
This reverts commit 1549fea999.
After some further discussion with other developers in the team it becomes
clear there's no correct solution here. It is just more matter of what's
more convenient in particular case.
We're just going back to old code to avoid possible frustration with the
older files in newer blenders. This also means all HSV/HSL is considered
to be "linear" in the shading nodes.
Would be ported to 2.73 final.
This way we'll be sure (in debug builds) that regular BVH traversal is not used
for QBVH tree (could happen because of mismatch of logic in kernel and render).
This commit implements heuristic which allows to skip nodes pushed to the stack
from intersection if distance to them is larger than the distance to the current
intersection.
This should solve speed regression which i didn't notice in the original QBVH
commit (which could have because i had WIP version of this patch applied in my
local branch).
From quick tests speed seems to be much closer to what is was with regular BVH.
There's still some possible code cleanup, but they'll need a bit of assembly
code check and now i want to make it so artists can happily use Cycles over the
holidays.
basically shadow rays were totally broken and most of the time did not record
any intersections, leading to really ad rendering artifacts.
This commit makes it so regardless of enabled optimization level render result
would be the same.
This issue doesn't happen with 6.5.12 and there's slight piece of hope it'll be
fixed in next toolkit releases..
For now we're forcing CUDA to not inline ray precalculation. This could lead to
some speed regression, but wouldn't expect it to be huge -- this code does not
run that often comparing to actual triangle intersection.
This commit implements traversal for QBVH tree, which is based on the old loop
code for traversal itself and Embree for node intersection.
This commit also does some changes to the loop inspired by Embree:
- Visibility flags are only checked for primitives.
Doing visibility check for every node cost quite reasonable amount of time
and in most cases those checks are true-positive.
Other idea here would be to do visibility checks for leaf nodes only, but
this would need to be investigated further.
- For minimum hair width we extend all the nodes' bounding boxes.
Again doing curve visibility check is quite costly for each of the nodes and
those checks returns truth for most of the hierarchy anyway.
There are number of possible optimization still, but current state is good
enough in terms it makes rendering faster a little bit after recent watertight
commit.
Currently QBVH is only implemented for CPU with SSE2 support at least. All
other devices would need to be supported later (if that'd make sense from
performance point of view).
The code is enabled for compilation in kernel. but blender wouldn't use it
still.
Most of them are not currently used but are essential for the further work.
- CPU kernels with SSE2 support will now have sse3b, sse3f and sse3i
- Added templatedversions of min4, max4 which are handy to use with register
variables.
- Added util_swap function which gets arguments by pointers.
So hopefully it'll be a portable version of std::swap.