This is not really supported by OpenCL but might happen in certain
configurations. There might be some remained cases when this happens
but so far can not find any,
Previous idea behind having vector during building and array for actual storage
was needed in order to minimize amount of re-allocations happening during the
build, but it lead to double memory overhead used by those arrays at the vector
to array conversion stage.
Issue with such approach was that for BVH without spatial split size of arrays
is known in advance and it never changes, which made vector to array conversion
totally redundant.
Also after testing with several rather complex from spatial split scenes (such
as trees) it seems even conservative approach of reallocation (when we perform
re-allocation when leaf does not fit into the memory) doesn't give measurable
difference in time.
This makes it so we can switch to array, which will avoid unneeded memory
re-allocations when spatial split is disabled without harming other cases.
it's a bit difficult to measure exact benefit of this change on our production
files here, but depending on the scene it might give quite reasonable memory
save.
This way we solve possible issues caused by regular allocator not being aware of
some classes preferring 16 bytes alignment needed for SSE to work properly. This
caused random crashes during rendering.
Now we always use aligned allocation in GuardedAllocator which shouldn't be any
measurable performance impact and the code is only used by developers after
defining special symbol, so there is no impact on release builds at all.
it is the same issue as described in the previous commit, original changes
in this area were wrong and only worked on a bugger optimus driver which
simply appeared to work by co-incident and in fact used wrong device..
It was annoying copy-paste happened across OpenCL device constructor, device
enumeration and split kernel checks. Now those areas are using an utility
function which returns pairs of platform and device IDs for devices which are
supported by Cycles and enumeration is happening inside that list.
This makes it so filtering is happening in a single place, so there's no need
to keep 3 different functions in sync.
This commit also fixes a bug with wrong enumeration of devices caused by recent
fixes. Those fixes were in fact wrong and only happened to appear to be working
on laptop with optimus card on Linux. Root of those issues is in fact in bad
Linux driver for optimus cards.
This was a mistake in the original code from D1079.
With the current way how direction ot mirror ball works camera should look
into negative Y direction. Corrected it in the camera matrix synchronization,
so no extra calculations are needed at the render time.
That's a bit annoying, but we'd better port it to the release branch, or
otherwise we'll end up with files created with wrong camera mapping after
2.75 release.
The idea is to make it possible to control linked duplicated objects motion
blur from the scene file without need to do overrides on the linked object
settings. Currently only supported for dupligroup duplication and all now
if duplicator object has motion blur disabled then it'll be inherited into
all the duplicated objects.
There should be no regressions/changes in look of existing files because
objects do have motion blur enabled by default.
This gives quite the same problems as experimental CUDA kernels
and for until it's found a root cause of the problem we'd just
explicitly uninline the function.
This is mainly useful for the render farms output when logging might stop at
the "Path Tracing Tile N/N" string, which makes it a bit difficult to follow
what exactly is happening (node going crazy, hardware issues or just last tile
is too much heavy).
This is more like an experiment, might be changed in the future.
They'll be checked for the version later and that check will fail anyway,
so better to not allow user to see unsupported device in the list.
Also corrected one more issue with the device enumeration.
Skipped devices did not reflect in the device number, which might result in bad
array indices.
This might also resolve T45037, and need to be ported to a release branch.
The main loop only handles transparent intersections from the camera ray.
Therefore we can simplify some things.
* Avoid PATH_RAY_CAMERA check, this is always true.
* Avoid path_state_next() call, we can just set transparent flag and increase transparent bounces. This way we avoid the function call and some branching.
Also remove debug num_ray_bounces++, this is incorrect here as no indirect bounce happens here.
Should be no functional changes.
Code there started becoming a bit too big, by splitting it up it'll make it
easier to do improvements or extending the features in there.
The layout is not totally final yet, would need to try de-duplicating parts
of code from split kernel with non-split integrators,
Older OSX has major issues with sincos() function, it's likely a big in OSL
or LLVM. For until we've updated to new versions of this libraries we'll use
a workaround to prevent possible crashes on all the platforms.
Shouldn't be that bad because it's mainly used for anisotropic shader where
angle is usually constant.
This fix is safe for inclusion into final Blender 2.75 release.
Glass BSDF was doing some magic with copying weigths from initial closure
onto refraction one and the code was not checking properly for the number
of closures.
TODO: We might want to refactor debug passes into PASS_DEBUG and some
debug_type (similar to Blender's side passes) to avoid issue of running
out of bits.
They're working just fine on AMD Tonga GPU and probably other architectures,
lets enable it under the experimental feature set and see what exact system
configuration gives issues.
This simplification is safe, as the call to volume_phase_eval() is guarded behind a CLOSURE_IS_PHASE check, which is equal to
CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID. I don't think we will add more phase functions anytime soon, if at all.