With this patch the build system checks whether the "CUDA10_NVCC_EXECUTABLE" CMake
variable is set and if so will use that to build sm_30 kernels. Similarily for sm_8x kernels it
checks "CUDA11_NVCC_EXECUTABLE". All other kernels are built using the default CUDA
toolkit. This makes it possible to use either the CUDA 10 or CUDA 11 toolkit by default and
only selectively use the other for the kernels where its a hard requirement.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D9179
NanoVDB is a platform-independent sparse volume data structure that makes it possible to
use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles,
replacing the previous usage of dense 3D textures.
Since it has a big impact on memory usage and performance and changes the OpenVDB
branch used for the rest of Blender as well, this is not enabled by default yet, which will
happen only after 2.82 was branched off. To enable it, build both dependencies and Blender
itself with the "WITH_NANOVDB" CMake option.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D8794
Ref {D8855}
Unix and Apple platform files use find_package(OpenSubdiv) which sets
`OPENSUBDIV_INCLUDE_DIR` as an advanced variable, as well as
`OPENSUBDIV_INCLUDE_DIRS` which should be used usually.
Windows sets `OPENSUBDIV_INCLUDE_DIR` which is used by the rest
of the code.
This patch renames it to `_DIRS` everywhere, for it to be like other
similar variables.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D8917
* Support precompiled libraries on Linux
* Add license headers
* Refactoring to deduplicate code
Includes work by Ray Molenkamp and Grische for precompiled libraries.
Ref D8769
This change enables the developer option `WITH_CYCLES_NATIVE_ONLY`
for MSVC. This allows a developer to just build the cycles
CPU kernel for their specific system rather than all kernels,
speeding up development.
Other platforms have had this option for years, but MSVC lacks
the compiler switch to target the host architecture hence it
always build all kernels.
This change uses a small helper program to detect the required
flags.
Only AVX/AVX2 are tested, for the following reasons
- SSE2 is enabled by default and requires no flags
- SSE3/4 have no specific build flags for msvc
- AVX512 is not yet supported by cycles
Differential Revision: https://developer.blender.org/D8775
Reviewed by: brecht, sergey
Voxels are loaded directly from the OpenVDB grid. Rendering still only supports
dense grid, so memory usage is not great for sparse volumes, this is to be
addressed in the future.
Ref T73201
This modifies the common CUDA implementation for adaptive kernel compilation slightly to support both CUBIN and PTX output (the latter which is then used in the OptiX device). It also fixes adaptive kernel compilation on Windows.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D6851
The partial disabling was causing issues with Clang and ASAN, and it seems we
don't need to restrict it to the kernel anymore now that we are no longer using
boost directly.
This version fixes various bugs, and there is no need anymore to use both
9.1 and 10.0 for different cards.
There is a bug related to WITH_CYCLES_CUBIN_COMPILER and bump mapping in the
regression tests, so that remains disabled same as it was for CUDA 10.0.
Fix T59286: CUDA bake failing on some cards.
Fix T56858: CUDA 9.2 and 10 issues.
Note that this is turned off by default and must be enabled at build time with the CMake WITH_CYCLES_EMBREE flag.
Embree must be built as a static library with ray masking turned on, the `make deps` scripts have been updated accordingly.
There, Embree is off by default too and must be enabled with the WITH_EMBREE flag.
Using Embree allows for much faster rendering of deformation motion blur while reducing the memory footprint.
TODO: GPU implementation, deduplication of data, leveraging more of Embrees features (e.g. tessellation cache).
Differential Revision: https://developer.blender.org/D3682
This is in preparation of upgrading our library dependencies, some of which
need C++11. We already use C++11 in blender2.8 and for Windows and macOS, so
this just affects Linux.
On many distributions this will not require any changes, on some
install_deps.sh will need to be run again to rebuild libraries.
Differential Revision: https://developer.blender.org/D3568
I've limited it to just the RGB<->XYZ stuff for now, correct image handling is the next step.
Reviewers: brecht, sergey
Differential Revision: https://developer.blender.org/D3478
This commit contains the minimum to make clang build/work with blender, asan and ninja build support is forthcoming
Things to note:
1) Builds and runs, and is able to pass all tests (except for the freestyle_stroke_material.blend test which was broken at that time for all platforms by the looks of it)
2) It's slightly faster than msvc when using cycles. (time in seconds, on an i7-3370)
victor_cpu
msvc:3099.51
clang:2796.43
pavillon_barcelona_cpu
msvc:1872.05
clang:1827.72
koro_cpu
msvc:1097.58
clang:1006.51
fishy_cat_cpu
msvc:815.37
clang:722.2
classroom_cpu
msvc:1705.39
clang:1575.43
bmw27_cpu
msvc:552.38
clang:561.53
barbershop_interior_cpu
msvc:2134.93
clang:1922.33
3) clang on windows uses a drop in replacement for the Microsoft cl.exe (takes some of the Microsoft parameters, but not all, and takes some of the clang parameters but not all) and uses ms headers + libraries + linker, so you still need visual studio installed and will use our existing vc14 svn libs.
4) X64 only currently, X86 builds but crashes on startup.
5) Tested with llvm/clang 6.0.0
6) Requires visual studio integration, available at https://github.com/LazyDodo/llvm-vs2017-integration
7) The Microsoft compiler spawns a few copies of cl in parallel to get faster build times, clang doesn't, so the build time is 3-4x slower than with msvc.
8) No openmp support yet. Have not looked at this much, the binary distribution of clang doesn't seem to include it on windows.
9) No ASAN support yet, some of the sanitizers can be made to work, but it was decided to leave support out of this commit.
Reviewers: campbellbarton
Differential Revision: https://developer.blender.org/D3304
This patch changes the huge list of projects in visual studio into a nice tree matching the source folder structure. see D2823 for details.
Differential Revision: http://developer.blender.org/D2823
nvcc is very picky regarding compiler versions, severely limiting the compiler we can use, this commit adds a nvrtc based compiler that'll allow us to build the cubins even if the host compiler is unsupported. for details see D2913.
Differential Revision: http://developer.blender.org/D2913
Empty BVH nodes are set to NaN which must be preserved all the way to the
tnear <= tfar test which can then give false for empty nodes. This needs
strict semantices and careful argument ordering for min() and max(), so
the second argument is used if either of the arguments is NaN.
Fixes T52635: crash in BVH traversal with SSE4.1.
Differential Revision: https://developer.blender.org/D2828
The issue was apparently caused by -fno-finite-math-only added to kernel.cpp
CFLAGS. For now just removed this flag from the kernel (we don't really want
it there at this point, and we don't have it for SSE/AVX optimized kernels).
But surely more investigation is needed here.
This allows us to use faster math and still have reliable
isnan/isfinite tests.
Only do it for host side, kernels stays unchanged.
Thanks Lukas Stockner for the tip!
This is important for the reliable behavior or isnan/isfinite/min/max
functions to work with nan and non-finite values. Some of the issues
with fast math are possible to work around, but didn't find a way to
have reliable min/max implementation yet.
Enables Catmull-Clark subdivision meshes with support for creases and attribute
subdivision. Still waiting on OpenSubdiv to fully support face varying
interpolation for subdividing uv coordinates tho. Also there may be some
inconsistencies with Blender's subdivision which will be resolved at a
later time.
Code for reading patch tables and creating patch maps is borrowed
from OpenSubdiv.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2111
This seems quite useful for the development, so you don't need to wait
all the kernels to be re-compiled when working on a new feature, which
speeds up re-iteration.
Marked as an advanced option, so if it doesn't work so well in practice
it's safe to revert anyway.
It's not really handy to silence something unused hoping for it'll be
used in the future. We can end up with quite some silencing then.
Also made this flag which i find rather useless to NOT cause -Werror
in Cycles code.
We don't have vectors re-allocation happening multiple times from inside
a loop anymore, so we can safely switch to a memory guarded allocator for
vectors and keep track on the memory usage at various stages of rendering.
Additionally, when building from inside Blender repository, Cycles will
use Blender's guarded allocator, so actual memory usage will be displayed
in the Space Info header.
There are couple of tricky aspects of the patch:
- TaskScheduler::exit() now explicitly frees memory used by `threads`.
This is needed because `threads` is a static member which destructor
isn't getting called on Blender's exit which caused memory leak print
to happen.
This shouldn't give any measurable speed issues, reallocation of that
vector is only one of fewzillion other allocations happening during
synchronization.
- Use regular guarded malloc (not aligned one). No idea why it was
made to be aligned in the first place. Perhaps some corner case tests
or so. Vector was never expected to be aligned anyway. Let's see if
we'll have actual bugs with this.
Reviewers: dingto, lukasstockner97, juicyfruit, brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D1774
This is an initial move to have unittests to at least cover
utility functions, which then could be extended further to
test such areas as shader optimization and such.
Currently only based on initial "infrastructure" layout and
writing tests needed to test the no-boost patch.
Note: This patch starts to use "<dir>/<header>.h" notation
for the include statements which i just got used to do in
other projects. Something what would be cool to use globally
in the code eventually.
Reviewers: dingto, juicyfruit, lukasstockner97, brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D1770
Seems i was the only one who was really up to using it and
i do have gcc-5 finally backported and installed here so
such a fine-tune flags are no longer needed.