blender

Author	SHA1	Message	Date
Lukas Stockner	ef816f9cff	Cycles: Fix the AO replacement option in the split kernel Currently the code for it was inside the hair-specific part, so it wouldn't be enabled in hairless renders.	2017-04-11 01:07:49 +02:00
Sergey Sharybin	3b4cc5dfed	Cycles: Workaround cubic volume filtering crashing on Linux The issue was caused by recent change in inline policy. There is some sort of memory corruption happening here, ASAN suggests it's stack overflow issue. Not quite sure why it is happening tho and was not able to solve anything here yet in the past hours. Committing fix which works with a big TODO note. The issue is visible on AVX2 machine when rendering cycles_reports_test.	2017-04-10 14:44:07 +02:00
Sergey Sharybin	90d85c7975	Cycles: Fix compilation error of AVX2 kernels with SSE optimization disabled	2017-04-10 14:44:04 +02:00
Sergey Sharybin	c3d393c1df	Cycles: Cleanup, indentation and trailing whitespace	2017-04-10 14:44:04 +02:00
Mai Lavelle	b60d4800c6	Cycles: Fix building of CUDA kernels with compilers where C++11 is disabled	2017-04-08 07:12:04 -04:00
Sergey Sharybin	7d77b3e813	Cycles: Fix compilation error with certain CUDA and host compiler configuration This seems to happen on Windows only, happened to Thomas and Nathan already. Similar patch Thomas was showing, but i do not see it committted. So comitting now in order to get more developers and users happy.	2017-04-07 18:28:38 +02:00
lazydodo	b332fc8f23	[Cycles/msvc] Get cycles_kernel compile time under control. Ever since we merged the extra texture types (half etc) and spit kernel the compile time for cycles_kernel has been going out of control. It's currently sitting at a cool 1295.762 seconds with our standard compiler (2013/x64/release) I'm not entirely sure why msvc gets upset with it, but the inlining of matrix near the bottom of the tri-cubic 3d interpolator is the source of the issue, this patch excludes it from being inlined. This patch bring it back down to a manageable 186 seconds. (7x faster!!) with the attached bzzt.blend that @sergey kindly provided i got the following results with builds with identical hashes 58:51.73 buildbot 58:04.23 Patched it's really close, the slight speedup could be explained by the switch instead of having multiple if's (switches do generate more optimal code than a chain of if/else/if/else statements) but in all honesty it might just have been pure luck (dev box,very polluted, bad for benchmarks) regardless, this patch doesn't seem to slow down anything with my limited testing. {F532336} {F532337} Reviewers: brecht, lukasstockner97, juicyfruit, dingto, sergey Reviewed By: brecht, dingto, sergey Subscribers: InsigMathK, sergey Tags: #cycles Differential Revision: https://developer.blender.org/D2595	2017-04-07 10:26:55 -06:00
Mai Lavelle	91b9db0724	Cycles: Change work pool and global size of split CPU for easier debugging	2017-04-07 06:06:08 -04:00
Mai Lavelle	8f85ee2fc9	Cycles: Fix indentation	2017-04-07 06:06:08 -04:00
Sergey Sharybin	ced8fff5de	Fix T51051: Incorrect render on 32bit Linux The issue was apparently caused by -fno-finite-math-only added to kernel.cpp CFLAGS. For now just removed this flag from the kernel (we don't really want it there at this point, and we don't have it for SSE/AVX optimized kernels). But surely more investigation is needed here.	2017-03-30 11:37:31 +02:00
Sergey Sharybin	48fa2c83eb	Cycles: Attempt to work around compilation errors of CUDA on sm_2x	2017-03-29 16:22:51 +02:00
Sergey Sharybin	be17445714	Cycles: Cleanup, indentation	2017-03-29 15:41:56 +02:00
Sergey Sharybin	cc7386ec6b	Cycles: Remove toolkit-specific workaround from kernel	2017-03-29 15:07:53 +02:00
Sergey Sharybin	30bed91b78	Cycles: Fix compilation error with visibility flag disabled	2017-03-29 14:28:45 +02:00
Sergey Sharybin	0579eaae1f	Cycles: Make all #include statements relative to cycles source directory The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586	2017-03-29 13:41:11 +02:00
Sergey Sharybin	6ea54fe9ff	Cycles: Switch to reformulated Pluecker ray/triangle intersection The intention of this commit it to address issues mentioned in the reports T43865,T50164 and T50452. The code is based on Embree code with some extra vectorization to speed up single ray to single triangle intersection. Unfortunately, such a fix is not coming for free. There is some slowdown for AVX2 processors, mainly due to different vectorization code, which caused different number of instructions to be executed and different instructions-per-cycle counters. But on another hand this commit makes pre-AVX2 platforms such as AVX and SSE4.1 a bit faster. The prerformance goes as following: 2.78c AVX2 2.78c AVX Patch AVX2 Patch AVX BMW 05:21.09 06:05.34 05:32.97 (+3.5%) 05:34.97 (-8.5%) Classroom 16:55.36 18:24.51 17:10.41 (+1.4%) 17:15.87 (-6.3%) Fishy Cat 08:08.49 08:36.26 08:09.19 (+0.2%) 08:12.25 (-4.7% Koro 11:22.54 11:45.24 11:13.25 (-1.5%) 11:43.81 (-0.3%) Barcelone 14:18.32 16:09.46 14:15.20 (-0.4%) 14:25.15 (-10.8%) On GPU the performance is about 1.5-2% slower in my tests on GTX1080 but afraid we can't do much as a part of this chaneg here and consider it a price to pay for more proper intersection check. Made in collaboration with Maxym Dmytrychenko, big thanks to him! Reviewers: brecht, juicyfruit, lukasstockner97, dingto Differential Revision: https://developer.blender.org/D1574	2017-03-28 17:26:47 +02:00
Thomas Dinges	7a65f9b171	Cleanup: Resolve todo in CUDA voxel image code.	2017-03-27 22:36:26 +02:00
Sergey Sharybin	8d48ea0233	Cycles: Make shadow catcher an optional feature for OpenCL Solves majority of speed regression on AMD OpenCL.	2017-03-27 10:47:14 +02:00
Hristo Gueorguiev	e07ffcbd1c	Cycles: Add OpenCL support for shadow catcher feature The title says it all actually.	2017-03-27 10:46:59 +02:00
Hristo Gueorguiev	8ada7f7397	Cycles: Remove ccl_addr_space from RNG passed to functions Simplifies code quite a bit, making it shorter and easier to extend. Currently no functional changes for users, but is required for the upcoming work of shadow catcher support with OpenCL.	2017-03-27 10:46:28 +02:00
Sergey Sharybin	d14e39622a	Cycles: First implementation of shadow catcher It uses an idea of accumulating all possible light reachable across the light path (without taking shadow blocked into account) and accumulating total shaded light across the path. Dividing second figure by first one seems to be giving good estimate of the shadow. In fact, to my knowledge, it's something really similar to what is happening in the denoising branch, so we are aligned here which is good. The workflow is following: - Create an object which matches real-life object on which shadow is to be catched. - Create approximate similar material on that object. This is needed to make indirect light properly affecting CG objects in the scene. - Mark object as Shadow Catcher in the Object properties. Ideally, after doing that it will be possible to render the image and simply alpha-over it on top of real footage.	2017-03-27 10:46:03 +02:00
Sergey Sharybin	5b45715f8a	Cycles: Correct isfinite check used in integrator Use fast-math friendly version of this function. We should probably avoid unsafe fast math, but this is to be done with real care with all the benchmarks properly done. For now comitting much safer fix.	2017-03-24 15:39:33 +01:00
Sergey Sharybin	85a5fbf2ce	Cycles: Workaround incorrect SSS with CUDA toolkit 8.0.61	2017-03-24 10:08:18 +01:00
Sergey Sharybin	27248c8636	Cycles: Remove unused macro	2017-03-23 17:59:02 +01:00
Sergey Sharybin	ba8c7d2ba1	Cycles: Use SSE-optimized version of triangle intersection for motion triangles The title says it all actually. Gives up to 10% speedup on test scenes here on i7-6800K. Render times on GPU are unreliable here, but there might be some slowdown caused by watertight nature of intersections.	2017-03-23 17:58:03 +01:00
Sergey Sharybin	a1348dde2e	Cycles: Fix speed regression on GPU Avoid construction of temporary array and make utility function force-inlined. Additionally avoid calling float4_to_float3 twice. This brings render times to the same values as before current patch series.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	2a5d7b5b1e	Cycles: Use utility function for SSS triangle intersection This effectively de-duplicates triangle intersection logic implemented for both regular triangle and SSS triangle.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	a5b6742ed2	Cycles: Move watertight triangle intersection to an utility file This way the code can be reused more easily.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	f8a999c965	Cycles: Move triangle intersection precalc to an util file This is a preparation work for the followup commit which wil l move remaining parts of Woop intersection logic to an utility file. Doing it as a separate commit to keep changes more atomic and easier to bisect when/if needed.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	b797a5ff78	Cycles: Cleanup, move utility function to utility file Was an old TODO, this function is handy for some math utilities as well.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	1c5cceb7af	Cycles: Move intersection math to own header file There are following benefits: - Modifying intersection algorithm will not cause so much re-compilation. - It works around header dependency hell and allows us to use vectorization types much easier in there.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	e8ff06186e	Cycles: Cleanup, inline AVX register construction from kernel global data Currently should be no functional changes, preparing for some upcoming refactor.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	2b44db4cfc	Fix/workaround T50533: Transparency shader doesn't cast shadows with curve segments There seems to be a compiler bug of MSVC2013. The issue does not happen on Linux and does not happen on Windows when building with MSVC2015. Since it's reallly a pain to debug release builds with MSVC2013 the AVX2 optimization is disabled for curve sergemnts for this compiler.	2017-03-22 11:37:23 +01:00
Mai Lavelle	8fff6cc2f5	Cycles: Fix building of OpenCL kernels Theres no overloading of functions in OpenCL so we can't make use of `safe_normalize` with `float2`.	2017-03-20 22:55:52 -04:00
Sergey Sharybin	a201b99c5a	Fix T50975: Cycles: Light sampling threshold inadvertently clamps negative lamps	2017-03-20 14:48:55 +01:00
Sergey Sharybin	18bf900b31	Fix T50990: Random black pixels in Cycles when rendering material with Multiscatter GGX	2017-03-20 12:07:41 +01:00
Sergey Sharybin	d6b4fb6429	Cycles: Fix mistake in previous split kernel commits Own stupid mistake. Reported by nirved in IRC, thanks!	2017-03-17 11:55:59 +01:00
Sergey Sharybin	a58350b07f	Cycles: Cleanup, indentation	2017-03-17 10:25:37 +01:00
Sergey Sharybin	e361adbca2	Cycles: Fix compilation error of LCG RNG	2017-03-17 09:58:08 +01:00
Mai Lavelle	60a344b43d	Cycles: Fix handling of barriers	2017-03-17 01:54:04 -04:00
Sergey Sharybin	1cad64900e	Cycles: Define ccl_local variables in kernel functions Declaring ccl_local in a device function is not supported by certain compilers.	2017-03-16 11:27:17 +01:00
Sergey Sharybin	1ff753baa4	Cycles: Workaround for compilation error caused by passing KernelGlobals Pass globals as a bare pointer, same as it sued to be prior to split kernel rework. AMD CPU platform and Intel OpenCL were complaining about this. Perhaps we shouldn't pass globals as pointer at all, this isn't something what is really portable and can cause issues on 32 bit perhaps.	2017-03-16 11:27:17 +01:00
Sergey Sharybin	26620f3f87	Cycles: Avoid some ccl_local in various kernels	2017-03-16 11:27:17 +01:00
Mai Lavelle	8dd0355c21	Cycles: Try to avoid infinite loops by catching invalid ray states	2017-03-14 06:22:57 -04:00
Sergey Sharybin	76acaefdd7	Cycles: Cleanup, wipe obviously outdated parts of split kernel comments	2017-03-13 17:16:16 +01:00
lazydodo	0c72008592	fix msvc warnings about unknown opencl pragmas	2017-03-13 10:08:14 -06:00
Sergey Sharybin	aa36c73c33	Cycles: Add missing header in the file	2017-03-13 16:59:09 +01:00
Hristo Gueorguiev	f169ff8b88	Fix T50925: Add AO approximation to split kernel	2017-03-13 11:15:58 +01:00
Sergey Sharybin	8794a43b68	Cycles: Make MESA compiler more happy While this compiler is not officially supported yet, getting it to work is a nice thing because more and more AMD cards will fall under MESA driver. It's also nice to use explicit comparison with NULL, which makes it more clear whether variable is a boolean or pointer. Even Rust enforces this! Patch by Ian Bruce with own modifications.	2017-03-13 09:57:25 +01:00
Mai Lavelle	96868a3941	Fix T50888: Numeric overflow in split kernel state buffer size calculation Overflow led to the state buffer being too small and the split kernel to get stuck doing nothing forever.	2017-03-11 05:39:28 -05:00

1 2 3 4 5 ...

1698 Commits