blender

Author	SHA1	Message	Date
Lukas Stockner	43b374e8c5	Cycles: Implement denoising option for reducing noise in the rendered image This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!	2017-05-07 14:40:58 +02:00
Mai Lavelle	915766f42d	Cycles: Branched path tracing for the split kernel This implements branched path tracing for the split kernel. General approach is to store the ray state at a branch point, trace the branched ray as normal, then restore the state as necessary before iterating to the next part of the path. A state machine is used to advance the indirect loop state, which avoids the need to add any new kernels. Each iteration the state machine recreates as much state as possible from the stored ray to keep overall storage down. Its kind of hard to keep all the different integration loops in sync, so this needs lots of testing to make sure everything is working correctly. We should probably start trying to deduplicate the integration loops more now. Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes could use more testing still. Reviewers: sergey, nirved Reviewed By: nirved Subscribers: Blendify, bliblubli Differential Revision: https://developer.blender.org/D2611	2017-05-02 14:26:46 -04:00
Sergey Sharybin	0579eaae1f	Cycles: Make all #include statements relative to cycles source directory The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586	2017-03-29 13:41:11 +02:00
Hristo Gueorguiev	8ada7f7397	Cycles: Remove ccl_addr_space from RNG passed to functions Simplifies code quite a bit, making it shorter and easier to extend. Currently no functional changes for users, but is required for the upcoming work of shadow catcher support with OpenCL.	2017-03-27 10:46:28 +02:00
Sergey Sharybin	d14e39622a	Cycles: First implementation of shadow catcher It uses an idea of accumulating all possible light reachable across the light path (without taking shadow blocked into account) and accumulating total shaded light across the path. Dividing second figure by first one seems to be giving good estimate of the shadow. In fact, to my knowledge, it's something really similar to what is happening in the denoising branch, so we are aligned here which is good. The workflow is following: - Create an object which matches real-life object on which shadow is to be catched. - Create approximate similar material on that object. This is needed to make indirect light properly affecting CG objects in the scene. - Mark object as Shadow Catcher in the Object properties. Ideally, after doing that it will be possible to render the image and simply alpha-over it on top of real footage.	2017-03-27 10:46:03 +02:00
Hristo Gueorguiev	57e26627c4	Cycles: SSS and Volume rendering in split kernel Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.	2017-03-09 17:09:37 +01:00
Mai Lavelle	352ee7c3ef	Cycles: Remove ccl_fetch and SOA	2017-03-08 00:52:41 -05:00
Sergey Sharybin	0330741548	Cycles: Add option to replace GI with AO approximation after certain amount of bounces This is a speed up option which is mainly useful for viewport. Gives nice speedup in the barbershop scene of 2x when replacing GI with AO after 2nd bounce without loosing too much details. Reviewers: brecht Subscribers: eyecandy, venomgfx Differential Revision: https://developer.blender.org/D2383	2017-01-27 14:21:49 +01:00
Sergey Sharybin	bc096e1eb8	Cycles: Split ShaderData object and shader flags We started to run out of bits there, so now we separate flags which came from __object_flags and which are either runtime or coming from __shader_flags. Rule now is: SD_OBJECT_* flags are to be tested against new object_flags field of ShaderData, all the rest flags are to be tested against flags field of ShaderData. There should be no user-visible changes, and time difference should be minimal. In fact, from tests here can only see hardly measurable difference and sometimes the new code is somewhat faster (all within a noise floor, so hard to tell for sure). Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself Differential Revision: https://developer.blender.org/D2428	2017-01-23 12:56:55 +01:00
Sergey Sharybin	b9311b5e5a	Cycles: Make object flag names more obvious that hey are object and not shader	2017-01-23 12:14:17 +01:00
Sergey Sharybin	53fa389802	Cycles: Use dedicated debug passes for traversed nodes and intersection tests This way it's more clear whether some issue is caused by lots of geometry in the node or by lots of "transparent" BVH nodes.	2017-01-12 13:44:35 +01:00
Sergey Sharybin	dd58390d71	Fix emissive volumes generates unexpected fireflies around intersections Discard the whole volume stack on the last bounce (but keep world volume if present). Volumes are expected to be closed manifol meshes, meaning if ray entered the volume there should be an intersection event of ray exisintg the volume. Case when ray hit nothing and there are still non-world volumes in the stack can happen in either of cases. 1. Mesh is not closed manifold. Such configurations are not really supported anyway and should not be used. Previous code would have consider the infinite length of the ray to sample across, so render result wasn't really correct anyway. 2. Exit intersection is more far away than the camera far clip distance. This case also will behave differently now, but previously it wasn't really correct either, so it's not like we're breaking something which was working as expected. 3. We missed exit event due to intersection precision issues. This is exact the case which this patch fixes and avoid fireflies. 4. Volume has Camera only visibility (all the rest visibility is set to off) This is what could be considered a regression but could be solved quite easily by checking volume stack's objects flags and keep entries which doesn't have Volume Scatter visibility (or even better: ensure Volume Scatter visibility for objects with volume closure), Fixes T46108: Cycles - Overlapping emissive volumes generates unexpected bright hotspots around the intersection Also fixes fireflies appearing on the edges of cube with emissive volue. Reviewers: juicyfruit, brecht Reviewed By: brecht Maniphest Tasks: T46108 Differential Revision: https://developer.blender.org/D2212	2016-12-08 17:35:43 +01:00
Mai Lavelle	a1aa3a8b75	Cycles: Add comments to endif directives `kernel_path.h` and `kernel_path_branched.h` have a lot of conditional code and it was kind of hard to tell what code belonged to which directive. Should be easier to read now.	2016-11-10 19:50:23 -05:00
Lukas Stockner	04aa454075	Cycles: Deduplicate AO calculation No functional changes.	2016-10-31 00:40:59 +01:00
Lukas Stockner	26bf230920	Cycles: Add optional probabilistic termination of light samples based on their expected contribution In scenes with many lights, some of them might have a very small contribution to some pixels, but the shadow rays are traced anyways. To avoid that, this patch adds probabilistic termination to light samples - if the contribution before checking for shadowing is below a user-defined threshold, the sample will be discarded with probability (1 - (contribution / threshold)) and otherwise kept, but weighted more to remain unbiased. This is the same approach that's also used in path termination based on length. Note that the rendering remains unbiased with this option, it just adds a bit of noise - but if the setting is used moderately, the speedup gained easily outweighs the additional noise. Reviewers: #cycles Subscribers: sergey, brecht Differential Revision: https://developer.blender.org/D2217	2016-10-30 11:31:28 +01:00
Brecht Van Lommel	a3abb020e3	Fix Cycles CUDA performance on CUDA 8.0. Mostly this is making inlining match CUDA 7.5 in a few performance critical places. The end result is that performance is now better than before, possibly due to less register spilling or other CUDA 8.0 compiler improvements. On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory usage is reduced a little too. Reviewed By: sergey Differential Revision: https://developer.blender.org/D2269	2016-10-03 22:15:25 +02:00
Sergey Sharybin	99b1c1018a	Cycles: Recent SSS inline changes broke CPU tests Very weird, but let's just fall back a bit for now.	2016-08-03 15:27:48 +02:00
Sergey Sharybin	6353ecb996	Cycles: Tweaks to support CUDA 8 toolkit All the changes are mainly giving explicit tips on inlining functions, so they match how inlining worked with previous toolkit. This make kernel compiled by CUDA 8 render in average with same speed as previous kernels. Some scenes are somewhat faster, some of them are somewhat slower. But slowdown is within 1% so far. On a positive side it allows us to enable newer generation cards on buildbots (so GTX 10x0 will be officially supported soon).	2016-08-01 15:54:29 +02:00
Sergey Sharybin	4355603790	Cycles: Move BVK kernel files to own directory BVH traversal is not really that much a geometry and we've got quite some traversals now. Makes sense to keep them separate in the name of source structure clarity.	2016-07-11 13:58:47 +02:00
Lukas Stockner	23c276832b	Cycles: Add multi-scattering, energy-conserving GGX as an option to the Glossy, Anisotropic and Glass BSDFs This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model". Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until the ray leaves it again, which ensures perfect energy conservation. In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing roughness - is solved in a physically correct and efficient way. The downside of this model is that it has no (known) analytic expression for evalation. However, it can be evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the balance heuristic guarantee an unbiased result at the cost of slightly higher noise. Reviewers: dingto, #cycles, brecht Reviewed By: dingto, #cycles, brecht Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel Differential Revision: https://developer.blender.org/D2002	2016-06-23 22:57:26 +02:00
Brecht Van Lommel	b49185df99	Cycles CUDA: reduce branched path stack memory by sharing indirect ShaderData. Saves about 15% for the branched path kernel.	2016-05-25 21:13:24 +02:00
Brecht Van Lommel	999d5a6785	Cycles CUDA: reduce stack memory by reusing ShaderData. 57% less for path and 48% less for branched path.	2016-05-23 22:29:24 +02:00
Sergey Sharybin	700722f686	Cycles: Cleanup, indent nested preprocessor directives Quite straightforward, main trick is happening in path_source_replace_includes(). Reviewers: brecht, dingto, lukasstockner97, juicyfruit Differential Revision: https://developer.blender.org/D1794	2016-03-25 13:55:42 +01:00
Brecht Van Lommel	3c4f971392	Workaround for T47213: branched path sampling issues with CUDA 7.5.	2016-02-19 00:49:24 +01:00
Sergey Sharybin	1f273cec00	Cycles: Tweak inline policy for some functions The goal is to make Experimental kernel closer in performance to the official kernel, avoiding spills and such. There should not be big impact on official kernel, own tests showed few percent performance drop on laptop's GPU. CPU was always the same speed on AVX, AVX2 and SSE4.1 CPUs i've been testing here. This seems to be the last essential step before we can get rid of Experimental kernel and enable SSS officially on GPU without causing some major performance issues. Surely some more tweaks are possibly required, but that we can do for until cows go home anyway.	2016-01-14 14:53:05 +05:00
Thomas Dinges	83e73a2100	Cycles: Refactor how we pass bounce info to light path node. This commit changes the way how we pass bounce information to the Light Path node. Instead of manualy copying the bounces into ShaderData, we now directly pass PathState. This reduces the arguments that we need to pass around and also makes it easier to extend the feature. This commit also exposes the Transmission Bounce Depth to the Light Path node. It works similar to the Transparent Depth Output: Replace a Transmission lightpath after X bounces with another shader, e.g a Diffuse one. This can be used to avoid black surfaces, due to low amount of max bounces. Reviewed by Sergey and Brecht, thanks for some hlp with this. I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split and Mega kernel. Hopefully this covers all devices. :)	2016-01-06 23:43:29 +01:00
Sergey Sharybin	d0a9ec5efc	Cycles: Fix SSS object not properly reflected in glossy object with indirect clamping This fixes remained issues reported in T46908.	2015-12-02 16:00:01 +05:00
Sergey Sharybin	6147c4037d	Cycles: Fix wrong volume stack after SSS bounce Was introduced by a recent fixes, now it should be all correct and additionally it solves the TODO mentioned in the code.	2015-11-28 20:07:34 +05:00
Sergey Sharybin	f5d1551b6e	Cycles: Fix wrong original ray used for SSS baking Also de-duplicated some code by moving to an utility function.	2015-11-28 20:07:34 +05:00
Sergey Sharybin	1e43f0d742	Cycles: Set of fixes for delayed SSS ray tracing There were multiple issues which are solved now: - It was possible that ray wouldn't be bounced off the BSSRDF, for example when PDF or shader eval is zero. In this case PathState might have been left in pre-bounced state which would have been gave incorrect shading results. This is solved by having separate PathState for each of the hits. - Path radiance summing wasn't happening correct as well, indirect rays were using wrong path radiance in the case when there were more than one hit recorded. This is now using a bit trickier state machine which calculates path radiance for just SSS (both direct and indirect) and then sums it back to the final radiance. - Previous commit wasn't totally correct either and was an induced bug due to wrong path state left from the "un-happened" ray bounce. There should be no special case happening here, BSSRDFs will be replaced with diffuse ones due to PATH_RAY_DIFFUSE_ANCESTOR flag. - Merged back codebases for "delayed" and "immediate" indirect SSS ray tracing, hopefully making it easier to maintain the codebase. Sure this changes brings memory usage back by about 4-5%, but overall it's still about 2x memory reduction for the experimental kernel here. Thanks Brecht for the review!	2015-11-28 20:07:34 +05:00
Sergey Sharybin	8919ed3a62	Cycles: Fallback to diffuse BSDF for the indirect SSS rays when BSSRDF is hit This is actually how it was intended to work, just didn't notice it wasn't really happening in the main ray loop. Solves some memory issues reported in T46880.	2015-11-28 20:07:34 +05:00
Sergey Sharybin	20fc9c00fd	Cycles: Fully roll-back to non-delayed SSS indirect rays for CPU There are some issues to be solved with the recent optimization we did for the indirect rays for the SSS. Those issues will take a bit of a time to be fully solved still and we need to unlock Caminandes team now, so let's revert some changes back. CUDA will still use delayed indirect rays since it's an experimental feature. For the details about what's to be done still please refer to T46880.	2015-11-27 17:15:02 +05:00
Sergey Sharybin	175f00c89a	Revert "Cycles: Fix wrong SSS with regular path tracing and clamping enabled" This wasn't really a complete fix and only worked if there was a single scatter event recorded only. Proper fix requires some more thoughts to make it correct without memory use increase. This reverts commit `bf9e88bfbe`.	2015-11-27 17:15:02 +05:00
Sergey Sharybin	bf9e88bfbe	Cycles: Fix wrong SSS with regular path tracing and clamping enabled Radiance sum and reset was happening in different order after `26f1c51`. This is a quick fix to unlock Caminandes team, perhaps we can avoid having separate variable to detect when radiance is to be sum.	2015-11-26 16:11:41 +05:00
Sergey Sharybin	26f1c51ca6	Cycles: Trace indirect subsurface rays by restarting the integrator loop This gives much lower stack usage on GPU and reduces kernel memory size to around 448MB on GTX560Ti (comparing to 652MB with previous commit and 946MB with official release). There's also a barely measurable speedup of around 5%, but this is to be confirmed still. At this stage we're using only ~3% for the experimental kernel and SSS rendering seems to be faster by 40% and after some further testing we might consider making SSS and CMJ official features and remove experimental precompiled kernels.	2015-11-25 13:01:22 +05:00
Sergey Sharybin	2a5c1fc9cc	Cycles: Delay shooting SSS indirect rays The idea is to delay shooting indirect rays for the SSS sampling and trace them after the main integration loop was finished. This reduces GPU stack usage even further and brings it down to around 652MB (comparing to 722MB before the change and 946MB with previous stable release). This also solves the speed regression happened in the previous commit and now simple SSS scene (SSS suzanne on the floor) renders in 0:50 (comparing to 1:16 with previous commit and 1:03 with official release).	2015-11-25 13:01:22 +05:00
Sergey Sharybin	8bca34fe32	Cysles: Avoid having ShaderData on the stack This commit introduces a SSS-oriented intersection structure which is replacing old logic of having separate arrays for just intersections and shader data and encapsulates all the data needed for SSS evaluation. This giver a huge stack memory saving on GPU. In own experiments it gave 25% memory usage reduction on GTX560Ti (722MB vs. 946MB). Unfortunately, this gave some performance loss of 20% which only happens on GPU. This is perhaps due to different memory access pattern. Will be solved in the future, hopefully. Famous saying: won in memory - lost in time (which is also valid in other way around).	2015-11-25 13:01:22 +05:00
Sergey Sharybin	099aaea447	Cycles: Move branched path tracking into own file Code there started becoming a bit too big, by splitting it up it'll make it easier to do improvements or extending the features in there. The layout is not totally final yet, would need to try de-duplicating parts of code from split kernel with non-split integrators,	2015-06-15 23:02:42 +02:00
Sergey Sharybin	596eadf0e1	Cycles: Add debug pass which shows number of instance pushes during camera ray intersection TODO: We might want to refactor debug passes into PASS_DEBUG and some debug_type (similar to Blender's side passes) to avoid issue of running out of bits.	2015-06-12 00:12:03 +02:00
Sergey Sharybin	2bd6de5bbb	Cycles: Add debug pass showing average number of ray bounces per pixel Quite straightforward implementation, but still needs some work for the split kernel. Includes both regular and split kernel implementation for that. The pass is not exposed to the interface yet because it's currently not really easy to have same pass listed in the menu multiple times.	2015-06-11 14:53:15 +02:00
George Kyriazis	7f4479da42	Cycles: OpenCL kernel split This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200	2015-05-09 19:52:40 +05:00
Thomas Dinges	900fc43bb4	Cleanup: Remove unused ray type flags. They were added for completeness, but it seems we don't need them.	2015-05-08 12:10:26 +02:00
Thomas Dinges	5e423775da	Cleanup: Move Cycles volume stack update for subsurface into kernel_volume.h.	2015-04-28 11:20:27 +02:00
Thomas Dinges	3db0e1ef6a	Cycles: Simplify volume light connect code.	2015-03-13 00:09:13 +01:00
Thomas Dinges	60679a171d	Revert "Cleanup: Simplify camera sample motion blur code." This reverts commit `8197f0bb64`.	2015-02-26 13:27:02 +01:00
Thomas Dinges	8197f0bb64	Cleanup: Simplify camera sample motion blur code.	2015-02-26 10:30:01 +01:00
Thomas Dinges	d979f39cf1	Cycles: Small improvement for volume render (decoupled) Simplify branching here a bit, helps ~3% in volume_light_sampling.blend (Branched MIS scene).	2015-02-14 20:44:30 +01:00
Sergey Sharybin	25f33e058a	Fix T43562: Cycles gets stuck with camera in volume in certain setup The issue was caused by the way how we shoot the ray to see which rays we're inside which might start bouncing back-n-forth between two close to parallel intersecting faces. Real solution would be to record all the intersections when shooting the ray, but it's kinda tricky on GPU because of needed sorting and uncertainty of how huge intersection array should be. For now we'll just limit number of steps in the check so in worst case we'll have some samples not being correct which will be compensated with further sampling. Shouldn't be an issue since probability of such a lock is quite small actually.	2015-02-05 16:10:50 +05:00
Thomas Dinges	ee36e75b85	Cleanup: Fix Cycles Apache header. This was already mixed a bit, but the dot belongs there.	2014-12-25 02:50:24 +01:00
Campbell Barton	4c60aae66c	Cleanup: warnings	2014-10-06 23:19:07 +02:00

1 2 3 4 5

205 Commits