blender

Author	SHA1	Message	Date
Sergey Sharybin	525673b37b	Cycles: Fix uninitialized variable issue after recent changes	2016-12-14 17:31:11 +01:00
Sergey Sharybin	c4d6fd3ec0	Cycles: Consider GGX/Beckmann/Ashikhmin of 0 roughness a singular ray This matches behavior of Multiscatter GGX and could become handy later on when/if we decide it would be beneficial to replace on closure with another. Reviewers: lukasstockner97, brecht Reviewed By: brecht Differential Revision: https://developer.blender.org/D2413	2016-12-14 11:04:02 +01:00
Sergey Sharybin	87cd56b012	Fix T50075: Assert during debug render of hair_geom_transmission.blend	2016-12-01 12:11:11 +01:00
Brecht Van Lommel	a3abb020e3	Fix Cycles CUDA performance on CUDA 8.0. Mostly this is making inlining match CUDA 7.5 in a few performance critical places. The end result is that performance is now better than before, possibly due to less register spilling or other CUDA 8.0 compiler improvements. On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory usage is reduced a little too. Reviewed By: sergey Differential Revision: https://developer.blender.org/D2269	2016-10-03 22:15:25 +02:00
Sergey Sharybin	0ec87f1227	Cycles: Cleanup, indentation	2016-09-28 17:05:33 +02:00
Sergey Sharybin	e1bfb89da2	Cycles: Fix compilation error with minimal feature set	2016-09-28 17:03:59 +02:00
Lukas Stockner	0b89b31a18	Cycles: Fix T49411: Multiscatter GGX with zero roughness when Filter Glossy is enabled	2016-09-25 22:09:38 +02:00
Sergey Sharybin	c5eb400b7c	Cycles: Fix embarrassing typo Spotted by Mai Lavelle, thanks!	2016-08-05 14:45:54 +02:00
Sergey Sharybin	500e0e9a3d	Cycles: Some more inline policy tweaks for CUDA 8 Makes it so toolkit does exactly the same decision about what to inline, but unfortunately it has really barely visible difference on GTX-980.	2016-08-02 15:13:34 +02:00
Sergey Sharybin	6353ecb996	Cycles: Tweaks to support CUDA 8 toolkit All the changes are mainly giving explicit tips on inlining functions, so they match how inlining worked with previous toolkit. This make kernel compiled by CUDA 8 render in average with same speed as previous kernels. Some scenes are somewhat faster, some of them are somewhat slower. But slowdown is within 1% so far. On a positive side it allows us to enable newer generation cards on buildbots (so GTX 10x0 will be officially supported soon).	2016-08-01 15:54:29 +02:00
Brecht Van Lommel	9b6ed3a42b	Cycles: refactor kernel closure storage to use structs per closure type. Reviewed By: dingto, sergey Differential Revision: https://developer.blender.org/D2127	2016-07-31 02:34:43 +02:00
Sergey Sharybin	d834759423	Cycles: Fix difference in Ashikhmin Shirley shader between CPU and GPU The issue was caused by some NaN appearing in calculations. Visible with scifi_armor_concept.blend from the cloud.	2016-07-28 18:46:29 +02:00
Lukas Stockner	d9cc3ea2c6	Cycles: Fix rays parallel to the surface in the triangle refine and MultiGGX code In the triangle intersection refinement code, rays that are parallel to the triangle caused a divide by zero. These rays might initially hit the triangle due to the watertight intersection test, but are very rare - therefore, just skipping the refinement for them works fine. Also, a few remaining issues in the MultiGGX code are fixed that were caused by rays parallel to the surface (which happened more often there due to smooth shading).	2016-07-25 16:14:25 +02:00
Lukas Stockner	83ae0a0e06	Cycles: Calculate differentials in the Multiscattering GGX closures The Multiscattering GGX closures didn't set the omega_i differentials, which could cause undefined behaviour.	2016-07-25 16:14:25 +02:00
Lukas Stockner	a2c82f5e5d	Cycles: Fix OpenCL compilation after the recent numerical fixes	2016-07-17 19:24:53 +02:00
Lukas Stockner	d9281a6332	Cycles: Fix three numerical issues in the fresnel, normal map and Beckmann code - In fresnel_dielectric, the differentials calculation sometimes divided by zero. - When the normal map was (0.5, 0.5, 0.5), the code would try to normalize a zero vector. Now, it just uses the regular normal as a fallback. - The approximate error function used in Beckmann sampling sometimes overflowed to inf while calculating r^16. The final value is 1 - 1/r^16, however, so now it just returns 1 if the computation would overflow otherwise.	2016-07-16 20:54:14 +02:00
Lukas Stockner	5ba78d76d4	Cycles: Deduplicate geometric factor calculation in the Beckmann distribution Also, this fixes a numerical issue where A would be inf. Since later G is set to 1 if A is larger than 1.6, the code now checks the reciprocal of A for being smaller than 1/1.6 - same effect, but no inf involved.	2016-07-16 20:54:14 +02:00
Sergey Sharybin	23cc453975	Fix T48732: New GGX breaks OpenCL kernel Make sure we don't perform any implicit address space conversion. A bit annoying, but less intrusive approaches (like using temp private variable in .cl kernel) do not work correct here. Using generic address space will help from code side here, but will be somewhat slower due to extra things happening as far as i know.	2016-06-28 17:15:35 +05:00
Lukas Stockner	2a69b09b62	Fix T48732 v2: New GGX breaks OpenCL kernel As far as I can see, the second issue there was that the functions receive a pointer to a member variable of the ShaderData, which is stored in global memory. However, this means that the pointer points to global memory as well, therefore OpenCL requires the ccl_addr_space "keyword" in front of the pointer. With this commit, the OpenCL kernels build on Linux with the Intel CPU OpenCL runtime - however, they already did without the change and I don't have an AMD card, so I can't really test whether the AMD runtime is happy as well now.	2016-06-26 00:51:16 +02:00
Thomas Dinges	99088f8b55	Fix T48732, OpenCL compile failure after Multiscatter GGX commit. Use OpenCL "all" builtin type for conversion, according to OpenCL 1.1 spec 6.3e.	2016-06-25 11:14:06 +02:00
Lukas Stockner	23c276832b	Cycles: Add multi-scattering, energy-conserving GGX as an option to the Glossy, Anisotropic and Glass BSDFs This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model". Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until the ray leaves it again, which ensures perfect energy conservation. In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing roughness - is solved in a physically correct and efficient way. The downside of this model is that it has no (known) analytic expression for evalation. However, it can be evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the balance heuristic guarantee an unbiased result at the cost of slightly higher noise. Reviewers: dingto, #cycles, brecht Reviewed By: dingto, #cycles, brecht Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel Differential Revision: https://developer.blender.org/D2002	2016-06-23 22:57:26 +02:00
Sergey Sharybin	7ac126e728	Fix T46492: GGX distribution produces black pixels The issue was caused by some numerical instability.	2016-06-17 16:30:29 +02:00
Sergey Sharybin	84c68dcb3f	Cycles: Minor cleanup, whitespace around keyword and preprocessor indent	2016-04-13 08:58:52 +02:00
Sergey Sharybin	700722f686	Cycles: Cleanup, indent nested preprocessor directives Quite straightforward, main trick is happening in path_source_replace_includes(). Reviewers: brecht, dingto, lukasstockner97, juicyfruit Differential Revision: https://developer.blender.org/D1794	2016-03-25 13:55:42 +01:00
Sergey Sharybin	69dc0c3192	Cycles: Fixes for Burley BSSRDF There are several fixes in here, which hopefully will make the shader working correct without too much magic in there. First of all, this commit brings BURLEY_TRUNCATE down from 30 to 16 which reduces noise a lot. It's still higher than original truncate from Brecht, but this reduces PDF value at a cutoff distance by an order of magnitude (now it's 0.008387, previously it was 0.063521 for the albedo of 0.8 and radius 1.0). This should converge to a proper result faster and don't have artifacts. This kind of reverts fix for T47356, but after additional thinking came to conclusion Burley is not being totally smooth, it is about giving less waxy results which it's kind of doing in the file. Second of all, this commit fixes burley_eval() to use normalized diffusion reflectance. This matches the way we calculate CDF and solves numeric instability close to 0, making PDF profile looking closer to other SSS profiles: https://developer.blender.org/F282355 https://developer.blender.org/F282356 https://developer.blender.org/F282357 Reviewers: brecht Reviewed By: brecht Differential Revision: https://developer.blender.org/D1792	2016-02-13 13:29:13 +01:00
Sergey Sharybin	2ac88328b0	Cycles: Fix Burley's CDF truncation after recent radius fix This is all not really ideal, but good enough for tonight. More thoughts and investigation tomorrow!	2016-02-08 21:50:38 +01:00
Sergey Sharybin	dae8326d1e	Fix T47356: Too sharp falloww with Burley BSSRDF After the clamping commit we need to bump BURLEY_TRUNCATE constant a bit, otherwise mean free path does not really match the disk radius needed for importance sampling.	2016-02-08 14:54:11 +01:00
Brecht Van Lommel	4443bb8922	Fix Burley BSSRDF NaNs and fireflies. Explicitly truncate to Rm same way as the Gaussian BSSRDF, and use safe_sqrtf() to be sure in case of float precision issues.	2016-02-06 11:52:43 +01:00
Sergey Sharybin	e688a62712	Cycles: Fix for initial guess of the radius for Burley BSSRDF The value was too high, causing bad Newton iteration step. Now the value is not so good, but it's still within 9 iterations and those high number of iterations are only happening in approx 1% of input values.	2016-02-05 10:06:08 +01:00
Sergey Sharybin	3e7389eaf2	Cycles: Speedup of Christensen-Burley SSS falloff function The idea is simply to pre-compute fitting and parameterization in the bssrdf_setup() function and re-use the values in both sample() and eval(). The only trick is where to store the pre-calculated values and the answer is inside of ShaderClosure->custom{1,2,3}. There's no memory bump here because we now simply re-use padding fields for the pre-calculated values. Similar trick we can do for other BSDFs. Seems to give nice speedup up to 7% here on my desktop with Core i7 CPU, SSE4.1 kernel.	2016-02-04 15:29:58 +01:00
Sergey Sharybin	ad26407b52	Cycles: Implement approximate reflectance profiles Using this paper: http://graphics.pixar.com/library/ApproxBSSRDF/paper.pdf This model gives less blurry results than the Cubic and Gaussian we had implemented: - Cubic: https://developer.blender.org/F279670 - Burley: https://developer.blender.org/F279671 The model is called "Christensen-Burley" in the interface, which actually should be read as "Physically based" or "Realistic". Reviewers: juicyfruit, dingto, lukasstockner97, brecht Reviewed By: brecht, dingto Subscribers: robocyte Differential Revision: https://developer.blender.org/D1759	2016-02-04 13:27:23 +05:00
Sergey Sharybin	e42852a339	Cycles: Cleanup and reference actual paper used for BSSRDF sampling	2016-02-02 18:06:29 +01:00
Sergey Sharybin	12b7850d4f	Cycles: Fix for transmissive microfacet sampling This is an alternate fix for T40964 which resolves bad handling of caustics reported in T45609. There were too much transmission rays being discarded by the original fix, which caused by caustic light being totally disabled. There is still some room for investigation why exactly original paper didn't work that well, could be caused by the way how the pdf is calculated. In any case current results seems rather correct now.	2015-07-31 13:46:58 +02:00
Thomas Dinges	6a0a205cb4	Cycles: Simplify volume_phase_eval(). This simplification is safe, as the call to volume_phase_eval() is guarded behind a CLOSURE_IS_PHASE check, which is equal to CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID. I don't think we will add more phase functions anytime soon, if at all.	2015-06-11 15:18:33 +02:00
George Kyriazis	7f4479da42	Cycles: OpenCL kernel split This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200	2015-05-09 19:52:40 +05:00
Sergey Sharybin	ae7d84dbc1	Cycles: Use native saturate function for CUDA This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1) into a single instruction and uses 4 instructions instead. Original patch by @lockal with own modification: Don't make changes outside of the kernel. They don't make any difference anyway and term saturate() has a bit different meaning outside of kernel. This gives around 2% of speedup in Barcelona file, but in more complex shader setups with lots of math nodes with clamping speedup could be much nicer. Subscribers: dingto Projects: #cycles Differential Revision: https://developer.blender.org/D1224	2015-04-28 00:38:32 +05:00
Thomas Dinges	bc160d8a85	Cleanup: Code style.	2015-04-26 00:42:26 +02:00
Sergey Sharybin	e2354e64d2	Cycles: Cleanup, spaces around assignment operator Did some bad spacing in recent commits, better to get rid of those so they does not confuse those who're working on sources.	2015-04-07 00:25:54 +05:00
Sergey Sharybin	a9bb8d8a73	Cycles: de-duplicate fast/approximate erf function calculation Our own implementation is in fact the same performance as in fast_math from OpenShadingLanguage, but implementation from fast_math is using explicit madd function, which increases chance of compiler deciding to use intrinsics.	2015-04-06 12:49:44 +05:00
Sergey Sharybin	b06962fcfe	Cycles: Avoid using lookup table for Beckmann slopes on GPU This patch is based on some work done in D788 and re-formulation from Beckmann implementation in OpenShadingLanguage. Skipping texture lookup helps a lot on GPUs where it's more expensive to access texture memory than to do some extra calculation in threads. CPU code still uses lookup-table based approach since this seems to be still faster (at least on computers i've got access to). This change gives about 2% speedup on BMW scene with GTX560TI.	2015-04-05 19:07:45 +05:00
Sergey Sharybin	252b36ce77	Cycles: Remove unused Beckmann slope sampling code It did not preserve stratification too well and lookup-table approach was working much better. There are now also some more interesting forumlation from Wenzel and OpenShadingLanguage which should work better than old code.	2015-04-05 19:07:44 +05:00
Sergey Sharybin	af399884e1	Fix T44113: Ashikhmin-Shirley distribution of glossy shader at 0 roughness causes artifacts when background uses MIS Was a division by zero error, solved in the same way as beckmann/ggx deals with small roughness values.	2015-04-01 14:21:21 +05:00
Sergey Sharybin	5ff132182d	Cycles: Code cleanup, spaces around keywords This inconsistency drove me totally crazy, it's really confusing when it's inconsistent especially when you work on both Cycles and Blender sides. Shouldn;t cause merge PITA, it's whitespace changes only, Git should be able to merge it nicely.	2015-03-28 00:15:15 +05:00
Sergey Sharybin	87cff57207	Fix T44123: Cycles SSS renders black in recent builds Issue was introduced in 01ee21f where i didn't notice *_setup() function only doing partial initialization, and some of parameters are expected to be initialized by callee function. This was hitting only some setups, so tests with benchmark scenes didn't unleash issues. Now it should all be fine. This is to go to the 2.74 branch and we actually might re-AHOY.	2015-03-25 02:33:49 +05:00
Sergey Sharybin	ed7e593a4b	Fix T43926: Volume scatter: intersecting objects GPU rendering artifacts Fix T44007: Cycles Volumetrics: block artifacts with overlapping volumes The issue was caused by uninitialized parameters of some closures, which lead to unpredictable behavior of shader_merge_closures().	2015-03-23 12:48:33 +05:00
Thomas Dinges	6f3500db05	Cleanup: Remove unused SD_PHASE_HAS_EVAL flag. We only have a non-singular volume closure and therefore no need to distinguish it.	2015-02-18 16:33:31 +01:00
Thomas Dinges	a2366a3a2e	Cleanup for Cycles hair shader ifdefs. sc->T and sc->data2 were behind __HAIR__ ifdef, now they are not anymore, so we can always assign the correct value.	2015-02-18 15:57:39 +01:00
Sergey Sharybin	f3e831f02d	Cycles: Fix for hair transmission BSDF not returning proper label	2015-02-17 22:40:00 +05:00
Thomas Dinges	a0d7db503d	Cycles: Small tweaks for Henyey Greenstein closure code. * Avoid duplicative fabs(g) check in sample code. * Avoid dot product in eval code. Helps like ~1% when Scatter Anisotropy is 0.	2015-02-17 17:48:18 +01:00
Thomas Dinges	bf878d3c3d	Cycles: Remove empty closure blur code and the corresponding entries in the switch. Most compilers will probably optimize that out, but I still don't see a reason to keep it.	2015-02-17 13:44:25 +01:00

1 2 3

125 Commits