summaryrefslogtreecommitdiffstats
path: root/src/shader_recompiler (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge pull request #9694 from ameerj/txq-mipsliamwhite2023-01-2911-29/+37
|\ | | | | shader_recompiler: TXQ: Skip QueryLevels when possible
| * shader_recompiler: TXQ: Skip QueryLevels when possibleameerj2023-01-2811-29/+37
| |
* | Merge pull request #9687 from ameerj/ogl-shader-msbunnei2023-01-294-33/+46
|\ \ | | | | | | glasm, glsl: Implement multisampled Image Fetch
| * | emit_glsl_image: Fix ImageFetch for MSAA texturesameerj2023-01-281-6/+11
| | |
| * | glasm: Add MS sampler typesameerj2023-01-272-5/+8
| | |
| * | glsl: Add MS sampler typesameerj2023-01-271-22/+27
| |/
* | Merge pull request #9682 from ameerj/shader-s32bunnei2023-01-2813-46/+19
|\ \ | |/ |/| shader_recompiler: Remove S32 IR type
| * shader_recompiler: Remove S32 IR typeameerj2023-01-2613-46/+19
| | | | | | | | | | | | The frontend IR opcodes do not distinguish between signed and unsigned integer types. Fixes broken shaders when IR validation/graphics debugging is enabled for shaders that used BitCastS32F32
* | spirv: fix multisampled image fetchLiam2023-01-234-2/+16
|/
* Avoid OOB array access reading passthrough attr maskBilly Laws2023-01-071-1/+1
| | | YFC 1.5 extended the size of the varying mask used to hold passthrough attrs without considering this
* Run clang-formatBilly Laws2023-01-055-23/+33
|
* shader_recompiler: Fix shuffle partitioning for >64 invoc-per-subgroup GPUsBilly Laws2023-01-051-30/+28
| | | | The existing implementation only supports 64 invoc-per-subgroup GPUs, and misbehaves on adreno when invocations need to be split into 4 emulated subgroups.
* shader_recompiler: Add support for lowering geometry passthroughBilly Laws2023-01-052-40/+67
| | | | Reuses most of the existing code for generating the gl_Layer passthrough. Fixes geometry in Nier: Automata on GPUs without HW passthrough support.
* shader_recompiler: Align SSBO offsets to meet host requirementsBilly Laws2023-01-054-6/+11
| | | | We can take advantage of SSBO addresses being passed in a constant bufer to account for the extra alignment requirements in the shader itself.
* shader_recompiler: SPIRV: Only enable int64 feature when supportedBilly Laws2023-01-051-1/+1
|
* shader_recompiler: Add comparison operators to descriptor typesBilly Laws2023-01-051-0/+12
|
* Vulkan: Add a workaround for input_position on Adreno driversBilly Laws2023-01-054-11/+41
| | | | Adreno drivers will crash compiling geometry shaders if the input position is not wrapped in a gl_in struct.
* Video_core: Address feedbackFernando Sahmkow2023-01-049-0/+39
|
* ShaderCompiler: Inline driver specific constants.Fernando Sahmkow2023-01-032-1/+34
|
* MacroHLE: Final cleanup and fixes.Fernando Sahmkow2023-01-011-2/+2
|
* MacroHLE: Add OpenGL SupportFernando Sahmkow2023-01-012-1/+13
|
* MacroHLE: Add HLE replacement for base vertex and base instance.Fernando Sahmkow2023-01-0112-6/+91
|
* Merge pull request #7450 from FernandoS27/ndc-vulkanliamwhite2022-12-173-3/+5
|\ | | | | Vulkan: Add support for VK_EXT_depth_clip_control.
| * Vulkan: Add support for VK_EXT_depth_clip_control.FernandoS272022-12-143-3/+5
| |
* | spirv_emit_context: declare GroupNonUniform capability for SubgroupLocalInvocationIdLiam2022-12-141-0/+2
|/
* Merge pull request #9300 from ameerj/pchliamwhite2022-12-033-1/+12
|\ | | | | CMake: Use precompiled headers to improve compile times
| * CMake: Consolidate common PCH headersameerj2022-12-011-7/+1
| |
| * CMake: Use precompiled headersameerj2022-11-302-0/+18
| |
| * value.h: remove recursive includeameerj2022-11-301-1/+0
| |
* | Merge pull request #9289 from liamwhite/fruit-companyliamwhite2022-12-036-3/+9
|\ \ | | | | | | general: fix compile for Apple Clang
| * | general: fix compile for Apple ClangLiam2022-11-236-3/+9
| |/
* | Merge pull request #9303 from liamwhite/new-vulkan-initMatías Locatti2022-12-023-19/+31
|\ \ | | | | | | Vulkan: update initialization
| * | Vulkan: update initializationLiam2022-11-273-19/+31
| |/ | | | | | | Co-authored-by: bylaws <bylaws@users.noreply.github.com>
* / shader_recompiler: add gl_Layer translation GS for older hardwareLiam2022-12-017-1/+165
|/
* spirv_emit_context: add missing flat decorationLiam2022-11-191-0/+1
|
* Merge pull request #9253 from vonchenplus/attr_layerliamwhite2022-11-195-0/+13
|\ | | | | shader: Implement miss attribute layer
| * shader: Implement miss attribute layerFengChen2022-11-175-0/+13
| |
* | Merge pull request #9167 from vonchenplus/tessliamwhite2022-11-1116-5/+60
|\ \ | | | | | | video_core: Fix few issues in Tess stage
| * | video_core: Fix few issues in Tess stageFengChen2022-11-0716-5/+60
| |/
* / ir/texture_pass: Use host_info instead of querying Settings::values (#9176)Morph2022-11-114-8/+13
|/
* video_core: Fix SNORM texture buffer emulating error (#9001)Feng Chen2022-11-0415-16/+115
|
* Merge pull request #8858 from vonchenplus/mipmapbunnei2022-11-0420-1/+163
|\ | | | | video_core: Generate mipmap texture by drawing
| * Merge branch 'master' into mipmapFeng Chen2022-09-201-5/+5
| |\
| * | video_core: Generate mipmap texture by drawingFengChen2022-09-2020-1/+163
| | |
* | | Revert "shader_recompiler/dead_code_elimination: Add DeadBranchElimination pass"Feng Chen2022-10-253-98/+9
| | |
* | | Merge pull request #8873 from vonchenplus/fix_legacy_location_errorbunnei2022-10-243-19/+33
|\ \ \ | | | | | | | | video_core: Fix legacy to generic location unpaired
| * | | Address feedbackFengChen2022-10-171-6/+6
| | | |
| * | | video_core: Fix legacy to generic location unpairedFengChen2022-09-203-15/+29
| | |/ | |/|
* | | CMakeLists: Disable C4100 and C4324Morph2022-10-222-8/+0
| | | | | | | | | | | | Disabling C4100 is similar to -Wno-unused-parameter
* | | CMakeLists: Remove redundant warningsMorph2022-10-221-2/+0
| | | | | | | | | | | | These warnings are already included in /W3.
* | | CMakeLists: Treat MSVC warnings as errorsMorph2022-10-221-1/+0
| | |
* | | general: Enforce C4800 everywhere except in video_coreMorph2022-10-221-1/+0
| | |
* | | CMakeLists: Remove all redundant warningsMorph2022-10-221-8/+2
| | | | | | | | | | | | These are already explicitly or implicitly set in src/CMakeLists.txt
* | | General: Fix compilation for GCCLiam White2022-10-061-1/+1
| | |
* | | Shader Decompiler: implement better tracking for Vulkan samplers.Fernando Sahmkow2022-10-061-9/+59
| | |
* | | Shader Decompiler: Check for shift when deriving composite samplers.Fernando Sahmkow2022-10-062-3/+35
| | |
* | | Shader Decompiler: Fix dangerous behavior of invalid iterator insertion.Fernando Sahmkow2022-10-061-3/+3
| | |
* | | shader_recompiler: add extended LDC to GLASM backendLiam2022-10-021-4/+21
| | |
* | | chore: fix some typosAndrea Pappacoda2022-09-232-2/+2
|/ / | | | | | | Fix some typos reported by Lintian
* / style: General style changes to match with the rest of the codebaseMorph2022-08-311-5/+5
|/
* video_code: support rectangle textureFengChen2022-08-2510-2/+44
|
* Add missed shader defines. Fixes Xenoblade Chronicles 3 booting with Vulkan.Kelebek12022-07-291-2/+3
|
* chore: make yuzu REUSE compliantAndrea Pappacoda2022-07-272-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [REUSE] is a specification that aims at making file copyright information consistent, so that it can be both human and machine readable. It basically requires that all files have a header containing copyright and licensing information. When this isn't possible, like when dealing with binary assets, generated files or embedded third-party dependencies, it is permitted to insert copyright information in the `.reuse/dep5` file. Oh, and it also requires that all the licenses used in the project are present in the `LICENSES` folder, that's why the diff is so huge. This can be done automatically with `reuse download --all`. The `reuse` tool also contains a handy subcommand that analyzes the project and tells whether or not the project is (still) compliant, `reuse lint`. Following REUSE has a few advantages over the current approach: - Copyright information is easy to access for users / downstream - Files like `dist/license.md` do not need to exist anymore, as `.reuse/dep5` is used instead - `reuse lint` makes it easy to ensure that copyright information of files like binary assets / images is always accurate and up to date To add copyright information of files that didn't have it I looked up who committed what and when, for each file. As yuzu contributors do not have to sign a CLA or similar I couldn't assume that copyright ownership was of the "yuzu Emulator Project", so I used the name and/or email of the commit author instead. [REUSE]: https://reuse.software Follow-up to 01cf05bc75b1e47beb08937439f3ed9339e7b254
* Merge pull request #8383 from Morph1984/shadow-of-the-pastMai2022-06-151-3/+0
|\ | | | | yuzu: Make variable shadowing a compile-time error
| * CMakeLists: Make variable shadowing a compile-time errorMorph2022-06-141-3/+0
| | | | | | | | Now that the entire project is free of variable shadowing, we can enforce this as a compile time error to prevent any further introduction of this logic bug.
* | general: fix compilation on GCC 12Liam2022-06-141-1/+1
| |
* | structured_control_flow: Remove constexpr Flow::Blocklat9nq2022-06-141-6/+0
|/ | | | | | This seems to be unsupported in newer libstdc++ versions due to Flow::Block's base class being a non-literal type. It's not clear to me why this was permitted in earlier versions.
* general: Avoid ambiguous format_to compilation errorsLioncash2022-05-142-2/+2
| | | | | | | Ensures that we're using the fmt version of format_to. These are also the only three outliers. All of the other formatters we have are properly qualified.
* GCC 12 fixesLiam2022-04-281-1/+1
|
* general: Convert source file copyright comments over to SPDXMorph2022-04-23233-699/+466
| | | | | This formats all copyright comments according to SPDX formatting guidelines. Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
* Merge pull request #8133 from liamwhite/gl-spv-cbufFernando S2022-04-076-25/+51
|\ | | | | shader_recompiler: support const buffer indirect addressing on OpenGL
| * shader_recompiler: Decrease indirect cbuf limit to match hardwareLiam2022-04-041-1/+1
| |
| * shader_compiler: support const buffer indirect addressing in GLSLLiam2022-04-014-9/+38
| |
| * shader_recompiler: support const buffer indirect addressing on OpenGL SPIR-VLiam2022-04-013-17/+14
| |
* | fix: typosAndrea Pappacoda2022-04-022-2/+2
|/
* Merge pull request #8095 from bylaws/masterMai M2022-03-273-0/+4
|\ | | | | shader_recompiler: Include <bit> header when std::count{r,l}_zero is used
| * Include <bit> header when std::count{r,l}_zero is usedBilly Laws2022-03-223-0/+4
| | | | | | | | Needed for compilation with older libc++ releases
* | dead_code_elimination_pass: Remove unreachable Phi argumentsameerj2022-03-233-0/+36
| |
* | shader_recompiler/dead_code_elimination: Add DeadBranchElimination passameerj2022-03-221-9/+62
|/ | | | This adds a pass to eliminate if(false) branches within the shader code
* Merge pull request #8038 from liamwhite/exit-register-detectionAmeer J2022-03-222-0/+9
|\ | | | | shader_recompiler/EXIT: increment output register on failed enable test
| * Address review commentsLiam2022-03-181-1/+1
| |
| * shader_recompiler/EXIT: skip render targets with no outputsLiam2022-03-182-0/+8
| |
| * shader_recompiler/EXIT: increment output register on failed enable testLiam2022-03-181-0/+1
| |
* | general: Fix clang/gcc build errorsameerj2022-03-201-0/+1
| |
* | shader_recompiler: Reduce unused includesameerj2022-03-2069-106/+7
|/
* Address review commentsLiam2022-03-174-52/+36
|
* shader_recompiler: Use functions for indirect const buffer accessesLiam2022-03-175-39/+94
|
* Address review commentsLiam2022-03-171-16/+15
|
* shader_recompiler: Implement LDC.IS address modeLiam2022-03-161-2/+12
|
* shader: add support for const buffer indirect addressingLiam2022-03-152-18/+68
|
* Merge pull request #8008 from ameerj/rescale-offsets-arrayFernando S2022-03-151-2/+27
|\ | | | | rescaling_pass: Fix rescaling Color2DArray ImageFetch offsets
| * rescaling_pass: Fix rescaling Color2DArray ImageFetch offsetsameerj2022-03-121-2/+27
| | | | | | | | | | | | ImageFetch offsets for 2D array coordinates have a different composite size than the coordinates. The rescaling pass was not taking this into account. Fixes broken shaders when scaling is enabled in Astral Chain, and likely other titles.
* | Shader decompiler: do constant propgation before texture pass.Fernando Sahmkow2022-03-131-2/+2
| |
* | Shader decompiler: Fix storage tracking in deko3d.Fernando Sahmkow2022-03-131-1/+2
| |
* | emit_spirv, vk_compute_pass: Resolve VS2022 compiler errorsameerj2022-03-121-1/+1
|/
* shader_recompiler/LOP3: Use brute force python results within switch/case.Markus Wick2022-03-082-52/+620
| | | | | | | | | | | | | | | | | | | | | Thanks to @asLody for optimizing this function. This raised the focus that this function should be optimized more. The current table assumes that the host GPU is able to invert for free, so only AND,OR,XOR are accumulated in the performance metrik. Performance results: Instructions 0: 8 1: 30 2: 114 3: 80 4: 24 Latency 0: 8 1: 30 2: 194 3: 24
* emit_glsl_atomic: Implement 32x2 fallback atomic opsameerj2022-01-301-9/+55
|
* lower_int64_to_int32: Add 64-bit atomic fallbacksameerj2022-01-303-11/+76
|
* shaders: Add U64->U32x2 Atomic fallback functionsameerj2022-01-309-1/+469
|
* spirv_atomic: Define U32x2 storage buffers for 64-bit storage atomicsameerj2022-01-292-3/+3
| | | | | | Some drivers do not support 64-bit atomics, and fallback to atomically modifying U32x2 vectors. This change ensures that U32x2 storage vectors are defined in the spir-v shader when 64-bit atomics are used. Fixes a hang on some devices, notably Intel GPUs, when booting Pokemon Legends Arceus
* Merge pull request #7786 from ameerj/vmnmx-selMorph2022-01-291-12/+6
|\ | | | | video_minimum_maximum: Implement src operand selectors
| * video_minimum_maximum: Implement src operand selectorsameerj2022-01-271-12/+6
| | | | | | | | Used by Pokemon Legends: Arceus
* | emit_spirv: Add Xfb execution mode when transform feedback is usedameerj2022-01-281-3/+9
|/ | | | Fixes Transform Feedback on Vulkan AMD drivers.
* shader_recompiler: Remove unnecessary [[nodiscard]]Lioncash2022-01-251-2/+1
| | | | | Since ConvertLegacyToGeneric has a void return value, there's nothing that is actually returned by the function.
* shader_recompiler: fix potential OOB accessv19932022-01-172-6/+8
| | | | Found by static analysis with PVS-Studio. Original check wasn't actually checking for OOB and would segfault in case of it.
* logging/log.h: move enum class formatter to a separate file ...liushuyu2022-01-103-7/+7
| | | | ... to common/logging/formatter.h
* logging: adapt to changes in fmt 8.1liushuyu2022-01-082-6/+6
|
* glsl: Remove unreachable returnNarr the Reg2022-01-051-1/+0
|
* ShaderDecompiler: Add a debug option to dump the game's shaders.Fernando Sahmkow2022-01-041-0/+2
|
* Merge pull request #7629 from ameerj/nv-driver-fixesFernando S2022-01-0315-23/+125
|\ | | | | shaders: Add fixes for NVIDIA drivers 495+
| * glsl: Add boolean reference workaroundameerj2021-12-303-2/+8
| |
| * glsl_context_get_set: Add alternative cbuf type for broken driversameerj2021-12-303-17/+27
| | | | | | | | some drivers have a bug bitwise converting floating point cbuf values to uint variables. This adds a workaround for these drivers to make all cbufs uint and convert to floating point as needed.
| * emit_glsl_integer: Use negation work aroundameerj2021-12-301-2/+2
| |
| * shader: Add integer attribute get optimization passameerj2021-12-309-0/+86
| | | | | | | | Works around an nvidia driver bug, where casting the integer attributes to float and back to an integer always returned 0.
| * emit_glsl_floating_point: Fix FPNeg on newer Nvidia driversameerj2021-12-251-2/+2
| |
* | Merge pull request #7618 from goldenx86/patch-4bunnei2021-12-291-0/+9
|\ \ | | | | | | Increase boost requirement to 1.78.0
| * | Empty spacesMatías Locatti2021-12-281-1/+1
| | |
| * | Changes to avoid warnings in SSE4.2 optimized SPIR-VMatías Locatti2021-12-281-0/+9
| |/
* / emit_glasm_context_get_set: Fix GetAttribute return value type.ameerj2021-12-251-4/+4
|/ | | | GetAttribute expects an F32 result type at the IR level, this fixes the return value of attributes which were not returning an F32
* Address format clangvonchenplus2021-12-181-36/+36
|
* Remove spirv handle legacy related codevonchenplus2021-12-184-190/+1
|
* Remove glsl handle legacy related codevonchenplus2021-12-183-103/+1
|
* Merge branch 'yuzu-emu:master' into convert_legacyFeng Chen2021-12-1866-214/+286
|\
| * Merge pull request #7522 from ameerj/shader-recompiler-filenamesMai M2021-12-0865-214/+282
| |\ | | | | | | shader_recompiler/backend: Minor organization and refactoring to reduce compile time overhead
| | * emit_spirv: Reduce emit_spirv.h include overheadameerj2021-12-0620-3/+20
| | | | | | | | | | | | emit_spirv.h is included in video_core, which was propagating further includes that video_core did not depend on.
| | * glasm: Move implemented instructions from not_implemented.cppameerj2021-12-067-169/+220
| | |
| | * shader_recompiler: Adjust emit_context includesameerj2021-12-0637-37/+37
| | |
| | * shader_recompiler: Rename backend emit_context filesameerj2021-12-057-6/+6
| | |
| * | general: Add missing copyright noticesameerj2021-12-051-0/+4
| |/
* / Implement convert legacy to genericFeng Chen2021-11-194-1/+103
|/
* ShaderCache: Better fix for Shuffling gl_FragCoordFernando Sahmkow2021-11-161-2/+13
|
* Texture Cahe/Shader decompiler: Resize PointSize on rescaling, refactor and make reaper more agressive on 4Gb GPUs.FernandoS272021-11-161-0/+21
|
* vulkan: Fix rescaling push constant usageameerj2021-11-164-34/+36
|
* rescaling_pass: Fix IR errors when unscalable texture types are encounteredameerj2021-11-161-0/+28
|
* rescaling_pass: Logic simplification and minor style cleanupameerj2021-11-162-33/+17
|
* rescaling_pass: Scale ImageFetch offset if it existsameerj2021-11-161-59/+37
| | | | Plus some code deduplication
* rescaling_pass: Enable PatchImageQueryDimensions on fragment stagesameerj2021-11-161-5/+4
|
* gl_texture_cache/rescaling_pass: minor cleanupameerj2021-11-161-12/+8
|
* rescaling_pass: Fix and simplify shuffle/fragcoord passameerj2021-11-161-26/+20
|
* Shader: Don't rescale FragCoord if used by ShuffleFernando Sahmkow2021-11-162-2/+55
|
* shader, video_core: Fix GCC build errorsameerj2021-11-161-4/+0
|
* emit_spirv: Fix RescalingLayout alignmentameerj2021-11-161-0/+1
|
* RescalingPass: Agregate pixels on texelFetch while on Fragment ShaderFernando Sahmkow2021-11-161-3/+97
|
* shader: Fix TextureSize check on rescaling.Fernando Sahmkow2021-11-161-27/+21
|
* emit_spirv: Fix RescalingLayout alignmentameerj2021-11-161-2/+2
|
* shader: Properly scale image reads and add GL SPIR-V supportReinUsesLisp2021-11-1620-51/+171
| | | | Thanks for everything!
* shader: Properly blacklist and scale image loadsReinUsesLisp2021-11-161-3/+19
|
* glsl/glasm: Pass and use scaling parameters in shadersReinUsesLisp2021-11-166-7/+11
|
* gl_graphics_pipeline: Add downscale factor to shader uniformsameerj2021-11-163-4/+5
|
* spirv: Implement rescaling patchingReinUsesLisp2021-11-168-5/+86
|
* shader/rescaling_pass: Patch more instructionsReinUsesLisp2021-11-161-4/+101
|
* shader: Add IsTextureScaled opcodeReinUsesLisp2021-11-1610-0/+34
|
* shader: Add copy constructor to instructionsReinUsesLisp2021-11-164-1/+20
|
* shader: Add integer division opcodesReinUsesLisp2021-11-169-0/+37
|
* shader: Fix rescaling passReinUsesLisp2021-11-161-1/+1
|
* shader: Fix resolution scaling passReinUsesLisp2021-11-165-35/+32
|
* shader: Add resolution down factor opcodeReinUsesLisp2021-11-169-0/+25
|
* ShaderDecompiler: Add initial support for rescaling.Fernando Sahmkow2021-11-162-0/+73
|
* Merge pull request #7260 from vonchenplus/spirv_support_legacy_attribute_v2bunnei2021-11-143-71/+153
|\ | | | | shader: Spirv support legacy attribute v2
| * Simply legacy attribute implementFeng Chen2021-11-043-152/+125
| |
| * Support gl_FogFragCoord attributevonchenplus2021-10-313-48/+58
| |
| * Support gl_BackSecondaryColor attributevonchenplus2021-10-263-0/+33
| |
| * Support gl_FrontSecondaryColor attributevonchenplus2021-10-263-0/+33
| |
| * Support gl_BackColor attributevonchenplus2021-10-263-0/+33
| |
* | Merge pull request #7262 from FernandoS27/Buffalo-buffalo-Buffalo-buffalo-buffalobunnei2021-11-037-3/+68
|\ \ | | | | | | ShaderCache: Order Phi Arguments from farthest away to nearest.
| * | Shader Cahe: Fix Phi Nodes on GLASM.Fernando Sahmkow2021-11-021-1/+1
| | |
| * | ShaderCache: Fix Phi Nodes Type on OGL.Fernando Sahmkow2021-11-013-2/+30
| | |
| * | ShaderCache: Order Phi Arguments from farthest away to nearest.Fernando Sahmkow2021-10-315-0/+37
| |/
* | Merge pull request #7201 from ameerj/spirv-depth-samplingFernando S2021-10-301-5/+16
|\ \ | |/ |/| emit_spirv_image: Fix depth image implicit lod sample in non-fragment stages
| * emit_spirv_image: Fix depth image implicit lod sample in computeameerj2021-10-171-5/+16
| | | | | | | | Ensures all drivers behave the same way in this case.
* | TexturePass: Fix clamping of images as this allowed negative indices.Fernando Sahmkow2021-10-241-1/+1
|/
* Merge pull request #7077 from FernandoS27/face-downAmeer J2021-10-171-1/+2
|\ | | | | A series of fixes to queries and indexed samplers.
| * Shader Compiler: avoid overflowed indices on indixed samplers.Fernando Sahmkow2021-10-171-1/+2
| |
* | style: Remove extra space preceding the :: operatorMorph2021-09-291-2/+2
| |
* | general: Update style to clang-format-12ameerj2021-09-241-2/+4
|/
* Spir-V: Rescale the frag depth to 0,1 mode when -1,1 mode is used in Vulkan.Fernando Sahmkow2021-09-151-1/+7
|
* Merge pull request #6948 from ameerj/amd-warp-fixMorph2021-09-122-54/+109
|\ | | | | shaders: Fix warp instructions on 64-thread warp devices
| * emit_glsl_warp: Fix shuffle ops for 64-thread warp sizesameerj2021-08-311-24/+36
| |
| * emit_glsl_warp: Fix ballot related ops for 64-thread warp sizesameerj2021-08-311-24/+38
| |
| * emit_spirv_warp: Fix shuffle ops for 64-thread warp sizesameerj2021-08-311-1/+29
| |
| * emit_spirv_warp: Fix ballot related ops for 64-thread warp sizesameerj2021-08-311-10/+11
| |
* | Merge pull request #6962 from vonchenplus/spirv_support_legacy_attributebunnei2021-09-083-0/+107
|\ \ | | | | | | renderer_vulkan: Spirv support glsl legacy attribute
| * | Detail adjustmentFeng Chen2021-09-081-13/+14
| | |
| * | Detail adjustmentFeng Chen2021-09-082-28/+35
| | |
| * | Re-implement get unused locationFeng Chen2021-09-071-30/+30
| | |
| * | Move attribute related definitions to spirv anonymous namespaceFeng Chen2021-09-074-30/+26
| | |
| * | Dynamic get unused locationFeng Chen2021-09-061-27/+49
| | |
| * | Implement intput and output fixed fnc texturesFeng Chen2021-09-064-19/+25
| | |
| * | Rename parametersFeng Chen2021-09-035-14/+24
| | |
| * | Fix create GraphicsPipelines crashFeng Chen2021-09-031-5/+5
| | |
| * | Add input/output locationFeng Chen2021-09-021-5/+13
| | |
| * | Add colorfront and txtcoord supportFeng Chen2021-08-315-0/+57
| | |
* | | Merge pull request #6900 from ameerj/attr-reorderbunnei2021-09-024-10/+133
|\ \ \ | |_|/ |/| | structured_control_flow: Add DemoteCombinationPass
| * | structured_control_flow: Skip reordering nested demote branches.ameerj2021-08-301-0/+11
| | | | | | | | | | | | Nested demote branches add complexity with combining the condition if it has not been initialized yet. Skip them for the time being.
| * | structured_control_flow: Conditionally invoke demote reorder passameerj2021-08-304-10/+16
| | | | | | | | | | | | This is only needed on select drivers when a fragment shader discards/demotes.
| * | structured_control_flow: Add DemoteCombinationPassameerj2021-08-281-1/+107
| |/ | | | | | | | | Some drivers misread data when demotes are interleaved in the program. This moves demote branches to be checked at the end of the program. Fixes "wireframe" issue in Pokemon SwSh on some drivers
* / emit_spirv_context_get_set: Fix Get FrontFace return valueameerj2021-08-271-2/+3
|/ | | | The IR expects GetAttribute to return an F32 value. This case was returning a U32 instead.
* SPIR-V: Merge two ifs in EmitGetAttributeValeri2021-08-191-6/+2
|
* Merge pull request #6767 from ReinUsesLisp/fold-float-packMorph2021-07-301-0/+4
|\ | | | | shader: Fold UnpackFloat2x16 and PackFloat2x16
| * shader: Fold UnpackFloat2x16 and PackFloat2x16ReinUsesLisp2021-07-301-0/+4
| | | | | | | | | | Simplifies the code a bit when possible. These instructions should be no-ops codegen wise.
* | Merge pull request #6722 from ReinUsesLisp/xmad-optsbunnei2021-07-302-14/+195
|\ \ | |/ |/| shader: Fold integer FMA from Nvidia's pattern
| * shader: Fold integer FMA from Nvidia's patternReinUsesLisp2021-07-261-0/+175
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fold shaders doing "a * b + c" on integers from the pattern generated by Nvidia's GL compiler. On a somewhat complex compute shader it reduces the code size by 16 instructions from 2 matches on Turing GPUs. On Intel as extracted from KHR_pipeline_executable_properties: Before the optimization: ``` Instruction Count: 2057 Basic Block Count: 45 Scratch Memory Size: 14752 Spill Count: 232 Fill Count: 261 SEND Count: 610 Cycle Count: 11325 ``` After the optimization: ``` Instruction Count: 2046 Basic Block Count: 44 Scratch Memory Size: 13728 Spill Count: 219 Fill Count: 268 SEND Count: 604 Cycle Count: 11367 ```
| * shader: Use TryInstRecursive on XMAD multiply foldingReinUsesLisp2021-07-261-14/+12
| | | | | | | | Simplify a bit the logic.
| * shader: Add TryInstRecursive utility to valuesReinUsesLisp2021-07-261-0/+8
| |
* | shader: Mark ConvertF16F32 and ConvertF32F16 as fp16 instructionsReinUsesLisp2021-07-281-0/+2
| | | | | | | | | | | | Fixes instances where fp16 types are not declared on SPIR-V but they are used. This shouldn't happen on master, as it's been uncovered by an additional optimization pass.
* | exception: Make constructors explicitLioncash2021-07-271-4/+4
| | | | | | | | Ensures that exception construction is always explicit.
* | exception: Make what() member function nodiscardLioncash2021-07-271-1/+1
| |
* | exception: Narrow down specific headerLioncash2021-07-271-1/+1
| | | | | | | | | | We can use the <exception> header instead of pulling in all of the exception-style classes.
* | Merge pull request #6724 from lioncash/nodisc-shaderRodrigo Locatti2021-07-262-4/+4
|\ \ | | | | | | shader_recompiler: Remove unnecessary [[nodiscard]] instances
| * | shader_recompiler: Remove unnecessary [[nodiscard]] instancesLioncash2021-07-262-4/+4
| |/ | | | | | | | | [[nodiscard]] doesn't do anything on functions with a void return type and causes superfluous warnings.
* | Merge pull request #6726 from lioncash/hguardRodrigo Locatti2021-07-261-0/+2
|\ \ | | | | | | emit_spirv_instructions: Add missing header guard
| * | emit_spirv_instructions: Add missing header guardLioncash2021-07-261-0/+2
| |/
* | Merge pull request #6727 from lioncash/topologyRodrigo Locatti2021-07-261-1/+1
|\ \ | | | | | | emit_glasm: Fix LINESS_ADJACENCY typo in InputPrimitive()
| * | emit_glasm: Fix LINESS_ADJACENCY typo in InputPrimitive()Lioncash2021-07-261-1/+1
| |/ | | | | | | This should be LINES_ADJACENCY
* | Merge pull request #6723 from lioncash/shaderRodrigo Locatti2021-07-261-0/+1
|\ \ | | | | | | object_pool: Add missing return in Chunk move assignment operator
| * | object_pool: Add missing return in Chunk move assignment operatorLioncash2021-07-261-0/+1
| |/ | | | | | | Prevents undefined behavior from occurring.
* / control_flow: Fix duplicate switch case in OpcodeTokenLioncash2021-07-261-1/+1
|/ | | | This previously duplicated the case of the PBK case above it.
* shader: Support out of bound local memory reads and immediate writesReinUsesLisp2021-07-231-4/+21
| | | | | | | | | Support ignoring immediate out of bound writes. Writing dynamically out of bounds is not yet supported (e.g. R0+0x4). Reading out of bounds yields zero. This is supported checking for the size from the IR; if the input is immediate, the optimization passes will drop it.
* shader: Fix disabled attribute default valuesameerj2021-07-231-1/+1
|
* glsl: Simplify FCMP emissionameerj2021-07-231-6/+4
|
* glsl: Update TessellationControl gl_inameerj2021-07-231-0/+28
| | | | Adheres to GL_ARB_separate_shader_objects requirements
* shader: Implement ISETP.Xameerj2021-07-234-44/+57
|
* shader: Avoid usage of C++20 ranges to build in clangReinUsesLisp2021-07-2311-39/+47
|
* glsl: Clamp shared mem size to GL_MAX_COMPUTE_SHARED_MEMORY_SIZEameerj2021-07-232-2/+11
|
* shader_recompiler, video_core: Resolve clang errorslat9nq2021-07-2313-41/+39
| | | | | | | | | | Silences the following warnings-turned-errors: -Wsign-conversion -Wunused-private-field -Wbraced-scalar-init -Wunused-variable And some other errors
* shader: Manually convert from array<u32> to bitset instead of using bit_castReinUsesLisp2021-07-231-2/+3
|
* glsl: Fix tracking of info.uses_shadow_lodameerj2021-07-231-4/+4
|
* shader: Ignore global memory ops on devices lacking int64 supportameerj2021-07-236-30/+77
|
* dual_vertex_pass: Clang formatameerj2021-07-231-14/+14
|
* emit_spirv: Workaround VK_KHR_shader_float_controls on fp16 NvidiaReinUsesLisp2021-07-232-5/+10
| | | | Fix regression on Fire Emblem: Three Houses when using native fp16.
* shader: GCC fmt 8.0.0 fixeslat9nq2021-07-237-16/+19
|
* shader: Account for 33-bit IADD3 scenarioameerj2021-07-231-2/+10
|
* shader: Only apply shift on register mode for IADD3ReinUsesLisp2021-07-231-10/+14
|
* shader: Fix disabled and unwritten attributes and varyingsReinUsesLisp2021-07-232-3/+11
|
* glsl: Fix shared and local memory declarationsameerj2021-07-231-3/+3
| | | | account for the fact that program.*memory_size is in units of bytes.
* opengl: Implement LOP.CCameerj2021-07-232-6/+38
| | | | Used by MH:Rise
* spirv: Fix code emission when descriptor aliasing is unsupportedReinUsesLisp2021-07-231-1/+2
| | | | Fixes OpenGL.
* glsl: Declare local memory in mainameerj2021-07-231-3/+3
|
* glsl: Add passthrough geometry shader supportameerj2021-07-233-7/+27
|
* shader: Use std::bit_cast instead of Common::BitCast for passthroughReinUsesLisp2021-07-231-2/+3
|
* glasm: Add passthrough geometry shader supportReinUsesLisp2021-07-232-7/+26
|
* shader: Rework varyings and implement passthrough geometry shadersReinUsesLisp2021-07-2322-316/+302
| | | | | | Put all varyings into a single std::bitset with helpers to access it. Implement passthrough geometry shaders using host's.
* shader: Only verify shader when graphics debugging is enabledReinUsesLisp2021-07-231-2/+7
|
* shader: Unify shader stage typesReinUsesLisp2021-07-231-2/+9
|
* lower_int64_to_int32: Add missing includelat9nq2021-07-231-0/+1
|
* shader: Emulate 64-bit integers when not supportedReinUsesLisp2021-07-231-0/+3
| | | | Useful for mobile and Intel Xe devices.
* shader: Add int64 to int32 lowering passReinUsesLisp2021-07-233-0/+218
|
* shader: Teach global memory base tracker to follow vectorsReinUsesLisp2021-07-231-15/+14
|
* shader: Add constant propagation to integer vectorsReinUsesLisp2021-07-231-0/+9
|
* glsl: Better IAdd Overflow CC fixameerj2021-07-232-11/+13
| | | | This ensures the original operand values are not overwritten when being used in the overflow detection.
* shader: Remove IAbs64ReinUsesLisp2021-07-239-26/+3
|
* glsl: Fix IADD CCameerj2021-07-232-5/+7
|
* shader_recompiler: Fix IADD3 input partitioningameerj2021-07-231-14/+13
|
* shader: Move loop safety tests to code emissionReinUsesLisp2021-07-2316-108/+54
|
* glsl: Remove frag color initializationameerj2021-07-231-9/+0
|
* glasm: Implement SetAttribute ViewportMaskameerj2021-07-232-1/+10
|
* emit_glsl_special: Skip initialization of frag_color0ameerj2021-07-231-1/+1
| | | | Fixes rendering in Devil May Cry without regressing Ori and the Blind Forest.
* shader: Calibrate loop safety thresholdReinUsesLisp2021-07-231-1/+1
|
* glsl: Add missing ; in EmitSetSampleMaskMorph2021-07-231-1/+1
| | | | Fixes shader compilation in Okami HD
* glsl: Fix output varying initialization when transform feedback is usedameerj2021-07-231-3/+37
|
* texture_pass: Fix is_read image qualificationameerj2021-07-231-1/+1
| | | | Atomic operations are considered to have both read and write access. This was not being accounted for.
* shader: Align constant buffer sizes to 16 bytesReinUsesLisp2021-07-231-1/+2
| | | | WAR for AMD reading zeroes on uniform buffers of size 2.
* spirv: Properly handle devices without int8 and int16ReinUsesLisp2021-07-232-39/+67
|
* spirv: Handle small storage buffer loads on devices with no supportReinUsesLisp2021-07-232-6/+6
|
* glsl: Fix cbuf component indexing bug falbackameerj2021-07-231-7/+6
|
* shader: Simplify MergeDualVertexProgramsReinUsesLisp2021-07-231-6/+4
|
* shader: Properly manage attributes not written from previous stagesReinUsesLisp2021-07-2310-36/+40
|
* glsl: Only declare fragment outputs on fragment shadersReinUsesLisp2021-07-231-4/+6
|
* shader: Split profile and runtime info headersReinUsesLisp2021-07-2311-76/+91
|
* shader: Add support for native 16-bit floatsReinUsesLisp2021-07-234-4/+26
|
* shader: Rename maxwell/program.h to translate_program.hReinUsesLisp2021-07-233-9/+4
|
* glsl: Obey need_declared_frag_colors to declare and initialize all frag_colorameerj2021-07-232-1/+10
| | | | Fixes Ori and the blind forest title screen
* glsl: Address rest of feedbackameerj2021-07-237-21/+64
|
* glsl: Move gl_Position/generic attribute initialization to EmitProlgueameerj2021-07-232-14/+12
|
* glsl: Conditionally use fine/coarse derivatives based on device supportameerj2021-07-233-4/+28
|
* glsl: Cleanup/Address feedbackameerj2021-07-239-28/+22
|
* glsl: Add Shader_GLSL loggingameerj2021-07-233-28/+32
|
* glsl: Add LoopSafety instructionsameerj2021-07-232-0/+10
|
* glsl: Conditionally add EXT_texture_shadow_lodameerj2021-07-233-4/+15
|
* glsl: Add stubs for sparse queries and variable aoffi when not supportedameerj2021-07-234-13/+39
|
* glsl: Implement legacy varyingsameerj2021-07-236-8/+81
|
* glsl: Minor cleanupameerj2021-07-232-19/+15
|
* glsl: Fix Cbuf getters for F32 typeameerj2021-07-231-12/+15
|
* glsl: Add immediate index oob checking for Cbuf gettersameerj2021-07-231-0/+16
|
* glsl: Refactor GetCbuf functions to reduce code duplicationameerj2021-07-231-104/+66
|
* glsl: Address more feedback. Implement indexed texture readsameerj2021-07-235-111/+109
|
* glsl: Remove Signed Integer variablesameerj2021-07-238-43/+13
|
* glsl: Address Rodrigo's feedbackameerj2021-07-2313-75/+87
|
* glsl: Reorganize backend code, remove unneeded [[maybe_unused]]ameerj2021-07-2312-315/+251
|
* glsl: Implement SampleId and SetSampleMaskameerj2021-07-233-30/+35
| | | | plus some minor refactoring of implementations
* glsl: Add gl_PerVertex in for GSameerj2021-07-231-1/+2
|
* glsl: Use existing tracking for enabling EXT_shader_image_load_formattedameerj2021-07-231-15/+1
|
* glsl: Enable early fragment testsameerj2021-07-232-4/+7
|
* glsl: Implement more attribute getters and settersameerj2021-07-232-12/+60
|
* glsl: Implement fswzaddameerj2021-07-234-5/+44
| | | | and wip nv thread shuffle impl
* glsl: Implement indexed attribute loadsameerj2021-07-235-29/+64
|
* glsl: Conditionally add GL_ARB_sparse_texture2ameerj2021-07-231-2/+3
|
* glsl: Conditionally use GL_EXT_shader_image_load_formattedameerj2021-07-231-2/+18
| | | | Fix for SULD.D
* glsl: Remove output generic indexing for geometry stageameerj2021-07-231-5/+3
|
* glsl: Allow dynamic tracking of variable allocationameerj2021-07-233-21/+35
|
* glsl: Implement barriersameerj2021-07-233-13/+21
|
* glsl: Implement image atomics and set layerameerj2021-07-235-153/+202
| | | | along with some more cleanup/oversight fixes
* glsl: Fix image gather logicameerj2021-07-231-0/+4
|
* glsl: Add cbuf access workaround for devices with component indexing bugameerj2021-07-232-51/+112
|
* glsl: Use textureGrad fallback when EXT_texture_shadow_lod is unsupportedameerj2021-07-233-8/+41
|
* emit_glsl_image: Use immediate offsets when possibleameerj2021-07-231-12/+33
|
* glsl: Fix <32-bit SSBO writesameerj2021-07-234-50/+43
| | | | and more cleanup
* glsl: Cleanup and address feedbackameerj2021-07-2310-86/+69
|
* glsl: Refactor Global memory functionsameerj2021-07-232-71/+73
|
* glsl: Increase NUM_VARS that can be allocatedameerj2021-07-231-1/+1
| | | | needed for HW:AoC.
* glsl: Implement Load/WriteGlobalameerj2021-07-239-98/+185
| | | | along with some other misc changes and fixes
* glsl: Implement Imagesameerj2021-07-232-9/+74
|
* glsl: skip gl_ViewportIndex write if device does not support itameerj2021-07-234-8/+17
|
* glsl: Implement transform feedbackameerj2021-07-233-13/+63
|
* glsl: Yet another gl_ViewportIndex fix attemptameerj2021-07-231-3/+19
|
* glsl: Add gl_ViewportIndex out attributeameerj2021-07-231-1/+3
|
* emit_glsl_context_get_set: Remove unused functionlat9nq2021-07-231-4/+0
|
* glsl: Fix precise variable declarationameerj2021-07-233-24/+25
| | | | and add some more separation in the shader for better debugability when dumped
* glsl: Implement tessellation shadersameerj2021-07-235-27/+146
|
* glsl: Implement ImageGradient and other texture function variantsameerj2021-07-232-32/+73
|
* glsl: Fix atomic SSBO offsetsameerj2021-07-234-67/+74
| | | | and implement misc getters
* glsl: Implement geometry shadersameerj2021-07-234-9/+62
|
* glsl: Use NotImplemented macro with function name outputameerj2021-07-2310-104/+103
|
* glsl: Implement gl_ViewportIndexameerj2021-07-233-5/+14
| | | | SSBU now working
* glsl: SHFL fix and prefer shift operations over divide in glsl shaderameerj2021-07-235-63/+64
|
* glsl: Implement precise fp variable allocationameerj2021-07-234-8/+67
|
* HACK glsl: Write defaults to unused generic attributesameerj2021-07-232-2/+11
|
* glsl: Fix ssbo indexing and name shadowing between shader stagesameerj2021-07-233-77/+101
|
* glsl: implement set clip distanceameerj2021-07-232-0/+15
| | | | and missed a diff in emit_glsl relating to var alloc ref counting
* glsl: Rework var alloc to not assign unused resultsameerj2021-07-239-49/+91
|
* glsl: Rework variable allocator to allow for variable reuseameerj2021-07-2314-353/+482
|
* glsl: Fix ATOM and implement ATOMSameerj2021-07-235-114/+136
|
* glsl: Use gl_SubGroupInvocationARBameerj2021-07-232-8/+7
|
* glsl: Implement VOTE for subgroup size potentially largerameerj2021-07-232-19/+36
|
* glsl: Implement VOTEameerj2021-07-234-50/+64
|
* glsl: Implement ST{LS}ameerj2021-07-236-69/+106
|
* glsl: Implement more instructions used by SMOameerj2021-07-231-3/+3
|
* glsl: Implement more instructions used by SMOameerj2021-07-235-10/+16
|
* glsl: Fix GetAttribute return valuesameerj2021-07-232-7/+9
| | | | fixes font rendering issues as these were used to index into the ssbos
* glsl: minor cleanupameerj2021-07-234-20/+19
|
* glsl: Fix and implement rest of cbuf accessameerj2021-07-231-7/+43
|
* glsl: Implement TXQ and other misc changesameerj2021-07-235-6/+36
|
* glsl: TLD4 implementationameerj2021-07-231-2/+89
|
* glsl: Implement TLD instructionameerj2021-07-231-1/+55
|
* glsl: Implement TEXSameerj2021-07-231-1/+29
|
* glsl: Cleanup texture functionsameerj2021-07-231-13/+11
|
* shader_recompiler: GCC fixeslat9nq2021-07-2314-3/+13
|
* glsl: Implement TEX depth functionsameerj2021-07-232-4/+46
|
* glsl: Implement TEX ImageSample functionsameerj2021-07-233-11/+71
|
* glsl: Rework Shuffle emit instructions to align with SPIR-Vameerj2021-07-231-19/+40
|
* glsl: Better Storage access and wip warpsameerj2021-07-238-62/+133
|
* glsl: Fix integer conversions, implement clamp CCameerj2021-07-232-27/+36
|
* glsl: Implement IADD CCameerj2021-07-232-2/+17
|
* glsl: SSBO access fixes and wip SampleExplicitLod implementation.ameerj2021-07-232-4/+19
|
* glsl: WIP var forward declarationameerj2021-07-236-49/+60
| | | | to fix Loop control flow.
* glsl: Fix bindings, add some CC opsameerj2021-07-238-57/+91
|
* glsl: remove unused headersameerj2021-07-2314-34/+10
|
* glsl: Implement derivatives and YDirectionameerj2021-07-238-81/+87
| | | | plus some other misc additions/changed
* glsl: Fix non-immediate buffer accessameerj2021-07-2312-72/+133
| | | | and many other misc implementations
* glsl: textures wipameerj2021-07-239-75/+139
|
* glsl: Implement some attribute getters and settersameerj2021-07-239-191/+337
|
* glsl: Track S32 atomicsameerj2021-07-233-6/+16
|
* glsl: Update phi node managementameerj2021-07-234-21/+53
|
* glsl: Fix floating point compare opsameerj2021-07-231-28/+28
| | | | Logic for ordered/unordered ops was wrong.
* glsl: Query GL Device for FP16 extension supportameerj2021-07-232-2/+9
|
* glsl: Simply FP storage atomicsameerj2021-07-232-48/+28
|
* glsl: F16x2 storage atomicsameerj2021-07-237-58/+64
|
* glsl: Revert ssbo aliasing. Storage Atomics implameerj2021-07-235-75/+134
|
* glsl: implement phi nodesameerj2021-07-234-20/+54
|
* glsl: Wip storage atomic opsameerj2021-07-2310-327/+414
|
* glsl: Implement FCMPameerj2021-07-233-242/+185
|
* glsl: Add a more robust fp formatterameerj2021-07-234-9/+14
|
* glsl: More FP fixesameerj2021-07-232-9/+16
|
* glsl: FP function fixesameerj2021-07-237-17/+25
|
* glsl: More FP instructions/fixesameerj2021-07-235-28/+41
|
* glsl: Add many FP32/64 instructionsameerj2021-07-2312-765/+1011
|
* glsl: Implement more Integer opsameerj2021-07-233-119/+72
|
* glsl: Implement BF*ameerj2021-07-233-9/+10
|
* glsl: Implement a few Integer instructionsameerj2021-07-2310-260/+398
|
* glsl: Use std::string_view for Emit function args.ameerj2021-07-236-760/+838
|
* glsl: Pass IR::Inst& to Emit functionsameerj2021-07-236-171/+169
|
* glsl: INeg and IAdd negate testsameerj2021-07-233-94/+106
|
* glsl: Reusable typed variables. IADD32ameerj2021-07-236-203/+311
|
* glsl: Fix program linking and cbufameerj2021-07-232-3/+5
|
* glsl: Fix "reg" allocingameerj2021-07-2310-898/+938
| | | | based on glasm with some tweaks
* glsl: Initial backendameerj2021-07-2327-0/+3292
|
* spirv: Reduce log severity of mismatching denorm rulesReinUsesLisp2021-07-231-2/+2
|
* shader: Fix loop safety to SSA passReinUsesLisp2021-07-232-2/+4
|
* shader: Add loggingReinUsesLisp2021-07-2313-28/+30
|
* shader: Add shader loop safety check settingslat9nq2021-07-239-33/+130
| | | | Also add a setting for enable Nsight Aftermath.
* shader: Comment why the array component is not read in TMMLReinUsesLisp2021-07-231-0/+2
|
* tmml: Remove index component from coords vecameerj2021-07-231-4/+3
| | | | The lod query functions exposed by the rendering API's do not make use of the texturearray layer indexing.
* spirv/convert: Catch more signed operations oversightsameerj2021-07-231-5/+5
| | | | The sign bit on integers of size < 32 was not properly preserved in casts
* spirv/convert: Catch more broken signed operations on Nvidia OpenGLReinUsesLisp2021-07-231-0/+6
| | | | | BitCast U32 to S32 before converting to float on drivers with broken signed operations.
* shader_environment: Add shader_local_memory_crs_size to local memory sizeameerj2021-07-231-2/+2
| | | | Fixes DOOM 2016 missing local memory
* shader: Fix VertexA Shaders.FernandoS272021-07-233-14/+30
|
* shader: Add 2D and 3D variants to SUATOM and SUREDReinUsesLisp2021-07-231-0/+4
| | | | Used by Claybook.
* shader: Avoid CPU side undefined behavior on I2FReinUsesLisp2021-07-231-0/+2
|
* glasm: Use ARB_derivative_control conditionallyReinUsesLisp2021-07-233-7/+30
|
* buffer_cache: Reduce uniform buffer size from shader usageReinUsesLisp2021-07-232-3/+17
| | | | Increases performance significantly on certain titles.
* emit_glasm_context_get_set: Remove unused variablelat9nq2021-07-231-1/+0
|
* shader,glasm: Implement legacy texcoord loadsReinUsesLisp2021-07-233-54/+29
|
* glasm: Implement legacy varyingsReinUsesLisp2021-07-231-17/+56
|
* shader: Track legacy varyingsReinUsesLisp2021-07-232-17/+105
|
* shader: Add support for "negative" and unaligned offsetsReinUsesLisp2021-07-233-8/+13
| | | | | | | | | "Negative" offsets don't exist. They are shown as such due to a bug in nvdisasm. Unaligned offsets have been proved to read the aligned offset. For example, when reading an U32, if the offset is 6, the offset read will be 4.
* shader: Implement ISCADD32IReinUsesLisp2021-07-231-17/+31
|
* spirv: Fix output generics with componentsReinUsesLisp2021-07-231-1/+1
|
* opengl: Declare fragment outputs even if they are not usedReinUsesLisp2021-07-234-10/+9
| | | | | | Fixes Ori and the Blind Forest's menu on GLASM. For some reason (probably high level optimizations) it is not sanitized on SPIR-V for OpenGL. Vulkan is unaffected by this change.
* shader: Always initialize up reference in structure control flowReinUsesLisp2021-07-231-31/+36
| | | | Fixes ubsan issue.
* shader: Fix ImageWrite indexingReinUsesLisp2021-07-231-1/+1
|
* spirv: Fix image and image buffer descriptor index usageReinUsesLisp2021-07-231-5/+7
|
* glasm: Fix immediate texture coordinateReinUsesLisp2021-07-231-0/+1
|
* shader: Clang-format secondary texturesReinUsesLisp2021-07-231-2/+2
|
* shader: Fix secondary texturesReinUsesLisp2021-07-231-2/+2
|
* shader: Fix TMML queriesReinUsesLisp2021-07-231-5/+9
|
* shader: Fix FSwizzleAdd folding when going through phi nodesReinUsesLisp2021-07-231-2/+2
|
* shader/exception: Fix compilation errors on gccReinUsesLisp2021-07-231-6/+6
|
* glasm: Reduce reg allocation leaks from an exception to a logReinUsesLisp2021-07-231-1/+1
|
* shader: Handle host exceptionsReinUsesLisp2021-07-234-13/+43
|
* glasm: Use integer lod for TXQReinUsesLisp2021-07-232-2/+2
|
* glasm: Fix global memory fallbacksReinUsesLisp2021-07-231-9/+10
|
* Revert "glasm: Skip phi moves on undefined instructions"ReinUsesLisp2021-07-232-16/+1
| | | | Causes regressions on Bowser's Fury.
* glasm: Remove unintentional '\n' on Undef32ReinUsesLisp2021-07-231-1/+1
|
* glasm: Use storage buffers instead of global memory when possibleReinUsesLisp2021-07-236-370/+383
|
* glasm: Implement Y directionReinUsesLisp2021-07-234-3/+9
|
* glasm: Skip phi moves on undefined instructionsReinUsesLisp2021-07-232-1/+16
|
* glasm: Implement undef instructionsReinUsesLisp2021-07-232-15/+15
|
* glasm: Fix global memory callbacksReinUsesLisp2021-07-231-5/+6
|
* video_core,shader: Clang-format fixesReinUsesLisp2021-07-232-2/+2
|
* glasm: Release phi node registers after they are no longer neededReinUsesLisp2021-07-232-38/+54
|
* glasm: Remove unintentionally committed fmt::printsReinUsesLisp2021-07-231-2/+0
|
* glasm: Fix INeg32 on negative immediatesReinUsesLisp2021-07-231-1/+5
|
* glasm: Remove unnecessary value typesReinUsesLisp2021-07-233-47/+6
|
* glasm: Throw when there are register leaksReinUsesLisp2021-07-232-0/+7
|
* glasm: Catch more register leaksReinUsesLisp2021-07-238-41/+114
| | | | | | | | | | | | | Add support for null registers. These are used when an instruction has no usages. This comes handy when an instruction is only used for its CC value, with the caveat of having to invalidate all pseudo-instructions before defining the instruction itself in the register allocator. This commits changes this. Workaround a bug on Nvidia's condition codes conditional execution using branches.
* glasm: Fix usage counting on phi nodesReinUsesLisp2021-07-233-8/+22
|
* glasm: Implement global memory fallbacksReinUsesLisp2021-07-232-50/+89
|
* glasm: Implement int64 add and subtractReinUsesLisp2021-07-232-8/+6
|
* emit_glasm_context_get_set: Remove unused variablelat9nq2021-07-231-1/+0
|
* glasm: Implement indirect attribute loadsReinUsesLisp2021-07-234-6/+65
|
* glasm: Implement image atomicsReinUsesLisp2021-07-233-166/+153
|
* glasm: Reorder unreachable image atomic instsReinUsesLisp2021-07-231-66/+66
| | | | Reorder them to the bottom of the file for readability.
* glasm: Implement gl_Layer storesReinUsesLisp2021-07-231-0/+7
|
* glasm: Implement SampleIdReinUsesLisp2021-07-232-3/+3
|
* glasm: Implement IsHelperInvocationReinUsesLisp2021-07-232-3/+3
|
* glasm: Fix EmitVertex's optimizationReinUsesLisp2021-07-231-1/+1
|
* gl_shader_cache,glasm: Conditionally use typeless image reads extensionReinUsesLisp2021-07-231-2/+4
|
* glasm: Implement forced early ZReinUsesLisp2021-07-231-2/+6
|
* glasm: Simplify patch readsReinUsesLisp2021-07-231-5/+2
|
* glasm: Fix output patch readsReinUsesLisp2021-07-232-13/+22
| | | | With this, Luigi's Mansion's sand renders properly.
* shader: Split profile and runtime information in separate structsReinUsesLisp2021-07-2311-71/+88
|
* emit_glasm_context_get_and_set.cpp: Add missing semicolonsameerj2021-07-231-2/+2
|
* glasm: Fix patch attribute declarationsReinUsesLisp2021-07-231-1/+1
|
* glasm: Implement FSWZADDameerj2021-07-233-4/+28
|
* glasm: Implement PrimitiveId attribute readReinUsesLisp2021-07-231-0/+3
|
* glasm: Implement clip distance storesReinUsesLisp2021-07-232-0/+15
|
* glasm: Fix tessellation input attributesReinUsesLisp2021-07-231-2/+5
|
* glasm: Add missing semicolon on tesscoord readingReinUsesLisp2021-07-231-1/+1
|
* glasm: Fix tessellation headersReinUsesLisp2021-07-231-2/+2
|
* glasm: Add tessellation shader declarationsReinUsesLisp2021-07-231-0/+35
|
* glasm: Implement TessellationEvaluationPointReinUsesLisp2021-07-231-0/+4
|
* glasm: Implement patch memoryReinUsesLisp2021-07-233-6/+51
|
* glasm: Fix InvocationId declarationReinUsesLisp2021-07-231-1/+1
|
* glasm: Implement InvocationIdReinUsesLisp2021-07-232-2/+5
|
* glasm: Optimize EmitVertex into EMITReinUsesLisp2021-07-231-1/+5
|
* glasm: Implement geometry shader attribute readsReinUsesLisp2021-07-232-4/+18
|
* glasm: Properly declare attributes on geometry programsReinUsesLisp2021-07-233-6/+14
|
* glasm: Declare geometry program headersReinUsesLisp2021-07-231-0/+35
|
* glasm: Fix potential aliasing bug on cube array samplesReinUsesLisp2021-07-232-35/+44
|
* glasm: Implement ImageWriteReinUsesLisp2021-07-231-4/+7
|
* glasm: Implement ImageReadReinUsesLisp2021-07-234-4/+56
|
* glasm: Implement EmitVertex and EndPrimitiveReinUsesLisp2021-07-232-4/+8
|
* glasm: Implement ImageGradientReinUsesLisp2021-07-232-7/+65
|
* glasm: Implement 64-bit shiftsReinUsesLisp2021-07-232-12/+14
|
* glasm: Implement barriersReinUsesLisp2021-07-231-3/+3
|
* glasm: Fix compute stage nameReinUsesLisp2021-07-231-1/+1
|
* glasm: Fix phi instruction typesReinUsesLisp2021-07-231-1/+1
|
* glasm: Implement PREC on relevant instructionsReinUsesLisp2021-07-231-6/+12
|
* glasm: Implement stores to gl_ViewportIndexReinUsesLisp2021-07-234-7/+29
|
* glasm: Implement gl_PointSize storesReinUsesLisp2021-07-231-0/+3
|
* glasm: Implement gl_PointCoordReinUsesLisp2021-07-231-0/+4
|
* glasm: Implement ImageQueryLodReinUsesLisp2021-07-231-3/+5
|
* glasm: Implement ImageFetchReinUsesLisp2021-07-234-13/+38
|
* glasm: Implement IADD.CCameerj2021-07-231-1/+26
|
* glasm: Implement BFE.CCReinUsesLisp2021-07-231-0/+8
|
* glasm: Implement SelectU1ReinUsesLisp2021-07-232-4/+5
|
* glasm: Implement gl_WorkGroupIDReinUsesLisp2021-07-232-3/+3
|
* glasm: Implement TXQ and improve texture info readsReinUsesLisp2021-07-232-50/+51
|
* glasm: Implement gl_FrongFacing attributeReinUsesLisp2021-07-231-0/+3
|
* glasm: Support textures used in more than one stageReinUsesLisp2021-07-233-4/+24
|
* glasm: Implement textureGather instructionsReinUsesLisp2021-07-232-15/+97
|
* glasm: Implement gl_FragDepth and gl_SampleMask storesReinUsesLisp2021-07-232-5/+5
|
* glasm: Do not alias ConditionRef for nowReinUsesLisp2021-07-232-3/+2
| | | | | Immediate condition refs where not handled correctly. Just move the value for now.
* shader: Read branch conditions from an instructionReinUsesLisp2021-07-2312-16/+36
| | | | Fixes the identity removal pass.
* glasm: Implement InstanceId and VertexIdReinUsesLisp2021-07-231-0/+6
|
* glasm: Add missing return value on move assignmentReinUsesLisp2021-07-231-0/+1
|
* glasm: Fix aliased bitcasts ref countingReinUsesLisp2021-07-233-13/+42
|
* glasm: Remove unintentional comma on vector insertReinUsesLisp2021-07-231-1/+1
|
* glasm: Implement TEX and TEXS instructionsReinUsesLisp2021-07-2310-69/+275
| | | | | Remove lod clamp from texture instructions with lod, as this is not needed (nor supported).
* glasm: Add support for non-2D texture samplesReinUsesLisp2021-07-231-4/+26
|
* glasm: Reorder unreachable image instructions to the bottomReinUsesLisp2021-07-231-97/+97
|
* glasm: Add support for texture offsetsReinUsesLisp2021-07-231-11/+15
|
* glasm: Improve texture sampling instructionsReinUsesLisp2021-07-232-50/+70
|
* emit_glasm: Enable ARB_draw_buffers when neededReinUsesLisp2021-07-232-1/+5
|
* emit_glasm: Add support for reading position attributesReinUsesLisp2021-07-231-3/+13
|
* shader_recompiler: GCC fixeslat9nq2021-07-237-58/+55
| | | | | Fixes members of unnamed union not being accessible, and one function without a declaration.
* glasm: Implement rest of shared memameerj2021-07-232-35/+29
|
* shader: Use a non-trivial dummy to construct ASL node unionReinUsesLisp2021-07-231-1/+6
|
* emit_spirv: Jump to loop body with local variableReinUsesLisp2021-07-231-1/+1
| | | | Silence unused variable warning
* glasm: Implement derivative instructions on GLASMReinUsesLisp2021-07-232-12/+12
|
* glasm: Initial (broken) implementation of TEX on GLASMReinUsesLisp2021-07-233-299/+386
|
* glasm: Implement some graphics instructions on GLASMReinUsesLisp2021-07-232-6/+5
|
* glasm: Add Void type to GLASM valuesReinUsesLisp2021-07-233-0/+15
|
* glasm: Add graphics specific shader declarations to GLASMReinUsesLisp2021-07-232-6/+63
|
* glasm: Implement local memory for glasmameerj2021-07-234-9/+12
|
* emit_spirv: Add missing block in caseReinUsesLisp2021-07-231-1/+2
|
* glasm: Initial implementation of phi nodes on GLASMReinUsesLisp2021-07-2312-25/+117
|
* glasm: Write result to scalar on integer comparison instructionsReinUsesLisp2021-07-231-10/+10
|
* glasm: Declare NV_shader_thread_group when neededReinUsesLisp2021-07-231-3/+4
|
* glasm: Rework control flow introducing a syntax listReinUsesLisp2021-07-2333-505/+437
| | | | | This commit regresses VertexA shaders, their transformation pass has to be adapted to the new control flow.
* glasm: Implement Storage atomicsameerj2021-07-235-109/+156
| | | | | StorageAtomicExchangeU64 is failing test seemingly due to failure storing 64-bit result into the register
* glasm: Ensure reg alloc order across compilers on GLASMReinUsesLisp2021-07-231-11/+14
| | | | | | | | | | | | | | | | | | Use a struct constructor to serialize register allocation arguments to ensure registers are allocated in the same order regardless of the compiler used. The A and B functions can be called in any order when passed as arguments to "foo": foo(A(), B()) But the order is guaranteed for curly-braced constructor calls in classes: Foo{A(), B()} Use this to get consistent behavior.
* glasm: Enable unintentionally disabled register aliasing on GLASMReinUsesLisp2021-07-231-16/+11
|
* glasm: Review all GLASM insts to be aware of register aliasingReinUsesLisp2021-07-234-20/+51
|
* glasm: Implement shuffle and vote instructions on GLASMReinUsesLisp2021-07-2310-100/+166
|
* glasm: Add MUFU instructions to GLASMReinUsesLisp2021-07-232-21/+22
|
* glasm: Implement IAbs64 and INeg64 on GLASMReinUsesLisp2021-07-232-6/+6
|
* shader: Add floating-point rounding to I2FReinUsesLisp2021-07-233-35/+42
|
* glasm: Properly clamp Fp64 on GLASMReinUsesLisp2021-07-231-6/+6
|
* glasm: Fix register allocation when moving immediate on GLASMReinUsesLisp2021-07-233-42/+89
|
* glasm: Implement SelectU64 on GLASMReinUsesLisp2021-07-232-4/+20
|
* glasm: Fix clamps so the min value has priority on NAN on GLASMReinUsesLisp2021-07-231-12/+15
|
* glasm: Fix moving U64 immediates to registers in GLASMReinUsesLisp2021-07-232-3/+4
|
* glasm: Implement storage atomic opsameerj2021-07-234-305/+358
|
* glasm: Add conversion instructions to GLASMReinUsesLisp2021-07-239-282/+351
|
* glasm: Add fp min/max insts and fix store for fp64 on GLASMReinUsesLisp2021-07-232-10/+8
|
* glasm: Add logical instructions on GLASMReinUsesLisp2021-07-232-12/+12
|
* glasm: Remove duplicated Fp64 pack instructions on GLASMReinUsesLisp2021-07-231-8/+0
|
* glasm: Remove unnecesary new white space on Clamp GLASMReinUsesLisp2021-07-231-4/+4
|
* glasm: Add floating-point comparisons on GLASMReinUsesLisp2021-07-233-120/+116
|
* emit_glasm: Implement more integer alu opsameerj2021-07-232-47/+41
|
* glasm: Reimplement bitwise ops and BFI/BFEameerj2021-07-234-88/+108
|
* glasm: Initial GLASM fp64 supportReinUsesLisp2021-07-239-55/+152
|
* glasm: Implement GLASM fp16 packing and move bitwise insnsReinUsesLisp2021-07-234-66/+77
|
* glasm: Remove unused functions left from rebaseReinUsesLisp2021-07-231-12/+0
|
* glasm: Specify namespace when using FormatToReinUsesLisp2021-07-231-6/+6
|
* glasm: Implement more GLASM composite instructionsReinUsesLisp2021-07-232-54/+63
|
* glasm: Make GLASM aware of typesReinUsesLisp2021-07-2312-1244/+1380
|
* glasm: Use CMP.S for Select32ameerj2021-07-233-12/+8
| | | | also fixes ADD and SUB to use U modifier
* glasm: Implement more logical opsameerj2021-07-232-5/+5
|
* glasm: Implement BFI, BFEameerj2021-07-234-138/+164
| | | | Along with implementations of common instructions along the way
* glasm: Use BitField instead of C bitfieldsReinUsesLisp2021-07-232-8/+12
|
* glasm: Remove unused argument in identity instructions on GLASMReinUsesLisp2021-07-231-7/+7
|
* glasm: Implement basic GLASM instructionsReinUsesLisp2021-07-2310-840/+1173
|
* glasm: Changes to GLASM register allocator and emit contextReinUsesLisp2021-07-234-26/+64
|
* glasm: Add GLASM backend infrastructureReinUsesLisp2021-07-2328-4/+3115
|
* shader: ISET.X implementationameerj2021-07-231-8/+58
|
* shader: Fixup SPIR-V emit header namespacesReinUsesLisp2021-07-231-2/+2
|
* Move SPIR-V emission functions to their own headerReinUsesLisp2021-07-2324-572/+631
|
* shader: Optimize NVN FallthroughFernandoS272021-07-234-9/+83
|
* shader: Stub SR_AFFINITYFernandoS272021-07-231-0/+3
|
* shader: Implement Int32 SUATOM/SUREDameerj2021-07-2317-6/+733
|
* shader: Initial OpenGL implementationReinUsesLisp2021-07-233-0/+12
|
* spirv: Be aware of NAN unaware driversReinUsesLisp2021-07-231-18/+40
|
* spirv: Add SSBO read fallbacks when no aliasing is availableReinUsesLisp2021-07-231-37/+99
|
* spirv: Add OpKill fallback to demoteReinUsesLisp2021-07-231-2/+6
|
* spirv: Do not enable ShaderLayerReinUsesLisp2021-07-231-3/+0
| | | | This is enabled by an extension instead of the capability.
* spirv: Enable DemoteToHelperInvocationEXT only when supportedReinUsesLisp2021-07-231-1/+1
|
* spirv: Use OriginLowerLeft when requestedReinUsesLisp2021-07-231-1/+5
|
* spirv: Only add image operands mask when neededReinUsesLisp2021-07-231-5/+9
|
* spirv: Workaround image unsigned offset bugReinUsesLisp2021-07-232-9/+26
| | | | | Workaround bug on Nvidia's OpenGL SPIR-V compiler when using unsigned texture offsets.
* spirv: Add int8 and int16 capabilities only when supportedReinUsesLisp2021-07-231-2/+2
|
* spirv: Add integer clamping workaroundsReinUsesLisp2021-07-231-4/+34
| | | | Workaround more bugs on Nvidia's OpenGL SPIR-V compiler.
* spirv: Implement int8 and int16 conversion fallbacksReinUsesLisp2021-07-231-19/+80
|
* spirv: Support OpenGL uniform buffers and change bindingsReinUsesLisp2021-07-235-56/+163
|
* spirv: Desambiguate descriptor namesReinUsesLisp2021-07-231-9/+37
| | | | | Worksaround a bug on Nvidia's OpenGL SPIR-V compiler where names are used for name matching.
* shader: Add OpenGL shader profile optionsReinUsesLisp2021-07-231-0/+11
|
* shader: Remove shader utilReinUsesLisp2021-07-234-176/+0
|
* shader: Address feedbackFernandoS272021-07-234-35/+33
|
* shader: Implement VertexA stageFernandoS272021-07-2311-0/+166
|
* shader: Implement delegation of Exit to dispatcher on CFGFernandoS272021-07-232-3/+47
|
* shader: Fix IADD3.CCameerj2021-07-231-12/+5
|
* shader: Fix BFE s32 undefined checkameerj2021-07-231-1/+1
| | | | Our unit tests were hitting this exception.
* shader: Fix error checking in bitfieldExtract and implement bitfieldInsert foldingReinUsesLisp2021-07-231-5/+14
|
* shader: Fix storage type when reading patches on tess controlReinUsesLisp2021-07-231-1/+2
|
* shader: Fix VMNMX selector BReinUsesLisp2021-07-231-1/+2
|
* shader: Increase the maximum number of storage buffersReinUsesLisp2021-07-231-1/+1
| | | | | Compute shaders spill uniform buffers on storage buffers, increasing the expected number.
* shader: Remove identity removal pass for better build timesReinUsesLisp2021-07-231-1/+0
|
* shader: Add more strict validation the passReinUsesLisp2021-07-231-0/+42
|
* shader: Fix forward referencing identity instructions when inserting phiReinUsesLisp2021-07-231-11/+13
|
* shader: Remove invalidated blocks in dead code elimination passReinUsesLisp2021-07-231-3/+6
|
* shader: Add missing UndoUse case for GetSparseFromOpReinUsesLisp2021-07-231-0/+4
|
* shader: Simplify code in opcodes.h to fix IntellisenseReinUsesLisp2021-07-231-8/+6
| | | | | | | | Avoid using std::array to fix Intellisense not properly compiling this code and disabling itself on all files that include it. While we are at it, change the code to use u8 instead of size_t for the number of instructions in an opcode.
* shader: Implement indexed texturesReinUsesLisp2021-07-237-93/+189
|
* shader: Refactor atomic_operations_global_memoryameerj2021-07-231-44/+36
|
* shader: add missing include guard in half_floating_point_helper.hameerj2021-07-231-0/+2
|
* shader: Fix gcc warningsReinUsesLisp2021-07-232-2/+2
|
* shader: Inline common Value gettersReinUsesLisp2021-07-232-109/+102
|
* shader: Intrusively store in a block if it's sealed or notReinUsesLisp2021-07-232-3/+11
|
* cmake: Link to common in shader_recompilerReinUsesLisp2021-07-231-1/+1
|
* shader: Improve goto removal algorithm complexityReinUsesLisp2021-07-231-49/+28
| | | | | Find sibling node containing a nephew searching from the nephew itself instead of the uncle.
* shader: Use memset to reset instruction argumentsReinUsesLisp2021-07-232-4/+7
|
* shader: Inline common Value functions into the headerReinUsesLisp2021-07-232-19/+23
|
* shader: Move microinstruction header to the value headerReinUsesLisp2021-07-2319-180/+161
|
* shader: Move siblings check to a separate function and comment them outReinUsesLisp2021-07-231-16/+21
|
* shader: Intrusively store register values in block for SSA passReinUsesLisp2021-07-232-21/+53
|
* shader: Inline common Opcode and Inst functionsReinUsesLisp2021-07-234-112/+83
|
* shader: Inline common IR::Block methodsReinUsesLisp2021-07-232-17/+12
|
* shader: Use a small_vector for phi blocksReinUsesLisp2021-07-231-1/+2
|
* shader: Calculate number of arguments in an opcode at compile timeReinUsesLisp2021-07-231-3/+12
|
* shader: Implement D3D samplersReinUsesLisp2021-07-233-12/+76
|
* shader: Add constant propagation for arithmetic right shiftsReinUsesLisp2021-07-231-0/+3
|
* shader: Simplify code for local memoryReinUsesLisp2021-07-231-6/+11
|
* shader: Add NVN storage buffer fallbacksReinUsesLisp2021-07-239-62/+214
| | | | | | | When we can't track the SSBO origin of a global memory instruction, leave it as a global memory operation and assume these pointers are in the NVN storage buffer slots, then apply a linear search in the shader's runtime.
* spirv: Fix ViewportMaskReinUsesLisp2021-07-231-1/+2
|
* spirv: Replace Constant/ConstantComposite with Const helperameerj2021-07-2312-112/+101
|
* shader: Address feedbackFernandoS272021-07-232-7/+10
|
* shader: Implement F2F (Imm)FernandoS272021-07-231-2/+28
|
* shader: Implement IADD3.CC/.XFernandoS272021-07-231-7/+22
|
* shader: Address feedbackFernandoS272021-07-234-7/+4
|
* shader: Add coarse derivativesFernandoS272021-07-237-8/+28
|
* shader: Implement fine derivates constant propagationFernandoS272021-07-239-0/+101
|
* shader: Implement SR_Y_DIRECTIONFernandoS272021-07-237-0/+18
|
* shader: Fix Phi node typesReinUsesLisp2021-07-232-4/+4
|
* shader: Fix memory barriersReinUsesLisp2021-07-238-62/+30
|
* spirv: Fix implicit lod typeReinUsesLisp2021-07-232-1/+5
|
* spirv: Use explicit lods outside of fragment shadersReinUsesLisp2021-07-231-5/+16
|
* spirv: Use ConstOffset instead of Offset when possibleReinUsesLisp2021-07-233-21/+67
|
* shader: Implement BFE and BFI CCameerj2021-07-233-14/+17
| | | | Fix two bugs in BFI.
* shader: Implement SampleMaskReinUsesLisp2021-07-2311-2/+22
|
* shader: Implement PIXLD.MY_INDEXReinUsesLisp2021-07-2313-4/+69
|
* spirv: Bitcast non-F32 output attributes to their type before storeReinUsesLisp2021-07-231-13/+28
|
* spirv: Implement ViewportMask with NV_viewport_array2ReinUsesLisp2021-07-237-0/+20
|
* spirv: Bitcast non-F32 attributes to F32ReinUsesLisp2021-07-231-7/+9
|
* shader: Implement PrimitiveIdReinUsesLisp2021-07-235-0/+10
|
* shader: Implement tessellation shaders, polygon mode and invocation idReinUsesLisp2021-07-2322-88/+555
|
* shader: Mark atomic instructions as writesReinUsesLisp2021-07-231-0/+27
|
* spirv: Implement image buffersReinUsesLisp2021-07-235-23/+86
|
* spirv: Implement Layer storesReinUsesLisp2021-07-236-9/+30
|
* spirv: Fix alpha testFernandoS272021-07-231-0/+5
|
* spirv: Fix non-atomic 64-bit storeameerj2021-07-231-1/+1
|
* spirv: Implement alpha testameerj2021-07-232-1/+59
|
* shader: Implement transform feedbacks and define file formatReinUsesLisp2021-07-238-16/+116
|
* shader: Implement early Z testsReinUsesLisp2021-07-232-0/+4
|
* shader: Document and relax cache control on surface instructionsReinUsesLisp2021-07-231-10/+11
|
* spirv: Rework storage buffers and shader memoryReinUsesLisp2021-07-238-499/+553
|
* shader: Fix fixed pipeline point size on geometry shadersReinUsesLisp2021-07-231-10/+18
|
* shader: Add constant propagation for *&^| binary operationsReinUsesLisp2021-07-231-0/+12
|
* shader: Implement geometry shadersReinUsesLisp2021-07-2312-84/+221
|
* shader: Implement OUTReinUsesLisp2021-07-2310-17/+73
|
* internal_stage_buffer_entry_read: Remove pragma optimize offlat9nq2021-07-231-2/+0
|
* shader: Stub SR_INVOCATION_INFOReinUsesLisp2021-07-231-2/+5
|
* shader: Stub ISBERDReinUsesLisp2021-07-233-4/+56
|
* shader: Fix CC in I2IReinUsesLisp2021-07-231-0/+2
|
* spirv: Define StorageImageWriteWithoutFormat capability when usedReinUsesLisp2021-07-233-0/+9
|
* shader: Simplify FLO and throw on CCReinUsesLisp2021-07-231-12/+13
|
* shader: Mark blocks with no end branch as unreachableReinUsesLisp2021-07-231-2/+7
|
* shader: Implement LOP CCReinUsesLisp2021-07-233-12/+29
|
* shader: Implement SR_THREAD_KILLReinUsesLisp2021-07-2310-0/+22
|
* shader: Apply sign bit in FCMP (imm)ReinUsesLisp2021-07-231-1/+1
|
* shader: Implement ATOM/S and REDameerj2021-07-2318-19/+1724
|
* spirv: Move phi node patching to a separate functionReinUsesLisp2021-07-231-13/+16
|
* spirv: Guard against typeless image reads on unsupported devicesReinUsesLisp2021-07-235-1/+16
|
* shader: Move LaneId to the warp emission file and fix AMDReinUsesLisp2021-07-235-7/+11
|
* spirv: Fix forward declarations on phi nodesReinUsesLisp2021-07-231-47/+25
|
* shader: Mark ImageWrite with side effectsReinUsesLisp2021-07-231-0/+3
|
* shader: Implement CC for ISET, FSET, PSET, CSET, and DSETFernandoS272021-07-2318-13/+136
| | | | Throw when other instructions are missing CC.
* shader: Remove outdated comment in F2IReinUsesLisp2021-07-231-4/+0
|
* shader: Implement SULD and SUSTReinUsesLisp2021-07-2323-137/+597
|
* shader: Fix Windows build issuesReinUsesLisp2021-07-231-1/+1
|
* shader: Address feedback + clang formatlat9nq2021-07-2311-22/+20
|
* shader_recompiler,video_core: Cleanup some GCC and Clang errorslat9nq2021-07-2359-297/+289
| | | | | | | | | | | | | | | | | Mostly fixing unused *, implicit conversion, braced scalar init, fpermissive, and some others. Some Clang errors likely remain in video_core, and std::ranges is still a pertinent issue in shader_recompiler shader_recompiler: cmake: Force bracket depth to 1024 on Clang Increases the maximum fold expression depth thread_worker: Include condition_variable Don't use list initializers in control flow Co-authored-by: ReinUsesLisp <reinuseslisp@airmail.cc>
* shader: Fix FCMP immediate variantReinUsesLisp2021-07-231-1/+9
|
* shader: Fix dangling labelsReinUsesLisp2021-07-231-0/+5
|
* shader: Interact texture buffers with buffer cacheReinUsesLisp2021-07-233-29/+29
|
* shader: Fix F2IReinUsesLisp2021-07-231-1/+1
|
* shader: Fix TextureGradReinUsesLisp2021-07-231-1/+1
|
* shader: Implement texture buffersReinUsesLisp2021-07-236-23/+125
|
* shader: Address feedbackFernandoS272021-07-235-53/+54
|
* shader: Implement indexed Position and ClipDistancesFernandoS272021-07-233-11/+100
|
* shader: Implement indexed attributesFernandoS272021-07-2312-35/+279
|
* shader: Implement AL2PFernandoS272021-07-233-4/+36
|
* shader: Fix BRX trackingFernandoS272021-07-232-3/+4
|
* shader: Move recursive SSA rewrite to the heapReinUsesLisp2021-07-231-29/+89
|
* shader: Fix ShadowCube declaration type, set number of pipeline threads based on hardwareFernandoS272021-07-231-1/+1
|
* shader: Fix splits on blocks using indirect branchesReinUsesLisp2021-07-233-17/+38
|
* shader: Eliminate orphan blocks more efficientlyReinUsesLisp2021-07-231-7/+8
|
* shader: Add subgroup masksReinUsesLisp2021-07-2310-45/+169
|
* shader: Implement BAR and fix memory barriersReinUsesLisp2021-07-237-5/+79
|
* shader: Abstract breadth searches and use the abstractionReinUsesLisp2021-07-234-104/+106
|
* shader: Reimplement GetCbufU64 as GetCbufU32x2ReinUsesLisp2021-07-239-22/+21
| | | | It may generate better code on some compilers and it's easier to handle.
* shader: Remove unused header in VOTEReinUsesLisp2021-07-231-2/+0
|
* shader: Rework global memory tracking to use breadth-first searchReinUsesLisp2021-07-231-69/+80
|
* shader: Fix fp16 merge when using native fp16ReinUsesLisp2021-07-231-3/+3
|
* shader: Fix FADD32IReinUsesLisp2021-07-231-6/+4
|
* shader: Fix undetected bug from reviewFernandoS272021-07-231-0/+3
|
* shader: Address feedbackFernandoS272021-07-233-13/+16
|
* shader: "Implement" NOPFernandoS272021-07-231-1/+1
|
* shader: Address FeedbackFernandoS272021-07-2316-211/+60
|
* shader: Implement SR_LaneIdFernandoS272021-07-237-0/+15
|
* shader: Fix shared memory on cool driversFernandoS272021-07-231-0/+1
|
* shader: Implement MEMBARFernandoS272021-07-239-11/+121
|
* shader: Improve VOTE.VTG stubFernandoS272021-07-237-4/+147
|
* shader: Mark SSBOs as written when they areFernandoS272021-07-232-2/+30
|
* shader: Implement ViewportIndexFernandoS272021-07-237-2/+32
|
* shader: Stub TLD4's PTP when it isn't constantFernandoS272021-07-231-1/+2
|
* shader: Stub VOTE.VTGFernandoS272021-07-234-4/+15
|
* shader: Fold composite extractFernandoS272021-07-231-0/+62
|
* shader: Fold comparisons and Pack/Unpack16FernandoS272021-07-231-1/+41
|
* shader: Fix branches to visited virtual blocksReinUsesLisp2021-07-232-0/+12
|
* shader: Fix dependency on identity removal passReinUsesLisp2021-07-232-3/+8
|
* shader: Fix constant propagation to use reverse post orderReinUsesLisp2021-07-231-1/+2
|
* shader: Implement LDG .U.128 as .128ReinUsesLisp2021-07-231-3/+2
|
* shader: Unroll "using enum" for opcode declarationsReinUsesLisp2021-07-231-1/+27
|
* spirv: Remove unnecesary variable for clip distancesReinUsesLisp2021-07-232-6/+2
|
* shader: Implement ClipDistanceFernandoS272021-07-235-0/+36
|
* shader: Fix TXDFernandoS272021-07-232-2/+2
|
* shader: Address feedbackFernandoS272021-07-234-52/+48
|
* shader: Always pass a lod for TexelFetchReinUsesLisp2021-07-233-25/+17
|
* shader: Implement TXDFernandoS272021-07-234-10/+183
|
* shader: Implement ImageGradientFernandoS272021-07-238-2/+84
|
* shader: Implement TMML partiallyFernandoS272021-07-236-13/+137
|
* shader,spirv: Implement ImageQueryLod.FernandoS272021-07-239-1/+38
|
* shader: Implement TLDSFernandoS272021-07-233-4/+253
|
* shader: Implement TLDFernandoS272021-07-237-14/+173
|
* spirv: Add fixed pipeline point sizeReinUsesLisp2021-07-233-1/+8
|
* shader: Add PointCoord attributeFernandoS272021-07-235-0/+16
|
* shader: Add PointSize attributeameerj2021-07-235-0/+13
|
* shader: Store type of phi nodes in flagsReinUsesLisp2021-07-233-2/+11
| | | | This is needed because pseudo-instructions where invalidated.
* shader: Fix indirect branches to scheduler instructionsReinUsesLisp2021-07-233-7/+17
|
* spirv: Fix default output attribute initializationReinUsesLisp2021-07-231-3/+3
|
* shader: Add missing new linesReinUsesLisp2021-07-231-0/+2
|
* shader: Implement FSWZADDameerj2021-07-2314-4/+87
|
* shader: Implement BRXFernandoS272021-07-2320-47/+388
|
* shader: Fix alignment checks on RZReinUsesLisp2021-07-231-1/+1
|
* shader: Implement I2I CCameerj2021-07-233-24/+45
|
* shader: Implement I2I SATameerj2021-07-236-10/+52
|
* shader: Fix ISCADD logic for PO/CCameerj2021-07-231-7/+8
|
* shader: Implement LDS, STS, LDL, and STS and use SPIR-V 1.4 when availableReinUsesLisp2021-07-2317-17/+626
|
* shader: Implement ISCADD CCameerj2021-07-231-1/+4
|
* shader: Implement VMAD, VMNMX, VSETPameerj2021-07-239-23/+319
|
* shader: Add missing I2I exception when CC is usedReinUsesLisp2021-07-231-0/+4
|
* shader: Better interpolation and disabled attributes supportReinUsesLisp2021-07-237-23/+96
|
* spirv: Remove dependencies on Environment when generating SPIR-VReinUsesLisp2021-07-234-9/+12
|
* shader: Implement front faceReinUsesLisp2021-07-235-0/+12
|
* shader: Fix structured control flow on KIL instructionsReinUsesLisp2021-07-232-3/+7
| | | | | This could potentially leave unvisited blocks, leading to illegal phi nodes.
* shader: Fix TXQFernandoS272021-07-231-1/+1
|
* shader: Implement TXQ and fix FragDepthReinUsesLisp2021-07-2314-21/+172
|
* shader: Refactor PTP and other minor changesReinUsesLisp2021-07-2314-123/+67
|
* shader: Add IR opcode for ImageFetchFernandoS272021-07-237-5/+55
|
* shader: Implement TLD4.PTPFernandoS272021-07-2315-28/+111
|
* shader: Fix Array Indices in TEX/TLD4FernandoS272021-07-232-6/+6
|
* shader: Implement FragDepthFernandoS272021-07-232-1/+7
|
* shader: Implement TLD4S.FernandoS272021-07-233-4/+134
|
* shader: Implement TLD4 and TLD4_BFernandoS272021-07-2313-11/+315
|
* shader: Implement SHFLameerj2021-07-2316-69/+284
|
* shader: Track first bindless argument instead of the instruction itselfReinUsesLisp2021-07-231-1/+1
|
* shader: Properly insert Prologue instructionReinUsesLisp2021-07-231-1/+2
|
* shader: Minor style nitsReinUsesLisp2021-07-231-2/+4
|
* shader: Fix F2IFernandoS272021-07-2310-9/+147
|
* shader: Implement NDC [-1, 1], attribute types and default varying initializationReinUsesLisp2021-07-2312-40/+149
|
* shader: Fix use-after-free bug in object_poolReinUsesLisp2021-07-231-3/+3
|
* shader: Implement VOTEameerj2021-07-2314-5/+167
|
* shader: Fix TEX maskReinUsesLisp2021-07-231-1/+3
|
* vk_pipeline_cache: Add pipeline cacheReinUsesLisp2021-07-234-8/+15
|
* shader: Fold interpolation multiplicationsReinUsesLisp2021-07-231-0/+34
|
* shader: Better but still partial interpolation supportReinUsesLisp2021-07-231-5/+7
|
* shader: Implement DMNMX, DSET, DSETPameerj2021-07-2315-59/+208
|
* shader: Implement FADD32IFernandoS272021-07-231-2/+15
|
* shader: Implement F2FFernandoS272021-07-236-20/+192
|
* shader: Add missing fp64 usage flagsReinUsesLisp2021-07-231-0/+34
|
* shader: Implement DMUL and DFMAameerj2021-07-238-30/+111
| | | | Also add a missing const on DADD
* shader: Add FP64 register load/store helpersameerj2021-07-233-21/+24
|
* shader: Add support for fp16 comparisons and misc fixesReinUsesLisp2021-07-2311-14/+56
|
* shader: Fix floating point comparison for FP16FernandoS272021-07-235-32/+56
|
* shader: Implement HSETP2FernandoS272021-07-233-12/+117
|
* shader: Implement HSET2FernandoS272021-07-235-14/+119
|
* shader: Implement HMUL2FernandoS272021-07-233-16/+144
|
* shader: Implement HFMA2FernandoS272021-07-235-20/+192
|
* spirv: Implement VertexId and InstanceId, refactor codeReinUsesLisp2021-07-239-144/+243
|
* shader: Refactor half floating instructionsFernandoS272021-07-234-58/+84
|
* shader: Implement I2FReinUsesLisp2021-07-2316-69/+427
|
* shader: Implement ISCADD (imm)ReinUsesLisp2021-07-231-2/+2
|
* shader: Implement LOP32IReinUsesLisp2021-07-232-18/+45
|
* shader: Add partial rasterizer integrationReinUsesLisp2021-07-2334-156/+629
|
* shader: Implement DADDameerj2021-07-238-14/+132
|
* shader: Implement CSET and CSETPameerj2021-07-236-15/+114
|
* shader: Reorder phi nodes when redefined as undefined opcodesReinUsesLisp2021-07-231-1/+9
|
* shader: Fix instruction transitions in and out of PhiReinUsesLisp2021-07-231-9/+11
|
* shader: Implement FSET and FSETPameerj2021-07-239-94/+204
| | | | Also fix oversight with adding SignedZeroInfNanPreserve execution mode.
* shader: Implement TEXSReinUsesLisp2021-07-238-7/+287
|
* shader: Implement CAL inlining function callsReinUsesLisp2021-07-2324-330/+286
|
* spirv: Add SignedZeroInfNanPreserve logicameerj2021-07-232-0/+8
|
* shader: Implement FMNMXameerj2021-07-238-25/+101
| | | | And add a const in FCMP
* shader: Fix rebase issueReinUsesLisp2021-07-231-1/+0
|
* shader: Implement FCMPameerj2021-07-239-50/+203
| | | | still need to configure some settings for NV denorm flush and intel NaN
* shader: Partial implementation of LDCReinUsesLisp2021-07-2316-50/+405
|
* shader: Initial support for textures and TEXReinUsesLisp2021-07-2329-341/+1378
|
* shader: Implement R2Pameerj2021-07-238-15/+88
|
* shader: Implement SHFameerj2021-07-238-31/+119
|
* shader: Implement LEAameerj2021-07-239-29/+136
|
* shader: Deduplicate HADD2 codeReinUsesLisp2021-07-231-19/+16
|
* shader: Implement I2Iameerj2021-07-233-12/+100
|
* shader: Implement HADD2ReinUsesLisp2021-07-2312-42/+400
|
* shader: Implement LOP and LOP3ameerj2021-07-238-31/+227
|
* shader: Implement IADD3ameerj2021-07-233-12/+104
|
* shader: Implement PSETPameerj2021-07-234-5/+40
|
* Implement PSET, refactor common comparison funcsameerj2021-07-239-101/+88
|
* shader: Implement FLOameerj2021-07-238-18/+75
|
* shader: Implement ISET, add common_funcsameerj2021-07-238-50/+150
|
* shader: Make IMNMX, SHR, SEL stylistically more consistentameerj2021-07-233-5/+5
|
* shader: Implement ICMPameerj2021-07-233-16/+84
|
* shader: Implement IMNMXameerj2021-07-238-12/+105
|
* shader: Implement BFIameerj2021-07-233-16/+57
|
* shader: Implement BFEameerj2021-07-233-12/+67
|
* shader: Implement POPCameerj2021-07-238-12/+59
|
* shader: Implement SHRameerj2021-07-238-18/+80
|
* shader: Implement SELameerj2021-07-234-16/+53
|
* spirv: Move phi arguments emit to a separate functionReinUsesLisp2021-07-231-27/+27
|
* shader: Avoid infinite recursion when tracking global memoryReinUsesLisp2021-07-231-5/+26
|
* shader: Fix conditional execution of exit instructionsReinUsesLisp2021-07-232-5/+6
|
* spirv: Add support for self-referencing phi nodesReinUsesLisp2021-07-231-3/+10
|
* shader: Fix control flowReinUsesLisp2021-07-238-20/+39
|
* shader: Implement more of XMAD and FFMA32I and fix XMAD.CBCCReinUsesLisp2021-07-235-28/+76
|
* shader: FMUL, select, RRO, and MUFU fixesReinUsesLisp2021-07-2318-119/+507
|
* shader: Fix MOV(reg), add SHL variants and emit neg and abs instructionsReinUsesLisp2021-07-234-11/+11
|
* spirv: Fixes and Intel specific workaroundsReinUsesLisp2021-07-2310-32/+43
|
* shader: Rename, implement FADD.SAT and P2R (imm)ReinUsesLisp2021-07-2317-125/+211
|
* shader: Add denorm flush supportReinUsesLisp2021-07-2315-60/+210
|
* spirv: Add lower fp16 to fp32 passReinUsesLisp2021-07-2328-276/+465
|
* shader: Primitive Vulkan integrationReinUsesLisp2021-07-2328-498/+573
|
* shader: Add XMAD multiplication folding optimizationReinUsesLisp2021-07-231-5/+77
|
* shader: Simplify ISCADDReinUsesLisp2021-07-231-6/+1
|
* shader: Add utility to resolve identities on a valueReinUsesLisp2021-07-232-0/+8
|
* spirv: Implement EmitIdentityReinUsesLisp2021-07-232-3/+3
|
* spirv: Initial bindings supportReinUsesLisp2021-07-2322-292/+671
|
* shader: Improve object poolReinUsesLisp2021-07-233-50/+66
|
* shader: Fix trackingReinUsesLisp2021-07-231-50/+72
|
* shader: Add support for forward declarationsReinUsesLisp2021-07-2310-68/+79
|
* shader: Support SSA loops on IRReinUsesLisp2021-07-2312-46/+150
|
* shader: Misc fixesReinUsesLisp2021-07-2310-89/+104
|
* shader: Initial implementation of an ASTReinUsesLisp2021-07-2332-589/+1345
|
* spirv: Initial SPIR-V supportReinUsesLisp2021-07-2318-34/+1400
|
* shader: Better constant foldingReinUsesLisp2021-07-232-13/+48
|
* shader: Properly store phi on InstReinUsesLisp2021-07-236-75/+132
|
* shader: Add pools and rename filesReinUsesLisp2021-07-2330-108/+255
|
* shader: Make typed IRReinUsesLisp2021-07-2319-269/+495
|
* shader: Remove illegal character in SSA passReinUsesLisp2021-07-231-1/+1
|
* shader: Constant propagation and global memory to storage bufferReinUsesLisp2021-07-2317-63/+652
|
* shader: Initial instruction supportReinUsesLisp2021-07-2328-334/+1450
|
* shader: SSA and dominanceReinUsesLisp2021-07-2324-77/+570
|
* shader: Initial recompiler workReinUsesLisp2021-07-2356-0/+7060