summaryrefslogtreecommitdiffstats
path: root/src/video_core/shader/decode/other.cpp (follow)
Commit message (Collapse)AuthorAgeFilesLines
* video_core: Reimplement the buffer cacheReinUsesLisp2021-02-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Reimplement the buffer cache using cached bindings and page level granularity for modification tracking. This also drops the usage of shared pointers and virtual functions from the cache. - Bindings are cached, allowing to skip work when the game changes few bits between draws. - OpenGL Assembly shaders no longer copy when a region has been modified from the GPU to emulate constant buffers, instead GL_EXT_memory_object is used to alias sub-buffers within the same allocation. - OpenGL Assembly shaders stream constant buffer data using glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In theory this should save one hash table resolve inside the driver compared to glBufferSubData. - A new OpenGL stream buffer is implemented based on fences for drivers that are not Nvidia's proprietary, due to their low performance on partial glBufferSubData calls synchronized with 3D rendering (that some games use a lot). - Most optimizations are shared between APIs now, allowing Vulkan to cache more bindings than before, skipping unnecesarry work. This commit adds the necessary infrastructure to use Vulkan object from OpenGL. Overall, it improves performance and fixes some bugs present on the old cache. There are still some edge cases hit by some games that harm performance on some vendors, this are planned to be fixed in later commits.
* video_core: Remove unnecessary enum class casting in logging messagesLioncash2020-12-071-20/+14
| | | | | | | fmt now automatically prints the numeric value of an enum class member by default, so we don't need to use casts any more. Reduces the line noise a bit.
* video_core: Resolve more variable shadowing scenarios pt.2Lioncash2020-12-051-5/+5
| | | | | | | Migrates the video core code closer to enabling variable shadowing warnings as errors. This primarily sorts out shadowing occurrences within the Vulkan code.
* decode/other: Implement S2R.LaneIdReinUsesLisp2020-07-161-2/+1
| | | | | | This maps to host's thread id. - Fixes graphical issues on Paper Mario.
* Merge pull request #4016 from ReinUsesLisp/invocation-infoLC2020-06-021-1/+1
|\ | | | | shader/other: Fix hardcoded value in S2R INVOCATION_INFO
| * shader/other: Fix hardcoded value in S2R INVOCATION_INFOReinUsesLisp2020-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | Geometry shaders built from Nvidia's compiler check for bits[16:23] to be less than or equal to 0 with VSETP to default to a "safe" value of 0x8000'0000 (safe from hardware's perspective). To avoid hitting this path in the shader, return 0x00ff'0000 from S2R INVOCATION_INFO. This seems to be the maximum number of vertices a geometry shader can emit in a primitive.
* | shader/other: Implement MEMBAR.CTSReinUsesLisp2020-05-271-2/+12
|/ | | | | This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it.
* Merge pull request #3981 from ReinUsesLisp/barbunnei2020-05-261-0/+5
|\ | | | | shader/other: Implement BAR.SYNC 0x0
| * shader/other: Implement BAR.SYNC 0x0ReinUsesLisp2020-05-221-0/+5
| | | | | | | | | | Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here.
* | shader/other: Implement thread comparisons (NV_shader_thread_group)ReinUsesLisp2020-05-221-0/+21
|/ | | | | | | | | | | Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
* Merge pull request #3601 from ReinUsesLisp/some-shader-encodingsbunnei2020-04-091-3/+9
|\ | | | | video_core/shader: Add some instruction and S2R encodings
| * shader/other: Add error message for some S2R registersReinUsesLisp2020-04-041-0/+6
| |
| * shader_bytecode: Rename MOV_SYS to S2RReinUsesLisp2020-04-041-3/+3
| |
* | shader_decompiler: Remove FragCoord.w hack and change IPA implementationReinUsesLisp2020-04-021-15/+21
|/ | | | | | | | | | | | | | | | Credits go to gdkchan and Ryujinx. The pull request used for this can be found here: https://github.com/Ryujinx/Ryujinx/pull/1082 yuzu was already using the header for interpolation, but it was missing the FragCoord.w multiplication described in the linked pull request. This commit finally removes the FragCoord.w == 1.0f hack from the shader decompiler. While we are at it, this commit renames some enumerations to match Nvidia's documentation (linked below) and fixes component declaration order in the shader program header (z and w were swapped). https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
* shader/other: Fix skips for SYNC and BRKReinUsesLisp2020-01-291-2/+2
|
* shader/other: Stub S2R LaneIdReinUsesLisp2020-01-291-1/+4
|
* shader: Implement MEMBAR.GLReinUsesLisp2019-12-101-0/+6
| | | | Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
* shader_ir/other: Implement S2R InvocationIdReinUsesLisp2019-12-101-0/+2
|
* shader/other: Reduce DEPBAR log severityReinUsesLisp2019-11-201-1/+1
| | | | | | While DEPBAR is stubbed it doesn't change anything from our end. Shading languages handle what this instruction does implicitly. We are not getting anything out fo this log except noise.
* video_core/shader: Resolve instances of variable shadowingLioncash2019-10-241-1/+1
| | | | Silences a few -Wshadow warnings.
* Merge pull request #2758 from ReinUsesLisp/packed-tidbunnei2019-08-291-0/+7
|\ | | | | shader/decode: Implement S2R Tic
| * shader/decode: Implement S2R TicReinUsesLisp2019-07-221-0/+7
| |
* | shader_ir: Implement NOPReinUsesLisp2019-08-041-0/+6
|/
* shader/decode/other: Correct branch indirect argument within BRA handlingLioncash2019-07-161-1/+1
| | | | | This appears to have been a copy/paste error introduced within 8a6fc529a968e007f01464abadd32f9b5eb0a26c
* shader_ir: Unify blocks in decompiled shaders.Fernando Sahmkow2019-07-091-7/+23
|
* shader_ir: Implement BRX & BRA.CCFernando Sahmkow2019-07-091-4/+38
|
* shader: Split SSY and PBK stackReinUsesLisp2019-06-071-10/+8
| | | | | | | | | | | Hardware testing revealed that SSY and PBK push to a different stack, allowing code like this: SSY label1; PBK label2; SYNC; label1: PBK; label2: EXIT;
* shader: Use shared_ptr to store nodes and move initialization to fileReinUsesLisp2019-06-061-0/+1
| | | | | | | | | Instead of having a vector of unique_ptr stored in a vector and returning star pointers to this, use shared_ptr. While changing initialization code, move it to a separate file when possible. This is a first step to allow code analysis and node generation beyond the ShaderIR class.
* Merge pull request #2446 from ReinUsesLisp/tidbunnei2019-05-291-14/+28
|\ | | | | shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
| * shader: Implement S2R Tid{XYZ} and CtaId{XYZ}ReinUsesLisp2019-05-201-14/+28
| |
* | shader/decode/*: Eliminate indirect inclusionsLioncash2019-05-231-0/+1
|/ | | | | | | Amends cases where we were using things that were indirectly being satisfied through other headers. This way, if those headers change and eliminate dependencies on other headers in the future, we don't have cascading compilation errors.
* shader_ir/other: Implement IPA.IDXReinUsesLisp2019-05-031-5/+8
|
* shader: Remove unused AbufNode Ipa modeReinUsesLisp2019-05-031-1/+1
|
* shader_decompiler: Improve Accuracy of Attribute Interpolation.Fernando Sahmkow2019-02-141-2/+13
|
* shader_ir: Rename BasicBlock to NodeBlockReinUsesLisp2019-02-031-1/+1
| | | | It's not always used as a basic block. Rename it for consistency.
* shader_ir: Pass decoded nodes as a whole instead of per basic blocksReinUsesLisp2019-02-031-1/+1
| | | | | | | | | Some games call LDG at the top of a basic block, making the tracking heuristic to fail. This commit lets the heuristic the decoded nodes as a whole instead of per basic blocks. This may lead to some false positives but allows it the heuristic to track cases it previously couldn't.
* shader_ir: Pass to decoder functions basic block's codeReinUsesLisp2019-01-151-1/+1
|
* shader_decode: Use proper primitive namesReinUsesLisp2019-01-151-6/+6
|
* shader_ir: Remove Ipa primitiveReinUsesLisp2019-01-151-3/+2
|
* video_core: Implement IR based geometry shadersReinUsesLisp2019-01-151-0/+25
|
* shader_ir: Fixup file inclusions and clang-formatReinUsesLisp2019-01-151-1/+1
|
* shader_decode: Implement MOV_SYSReinUsesLisp2019-01-151-0/+27
|
* shader_decode: Implement BRA internal flagReinUsesLisp2019-01-151-4/+8
|
* shader_decode: Implement PBK and BRKReinUsesLisp2019-01-151-1/+22
|
* shader_decode: Stub DEPBARReinUsesLisp2019-01-151-0/+4
|
* shader_decode: Implement SSY and SYNCReinUsesLisp2019-01-151-0/+19
|
* shader_decode: Partially implement BRAReinUsesLisp2019-01-151-0/+12
|
* shader_decode: Implement IPAReinUsesLisp2019-01-151-0/+12
|
* shader_decode: Implement EXITReinUsesLisp2019-01-151-1/+32
|
* shader_ir: Initial implementationReinUsesLisp2019-01-151-0/+24