summaryrefslogtreecommitdiffstats
path: root/src/video_core/renderer_opengl/gl_shader_cache.cpp (follow)
Commit message (Collapse)AuthorAgeFilesLines
* video_core: Enable ImageGather with subpixel offset on IntelWollnashorn2023-04-081-1/+1
|
* shader_recompiler: Add subpixel offset for correct rounding at `ImageGather`Wollnashorn2023-04-081-0/+1
| | | | | | | On AMD a subpixel offset of 1/512 of the texel size is applied to the texture coordinates at a ImageGather call to ensure the rounding at the texel centers is done the same way as in Maxwell or other Nvidia architectures. See https://www.reedbeta.com/blog/texture-gathers-and-coordinate-precision/ for more details why this might be necessary. This should fix shadow artifacts at object edges in Zelda: Breath of the Wild (#9957, #6956).
* Merge pull request #9588 from liamwhite/bylaws-revertsliamwhite2023-02-191-1/+0
|\ | | | | Revert "shader_recompiler: Align SSBO offsets to meet host requirements"
| * Revert "Vulkan, OpenGL: Hook up storage buffer alignment code"Liam2023-01-071-1/+0
| | | | | | | | This reverts commit 9e2997c4b6456031622602002924617690e32a13.
* | gl_compute_pipeline: Force context flush when loading shader cacheameerj2023-01-301-4/+4
| |
* | gl_graphics_pipeline: Force context flush when loading shader cacheameerj2023-01-301-4/+5
|/
* Vulkan, OpenGL: Hook up geometry shader passthrough emulationBilly Laws2023-01-051-0/+1
|
* Vulkan, OpenGL: Hook up storage buffer alignment codeBilly Laws2023-01-051-0/+1
|
* ShaderCompiler: Inline driver specific constants.Fernando Sahmkow2023-01-031-1/+1
|
* MacroHLE: Add OpenGL SupportFernando Sahmkow2023-01-011-1/+2
|
* Merge pull request #7450 from FernandoS27/ndc-vulkanliamwhite2022-12-171-0/+1
|\ | | | | Vulkan: Add support for VK_EXT_depth_clip_control.
| * Vulkan: Add support for VK_EXT_depth_clip_control.FernandoS272022-12-141-0/+1
| |
* | renderer_opengl: refactor context acquireLiam2022-12-131-31/+45
|/
* video_core: Implement maxwell3d draw manager and split draw logicFeng Chen2022-12-081-2/+4
|
* shader_recompiler: add gl_Layer translation GS for older hardwareLiam2022-12-011-4/+33
|
* Merge pull request #9167 from vonchenplus/tessliamwhite2022-11-111-1/+2
|\ | | | | video_core: Fix few issues in Tess stage
| * video_core: Fix few issues in Tess stageFengChen2022-11-071-1/+2
| |
* | ir/texture_pass: Use host_info instead of querying Settings::values (#9176)Morph2022-11-111-0/+1
|/
* Merge pull request #8858 from vonchenplus/mipmapbunnei2022-11-041-1/+1
|\ | | | | video_core: Generate mipmap texture by drawing
| * video_core: Generate mipmap texture by drawingFengChen2022-09-201-1/+1
| |
* | Merge pull request #8873 from vonchenplus/fix_legacy_location_errorbunnei2022-10-241-0/+1
|\ \ | | | | | | video_core: Fix legacy to generic location unpaired
| * | video_core: Fix legacy to generic location unpairedFengChen2022-09-201-0/+1
| | |
* | | renderer_(opengl/vulkan): Fix tessellation clockwise parameterMorph2022-10-131-2/+2
| | | | | | | | | | | | This should be assigned CW only on Triangles_CW rather than not Triangles_CCW, making CCW the default winding order rather than CW.
* | | Update 3D regsKelebek12022-10-071-20/+23
| | |
* | | VideoCore: Fix channels with disk pipeline/shader cache.Fernando Sahmkow2022-10-061-6/+5
| | |
* | | VideoCore: implement channels on gpu caches.Fernando Sahmkow2022-10-061-21/+18
| | |
* | | common: remove "yuzu:" prefix from thread namesLiam2022-10-041-1/+1
|/ /
* / (shader/pipeline)_cache: Raise shader/pipeline cache versionMorph2022-08-311-1/+1
|/ | | | Since the following commit: https://github.com/yuzu-emu/yuzu/commit/a83a5d2e4c8932df864dd4cea2b04d87a12c8760 , many games will refuse to boot unless the shader/pipeline cache has been invalidated.
* video_core: stop waiting for shader compilation on user cancelLiam2022-07-301-1/+1
|
* common: Change semantics of UNREACHABLE to unconditionally crashLiam2022-06-141-2/+2
|
* general: Convert source file copyright comments over to SPDXMorph2022-04-231-3/+2
| | | | | This formats all copyright comments according to SPDX formatting guidelines. Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
* video_core: Replace lock_guard with scoped_lockMerry2022-04-071-2/+2
|
* video_core: Reduce unused includesameerj2022-03-191-3/+0
|
* ShaderDecompiler: Add a debug option to dump the game's shaders.Fernando Sahmkow2022-01-041-1/+10
|
* glsl: Add boolean reference workaroundameerj2021-12-301-0/+1
|
* glsl_context_get_set: Add alternative cbuf type for broken driversameerj2021-12-301-0/+1
| | | | some drivers have a bug bitwise converting floating point cbuf values to uint variables. This adds a workaround for these drivers to make all cbufs uint and convert to floating point as needed.
* Address format clangvonchenplus2021-12-181-1/+1
|
* Implement convert legacy to genericFeng Chen2021-11-191-0/+3
|
* opengl: Use Shader::NumDescriptors when possibleReinUsesLisp2021-11-161-10/+5
|
* renderers: Log total pipeline countMorph2021-09-141-0/+2
|
* structured_control_flow: Conditionally invoke demote reorder passameerj2021-08-301-0/+1
| | | | This is only needed on select drivers when a fragment shader discards/demotes.
* gl_shader_cache: Remove unused variableLioncash2021-07-271-1/+0
|
* shader_environment: Receive cache version from outsideReinUsesLisp2021-07-231-3/+7
| | | | This allows us invalidating OpenGL and Vulkan separately in the future.
* glsl: Clamp shared mem size to GL_MAX_COMPUTE_SHARED_MEMORY_SIZEameerj2021-07-231-0/+1
|
* gl_shader_cache: Properly implement asynchronous shadersReinUsesLisp2021-07-231-1/+1
|
* shader: Ignore global memory ops on devices lacking int64 supportameerj2021-07-231-0/+1
|
* gl_shader_cache: Fixes for async shadersameerj2021-07-231-2/+23
|
* emit_spirv: Workaround VK_KHR_shader_float_controls on fp16 NvidiaReinUsesLisp2021-07-231-0/+1
| | | | Fix regression on Fire Emblem: Three Houses when using native fp16.
* video_core: Enable GL SPIR-V shaderslat9nq2021-07-231-9/+31
|
* glasm: Add passthrough geometry shader supportReinUsesLisp2021-07-231-1/+1
|
* shader: Rework varyings and implement passthrough geometry shadersReinUsesLisp2021-07-231-3/+4
| | | | | | Put all varyings into a single std::bitset with helpers to access it. Implement passthrough geometry shaders using host's.
* shader: Unify shader stage typesReinUsesLisp2021-07-231-1/+0
|
* shader: Emulate 64-bit integers when not supportedReinUsesLisp2021-07-231-1/+1
| | | | Useful for mobile and Intel Xe devices.
* gl_shader_cache: Check previous pipeline before checking hash mapReinUsesLisp2021-07-231-6/+14
| | | | Port optimization from Vulkan.
* shaders: Allow shader notify when async shaders is disabledameerj2021-07-231-4/+4
|
* shader: Properly manage attributes not written from previous stagesReinUsesLisp2021-07-231-1/+10
|
* shader: Add support for native 16-bit floatsReinUsesLisp2021-07-231-4/+8
|
* shader: Rename maxwell/program.h to translate_program.hReinUsesLisp2021-07-231-1/+1
|
* glsl: Address rest of feedbackameerj2021-07-231-0/+1
|
* glsl: Conditionally use fine/coarse derivatives based on device supportameerj2021-07-231-0/+1
|
* glsl: Cleanup/Address feedbackameerj2021-07-231-0/+2
|
* gl_shader_cache: Implement async shadersameerj2021-07-231-29/+25
|
* glsl: Add stubs for sparse queries and variable aoffi when not supportedameerj2021-07-231-0/+2
|
* gl_shader_cache: Move OGL shader compilation to the respective Pipeline constructorameerj2021-07-231-61/+8
|
* glsl: Implement fswzaddameerj2021-07-231-0/+1
| | | | and wip nv thread shuffle impl
* glsl: Rebase fixesameerj2021-07-231-1/+0
|
* glsl: Use textureGrad fallback when EXT_texture_shadow_lod is unsupportedameerj2021-07-231-0/+1
|
* glsl: skip gl_ViewportIndex write if device does not support itameerj2021-07-231-0/+1
|
* glsl: Implement transform feedbackameerj2021-07-231-5/+13
|
* glsl: Implement VOTE for subgroup size potentially largerameerj2021-07-231-1/+1
|
* glsl: Implement some attribute getters and settersameerj2021-07-231-1/+0
|
* glsl: Query GL Device for FP16 extension supportameerj2021-07-231-0/+2
|
* glsl: Fixup build issuesReinUsesLisp2021-07-231-1/+1
|
* glsl: Initial backendameerj2021-07-231-2/+5
|
* shader: Reorder shader cache directoriesReinUsesLisp2021-07-231-8/+5
|
* gl_shader_util: Move shader utility code to a separate fileReinUsesLisp2021-07-231-76/+5
|
* gl_shader_cache: Store workers in shader cache objectReinUsesLisp2021-07-231-58/+71
|
* shader: Fix VertexA Shaders.FernandoS272021-07-231-5/+21
|
* glasm: Use ARB_derivative_control conditionallyReinUsesLisp2021-07-231-0/+1
|
* opengl: Declare fragment outputs even if they are not usedReinUsesLisp2021-07-231-0/+2
| | | | | | Fixes Ori and the Blind Forest's menu on GLASM. For some reason (probably high level optimizations) it is not sanitized on SPIR-V for OpenGL. Vulkan is unaffected by this change.
* shader: Handle host exceptionsReinUsesLisp2021-07-231-17/+26
|
* glasm: Use storage buffers instead of global memory when possibleReinUsesLisp2021-07-231-6/+24
|
* gl_shader_cache: Add disk shader cacheReinUsesLisp2021-07-231-6/+107
|
* gl_shader_cache: Rename Program abstractions into PipelineReinUsesLisp2021-07-231-21/+21
|
* gl_shader_cache: Do not flip tessellation on OpenGLReinUsesLisp2021-07-231-2/+1
|
* gl_shader_cache: Conditionally use viewport maskReinUsesLisp2021-07-231-1/+1
|
* gl_shader_cache,glasm: Conditionally use typeless image reads extensionReinUsesLisp2021-07-231-37/+37
|
* gl_shader_cache: Improve GLASM error print logicReinUsesLisp2021-07-231-7/+10
|
* glasm: Implement forced early ZReinUsesLisp2021-07-231-2/+2
|
* glasm: Set transform feedback stateReinUsesLisp2021-07-231-2/+17
|
* gl_shader_cache: Pass shader runtime informationReinUsesLisp2021-07-231-2/+74
|
* shader: Split profile and runtime information in separate structsReinUsesLisp2021-07-231-22/+4
|
* glasm: Support textures used in more than one stageReinUsesLisp2021-07-231-1/+1
|
* opengl: Initial (broken) support to GLASM shadersReinUsesLisp2021-07-231-11/+35
|
* glasm: Initial GLASM compute implementation for testingReinUsesLisp2021-07-231-6/+31
|
* gl_shader_cache: Remove code unintentionally committedReinUsesLisp2021-07-231-3/+0
|
* Move SPIR-V emission functions to their own headerReinUsesLisp2021-07-231-3/+2
|
* shader: Initial OpenGL implementationReinUsesLisp2021-07-231-3/+272
|
* shader: Move pipeline cache logic to separate filesReinUsesLisp2021-07-231-13/+8
| | | | | | | | | Move code to separate files to be able to reuse it from OpenGL. This greatly simplifies the pipeline cache logic on Vulkan. Transform feedback state is not yet abstracted and it's still intrusively stored inside vk_pipeline_cache. It will be moved when needed on OpenGL.
* shader: Remove old shader managementReinUsesLisp2021-07-231-563/+1
|
* bootmanager: Use std::stop_source for stopping emulationReinUsesLisp2021-06-221-3/+3
| | | | | | | Use its std::stop_token to abort shader cache loading. Using std::stop_token instead of std::atomic_bool allows the usage of other utilities like std::stop_callback.
* gl_disk_shader_cache: Log total shader entries count on game loadMorph2021-02-201-0/+4
|
* renderer_opengl: Avoid precompiled cache and force NV GL cache directoryReinUsesLisp2021-01-211-5/+8
| | | | | | | | | | | | | | Setting __GL_SHADER_DISK_CACHE_PATH we can force the cache directory to be in yuzu's user directory to stop commonly distributed malware from deleting our driver shader cache. And by setting __GL_SHADER_DISK_CACHE_SKIP_CLEANUP we can have an unbounded shader cache size. This has only been implemented on Windows, mostly because previous tests didn't seem to work on Linux. Disable the precompiled cache on Nvidia's driver. There's no need to hide information the driver already has in its own cache.
* video_core: Rewrite the texture cacheReinUsesLisp2020-12-301-1/+0
| | | | | | | | | | | | | | The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage.The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage. This commit aims to address those issues.
* video_core: Make use of ordered container contains() where applicableLioncash2020-12-071-1/+1
| | | | | | With C++20, we can use the more concise contains() member function instead of comparing the result of the find() call with the end iterator.
* video_core: Resolve more variable shadowing scenarios pt.3Lioncash2020-12-051-4/+3
| | | | | Cleans out the rest of the occurrences of variable shadowing and makes any further occurrences of shadowing compiler errors.
* video_core: Resolve more variable shadowing scenariosLioncash2020-12-041-3/+3
| | | | | | Resolves variable shadowing scenarios up to the end of the OpenGL code to make it nicer to review. The rest will be resolved in a following commit.
* video_core: Remove all Core::System references in rendererReinUsesLisp2020-09-061-32/+31
| | | | | | | | | Now that the GPU is initialized when video backends are initialized, it's no longer needed to query components once the game is running: it can be done when yuzu is booting. This allows us to pass components between constructors and in the process remove all Core::System references in the video backend.
* gl_shader_util: Use std::string_view instead of star pointerReinUsesLisp2020-08-241-0/+1
| | | | | This allows us passing any type of string and hinting the length of the string to the OpenGL driver.
* gl_shader_cache: Use std::max() for determining num_workersMorph2020-08-121-1/+1
| | | | Does not allocate more threads than available in the host system for boot-time shader compilation and always allocates at least 1 thread if hardware_concurrency() returns 0.
* Merge pull request #4391 from lioncash/nrvobunnei2020-07-241-1/+1
|\ | | | | video_core: Allow copy elision to take place where applicable
| * video_core: Allow copy elision to take place where applicableLioncash2020-07-211-1/+1
| | | | | | | | | | Removes const from some variables that are returned from functions, as this allows the move assignment/constructors to execute for them.
* | video_core: Remove unused variablesLioncash2020-07-211-3/+0
|/ | | | Silences several compiler warnings about unused variables.
* async shadersDavid Marcec2020-07-171-49/+132
|
* gl_shader_cache: Avoid use after move for program sizeReinUsesLisp2020-06-241-5/+7
| | | | | | All programs had a size of zero due to this bug, skipping invalidations. While we are at it, remove some unused forward declarations.
* Merge pull request #4041 from ReinUsesLisp/arb-decompbunnei2020-06-161-1/+3
|\ | | | | gl_arb_decompiler: Implement an assembly shader decompiler
| * gl_arb_decompiler: Implement an assembly shader decompilerReinUsesLisp2020-06-121-1/+3
| | | | | | | | | | | | Emit code compatible with NV_gpu_program5. This should emit code compatible with Fermi, but it wasn't tested on that architecture. Pascal has some issues not present on Turing GPUs.
* | vk_pipeline_cache: Use generic shader cacheReinUsesLisp2020-06-071-3/+3
| | | | | | | | Trivial port the generic shader cache to Vulkan.
* | gl_shader_cache: Use generic shader cacheReinUsesLisp2020-06-071-45/+42
|/ | | | Trivially port the generic shader cache to OpenGL.
* glsl: Squash constant buffers into a single SSBO when we hit the limitReinUsesLisp2020-06-011-5/+7
| | | | | Avoids compilation errors at the cost of shader build times and runtime performance when a game hits the limit of uniform buffers we can use.
* renderer_opengl: Add assembly program code pathsReinUsesLisp2020-05-191-21/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add code required to use OpenGL assembly programs based on NV_gpu_program5. Decompilation for ARB programs is intended to be added in a follow up commit. This does **not** include ARB decompilation and it's not in an usable state. The intention behind assembly programs is to reduce shader stutter significantly on drivers supporting NV_gpu_program5 (and other required extensions). Currently only Nvidia's proprietary driver supports these extensions. Add a UI option hidden for now to avoid people enabling this option accidentally. This code path has some limitations that OpenGL compatibility doesn't have: - NV_shader_storage_buffer_object is limited to 16 entries for a single OpenGL context state (I don't know if this is an intended limitation, an specification issue or I am missing something). Currently causes issues on The Legend of Zelda: Link's Awakening. - NV_parameter_buffer_object can't bind buffers using an offset different to zero. The used workaround is to copy to a temporary buffer (this doesn't happen often so it's not an issue). On the other hand, it has the following advantages: - Shaders build a lot faster. - We have control over how floating point rounding is done over individual instructions (SPIR-V on Vulkan can't do this). - Operations on shared memory can be unsigned and signed. - Transform feedbacks are dynamic state (not yet implemented). - Parameter buffers (uniform buffers) are per stage, matching NVN and hardware's behavior. - The API to bind and create assembly programs makes sense, unlike ARB_separate_shader_objects.
* shader/memory_util: Deduplicate codeReinUsesLisp2020-04-261-72/+10
| | | | | | | | Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as well as shader decoder code. While we are at it, fix a bug in gl_shader_cache where compute shaders had an start offset of a stage shader.
* ShaderCache/PipelineCache: Cache null shaders.Fernando Sahmkow2020-04-221-4/+13
|
* Revert "gl_shader_cache: Use CompileDepth::FullDecompile on GLSL"Rodrigo Locatti2020-04-171-3/+1
|
* gl_shader_cache: Use CompileDepth::FullDecompile on GLSLReinUsesLisp2020-04-141-1/+3
| | | | | | | | From my testing on a Splatoon 2 shader that takes 3800ms on average to compile changing to FullDecompile reduces it to 900ms on average. The shader decoder will automatically fallback to a more naive method if it can't use full decompile.
* Shader/Pipeline Cache: Use VAddr instead of physical memory for addressing.Fernando Sahmkow2020-04-061-22/+24
|
* Address review and fix broken yuzu-tester buildJames Rowe2020-03-261-1/+1
|
* Frontend/GPU: Refactor context managementJames Rowe2020-03-251-3/+2
| | | | | | | | | | | | | | | | | | | | Changes the GraphicsContext to be managed by the GPU core. This eliminates the need for the frontends to fool around with tricky MakeCurrent/DoneCurrent calls that are dependent on the settings (such as async gpu option). This also refactors out the need to use QWidget::fromWindowContainer as that caused issues with focus and input handling. Now we use a regular QWidget and just access the native windowHandle() directly. Another change is removing the debug tool setting in FrameMailbox. Instead of trying to block the frontend until a new frame is ready, the core will now take over presentation and draw directly to the window if the renderer detects that its hooked by NSight or RenderDoc Lastly, since it was in the way, I removed ScopeAcquireWindowContext and replaced it with a simple subclass in GraphicsContext that achieves the same result
* gl_shader_decompiler: Add identifier to decompiled codeReinUsesLisp2020-03-091-2/+4
|
* gl_shader_cache: Reduce registry consistency to debug assertReinUsesLisp2020-03-091-3/+1
| | | | | Registry consistency is something that practically can't happen and it has a measurable runtime cost. Reduce it to a DEBUG_ASSERT.
* shader/registry: Store graphics and compute metadataReinUsesLisp2020-03-091-12/+16
| | | | | Store information GLSL forces us to provide but it's dynamic state in hardware (workgroup sizes, primitive topology, shared memory size).
* video_core: Rename "const buffer locker" to "registry"ReinUsesLisp2020-03-091-32/+33
|
* gl_shader_cache: Rework shader cache and remove post-specializationsReinUsesLisp2020-03-091-345/+158
| | | | | Instead of pre-specializing shaders and then post-specializing them, drop the later and only "specialize" the shader while decoding it.
* gl_state_tracker: Implement dirty flags for clip distances and shadersReinUsesLisp2020-02-281-0/+5
|
* gl_rasterizer: Remove dirty flagsReinUsesLisp2020-02-281-4/+0
|
* Shader_IR: Store Bound buffer on Shader UsageFernando Sahmkow2020-01-241-1/+3
|
* gl_shader_cache: Disable fastmath on NvidiaReinUsesLisp2020-01-211-0/+4
|
* gl_shader_cache: Remove unused STAGE_RESERVED_UBOS constantLioncash2020-01-141-3/+0
| | | | Given this isn't used, this can be removed entirely.
* gl_shader_cache: std::move entries in CachedShader constructorLioncash2020-01-141-3/+4
| | | | Avoids several reallocations of std::vector instances where applicable.
* gl_shader_cache: Remove unused entries variable in BuildShader()Lioncash2020-01-141-1/+0
| | | | Eliminates a few unnecessary constructions of std::vectors.
* gl_shader_cache: Update commentary for shared memoryReinUsesLisp2019-12-211-9/+6
| | | | | | | | Remove false commentary. Not dividing by 4 the size of shared memory is not a hack; it describes the number of integers, not bytes. While we are at it sort the generated code to put preprocessor lines on the top.
* gl_shader_cache: Remove unused entry in GetPrimitiveDescriptionReinUsesLisp2019-12-211-11/+9
|
* gl_shader_cache: Add missing new-line on emitted GLSLReinUsesLisp2019-12-111-2/+2
| | | | | | | | | | | | | Add missing new-line. This caused shaders using local memory and shared memory to inject a preprocessor GLSL line after an expression (resulting in invalid code). It looked like this: shared uint smem[8];#define LOCAL_MEMORY_SIZE 16 It should look like this (addressed by this commit): shared uint smem[8]; \#define LOCAL_MEMORY_SIZE 16
* gl_shader_cache: Hack shared memory sizeReinUsesLisp2019-11-231-2/+3
| | | | | | | | The current shared memory size seems to be smaller than what the game actually uses. This makes Nvidia's driver consistently blow up; in the case of FE3H it made it explode on Qt's SwapBuffers while SDL2 worked just fine. For now keep this hack since it's still progress over the previous hardcoded shared memory size.
* gl_shader_cache: Remove dynamic BaseBinding specializationReinUsesLisp2019-11-231-36/+8
|
* video_core: Unify ProgramType and ShaderStage into ShaderTypeReinUsesLisp2019-11-231-127/+100
|
* gl_rasterizer: Bind graphics images to draw commandsReinUsesLisp2019-11-231-0/+1
| | | | | Images were not being bound to draw invocations because these would require a cache invalidation.
* gl_shader_cache: Specialize local memory size for compute shadersReinUsesLisp2019-11-231-0/+5
| | | | | Local memory size in compute shaders was stubbed with an arbitary size. This commit specializes local memory size from guest GPU parameters.
* gl_shader_cache: Specialize shared memory sizeReinUsesLisp2019-11-231-0/+7
| | | | | Shared memory was being declared with an undefined size. Specialize from guest GPU parameters the compute shader's shared memory size.
* gl_shader_cache: Specialize shader workgroupReinUsesLisp2019-11-231-35/+28
| | | | | | Drop the usage of ARB_compute_variable_group_size and specialize compute shaders instead. This permits compute to run on AMD and Intel proprietary drivers.
* shader/texture: Deduce texture buffers from lockerReinUsesLisp2019-11-231-12/+0
| | | | | Instead of specializing shaders to separate texture buffers from 1D textures, use the locker to deduce them while they are being decoded.
* Merge pull request #3081 from ReinUsesLisp/fswzadd-shufflesFernando Sahmkow2019-11-141-5/+14
|\ | | | | shader: Implement FSWZADD and reimplement SHFL
| * gl_shader_cache: Enable extensions only when availableReinUsesLisp2019-11-081-6/+14
| | | | | | | | Silence GLSL compilation warnings.
| * gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsicsReinUsesLisp2019-11-081-0/+1
| |
* | gl_shader_cache: Fix locker constructorsReinUsesLisp2019-11-081-2/+4
|/ | | | Properly pass engine when a shader is being constructed from memory.
* gl_shader_cache: Implement locker variants invalidationReinUsesLisp2019-10-251-27/+75
|
* gl_shader_disk_cache: Store and load fast BRXReinUsesLisp2019-10-251-2/+17
|
* gl_shader_decompiler: Move entries to a separate functionReinUsesLisp2019-10-251-238/+251
|
* Shader_Cache: setup connection of ConstBufferLockerFernando Sahmkow2019-10-251-16/+29
|
* shader/image: Implement SULD and remove irrelevant codeReinUsesLisp2019-09-211-8/+8
| | | | | * Implement SULD as float. * Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
* shader_ir/warp: Implement SHFLReinUsesLisp2019-09-171-1/+2
|
* gl_shader_cache: Remove special casing for geometry shadersReinUsesLisp2019-09-041-59/+9
| | | | | Now that ProgramVariants holds the primitive topology we no longer need to keep track of individual geometry shaders topologies.
* video_core: Silent miscellaneous warnings (#2820)Rodrigo Locatti2019-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | * texture_cache/surface_params: Remove unused local variable * rasterizer_interface: Add missing documentation commentary * maxwell_dma: Remove unused rasterizer reference * video_core/gpu: Sort member declaration order to silent -Wreorder warning * fermi_2d: Remove unused MemoryManager reference * video_core: Silent unused variable warnings * buffer_cache: Silent -Wreorder warnings * kepler_memory: Remove unused MemoryManager reference * gl_texture_cache: Add missing override * buffer_cache: Add missing include * shader/decode: Remove unused variables
* Merge pull request #2742 from ReinUsesLisp/fix-texture-buffersbunnei2019-08-291-2/+6
|\ | | | | gl_texture_cache: Miscellaneous texture buffer fixes
| * gl_shader_cache: Fix newline on buffer preprocessor definitionsReinUsesLisp2019-07-181-2/+6
| |
* | shader_ir: Implement VOTEReinUsesLisp2019-08-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers.
* | Merge pull request #2734 from ReinUsesLisp/compute-shadersbunnei2019-07-221-33/+117
|\ \ | | | | | | gl_rasterizer: Implement compute shaders
| * | gl_shader_cache: Fix clang-format issuesReinUsesLisp2019-07-161-2/+1
| | |
| * | gl_shader_cache: Address review commentariesReinUsesLisp2019-07-151-7/+4
| | |
| * | gl_shader_cache: Address CI issuesReinUsesLisp2019-07-151-1/+2
| | |
| * | gl_rasterizer: Implement compute shadersReinUsesLisp2019-07-151-35/+122
| |/
* / Maxwell3D: Rework the dirty system to be more consistant and scaleableFernando Sahmkow2019-07-171-1/+1
|/
* Merge pull request #2695 from ReinUsesLisp/layer-viewportFernando Sahmkow2019-07-151-2/+5
|\ | | | | gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
| * gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shadersReinUsesLisp2019-07-081-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements gl_ViewportIndex and gl_Layer in vertex and geometry shaders. In the case it's used in a vertex shader, it requires ARB_shader_viewport_layer_array. This extension is available on AMD and Nvidia devices (mesa and proprietary drivers), but not available on Intel on any platform. At the moment of writing this description I don't know if this is a hardware limitation or a driver limitation. In the case that ARB_shader_viewport_layer_array is not available, writes to these registers on a vertex shader are ignored, with the appropriate logging.
* | shader_ir: propagate shader size to the IRFernando Sahmkow2019-07-091-7/+15
|/
* Merge pull request #2601 from FernandoS27/texture_cacheZach Hilman2019-07-051-35/+56
|\ | | | | Implement a new Texture Cache
| * texture_cache: Address FeedbackFernando Sahmkow2019-07-051-2/+4
| |
| * texture_cache: Style and CorrectionsFernando Sahmkow2019-06-211-1/+1
| |
| * shader_cache: Correct versioning and size calculation.Fernando Sahmkow2019-06-211-1/+6
| |
| * gl_shader_decompiler: Implement image binding settingsReinUsesLisp2019-06-211-0/+4
| |
| * gl_rasterizer: Track texture buffer usageReinUsesLisp2019-06-211-34/+44
| |
* | gl_shader_cache: Make CachedShader constructor privateZach Hilman2019-07-041-2/+2
| | | | | | | | Fixes missing review comments introduced.
* | gl_shader_cache: Use static constructors for CachedShader initializationReinUsesLisp2019-06-081-42/+34
|/
* gl_shader_cache: Store a system class and drop global accessorsReinUsesLisp2019-05-301-7/+8
|
* gl_shader_cache: Add commentaries explaining the intention in shaders creationReinUsesLisp2019-05-301-0/+2
|
* gl_shader_cache: Flip if condition in GetStageProgram to reduce indentationReinUsesLisp2019-05-301-25/+26
|
* gl_shader_gen: Always declare extensions after the version declarationReinUsesLisp2019-05-271-1/+2
| | | | | This addresses a bug on geometry shaders where code was being written before all #extension declarations were done. Ref to #2523
* gl_shader_cache: Fix clang strict standard build issuesReinUsesLisp2019-05-211-3/+4
|
* gl_shader_cache: Use shared contexts to build shaders in parallelReinUsesLisp2019-05-211-34/+82
|
* video_core/renderer_opengl/gl_shader_cache: Correct member initialization orderLioncash2019-05-101-1/+1
| | | | Silences a -Wreorder warning.
* Re added new lines at the end of filesFreddyFunk2019-04-231-1/+1
|
* gl_shader_disk_cache: Use VectorVfsFile for the virtual precompiled shader cache fileunknown2019-04-231-1/+11
|
* Merge pull request #2383 from ReinUsesLisp/aoffi-testbunnei2019-04-231-27/+27
|\ | | | | gl_shader_decompiler: Disable variable AOFFI on unsupported devices
| * gl_shader_decompiler: Use variable AOFFI on supported hardwareReinUsesLisp2019-04-141-27/+27
| |
* | Document unsafe versions and add BlockCopyUnsafeFernando Sahmkow2019-04-161-6/+7
| |
* | Use ReadBlockUnsafe for Shader CacheFernando Sahmkow2019-04-161-5/+7
|/
* Merge pull request #2354 from lioncash/headerbunnei2019-04-101-0/+1
|\ | | | | video_core/texures/texture: Remove unnecessary includes
| * video_core/texures/texture: Remove unnecessary includesLioncash2019-04-061-0/+1
| | | | | | | | | | | | Nothing in this header relies on common_funcs or the memory manager. This gets rid of reliance on indirect inclusions in the OpenGL caches.
* | Merge pull request #2300 from FernandoS27/null-shaderbunnei2019-04-071-0/+4
|\ \ | |/ |/| shader_cache: Permit a Null Shader in case of a bad host_ptr.
| * Permit a Null Shader in case of a bad host_ptr.Fernando Sahmkow2019-04-071-0/+4
| |
* | Merge pull request #2299 from lioncash/maxwellbunnei2019-04-041-2/+0
|\ \ | | | | | | gl_shader_manager: Remove reliance on a global accessor within MaxwellUniformData::SetFromRegs()
| * | gl_shader_manager: Remove unnecessary gl_shader_manager inclusionLioncash2019-03-281-2/+0
| |/ | | | | | | | | | | | | | | | | This isn't used at all in the OpenGL shader cache, so we can remove it's include here, meaning one less file needs to be recompiled if any changes ever occur within that header. core/memory.h is also not used within this file at all, so we can remove it as well.
* / video_core: Amend constructor initializer list order where applicableLioncash2019-03-271-6/+6
|/ | | | | | | Specifies the members in the same order that initialization would take place in. This also silences -Wreorder warnings.
* gpu: Move GPUVAddr definition to common_types.bunnei2019-03-211-2/+2
|
* video_core: Refactor to use MemoryManager interface for all memory access.bunnei2019-03-161-20/+17
| | | | | | | | | | | # Conflicts: # src/video_core/engines/kepler_memory.cpp # src/video_core/engines/maxwell_3d.cpp # src/video_core/morton.cpp # src/video_core/morton.h # src/video_core/renderer_opengl/gl_global_cache.cpp # src/video_core/renderer_opengl/gl_global_cache.h # src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
* gpu: Use host address for caching instead of guest address.bunnei2019-03-151-19/+24
|
* gl_shader_disk_cache: Use unordered containersReinUsesLisp2019-02-071-3/+3
|
* gl_shader_cache: Fixup GLSL unique identifiersReinUsesLisp2019-02-071-2/+2
|
* gl_shader_cache: Link loading screen with disk shader cache loadReinUsesLisp2019-02-071-3/+26
|
* gl_shader_cache: Set GL_PROGRAM_SEPARABLE to dumped shadersReinUsesLisp2019-02-071-0/+1
| | | | | | i965 (and probably all mesa drivers) require GL_PROGRAM_SEPARABLE when using glProgramBinary. This is probably required by the standard but it's ignored by permisive proprietary drivers.
* gl_shader_disk_cache: Pass core system as argument and guard against games without title idsReinUsesLisp2019-02-071-1/+2
|
* gl_shader_disk_cache: Address miscellaneous feedbackReinUsesLisp2019-02-071-3/+3
|
* gl_shader_disk_cache: Pass return values returning instead of by parametersReinUsesLisp2019-02-071-7/+5
|
* gl_shader_disk_cache: Save GLSL and entries into the precompiled fileReinUsesLisp2019-02-071-32/+39
|
* gl_shader_cache: Refactor to support disk shader cacheReinUsesLisp2019-02-071-105/+345
|
* rasterizer_interface: Add disk cache entry for the rasterizerReinUsesLisp2019-02-071-0/+2
|
* video_core: Assert on invalid GPU to CPU address queriesReinUsesLisp2019-02-031-2/+4
|
* gl_shader_cache: Use explicit bindingsReinUsesLisp2019-01-301-63/+83
|
* gl_rasterizer: Implement global memory managementReinUsesLisp2019-01-301-3/+15
|
* video_core: Rename glsl_decompiler to gl_shader_decompilerReinUsesLisp2019-01-151-1/+1
|
* video_core: Replace gl_shader_decompilerReinUsesLisp2019-01-151-2/+6
|
* gl_shader_cache: Use dirty flags for shadersReinUsesLisp2019-01-071-1/+5
|
* gl_shader_cache: Dehardcode constant in CalculateProgramSize()Lioncash2018-12-111-2/+2
| | | | This constant is related to the size of the instruction.
* gl_shader_cache: Resolve truncation compiler warningLioncash2018-12-111-1/+1
| | | | | The previous code would cause a warning, as it was truncating size_t (64-bit) to a u32 (32-bit) implicitly.
* Implemented a shader unique identifier.Fernando Sahmkow2018-12-091-0/+45
|
* shader_cache: Only lock covered instructions.Markus Wick2018-11-201-0/+1
|
* Merge pull request #1669 from ReinUsesLisp/fixup-gsbunnei2018-11-111-2/+6
|\ | | | | gl_shader_decompiler: Guard out of bound geometry shader input reads
| * gl_shader_decompiler: Guard out of bound geometry shader input readsReinUsesLisp2018-11-101-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Geometry shaders follow a pattern that results in out of bound reads. This pattern is: - VSETP to predicate - Use that predicate to conditionally set a register a big number - Use the register to access geometry shaders At the time of writing this commit I don't know what's the intent of this number. Some drivers argue about these out of bound reads. To avoid this issue, input reads are guarded limiting reads to the highest posible vertex input of the current topology (e.g. points to 1 and triangles to 3).
* | rasterizer_cache: Remove reliance on the System singletonLioncash2018-11-081-1/+3
|/ | | | | | Rather than have a transparent dependency, we can make it explicit in the interface. This also gets rid of the need to put the core include in a header.
* video_core: Move OpenGL specific utils to its rendererReinUsesLisp2018-10-291-2/+3
|
* gl_shader_decompiler: Implement geometry shadersReinUsesLisp2018-10-071-5/+29
|
* Added glObjectLabels for renderdoc for textures and shader programs (#1384)David2018-09-231-0/+2
| | | | | | | | * Added glObjectLabels for renderdoc for textures and shader programs * Changed hardcoded "Texture" name to reflect the texture type instead * Removed string initialize
* Port #4182 from Citra: "Prefix all size_t with std::"fearlessTobi2018-09-151-3/+3
|
* video_core: fixed arithmetic overflow warnings & improved code stylePatrick Elsässer2018-09-091-4/+4
| | | | | | | | - Fixed all warnings, for renderer_opengl items, which were indicating a possible incorrect behavior from integral promotion rules and types larger than those in which arithmetic is typically performed. - Added const for variables where possible and meaningful. - Added constexpr where possible.
* gl_shader_cache: Use an u32 for the binding point cache.Markus Wick2018-09-041-8/+8
| | | | | | | The std::string generation with its malloc and free requirement was a noticeable overhead. Also switch to an ordered_map to avoid the std::hash call. As those maps usually have a size of two elements, the lookup time shall not matter.
* gl_renderer: Cache textures, framebuffers, and shaders based on CPU address.bunnei2018-08-311-11/+7
|
* gl_shader_cache: Remove unused program_code vector in GetShaderAddress()Lioncash2018-08-281-2/+1
| | | | | | | | Given std::vector is a type with a non-trivial destructor, this variable cannot be optimized away by the compiler, even if unused. Because of that, something that was intended to be fairly lightweight, was actually allocating 32KB and deallocating it at the end of the function.
* renderer_opengl: Implement a new shader cache.bunnei2018-08-281-0/+131