anonymous/yuzu - yuzu is the world's most popular, open-source, Nintendo Switch emulator — started by the creators of Citra. It is written in C++ with portability in mind,

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge pull request #4711 from lioncash/move5	bunnei	2020-09-25	1	-16/+19
\|\ \| \| \| \|	arithmetic_integer_immediate: Make use of std::move where applicable
\| *	arithmetic_integer_immediate: Make use of std::move where applicable	Lioncash	2020-09-24	1	-16/+19
\| \| \| \| \| \| \| \| \| \|	Same behavior, minus any redundant atomic reference count increments and decrements.
* \|	Merge pull request #4674 from ReinUsesLisp/timeline-semaphores	bunnei	2020-09-24	1	-0/+11
\|\ \ \| \| \| \| \| \|	renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphore
\| * \|	renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphore	ReinUsesLisp	2020-09-19	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reworks how host<->device synchronization works on the Vulkan backend. Instead of "protecting" resources with a fence and signalling these as free when the fence is known to be signalled by the host GPU, use timeline semaphores. Vulkan timeline semaphores allow use to work on a subset of D3D12 fences. As far as we are concerned, timeline semaphores are a value set by the host or the device that can be waited by either of them. Taking advantange of this, we can have a monolithically increasing atomic value for each submission to the graphics queue. Instead of protecting resources with a fence, we simply store the current logical tick (the atomic value stored in CPU memory). When we want to know if a resource is free, it can be compared to the current GPU tick. This greatly simplifies resource management code and the free status of resources should have less false negatives. To workaround bugs in validation layers, when these are attached there's a thread waiting for timeline semaphores.
* \| \|	control_flow: emplace elements in place within TryQuery()	Lioncash	2020-09-23	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Places data structures where they'll eventually be moved to to avoid needing to even move them in the first place.
* \| \|	control_flow: Make use of std::move in InsertBranch()	Lioncash	2020-09-23	1	-7/+8
\| \|/ \|/\| \| \| \| \|	Avoids unnecessary atomic increments and decrements.
* \|	General: Make use of std::nullopt where applicable	Lioncash	2020-09-22	2	-18/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allows some implementations to avoid completely zeroing out the internal buffer of the optional, and instead only set the validity byte within the structure. This also makes it consistent how we return empty optionals.
* \|	Merge pull request #4672 from lioncash/narrowing	Rodrigo Locatti	2020-09-17	1	-1/+1
\|\ \ \| \| \| \| \| \|	decoder/texture: Eliminate narrowing conversion in GetTldCode()
\| * \|	decoder/texture: Eliminate narrowing conversion in GetTldCode()	Lioncash	2020-09-17	1	-1/+1
\| \|/ \| \| \| \| \| \|	The assignment was previously truncating a u64 value to a bool.
* /	decode/image: Eliminate switch fallthrough in DecodeImage()	Lioncash	2020-09-17	1	-0/+1
\|/ \| \| \| \|	Fortunately this didn't result in any issues, given the block that code was falling through to would immediately break.
*	video_core: Enforce -Werror=switch	ReinUsesLisp	2020-09-16	2	-4/+13
\| \| \| \|	This forces us to fix all -Wswitch warnings in video_core.
*	video_core: Remove all Core::System references in renderer	ReinUsesLisp	2020-09-06	2	-9/+4
\| \| \| \| \| \| \| \| \|	Now that the GPU is initialized when video backends are initialized, it's no longer needed to query components once the game is running: it can be done when yuzu is booting. This allows us to pass components between constructors and in the process remove all Core::System references in the video backend.
*	Merge pull request #4575 from lioncash/async	bunnei	2020-09-03	2	-17/+15
\|\ \| \| \| \|	async_shaders: Mark getters as const member functions
\| *	async_shaders: Mark getters as const member functions	Lioncash	2020-08-24	2	-17/+15
\| \| \| \| \| \| \| \|	While we're at it, we can also mark them as nodiscard.
* \|	Merge pull request #4524 from lioncash/memory-log	bunnei	2020-08-27	1	-1/+2
\|\ \ \| \|/ \|/\|	shader/memory: Amend UNIMPLEMENTED_IF_MSG without a message
\| *	shader/memory: Amend UNIMPLEMENTED_IF_MSG without a message	Lioncash	2020-08-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	We need to provide a message for this variant of the macro, so we can simply log out the type being used.
* \|	Merge pull request #4443 from ameerj/vk-async-shaders	David	2020-08-17	2	-30/+99
\|\ \ \| \| \| \| \| \|	vulkan_renderer: Async shader/graphics pipeline compilation
\| * \|	Remove unneeded newlines, optional Registry in shader params	ameerj	2020-08-16	2	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Addressing feedback from Rodrigo
\| * \|	Morph: Update worker allocation comment	Ameer J	2020-08-16	1	-1/+1
\| \| \| \| \| \| \| \| \|	Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com>
\| * \|	move thread 1/4 count computation into allocate workers method	ameerj	2020-08-16	2	-3/+12
\| \| \|
\| * \|	Address feedback, add shader compile notifier, update setting text	ameerj	2020-08-16	2	-68/+65
\| \| \|
\| * \|	Vk Async Worker directly emplace in cache	ameerj	2020-08-16	1	-53/+25
\| \| \|
\| * \|	Address feedback. Bruteforce delete duplicates	ameerj	2020-08-16	2	-61/+78
\| \| \|
\| * \|	Vk Async pipeline compilation	ameerj	2020-08-16	2	-6/+84
\| \|/
* /	async_shaders: Resolve -Wpessimizing-move warning	Lioncash	2020-08-14	1	-2/+2
\|/ \| \| \| \|	Prevents pessimization of the move constructor (which thankfully didn't actually happen in practice here, given std::thread isn't copyable).
*	General: Tidy up clang-format warnings part 2	Lioncash	2020-08-13	2	-17/+19
\|
*	Merge pull request #4391 from lioncash/nrvo	bunnei	2020-07-24	4	-22/+22
\|\ \| \| \| \|	video_core: Allow copy elision to take place where applicable
\| *	video_core: Allow copy elision to take place where applicable	Lioncash	2020-07-21	4	-22/+22
\| \| \| \| \| \| \| \| \| \|	Removes const from some variables that are returned from functions, as this allows the move assignment/constructors to execute for them.
* \|	Merge pull request #4361 from ReinUsesLisp/lane-id	Rodrigo Locatti	2020-07-21	1	-2/+1
\|\ \ \| \| \| \| \| \|	decode/other: Implement S2R.LaneId
\| * \|	decode/other: Implement S2R.LaneId	ReinUsesLisp	2020-07-16	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This maps to host's thread id. - Fixes graphical issues on Paper Mario.
* \| \|	Merge pull request #4324 from ReinUsesLisp/formats	bunnei	2020-07-21	1	-27/+27
\|\ \ \ \| \|_\|/ \|/\| \|	video_core: Fix, add and rename pixel formats
\| * \|	video_core: Rearrange pixel format names	ReinUsesLisp	2020-07-13	1	-27/+27
\| \|/ \| \| \| \| \| \| \| \| \| \|	Normalizes pixel format names to match Vulkan names. Previous to this commit pixel formats had no convention, leading to confusion and potential bugs.
* \|	Fix style issues	David Marcec	2020-07-18	1	-4/+10
\| \|
* \|	Remove duplicate config	David Marcec	2020-07-17	1	-0/+1
\| \|
* \|	Use conditional var	David Marcec	2020-07-17	2	-9/+15
\| \|
* \|	async shaders	David Marcec	2020-07-17	2	-0/+277
\|/
*	Merge pull request #4147 from ReinUsesLisp/hset2-imm	bunnei	2020-06-27	1	-21/+67
\|\ \| \| \| \|	shader/half_set: Implement HSET2_IMM
\| *	shader/half_set: Implement HSET2_IMM	ReinUsesLisp	2020-06-23	1	-21/+67
\| \| \| \| \| \| \| \| \| \| \| \|	Add HSET2_IMM. Due to the complexity of the encoding avoid using BitField unions and read the relevant bits from the code itself. This is less error prone.
* \|	Merge pull request #4083 from Morph1984/B10G11R11F	bunnei	2020-06-24	1	-9/+17
\|\ \ \| \|/ \|/\|	decode/image: Implement B10G11R11F
\| *	decode/image: Implement B10G11R11F	Morph	2020-06-20	1	-9/+17
\| \| \| \| \| \| \| \|	- Used by Kirby Star Allies
* \|	memory_util: boost hashes are size_t	MerryMage	2020-06-18	1	-2/+2
\|/ \| \| \| \|	* boost::hash_value returns a size_t * boost::hash_combine takes a size_t& argument
*	shader/texture: Join separate image and sampler pairs offline	ReinUsesLisp	2020-06-05	7	-69/+146
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Games using D3D idioms can join images and samplers when a shader executes, instead of baking them into a combined sampler image. This is also possible on Vulkan. One approach to this solution would be to use separate samplers on Vulkan and leave this unimplemented on OpenGL, but we can't do this because there's no consistent way of determining which constant buffer holds a sampler and which one an image. We could in theory find the first bit and if it's in the TIC area, it's an image; but this falls apart when an image or sampler handle use an index of zero. The used approach is to track for a LOP.OR operation (this is done at an IR level, not at an ISA level), track again the constant buffers used as source and store this pair. Then, outside of shader execution, join the sample and image pair with a bitwise or operation. This approach won't work on games that truly use separate samplers in a meaningful way. For example, pooling textures in a 2D array and determining at runtime what sampler to use. This invalidates OpenGL's disk shader cache :) - Used mostly by D3D ports to Switch
*	shader/track: Move bindless tracking to a separate function	ReinUsesLisp	2020-06-05	2	-25/+39
\|
*	Merge pull request #4016 from ReinUsesLisp/invocation-info	LC	2020-06-02	1	-1/+1
\|\ \| \| \| \|	shader/other: Fix hardcoded value in S2R INVOCATION_INFO
\| *	shader/other: Fix hardcoded value in S2R INVOCATION_INFO	ReinUsesLisp	2020-05-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Geometry shaders built from Nvidia's compiler check for bits[16:23] to be less than or equal to 0 with VSETP to default to a "safe" value of 0x8000'0000 (safe from hardware's perspective). To avoid hitting this path in the shader, return 0x00ff'0000 from S2R INVOCATION_INFO. This seems to be the maximum number of vertices a geometry shader can emit in a primitive.
* \|	shader/other: Implement MEMBAR.CTS	ReinUsesLisp	2020-05-27	2	-4/+15
\|/ \| \| \| \|	This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it.
*	Merge pull request #3981 from ReinUsesLisp/bar	bunnei	2020-05-26	2	-0/+6
\|\ \| \| \| \|	shader/other: Implement BAR.SYNC 0x0
\| *	shader/other: Implement BAR.SYNC 0x0	ReinUsesLisp	2020-05-22	2	-0/+6
\| \| \| \| \| \| \| \| \| \|	Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here.
* \|	Merge pull request #3980 from ReinUsesLisp/red-op	bunnei	2020-05-26	1	-2/+1
\|\ \ \| \| \| \| \| \|	shader/memory: Implement non-addition operations in RED
\| * \|	shader/memory: Implement non-addition operations in RED	ReinUsesLisp	2020-05-22	1	-2/+1
\| \|/ \| \| \| \| \| \|	Trivially implement these instructions. They are used in Astral Chain.
* /	shader/other: Implement thread comparisons (NV_shader_thread_group)	ReinUsesLisp	2020-05-22	2	-0/+26
\|/ \| \| \| \| \| \| \| \| \| \|	Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
*	shader_ir: Separate float-point comparisons in ordered and unordered	ReinUsesLisp	2020-05-09	4	-78/+66
\| \| \| \| \|	This allows us to use native SPIR-V instructions without having to manually check for NAN.
*	Merge pull request #3693 from ReinUsesLisp/clean-samplers	bunnei	2020-05-02	5	-223/+168
\|\ \| \| \| \|	shader/texture: Support multiple unknown sampler properties
\| *	shader/texture: Support multiple unknown sampler properties	ReinUsesLisp	2020-04-23	2	-62/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows deducing some properties from the texture instruction before asking the runtime. By doing this we can handle type mismatches in some instructions from the renderer instead of the shader decoder. Fixes texelFetch issues with games using 2D texture instructions on a 1D sampler.
\| *	shader_ir: Turn classes into data structures	ReinUsesLisp	2020-04-23	5	-182/+102
\| \|
* \|	Merge pull request #3799 from ReinUsesLisp/iadd-cc	bunnei	2020-04-30	3	-27/+58
\|\ \ \| \| \| \| \| \|	shader: Implement P2R CC, IADD Rd.CC and IADD.X
\| * \|	shader/arithmetic_integer: Fix tracking issue in temporary	ReinUsesLisp	2020-04-28	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This temporary is not needed as we mark Rd.CC + IADD.X as unimplemented. It caused issues when tracking global buffers.
\| * \|	shader/arithmetic_integer: Fix edge case and mark IADD.X Rd.CC as unimplemented	ReinUsesLisp	2020-04-26	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IADD.X Rd.CC requires some extra logic that is not currently implemented. Abort when this is hit.
\| * \|	shader/arithmetic_integer: Change IAdd to UAdd to avoid signed overflow	ReinUsesLisp	2020-04-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed integer addition overflow might be undefined behavior. It's free to change operations to UAdd and use unsigned integers to avoid potential bugs.
\| * \|	shader/arithmetic_integer: Implement IADD.X	ReinUsesLisp	2020-04-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers.
\| * \|	shader/arithmetic_integer: Implement CC for IADD	ReinUsesLisp	2020-04-26	2	-3/+21
\| \| \|
\| * \|	decode/register_set_predicate: Implement CC	ReinUsesLisp	2020-04-26	1	-9/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	P2R CC takes the state of condition codes and puts them into a register. We already have this implemented for PR (predicates). This commit implements CC over that.
\| * \|	decode/register_set_predicate: Use move for shared pointers	ReinUsesLisp	2020-04-26	1	-16/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Avoid atomic counters used by shared pointers.
* \| \|	Merge pull request #3788 from FernandoS27/revert	bunnei	2020-04-30	1	-14/+6
\|\ \ \ \| \| \| \| \| \| \| \|	Revert: shader_decode: Fix LD, LDG when track constant buffer.
\| * \| \|	Revert: shader_decode: Fix LD, LDG when track constant buffer.	Fernando Sahmkow	2020-04-24	1	-14/+6
\| \| \|/ \| \|/\|
* \| \|	Merge pull request #3784 from ReinUsesLisp/shader-memory-util	bunnei	2020-04-28	5	-24/+127
\|\ \ \ \| \|_\|/ \|/\| \|	shader/memory_util: Deduplicate code
\| * \|	shader/memory_util: Deduplicate code	ReinUsesLisp	2020-04-26	5	-24/+127
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as well as shader decoder code. While we are at it, fix a bug in gl_shader_cache where compute shaders had an start offset of a stage shader.
* \|	Merge pull request #3734 from ReinUsesLisp/half-float-mods	bunnei	2020-04-25	1	-14/+37
\|\ \ \| \| \| \| \| \|	decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
\| * \|	decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits	ReinUsesLisp	2020-04-23	1	-14/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The encoding for negation and absolute value was wrong. Extracting is now done manually. Similar instructions having different encodings is the rule, not the exception. To keep sanity and readability I preferred to extract the desired bit manually. This is implemented against nxas: https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L68 That is itself tested against nvdisasm (Nvidia's official disassembler).
* \| \|	Merge pull request #3749 from ReinUsesLisp/lea-imm	bunnei	2020-04-24	1	-2/+2
\|\ \ \ \| \|_\|/ \|/\| \|	shader/arithmetic_integer: Fix LEA_IMM encoding
\| * \|	shader/arithmetic_integer: Fix LEA_IMM encoding	ReinUsesLisp	2020-04-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The operand order in LEA_IMM was flipped compared to nvdisasm. Fix that using nxas as reference: https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L122
* \| \|	Merge pull request #3697 from lioncash/declarations	bunnei	2020-04-23	1	-2/+2
\|\ \ \ \| \| \| \| \| \| \| \|	CMakeLists: Enable -Wmissing-declarations on Linux builds
\| * \| \|	General: Resolve warnings related to missing declarations	Lioncash	2020-04-17	1	-2/+2
\| \| \|/ \| \|/\|
* \| \|	Merge pull request #3698 from lioncash/warning	bunnei	2020-04-21	2	-12/+13
\|\ \ \ \| \|_\|/ \|/\| \|	General: Resolve minor assorted warnings
\| * \|	decode/memory: Resolve unused variable warning	Lioncash	2020-04-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Only the first element of the returned pair is ever used.
\| * \|	decode/texture: Resolve unused variable warnings.	Lioncash	2020-04-17	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some variables aren't used, so we can remove these. Unfortunately, diagnostics are still reported on structured bindings even when annotated with [[maybe_unused]], so we need to unpack the elements that we want to use manually.
\| * \|	decode/texture: Collapse loop down into std::generate	Lioncash	2020-04-17	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Same behavior, less code.
\| * \|	decode/texture: Eliminate trivial missing field initializer warnings	Lioncash	2020-04-17	1	-3/+4
\| \|/ \| \| \| \| \| \|	We can just specify the initializers.
* \|	Merge pull request #3679 from lioncash/track	bunnei	2020-04-19	1	-5/+6
\|\ \ \| \|/ \|/\|	track: Eliminate redundant copies
\| *	track: Eliminate redundant copies	Lioncash	2020-04-16	1	-5/+6
\| \| \| \| \| \| \| \| \| \|	Two variables can be references, while two others can be std::moved. Makes for 4 less atomic reference count increments and decrements.
* \|	Merge pull request #3673 from lioncash/extra	bunnei	2020-04-17	3	-10/+15
\|\ \ \| \| \| \| \| \|	CMakeLists: Specify -Wextra on linux builds
\| * \|	CMakeLists: Specify -Wextra on linux builds	Lioncash	2020-04-16	3	-10/+15
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allows reporting more cases where logic errors may exist, such as implicit fallthrough cases, etc. We currently ignore unused parameters, since we currently have many cases where this is intentional (virtual interfaces). While we're at it, we can also tidy up any existing code that causes warnings. This also uncovered a few bugs as well.
* \|	Merge pull request #3689 from lioncash/unused-var	Rodrigo Locatti	2020-04-16	1	-1/+0
\|\ \ \| \| \| \| \| \|	decode/shift: Remove unused variable within Shift()
\| * \|	decode/shift: Remove unused variable within Shift()	Lioncash	2020-04-16	1	-1/+0
\| \|/ \| \| \| \| \| \| \| \|	Removes a redundant variable that is already satisfied by the IsFull() utility function.
* /	control_flow: Make use of std::move in TryInspectAddress()	Lioncash	2020-04-16	1	-3/+3
\|/ \| \| \|	Eliminates redundant atomic reference count increments and decrements.
*	Merge pull request #3612 from ReinUsesLisp/red	Fernando Sahmkow	2020-04-15	2	-43/+71
\|\ \| \| \| \|	shader/memory: Implement RED.E.ADD and minor changes to ATOM
\| *	shader/memory: Implement RED.E.ADD	ReinUsesLisp	2020-04-06	2	-1/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implements a reduction operation. It's an atomic operation that doesn't return a value. This commit introduces another primitive because some shading languages might have a primitive for reduction operations.
\| *	shader/memory: Add "using std::move"	ReinUsesLisp	2020-04-06	1	-11/+13
\| \|
\| *	shader/memory: Minor fixes in ATOM	ReinUsesLisp	2020-04-06	1	-32/+30
\| \|
* \|	shader/arithmetic: Add FCMP_CR variant	ReinUsesLisp	2020-04-15	1	-1/+2
\| \| \| \| \| \| \| \|	Adds another variant of FCMP.
* \|	Merge pull request #3619 from ReinUsesLisp/i2i	Mat M	2020-04-13	1	-13/+100
\|\ \ \| \| \| \| \| \|	shader/conversion: Implement I2I sign extension, saturation and selection
\| * \|	shader/conversion: Implement I2I sign extension, saturation and selection	ReinUsesLisp	2020-04-07	1	-13/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reimplements I2I adding sign extension, saturation (clamp source value to the destination), selection and destination sizes that are not 32 bits wide. It doesn't implement CC yet.
* \| \|	Merge pull request #3633 from ReinUsesLisp/clean-texdec	Mat M	2020-04-13	1	-14/+0
\|\ \ \ \| \| \| \| \| \| \| \|	shader/texture: Remove type mismatches management from shader decoder
\| * \| \|	shader/texture: Remove type mismatches management from shader decoder	ReinUsesLisp	2020-04-10	1	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit e22816a5bb we handle type mismatches from the CPU. We don't need to hack our shader decoder due to game bugs anymore. Removed in this commit.
* \| \| \|	Merge pull request #3578 from ReinUsesLisp/vmnmx	Fernando Sahmkow	2020-04-12	2	-0/+61
\|\ \ \ \ \| \|/ / / \|/\| \| \|	shader/video: Partially implement VMNMX
\| * \| \|	shader/video: Partially implement VMNMX	ReinUsesLisp	2020-04-12	2	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implements the common usages for VMNMX. Inputs with a different size than 32 bits are not supported and sign mismatches aren't supported either. VMNMX works as follows: It grabs Ra and Rb and applies a maximum/minimum on them (this is defined by .MX), having in mind the input sign. This result can then be saturated. After the intermediate result is calculated, it applies another operation on it using Rc. These operations are merges, accumulations or another min/max pass. This instruction allows to implement with a more flexible approach GCN's min3 and max3 instructions (for instance).
* \| \| \|	Merge pull request #3601 from ReinUsesLisp/some-shader-encodings	bunnei	2020-04-09	2	-3/+12
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \|	video_core/shader: Add some instruction and S2R encodings
\| * \| \| \|	shader/other: Add error message for some S2R registers	ReinUsesLisp	2020-04-04	1	-0/+6
\| \| \| \| \|
\| * \| \| \|	shader_bytecode: Rename MOV_SYS to S2R	ReinUsesLisp	2020-04-04	1	-3/+3
\| \| \| \| \|
\| * \| \| \|	shader_ir: Add error message for EXIT.FCSM_TR	ReinUsesLisp	2020-04-04	1	-0/+3
\| \| \|_\|/ \| \|/\| \|
* \| \| \|	Merge pull request #3489 from namkazt/patch-2	Rodrigo Locatti	2020-04-07	2	-11/+353
\|\ \ \ \ \| \|_\|_\|/ \|/\| \| \|	shader: implement SULD.D bits32/64
\| * \| \|	address nit.	Nguyen Dac Nam	2020-04-07	1	-1/+1
\| \| \| \|
\| * \| \|	Apply suggestions from code review	Nguyen Dac Nam	2020-04-07	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
\| * \| \|	shader_decode: SULD.D using std::pair instead of out parameter	namkazy	2020-04-06	2	-19/+15
\| \| \| \|
\| * \| \|	shader_decode: SULD.D avoid duplicate code block.	namkazy	2020-04-06	1	-39/+2
\| \| \| \|
\| * \| \|	shader_decode: SULD.D fix conversion error.	namkazy	2020-04-06	1	-3/+3
\| \| \| \|
\| * \| \|	shader_decode: SULD.D implement bits64 and reverse shader ir init method to removed shader stage.	namkazy	2020-04-06	3	-42/+101
\| \| \| \|
\| * \| \|	silent warning (conversion error)	namkazy	2020-04-05	1	-3/+2
\| \| \| \|
\| * \| \|	shader_decode: SULD.D -> SINT actually same as UNORM.	namkazy	2020-04-05	1	-5/+4
\| \| \| \|
\| * \| \|	shader_decode: SULD.D fix decode SNORM component	namkazy	2020-04-05	1	-10/+9
\| \| \| \|
\| * \| \|	clang-format	namkazy	2020-04-05	1	-2/+2
\| \| \| \|
\| * \| \|	shader_decode: get sampler descriptor from registry.	namkazy	2020-04-05	1	-77/+93
\| \| \| \|
\| * \| \|	tweaking.	namkazy	2020-04-05	1	-3/+3
\| \| \| \|
\| * \| \|	cleanup unuse params	namkazy	2020-04-05	1	-8/+6
\| \| \| \|
\| * \| \|	cleanup debug code.	namkazy	2020-04-05	1	-14/+3
\| \| \| \|
\| * \| \|	reimplement get component type, uncomment mistaken code	namkazy	2020-04-05	1	-18/+93
\| \| \| \|
\| * \| \|	remove disable optimize	namkazy	2020-04-05	1	-2/+0
\| \| \| \|
\| * \| \|	[wip] reimplement SULD.D	namkazy	2020-04-05	1	-22/+229
\| \| \| \|
\| * \| \|	add shader stage when init shader ir	namkazy	2020-04-05	2	-5/+7
\| \| \| \|
\| * \| \|	clang-fix	Nguyen Dac Nam	2020-04-05	1	-1/+1
\| \| \| \|
\| * \| \|	shader: image - import PredCondition	Nguyen Dac Nam	2020-04-05	1	-0/+1
\| \| \| \|
\| * \| \|	shader: SULD.D bits32 implement more complexer method.	Nguyen Dac Nam	2020-04-05	1	-4/+28
\| \| \| \|
\| * \| \|	shader: SULD.D import StoreType	Nguyen Dac Nam	2020-04-05	1	-0/+1
\| \| \| \|
\| * \| \|	shader: implement SULD.D bits32	Nguyen Dac Nam	2020-04-05	1	-11/+27
\| \|/ /
* \| \|	Merge pull request #3592 from ReinUsesLisp/ipa	Fernando Sahmkow	2020-04-06	1	-15/+21
\|\ \ \ \| \|/ / \|/\| \|	shader_decompiler: Remove FragCoord.w hack and change IPA implementation
\| * \|	shader_decompiler: Remove FragCoord.w hack and change IPA implementation	ReinUsesLisp	2020-04-02	1	-15/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Credits go to gdkchan and Ryujinx. The pull request used for this can be found here: https://github.com/Ryujinx/Ryujinx/pull/1082 yuzu was already using the header for interpolation, but it was missing the FragCoord.w multiplication described in the linked pull request. This commit finally removes the FragCoord.w == 1.0f hack from the shader decompiler. While we are at it, this commit renames some enumerations to match Nvidia's documentation (linked below) and fixes component declaration order in the shader program header (z and w were swapped). https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
* \| \|	shader/memory: Silence no return value warning	ReinUsesLisp	2020-04-02	1	-0/+3
\|/ / \| \| \| \| \| \|	Silences a warning about control paths not all returning a value.
* \|	Merge pull request #3561 from ReinUsesLisp/f2f-conversion	Fernando Sahmkow	2020-03-31	1	-5/+10
\|\ \ \| \| \| \| \| \|	shader/conversion: Fix F2F rounding operations with different sizes
\| * \|	shader/conversion: Fix F2F rounding operations with different sizes	ReinUsesLisp	2020-03-26	1	-5/+10
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rounding operations only matter when the conversion size of source and destination is the same, i.e. .F16.F16, .F32.F32 and .F64.F64. When there is a mismatch (.F16.F32), these bits are used for IEEE rounding, we don't emulate this because GLSL and SPIR-V don't support configuring it per operation.
* \|	Merge pull request #3577 from ReinUsesLisp/lea	Fernando Sahmkow	2020-03-31	1	-11/+4
\|\ \ \| \| \| \| \| \|	shader/lea: Fix LEA implementation
\| * \|	shader/lea: Simplify generated LEA code	ReinUsesLisp	2020-03-28	1	-3/+2
\| \| \|
\| * \|	shader/lea: Fix op_a and op_b usages	ReinUsesLisp	2020-03-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	They were swapped.
\| * \|	shader/lea: Remove const and use move when possible	ReinUsesLisp	2020-03-27	1	-11/+5
\| \|/
* \|	clang-format	Nguyen Dac Nam	2020-03-31	1	-2/+1
\| \|
* \|	shader_decode: fix by suggestion	Nguyen Dac Nam	2020-03-31	1	-27/+22
\| \|
* \|	clang-format	namkazy	2020-03-30	1	-3/+3
\| \|
* \|	shader_decode: ATOM/ATOMS: add function to avoid code repetition	namkazy	2020-03-30	2	-70/+53
\| \|
* \|	shader_decode: implement ATOM operation for S32 and U32	Nguyen Dac Nam	2020-03-30	1	-6/+39
\| \|
* \|	clang-format	namkazy	2020-03-30	1	-3/+3
\| \|
* \|	shader_decode: implement ATOMS instr partial.	Nguyen Dac Nam	2020-03-30	1	-10/+42
\| \|
* \|	shader: node - update correct comment	Nguyen Dac Nam	2020-03-30	1	-15/+15
\| \|
* \|	shader_decode: add Atomic op for common usage	Nguyen Dac Nam	2020-03-30	1	-1/+15
\|/
*	Merge pull request #3544 from makigumo/myfork/patch-2	bunnei	2020-03-26	1	-4/+5
\|\ \| \| \| \|	xmad: fix clang build error
\| *	xmad: fix clang build error	makigumo	2020-03-23	1	-4/+5
\| \|
* \|	Merge pull request #3520 from ReinUsesLisp/legacy-varyings	bunnei	2020-03-26	2	-35/+58
\|\ \ \| \|/ \|/\|	gl_shader_decompiler: Implement legacy varyings
\| *	shader/shader_ir: Track usage in input attribute and of legacy varyings	ReinUsesLisp	2020-03-16	2	-34/+58
\| \|
\| *	shader/shader_ir: Fix clip distance usage stores	ReinUsesLisp	2020-03-16	1	-2/+1
\| \|
\| *	shader/shader_ir: Change declare output attribute to a switch	ReinUsesLisp	2020-03-16	1	-9/+9
\| \|
* \|	Merge pull request #3505 from namkazt/patch-8	bunnei	2020-03-19	1	-15/+48
\|\ \ \| \| \| \| \| \|	shader_decode: implement XMAD mode CSfu
\| * \|	nit & remove some optional param	Nguyen Dac Nam	2020-03-13	1	-10/+11
\| \| \|
\| * \|	shader_decode: implement XMAD mode CSfu	Nguyen Dac Nam	2020-03-13	1	-9/+41
\| \| \|
* \| \|	Merge pull request #3502 from namkazt/patch-3	Rodrigo Locatti	2020-03-16	2	-21/+50
\|\ \ \ \| \|_\|/ \|/\| \|	shader_decode: Reimplement BFE instructions
\| * \|	clang-format	Nguyen Dac Nam	2020-03-14	1	-2/+1
\| \| \|
\| * \|	nit	Nguyen Dac Nam	2020-03-14	1	-1/+1
\| \| \|
\| * \|	clang-format	Nguyen Dac Nam	2020-03-13	1	-4/+8
\| \| \|
\| * \|	Apply suggestions from code review	Nguyen Dac Nam	2020-03-13	1	-5/+5
\| \| \| \| \| \| \| \| \|	Co-Authored-By: Mat M. <mathew1800@gmail.com>
\| * \|	shader_decode: BFE add ref of reverse parallel method.	Nguyen Dac Nam	2020-03-13	1	-0/+3
\| \| \|
\| * \|	shader_decode: implement BREV on BFE	Nguyen Dac Nam	2020-03-13	1	-6/+25
\| \| \| \| \| \| \| \| \|	Implement reverse parallel follow: https://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
\| * \|	node_helper: add IBitfieldExtract case	Nguyen Dac Nam	2020-03-13	1	-0/+2
\| \| \|
\| * \|	shader_decode: Reimplement BFE instructions	Nguyen Dac Nam	2020-03-13	1	-25/+27
\| \|/
* \|	shader/transform_feedback: Expose buffer stride	ReinUsesLisp	2020-03-13	2	-0/+2
\| \|
* \|	shader/transform_feedback: Add host API friendly TFB builder	ReinUsesLisp	2020-03-13	2	-0/+136
\| \|
* \|	engines/maxwell_3d: Add TFB registers and store them in shader registry	ReinUsesLisp	2020-03-09	2	-3/+12
\| \|
* \|	shader/registry: Address feedback	ReinUsesLisp	2020-03-09	2	-12/+17
\| \|
* \|	shader/registry: Cache tessellation state	ReinUsesLisp	2020-03-09	2	-2/+9
\| \|
* \|	shader/registry: Store graphics and compute metadata	ReinUsesLisp	2020-03-09	3	-36/+81
\| \| \| \| \| \| \| \| \| \|	Store information GLSL forces us to provide but it's dynamic state in hardware (workgroup sizes, primitive topology, shared memory size).
* \|	video_core: Rename "const buffer locker" to "registry"	ReinUsesLisp	2020-03-09	9	-50/+54
\| \|
* \|	gl_shader_cache: Rework shader cache and remove post-specializations	ReinUsesLisp	2020-03-09	4	-28/+17
\|/ \| \| \| \|	Instead of pre-specializing shaders and then post-specializing them, drop the later and only "specialize" the shader while decoding it.
*	Merge pull request #3451 from ReinUsesLisp/indexed-textures	bunnei	2020-03-05	1	-1/+1
\|\ \| \| \| \|	vk_shader_decompiler: Implement indexed textures
\| *	shader: Simplify indexed sampler usages	ReinUsesLisp	2020-02-24	1	-1/+1
\| \|
* \|	nit: move comment to right place.	Nguyen Dac Nam	2020-02-29	1	-2/+2
\| \|
* \|	shader_decode: Fix LD, LDG when track constant buffer	Nguyen Dac Nam	2020-02-28	1	-4/+12
\| \|
* \|	shader: FMUL switch to using LUT (#3441)	Nguyen Dac Nam	2020-02-27	1	-19/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* shader: add FmulPostFactor LUT table * shader: FMUL apply LUT * Update src/video_core/engines/shader_bytecode.h Co-Authored-By: Mat M. <mathew1800@gmail.com> * nit: mistype * clang-format & add missing import * shader: remove post factor LUT. * shader: move post factor LUT to function and fix incorrect order. * clang-format * shader: FMUL: add static to post factor LUT * nit: typo Co-authored-by: Mat M. <mathew1800@gmail.com>
* \|	Merge pull request #3440 from namkazt/patch-6	bunnei	2020-02-26	1	-36/+58
\|\ \ \| \|/ \|/\|	shader: implement LOP3 fast replace for old function
\| *	nit: add const to where it need.	Nguyen Dac Nam	2020-02-21	1	-14/+14
\| \|
\| *	shader: implement LOP3 fast replace for old function	Nguyen Dac Nam	2020-02-21	1	-36/+58
\| \| \| \| \| \|	ref: https://devtalk.nvidia.com/default/topic/1070081/cuda-programming-and-performance/reverse-lut-for-lop3-lut/
* \|	shader/texture: Fix illegal 3D texture assert	ReinUsesLisp	2020-02-21	1	-1/+1
\|/ \| \| \| \|	Fix typo in the illegal 3D texture assert logic. We care about catching arrayed 3D textures or 3D shadow textures, not regular 3D textures.
*	Merge pull request #3415 from ReinUsesLisp/texture-code	bunnei	2020-02-20	1	-43/+28
\|\ \| \| \| \|	shader/texture: Allow 2D shadow arrays and simplify code
\| *	shader/texture: Allow 2D shadow arrays and simplify code	ReinUsesLisp	2020-02-15	1	-43/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Shadow sampler 2D arrays are supported on OpenGL, so there's no reason to forbid these. Enable textureLod usage on these. Minor style changes.
* \|	shader_conversion: I2F : add Assert for case src_size is Short	Nguyen Dac Nam	2020-02-19	1	-0/+3
\| \|
* \|	fix warning	Nguyen Dac Nam	2020-02-19	1	-1/+1
\| \|
* \|	clang-format fix	Nguyen Dac Nam	2020-02-19	1	-1/+1
\| \|
* \|	shader_conversion: add conversion I2F for Short	Nguyen Dac Nam	2020-02-19	1	-9/+6
\|/
*	Merge pull request #3379 from ReinUsesLisp/cbuf-offset	bunnei	2020-02-14	2	-3/+3
\|\ \| \| \| \|	shader/decode: Fix constant buffer offsets
\| *	shader/decode: Fix constant buffer offsets	ReinUsesLisp	2020-02-05	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Some instances were using cbuf34.offset instead of cbuf34.GetOffset(). This returned the an invalid offset. Address those instances and rename offset to "shifted_offset" to avoid future bugs.
* \|	Merge pull request #3369 from ReinUsesLisp/shf	bunnei	2020-02-08	1	-11/+102
\|\ \ \| \|/ \|/\|	shader/shift: Implement SHF
\| *	shader/shift: Implement SHIFT_RIGHT_{IMM,R}	ReinUsesLisp	2020-02-02	1	-26/+58
\| \| \| \| \| \| \| \|	Shifts a pair of registers to the right and returns the low register.
\| *	shader/shift: Implement SHF_LEFT_{IMM,R}	ReinUsesLisp	2020-02-02	1	-10/+69
\| \| \| \| \| \| \| \|	Shifts a pair of registers to the left and returns the high register.
* \|	Merge pull request #3357 from ReinUsesLisp/bfi-rc	bunnei	2020-02-04	1	-2/+5
\|\ \ \| \| \| \| \| \|	shader/bfi: Implement register-constant buffer variant
\| * \|	shader/bfi: Implement register-constant buffer variant	ReinUsesLisp	2020-01-27	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's the same as the variant that was implemented, but it takes the operands from another source.
* \| \|	Merge pull request #3356 from ReinUsesLisp/fcmp	bunnei	2020-02-04	1	-1/+10
\|\ \ \ \| \| \| \| \| \| \| \|	shader/arithmetic: Implement FCMP
\| * \| \|	shader/arithmetic: Implement FCMP	ReinUsesLisp	2020-01-27	1	-1/+10
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \|	Compares the third operand with zero, then selects between the first and second.
* \| \|	Merge pull request #3337 from ReinUsesLisp/vulkan-staged	bunnei	2020-02-03	1	-3/+6
\|\ \ \ \| \| \| \| \| \| \| \|	yuzu: Implement Vulkan frontend
\| * \| \|	shader/other: Fix skips for SYNC and BRK	ReinUsesLisp	2020-01-29	1	-2/+2
\| \| \| \|
\| * \| \|	shader/other: Stub S2R LaneId	ReinUsesLisp	2020-01-29	1	-1/+4
\| \|/ /
* \| \|	shader: Remove curly braces initializers on shared pointers	ReinUsesLisp	2020-02-02	5	-12/+12
\| \| \|
* \| \|	Merge pull request #3282 from FernandoS27/indexed-samplers	bunnei	2020-02-02	9	-43/+397
\|\ \ \ \| \| \| \| \| \| \| \|	Partially implement Indexed samplers in general and specific code in GLSL
\| * \| \|	Shader_IR: Address feedback.	Fernando Sahmkow	2020-01-25	7	-31/+33
\| \| \| \|
\| * \| \|	Shader_IR: Change name of TrackSampler function so it does not confuse with the type.	Fernando Sahmkow	2020-01-24	3	-7/+10
\| \| \| \|
\| * \| \|	Shader_IR: Corrections, styling and extras.	Fernando Sahmkow	2020-01-24	1	-2/+4
\| \| \| \|
\| * \| \|	Shader_IR: Propagate bindless index into the GL compiler.	Fernando Sahmkow	2020-01-24	4	-23/+53
\| \| \| \|
\| * \| \|	Shader_IR: Implement Injectable Custom Variables to the IR.	Fernando Sahmkow	2020-01-24	3	-1/+34
\| \| \| \|
\| * \| \|	Shader_IR: deduce size of indexed samplers	Fernando Sahmkow	2020-01-24	4	-8/+60
\| \| \| \|
\| * \| \|	Shader_IR: Setup Indexed Samplers on the IR	Fernando Sahmkow	2020-01-24	1	-20/+46
\| \| \| \|
\| * \| \|	Shader_IR: Implement initial code for tracking indexed samplers.	Fernando Sahmkow	2020-01-24	4	-0/+139
\| \| \| \|
\| * \| \|	Shader_IR: Address Feedback	Fernando Sahmkow	2020-01-24	2	-25/+25
\| \| \| \|
\| * \| \|	Shader_IR: Allow constant access of guest driver.	Fernando Sahmkow	2020-01-24	1	-1/+1
\| \| \| \|
\| * \| \|	Shader_IR: Address Feedback	Fernando Sahmkow	2020-01-24	2	-17/+24
\| \| \| \|
\| * \| \|	Shader_IR: Store Bound buffer on Shader Usage	Fernando Sahmkow	2020-01-24	2	-0/+29
\| \| \| \|
\| * \| \|	GPU: Implement guest driver profile and deduce texture handler sizes.	Fernando Sahmkow	2020-01-24	4	-0/+31
\| \|/ /
* \| \|	Merge pull request #3347 from ReinUsesLisp/local-mem	bunnei	2020-01-30	1	-30/+55
\|\ \ \ \| \|_\|/ \|/\| \|	shader/memory: Implement LDL.S16, LDS.S16, STL.S16 and STS.S16
\| * \|	shader/memory: Implement STL.S16 and STS.S16	ReinUsesLisp	2020-01-25	1	-3/+10
\| \| \|
\| * \|	shader/memory: Implement unaligned LDL.S16 and LDS.S16	ReinUsesLisp	2020-01-25	1	-5/+3
\| \| \|
\| * \|	shader/memory: Move unaligned load/store to functions	ReinUsesLisp	2020-01-25	1	-18/+27
\| \| \|
\| * \|	shader/memory: Implement LDL.S16 and LDS.S16	ReinUsesLisp	2020-01-25	1	-12/+23
\| \|/
* /	shader/memory: Implement ATOM.ADD	ReinUsesLisp	2020-01-26	2	-2/+22
\|/ \| \| \| \| \| \| \| \| \| \| \| \|	ATOM operates atomically on global memory. For now only add ATOM.ADD since that's what was found in commercial games. This asserts for ATOM.ADD.S32 (handling the others as unimplemented), although ATOM.ADD.U32 shouldn't be any different. This change forces us to change the default type on SPIR-V storage buffers from float to uint. We could also alias the buffers, but it's simpler for now to just use uint. While we are at it, abstract the code to avoid repetition.
*	Merge pull request #3273 from FernandoS27/txd-array	bunnei	2020-01-24	1	-5/+12
\|\ \| \| \| \|	Shader_IR: Implement TXD Array.
\| *	Shader_IR: Implement TXD Array.	Fernando Sahmkow	2020-01-04	1	-5/+12
\| \| \| \| \| \| \| \| \| \|	This commit extends the compilation of TXD to support array samplers on TXD.
* \|	shader/memory: Implement ATOMS.ADD.U32	ReinUsesLisp	2020-01-16	2	-0/+21
\| \|
* \|	control_flow: Silence -Wreorder warning for CFGRebuildState	Lioncash	2020-01-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Organizes the initializer list in the same order that the variables would actually be initialized in.
* \|	Merge pull request #3287 from ReinUsesLisp/ldg-stg-16	bunnei	2020-01-14	2	-34/+52
\|\ \ \| \| \| \| \| \|	shader_ir/memory: Implement u16 and u8 for STG and LDG
\| * \|	shader_ir/memory: Implement u16 and u8 for STG and LDG	ReinUsesLisp	2020-01-09	2	-34/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using the same technique we used for u8 on LDG, implement u16. In the case of STG, load memory and insert the value we want to set into it with bitfieldInsert. Then set that value.
* \| \|	shader_ir/texture: Simplify AOFFI code	ReinUsesLisp	2020-01-09	1	-10/+6
\|/ /
* \|	Merge pull request #3258 from FernandoS27/shader-amend	bunnei	2020-01-04	3	-2/+38
\|\ \ \| \|/ \|/\|	Shader_IR: add the ability to amend code in the shader ir.
\| *	Shader_IR: Address Feedback	Fernando Sahmkow	2020-01-04	3	-11/+11
\| \|
\| *	Shader_IR: add the ability to amend code in the shader ir.	Fernando Sahmkow	2019-12-30	3	-3/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit introduces a mechanism by which shader IR code can be amended and extended. This useful for track algorithms where certain information can derived from before the track such as indexes to array samplers.
* \|	Merge pull request #3239 from ReinUsesLisp/p2r	bunnei	2020-01-01	1	-16/+44
\|\ \ \| \|/ \|/\|	shader/p2r: Implement P2R Pr
\| *	shader/p2r: Implement P2R Pr	ReinUsesLisp	2019-12-20	1	-1/+15
\| \| \| \| \| \| \| \| \| \|	P2R dumps predicate or condition codes state to a register. This is useful for unit testing.
\| *	shader/r2p: Refactor P2R to support P2R	ReinUsesLisp	2019-12-20	1	-16/+30
\| \|
* \|	Merge pull request #3228 from ReinUsesLisp/ptp	bunnei	2019-12-27	3	-34/+80
\|\ \ \| \| \| \| \| \|	shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S
\| * \|	shader/texture: Implement TLD4.PTP	ReinUsesLisp	2019-12-16	3	-19/+61
\| \| \|
\| * \|	shader/texture: Enable arrayed TLD4	ReinUsesLisp	2019-12-16	1	-1/+0
\| \| \|
\| * \|	shader/texture: Implement AOFFI for TLD4S	ReinUsesLisp	2019-12-16	1	-13/+18
\| \| \|
\| * \|	shader/texture: Remove unnecesary parenthesis	ReinUsesLisp	2019-12-16	1	-2/+2
\| \| \|
* \| \|	Merge pull request #3235 from ReinUsesLisp/ldg-u8	bunnei	2019-12-22	1	-6/+32
\|\ \ \ \| \|_\|/ \|/\| \|	shader/memory: Implement LDG.U8 and unaligned U8 loads
\| * \|	shader/memory: Implement LDG.U8 and unaligned U8 loads	ReinUsesLisp	2019-12-18	1	-6/+32
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LDG can load single bytes instead of full integers or packs of integers. These have the advantage of loading bytes that are not aligned to 4 bytes. To emulate these this commit gets the byte being referenced (by doing "address & 3" and then using that to extract the byte from the loaded integer: result = bitfieldExtract(loaded_integer, (address % 4) * 8, 8)
* \|	Merge pull request #3234 from ReinUsesLisp/i2f-u8-selector	bunnei	2019-12-20	1	-2/+13
\|\ \ \| \| \| \| \| \|	shader/conversion: Implement byte selector in I2F
\| * \|	shader/conversion: Implement byte selector in I2F	ReinUsesLisp	2019-12-18	1	-2/+13
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \|	I2F's byte selector is used to choose what bytes to convert to float. e.g. if the input is 0xaabbccdd and the selector is ".B3" it will convert 0xaa. The default (when it's not shown in nvdisasm) is ".B0", in that example the default would convert 0xdd to float.
* /	shader/texture: Properly shrink unused entries in size mismatches	ReinUsesLisp	2019-12-18	1	-4/+9
\|/ \| \| \| \| \| \|	When a image format mismatches we were inserting zeroes to the texture itself. This was not handling cases were the mismatch uses less coordinates than the guest shader code. Address that by resizing the vector.
*	Shader_IR: Correct TLD4S Depth Compare.	Fernando Sahmkow	2019-12-12	1	-5/+12
\|
*	Shader_Ir: Correct TLD4S encoding and implement f16 flag.	Fernando Sahmkow	2019-12-12	2	-10/+13
\|
*	Shader_Ir: default failed tracks on bindless samplers to null values.	Fernando Sahmkow	2019-12-12	2	-24/+77
\|
*	shader: Implement MEMBAR.GL	ReinUsesLisp	2019-12-10	2	-0/+8
\| \| \| \|	Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
*	shader_ir/other: Implement S2R InvocationId	ReinUsesLisp	2019-12-10	2	-0/+3
\|
*	shader: Keep track of shaders using warp instructions	ReinUsesLisp	2019-12-10	2	-0/+8
\|
*	shader_ir/memory: Implement patch stores	ReinUsesLisp	2019-12-10	3	-19/+36
\|
*	Merge pull request #3109 from FernandoS27/new-instr	bunnei	2019-12-07	3	-7/+69
\|\ \| \| \| \|	Implement FLO & TXD Instructions on GPU Shaders
\| *	Shader_IR: Address Feedback	Fernando Sahmkow	2019-11-18	2	-10/+8
\| \|
\| *	Shader_IR: Implement TXD instruction.	Fernando Sahmkow	2019-11-14	2	-7/+51
\| \|
\| *	Shader_IR: Implement FLO instruction.	Fernando Sahmkow	2019-11-14	2	-0/+20
\| \|
* \|	video_core/const_buffer_locker: Make use of std::tie in HasEqualKeys()	Lioncash	2019-11-27	1	-2/+3
\| \| \| \| \| \| \| \|	Tidies it up a little bit visually.
* \|	video_core/const_buffer_locker: Remove unused includes	Lioncash	2019-11-27	2	-2/+2
\| \|
* \|	video_core/const_buffer_locker: Remove #pragma once from cpp file	Lioncash	2019-11-27	1	-2/+0
\| \| \| \| \| \| \| \|	Silences a compiler warning.
* \|	video_core: Unify ProgramType and ShaderStage into ShaderType	ReinUsesLisp	2019-11-23	2	-1/+3
\| \|
* \|	shader/texture: Handle TLDS texture type mismatches	ReinUsesLisp	2019-11-23	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets used by instructions of 1D textures. To handle the discrepancy this commit uses the the texture type from the binding and modifies the emitted code IR to build a valid backend expression. E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples a 2D texture in the coordinate ivec2(X, 0).
* \|	shader/texture: Deduce texture buffers from locker	ReinUsesLisp	2019-11-23	3	-69/+60
\| \| \| \| \| \| \| \| \| \|	Instead of specializing shaders to separate texture buffers from 1D textures, use the locker to deduce them while they are being decoded.
* \|	shader/other: Reduce DEPBAR log severity	ReinUsesLisp	2019-11-20	1	-1/+1
\|/ \| \| \| \| \|	While DEPBAR is stubbed it doesn't change anything from our end. Shading languages handle what this instruction does implicitly. We are not getting anything out fo this log except noise.
*	Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles	Fernando Sahmkow	2019-11-14	2	-43/+48
\|\ \| \| \| \|	shader: Implement FSWZADD and reimplement SHFL
\| *	shader_ir/warp: Implement FSWZADD	ReinUsesLisp	2019-11-08	2	-0/+10
\| \|
\| *	gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics	ReinUsesLisp	2019-11-08	2	-42/+37
\| \|
* \|	Merge pull request #3084 from ReinUsesLisp/cast-warnings	Rodrigo Locatti	2019-11-13	2	-5/+5
\|\ \ \| \|/ \|/\|	video_core: Treat implicit conversions as errors
\| *	video_core: Silence implicit conversion warnings	ReinUsesLisp	2019-11-08	2	-5/+5
\| \|
* \|	Merge pull request #3032 from ReinUsesLisp/simplify-control-flow-brx	bunnei	2019-11-07	1	-103/+111
\|\ \ \| \| \| \| \| \|	shader/control_flow: Abstract repeated code chunks in BRX tracking
\| * \|	shader/control_flow: Specify constness on caller lambdas	Rodrigo Locatti	2019-11-07	1	-11/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com>
\| * \|	shader/control_flow: Use callable template instead of std::function	ReinUsesLisp	2019-11-07	1	-6/+5
\| \| \|
\| * \|	shader/control_flow: Abstract repeated code chunks in BRX tracking	ReinUsesLisp	2019-11-07	1	-93/+101
\| \| \| \| \| \| \| \| \| \| \| \|	Remove copied and pasted for cycles into a common templated function.
\| * \|	shader/control_flow: Silence Intellisense cast warnings	ReinUsesLisp	2019-11-07	1	-1/+1
\| \| \|
\| * \|	shader/control_flow: Remove brace initializer in std containers	ReinUsesLisp	2019-11-07	1	-9/+9
\| \|/ \| \| \| \| \| \|	These containers have a default constructor.
* \|	shader/decode: Reduce severity of arithmetic rounding warnings	ReinUsesLisp	2019-11-07	6	-15/+17
\| \|
* \|	shader/arithmetic: Reduce RRO stub severity	ReinUsesLisp	2019-11-07	1	-1/+2
\| \|
* \|	shader/texture: Remove NODEP warnings	ReinUsesLisp	2019-11-07	1	-35/+0
\|/ \| \| \| \|	These warnings don't offer meaningful information while decoding shaders. Remove them.
*	Merge pull request #3039 from ReinUsesLisp/cleanup-samplers	Rodrigo Locatti	2019-11-06	4	-122/+100
\|\ \| \| \| \|	shader/node: Unpack bindless texture encoding
\| *	shader/node: Unpack bindless texture encoding	ReinUsesLisp	2019-10-30	4	-122/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bindless textures were using u64 to pack the buffer and offset from where they come from. Drop this in favor of separated entries in the struct. Remove the usage of std::set in favor of std::list (it's not std::vector to avoid reference invalidations) for samplers and images.
* \|	Shader_IR: Fix regression on TLD4	Fernando Sahmkow	2019-10-31	2	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Originally on the last commit I thought TLD4 acted the same as TLD4S and didn't have a mask. It actually does have a component mask. This commit corrects that.
* \|	Shader_IR: Fix TLD4 and add Bindless Variant.	Fernando Sahmkow	2019-10-30	2	-10/+26
\|/ \| \| \| \| \|	This commit fixes an issue where not all 4 results of tld4 were being written, the color component was defaulted to red, among other things. It also implements the bindless variant.
*	Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased	Rodrigo Locatti	2019-10-26	10	-171/+638
\|\ \| \| \| \|	Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
\| *	Shader_IR: Address Feedback.	Fernando Sahmkow	2019-10-26	7	-52/+59
\| \|
\| *	gl_shader_cache: Implement locker variants invalidation	ReinUsesLisp	2019-10-25	2	-12/+19
\| \|
\| *	gl_shader_disk_cache: Store and load fast BRX	ReinUsesLisp	2019-10-25	1	-2/+2
\| \|
\| *	const_buffer_locker: Minor style changes	ReinUsesLisp	2019-10-25	2	-152/+76
\| \|
\| *	gl_shader_decompiler: Move entries to a separate function	ReinUsesLisp	2019-10-25	7	-32/+29
\| \|
\| *	Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.	Fernando Sahmkow	2019-10-25	1	-1/+1
\| \|
\| *	Shader_IR: Correct typo in Consistent method.	Fernando Sahmkow	2019-10-25	2	-2/+2
\| \|
\| *	Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it	Fernando Sahmkow	2019-10-25	4	-42/+212
\| \|
\| *	Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.	Fernando Sahmkow	2019-10-25	5	-130/+246
\| \|
\| *	Shader_Cache: setup connection of ConstBufferLocker	Fernando Sahmkow	2019-10-25	5	-12/+22
\| \|
\| *	VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders.	Fernando Sahmkow	2019-10-25	3	-0/+123
\| \|
\| *	Shader_IR: Implement BRX tracking.	Fernando Sahmkow	2019-10-25	1	-0/+113
\| \|
* \|	Merge pull request #3027 from lioncash/lookup	Rodrigo Locatti	2019-10-26	1	-53/+67
\|\ \ \| \| \| \| \| \|	shader_ir: Use std::array with std::pair instead of std::unordered_map
\| * \|	shader_ir: Use std::array with pair instead of unordered_map	Lioncash	2019-10-24	1	-53/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given the overall size of the maps are very small, we can use arrays of pairs here instead of always heap allocating a new map every time the functions are called. Given the small size of the maps, the difference in container lookups are negligible, especially given the entries are already sorted.
* \| \|	Merge pull request #3013 from FernandoS27/tld4s-fix	Rodrigo Locatti	2019-10-26	2	-5/+5
\|\ \ \ \| \|_\|/ \|/\| \|	Shader_Ir: Fix TLD4S from using a component mask.
\| * \|	Shader_Ir: Fix TLD4S from using a component mask.	Fernando Sahmkow	2019-10-22	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TLD4S always outputs 4 values, the previous code checked a component mask and omitted those values that weren't part of it. This commit corrects that and makes sure all 4 values are set.
* \| \|	video_core/shader: Resolve instances of variable shadowing	Lioncash	2019-10-24	6	-11/+12
\| \|/ \|/\| \| \| \| \|	Silences a few -Wshadow warnings.
* \|	shader_ir/memory: Ignore global memory when tracking fails	ReinUsesLisp	2019-10-22	2	-18/+26
\|/ \| \| \| \| \| \| \| \| \| \|	Ignore global memory operations instead of invoking undefined behaviour when constant buffer tracking fails and we are blasting through asserts, ignore the operation. In the case of LDG this means filling the destination registers with zeroes; for STG this means ignore the instruction as a whole. The default behaviour is still to abort execution on failure.
*	video_core/shader/ast: Make ShowCurrentState() and SanityCheck() const member functions	Lioncash	2019-10-18	2	-5/+5
\| \| \| \| \|	These can also trivially be made const member functions, with the addition of a few consts.
*	video_core/shader/ast: Make ASTManager::Print a const member function	Lioncash	2019-10-18	2	-3/+3
\| \| \| \| \|	Given all visiting functions never modify the nodes, we can trivially make this a const member function.
*	video_core/shader/ast: Make ExprPrinter members private	Lioncash	2019-10-18	1	-1/+2
\| \| \| \| \|	This member already has an accessor, so there's no need for it to be public.
*	video_core/shader/ast: Make Indent() return a string_view	Lioncash	2019-10-18	1	-14/+24
\| \| \| \| \| \| \| \|	The returned string is simply a substring of our constexpr tabs string_view, so we can just use a string_view here as well, since the original string_view is guaranteed to always exist. Now the function is fully non-allocating.
*	video_core/shader/ast: Make Indent() private	Lioncash	2019-10-18	1	-9/+9
\| \| \| \|	It's never used outside of this class, so we can narrow its scope down.
*	video_core/shader/ast: Rename Ident() to Indent()	Lioncash	2019-10-18	1	-13/+13
\| \| \| \| \|	This can be confusing, given "ident" is generally used as a shorthand for "identifier".
*	video_core/shader/ast: Make use of fmt where applicable	Lioncash	2019-10-18	1	-14/+14
\| \| \| \| \|	Makes a few strings nicer to read and also eliminates a bit of string churn with operator+.
*	Merge pull request #2980 from lioncash/warn	bunnei	2019-10-17	2	-4/+4
\|\ \| \| \| \|	maxwell_3d: Silence truncation warnings
\| *	control_flow: Silence truncation warnings	Lioncash	2019-10-16	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This can be trivially fixed by making the input size a size_t. CFGRebuildState's constructor parameter is already a std::size_t, so this just makes the size type fully conform with it.
* \|	shader/node: std::move Meta instance within OperationNode constructor	Lioncash	2019-10-16	1	-1/+1
\|/ \| \| \|	Allows usages of the constructor to avoid an unnecessary copy.
*	shader/half_set_predicate: Fix HSETP2 for constant buffers	ReinUsesLisp	2019-10-07	1	-0/+2
\| \| \| \| \|	HSETP2 when used with a constant buffer parses the second operand type as F32. This is not configurable.
*	shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG	ReinUsesLisp	2019-10-07	1	-1/+2
\|
*	video_core/control_flow: Eliminate variable shadowing warnings	Lioncash	2019-10-05	1	-6/+6
\|
*	video_core/control_flow: Eliminate pessimizing moves	Lioncash	2019-10-05	1	-5/+8
\| \| \| \|	These can inhibit the ability of a compiler to perform RVO.
*	video_core/ast: Unindent most of IsFullyDecompiled() by one level	Lioncash	2019-10-05	1	-12/+12
\|
*	video_core/ast: Make ShowCurrentState() take a string_view instead of std::string	Lioncash	2019-10-05	2	-2/+2
\| \| \| \|	Allows the function to be non-allocating in terms of the output string.
*	video_core/ast: Eliminate variable shadowing warnings	Lioncash	2019-10-05	1	-3/+3
\|
*	video_core/ast: Replace std::string with a constexpr std::string_view	Lioncash	2019-10-05	1	-3/+1
\| \| \| \|	Same behavior, but without the need to heap allocate
*	video_core/ast: Default the move constructor and assignment operator	Lioncash	2019-10-05	2	-26/+2
\| \| \| \| \|	This is behaviorally equivalent and also fixes a bug where some members weren't being moved over.
*	video_core/{ast, expr}: Organize forward declaration	Lioncash	2019-10-05	2	-10/+10
\| \| \| \|	Keeps them alphabetically sorted for readability.
*	video_core/expr: Supply operator!= along with operator==	Lioncash	2019-10-05	2	-1/+32
\| \| \| \|	Provides logical symmetry to the interface.
*	video_core/{ast, expr}: Use std::move where applicable	Lioncash	2019-10-05	4	-45/+47
\| \| \| \|	Avoids unnecessary atomic reference count increments and decrements.
*	video_core/ast: Supply const accessors for data where applicable	Lioncash	2019-10-05	2	-37/+41
\| \| \| \| \|	Provides const equivalents of data accessors for use within const contexts.
*	Shader_ir: Address feedback	Fernando Sahmkow	2019-10-05	4	-50/+14
\|
*	Shader_Ir: Address Feedback and clang format.	Fernando Sahmkow	2019-10-05	3	-43/+50
\|
*	Shader_IR: clean up AST handling and add documentation.	Fernando Sahmkow	2019-10-05	1	-2/+6
\|
*	Shader_IR: Correct OutwardMoves for Ifs	Fernando Sahmkow	2019-10-05	1	-22/+11
\|
*	Shader_IR: corrections and clang-format	Fernando Sahmkow	2019-10-05	2	-70/+64
\|
*	Shader_IR: allow else derivation to be optional.	Fernando Sahmkow	2019-10-05	6	-8/+14
\|
*	vk_shader_compiler: Implement the decompiler in SPIR-V	Fernando Sahmkow	2019-10-05	2	-1/+25
\|
*	Shader_IR: mark labels as unused for partial decompile.	Fernando Sahmkow	2019-10-05	2	-3/+9
\|
*	Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes.	Fernando Sahmkow	2019-10-05	10	-74/+307
\|
*	gl_shader_decompiler: Implement AST decompiling	Fernando Sahmkow	2019-10-05	10	-34/+116
\|
*	shader_ir: Declare Manager and pass it to appropiate programs.	Fernando Sahmkow	2019-10-05	7	-104/+214
\|
*	shader_ir: Corrections to outward movements and misc stuffs	Fernando Sahmkow	2019-10-05	5	-58/+305
\|
*	shader_ir: Add basic goto elimination	Fernando Sahmkow	2019-10-05	2	-38/+484
\|
*	shader_ir: Initial Decompile Setup	Fernando Sahmkow	2019-10-05	5	-5/+507
\|
*	Merge pull request #2869 from ReinUsesLisp/suld	bunnei	2019-09-24	3	-91/+101
\|\ \| \| \| \|	shader/image: Implement SULD and fix SUATOM
\| *	gl_shader_decompiler: Use uint for images and fix SUATOM	ReinUsesLisp	2019-09-21	3	-69/+52
\| \| \| \| \| \| \| \| \| \| \| \|	In the process remove implementation of SUATOM.MIN and SUATOM.MAX as these require a distinction between U32 and S32. These have to be implemented with imageCompSwap loop.
\| *	shader/image: Implement SULD and remove irrelevant code	ReinUsesLisp	2019-09-21	2	-25/+52
\| \| \| \| \| \| \| \| \| \|	* Implement SULD as float. * Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
* \|	Merge pull request #2870 from FernandoS27/multi-draw	David	2019-09-22	2	-0/+22
\|\ \ \| \| \| \| \| \|	Implement a MME Draw commands Inliner and correct host instance drawing
\| * \|	VideoCore: Corrections to the MME Inliner and removal of hacky instance management.	Fernando Sahmkow	2019-09-19	2	-0/+22
\| \| \|
* \| \|	Merge pull request #2878 from FernandoS27/icmp	Rodrigo Locatti	2019-09-21	1	-0/+29
\|\ \ \ \| \|_\|/ \|/\| \|	shader_ir: Implement ICMP
\| * \|	Shader_IR: ICMP corrections and fixes	Fernando Sahmkow	2019-09-21	1	-6/+9
\| \| \|
\| * \|	Shader_IR: Implement ICMP.	Fernando Sahmkow	2019-09-20	1	-0/+26
\| \|/
* \|	Merge pull request #2855 from ReinUsesLisp/shfl	bunnei	2019-09-20	2	-0/+57
\|\ \ \| \|/ \|/\|	shader_ir/warp: Implement SHFL for Nvidia devices
\| *	shader_ir/warp: Implement SHFL	ReinUsesLisp	2019-09-17	2	-0/+57
\| \|
* \|	Merge pull request #2784 from ReinUsesLisp/smem	bunnei	2019-09-18	4	-21/+58
\|\ \ \| \|/ \|/\|	shader_ir: Implement shared memory
\| *	shader_ir: Implement LD_S	ReinUsesLisp	2019-09-05	1	-10/+13
\| \| \| \| \| \| \| \|	Loads from shared memory.
\| *	shader_ir: Implement ST_S	ReinUsesLisp	2019-09-05	4	-11/+45
\| \| \| \| \| \| \| \| \| \|	This instruction writes to a memory buffer shared with threads within the same work group. It is known as "shared" memory in GLSL.
* \|	shader/image: Implement SUATOM and fix SUST	ReinUsesLisp	2019-09-11	3	-37/+122
\| \|
* \|	Merge pull request #2823 from ReinUsesLisp/shr-clamp	bunnei	2019-09-10	1	-6/+13
\|\ \ \| \| \| \| \| \|	shader/shift: Implement SHR wrapped and clamped variants
\| * \|	shader/shift: Implement SHR wrapped and clamped variants	ReinUsesLisp	2019-09-04	1	-6/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Nvidia defaults to wrapped shifts, but this is undefined behaviour on OpenGL's spec. Explicitly mask/clamp according to what the guest shader requires.
* \| \|	gl_shader_decompiler: Keep track of written images and mark them as modified	ReinUsesLisp	2019-09-06	3	-42/+54
\| \| \|
* \| \|	kepler_compute: Implement texture queries	ReinUsesLisp	2019-09-06	1	-0/+4
\| \|/ \|/\|
* \|	half_set_predicate: Fix predicate assignments	ReinUsesLisp	2019-09-04	1	-10/+9
\|/
*	Merge pull request #2812 from ReinUsesLisp/f2i-selector	bunnei	2019-09-04	1	-6/+16
\|\ \| \| \| \|	shader_ir/conversion: Implement F2I and F2F F16 selector
\| *	shader_ir/conversion: Split int and float selector and implement F2F H1	ReinUsesLisp	2019-08-28	1	-18/+16
\| \|
\| *	shader_ir/conversion: Implement F2I F16 Ra.H1	ReinUsesLisp	2019-08-28	1	-4/+16
\| \|
* \|	Merge pull request #2811 from ReinUsesLisp/fsetp-fix	bunnei	2019-09-04	1	-4/+5
\|\ \ \| \| \| \| \| \|	float_set_predicate: Add missing negation bit for the second operand
\| * \|	float_set_predicate: Add missing negation bit for the second operand	ReinUsesLisp	2019-08-28	1	-4/+5
\| \|/
* \|	video_core: Silent miscellaneous warnings (#2820)	Rodrigo Locatti	2019-08-30	5	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* texture_cache/surface_params: Remove unused local variable * rasterizer_interface: Add missing documentation commentary * maxwell_dma: Remove unused rasterizer reference * video_core/gpu: Sort member declaration order to silent -Wreorder warning * fermi_2d: Remove unused MemoryManager reference * video_core: Silent unused variable warnings * buffer_cache: Silent -Wreorder warnings * kepler_memory: Remove unused MemoryManager reference * gl_texture_cache: Add missing override * buffer_cache: Add missing include * shader/decode: Remove unused variables
* \|	Merge pull request #2758 from ReinUsesLisp/packed-tid	bunnei	2019-08-29	3	-0/+15
\|\ \ \| \| \| \| \| \|	shader/decode: Implement S2R Tic
\| * \|	shader/decode: Implement S2R Tic	ReinUsesLisp	2019-07-22	3	-0/+15
\| \| \|
* \| \|	shader_ir: Implement VOTE	ReinUsesLisp	2019-08-21	4	-0/+62
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers.
* \|	Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix	bunnei	2019-08-21	1	-1/+1
\|\ \ \| \| \| \| \| \|	half_set_predicate: Fix HSETP2_C constant buffer offset
\| * \|	half_set_predicate: Fix HSETP2_C constant buffer offset	ReinUsesLisp	2019-08-04	1	-1/+1
\| \| \|
* \| \|	Merge pull request #2753 from FernandoS27/float-convert	bunnei	2019-08-21	2	-16/+39
\|\ \ \ \| \| \| \| \| \| \| \|	Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
\| * \| \|	Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.	Fernando Sahmkow	2019-07-20	2	-16/+39
\| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \|	This commit takes care of implementing the F16 Variants of the conversion instructions and makes sure conversions are done.
* \| \|	Merge pull request #2778 from ReinUsesLisp/nop	bunnei	2019-08-18	1	-0/+6
\|\ \ \ \| \| \| \| \| \| \| \|	shader_ir: Implement NOP
\| * \| \|	shader_ir: Implement NOP	ReinUsesLisp	2019-08-04	1	-0/+6
\| \| \|/ \| \|/\|
* / \|	decode/half_set_predicate: Fix predicates	ReinUsesLisp	2019-07-26	1	-3/+3
\|/ /
* \|	Merge pull request #2739 from lioncash/cflow	bunnei	2019-07-25	3	-30/+51
\|\ \ \| \| \| \| \| \|	video_core/control_flow: Minor changes/warning cleanup
\| * \|	video_core/control_flow: Provide operator!= for types with operator==	Lioncash	2019-07-19	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Provides operational symmetry for the respective structures.
\| * \|	video_core/control_flow: Prevent sign conversion in TryGetBlock()	Lioncash	2019-07-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The return value is a u32, not an s32, so this would result in an implicit signedness conversion.
\| * \|	video_core/control_flow: Remove unnecessary BlockStack copy constructor	Lioncash	2019-07-19	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the default behavior of the copy constructor, so it doesn't need to be specified. While we're at it we can make the other non-default constructor explicit.
\| * \|	video_core/control_flow: Use std::move where applicable	Lioncash	2019-07-19	1	-10/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Results in less work being done where avoidable.
\| * \|	video_core/control_flow: Use the prefix variant of operator++ for iterators	Lioncash	2019-07-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Same thing, but potentially allows a standard library implementation to pick a more efficient codepath.
\| * \|	video_core/control_flow: Use empty() member function for checking emptiness	Lioncash	2019-07-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	It's what it's there for.
\| * \|	video_core: Resolve -Wreorder warnings	Lioncash	2019-07-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensures that the constructor members are always initialized in the order that they're declared in.
\| * \|	video_core/control_flow: Make program_size for ScanFlow() a std::size_t	Lioncash	2019-07-19	2	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prevents a truncation warning from occurring with MSVC. Also the internal data structures already treat it as a size_t, so this is just a discrepancy in the interface.
\| * \|	video_core/control_flow: Place all internally linked types/functions within an anonymous namespace	Lioncash	2019-07-19	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, quite a few functions were being linked with external linkage.
\| * \|	video_core/shader/decode: Prevent sign-conversion warnings	Lioncash	2019-07-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Makes it explicit that the conversions here are intentional.
* \| \|	Merge pull request #2737 from FernandoS27/track-fix	bunnei	2019-07-25	1	-2/+2
\|\ \ \ \| \| \| \| \| \| \| \|	Shader_Ir: Correct tracking to track from right to left
\| * \| \|	Shader_Ir: Correct tracking to track from right to left	Fernando Sahmkow	2019-07-16	1	-2/+2
\| \| \| \|
* \| \| \|	Merge pull request #2743 from FernandoS27/surpress-assert	bunnei	2019-07-25	5	-13/+20
\|\ \ \ \ \| \|_\|_\|/ \|/\| \| \|	Downgrade and suppress a series of GPU asserts and debug messages.
\| * \| \|	Shader_Ir: Change Debug Asserts for Log Warnings	Fernando Sahmkow	2019-07-20	3	-10/+17
\| \| \| \|
\| * \| \|	Shader_Ir: correct clang format	Fernando Sahmkow	2019-07-18	1	-2/+2
\| \| \| \|
\| * \| \|	Shader_Ir: Downgrade precision and rounding asserts to debug asserts.	Fernando Sahmkow	2019-07-18	5	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit reduces the sevirity of asserts for FP precision and rounding as this are well known and have little to no consequences in gpu's accuracy.
* \| \| \|	shader/half_set_predicate: Fix HSETP2 implementation	ReinUsesLisp	2019-07-20	2	-19/+15
\| \| \| \|
* \| \| \|	shader/half_set_predicate: Implement missing HSETP2 variants	ReinUsesLisp	2019-07-20	1	-13/+29
\| \|_\|/ \|/\| \|
* \| \|	Merge pull request #2738 from lioncash/shader-ir	bunnei	2019-07-18	8	-99/+103
\|\ \ \ \| \|/ / \|/\| \|	shader-ir: Minor cleanup-related changes
\| * \|	shader_ir: std::move Node instance where applicable	Lioncash	2019-07-17	4	-60/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are std::shared_ptr instances underneath the hood, which means copying them isn't as cheap as a regular pointer. Particularly so on weakly-ordered systems. This avoids atomic reference count increments and decrements where they aren't necessary for the core set of operations.
\| * \|	shader_ir: Rename Get/SetTemporal to Get/SetTemporary	Lioncash	2019-07-17	5	-36/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is more accurate in terms of describing what the functions are actually doing. Temporal relates to time, not the setting of a temporary itself.
\| * \|	shader_ir: Remove unused includes	Lioncash	2019-07-17	1	-3/+0
\| \|/ \| \| \| \| \| \|	Removes unnecessary header dependencies.
* \|	Merge pull request #2740 from lioncash/bra	Fernando Sahmkow	2019-07-17	1	-1/+1
\|\ \ \| \|/ \|/\|	shader/decode/other: Correct branch indirect argument within BRA handling
\| *	shader/decode/other: Correct branch indirect argument within BRA handling	Lioncash	2019-07-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This appears to have been a copy/paste error introduced within 8a6fc529a968e007f01464abadd32f9b5eb0a26c
* \|	Merge pull request #2565 from ReinUsesLisp/track-indirect	Fernando Sahmkow	2019-07-16	6	-35/+36
\|\ \ \| \|/ \|/\|	shader/track: Track indirect buffers
\| *	shader: Allow tracking of indirect buffers without variable offset	ReinUsesLisp	2019-07-15	6	-35/+36
\| \| \| \| \| \| \| \| \| \| \| \|	While changing this code, simplify tracking code to allow returning the base address node, this way callers don't have to manually rebuild it on each invocation.
* \|	Merge pull request #2695 from ReinUsesLisp/layer-viewport	Fernando Sahmkow	2019-07-15	2	-0/+31
\|\ \ \| \|/ \|/\|	gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
\| *	gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders	ReinUsesLisp	2019-07-08	2	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit implements gl_ViewportIndex and gl_Layer in vertex and geometry shaders. In the case it's used in a vertex shader, it requires ARB_shader_viewport_layer_array. This extension is available on AMD and Nvidia devices (mesa and proprietary drivers), but not available on Intel on any platform. At the moment of writing this description I don't know if this is a hardware limitation or a driver limitation. In the case that ARB_shader_viewport_layer_array is not available, writes to these registers on a vertex shader are ignored, with the appropriate logging.
* \|	Merge pull request #2692 from ReinUsesLisp/tlds-f16	Fernando Sahmkow	2019-07-14	1	-1/+7
\|\ \ \| \| \| \| \| \|	shader/texture: Add F16 support for TLDS
\| * \|	shader/texture: Add F16 support for TLDS	ReinUsesLisp	2019-07-07	1	-1/+7
\| \| \|
* \| \|	shader_ir: Add comments on missing instruction.	Fernando Sahmkow	2019-07-09	2	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Also shows Nvidia's address space on comments.
* \| \|	shader_ir: limit explorastion to best known program size.	Fernando Sahmkow	2019-07-09	1	-1/+1
\| \| \|
* \| \|	control_flow: Correct block breaking algorithm.	Fernando Sahmkow	2019-07-09	1	-17/+17
\| \| \|
* \| \|	control_flow: Assert shaders bigger than limit.	Fernando Sahmkow	2019-07-09	1	-0/+2
\| \| \|
* \| \|	control_flow: Address feedback.	Fernando Sahmkow	2019-07-09	1	-89/+37
\| \| \|
* \| \|	shader_ir: Correct parsing of scheduling instructions and correct sizing	Fernando Sahmkow	2019-07-09	2	-13/+30
\| \| \|
* \| \|	shader_ir: Correct max sizing	Fernando Sahmkow	2019-07-09	2	-2/+2
\| \| \|
* \| \|	shader_ir: Remove unnecessary constructors and use optional for ScanFlow result	Fernando Sahmkow	2019-07-09	3	-28/+17
\| \| \|
* \| \|	shader_ir: Corrections, documenting and asserting control_flow	Fernando Sahmkow	2019-07-09	3	-52/+54
\| \| \|
* \| \|	shader_ir: Unify blocks in decompiled shaders.	Fernando Sahmkow	2019-07-09	6	-54/+79
\| \| \|
* \| \|	shader_ir: Decompile Flow Stack	Fernando Sahmkow	2019-07-09	4	-11/+206
\| \| \|
* \| \|	shader_ir: propagate shader size to the IR	Fernando Sahmkow	2019-07-09	3	-6/+7
\| \| \|
* \| \|	shader_ir: Implement BRX & BRA.CC	Fernando Sahmkow	2019-07-09	3	-4/+42
\| \| \|
* \| \|	shader_ir: Remove the old scanner.	Fernando Sahmkow	2019-07-09	2	-77/+0
\| \| \|
* \| \|	shader_ir: Implement a new shader scanner	Fernando Sahmkow	2019-07-09	3	-16/+471
\| \|/ \|/\|
* \|	Delete decode_integer_set.cpp	Tobias	2019-07-07	1	-0/+0
\|/
*	decode/texture: Address feedback	ReinUsesLisp	2019-06-24	1	-0/+1
\|
*	texture_cache: Style and Corrections	Fernando Sahmkow	2019-06-21	1	-1/+2
\|
*	shader_ir: Fix image copy rebase issues	Fernando Sahmkow	2019-06-21	1	-2/+7
\|
*	shader: Implement bindless images	ReinUsesLisp	2019-06-21	3	-2/+40
\|
*	shader: Decode SUST and implement backing image functionality	ReinUsesLisp	2019-06-21	4	-1/+140
\|
*	shader: Implement texture buffers	ReinUsesLisp	2019-06-21	2	-0/+46
\|
*	shader: Split SSY and PBK stack	ReinUsesLisp	2019-06-07	2	-11/+14
\| \| \| \| \| \| \| \| \| \| \|	Hardware testing revealed that SSY and PBK push to a different stack, allowing code like this: SSY label1; PBK label2; SYNC; label1: PBK; label2: EXIT;
*	shader/node: Minor changes	ReinUsesLisp	2019-06-07	1	-50/+54
\| \| \| \| \| \| \|	Reflect std::shared_ptr nature of Node on initializers and remove constant members in nodes. Add some commentaries.
*	shader: Move Node declarations out of the shader IR header	ReinUsesLisp	2019-06-07	3	-493/+517
\| \| \| \| \| \|	Analysis passes do not have a good reason to depend on shader_ir.h to work on top of nodes. This splits node-related declarations to their own file and leaves the IR in shader_ir.h
*	shader: Use shared_ptr to store nodes and move initialization to file	ReinUsesLisp	2019-06-06	32	-192/+238
\| \| \| \| \| \| \| \| \|	Instead of having a vector of unique_ptr stored in a vector and returning star pointers to this, use shared_ptr. While changing initialization code, move it to a separate file when possible. This is a first step to allow code analysis and node generation beyond the ShaderIR class.
*	Merge pull request #2446 from ReinUsesLisp/tid	bunnei	2019-05-29	2	-15/+35
\|\ \| \| \| \|	shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
\| *	shader: Implement S2R Tid{XYZ} and CtaId{XYZ}	ReinUsesLisp	2019-05-20	2	-15/+35
\| \|
* \|	Merge pull request #2485 from ReinUsesLisp/generic-memory	bunnei	2019-05-25	2	-31/+57
\|\ \ \| \| \| \| \| \|	shader/memory: Implement generic memory stores and loads (ST and LD)
\| * \|	shader/memory: Implement ST (generic memory)	ReinUsesLisp	2019-05-21	1	-21/+35
\| \| \|
\| * \|	shader/memory: Implement LD (generic memory)	ReinUsesLisp	2019-05-21	2	-11/+23
\| \|/
* \|	shader/shader_ir: Make Comment() take a std::string by value	Lioncash	2019-05-23	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows for forming comment nodes without making unnecessary copies of the std::string instance. e.g. previously: Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]", cbuf->GetIndex(), cbuf_offset)); Would result in a copy of the string being created, as CommentNode() takes a std::string by value (a const ref passed to a value parameter results in a copy). Now, only one instance of the string is ever moved around. (fmt::format returns a std::string, and since it's returned from a function by value, this is a prvalue (which can be treated like an rvalue), so it's moved into Comment's string parameter), we then move it into the CommentNode constructor, which then moves the string into its member variable).
* \|	shader/decode/*: Add missing newline to files lacking them	Lioncash	2019-05-23	18	-18/+18
\| \| \| \| \| \| \| \|	Keeps the shader code file endings consistent.
* \|	shader/decode/*: Eliminate indirect inclusions	Lioncash	2019-05-23	6	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Amends cases where we were using things that were indirectly being satisfied through other headers. This way, if those headers change and eliminate dependencies on other headers in the future, we don't have cascading compilation errors.
* \|	shader/decode/memory: Remove left in debug pragma	Lioncash	2019-05-22	1	-2/+0
\|/
*	Merge pull request #2441 from ReinUsesLisp/al2p	bunnei	2019-05-19	4	-34/+67
\|\ \| \| \| \|	shader: Implement AL2P and ALD.PHYS
\| *	shader_ir/other: Implement IPA.IDX	ReinUsesLisp	2019-05-03	1	-5/+8
\| \|
\| *	shader_ir/memory: Assert on non-32 bits ALD.PHYS	ReinUsesLisp	2019-05-03	1	-0/+3
\| \|
\| *	shader: Add physical attributes commentaries	ReinUsesLisp	2019-05-03	3	-4/+6
\| \|
\| *	gl_shader_decompiler: Implement GLSL physical attributes	ReinUsesLisp	2019-05-03	1	-1/+1
\| \|
\| *	shader_ir/memory: Implement physical input attributes	ReinUsesLisp	2019-05-03	3	-6/+28
\| \|
\| *	shader: Remove unused AbufNode Ipa mode	ReinUsesLisp	2019-05-03	4	-29/+10
\| \|
\| *	shader_ir/memory: Emit AL2P IR	ReinUsesLisp	2019-05-03	2	-0/+22
\| \|
* \|	shader/shader_ir: Remove unnecessary inline specifiers	Lioncash	2019-05-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	constexpr internally links by default, so the inline specifier is unnecessary.
* \|	shader/shader_ir: Simplify constructors for OperationNode	Lioncash	2019-05-19	1	-15/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many of these constructors don't even need to be templated. The only ones that need to be templated are the ones that actually make use of the parameter pack. Even then, since std::vector accepts an initializer list, we can supply the parameter pack directly to it instead of creating our own copy of the list, then copying it again into the std::vector.
* \|	shader/shader_ir: Remove unnecessary template parameter packs from Operation() overloads where applicable	Lioncash	2019-05-19	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	These overloads don't actually make use of the parameter pack, so they can be turned into regular non-template function overloads.
* \|	shader/shader_ir: Mark tracking functions as const member functions	Lioncash	2019-05-19	2	-8/+11
\| \| \| \| \| \| \| \| \| \|	These don't actually modify instance state, so they can be marked as const member functions
* \|	shader/shader_ir: Place implementations of constructor and destructor in cpp file	Lioncash	2019-05-19	2	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Given the class contains quite a lot of non-trivial types, place the constructor and destructor within the cpp file to avoid inlining construction and destruction code everywhere the class is used.
* \|	video_core/shader/decode/texture: Remove unused variable from GetTld4Code()	Lioncash	2019-05-10	1	-1/+0
\| \|
* \|	shader/decode/texture: Remove unused variable	Lioncash	2019-05-04	1	-1/+0
\|/ \| \| \|	This isn't used anywhere, so we can get rid of it.
*	Merge pull request #2435 from ReinUsesLisp/misc-vc	bunnei	2019-04-29	2	-3/+4
\|\ \| \| \| \|	shader_ir: Miscellaneous fixes
\| *	shader_ir: Move Sampler index entry in operand< to sort declarations	ReinUsesLisp	2019-04-26	1	-2/+2
\| \|
\| *	shader_ir: Add missing entry to Sampler operand< comparison	ReinUsesLisp	2019-04-26	1	-2/+3
\| \|
\| *	shader_ir/texture: Fix sampler const buffer key shift	ReinUsesLisp	2019-04-26	1	-1/+1
\| \|
* \|	Merge pull request #2322 from ReinUsesLisp/wswitch	bunnei	2019-04-29	4	-9/+16
\|\ \ \| \| \| \| \| \|	video_core: Silent -Wswitch warnings
\| * \|	video_core: Silent -Wswitch warnings	ReinUsesLisp	2019-04-18	4	-9/+16
\| \| \|
* \| \|	Merge pull request #2423 from FernandoS27/half-correct	bunnei	2019-04-29	2	-15/+16
\|\ \ \ \| \|_\|/ \|/\| \|	Corrections on Half Float operations: HADD2 HMUL2 and HFMA2
\| * \|	Corrections Half Float operations on const buffers and implement saturation.	Fernando Sahmkow	2019-04-21	2	-15/+16
\| \| \|
* \| \|	Merge pull request #2407 from FernandoS27/f2f	bunnei	2019-04-20	1	-16/+53
\|\ \ \ \| \|/ / \|/\| \|	Do some corrections in conversion shader instructions.
\| * \|	Do some corrections in conversion shader instructions.	Fernando Sahmkow	2019-04-16	1	-16/+53
\| \|/ \| \| \| \| \| \| \| \| \| \|	Corrects encodings for I2F, F2F, I2I and F2I Implements Immediate variants of all four conversion types. Add assertions to unimplemented stuffs.
* \|	Merge pull request #2409 from ReinUsesLisp/half-floats	bunnei	2019-04-20	7	-81/+85
\|\ \ \| \| \| \| \| \|	shader_ir/decode: Miscellaneous fixes to half-float decompilation
\| * \|	shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic	ReinUsesLisp	2019-04-16	7	-52/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Operations done before the main half float operation (like HAdd) were managing a packed value instead of the unpacked one. Adding an unpacked operation allows us to drop the per-operand MetaHalfArithmetic entry, simplifying the code overall.
\| * \|	shader_ir/decode: Implement half float saturation	ReinUsesLisp	2019-04-16	3	-4/+14
\| \| \|
\| * \|	shader_ir/decode: Reduce severity of unimplemented half-float FTZ	ReinUsesLisp	2019-04-16	3	-3/+9
\| \| \|
\| * \|	renderer_opengl: Implement half float NaN comparisons	ReinUsesLisp	2019-04-16	2	-18/+17
\| \| \|
\| * \|	shader_ir: Avoid using static on heap-allocated objects	ReinUsesLisp	2019-04-16	1	-5/+4
\| \|/ \| \| \| \| \| \| \| \|	Using static here might be faster at runtime, but it adds a heap allocation called before main.
* \|	Merge pull request #2348 from FernandoS27/guest-bindless	bunnei	2019-04-18	2	-24/+129
\|\ \ \| \| \| \| \| \|	Implement Bindless Textures on Shader Decompiler and GL backend
\| * \|	Adapt Bindless to work with AOFFI	Fernando Sahmkow	2019-04-08	1	-7/+18
\| \| \|
\| * \|	Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format.	Fernando Sahmkow	2019-04-08	2	-3/+4
\| \| \|
\| * \|	Fix TMML	Fernando Sahmkow	2019-04-08	1	-5/+7
\| \| \|
\| * \|	Refactor GetTextureCode and GetTexCode to use an optional instead of optional parameters	Fernando Sahmkow	2019-04-08	2	-34/+33
\| \| \|
\| * \|	Implement TXQ_B	Fernando Sahmkow	2019-04-08	1	-2/+8
\| \| \|
\| * \|	Implement TMML_B	Fernando Sahmkow	2019-04-08	1	-5/+10
\| \| \|
\| * \|	Corrections to TEX_B	Fernando Sahmkow	2019-04-08	1	-4/+5
\| \| \|
\| * \|	Implement Bindless Handling on SetupTexture	Fernando Sahmkow	2019-04-08	1	-4/+3
\| \| \|
\| * \|	Unify both sampler types.	Fernando Sahmkow	2019-04-08	2	-18/+40
\| \| \|
\| * \|	Implement Bindless Samplers and TEX_B in the IR.	Fernando Sahmkow	2019-04-08	2	-15/+74
\| \| \|
* \| \|	Merge pull request #2315 from ReinUsesLisp/severity-decompiler	bunnei	2019-04-17	1	-4/+5
\|\ \ \ \| \| \| \| \| \| \| \|	shader_ir/decode: Reduce the severity of common assertions
\| * \| \|	shader_ir/memory: Reduce severity of LD_L cache management and log it	ReinUsesLisp	2019-04-03	1	-2/+2
\| \| \| \|
\| * \| \|	shader_ir/memory: Reduce severity of ST_L cache management and log it	ReinUsesLisp	2019-04-03	1	-2/+3
\| \| \| \|
* \| \| \|	shader_ir: Implement STG, keep track of global memory usage and flush	ReinUsesLisp	2019-04-14	2	-38/+87
\| \|_\|/ \|/\| \|
* \| \|	Correct XMAD mode, psl and high_b on different encodings.	Fernando Sahmkow	2019-04-08	1	-9/+30
\| \|/ \|/\|
* \|	shader_ir/decode: Silent implicit sign conversion warning	Mat M	2019-03-31	1	-2/+2
\| \| \| \| \| \|	Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
* \|	shader_ir/decode: Implement AOFFI for TEX and TLD4	ReinUsesLisp	2019-03-30	2	-27/+94
\| \|
* \|	shader_ir: Implement immediate register tracking	ReinUsesLisp	2019-03-30	2	-1/+19
\|/
*	shader/decode: Remove extras from MetaTexture	ReinUsesLisp	2019-02-26	2	-15/+26
\|
*	shader/decode: Split memory and texture instructions decoding	ReinUsesLisp	2019-02-26	4	-493/+527
\|
*	shader/track: Resolve variable shadowing warnings	Lioncash	2019-02-25	1	-5/+5
\|
*	Merge pull request #2118 from FernandoS27/ipa-improve	bunnei	2019-02-25	2	-3/+14
\|\ \| \| \| \|	shader_decompiler: Improve Accuracy of Attribute Interpolation.
\| *	shader_decompiler: Improve Accuracy of Attribute Interpolation.	Fernando Sahmkow	2019-02-14	2	-3/+14
\| \|
* \|	gl_shader_decompiler: Re-implement TLDS lod	ReinUsesLisp	2019-02-12	1	-1/+1
\|/
*	Merge pull request #2108 from FernandoS27/fix-cc	bunnei	2019-02-12	1	-2/+2
\|\ \| \| \| \|	Fix incorrect value for CC bit in IADD
\| *	Fix incorrect value for CC bit in IADD	Fernando Sahmkow	2019-02-11	1	-2/+2
\| \|
* \|	Merge pull request #2109 from FernandoS27/fix-f2i	bunnei	2019-02-12	1	-3/+3
\|\ \ \| \| \| \| \| \|	Corrected F2I None mode to RoundEven.
\| * \|	Corrected F2I None mode to RoundEven.	Fernando Sahmkow	2019-02-11	1	-3/+3
\| \|/
* \|	shader_ir: Remove F4 prefix to texture operations	ReinUsesLisp	2019-02-07	2	-14/+13
\| \| \| \| \| \| \| \| \| \| \| \|	This was originally included because texture operations returned a vec4. These operations now return a single float and the F4 prefix doesn't mean anything.
* \|	shader_ir: Clean texture management code	ReinUsesLisp	2019-02-07	2	-101/+63
\|/ \| \| \| \| \| \| \| \|	Previous code relied on GLSL parameter order (something that's always ill-formed on an IR design). This approach passes spatial coordiantes through operation nodes and array and depth compare values in the the texture metadata. It still contains an "extra" vector containing generic nodes for bias and component index (for example) which is still a bit ill-formed but it should be better than the previous approach.
*	Merge pull request #2083 from ReinUsesLisp/shader-ir-cbuf-tracking	bunnei	2019-02-07	29	-124/+138
\|\ \| \| \| \|	shader/track: Add a more permissive global memory tracking
\| *	shader/track: Search inside of conditional nodes	ReinUsesLisp	2019-02-03	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Some games search conditionally use global memory instructions. This allows the heuristic to search inside conditional nodes for the source constant buffer.
\| *	shader_ir: Rename BasicBlock to NodeBlock	ReinUsesLisp	2019-02-03	29	-119/+117
\| \| \| \| \| \| \| \|	It's not always used as a basic block. Rename it for consistency.
\| *	shader_ir: Pass decoded nodes as a whole instead of per basic blocks	ReinUsesLisp	2019-02-03	27	-57/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some games call LDG at the top of a basic block, making the tracking heuristic to fail. This commit lets the heuristic the decoded nodes as a whole instead of per basic blocks. This may lead to some false positives but allows it the heuristic to track cases it previously couldn't.
* \|	gl_shader_disk_cache: Save GLSL and entries into the precompiled file	ReinUsesLisp	2019-02-07	1	-0/+9
\| \|
* \|	Merge pull request #2081 from ReinUsesLisp/lmem-64	bunnei	2019-02-05	1	-12/+43
\|\ \ \| \| \| \| \| \|	shader_ir/memory: Add LD_L 64 bits loads
\| * \|	shader_ir/memory: Add ST_L 64 and 128 bits stores	ReinUsesLisp	2019-02-03	1	-3/+11
\| \| \|
\| * \|	shader_ir/memory: Add LD_L 128 bits loads	ReinUsesLisp	2019-02-03	1	-7/+19
\| \| \|
\| * \|	shader_bytecode: Rename BytesN enums to BitsN	ReinUsesLisp	2019-02-03	1	-4/+4
\| \| \|
\| * \|	shader_ir/memory: Add LD_L 64 bits loads	ReinUsesLisp	2019-02-03	1	-6/+17
\| \|/
* \|	Merge pull request #2082 from FernandoS27/txq-stl	bunnei	2019-02-05	1	-6/+9
\|\ \ \| \|/ \|/\|	Fix TXQ not using the component mask.
\| *	Fix TXQ not using the component mask.	Fernando Sahmkow	2019-02-03	1	-6/+9
\| \|
* \|	shader_ir: Unify constant buffer offset values	ReinUsesLisp	2019-01-30	14	-22/+24
\|/ \| \| \| \| \| \|	Constant buffer values on the shader IR were using different offsets if the access direct or indirect. cbuf34 has a non-multiplied offset while cbuf36 does. On shader decoding this commit multiplies it by four on cbuf34 queries.
*	shader_decode: Implement LDG and basic cbuf tracking	ReinUsesLisp	2019-01-30	3	-4/+159
\|
*	shader/shader_ir: Amend three comment typos	Lioncash	2019-01-28	1	-3/+3
\| \| \| \| \|	Given we're in the area, these are three trivial typos that can be corrected.
*	shader/shader_ir: Amend constructor initializer ordering for AbufNode	Lioncash	2019-01-28	1	-2/+2
\| \| \| \| \|	Orders the class members in the same order that they would actually be initialized in. Gets rid of two compiler warnings.
*	shader/decode: Avoid a pessimizing std::move within DecodeRange()	Lioncash	2019-01-28	1	-1/+1
\| \| \| \| \| \|	std::moveing a local variable in a return statement has the potential to prevent copy elision from occurring, so this can just be converted into a regular return.
*	shader_ir: Fixup clang build	ReinUsesLisp	2019-01-16	1	-4/+6
\|
*	shader_decode: Fixup XMAD	ReinUsesLisp	2019-01-15	1	-1/+1
\|
*	shader_ir: Pass to decoder functions basic block's code	ReinUsesLisp	2019-01-15	27	-82/+83
\|
*	shader_decode: Improve zero flag implementation	ReinUsesLisp	2019-01-15	15	-75/+79
\|
*	shader_ir: Remove composite primitives and use temporals instead	ReinUsesLisp	2019-01-15	3	-175/+187
\|
*	shader_decode: Use proper primitive names	ReinUsesLisp	2019-01-15	3	-15/+13
\|
*	shader_decode: Use BitfieldExtract instead of shift + and	ReinUsesLisp	2019-01-15	7	-48/+30
\|
*	shader_ir: Remove Ipa primitive	ReinUsesLisp	2019-01-15	2	-5/+2
\|
*	video_core: Rename glsl_decompiler to gl_shader_decompiler	ReinUsesLisp	2019-01-15	2	-1631/+0
\|
*	shader_ir: Remove RZ and use Register::ZeroIndex instead	ReinUsesLisp	2019-01-15	3	-12/+16
\|
*	shader_decode: Implement TEXS.F16	ReinUsesLisp	2019-01-15	3	-15/+57
\|
*	shader_decode: Fixup R2P	ReinUsesLisp	2019-01-15	1	-2/+3
\|
*	glsl_decompiler: Fixup TLDS	ReinUsesLisp	2019-01-15	1	-1/+0
\|
*	glsl_decompiler: Fixup geometry shaders	ReinUsesLisp	2019-01-15	1	-10/+16
\|
*	shader_decode: Fixup WriteLogicOperation zero comparison	ReinUsesLisp	2019-01-15	1	-1/+1
\|
*	glsl_decompiler: Fixup permissive member function declarations	ReinUsesLisp	2019-01-15	1	-133/+133
\|
*	shader_decode: Fixup PSET	ReinUsesLisp	2019-01-15	1	-2/+3
\|
*	shader_decode: Fixup clang-format	ReinUsesLisp	2019-01-15	2	-2/+4
\|
*	video_core: Implement IR based geometry shaders	ReinUsesLisp	2019-01-15	3	-2/+96
\|
*	shader_decode: Implement VMAD and VSETP	ReinUsesLisp	2019-01-15	3	-0/+125
\|
*	shader_decode: Implement HSET2	ReinUsesLisp	2019-01-15	3	-1/+50
\|
*	shader_decode: Rework HSETP2	ReinUsesLisp	2019-01-15	4	-47/+57
\|
*	shader_decode: Implement R2P	ReinUsesLisp	2019-01-15	1	-1/+28
\|
*	shader_decode: Implement CSETP	ReinUsesLisp	2019-01-15	1	-14/+37
\|
*	shader_decode: Implement PSET	ReinUsesLisp	2019-01-15	1	-1/+16
\|
*	shader_decode: Implement HFMA2	ReinUsesLisp	2019-01-15	3	-5/+59
\|
*	glsl_decompiler: Remove HNegate inlining	ReinUsesLisp	2019-01-15	1	-10/+0
\|
*	shader_decode: Implement POPC	ReinUsesLisp	2019-01-15	4	-1/+22
\|
*	shader_decode: Implement TLDS (untested)	ReinUsesLisp	2019-01-15	3	-10/+92
\|
*	shader_decode: Update TLD4 reflecting #1862 changes	ReinUsesLisp	2019-01-15	2	-52/+52
\|
*	shader_ir: Fixup TEX and TEXS and partially fix TLD4 decompiling	ReinUsesLisp	2019-01-15	3	-60/+72
\|
*	shader_decode: Fixup FSET	ReinUsesLisp	2019-01-15	1	-2/+2
\|
*	shader_decode: Implement IADD32I	ReinUsesLisp	2019-01-15	1	-0/+11
\|
*	video_core: Return safe values after an assert hits	ReinUsesLisp	2019-01-15	8	-8/+19
\|
*	shader_decode: Implement FFMA	ReinUsesLisp	2019-01-15	1	-1/+36
\|
*	video_core: Address feedback	ReinUsesLisp	2019-01-15	4	-13/+16
\|
*	shader_ir: Fixup file inclusions and clang-format	ReinUsesLisp	2019-01-15	3	-2/+2
\|
*	shader_ir: Move comment node string	Mat M	2019-01-15	1	-2/+2
\| \| \|	Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
*	shader_ir: Address feedback to avoid UB in bit casting	ReinUsesLisp	2019-01-15	1	-2/+4
\|
*	shader_decode: Fixup clang-format	ReinUsesLisp	2019-01-15	2	-3/+2
\|
*	shader_decode: Implement LEA	ReinUsesLisp	2019-01-15	1	-0/+55
\|
*	shader_decode: Implement IADD3	ReinUsesLisp	2019-01-15	1	-0/+61
\|
*	shader_decode: Implement LOP3	ReinUsesLisp	2019-01-15	2	-0/+62
\|
*	shader_decode: Implement ST_L	ReinUsesLisp	2019-01-15	1	-0/+17
\|
*	shader_decode: Implement LD_L	ReinUsesLisp	2019-01-15	1	-0/+18
\|
*	shader_decode: Implement HSETP2	ReinUsesLisp	2019-01-15	1	-1/+37
\|
*	shader_decode: Implement HADD2 and HMUL2	ReinUsesLisp	2019-01-15	1	-1/+48
\|
*	shader_decode: Implement HADD2_IMM and HMUL2_IMM	ReinUsesLisp	2019-01-15	1	-1/+28
\|
*	shader_decode: Implement MOV_SYS	ReinUsesLisp	2019-01-15	1	-0/+27
\|
*	shader_decode: Implement IMNMX	ReinUsesLisp	2019-01-15	1	-0/+16
\|
*	shader_decode: Implement F2F_C	ReinUsesLisp	2019-01-15	1	-2/+10
\|
*	shader_decode: Implement I2I	ReinUsesLisp	2019-01-15	1	-0/+26
\|
*	shader_decode: Implement BRA internal flag	ReinUsesLisp	2019-01-15	1	-4/+8
\|
*	shader_decode: Implement ISCADD	ReinUsesLisp	2019-01-15	1	-0/+15
\|
*	shader_decode: Implement XMAD	ReinUsesLisp	2019-01-15	1	-1/+85
\|
*	shader_decode: Implement PBK and BRK	ReinUsesLisp	2019-01-15	1	-1/+22
\|
*	shader_decode: Implement LOP	ReinUsesLisp	2019-01-15	1	-0/+15
\|
*	shader_decode: Implement SEL	ReinUsesLisp	2019-01-15	1	-0/+8
\|
*	shader_decode: Implement IADD	ReinUsesLisp	2019-01-15	1	-1/+28
\|
*	shader_decode: Implement ISETP	ReinUsesLisp	2019-01-15	1	-1/+30
\|
*	shader_decode: Implement BFI	ReinUsesLisp	2019-01-15	1	-1/+22
\|
*	shader_decode: Implement ISET	ReinUsesLisp	2019-01-15	1	-1/+27
\|
*	shader_decode: Implement LD_C	ReinUsesLisp	2019-01-15	1	-0/+31
\|
*	shader_decode: Implement SHL	ReinUsesLisp	2019-01-15	1	-0/+8
\|
*	shader_decode: Implement SHR	ReinUsesLisp	2019-01-15	1	-1/+26
\|
*	shader_decode: Implement LOP32I	ReinUsesLisp	2019-01-15	2	-1/+72
\|
*	shader_decode: Implement BFE	ReinUsesLisp	2019-01-15	1	-1/+25
\|
*	shader_decode: Implement FSET	ReinUsesLisp	2019-01-15	1	-1/+36
\|
*	shader_decode: Implement F2I	ReinUsesLisp	2019-01-15	1	-0/+37
\|
*	shader_decode: Implement I2F	ReinUsesLisp	2019-01-15	1	-0/+23
\|
*	shader_decode: Implement F2F	ReinUsesLisp	2019-01-15	1	-1/+37
\|
*	shader_decode: Stub DEPBAR	ReinUsesLisp	2019-01-15	1	-0/+4
\|
*	shader_decode: Implement SSY and SYNC	ReinUsesLisp	2019-01-15	1	-0/+19
\|
*	shader_decode: Implement PSETP	ReinUsesLisp	2019-01-15	1	-1/+21
\|
*	shader_decode: Implement TMML	ReinUsesLisp	2019-01-15	1	-3/+45
\|
*	shader_decode: Implement TEX and TXQ	ReinUsesLisp	2019-01-15	2	-0/+223
\|
*	shader_decode: Implement TEXS (F32)	ReinUsesLisp	2019-01-15	2	-0/+217
\|
*	shader_decode: Implement FSETP	ReinUsesLisp	2019-01-15	1	-1/+33
\|
*	shader_decode: Partially implement BRA	ReinUsesLisp	2019-01-15	1	-0/+12
\|
*	shader_decode: Implement IPA	ReinUsesLisp	2019-01-15	1	-0/+12
\|
*	shader_decode: Implement EXIT	ReinUsesLisp	2019-01-15	1	-1/+32
\|
*	shader_decode: Implement ST_A	ReinUsesLisp	2019-01-15	1	-0/+30
\|
*	shader_decode: Implement LD_A	ReinUsesLisp	2019-01-15	1	-1/+39
\|
*	shader_decode: Implement FADD32I	ReinUsesLisp	2019-01-15	1	-0/+12
\|
*	shader_decode: Implement FMUL32_IMM	ReinUsesLisp	2019-01-15	1	-0/+10
\|
*	shader_decode: Implement MOV32_IMM	ReinUsesLisp	2019-01-15	1	-1/+9
\|
*	shader_decode: Stub RRO_C, RRO_R and RRO_IMM	ReinUsesLisp	2019-01-15	1	-0/+9
\|
*	shader_decode: Implement FMNMX_C, FMNMX_R and FMNMX_IMM	ReinUsesLisp	2019-01-15	1	-0/+18
\|
*	shader_decode: Implement MUFU	ReinUsesLisp	2019-01-15	1	-0/+29
\|
*	shader_decode: Implement FADD_C, FADD_R and FADD_IMM	ReinUsesLisp	2019-01-15	1	-0/+15
\|
*	shader_decode: Implement FMUL_C, FMUL_R and FMUL_IMM	ReinUsesLisp	2019-01-15	1	-0/+42
\|
*	shader_decode: Implement MOV_C and MOV_R	ReinUsesLisp	2019-01-15	1	-1/+23
\|
*	glsl_decompiler: Implementation	ReinUsesLisp	2019-01-15	2	-0/+1481
\|
*	shader_ir: Add condition code helper	ReinUsesLisp	2019-01-15	2	-0/+13
\|
*	shader_ir: Add predicate combiner helper	ReinUsesLisp	2019-01-15	2	-0/+15
\|
*	shader_ir: Add comparison helpers	ReinUsesLisp	2019-01-15	2	-0/+106
\|
*	shader_ir: Add half float helpers	ReinUsesLisp	2019-01-15	2	-0/+44
\|
*	shader_ir: Add integer helpers	ReinUsesLisp	2019-01-15	2	-0/+40
\|
*	shader_ir: Add float helpers	ReinUsesLisp	2019-01-15	2	-0/+24
\|
*	shader_ir: Add setters	ReinUsesLisp	2019-01-15	2	-0/+24
\|
*	shader_ir: Add local memory getters	ReinUsesLisp	2019-01-15	2	-0/+7
\|
*	shader_ir: Add internal flag getters	ReinUsesLisp	2019-01-15	2	-0/+10
\|
*	shader_ir: Add attribute getters	ReinUsesLisp	2019-01-15	2	-0/+26
\|
*	shader_ir: Add constant buffer getters	ReinUsesLisp	2019-01-15	2	-0/+25
\|
*	shader_ir: Add register getter	ReinUsesLisp	2019-01-15	2	-0/+9
\|
*	shader_ir: Add immediate node constructors	ReinUsesLisp	2019-01-15	2	-1/+34
\|
*	shader_ir: Initial implementation	ReinUsesLisp	2019-01-15	28	-0/+1542
\|
*	Remove references to PICA and rasterizers in video_core	James Rowe	2018-01-13	9	-2453/+0
\|
*	Improved performance of FromAttributeBuffer	Huw Pascoe	2017-09-17	1	-1/+2
\| \| \| \| \| \| \|	Ternary operator is optimized by the compiler whereas std::min() is meant to return a value. I've noticed a 5%-10% emulation speed increase.
*	pica/shader/jit: implement SETEMIT and EMIT	wwylele	2017-08-19	2	-2/+49
\|
*	correct constness	wwylele	2017-08-19	2	-2/+4
\|
*	pica/shader/interpreter: implement SETEMIT and EMIT	wwylele	2017-08-19	1	-0/+16
\|
*	pica/shader: extend UnitState for GS	wwylele	2017-08-19	2	-0/+84
\| \| \| \| \|	Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access. This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
*	pica/shader_interpreter: fix off-by-one in LOOP	wwylele	2017-07-27	1	-1/+1
\|
*	Stop using reserved operator names (and/or/xor) with Xbyak	Yuri Kunde Schlesner	2017-06-17	1	-13/+13
\| \| \| \|	Also has the Dynarmic upgrade with the same change
*	Pica: Set program code / swizzle data limit to 4096	Jannik Vogel	2017-05-11	5	-13/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	One of the later commits will enable writing to GS regs. It turns out that on startup, most games will write 4096 GS program words. The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages: ``` HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024 ``` New constants have been introduced to represent these limits. The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX) (12 bit = [0; 4095]).
*	Doxygen: Amend minor issues (#2593)	Mat M	2017-02-27	2	-2/+4
\| \| \| \| \| \| \| \| \|	Corrects a few issues with regards to Doxygen documentation, for example: - Incorrect parameter referencing. - Missing @param tags. - Typos in @param tags. and a few minor other issues.
*	video_core/shader: Document sanitized MUL operation	Yuri Kunde Schlesner	2017-02-12	1	-0/+8
\|
*	Merge pull request #2550 from yuriks/pica-refactor2	Yuri Kunde Schlesner	2017-02-12	2	-2/+4
\|\ \| \| \| \|	Small VideoCore cleanups
\| *	VideoCore: Split regs.h inclusions	Yuri Kunde Schlesner	2017-02-09	2	-2/+4
\| \|
* \|	video_core: Fix benign out-of-bounds indexing of array (#2553)	Yuri Kunde Schlesner	2017-02-11	1	-2/+1
\|/ \| \| \| \| \|	The resulting pointer wasn't written to unless the index was verified as valid, but that's still UB and triggered debug checks in MSVC. Reported by garrettboast on IRC
*	VideoCore: Move Regs to its own file	Yuri Kunde Schlesner	2017-02-04	2	-2/+2
\|
*	VideoCore: Split shader regs from Regs struct	Yuri Kunde Schlesner	2017-02-04	4	-6/+6
\|
*	VideoCore: Split rasterizer regs from Regs struct	Yuri Kunde Schlesner	2017-02-04	2	-13/+13
\|
*	Merge pull request #2476 from yuriks/shader-refactor3	Yuri Kunde Schlesner	2017-02-04	4	-78/+58
\|\ \| \| \| \|	Oh No! More shader changes!
\| *	VideoCore: Extract swrast-specific data from OutputVertex	Yuri Kunde Schlesner	2017-01-30	2	-37/+14
\| \|
\| *	VideoCore/Shader: Clean up OutputVertex::FromAttributeBuffer	Yuri Kunde Schlesner	2017-01-30	1	-9/+14
\| \| \| \| \| \| \| \| \| \| \| \|	This also fixes a long-standing but neverthless harmless memory corruption bug, whech the padding of the OutputVertex struct would get corrupted by unused attributes.
\| *	VideoCore: Split shader output writing from semantic loading	Yuri Kunde Schlesner	2017-01-30	2	-18/+16
\| \|
\| *	VideoCore: Consistently use shader configuration to load attributes	Yuri Kunde Schlesner	2017-01-30	4	-12/+12
\| \|
\| *	VideoCore: Rename some types to more accurate names	Yuri Kunde Schlesner	2017-01-30	4	-6/+6
\| \|
* \|	ShaderJIT: add 16 dummy bytes at the bottom of the stack	wwylele	2017-02-03	1	-2/+5
\| \|
* \|	Common/x64: remove legacy emitter and abi (#2504)	Weiyi Wang	2017-01-31	1	-1/+0
\| \| \| \| \| \|	These are not used any more since we moved shader JIT to xbyak.
* \|	shader_jit_x64_compiler: esi and edi should be persistent (#2500)	Merry	2017-01-31	1	-0/+2
\|/
*	VideoCore/Shader: Move entry_point to SetupBatch	Yuri Kunde Schlesner	2017-01-26	5	-22/+23
\|
*	VideoCore/Shader: Move per-batch ShaderEngine state into ShaderSetup	Yuri Kunde Schlesner	2017-01-26	5	-40/+36
\|
*	Shader: Remove OutputRegisters struct	Yuri Kunde Schlesner	2017-01-26	3	-19/+13
\|
*	Shader: Initialize conditional_code in interpreter	Yuri Kunde Schlesner	2017-01-26	2	-3/+3
\| \| \| \| \| \| \|	This doesn't belong in LoadInputVertex because it also happens for non-VS invocations. Since it's not used by the JIT it seems adequate to initialize it in the interpreter which is the only thing that cares about them.
*	Shader: Don't read ShaderSetup from global state	Yuri Kunde Schlesner	2017-01-26	1	-3/+3
\|
*	shader_jit_x64: Don't read program from global state	Yuri Kunde Schlesner	2017-01-26	3	-22/+22
\|
*	VideoCore/Shader: Move ProduceDebugInfo to InterpreterEngine	Yuri Kunde Schlesner	2017-01-26	4	-19/+10
\|
*	VideoCore/Shader: Split interpreter and JIT into separate ShaderEngines	Yuri Kunde Schlesner	2017-01-26	6	-96/+150
\|
*	VideoCore/Shader: Rename shader_jit_x64{ => _compiler}.{cpp,h}	Yuri Kunde Schlesner	2017-01-26	3	-2/+2
\|
*	VideoCore/Shader: Split shader uniform state and shader engine	Yuri Kunde Schlesner	2017-01-26	3	-16/+46
\| \| \| \| \|	Currently there's only a single dummy implementation, which will be split in a following commit.
*	VideoCore/Shader: Add constness to methods	Yuri Kunde Schlesner	2017-01-26	2	-4/+4
\|
*	VideoCore/Shader: Use only entry_point as ShaderSetup param	Yuri Kunde Schlesner	2017-01-26	2	-9/+11
\| \| \| \| \|	This removes all implicit dependency of ShaderState on global PICA state.
*	VideoCore/Shader: Use self instead of g_state.vs in ShaderSetup	Yuri Kunde Schlesner	2017-01-26	2	-11/+8
\|
*	VideoCore/Shader: Extract input vertex loading code into function	Yuri Kunde Schlesner	2017-01-26	2	-20/+22
\|
*	video_core: fix shader.cpp signed / unsigned warning	Kloen	2017-01-23	1	-2/+2
\|
*	Fix some warnings (#2399)	Jonathan Hao	2017-01-04	1	-2/+0
\|
*	VideoCore/Shader: Extract DebugData out from UnitState	Yuri Kunde Schlesner	2016-12-16	7	-101/+97
\|
*	Remove unnecessary cast	Yuri Kunde Schlesner	2016-12-16	1	-3/+1
\|
*	VideoCore/Shader: Extract evaluate_condition lambda to function scope	Yuri Kunde Schlesner	2016-12-16	1	-26/+24
\|
*	VideoCore/Shader: Extract call lambda up a scope and remove unused param	Yuri Kunde Schlesner	2016-12-16	1	-21/+17
\|
*	VideoCore/Shader: Remove dynamic control flow in (Get)UniformOffset	Yuri Kunde Schlesner	2016-12-16	2	-18/+11
\|
*	VideoCore/Shader: Move DebugData to a separate file	Yuri Kunde Schlesner	2016-12-16	3	-172/+188
\|
*	shader_jit_x64: Use LOOPCOUNT_REG as a 64-bit reg when indexing	Yuri Kunde Schlesner	2016-12-15	1	-1/+1
\|
*	VideoCore: Eliminate an unnecessary copy in the drawcall loop	Yuri Kunde Schlesner	2016-12-15	2	-2/+2
\|
*	shader_jit_x64: Use Reg32 for LOOP* registers, eliminating casts	Yuri Kunde Schlesner	2016-12-15	1	-16/+16
\|
*	VideoCore: Convert x64 shader JIT to use Xbyak for assembly	Yuri Kunde Schlesner	2016-12-15	2	-223/+225
\|
*	shader_jit: Fix non-SSE4.1 path where FLR would not truncate	Jannik Vogel	2016-12-04	1	-1/+1
\|
*	shader_jit: Load LOOPCOUNT_REG and LOOPINC 4 bit left-shifted	Jannik Vogel	2016-12-02	1	-6/+9
\|
*	VideoCore: Shader interpreter cleanups	Yuri Kunde Schlesner	2016-09-30	1	-32/+42
\|
*	VideoCore: Fix out-of-bounds read in ShaderSetup::ProduceDebugInfo	Yuri Kunde Schlesner	2016-09-30	1	-3/+1
\| \| \| \| \| \|	As far as I can tell, memset was replaced by a fill without correcting the parameter type, causing an out-of-bounds array read in the Vec4 constructor.
*	Remove special rules for Windows.h and library includes	Yuri Kunde Schlesner	2016-09-21	1	-1/+1
\|
*	Use negative priorities to avoid special-casing the self-include	Yuri Kunde Schlesner	2016-09-21	3	-3/+3
\|
*	Remove empty newlines in #include blocks.	Emmanuel Gil Peyrot	2016-09-21	5	-22/+3
\| \| \| \| \| \| \|	This makes clang-format useful on those. Also add a bunch of forgotten transitive includes, which otherwise prevented compilation.
*	Manually tweak source formatting and then re-run clang-format	Yuri Kunde Schlesner	2016-09-19	4	-9/+6
\|
*	Sources: Run clang-format on everything.	Emmanuel Gil Peyrot	2016-09-18	6	-311/+335
\|
*	VideoCore: Fix dangling lambda context in shader interpreter	Yuri Kunde Schlesner	2016-09-16	1	-1/+1
\| \| \| \| \| \|	The static meant that after the first execution, these lambda context would be pointing to a random location on the stack. Fixes a random crash when using the interpreter.
*	Retrieve shader result from new OutputRegisters-type	Jannik Vogel	2016-05-16	3	-56/+68
\|
*	Use new shader-jit signature for interpreter	Jannik Vogel	2016-05-13	3	-8/+8
\|
*	Refactor access to state in shader-jit	Jannik Vogel	2016-05-13	4	-24/+42
\|
*	Move program_counter and call_stack from UnitState to interpreter	Jannik Vogel	2016-05-12	3	-45/+42
\|
*	Move default_attributes into Pica state	Jannik Vogel	2016-05-12	1	-2/+0
\|
*	Merge pull request #1690 from JayFoxRox/tex-type-3	bunnei	2016-05-12	1	-1/+2
\|\ \| \| \| \|	Pica: Implement texture type 3 (Projection2D)
\| *	Pica: Add tc0.w to OutputVertex	Jannik Vogel	2016-05-11	1	-1/+2
\| \|
* \|	Turn ShaderSetup into struct	Jannik Vogel	2016-05-11	2	-52/+53
\|/
*	Pica: Replace logic in shader.cpp with loop	Jannik Vogel	2016-05-03	1	-34/+4
\|
*	VideoCore: Run include-what-you-use and fix most includes.	Emmanuel Gil Peyrot	2016-04-30	6	-14/+43
\|
*	Merge pull request #1730 from hrydgard/vertex-loader	bunnei	2016-04-29	1	-1/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Remove late accesses to attribute_config * Refactor: Extract VertexLoader from command_processor.cpp. Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached. * Move "&" to their proper place, add missing includes and make some properly relative. * Don't keep base_address in the loader, it doesn't belong there (with it, the loader can't be cached). * Optimize the vertex loader, nearly doubling its speed. * Debugger fix * Move and rename the MemoryAccesses class to MemoryAccessTracker.
\| *	Refactor: Extract VertexLoader from command_processor.cpp.	Henrik Rydgard	2016-04-28	1	-1/+1
\| \| \| \| \| \| \| \|	Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached.
* \|	Common: Remove section measurement from profiler (#1731)	Yuri Kunde Schlesner	2016-04-29	1	-3/+0
\| \| \| \| \| \| \| \|	This has been entirely superseded by MicroProfile. The rest of the code can go when a simpler frametime/FPS meter is added to the GUI.
* \|	shader: Shader size is long uint, not uint.	Sam Spilsbury	2016-04-24	1	-1/+1
\| \|
* \|	shader: Handle non-CALL opcodes with a break	Sam Spilsbury	2016-04-24	1	-0/+2
\| \|
* \|	shader: Format string must be provided inline and not as a variable	Sam Spilsbury	2016-04-24	1	-1/+1
\|/
*	shader_jit_x64: Rename RuntimeAssert to Compile_Assert.	bunnei	2016-04-14	2	-5/+5
\|
*	shader_jit_x64.cpp: Rename JitCompiler to JitShader.	bunnei	2016-04-14	3	-92/+92
\|
*	shader_jit_x64: Free memory that's no longer needed after compilation.	bunnei	2016-04-14	1	-0/+6
\|
*	shader_jit_x64: Use a sorted vector instead of a set for keeping track of return addresses.	bunnei	2016-04-14	2	-5/+8
\|
*	shader_jit_x64: Use CALL/RET instead of JMP for subroutines.	bunnei	2016-04-14	1	-17/+7
\|
*	shader_jit_x64: Separate initialization and code generation for readability.	bunnei	2016-04-14	1	-9/+8
\|
*	shader_jit_x64: Get rid of unnecessary last_program_counter variable.	bunnei	2016-04-14	2	-6/+2
\|
*	shader_jit_x64: Execute certain asserts at runtime.	bunnei	2016-04-14	2	-5/+19
\| \| \| \|	- This is because we compile the full shader code space, and therefore its common to compile malformed instructions.
*	shader: Remove unused 'state' argument from 'Setup' function.	bunnei	2016-04-14	2	-3/+2
\|
*	shader_jit_x64: Specify shader main offset at runtime.	bunnei	2016-04-14	3	-10/+6
\|
*	shader_jit_x64: Allocate each program independently and persist for emu session.	bunnei	2016-04-14	3	-38/+28
\|
*	shader_jit_x64: Rewrite flow control to support arbitrary CALL and JMP instructions.	bunnei	2016-04-14	2	-35/+119
\|
*	shader_jit_x64: Fix strict memory aliasing issues.	bunnei	2016-04-14	1	-1/+3
\|
*	Merge pull request #1643 from MerryMage/make_unique	Mathew Maidment	2016-04-06	1	-1/+0
\|\ \| \| \| \|	Common: Remove Common::make_unique, use std::make_unique
\| *	Common: Remove Common::make_unique, use std::make_unique	MerryMage	2016-04-05	1	-1/+0
\| \|
* \|	Merge pull request #1508 from JayFoxRox/vs-output-map	bunnei	2016-03-22	1	-4/+14
\|\ \ \| \|/ \|/\|	Respect vs output map
\| *	Respect vs output map	Jannik Vogel	2016-03-14	1	-4/+14
\| \|
* \|	Merge pull request #1538 from lioncash/dot	bunnei	2016-03-20	1	-5/+3
\|\ \ \| \| \| \| \| \|	shader_interpreter: use std::inner_product for the dot product
\| * \|	shader_interpreter: use std::inner_product for the dot product	Lioncash	2016-03-17	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Same thing, less code.
* \| \|	video_core: Don't cast away const	Lioncash	2016-03-17	1	-1/+1
\|/ /
* \|	Merge pull request #1503 from bunnei/clear-jit-cache	bunnei	2016-03-16	3	-7/+27
\|\ \ \| \| \| \| \| \|	Clear JIT cache
\| * \|	shader_jit_x64: Clear cache after code space fills up.	bunnei	2016-03-12	3	-2/+19
\| \| \|
\| * \|	shader_jit_x64: Make assert outputs more useful & cleanup formatting.	bunnei	2016-03-12	1	-4/+7
\| \| \|
\| * \|	shader: Update log message to use proper log class.	bunnei	2016-03-12	1	-1/+1
\| \|/
* /	PICA: Fix MAD/MADI encoding	Jannik Vogel	2016-03-15	2	-29/+33
\|/
*	Common: Get rid of alignment macros	Lioncash	2016-03-09	1	-4/+4
\| \| \| \| \|	The gl rasterizer already uses alignas, so we may as well move everything over.
*	Add immediate mode vertex submission	Dwayne Slater	2016-03-03	4	-2/+22
\|
*	pica: Implement decoding of basic fragment lighting components.	bunnei	2016-02-05	2	-5/+9
\| \| \| \| \| \| \|	- Diffuse - Distance attenuation - float16/float20 types - Vertex Shader 'view' output
*	Merge pull request #1367 from yuriks/jit-jmp	bunnei	2016-01-27	2	-6/+6
\|\ \| \| \| \|	Shader JIT: Fix off-by-one error when compiling JMPs
\| *	Shader JIT: Fix off-by-one error when compiling JMPs	Yuri Kunde Schlesner	2016-01-24	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a mistake in the JMP code which meant that one instruction at the destination would be skipped when the jump was taken. This commit also changes the meaning of the culprit parameter to make it less confusing and avoid similar mistakes in the future.
* \|	Shader: Implement "invert condition" feature of IFU instruction	Yuri Kunde Schlesner	2016-01-25	2	-2/+5
\|/ \| \| \| \| \|	If the bit 0 of the JMPU instruction is set, then the jump condition will be inverted. That is, a jump will happen when the boolean is false instead of when it is true.
*	video_core: Reorganize headers	Lioncash	2015-09-11	3	-6/+4
\|
*	video_core: Remove unnecessary includes from headers	Lioncash	2015-09-11	1	-2/+0
\|
*	video_core: Remove unused variables	Lioncash	2015-09-10	2	-2/+0
\|
*	Shader JIT: Use SCALE constant from emitter	aroulin	2015-09-07	1	-4/+4
\|
*	Shader: Fix size_t to int casts of register offsets	aroulin	2015-09-07	2	-15/+21
\|
*	Merge pull request #1088 from aroulin/x64-emitter-abi-call	bunnei	2015-09-02	2	-28/+18
\|\ \| \| \| \|	x64: Proper stack alignment in shader JIT function calls
\| *	x64: Proper stack alignment in shader JIT function calls	aroulin	2015-09-01	2	-28/+18
\| \| \| \| \| \| \| \| \| \|	Import Dolphin stack handling and register saving routines Also removes the x86 parts from abi files
* \|	video_core: Fix format specifiers warnings	aroulin	2015-09-02	1	-1/+2
\|/
*	Shader JIT: Fix SGE/SGEI NaN behavior	aroulin	2015-08-31	1	-3/+3
\| \| \| \| \|	SGE was incorrectly emulated w.r.t. NaN behavior as the CMPSS SSE instruction was used with NLT
*	Merge pull request #1065 from yuriks/shader-fp	Yuri Kunde Schlesner	2015-08-28	3	-56/+87
\|\ \| \| \| \|	Shader FP compliance fixes
\| *	Shader JIT: Tiny micro-optimization in DPH	Yuri Kunde Schlesner	2015-08-24	1	-4/+4
\| \|
\| *	Shaders: Fix multiplications between 0.0 and inf	Yuri Kunde Schlesner	2015-08-24	2	-39/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The PICA200 semantics for multiplication are so that when multiplying inf by exactly 0.0, the result is 0.0, instead of NaN, as defined by IEEE. This is relied upon by games. Fixes #1024 (missing OoT interface items)
\| *	Shaders: Explicitly conform to PICA semantics in MAX/MIN	Yuri Kunde Schlesner	2015-08-24	2	-2/+10
\| \|
\| *	Shader JIT: Add name to second scratch register (XMM4)	Yuri Kunde Schlesner	2015-08-24	1	-3/+5
\| \|
\| *	Shader JIT: Fix CMP NaN behavior to match hardware	Yuri Kunde Schlesner	2015-08-24	1	-8/+23
\| \|
* \|	Shader JIT: Fix float to integer rounding in MOVA	aroulin	2015-08-27	1	-2/+2
\| \| \| \| \| \| \| \|	MOVA converts new address register values from floats to integers using truncation
* \|	Shader JIT: ifdef out reference to ifdef'd out shader_map	archshift	2015-08-27	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	shader_map was only defined on x86 architectures, but was cleared on shutdown with no ifdef protection. Ifdef this out so non-x86 architectures can be built.
* \|	Integrate the MicroProfile profiling library	Yuri Kunde Schlesner	2015-08-25	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	This brings goodies such as a configurable user interface and multi-threaded timeline view.
* \|	shader_jit: Replace two MDisp usages with MatR	Lioncash	2015-08-24	1	-2/+2
\|/
*	Merge pull request #1062 from aroulin/shader-rcp-rsq	bunnei	2015-08-23	2	-10/+10
\|\ \| \| \| \|	Shader: RCP and RSQ computes only the 1st component
\| *	Shader: Use std::sqrt for float instead of sqrt	aroulin	2015-08-23	1	-1/+1
\| \|
\| *	Shader: RCP and RSQ computes only the 1st component	aroulin	2015-08-23	2	-10/+10
\| \|
* \|	Shader: implement DPH/DPHI in JIT	aroulin	2015-08-22	2	-2/+36
\| \|
* \|	Shader: implement DPH/DPHI in interpreter	aroulin	2015-08-22	1	-1/+8
\|/ \| \| \| \|	Tests revealed that the component with w=1 is SRC1 and not SRC2, it is now fixed on 3dbrew.
*	Shader: implement SGE, SGEI and SLT in JIT	aroulin	2015-08-19	2	-15/+36
\|
*	Shader: implement SGE, SGEI in interpreter	aroulin	2015-08-19	1	-0/+14
\|
*	Shader: Save caller-saved registers in JIT before a CALL	aroulin	2015-08-19	2	-0/+33
\|
*	Shader: implement EX2 and LG2 in JIT	aroulin	2015-08-17	2	-2/+22
\|
*	Shader: implement EX2 and LG2 in interpreter	aroulin	2015-08-16	1	-0/+36
\|
*	Build fix for Debug configurations.	Tony Wasserka	2015-08-16	1	-1/+1
\|
*	Introduce a shader tracer to allow inspection of input/output values for each processed instruction.	Tony Wasserka	2015-08-16	5	-37/+322
\|
*	citra-qt: Improve shader debugger.	Tony Wasserka	2015-08-16	1	-6/+0
\| \| \| \|	Now supports dumping the current shader and recognizes a larger number of output semantics.
*	Shader: Use a POD struct for registers.	bunnei	2015-08-16	5	-40/+43
\|
*	Rename ARCHITECTURE_X64 definition to ARCHITECTURE_x86_64.	bunnei	2015-08-16	1	-6/+5
\|
*	Common: Cleanup CPU capability detection code.	bunnei	2015-08-16	1	-5/+5
\|
*	Common: Move cpu_detect to x64 directory.	bunnei	2015-08-16	1	-2/+1
\|
*	x64: Refactor to remove fake interfaces and general cleanups.	bunnei	2015-08-16	5	-144/+22
\|
*	JIT: Support negative address offsets.	bunnei	2015-08-16	1	-26/+25
\|
*	Shader: Initial implementation of x86_x64 JIT compiler for Pica vertex shaders.	bunnei	2015-08-16	6	-2/+924
\| \| \| \| \|	- Config: Add an option for selecting to use shader JIT or interpreter. - Qt: Add a menu option for enabling/disabling the shader JIT.
*	Common: Added MurmurHash3 hash function for general-purpose use.	bunnei	2015-08-15	1	-1/+1
\|
*	Shader: Define a common interface for running vertex shader programs.	bunnei	2015-08-15	4	-184/+278
\|
*	Shader: Move shader code to its own subdirectory, "shader".	bunnei	2015-08-15	2	-0/+701