Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Merge pull request #3039 from ReinUsesLisp/cleanup-samplers | Rodrigo Locatti | 2019-11-06 | 4 | -122/+100 |
|\ | | | | | shader/node: Unpack bindless texture encoding | ||||
| * | shader/node: Unpack bindless texture encoding | ReinUsesLisp | 2019-10-30 | 4 | -122/+100 |
| | | | | | | | | | | | | | | | | | | Bindless textures were using u64 to pack the buffer and offset from where they come from. Drop this in favor of separated entries in the struct. Remove the usage of std::set in favor of std::list (it's not std::vector to avoid reference invalidations) for samplers and images. | ||||
* | | Shader_IR: Fix regression on TLD4 | Fernando Sahmkow | 2019-10-31 | 2 | -5/+4 |
| | | | | | | | | | | | | Originally on the last commit I thought TLD4 acted the same as TLD4S and didn't have a mask. It actually does have a component mask. This commit corrects that. | ||||
* | | Shader_IR: Fix TLD4 and add Bindless Variant. | Fernando Sahmkow | 2019-10-30 | 2 | -10/+26 |
|/ | | | | | | This commit fixes an issue where not all 4 results of tld4 were being written, the color component was defaulted to red, among other things. It also implements the bindless variant. | ||||
* | Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased | Rodrigo Locatti | 2019-10-26 | 10 | -171/+638 |
|\ | | | | | Implement Fast BRX, fix TXQ and addapt the Shader Cache for it | ||||
| * | Shader_IR: Address Feedback. | Fernando Sahmkow | 2019-10-26 | 7 | -52/+59 |
| | | |||||
| * | gl_shader_cache: Implement locker variants invalidation | ReinUsesLisp | 2019-10-25 | 2 | -12/+19 |
| | | |||||
| * | gl_shader_disk_cache: Store and load fast BRX | ReinUsesLisp | 2019-10-25 | 1 | -2/+2 |
| | | |||||
| * | const_buffer_locker: Minor style changes | ReinUsesLisp | 2019-10-25 | 2 | -152/+76 |
| | | |||||
| * | gl_shader_decompiler: Move entries to a separate function | ReinUsesLisp | 2019-10-25 | 7 | -32/+29 |
| | | |||||
| * | Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. | Fernando Sahmkow | 2019-10-25 | 1 | -1/+1 |
| | | |||||
| * | Shader_IR: Correct typo in Consistent method. | Fernando Sahmkow | 2019-10-25 | 2 | -2/+2 |
| | | |||||
| * | Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it | Fernando Sahmkow | 2019-10-25 | 4 | -42/+212 |
| | | |||||
| * | Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. | Fernando Sahmkow | 2019-10-25 | 5 | -130/+246 |
| | | |||||
| * | Shader_Cache: setup connection of ConstBufferLocker | Fernando Sahmkow | 2019-10-25 | 5 | -12/+22 |
| | | |||||
| * | VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders. | Fernando Sahmkow | 2019-10-25 | 3 | -0/+123 |
| | | |||||
| * | Shader_IR: Implement BRX tracking. | Fernando Sahmkow | 2019-10-25 | 1 | -0/+113 |
| | | |||||
* | | Merge pull request #3027 from lioncash/lookup | Rodrigo Locatti | 2019-10-26 | 1 | -53/+67 |
|\ \ | | | | | | | shader_ir: Use std::array with std::pair instead of std::unordered_map | ||||
| * | | shader_ir: Use std::array with pair instead of unordered_map | Lioncash | 2019-10-24 | 1 | -53/+67 |
| | | | | | | | | | | | | | | | | | | | | | | | | Given the overall size of the maps are very small, we can use arrays of pairs here instead of always heap allocating a new map every time the functions are called. Given the small size of the maps, the difference in container lookups are negligible, especially given the entries are already sorted. | ||||
* | | | Merge pull request #3013 from FernandoS27/tld4s-fix | Rodrigo Locatti | 2019-10-26 | 2 | -5/+5 |
|\ \ \ | |_|/ |/| | | Shader_Ir: Fix TLD4S from using a component mask. | ||||
| * | | Shader_Ir: Fix TLD4S from using a component mask. | Fernando Sahmkow | 2019-10-22 | 2 | -5/+5 |
| | | | | | | | | | | | | | | | | | | TLD4S always outputs 4 values, the previous code checked a component mask and omitted those values that weren't part of it. This commit corrects that and makes sure all 4 values are set. | ||||
* | | | video_core/shader: Resolve instances of variable shadowing | Lioncash | 2019-10-24 | 6 | -11/+12 |
| |/ |/| | | | | | Silences a few -Wshadow warnings. | ||||
* | | shader_ir/memory: Ignore global memory when tracking fails | ReinUsesLisp | 2019-10-22 | 2 | -18/+26 |
|/ | | | | | | | | | | | Ignore global memory operations instead of invoking undefined behaviour when constant buffer tracking fails and we are blasting through asserts, ignore the operation. In the case of LDG this means filling the destination registers with zeroes; for STG this means ignore the instruction as a whole. The default behaviour is still to abort execution on failure. | ||||
* | video_core/shader/ast: Make ShowCurrentState() and SanityCheck() const member functions | Lioncash | 2019-10-18 | 2 | -5/+5 |
| | | | | | These can also trivially be made const member functions, with the addition of a few consts. | ||||
* | video_core/shader/ast: Make ASTManager::Print a const member function | Lioncash | 2019-10-18 | 2 | -3/+3 |
| | | | | | Given all visiting functions never modify the nodes, we can trivially make this a const member function. | ||||
* | video_core/shader/ast: Make ExprPrinter members private | Lioncash | 2019-10-18 | 1 | -1/+2 |
| | | | | | This member already has an accessor, so there's no need for it to be public. | ||||
* | video_core/shader/ast: Make Indent() return a string_view | Lioncash | 2019-10-18 | 1 | -14/+24 |
| | | | | | | | | The returned string is simply a substring of our constexpr tabs string_view, so we can just use a string_view here as well, since the original string_view is guaranteed to always exist. Now the function is fully non-allocating. | ||||
* | video_core/shader/ast: Make Indent() private | Lioncash | 2019-10-18 | 1 | -9/+9 |
| | | | | It's never used outside of this class, so we can narrow its scope down. | ||||
* | video_core/shader/ast: Rename Ident() to Indent() | Lioncash | 2019-10-18 | 1 | -13/+13 |
| | | | | | This can be confusing, given "ident" is generally used as a shorthand for "identifier". | ||||
* | video_core/shader/ast: Make use of fmt where applicable | Lioncash | 2019-10-18 | 1 | -14/+14 |
| | | | | | Makes a few strings nicer to read and also eliminates a bit of string churn with operator+. | ||||
* | Merge pull request #2980 from lioncash/warn | bunnei | 2019-10-17 | 2 | -4/+4 |
|\ | | | | | maxwell_3d: Silence truncation warnings | ||||
| * | control_flow: Silence truncation warnings | Lioncash | 2019-10-16 | 2 | -4/+4 |
| | | | | | | | | | | | | This can be trivially fixed by making the input size a size_t. CFGRebuildState's constructor parameter is already a std::size_t, so this just makes the size type fully conform with it. | ||||
* | | shader/node: std::move Meta instance within OperationNode constructor | Lioncash | 2019-10-16 | 1 | -1/+1 |
|/ | | | | Allows usages of the constructor to avoid an unnecessary copy. | ||||
* | shader/half_set_predicate: Fix HSETP2 for constant buffers | ReinUsesLisp | 2019-10-07 | 1 | -0/+2 |
| | | | | | HSETP2 when used with a constant buffer parses the second operand type as F32. This is not configurable. | ||||
* | shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG | ReinUsesLisp | 2019-10-07 | 1 | -1/+2 |
| | |||||
* | video_core/control_flow: Eliminate variable shadowing warnings | Lioncash | 2019-10-05 | 1 | -6/+6 |
| | |||||
* | video_core/control_flow: Eliminate pessimizing moves | Lioncash | 2019-10-05 | 1 | -5/+8 |
| | | | | These can inhibit the ability of a compiler to perform RVO. | ||||
* | video_core/ast: Unindent most of IsFullyDecompiled() by one level | Lioncash | 2019-10-05 | 1 | -12/+12 |
| | |||||
* | video_core/ast: Make ShowCurrentState() take a string_view instead of std::string | Lioncash | 2019-10-05 | 2 | -2/+2 |
| | | | | Allows the function to be non-allocating in terms of the output string. | ||||
* | video_core/ast: Eliminate variable shadowing warnings | Lioncash | 2019-10-05 | 1 | -3/+3 |
| | |||||
* | video_core/ast: Replace std::string with a constexpr std::string_view | Lioncash | 2019-10-05 | 1 | -3/+1 |
| | | | | Same behavior, but without the need to heap allocate | ||||
* | video_core/ast: Default the move constructor and assignment operator | Lioncash | 2019-10-05 | 2 | -26/+2 |
| | | | | | This is behaviorally equivalent and also fixes a bug where some members weren't being moved over. | ||||
* | video_core/{ast, expr}: Organize forward declaration | Lioncash | 2019-10-05 | 2 | -10/+10 |
| | | | | Keeps them alphabetically sorted for readability. | ||||
* | video_core/expr: Supply operator!= along with operator== | Lioncash | 2019-10-05 | 2 | -1/+32 |
| | | | | Provides logical symmetry to the interface. | ||||
* | video_core/{ast, expr}: Use std::move where applicable | Lioncash | 2019-10-05 | 4 | -45/+47 |
| | | | | Avoids unnecessary atomic reference count increments and decrements. | ||||
* | video_core/ast: Supply const accessors for data where applicable | Lioncash | 2019-10-05 | 2 | -37/+41 |
| | | | | | Provides const equivalents of data accessors for use within const contexts. | ||||
* | Shader_ir: Address feedback | Fernando Sahmkow | 2019-10-05 | 4 | -50/+14 |
| | |||||
* | Shader_Ir: Address Feedback and clang format. | Fernando Sahmkow | 2019-10-05 | 3 | -43/+50 |
| | |||||
* | Shader_IR: clean up AST handling and add documentation. | Fernando Sahmkow | 2019-10-05 | 1 | -2/+6 |
| | |||||
* | Shader_IR: Correct OutwardMoves for Ifs | Fernando Sahmkow | 2019-10-05 | 1 | -22/+11 |
| | |||||
* | Shader_IR: corrections and clang-format | Fernando Sahmkow | 2019-10-05 | 2 | -70/+64 |
| | |||||
* | Shader_IR: allow else derivation to be optional. | Fernando Sahmkow | 2019-10-05 | 6 | -8/+14 |
| | |||||
* | vk_shader_compiler: Implement the decompiler in SPIR-V | Fernando Sahmkow | 2019-10-05 | 2 | -1/+25 |
| | |||||
* | Shader_IR: mark labels as unused for partial decompile. | Fernando Sahmkow | 2019-10-05 | 2 | -3/+9 |
| | |||||
* | Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes. | Fernando Sahmkow | 2019-10-05 | 10 | -74/+307 |
| | |||||
* | gl_shader_decompiler: Implement AST decompiling | Fernando Sahmkow | 2019-10-05 | 10 | -34/+116 |
| | |||||
* | shader_ir: Declare Manager and pass it to appropiate programs. | Fernando Sahmkow | 2019-10-05 | 7 | -104/+214 |
| | |||||
* | shader_ir: Corrections to outward movements and misc stuffs | Fernando Sahmkow | 2019-10-05 | 5 | -58/+305 |
| | |||||
* | shader_ir: Add basic goto elimination | Fernando Sahmkow | 2019-10-05 | 2 | -38/+484 |
| | |||||
* | shader_ir: Initial Decompile Setup | Fernando Sahmkow | 2019-10-05 | 5 | -5/+507 |
| | |||||
* | Merge pull request #2869 from ReinUsesLisp/suld | bunnei | 2019-09-24 | 3 | -91/+101 |
|\ | | | | | shader/image: Implement SULD and fix SUATOM | ||||
| * | gl_shader_decompiler: Use uint for images and fix SUATOM | ReinUsesLisp | 2019-09-21 | 3 | -69/+52 |
| | | | | | | | | | | | | In the process remove implementation of SUATOM.MIN and SUATOM.MAX as these require a distinction between U32 and S32. These have to be implemented with imageCompSwap loop. | ||||
| * | shader/image: Implement SULD and remove irrelevant code | ReinUsesLisp | 2019-09-21 | 2 | -25/+52 |
| | | | | | | | | | | * Implement SULD as float. * Remove conditional declaration of GL_ARB_shader_viewport_layer_array. | ||||
* | | Merge pull request #2870 from FernandoS27/multi-draw | David | 2019-09-22 | 2 | -0/+22 |
|\ \ | | | | | | | Implement a MME Draw commands Inliner and correct host instance drawing | ||||
| * | | VideoCore: Corrections to the MME Inliner and removal of hacky instance management. | Fernando Sahmkow | 2019-09-19 | 2 | -0/+22 |
| | | | |||||
* | | | Merge pull request #2878 from FernandoS27/icmp | Rodrigo Locatti | 2019-09-21 | 1 | -0/+29 |
|\ \ \ | |_|/ |/| | | shader_ir: Implement ICMP | ||||
| * | | Shader_IR: ICMP corrections and fixes | Fernando Sahmkow | 2019-09-21 | 1 | -6/+9 |
| | | | |||||
| * | | Shader_IR: Implement ICMP. | Fernando Sahmkow | 2019-09-20 | 1 | -0/+26 |
| |/ | |||||
* | | Merge pull request #2855 from ReinUsesLisp/shfl | bunnei | 2019-09-20 | 2 | -0/+57 |
|\ \ | |/ |/| | shader_ir/warp: Implement SHFL for Nvidia devices | ||||
| * | shader_ir/warp: Implement SHFL | ReinUsesLisp | 2019-09-17 | 2 | -0/+57 |
| | | |||||
* | | Merge pull request #2784 from ReinUsesLisp/smem | bunnei | 2019-09-18 | 4 | -21/+58 |
|\ \ | |/ |/| | shader_ir: Implement shared memory | ||||
| * | shader_ir: Implement LD_S | ReinUsesLisp | 2019-09-05 | 1 | -10/+13 |
| | | | | | | | | Loads from shared memory. | ||||
| * | shader_ir: Implement ST_S | ReinUsesLisp | 2019-09-05 | 4 | -11/+45 |
| | | | | | | | | | | This instruction writes to a memory buffer shared with threads within the same work group. It is known as "shared" memory in GLSL. | ||||
* | | shader/image: Implement SUATOM and fix SUST | ReinUsesLisp | 2019-09-11 | 3 | -37/+122 |
| | | |||||
* | | Merge pull request #2823 from ReinUsesLisp/shr-clamp | bunnei | 2019-09-10 | 1 | -6/+13 |
|\ \ | | | | | | | shader/shift: Implement SHR wrapped and clamped variants | ||||
| * | | shader/shift: Implement SHR wrapped and clamped variants | ReinUsesLisp | 2019-09-04 | 1 | -6/+13 |
| | | | | | | | | | | | | | | | | | | Nvidia defaults to wrapped shifts, but this is undefined behaviour on OpenGL's spec. Explicitly mask/clamp according to what the guest shader requires. | ||||
* | | | gl_shader_decompiler: Keep track of written images and mark them as modified | ReinUsesLisp | 2019-09-06 | 3 | -42/+54 |
| | | | |||||
* | | | kepler_compute: Implement texture queries | ReinUsesLisp | 2019-09-06 | 1 | -0/+4 |
| |/ |/| | |||||
* | | half_set_predicate: Fix predicate assignments | ReinUsesLisp | 2019-09-04 | 1 | -10/+9 |
|/ | |||||
* | Merge pull request #2812 from ReinUsesLisp/f2i-selector | bunnei | 2019-09-04 | 1 | -6/+16 |
|\ | | | | | shader_ir/conversion: Implement F2I and F2F F16 selector | ||||
| * | shader_ir/conversion: Split int and float selector and implement F2F H1 | ReinUsesLisp | 2019-08-28 | 1 | -18/+16 |
| | | |||||
| * | shader_ir/conversion: Implement F2I F16 Ra.H1 | ReinUsesLisp | 2019-08-28 | 1 | -4/+16 |
| | | |||||
* | | Merge pull request #2811 from ReinUsesLisp/fsetp-fix | bunnei | 2019-09-04 | 1 | -4/+5 |
|\ \ | | | | | | | float_set_predicate: Add missing negation bit for the second operand | ||||
| * | | float_set_predicate: Add missing negation bit for the second operand | ReinUsesLisp | 2019-08-28 | 1 | -4/+5 |
| |/ | |||||
* | | video_core: Silent miscellaneous warnings (#2820) | Rodrigo Locatti | 2019-08-30 | 5 | -5/+0 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * texture_cache/surface_params: Remove unused local variable * rasterizer_interface: Add missing documentation commentary * maxwell_dma: Remove unused rasterizer reference * video_core/gpu: Sort member declaration order to silent -Wreorder warning * fermi_2d: Remove unused MemoryManager reference * video_core: Silent unused variable warnings * buffer_cache: Silent -Wreorder warnings * kepler_memory: Remove unused MemoryManager reference * gl_texture_cache: Add missing override * buffer_cache: Add missing include * shader/decode: Remove unused variables | ||||
* | | Merge pull request #2758 from ReinUsesLisp/packed-tid | bunnei | 2019-08-29 | 3 | -0/+15 |
|\ \ | | | | | | | shader/decode: Implement S2R Tic | ||||
| * | | shader/decode: Implement S2R Tic | ReinUsesLisp | 2019-07-22 | 3 | -0/+15 |
| | | | |||||
* | | | shader_ir: Implement VOTE | ReinUsesLisp | 2019-08-21 | 4 | -0/+62 |
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers. | ||||
* | | Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix | bunnei | 2019-08-21 | 1 | -1/+1 |
|\ \ | | | | | | | half_set_predicate: Fix HSETP2_C constant buffer offset | ||||
| * | | half_set_predicate: Fix HSETP2_C constant buffer offset | ReinUsesLisp | 2019-08-04 | 1 | -1/+1 |
| | | | |||||
* | | | Merge pull request #2753 from FernandoS27/float-convert | bunnei | 2019-08-21 | 2 | -16/+39 |
|\ \ \ | | | | | | | | | Shader_Ir: Implement F16 Variants of F2F, F2I, I2F. | ||||
| * | | | Shader_Ir: Implement F16 Variants of F2F, F2I, I2F. | Fernando Sahmkow | 2019-07-20 | 2 | -16/+39 |
| | |/ | |/| | | | | | | | | | | This commit takes care of implementing the F16 Variants of the conversion instructions and makes sure conversions are done. | ||||
* | | | Merge pull request #2778 from ReinUsesLisp/nop | bunnei | 2019-08-18 | 1 | -0/+6 |
|\ \ \ | | | | | | | | | shader_ir: Implement NOP | ||||
| * | | | shader_ir: Implement NOP | ReinUsesLisp | 2019-08-04 | 1 | -0/+6 |
| | |/ | |/| | |||||
* / | | decode/half_set_predicate: Fix predicates | ReinUsesLisp | 2019-07-26 | 1 | -3/+3 |
|/ / | |||||
* | | Merge pull request #2739 from lioncash/cflow | bunnei | 2019-07-25 | 3 | -30/+51 |
|\ \ | | | | | | | video_core/control_flow: Minor changes/warning cleanup | ||||
| * | | video_core/control_flow: Provide operator!= for types with operator== | Lioncash | 2019-07-19 | 1 | -4/+21 |
| | | | | | | | | | | | | Provides operational symmetry for the respective structures. | ||||
| * | | video_core/control_flow: Prevent sign conversion in TryGetBlock() | Lioncash | 2019-07-19 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | The return value is a u32, not an s32, so this would result in an implicit signedness conversion. | ||||
| * | | video_core/control_flow: Remove unnecessary BlockStack copy constructor | Lioncash | 2019-07-19 | 1 | -2/+1 |
| | | | | | | | | | | | | | | | | | | | | | | | | This is the default behavior of the copy constructor, so it doesn't need to be specified. While we're at it we can make the other non-default constructor explicit. | ||||
| * | | video_core/control_flow: Use std::move where applicable | Lioncash | 2019-07-19 | 1 | -10/+15 |
| | | | | | | | | | | | | Results in less work being done where avoidable. | ||||
| * | | video_core/control_flow: Use the prefix variant of operator++ for iterators | Lioncash | 2019-07-19 | 1 | -2/+2 |
| | | | | | | | | | | | | | | | Same thing, but potentially allows a standard library implementation to pick a more efficient codepath. | ||||
| * | | video_core/control_flow: Use empty() member function for checking emptiness | Lioncash | 2019-07-19 | 1 | -2/+2 |
| | | | | | | | | | | | | It's what it's there for. | ||||
| * | | video_core: Resolve -Wreorder warnings | Lioncash | 2019-07-19 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | Ensures that the constructor members are always initialized in the order that they're declared in. | ||||
| * | | video_core/control_flow: Make program_size for ScanFlow() a std::size_t | Lioncash | 2019-07-19 | 2 | -5/+4 |
| | | | | | | | | | | | | | | | | | | Prevents a truncation warning from occurring with MSVC. Also the internal data structures already treat it as a size_t, so this is just a discrepancy in the interface. | ||||
| * | | video_core/control_flow: Place all internally linked types/functions within an anonymous namespace | Lioncash | 2019-07-19 | 1 | -1/+2 |
| | | | | | | | | | | | | | | | Previously, quite a few functions were being linked with external linkage. | ||||
| * | | video_core/shader/decode: Prevent sign-conversion warnings | Lioncash | 2019-07-19 | 1 | -2/+2 |
| | | | | | | | | | | | | Makes it explicit that the conversions here are intentional. | ||||
* | | | Merge pull request #2737 from FernandoS27/track-fix | bunnei | 2019-07-25 | 1 | -2/+2 |
|\ \ \ | | | | | | | | | Shader_Ir: Correct tracking to track from right to left | ||||
| * | | | Shader_Ir: Correct tracking to track from right to left | Fernando Sahmkow | 2019-07-16 | 1 | -2/+2 |
| | | | | |||||
* | | | | Merge pull request #2743 from FernandoS27/surpress-assert | bunnei | 2019-07-25 | 5 | -13/+20 |
|\ \ \ \ | |_|_|/ |/| | | | Downgrade and suppress a series of GPU asserts and debug messages. | ||||
| * | | | Shader_Ir: Change Debug Asserts for Log Warnings | Fernando Sahmkow | 2019-07-20 | 3 | -10/+17 |
| | | | | |||||
| * | | | Shader_Ir: correct clang format | Fernando Sahmkow | 2019-07-18 | 1 | -2/+2 |
| | | | | |||||
| * | | | Shader_Ir: Downgrade precision and rounding asserts to debug asserts. | Fernando Sahmkow | 2019-07-18 | 5 | -10/+10 |
| | | | | | | | | | | | | | | | | | | | | | | | | This commit reduces the sevirity of asserts for FP precision and rounding as this are well known and have little to no consequences in gpu's accuracy. | ||||
* | | | | shader/half_set_predicate: Fix HSETP2 implementation | ReinUsesLisp | 2019-07-20 | 2 | -19/+15 |
| | | | | |||||
* | | | | shader/half_set_predicate: Implement missing HSETP2 variants | ReinUsesLisp | 2019-07-20 | 1 | -13/+29 |
| |_|/ |/| | | |||||
* | | | Merge pull request #2738 from lioncash/shader-ir | bunnei | 2019-07-18 | 8 | -99/+103 |
|\ \ \ | |/ / |/| | | shader-ir: Minor cleanup-related changes | ||||
| * | | shader_ir: std::move Node instance where applicable | Lioncash | 2019-07-17 | 4 | -60/+67 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | These are std::shared_ptr instances underneath the hood, which means copying them isn't as cheap as a regular pointer. Particularly so on weakly-ordered systems. This avoids atomic reference count increments and decrements where they aren't necessary for the core set of operations. | ||||
| * | | shader_ir: Rename Get/SetTemporal to Get/SetTemporary | Lioncash | 2019-07-17 | 5 | -36/+36 |
| | | | | | | | | | | | | | | | | | | This is more accurate in terms of describing what the functions are actually doing. Temporal relates to time, not the setting of a temporary itself. | ||||
| * | | shader_ir: Remove unused includes | Lioncash | 2019-07-17 | 1 | -3/+0 |
| |/ | | | | | | | Removes unnecessary header dependencies. | ||||
* | | Merge pull request #2740 from lioncash/bra | Fernando Sahmkow | 2019-07-17 | 1 | -1/+1 |
|\ \ | |/ |/| | shader/decode/other: Correct branch indirect argument within BRA handling | ||||
| * | shader/decode/other: Correct branch indirect argument within BRA handling | Lioncash | 2019-07-16 | 1 | -1/+1 |
| | | | | | | | | | | This appears to have been a copy/paste error introduced within 8a6fc529a968e007f01464abadd32f9b5eb0a26c | ||||
* | | Merge pull request #2565 from ReinUsesLisp/track-indirect | Fernando Sahmkow | 2019-07-16 | 6 | -35/+36 |
|\ \ | |/ |/| | shader/track: Track indirect buffers | ||||
| * | shader: Allow tracking of indirect buffers without variable offset | ReinUsesLisp | 2019-07-15 | 6 | -35/+36 |
| | | | | | | | | | | | | While changing this code, simplify tracking code to allow returning the base address node, this way callers don't have to manually rebuild it on each invocation. | ||||
* | | Merge pull request #2695 from ReinUsesLisp/layer-viewport | Fernando Sahmkow | 2019-07-15 | 2 | -0/+31 |
|\ \ | |/ |/| | gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders | ||||
| * | gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders | ReinUsesLisp | 2019-07-08 | 2 | -0/+31 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements gl_ViewportIndex and gl_Layer in vertex and geometry shaders. In the case it's used in a vertex shader, it requires ARB_shader_viewport_layer_array. This extension is available on AMD and Nvidia devices (mesa and proprietary drivers), but not available on Intel on any platform. At the moment of writing this description I don't know if this is a hardware limitation or a driver limitation. In the case that ARB_shader_viewport_layer_array is not available, writes to these registers on a vertex shader are ignored, with the appropriate logging. | ||||
* | | Merge pull request #2692 from ReinUsesLisp/tlds-f16 | Fernando Sahmkow | 2019-07-14 | 1 | -1/+7 |
|\ \ | | | | | | | shader/texture: Add F16 support for TLDS | ||||
| * | | shader/texture: Add F16 support for TLDS | ReinUsesLisp | 2019-07-07 | 1 | -1/+7 |
| | | | |||||
* | | | shader_ir: Add comments on missing instruction. | Fernando Sahmkow | 2019-07-09 | 2 | -2/+9 |
| | | | | | | | | | | | | Also shows Nvidia's address space on comments. | ||||
* | | | shader_ir: limit explorastion to best known program size. | Fernando Sahmkow | 2019-07-09 | 1 | -1/+1 |
| | | | |||||
* | | | control_flow: Correct block breaking algorithm. | Fernando Sahmkow | 2019-07-09 | 1 | -17/+17 |
| | | | |||||
* | | | control_flow: Assert shaders bigger than limit. | Fernando Sahmkow | 2019-07-09 | 1 | -0/+2 |
| | | | |||||
* | | | control_flow: Address feedback. | Fernando Sahmkow | 2019-07-09 | 1 | -89/+37 |
| | | | |||||
* | | | shader_ir: Correct parsing of scheduling instructions and correct sizing | Fernando Sahmkow | 2019-07-09 | 2 | -13/+30 |
| | | | |||||
* | | | shader_ir: Correct max sizing | Fernando Sahmkow | 2019-07-09 | 2 | -2/+2 |
| | | | |||||
* | | | shader_ir: Remove unnecessary constructors and use optional for ScanFlow result | Fernando Sahmkow | 2019-07-09 | 3 | -28/+17 |
| | | | |||||
* | | | shader_ir: Corrections, documenting and asserting control_flow | Fernando Sahmkow | 2019-07-09 | 3 | -52/+54 |
| | | | |||||
* | | | shader_ir: Unify blocks in decompiled shaders. | Fernando Sahmkow | 2019-07-09 | 6 | -54/+79 |
| | | | |||||
* | | | shader_ir: Decompile Flow Stack | Fernando Sahmkow | 2019-07-09 | 4 | -11/+206 |
| | | | |||||
* | | | shader_ir: propagate shader size to the IR | Fernando Sahmkow | 2019-07-09 | 3 | -6/+7 |
| | | | |||||
* | | | shader_ir: Implement BRX & BRA.CC | Fernando Sahmkow | 2019-07-09 | 3 | -4/+42 |
| | | | |||||
* | | | shader_ir: Remove the old scanner. | Fernando Sahmkow | 2019-07-09 | 2 | -77/+0 |
| | | | |||||
* | | | shader_ir: Implement a new shader scanner | Fernando Sahmkow | 2019-07-09 | 3 | -16/+471 |
| |/ |/| | |||||
* | | Delete decode_integer_set.cpp | Tobias | 2019-07-07 | 1 | -0/+0 |
|/ | |||||
* | decode/texture: Address feedback | ReinUsesLisp | 2019-06-24 | 1 | -0/+1 |
| | |||||
* | texture_cache: Style and Corrections | Fernando Sahmkow | 2019-06-21 | 1 | -1/+2 |
| | |||||
* | shader_ir: Fix image copy rebase issues | Fernando Sahmkow | 2019-06-21 | 1 | -2/+7 |
| | |||||
* | shader: Implement bindless images | ReinUsesLisp | 2019-06-21 | 3 | -2/+40 |
| | |||||
* | shader: Decode SUST and implement backing image functionality | ReinUsesLisp | 2019-06-21 | 4 | -1/+140 |
| | |||||
* | shader: Implement texture buffers | ReinUsesLisp | 2019-06-21 | 2 | -0/+46 |
| | |||||
* | shader: Split SSY and PBK stack | ReinUsesLisp | 2019-06-07 | 2 | -11/+14 |
| | | | | | | | | | | | Hardware testing revealed that SSY and PBK push to a different stack, allowing code like this: SSY label1; PBK label2; SYNC; label1: PBK; label2: EXIT; | ||||
* | shader/node: Minor changes | ReinUsesLisp | 2019-06-07 | 1 | -50/+54 |
| | | | | | | | Reflect std::shared_ptr nature of Node on initializers and remove constant members in nodes. Add some commentaries. | ||||
* | shader: Move Node declarations out of the shader IR header | ReinUsesLisp | 2019-06-07 | 3 | -493/+517 |
| | | | | | | Analysis passes do not have a good reason to depend on shader_ir.h to work on top of nodes. This splits node-related declarations to their own file and leaves the IR in shader_ir.h | ||||
* | shader: Use shared_ptr to store nodes and move initialization to file | ReinUsesLisp | 2019-06-06 | 32 | -192/+238 |
| | | | | | | | | | Instead of having a vector of unique_ptr stored in a vector and returning star pointers to this, use shared_ptr. While changing initialization code, move it to a separate file when possible. This is a first step to allow code analysis and node generation beyond the ShaderIR class. | ||||
* | Merge pull request #2446 from ReinUsesLisp/tid | bunnei | 2019-05-29 | 2 | -15/+35 |
|\ | | | | | shader: Implement S2R Tid{XYZ} and CtaId{XYZ} | ||||
| * | shader: Implement S2R Tid{XYZ} and CtaId{XYZ} | ReinUsesLisp | 2019-05-20 | 2 | -15/+35 |
| | | |||||
* | | Merge pull request #2485 from ReinUsesLisp/generic-memory | bunnei | 2019-05-25 | 2 | -31/+57 |
|\ \ | | | | | | | shader/memory: Implement generic memory stores and loads (ST and LD) | ||||
| * | | shader/memory: Implement ST (generic memory) | ReinUsesLisp | 2019-05-21 | 1 | -21/+35 |
| | | | |||||
| * | | shader/memory: Implement LD (generic memory) | ReinUsesLisp | 2019-05-21 | 2 | -11/+23 |
| |/ | |||||
* | | shader/shader_ir: Make Comment() take a std::string by value | Lioncash | 2019-05-23 | 2 | -3/+3 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows for forming comment nodes without making unnecessary copies of the std::string instance. e.g. previously: Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]", cbuf->GetIndex(), cbuf_offset)); Would result in a copy of the string being created, as CommentNode() takes a std::string by value (a const ref passed to a value parameter results in a copy). Now, only one instance of the string is ever moved around. (fmt::format returns a std::string, and since it's returned from a function by value, this is a prvalue (which can be treated like an rvalue), so it's moved into Comment's string parameter), we then move it into the CommentNode constructor, which then moves the string into its member variable). | ||||
* | | shader/decode/*: Add missing newline to files lacking them | Lioncash | 2019-05-23 | 18 | -18/+18 |
| | | | | | | | | Keeps the shader code file endings consistent. | ||||
* | | shader/decode/*: Eliminate indirect inclusions | Lioncash | 2019-05-23 | 6 | -1/+5 |
| | | | | | | | | | | | | | | Amends cases where we were using things that were indirectly being satisfied through other headers. This way, if those headers change and eliminate dependencies on other headers in the future, we don't have cascading compilation errors. | ||||
* | | shader/decode/memory: Remove left in debug pragma | Lioncash | 2019-05-22 | 1 | -2/+0 |
|/ | |||||
* | Merge pull request #2441 from ReinUsesLisp/al2p | bunnei | 2019-05-19 | 4 | -34/+67 |
|\ | | | | | shader: Implement AL2P and ALD.PHYS | ||||
| * | shader_ir/other: Implement IPA.IDX | ReinUsesLisp | 2019-05-03 | 1 | -5/+8 |
| | | |||||
| * | shader_ir/memory: Assert on non-32 bits ALD.PHYS | ReinUsesLisp | 2019-05-03 | 1 | -0/+3 |
| | | |||||
| * | shader: Add physical attributes commentaries | ReinUsesLisp | 2019-05-03 | 3 | -4/+6 |
| | | |||||
| * | gl_shader_decompiler: Implement GLSL physical attributes | ReinUsesLisp | 2019-05-03 | 1 | -1/+1 |
| | | |||||
| * | shader_ir/memory: Implement physical input attributes | ReinUsesLisp | 2019-05-03 | 3 | -6/+28 |
| | | |||||
| * | shader: Remove unused AbufNode Ipa mode | ReinUsesLisp | 2019-05-03 | 4 | -29/+10 |
| | | |||||
| * | shader_ir/memory: Emit AL2P IR | ReinUsesLisp | 2019-05-03 | 2 | -0/+22 |
| | | |||||
* | | shader/shader_ir: Remove unnecessary inline specifiers | Lioncash | 2019-05-19 | 1 | -2/+2 |
| | | | | | | | | | | constexpr internally links by default, so the inline specifier is unnecessary. | ||||
* | | shader/shader_ir: Simplify constructors for OperationNode | Lioncash | 2019-05-19 | 1 | -15/+6 |
| | | | | | | | | | | | | | | | | | | | | Many of these constructors don't even need to be templated. The only ones that need to be templated are the ones that actually make use of the parameter pack. Even then, since std::vector accepts an initializer list, we can supply the parameter pack directly to it instead of creating our own copy of the list, then copying it again into the std::vector. | ||||
* | | shader/shader_ir: Remove unnecessary template parameter packs from Operation() overloads where applicable | Lioncash | 2019-05-19 | 1 | -2/+0 |
| | | | | | | | | | | These overloads don't actually make use of the parameter pack, so they can be turned into regular non-template function overloads. | ||||
* | | shader/shader_ir: Mark tracking functions as const member functions | Lioncash | 2019-05-19 | 2 | -8/+11 |
| | | | | | | | | | | These don't actually modify instance state, so they can be marked as const member functions | ||||
* | | shader/shader_ir: Place implementations of constructor and destructor in cpp file | Lioncash | 2019-05-19 | 2 | -5/+9 |
| | | | | | | | | | | | | Given the class contains quite a lot of non-trivial types, place the constructor and destructor within the cpp file to avoid inlining construction and destruction code everywhere the class is used. | ||||
* | | video_core/shader/decode/texture: Remove unused variable from GetTld4Code() | Lioncash | 2019-05-10 | 1 | -1/+0 |
| | | |||||
* | | shader/decode/texture: Remove unused variable | Lioncash | 2019-05-04 | 1 | -1/+0 |
|/ | | | | This isn't used anywhere, so we can get rid of it. | ||||
* | Merge pull request #2435 from ReinUsesLisp/misc-vc | bunnei | 2019-04-29 | 2 | -3/+4 |
|\ | | | | | shader_ir: Miscellaneous fixes | ||||
| * | shader_ir: Move Sampler index entry in operand< to sort declarations | ReinUsesLisp | 2019-04-26 | 1 | -2/+2 |
| | | |||||
| * | shader_ir: Add missing entry to Sampler operand< comparison | ReinUsesLisp | 2019-04-26 | 1 | -2/+3 |
| | | |||||
| * | shader_ir/texture: Fix sampler const buffer key shift | ReinUsesLisp | 2019-04-26 | 1 | -1/+1 |
| | | |||||
* | | Merge pull request #2322 from ReinUsesLisp/wswitch | bunnei | 2019-04-29 | 4 | -9/+16 |
|\ \ | | | | | | | video_core: Silent -Wswitch warnings | ||||
| * | | video_core: Silent -Wswitch warnings | ReinUsesLisp | 2019-04-18 | 4 | -9/+16 |
| | | | |||||
* | | | Merge pull request #2423 from FernandoS27/half-correct | bunnei | 2019-04-29 | 2 | -15/+16 |
|\ \ \ | |_|/ |/| | | Corrections on Half Float operations: HADD2 HMUL2 and HFMA2 | ||||
| * | | Corrections Half Float operations on const buffers and implement saturation. | Fernando Sahmkow | 2019-04-21 | 2 | -15/+16 |
| | | | |||||
* | | | Merge pull request #2407 from FernandoS27/f2f | bunnei | 2019-04-20 | 1 | -16/+53 |
|\ \ \ | |/ / |/| | | Do some corrections in conversion shader instructions. | ||||
| * | | Do some corrections in conversion shader instructions. | Fernando Sahmkow | 2019-04-16 | 1 | -16/+53 |
| |/ | | | | | | | | | | | Corrects encodings for I2F, F2F, I2I and F2I Implements Immediate variants of all four conversion types. Add assertions to unimplemented stuffs. | ||||
* | | Merge pull request #2409 from ReinUsesLisp/half-floats | bunnei | 2019-04-20 | 7 | -81/+85 |
|\ \ | | | | | | | shader_ir/decode: Miscellaneous fixes to half-float decompilation | ||||
| * | | shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic | ReinUsesLisp | 2019-04-16 | 7 | -52/+42 |
| | | | | | | | | | | | | | | | | | | | | | Operations done before the main half float operation (like HAdd) were managing a packed value instead of the unpacked one. Adding an unpacked operation allows us to drop the per-operand MetaHalfArithmetic entry, simplifying the code overall. | ||||
| * | | shader_ir/decode: Implement half float saturation | ReinUsesLisp | 2019-04-16 | 3 | -4/+14 |
| | | | |||||
| * | | shader_ir/decode: Reduce severity of unimplemented half-float FTZ | ReinUsesLisp | 2019-04-16 | 3 | -3/+9 |
| | | | |||||
| * | | renderer_opengl: Implement half float NaN comparisons | ReinUsesLisp | 2019-04-16 | 2 | -18/+17 |
| | | | |||||
| * | | shader_ir: Avoid using static on heap-allocated objects | ReinUsesLisp | 2019-04-16 | 1 | -5/+4 |
| |/ | | | | | | | | | Using static here might be faster at runtime, but it adds a heap allocation called before main. | ||||
* | | Merge pull request #2348 from FernandoS27/guest-bindless | bunnei | 2019-04-18 | 2 | -24/+129 |
|\ \ | | | | | | | Implement Bindless Textures on Shader Decompiler and GL backend | ||||
| * | | Adapt Bindless to work with AOFFI | Fernando Sahmkow | 2019-04-08 | 1 | -7/+18 |
| | | | |||||
| * | | Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. | Fernando Sahmkow | 2019-04-08 | 2 | -3/+4 |
| | | | |||||
| * | | Fix TMML | Fernando Sahmkow | 2019-04-08 | 1 | -5/+7 |
| | | | |||||
| * | | Refactor GetTextureCode and GetTexCode to use an optional instead of optional parameters | Fernando Sahmkow | 2019-04-08 | 2 | -34/+33 |
| | | | |||||
| * | | Implement TXQ_B | Fernando Sahmkow | 2019-04-08 | 1 | -2/+8 |
| | | | |||||
| * | | Implement TMML_B | Fernando Sahmkow | 2019-04-08 | 1 | -5/+10 |
| | | | |||||
| * | | Corrections to TEX_B | Fernando Sahmkow | 2019-04-08 | 1 | -4/+5 |
| | | | |||||
| * | | Implement Bindless Handling on SetupTexture | Fernando Sahmkow | 2019-04-08 | 1 | -4/+3 |
| | | | |||||
| * | | Unify both sampler types. | Fernando Sahmkow | 2019-04-08 | 2 | -18/+40 |
| | | | |||||
| * | | Implement Bindless Samplers and TEX_B in the IR. | Fernando Sahmkow | 2019-04-08 | 2 | -15/+74 |
| | | | |||||
* | | | Merge pull request #2315 from ReinUsesLisp/severity-decompiler | bunnei | 2019-04-17 | 1 | -4/+5 |
|\ \ \ | | | | | | | | | shader_ir/decode: Reduce the severity of common assertions | ||||
| * | | | shader_ir/memory: Reduce severity of LD_L cache management and log it | ReinUsesLisp | 2019-04-03 | 1 | -2/+2 |
| | | | | |||||
| * | | | shader_ir/memory: Reduce severity of ST_L cache management and log it | ReinUsesLisp | 2019-04-03 | 1 | -2/+3 |
| | | | | |||||
* | | | | shader_ir: Implement STG, keep track of global memory usage and flush | ReinUsesLisp | 2019-04-14 | 2 | -38/+87 |
| |_|/ |/| | | |||||
* | | | Correct XMAD mode, psl and high_b on different encodings. | Fernando Sahmkow | 2019-04-08 | 1 | -9/+30 |
| |/ |/| | |||||
* | | shader_ir/decode: Silent implicit sign conversion warning | Mat M | 2019-03-31 | 1 | -2/+2 |
| | | | | | | Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc> | ||||
* | | shader_ir/decode: Implement AOFFI for TEX and TLD4 | ReinUsesLisp | 2019-03-30 | 2 | -27/+94 |
| | | |||||
* | | shader_ir: Implement immediate register tracking | ReinUsesLisp | 2019-03-30 | 2 | -1/+19 |
|/ | |||||
* | shader/decode: Remove extras from MetaTexture | ReinUsesLisp | 2019-02-26 | 2 | -15/+26 |
| | |||||
* | shader/decode: Split memory and texture instructions decoding | ReinUsesLisp | 2019-02-26 | 4 | -493/+527 |
| | |||||
* | shader/track: Resolve variable shadowing warnings | Lioncash | 2019-02-25 | 1 | -5/+5 |
| | |||||
* | Merge pull request #2118 from FernandoS27/ipa-improve | bunnei | 2019-02-25 | 2 | -3/+14 |
|\ | | | | | shader_decompiler: Improve Accuracy of Attribute Interpolation. | ||||
| * | shader_decompiler: Improve Accuracy of Attribute Interpolation. | Fernando Sahmkow | 2019-02-14 | 2 | -3/+14 |
| | | |||||
* | | gl_shader_decompiler: Re-implement TLDS lod | ReinUsesLisp | 2019-02-12 | 1 | -1/+1 |
|/ | |||||
* | Merge pull request #2108 from FernandoS27/fix-cc | bunnei | 2019-02-12 | 1 | -2/+2 |
|\ | | | | | Fix incorrect value for CC bit in IADD | ||||
| * | Fix incorrect value for CC bit in IADD | Fernando Sahmkow | 2019-02-11 | 1 | -2/+2 |
| | | |||||
* | | Merge pull request #2109 from FernandoS27/fix-f2i | bunnei | 2019-02-12 | 1 | -3/+3 |
|\ \ | | | | | | | Corrected F2I None mode to RoundEven. | ||||
| * | | Corrected F2I None mode to RoundEven. | Fernando Sahmkow | 2019-02-11 | 1 | -3/+3 |
| |/ | |||||
* | | shader_ir: Remove F4 prefix to texture operations | ReinUsesLisp | 2019-02-07 | 2 | -14/+13 |
| | | | | | | | | | | | | This was originally included because texture operations returned a vec4. These operations now return a single float and the F4 prefix doesn't mean anything. | ||||
* | | shader_ir: Clean texture management code | ReinUsesLisp | 2019-02-07 | 2 | -101/+63 |
|/ | | | | | | | | | Previous code relied on GLSL parameter order (something that's always ill-formed on an IR design). This approach passes spatial coordiantes through operation nodes and array and depth compare values in the the texture metadata. It still contains an "extra" vector containing generic nodes for bias and component index (for example) which is still a bit ill-formed but it should be better than the previous approach. | ||||
* | Merge pull request #2083 from ReinUsesLisp/shader-ir-cbuf-tracking | bunnei | 2019-02-07 | 29 | -124/+138 |
|\ | | | | | shader/track: Add a more permissive global memory tracking | ||||
| * | shader/track: Search inside of conditional nodes | ReinUsesLisp | 2019-02-03 | 1 | -0/+11 |
| | | | | | | | | | | | | Some games search conditionally use global memory instructions. This allows the heuristic to search inside conditional nodes for the source constant buffer. | ||||
| * | shader_ir: Rename BasicBlock to NodeBlock | ReinUsesLisp | 2019-02-03 | 29 | -119/+117 |
| | | | | | | | | It's not always used as a basic block. Rename it for consistency. | ||||
| * | shader_ir: Pass decoded nodes as a whole instead of per basic blocks | ReinUsesLisp | 2019-02-03 | 27 | -57/+62 |
| | | | | | | | | | | | | | | | | | | Some games call LDG at the top of a basic block, making the tracking heuristic to fail. This commit lets the heuristic the decoded nodes as a whole instead of per basic blocks. This may lead to some false positives but allows it the heuristic to track cases it previously couldn't. | ||||
* | | gl_shader_disk_cache: Save GLSL and entries into the precompiled file | ReinUsesLisp | 2019-02-07 | 1 | -0/+9 |
| | | |||||
* | | Merge pull request #2081 from ReinUsesLisp/lmem-64 | bunnei | 2019-02-05 | 1 | -12/+43 |
|\ \ | | | | | | | shader_ir/memory: Add LD_L 64 bits loads | ||||
| * | | shader_ir/memory: Add ST_L 64 and 128 bits stores | ReinUsesLisp | 2019-02-03 | 1 | -3/+11 |
| | | | |||||
| * | | shader_ir/memory: Add LD_L 128 bits loads | ReinUsesLisp | 2019-02-03 | 1 | -7/+19 |
| | | | |||||
| * | | shader_bytecode: Rename BytesN enums to BitsN | ReinUsesLisp | 2019-02-03 | 1 | -4/+4 |
| | | | |||||
| * | | shader_ir/memory: Add LD_L 64 bits loads | ReinUsesLisp | 2019-02-03 | 1 | -6/+17 |
| |/ | |||||
* | | Merge pull request #2082 from FernandoS27/txq-stl | bunnei | 2019-02-05 | 1 | -6/+9 |
|\ \ | |/ |/| | Fix TXQ not using the component mask. | ||||
| * | Fix TXQ not using the component mask. | Fernando Sahmkow | 2019-02-03 | 1 | -6/+9 |
| | | |||||
* | | shader_ir: Unify constant buffer offset values | ReinUsesLisp | 2019-01-30 | 14 | -22/+24 |
|/ | | | | | | | Constant buffer values on the shader IR were using different offsets if the access direct or indirect. cbuf34 has a non-multiplied offset while cbuf36 does. On shader decoding this commit multiplies it by four on cbuf34 queries. | ||||
* | shader_decode: Implement LDG and basic cbuf tracking | ReinUsesLisp | 2019-01-30 | 3 | -4/+159 |
| | |||||
* | shader/shader_ir: Amend three comment typos | Lioncash | 2019-01-28 | 1 | -3/+3 |
| | | | | | Given we're in the area, these are three trivial typos that can be corrected. | ||||
* | shader/shader_ir: Amend constructor initializer ordering for AbufNode | Lioncash | 2019-01-28 | 1 | -2/+2 |
| | | | | | Orders the class members in the same order that they would actually be initialized in. Gets rid of two compiler warnings. | ||||
* | shader/decode: Avoid a pessimizing std::move within DecodeRange() | Lioncash | 2019-01-28 | 1 | -1/+1 |
| | | | | | | std::moveing a local variable in a return statement has the potential to prevent copy elision from occurring, so this can just be converted into a regular return. | ||||
* | shader_ir: Fixup clang build | ReinUsesLisp | 2019-01-16 | 1 | -4/+6 |
| | |||||
* | shader_decode: Fixup XMAD | ReinUsesLisp | 2019-01-15 | 1 | -1/+1 |
| | |||||
* | shader_ir: Pass to decoder functions basic block's code | ReinUsesLisp | 2019-01-15 | 27 | -82/+83 |
| | |||||
* | shader_decode: Improve zero flag implementation | ReinUsesLisp | 2019-01-15 | 15 | -75/+79 |
| | |||||
* | shader_ir: Remove composite primitives and use temporals instead | ReinUsesLisp | 2019-01-15 | 3 | -175/+187 |
| | |||||
* | shader_decode: Use proper primitive names | ReinUsesLisp | 2019-01-15 | 3 | -15/+13 |
| | |||||
* | shader_decode: Use BitfieldExtract instead of shift + and | ReinUsesLisp | 2019-01-15 | 7 | -48/+30 |
| | |||||
* | shader_ir: Remove Ipa primitive | ReinUsesLisp | 2019-01-15 | 2 | -5/+2 |
| | |||||
* | video_core: Rename glsl_decompiler to gl_shader_decompiler | ReinUsesLisp | 2019-01-15 | 2 | -1631/+0 |
| | |||||
* | shader_ir: Remove RZ and use Register::ZeroIndex instead | ReinUsesLisp | 2019-01-15 | 3 | -12/+16 |
| | |||||
* | shader_decode: Implement TEXS.F16 | ReinUsesLisp | 2019-01-15 | 3 | -15/+57 |
| | |||||
* | shader_decode: Fixup R2P | ReinUsesLisp | 2019-01-15 | 1 | -2/+3 |
| | |||||
* | glsl_decompiler: Fixup TLDS | ReinUsesLisp | 2019-01-15 | 1 | -1/+0 |
| | |||||
* | glsl_decompiler: Fixup geometry shaders | ReinUsesLisp | 2019-01-15 | 1 | -10/+16 |
| | |||||
* | shader_decode: Fixup WriteLogicOperation zero comparison | ReinUsesLisp | 2019-01-15 | 1 | -1/+1 |
| | |||||
* | glsl_decompiler: Fixup permissive member function declarations | ReinUsesLisp | 2019-01-15 | 1 | -133/+133 |
| | |||||
* | shader_decode: Fixup PSET | ReinUsesLisp | 2019-01-15 | 1 | -2/+3 |
| | |||||
* | shader_decode: Fixup clang-format | ReinUsesLisp | 2019-01-15 | 2 | -2/+4 |
| | |||||
* | video_core: Implement IR based geometry shaders | ReinUsesLisp | 2019-01-15 | 3 | -2/+96 |
| | |||||
* | shader_decode: Implement VMAD and VSETP | ReinUsesLisp | 2019-01-15 | 3 | -0/+125 |
| | |||||
* | shader_decode: Implement HSET2 | ReinUsesLisp | 2019-01-15 | 3 | -1/+50 |
| | |||||
* | shader_decode: Rework HSETP2 | ReinUsesLisp | 2019-01-15 | 4 | -47/+57 |
| | |||||
* | shader_decode: Implement R2P | ReinUsesLisp | 2019-01-15 | 1 | -1/+28 |
| | |||||
* | shader_decode: Implement CSETP | ReinUsesLisp | 2019-01-15 | 1 | -14/+37 |
| | |||||
* | shader_decode: Implement PSET | ReinUsesLisp | 2019-01-15 | 1 | -1/+16 |
| | |||||
* | shader_decode: Implement HFMA2 | ReinUsesLisp | 2019-01-15 | 3 | -5/+59 |
| | |||||
* | glsl_decompiler: Remove HNegate inlining | ReinUsesLisp | 2019-01-15 | 1 | -10/+0 |
| | |||||
* | shader_decode: Implement POPC | ReinUsesLisp | 2019-01-15 | 4 | -1/+22 |
| | |||||
* | shader_decode: Implement TLDS (untested) | ReinUsesLisp | 2019-01-15 | 3 | -10/+92 |
| | |||||
* | shader_decode: Update TLD4 reflecting #1862 changes | ReinUsesLisp | 2019-01-15 | 2 | -52/+52 |
| | |||||
* | shader_ir: Fixup TEX and TEXS and partially fix TLD4 decompiling | ReinUsesLisp | 2019-01-15 | 3 | -60/+72 |
| | |||||
* | shader_decode: Fixup FSET | ReinUsesLisp | 2019-01-15 | 1 | -2/+2 |
| | |||||
* | shader_decode: Implement IADD32I | ReinUsesLisp | 2019-01-15 | 1 | -0/+11 |
| | |||||
* | video_core: Return safe values after an assert hits | ReinUsesLisp | 2019-01-15 | 8 | -8/+19 |
| | |||||
* | shader_decode: Implement FFMA | ReinUsesLisp | 2019-01-15 | 1 | -1/+36 |
| | |||||
* | video_core: Address feedback | ReinUsesLisp | 2019-01-15 | 4 | -13/+16 |
| | |||||
* | shader_ir: Fixup file inclusions and clang-format | ReinUsesLisp | 2019-01-15 | 3 | -2/+2 |
| | |||||
* | shader_ir: Move comment node string | Mat M | 2019-01-15 | 1 | -2/+2 |
| | | | Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc> | ||||
* | shader_ir: Address feedback to avoid UB in bit casting | ReinUsesLisp | 2019-01-15 | 1 | -2/+4 |
| | |||||
* | shader_decode: Fixup clang-format | ReinUsesLisp | 2019-01-15 | 2 | -3/+2 |
| | |||||
* | shader_decode: Implement LEA | ReinUsesLisp | 2019-01-15 | 1 | -0/+55 |
| | |||||
* | shader_decode: Implement IADD3 | ReinUsesLisp | 2019-01-15 | 1 | -0/+61 |
| | |||||
* | shader_decode: Implement LOP3 | ReinUsesLisp | 2019-01-15 | 2 | -0/+62 |
| | |||||
* | shader_decode: Implement ST_L | ReinUsesLisp | 2019-01-15 | 1 | -0/+17 |
| | |||||
* | shader_decode: Implement LD_L | ReinUsesLisp | 2019-01-15 | 1 | -0/+18 |
| | |||||
* | shader_decode: Implement HSETP2 | ReinUsesLisp | 2019-01-15 | 1 | -1/+37 |
| | |||||
* | shader_decode: Implement HADD2 and HMUL2 | ReinUsesLisp | 2019-01-15 | 1 | -1/+48 |
| | |||||
* | shader_decode: Implement HADD2_IMM and HMUL2_IMM | ReinUsesLisp | 2019-01-15 | 1 | -1/+28 |
| | |||||
* | shader_decode: Implement MOV_SYS | ReinUsesLisp | 2019-01-15 | 1 | -0/+27 |
| | |||||
* | shader_decode: Implement IMNMX | ReinUsesLisp | 2019-01-15 | 1 | -0/+16 |
| | |||||
* | shader_decode: Implement F2F_C | ReinUsesLisp | 2019-01-15 | 1 | -2/+10 |
| | |||||
* | shader_decode: Implement I2I | ReinUsesLisp | 2019-01-15 | 1 | -0/+26 |
| | |||||
* | shader_decode: Implement BRA internal flag | ReinUsesLisp | 2019-01-15 | 1 | -4/+8 |
| | |||||
* | shader_decode: Implement ISCADD | ReinUsesLisp | 2019-01-15 | 1 | -0/+15 |
| | |||||
* | shader_decode: Implement XMAD | ReinUsesLisp | 2019-01-15 | 1 | -1/+85 |
| | |||||
* | shader_decode: Implement PBK and BRK | ReinUsesLisp | 2019-01-15 | 1 | -1/+22 |
| | |||||
* | shader_decode: Implement LOP | ReinUsesLisp | 2019-01-15 | 1 | -0/+15 |
| | |||||
* | shader_decode: Implement SEL | ReinUsesLisp | 2019-01-15 | 1 | -0/+8 |
| | |||||
* | shader_decode: Implement IADD | ReinUsesLisp | 2019-01-15 | 1 | -1/+28 |
| | |||||
* | shader_decode: Implement ISETP | ReinUsesLisp | 2019-01-15 | 1 | -1/+30 |
| | |||||
* | shader_decode: Implement BFI | ReinUsesLisp | 2019-01-15 | 1 | -1/+22 |
| | |||||
* | shader_decode: Implement ISET | ReinUsesLisp | 2019-01-15 | 1 | -1/+27 |
| | |||||
* | shader_decode: Implement LD_C | ReinUsesLisp | 2019-01-15 | 1 | -0/+31 |
| | |||||
* | shader_decode: Implement SHL | ReinUsesLisp | 2019-01-15 | 1 | -0/+8 |
| | |||||
* | shader_decode: Implement SHR | ReinUsesLisp | 2019-01-15 | 1 | -1/+26 |
| | |||||
* | shader_decode: Implement LOP32I | ReinUsesLisp | 2019-01-15 | 2 | -1/+72 |
| | |||||
* | shader_decode: Implement BFE | ReinUsesLisp | 2019-01-15 | 1 | -1/+25 |
| | |||||
* | shader_decode: Implement FSET | ReinUsesLisp | 2019-01-15 | 1 | -1/+36 |
| | |||||
* | shader_decode: Implement F2I | ReinUsesLisp | 2019-01-15 | 1 | -0/+37 |
| | |||||
* | shader_decode: Implement I2F | ReinUsesLisp | 2019-01-15 | 1 | -0/+23 |
| | |||||
* | shader_decode: Implement F2F | ReinUsesLisp | 2019-01-15 | 1 | -1/+37 |
| | |||||
* | shader_decode: Stub DEPBAR | ReinUsesLisp | 2019-01-15 | 1 | -0/+4 |
| | |||||
* | shader_decode: Implement SSY and SYNC | ReinUsesLisp | 2019-01-15 | 1 | -0/+19 |
| | |||||
* | shader_decode: Implement PSETP | ReinUsesLisp | 2019-01-15 | 1 | -1/+21 |
| | |||||
* | shader_decode: Implement TMML | ReinUsesLisp | 2019-01-15 | 1 | -3/+45 |
| | |||||
* | shader_decode: Implement TEX and TXQ | ReinUsesLisp | 2019-01-15 | 2 | -0/+223 |
| | |||||
* | shader_decode: Implement TEXS (F32) | ReinUsesLisp | 2019-01-15 | 2 | -0/+217 |
| | |||||
* | shader_decode: Implement FSETP | ReinUsesLisp | 2019-01-15 | 1 | -1/+33 |
| | |||||
* | shader_decode: Partially implement BRA | ReinUsesLisp | 2019-01-15 | 1 | -0/+12 |
| | |||||
* | shader_decode: Implement IPA | ReinUsesLisp | 2019-01-15 | 1 | -0/+12 |
| | |||||
* | shader_decode: Implement EXIT | ReinUsesLisp | 2019-01-15 | 1 | -1/+32 |
| | |||||
* | shader_decode: Implement ST_A | ReinUsesLisp | 2019-01-15 | 1 | -0/+30 |
| | |||||
* | shader_decode: Implement LD_A | ReinUsesLisp | 2019-01-15 | 1 | -1/+39 |
| | |||||
* | shader_decode: Implement FADD32I | ReinUsesLisp | 2019-01-15 | 1 | -0/+12 |
| | |||||
* | shader_decode: Implement FMUL32_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+10 |
| | |||||
* | shader_decode: Implement MOV32_IMM | ReinUsesLisp | 2019-01-15 | 1 | -1/+9 |
| | |||||
* | shader_decode: Stub RRO_C, RRO_R and RRO_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+9 |
| | |||||
* | shader_decode: Implement FMNMX_C, FMNMX_R and FMNMX_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+18 |
| | |||||
* | shader_decode: Implement MUFU | ReinUsesLisp | 2019-01-15 | 1 | -0/+29 |
| | |||||
* | shader_decode: Implement FADD_C, FADD_R and FADD_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+15 |
| | |||||
* | shader_decode: Implement FMUL_C, FMUL_R and FMUL_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+42 |
| | |||||
* | shader_decode: Implement MOV_C and MOV_R | ReinUsesLisp | 2019-01-15 | 1 | -1/+23 |
| | |||||
* | glsl_decompiler: Implementation | ReinUsesLisp | 2019-01-15 | 2 | -0/+1481 |
| | |||||
* | shader_ir: Add condition code helper | ReinUsesLisp | 2019-01-15 | 2 | -0/+13 |
| | |||||
* | shader_ir: Add predicate combiner helper | ReinUsesLisp | 2019-01-15 | 2 | -0/+15 |
| | |||||
* | shader_ir: Add comparison helpers | ReinUsesLisp | 2019-01-15 | 2 | -0/+106 |
| | |||||
* | shader_ir: Add half float helpers | ReinUsesLisp | 2019-01-15 | 2 | -0/+44 |
| | |||||
* | shader_ir: Add integer helpers | ReinUsesLisp | 2019-01-15 | 2 | -0/+40 |
| | |||||
* | shader_ir: Add float helpers | ReinUsesLisp | 2019-01-15 | 2 | -0/+24 |
| | |||||
* | shader_ir: Add setters | ReinUsesLisp | 2019-01-15 | 2 | -0/+24 |
| | |||||
* | shader_ir: Add local memory getters | ReinUsesLisp | 2019-01-15 | 2 | -0/+7 |
| | |||||
* | shader_ir: Add internal flag getters | ReinUsesLisp | 2019-01-15 | 2 | -0/+10 |
| | |||||
* | shader_ir: Add attribute getters | ReinUsesLisp | 2019-01-15 | 2 | -0/+26 |
| | |||||
* | shader_ir: Add constant buffer getters | ReinUsesLisp | 2019-01-15 | 2 | -0/+25 |
| | |||||
* | shader_ir: Add register getter | ReinUsesLisp | 2019-01-15 | 2 | -0/+9 |
| | |||||
* | shader_ir: Add immediate node constructors | ReinUsesLisp | 2019-01-15 | 2 | -1/+34 |
| | |||||
* | shader_ir: Initial implementation | ReinUsesLisp | 2019-01-15 | 28 | -0/+1542 |
| | |||||
* | Remove references to PICA and rasterizers in video_core | James Rowe | 2018-01-13 | 9 | -2453/+0 |
| | |||||
* | Improved performance of FromAttributeBuffer | Huw Pascoe | 2017-09-17 | 1 | -1/+2 |
| | | | | | | | Ternary operator is optimized by the compiler whereas std::min() is meant to return a value. I've noticed a 5%-10% emulation speed increase. | ||||
* | pica/shader/jit: implement SETEMIT and EMIT | wwylele | 2017-08-19 | 2 | -2/+49 |
| | |||||
* | correct constness | wwylele | 2017-08-19 | 2 | -2/+4 |
| | |||||
* | pica/shader/interpreter: implement SETEMIT and EMIT | wwylele | 2017-08-19 | 1 | -0/+16 |
| | |||||
* | pica/shader: extend UnitState for GS | wwylele | 2017-08-19 | 2 | -0/+84 |
| | | | | | Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access. This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits | ||||
* | pica/shader_interpreter: fix off-by-one in LOOP | wwylele | 2017-07-27 | 1 | -1/+1 |
| | |||||
* | Stop using reserved operator names (and/or/xor) with Xbyak | Yuri Kunde Schlesner | 2017-06-17 | 1 | -13/+13 |
| | | | | Also has the Dynarmic upgrade with the same change | ||||
* | Pica: Set program code / swizzle data limit to 4096 | Jannik Vogel | 2017-05-11 | 5 | -13/+16 |
| | | | | | | | | | | | | | One of the later commits will enable writing to GS regs. It turns out that on startup, most games will write 4096 GS program words. The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages: ``` HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024 ``` New constants have been introduced to represent these limits. The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX) (12 bit = [0; 4095]). | ||||
* | Doxygen: Amend minor issues (#2593) | Mat M | 2017-02-27 | 2 | -2/+4 |
| | | | | | | | | | Corrects a few issues with regards to Doxygen documentation, for example: - Incorrect parameter referencing. - Missing @param tags. - Typos in @param tags. and a few minor other issues. | ||||
* | video_core/shader: Document sanitized MUL operation | Yuri Kunde Schlesner | 2017-02-12 | 1 | -0/+8 |
| | |||||
* | Merge pull request #2550 from yuriks/pica-refactor2 | Yuri Kunde Schlesner | 2017-02-12 | 2 | -2/+4 |
|\ | | | | | Small VideoCore cleanups | ||||
| * | VideoCore: Split regs.h inclusions | Yuri Kunde Schlesner | 2017-02-09 | 2 | -2/+4 |
| | | |||||
* | | video_core: Fix benign out-of-bounds indexing of array (#2553) | Yuri Kunde Schlesner | 2017-02-11 | 1 | -2/+1 |
|/ | | | | | | The resulting pointer wasn't written to unless the index was verified as valid, but that's still UB and triggered debug checks in MSVC. Reported by garrettboast on IRC | ||||
* | VideoCore: Move Regs to its own file | Yuri Kunde Schlesner | 2017-02-04 | 2 | -2/+2 |
| | |||||
* | VideoCore: Split shader regs from Regs struct | Yuri Kunde Schlesner | 2017-02-04 | 4 | -6/+6 |
| | |||||
* | VideoCore: Split rasterizer regs from Regs struct | Yuri Kunde Schlesner | 2017-02-04 | 2 | -13/+13 |
| | |||||
* | Merge pull request #2476 from yuriks/shader-refactor3 | Yuri Kunde Schlesner | 2017-02-04 | 4 | -78/+58 |
|\ | | | | | Oh No! More shader changes! | ||||
| * | VideoCore: Extract swrast-specific data from OutputVertex | Yuri Kunde Schlesner | 2017-01-30 | 2 | -37/+14 |
| | | |||||
| * | VideoCore/Shader: Clean up OutputVertex::FromAttributeBuffer | Yuri Kunde Schlesner | 2017-01-30 | 1 | -9/+14 |
| | | | | | | | | | | | | This also fixes a long-standing but neverthless harmless memory corruption bug, whech the padding of the OutputVertex struct would get corrupted by unused attributes. | ||||
| * | VideoCore: Split shader output writing from semantic loading | Yuri Kunde Schlesner | 2017-01-30 | 2 | -18/+16 |
| | | |||||
| * | VideoCore: Consistently use shader configuration to load attributes | Yuri Kunde Schlesner | 2017-01-30 | 4 | -12/+12 |
| | | |||||
| * | VideoCore: Rename some types to more accurate names | Yuri Kunde Schlesner | 2017-01-30 | 4 | -6/+6 |
| | | |||||
* | | ShaderJIT: add 16 dummy bytes at the bottom of the stack | wwylele | 2017-02-03 | 1 | -2/+5 |
| | | |||||
* | | Common/x64: remove legacy emitter and abi (#2504) | Weiyi Wang | 2017-01-31 | 1 | -1/+0 |
| | | | | | | These are not used any more since we moved shader JIT to xbyak. | ||||
* | | shader_jit_x64_compiler: esi and edi should be persistent (#2500) | Merry | 2017-01-31 | 1 | -0/+2 |
|/ | |||||
* | VideoCore/Shader: Move entry_point to SetupBatch | Yuri Kunde Schlesner | 2017-01-26 | 5 | -22/+23 |
| | |||||
* | VideoCore/Shader: Move per-batch ShaderEngine state into ShaderSetup | Yuri Kunde Schlesner | 2017-01-26 | 5 | -40/+36 |
| | |||||
* | Shader: Remove OutputRegisters struct | Yuri Kunde Schlesner | 2017-01-26 | 3 | -19/+13 |
| | |||||
* | Shader: Initialize conditional_code in interpreter | Yuri Kunde Schlesner | 2017-01-26 | 2 | -3/+3 |
| | | | | | | | This doesn't belong in LoadInputVertex because it also happens for non-VS invocations. Since it's not used by the JIT it seems adequate to initialize it in the interpreter which is the only thing that cares about them. | ||||
* | Shader: Don't read ShaderSetup from global state | Yuri Kunde Schlesner | 2017-01-26 | 1 | -3/+3 |
| | |||||
* | shader_jit_x64: Don't read program from global state | Yuri Kunde Schlesner | 2017-01-26 | 3 | -22/+22 |
| | |||||
* | VideoCore/Shader: Move ProduceDebugInfo to InterpreterEngine | Yuri Kunde Schlesner | 2017-01-26 | 4 | -19/+10 |
| | |||||
* | VideoCore/Shader: Split interpreter and JIT into separate ShaderEngines | Yuri Kunde Schlesner | 2017-01-26 | 6 | -96/+150 |
| | |||||
* | VideoCore/Shader: Rename shader_jit_x64{ => _compiler}.{cpp,h} | Yuri Kunde Schlesner | 2017-01-26 | 3 | -2/+2 |
| | |||||
* | VideoCore/Shader: Split shader uniform state and shader engine | Yuri Kunde Schlesner | 2017-01-26 | 3 | -16/+46 |
| | | | | | Currently there's only a single dummy implementation, which will be split in a following commit. | ||||
* | VideoCore/Shader: Add constness to methods | Yuri Kunde Schlesner | 2017-01-26 | 2 | -4/+4 |
| | |||||
* | VideoCore/Shader: Use only entry_point as ShaderSetup param | Yuri Kunde Schlesner | 2017-01-26 | 2 | -9/+11 |
| | | | | | This removes all implicit dependency of ShaderState on global PICA state. | ||||
* | VideoCore/Shader: Use self instead of g_state.vs in ShaderSetup | Yuri Kunde Schlesner | 2017-01-26 | 2 | -11/+8 |
| | |||||
* | VideoCore/Shader: Extract input vertex loading code into function | Yuri Kunde Schlesner | 2017-01-26 | 2 | -20/+22 |
| | |||||
* | video_core: fix shader.cpp signed / unsigned warning | Kloen | 2017-01-23 | 1 | -2/+2 |
| | |||||
* | Fix some warnings (#2399) | Jonathan Hao | 2017-01-04 | 1 | -2/+0 |
| | |||||
* | VideoCore/Shader: Extract DebugData out from UnitState | Yuri Kunde Schlesner | 2016-12-16 | 7 | -101/+97 |
| | |||||
* | Remove unnecessary cast | Yuri Kunde Schlesner | 2016-12-16 | 1 | -3/+1 |
| | |||||
* | VideoCore/Shader: Extract evaluate_condition lambda to function scope | Yuri Kunde Schlesner | 2016-12-16 | 1 | -26/+24 |
| | |||||
* | VideoCore/Shader: Extract call lambda up a scope and remove unused param | Yuri Kunde Schlesner | 2016-12-16 | 1 | -21/+17 |
| | |||||
* | VideoCore/Shader: Remove dynamic control flow in (Get)UniformOffset | Yuri Kunde Schlesner | 2016-12-16 | 2 | -18/+11 |
| | |||||
* | VideoCore/Shader: Move DebugData to a separate file | Yuri Kunde Schlesner | 2016-12-16 | 3 | -172/+188 |
| | |||||
* | shader_jit_x64: Use LOOPCOUNT_REG as a 64-bit reg when indexing | Yuri Kunde Schlesner | 2016-12-15 | 1 | -1/+1 |
| | |||||
* | VideoCore: Eliminate an unnecessary copy in the drawcall loop | Yuri Kunde Schlesner | 2016-12-15 | 2 | -2/+2 |
| | |||||
* | shader_jit_x64: Use Reg32 for LOOP* registers, eliminating casts | Yuri Kunde Schlesner | 2016-12-15 | 1 | -16/+16 |
| | |||||
* | VideoCore: Convert x64 shader JIT to use Xbyak for assembly | Yuri Kunde Schlesner | 2016-12-15 | 2 | -223/+225 |
| | |||||
* | shader_jit: Fix non-SSE4.1 path where FLR would not truncate | Jannik Vogel | 2016-12-04 | 1 | -1/+1 |
| | |||||
* | shader_jit: Load LOOPCOUNT_REG and LOOPINC 4 bit left-shifted | Jannik Vogel | 2016-12-02 | 1 | -6/+9 |
| | |||||
* | VideoCore: Shader interpreter cleanups | Yuri Kunde Schlesner | 2016-09-30 | 1 | -32/+42 |
| | |||||
* | VideoCore: Fix out-of-bounds read in ShaderSetup::ProduceDebugInfo | Yuri Kunde Schlesner | 2016-09-30 | 1 | -3/+1 |
| | | | | | | As far as I can tell, memset was replaced by a fill without correcting the parameter type, causing an out-of-bounds array read in the Vec4 constructor. | ||||
* | Remove special rules for Windows.h and library includes | Yuri Kunde Schlesner | 2016-09-21 | 1 | -1/+1 |
| | |||||
* | Use negative priorities to avoid special-casing the self-include | Yuri Kunde Schlesner | 2016-09-21 | 3 | -3/+3 |
| | |||||
* | Remove empty newlines in #include blocks. | Emmanuel Gil Peyrot | 2016-09-21 | 5 | -22/+3 |
| | | | | | | | This makes clang-format useful on those. Also add a bunch of forgotten transitive includes, which otherwise prevented compilation. | ||||
* | Manually tweak source formatting and then re-run clang-format | Yuri Kunde Schlesner | 2016-09-19 | 4 | -9/+6 |
| | |||||
* | Sources: Run clang-format on everything. | Emmanuel Gil Peyrot | 2016-09-18 | 6 | -311/+335 |
| | |||||
* | VideoCore: Fix dangling lambda context in shader interpreter | Yuri Kunde Schlesner | 2016-09-16 | 1 | -1/+1 |
| | | | | | | The static meant that after the first execution, these lambda context would be pointing to a random location on the stack. Fixes a random crash when using the interpreter. | ||||
* | Retrieve shader result from new OutputRegisters-type | Jannik Vogel | 2016-05-16 | 3 | -56/+68 |
| | |||||
* | Use new shader-jit signature for interpreter | Jannik Vogel | 2016-05-13 | 3 | -8/+8 |
| | |||||
* | Refactor access to state in shader-jit | Jannik Vogel | 2016-05-13 | 4 | -24/+42 |
| | |||||
* | Move program_counter and call_stack from UnitState to interpreter | Jannik Vogel | 2016-05-12 | 3 | -45/+42 |
| | |||||
* | Move default_attributes into Pica state | Jannik Vogel | 2016-05-12 | 1 | -2/+0 |
| | |||||
* | Merge pull request #1690 from JayFoxRox/tex-type-3 | bunnei | 2016-05-12 | 1 | -1/+2 |
|\ | | | | | Pica: Implement texture type 3 (Projection2D) | ||||
| * | Pica: Add tc0.w to OutputVertex | Jannik Vogel | 2016-05-11 | 1 | -1/+2 |
| | | |||||
* | | Turn ShaderSetup into struct | Jannik Vogel | 2016-05-11 | 2 | -52/+53 |
|/ | |||||
* | Pica: Replace logic in shader.cpp with loop | Jannik Vogel | 2016-05-03 | 1 | -34/+4 |
| | |||||
* | VideoCore: Run include-what-you-use and fix most includes. | Emmanuel Gil Peyrot | 2016-04-30 | 6 | -14/+43 |
| | |||||
* | Merge pull request #1730 from hrydgard/vertex-loader | bunnei | 2016-04-29 | 1 | -1/+1 |
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Remove late accesses to attribute_config * Refactor: Extract VertexLoader from command_processor.cpp. Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached. * Move "&" to their proper place, add missing includes and make some properly relative. * Don't keep base_address in the loader, it doesn't belong there (with it, the loader can't be cached). * Optimize the vertex loader, nearly doubling its speed. * Debugger fix * Move and rename the MemoryAccesses class to MemoryAccessTracker. | ||||
| * | Refactor: Extract VertexLoader from command_processor.cpp. | Henrik Rydgard | 2016-04-28 | 1 | -1/+1 |
| | | | | | | | | Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached. | ||||
* | | Common: Remove section measurement from profiler (#1731) | Yuri Kunde Schlesner | 2016-04-29 | 1 | -3/+0 |
| | | | | | | | | This has been entirely superseded by MicroProfile. The rest of the code can go when a simpler frametime/FPS meter is added to the GUI. | ||||
* | | shader: Shader size is long uint, not uint. | Sam Spilsbury | 2016-04-24 | 1 | -1/+1 |
| | | |||||
* | | shader: Handle non-CALL opcodes with a break | Sam Spilsbury | 2016-04-24 | 1 | -0/+2 |
| | | |||||
* | | shader: Format string must be provided inline and not as a variable | Sam Spilsbury | 2016-04-24 | 1 | -1/+1 |
|/ | |||||
* | shader_jit_x64: Rename RuntimeAssert to Compile_Assert. | bunnei | 2016-04-14 | 2 | -5/+5 |
| | |||||
* | shader_jit_x64.cpp: Rename JitCompiler to JitShader. | bunnei | 2016-04-14 | 3 | -92/+92 |
| | |||||
* | shader_jit_x64: Free memory that's no longer needed after compilation. | bunnei | 2016-04-14 | 1 | -0/+6 |
| | |||||
* | shader_jit_x64: Use a sorted vector instead of a set for keeping track of return addresses. | bunnei | 2016-04-14 | 2 | -5/+8 |
| | |||||
* | shader_jit_x64: Use CALL/RET instead of JMP for subroutines. | bunnei | 2016-04-14 | 1 | -17/+7 |
| | |||||
* | shader_jit_x64: Separate initialization and code generation for readability. | bunnei | 2016-04-14 | 1 | -9/+8 |
| | |||||
* | shader_jit_x64: Get rid of unnecessary last_program_counter variable. | bunnei | 2016-04-14 | 2 | -6/+2 |
| | |||||
* | shader_jit_x64: Execute certain asserts at runtime. | bunnei | 2016-04-14 | 2 | -5/+19 |
| | | | | - This is because we compile the full shader code space, and therefore its common to compile malformed instructions. | ||||
* | shader: Remove unused 'state' argument from 'Setup' function. | bunnei | 2016-04-14 | 2 | -3/+2 |
| | |||||
* | shader_jit_x64: Specify shader main offset at runtime. | bunnei | 2016-04-14 | 3 | -10/+6 |
| | |||||
* | shader_jit_x64: Allocate each program independently and persist for emu session. | bunnei | 2016-04-14 | 3 | -38/+28 |
| | |||||
* | shader_jit_x64: Rewrite flow control to support arbitrary CALL and JMP instructions. | bunnei | 2016-04-14 | 2 | -35/+119 |
| | |||||
* | shader_jit_x64: Fix strict memory aliasing issues. | bunnei | 2016-04-14 | 1 | -1/+3 |
| | |||||
* | Merge pull request #1643 from MerryMage/make_unique | Mathew Maidment | 2016-04-06 | 1 | -1/+0 |
|\ | | | | | Common: Remove Common::make_unique, use std::make_unique | ||||
| * | Common: Remove Common::make_unique, use std::make_unique | MerryMage | 2016-04-05 | 1 | -1/+0 |
| | | |||||
* | | Merge pull request #1508 from JayFoxRox/vs-output-map | bunnei | 2016-03-22 | 1 | -4/+14 |
|\ \ | |/ |/| | Respect vs output map | ||||
| * | Respect vs output map | Jannik Vogel | 2016-03-14 | 1 | -4/+14 |
| | | |||||
* | | Merge pull request #1538 from lioncash/dot | bunnei | 2016-03-20 | 1 | -5/+3 |
|\ \ | | | | | | | shader_interpreter: use std::inner_product for the dot product | ||||
| * | | shader_interpreter: use std::inner_product for the dot product | Lioncash | 2016-03-17 | 1 | -5/+3 |
| | | | | | | | | | | | | Same thing, less code. | ||||
* | | | video_core: Don't cast away const | Lioncash | 2016-03-17 | 1 | -1/+1 |
|/ / | |||||
* | | Merge pull request #1503 from bunnei/clear-jit-cache | bunnei | 2016-03-16 | 3 | -7/+27 |
|\ \ | | | | | | | Clear JIT cache | ||||
| * | | shader_jit_x64: Clear cache after code space fills up. | bunnei | 2016-03-12 | 3 | -2/+19 |
| | | | |||||
| * | | shader_jit_x64: Make assert outputs more useful & cleanup formatting. | bunnei | 2016-03-12 | 1 | -4/+7 |
| | | | |||||
| * | | shader: Update log message to use proper log class. | bunnei | 2016-03-12 | 1 | -1/+1 |
| |/ | |||||
* / | PICA: Fix MAD/MADI encoding | Jannik Vogel | 2016-03-15 | 2 | -29/+33 |
|/ | |||||
* | Common: Get rid of alignment macros | Lioncash | 2016-03-09 | 1 | -4/+4 |
| | | | | | The gl rasterizer already uses alignas, so we may as well move everything over. | ||||
* | Add immediate mode vertex submission | Dwayne Slater | 2016-03-03 | 4 | -2/+22 |
| | |||||
* | pica: Implement decoding of basic fragment lighting components. | bunnei | 2016-02-05 | 2 | -5/+9 |
| | | | | | | | - Diffuse - Distance attenuation - float16/float20 types - Vertex Shader 'view' output | ||||
* | Merge pull request #1367 from yuriks/jit-jmp | bunnei | 2016-01-27 | 2 | -6/+6 |
|\ | | | | | Shader JIT: Fix off-by-one error when compiling JMPs | ||||
| * | Shader JIT: Fix off-by-one error when compiling JMPs | Yuri Kunde Schlesner | 2016-01-24 | 2 | -6/+6 |
| | | | | | | | | | | | | | | There was a mistake in the JMP code which meant that one instruction at the destination would be skipped when the jump was taken. This commit also changes the meaning of the culprit parameter to make it less confusing and avoid similar mistakes in the future. | ||||
* | | Shader: Implement "invert condition" feature of IFU instruction | Yuri Kunde Schlesner | 2016-01-25 | 2 | -2/+5 |
|/ | | | | | | If the bit 0 of the JMPU instruction is set, then the jump condition will be inverted. That is, a jump will happen when the boolean is false instead of when it is true. | ||||
* | video_core: Reorganize headers | Lioncash | 2015-09-11 | 3 | -6/+4 |
| | |||||
* | video_core: Remove unnecessary includes from headers | Lioncash | 2015-09-11 | 1 | -2/+0 |
| | |||||
* | video_core: Remove unused variables | Lioncash | 2015-09-10 | 2 | -2/+0 |
| | |||||
* | Shader JIT: Use SCALE constant from emitter | aroulin | 2015-09-07 | 1 | -4/+4 |
| | |||||
* | Shader: Fix size_t to int casts of register offsets | aroulin | 2015-09-07 | 2 | -15/+21 |
| | |||||
* | Merge pull request #1088 from aroulin/x64-emitter-abi-call | bunnei | 2015-09-02 | 2 | -28/+18 |
|\ | | | | | x64: Proper stack alignment in shader JIT function calls | ||||
| * | x64: Proper stack alignment in shader JIT function calls | aroulin | 2015-09-01 | 2 | -28/+18 |
| | | | | | | | | | | Import Dolphin stack handling and register saving routines Also removes the x86 parts from abi files | ||||
* | | video_core: Fix format specifiers warnings | aroulin | 2015-09-02 | 1 | -1/+2 |
|/ | |||||
* | Shader JIT: Fix SGE/SGEI NaN behavior | aroulin | 2015-08-31 | 1 | -3/+3 |
| | | | | | SGE was incorrectly emulated w.r.t. NaN behavior as the CMPSS SSE instruction was used with NLT | ||||
* | Merge pull request #1065 from yuriks/shader-fp | Yuri Kunde Schlesner | 2015-08-28 | 3 | -56/+87 |
|\ | | | | | Shader FP compliance fixes | ||||
| * | Shader JIT: Tiny micro-optimization in DPH | Yuri Kunde Schlesner | 2015-08-24 | 1 | -4/+4 |
| | | |||||
| * | Shaders: Fix multiplications between 0.0 and inf | Yuri Kunde Schlesner | 2015-08-24 | 2 | -39/+45 |
| | | | | | | | | | | | | | | | | The PICA200 semantics for multiplication are so that when multiplying inf by exactly 0.0, the result is 0.0, instead of NaN, as defined by IEEE. This is relied upon by games. Fixes #1024 (missing OoT interface items) | ||||
| * | Shaders: Explicitly conform to PICA semantics in MAX/MIN | Yuri Kunde Schlesner | 2015-08-24 | 2 | -2/+10 |
| | | |||||
| * | Shader JIT: Add name to second scratch register (XMM4) | Yuri Kunde Schlesner | 2015-08-24 | 1 | -3/+5 |
| | | |||||
| * | Shader JIT: Fix CMP NaN behavior to match hardware | Yuri Kunde Schlesner | 2015-08-24 | 1 | -8/+23 |
| | | |||||
* | | Shader JIT: Fix float to integer rounding in MOVA | aroulin | 2015-08-27 | 1 | -2/+2 |
| | | | | | | | | MOVA converts new address register values from floats to integers using truncation | ||||
* | | Shader JIT: ifdef out reference to ifdef'd out shader_map | archshift | 2015-08-27 | 1 | -0/+2 |
| | | | | | | | | | | shader_map was only defined on x86 architectures, but was cleared on shutdown with no ifdef protection. Ifdef this out so non-x86 architectures can be built. | ||||
* | | Integrate the MicroProfile profiling library | Yuri Kunde Schlesner | 2015-08-25 | 1 | -0/+3 |
| | | | | | | | | | | This brings goodies such as a configurable user interface and multi-threaded timeline view. | ||||
* | | shader_jit: Replace two MDisp usages with MatR | Lioncash | 2015-08-24 | 1 | -2/+2 |
|/ | |||||
* | Merge pull request #1062 from aroulin/shader-rcp-rsq | bunnei | 2015-08-23 | 2 | -10/+10 |
|\ | | | | | Shader: RCP and RSQ computes only the 1st component | ||||
| * | Shader: Use std::sqrt for float instead of sqrt | aroulin | 2015-08-23 | 1 | -1/+1 |
| | | |||||
| * | Shader: RCP and RSQ computes only the 1st component | aroulin | 2015-08-23 | 2 | -10/+10 |
| | | |||||
* | | Shader: implement DPH/DPHI in JIT | aroulin | 2015-08-22 | 2 | -2/+36 |
| | | |||||
* | | Shader: implement DPH/DPHI in interpreter | aroulin | 2015-08-22 | 1 | -1/+8 |
|/ | | | | | Tests revealed that the component with w=1 is SRC1 and not SRC2, it is now fixed on 3dbrew. | ||||
* | Shader: implement SGE, SGEI and SLT in JIT | aroulin | 2015-08-19 | 2 | -15/+36 |
| | |||||
* | Shader: implement SGE, SGEI in interpreter | aroulin | 2015-08-19 | 1 | -0/+14 |
| | |||||
* | Shader: Save caller-saved registers in JIT before a CALL | aroulin | 2015-08-19 | 2 | -0/+33 |
| | |||||
* | Shader: implement EX2 and LG2 in JIT | aroulin | 2015-08-17 | 2 | -2/+22 |
| | |||||
* | Shader: implement EX2 and LG2 in interpreter | aroulin | 2015-08-16 | 1 | -0/+36 |
| | |||||
* | Build fix for Debug configurations. | Tony Wasserka | 2015-08-16 | 1 | -1/+1 |
| | |||||
* | Introduce a shader tracer to allow inspection of input/output values for each processed instruction. | Tony Wasserka | 2015-08-16 | 5 | -37/+322 |
| | |||||
* | citra-qt: Improve shader debugger. | Tony Wasserka | 2015-08-16 | 1 | -6/+0 |
| | | | | Now supports dumping the current shader and recognizes a larger number of output semantics. | ||||
* | Shader: Use a POD struct for registers. | bunnei | 2015-08-16 | 5 | -40/+43 |
| | |||||
* | Rename ARCHITECTURE_X64 definition to ARCHITECTURE_x86_64. | bunnei | 2015-08-16 | 1 | -6/+5 |
| | |||||
* | Common: Cleanup CPU capability detection code. | bunnei | 2015-08-16 | 1 | -5/+5 |
| | |||||
* | Common: Move cpu_detect to x64 directory. | bunnei | 2015-08-16 | 1 | -2/+1 |
| | |||||
* | x64: Refactor to remove fake interfaces and general cleanups. | bunnei | 2015-08-16 | 5 | -144/+22 |
| | |||||
* | JIT: Support negative address offsets. | bunnei | 2015-08-16 | 1 | -26/+25 |
| | |||||
* | Shader: Initial implementation of x86_x64 JIT compiler for Pica vertex shaders. | bunnei | 2015-08-16 | 6 | -2/+924 |
| | | | | | - Config: Add an option for selecting to use shader JIT or interpreter. - Qt: Add a menu option for enabling/disabling the shader JIT. | ||||
* | Common: Added MurmurHash3 hash function for general-purpose use. | bunnei | 2015-08-15 | 1 | -1/+1 |
| | |||||
* | Shader: Define a common interface for running vertex shader programs. | bunnei | 2015-08-15 | 4 | -184/+278 |
| | |||||
* | Shader: Move shader code to its own subdirectory, "shader". | bunnei | 2015-08-15 | 2 | -0/+701 |