| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
| |
This seems to only be used to protect a later gpu function call. So we can move the lock into that call instead.
|
| |
|
| |
|
|
|
|
|
| |
- Fixes a hang on shutdown when NVFlinger thread is waiting on a syncpoint that will never occur.
- Commonly observed when stopping emulation in Super Mario Odyssey.
|
|
|
|
|
|
|
|
|
| |
The FPS counter was based on metrics in the nvdisp swapbuffers call. This metric would be accurate if the gpu thread/renderer were synchronous with the nvdisp service, but that's no longer the case.
This commit moves the frame counting responsibility onto the concrete renderers after their frame draw calls. Resulting in more meaningful metrics.
The displayed FPS is now made up of the average framerate between the previous and most recent update, in order to avoid distracting FPS counter updates when framerate is oscillating between close values.
The status bar update frequency was also changed from 2 seconds to 500ms.
|
|
|
|
|
|
| |
Implements the OnClose method of the nvhost_vic device, and removes the remnants of an older implementation.
Also cleans up some of the surrounding code.
|
|\
| |
| | |
nvdrv: Cleanup CDMA Processor on device closure
|
| |
| |
| |
| | |
Brings us a step closer to unifying all channels to share a common interface.
|
|/
|
|
|
|
| |
This was implicitly done by `is_powered_on = false`, however the explicit method allows us to block until the GPU is actually gone.
This should fix a race condition while removing the other subsystems while the GPU is still active.
|
|
|
|
|
|
| |
Instead of using a two step initialization to report errors, initialize
the GPU renderer and rasterizer on the constructor and report errors
through std::runtime_error.
|
|
|
|
| |
INSERT_PADDING_BYTES_NOINIT is more descriptive of the underlying behavior.
|
| |
|
|
|
|
| |
- We must always use a GPU thread now, even with synchronous GPU.
|
|
|
|
|
|
| |
Resolves variable shadowing scenarios up to the end of the OpenGL code
to make it nicer to review. The rest will be resolved in a following
commit.
|
| |
|
|
|
|
| |
Allows building on clang to work again
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library.
The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data.
To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library.
Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header.
Async GPU is not properly implemented at the moment.
Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
| |
Now that the GPU is initialized when video backends are initialized,
it's no longer needed to query components once the game is running: it
can be done when yuzu is booting.
This allows us to pass components between constructors and in the
process remove all Core::System references in the video backend.
|
|
|
|
|
| |
Add an extra step in GPU initialization to be able to initialize render
backends with a valid GPU instance.
|
|
|
| |
The puller register array is made up of u32s however the `NUM_REGS` value is the size in bytes, so switch it to avoid making the struct unnecessary large. Also fix a small typo in a comment.
|
|\
| |
| | |
video_core: Fix, add and rename pixel formats
|
| |
| |
| |
| |
| |
| | |
Normalizes pixel format names to match Vulkan names. Previous to this
commit pixel formats had no convention, leading to confusion and
potential bugs.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
|/ |
|
| |
|
|
|
|
| |
- Used by The Walking Dead: The Final Season
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes the GraphicsContext to be managed by the GPU core. This
eliminates the need for the frontends to fool around with tricky
MakeCurrent/DoneCurrent calls that are dependent on the settings (such
as async gpu option).
This also refactors out the need to use QWidget::fromWindowContainer as
that caused issues with focus and input handling. Now we use a regular
QWidget and just access the native windowHandle() directly.
Another change is removing the debug tool setting in FrameMailbox.
Instead of trying to block the frontend until a new frame is ready, the
core will now take over presentation and draw directly to the window if
the renderer detects that its hooked by NSight or RenderDoc
Lastly, since it was in the way, I removed ScopeAcquireWindowContext and
replaced it with a simple subclass in GraphicsContext that achieves the
same result
|
|
|
|
| |
Implement RGBA16_SNORM with the current API. Nothing special here.
|
|\
| |
| | |
video_core/surface: Add R32_SINT render target format
|
| | |
|
|/ |
|
| |
|
|
|
|
|
|
|
| |
This function is called rarely and blocks quite often for a long time.
So don't waste power and let the CPU sleep.
This might also increase the performance as the other cores might be allowed to clock higher.
|
|
|
|
| |
- Zero initialization here is useful for determinism.
|
| |
|
|
|
|
|
|
|
| |
This commit uses guest fences on vSync event instead of an articial fake
fence we had.
It also corrects to keep signaling display events while loading the game
as the OS is suppose to send buffers to vSync during that time.
|
| |
|
|\
| |
| | |
renderer_opengl: Implement RGB565 framebuffer format
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* texture_cache/surface_params: Remove unused local variable
* rasterizer_interface: Add missing documentation commentary
* maxwell_dma: Remove unused rasterizer reference
* video_core/gpu: Sort member declaration order to silent -Wreorder warning
* fermi_2d: Remove unused MemoryManager reference
* video_core: Silent unused variable warnings
* buffer_cache: Silent -Wreorder warnings
* kepler_memory: Remove unused MemoryManager reference
* gl_texture_cache: Add missing override
* buffer_cache: Add missing include
* shader/decode: Remove unused variables
|
| | |
|
|/
|
|
|
|
| |
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.
|
|\
| |
| | |
Implement GPU Synchronization Mechanisms & Correct NVFlinger
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
|\ \
| | |
| | | |
Downgrade and suppress a series of GPU asserts and debug messages.
|
| |/
| |
| |
| |
| | |
This adds some missing puller methods. We don't assert them as these are
nop operations for us.
|
|/ |
|
|
|
|
|
|
|
|
|
| |
Like with CPU emulation, we generally don't want to fire off the threads
immediately after the relevant classes are initialized, we want to do
this after all necessary data is done loading first.
This splits the thread creation into its own interface member function
to allow controlling when these threads in particular get created.
|
|
|
|
| |
smaphore -> semaphore
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Because of the recent separation of GPU functionality into sync/async
variants, we need to mark the destructor virtual to provide proper
destruction behavior, given we use the base class within the System
class.
Prior to this, it was undefined behavior whether or not the destructor
in the derived classes would ever execute.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
These types are within the common library, so they should be within the
Common namespace.
|
|
|
|
|
| |
Avoids the use of the global accessor in favor of explicitly making the
system a dependency within the interface.
|
|\
| |
| | |
Implement BGRA8 framebuffer format
|
| | |
|
|/
|
|
|
|
|
|
|
|
| |
When I originally added the compute assert I used the wrong
documentation. This addresses that.
The dispatch register was tested with homebrew against hardware and is
triggered by some games (e.g. Super Mario Odyssey). What exactly is
missing to get a valid program bound by this engine requires more
investigation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implemented the puller semaphore operations.
* Nit: Fix 2 style issues
* Nit: Add Break to default case.
* Fix style.
* Update for comments. Added ReferenceCount method
* Forgot to remove GpuSmaphoreAddress union.
* Fix the clang-format issues.
* More clang formatting.
* two more white spaces for the Clang formatting.
* Move puller members into the regs union
* Updated to use Memory::WriteBlock instead of Memory::Write*
* Fix clang style issues
* White space clang error
* Removing unused funcitons and other pr comment
* Removing unused funcitons and other pr comment
* More union magic for setting regs value.
* union magic refcnt as well
* Remove local var
* Set up the regs and regs_assert_positions up properly
* Fix clang error
|
|
|
|
| |
- More accurate impl., fixes Undertale (among other games).
|
|\
| |
| | |
Implement RenderTargetFormat::BGR5A1_UNORM
|
| | |
|
|/
|
|
| |
This engine writes data from a FIFO register into the configured address.
|
|\
| |
| | |
gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
|
| |
| |
| |
| | |
- Used by Octopath Traveler (with multiple render targets).
|
| |
| |
| |
| | |
Inline the WriteReg helper as it is called ~20k times per frame.
|
|/
|
|
| |
This moves the hot loop into video_core. This refactoring shall reduce the CPU overhead of calling ProcessCommandList.
|
|
|
|
|
| |
subchannel is a 3 bit field. So there must not be more than 8 bound engines.
And using a hashmap for up to 8 values is a bit overpowered.
|
|
|
|
|
|
|
|
|
|
| |
Makes the class interface consistent and provides accessors for
obtaining a reference to the memory manager instance.
Given we also return references, this makes our more flimsy uses of
const apparent, given const doesn't propagate through pointers in the
way one would typically expect. This makes our mutable state more
apparent in some places.
|
|
|
|
| |
Needed by kirby
|
|
|
|
| |
- Used by Breath of the Wild.
|
|
|
|
| |
Needed for xenoblade
|
|
|
|
| |
- Used by Breath of the Wild.
|
|
|
|
| |
- Used by Breath of the Wild.
|
|
|
|
| |
- Used by Go Vacation
|
|
|
|
| |
- Used by Super Mario Odyssey.
|
|
|
|
| |
- Used by Super Mario Odyssey.
|
|\
| |
| | |
video_core: Get rid of global variable g_toggle_framelimit_enabled
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Instead, we make a struct for renderer settings and allow the renderer
to update all of these settings, getting rid of the need for
global-scoped variables.
This also uncovered a few indirect inclusions for certain headers, which
this commit also fixes.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats
Do a separate function in order to get Bytes Per Pixel of DepthFormat
Apply the new function in gpu.h
delete unneeded white space
* correct merging error
|
|
|
|
| |
- Used by Super Mario Odyssey.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We move the initialization of the renderer to the core class, while
keeping the creation of it and any other specifics in video_core. This
way we can ensure that the renderer is initialized and doesn't give
unfettered access to the renderer. This also makes dependencies on types
more explicit.
For example, the GPU class doesn't need to depend on the
existence of a renderer, it only needs to care about whether or not it
has a rasterizer, but since it was accessing the global variable, it was
also making the renderer a part of its dependency chain. By adjusting
the interface, we can get rid of this dependency.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correct trailing white spaces
Delete tabs
correct placement
Add RG16F & RG16UI & RG16I & RG16S PixelFormats
Return correct data according to changes done previously
correct PixelFormat declaration
correct coding style error
correct coding style error part 2
correct RG16S Declaration error
correct alignment
|
|\
| |
| | |
GPU: Implemented the Z32_S8_X24 depth buffer format.
|
| | |
|
|/ |
|
| |
|
| |
|
|
|
|
| |
This makes it match its const qualified equivalent.
|
| |
|
| |
|
| |
|
|
|
|
| |
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
|
|\
| |
| | |
GPU: Allow the usage of RGBA32_FLOAT and RGBA16_FLOAT in the texture copy engine.
|
| | |
|
|/ |
|
| |
|
|
|
|
| |
It doesn't belong in the PFIFO handler.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
This should reduce recompile times when editing the Maxwell3D register structure.
|
| |
|
| |
|
|
|
|
|
|
| |
Accumulate all arguments before calling the desired method.
Note: Maybe we should do the same for the NonIncreasing mode?
|
|
|
|
| |
Only QueryMode::Write is supported at the moment.
|
|
Also moved the GPU MemoryManager class to video_core since it makes more sense for it to be there.
|