The nsleep() call lets d3d12 sleep for a more precise amount of
time while maintaining the current millisecond-scale sleep
interface in the configuration file.
This will cache up to 10 handles, in practice I have never seen DXGI
return anything but the same resource each time but we allow for more
anyway should MS change something in the future.
Should the cache get over filled it is disabled entirely and we revert
to the original behaviour.
The buffer input sizes to the `IDXGIOutputDuplication` methods are measured
in bytes. This dramatically increases the number of dirty/move rects that
can be handled.
This new feature while helps on some systems, others using freesync or
higher refresh rates where the capture can't keep up will limit to
fractions of the refresh rate. Better to disable and allow users to
opt-in.
This change reduces the host GPU and CPU load by a large margin
improving guest system performance along with removing latency spikes
when moving the mouse. This is default enabled but can be disabled with
the new option `dxgi:dwmFlush=no` as it limits the capture rate to the
refresh rate of the guests output which may not be desireable.
This commit adds damage tracking to the DXGI textures, and only copies the
damaged areas to the textures with ID3D11DeviceContext::CopySubresourceRegion.
The sleep logic in waitFrame makes it difficult for this to reduce the
latency, but removing it shows significant improvements (6-7 ms to ~3 ms)
when a tiny portion of the screen is damaged, while showing no difference on
full screen damage.
This implementation uses a line sweep algorithm to copy the precisely the
intersection of all accumulated damage rectangles, ensuring that every
pixel is copied exactly once, and no pixel is ever copied multiple times.
Furthermore, once a row has been swept, we update the framebuffer write
pointer immediately.
Before we try and perhaps fail to init DXGI, we should print out what
the device is so that when there is an error report we can immediately
see if the user has the QXL device attached still.
While it's correct for DXGI to use a asyncronous waitFrame model, other
capture interfaces such as NvFBC it is not correct. This change allows
the capture interface to specify which is more correct for it and moves
the waitFrame/post into the main thread if async is not desired.
Testing shows that `D3DKMTSetProcessSchedulingPriorityClass` has a
positive performance impact for NvFBC as well as DXGI, as such always
try to boost the priority for the windows host.
This change adds an average function to time how long it takes the GPU
to copy and map the texture, and then uses this average to sleep for 80%
of this average lowering CPU usage and potentially decreasing lock
contention.