This change reduces the host GPU and CPU load by a large margin
improving guest system performance along with removing latency spikes
when moving the mouse. This is default enabled but can be disabled with
the new option `dxgi:dwmFlush=no` as it limits the capture rate to the
refresh rate of the guests output which may not be desireable.
This change allows the host to provide information to the client about
how the VM is configured, information such as the UUID, CPU
configuration and capture method both for informational display in the
client as well as debugging in the client's logs.
The format of the records allows this to be extended later with new
record types without needing to bump the KVMFR version.
If the wait times out, we used to simply restart the loop, which causes
it to check this->stop and exit if set to true. However, nvfbc_stop
already calls lgSignalEvent, which would wake up the pointer thread to
perform the check, so there is no need to set a timeout on the wait.
The host is a 64-bit application, so requiring 64-bit Windows isn't an
issue. However, not using 32-bit installers has an advantage: we don't
need to require WoW64.
This commit adds damage tracking to the DXGI textures, and only copies the
damaged areas to the textures with ID3D11DeviceContext::CopySubresourceRegion.
The sleep logic in waitFrame makes it difficult for this to reduce the
latency, but removing it shows significant improvements (6-7 ms to ~3 ms)
when a tiny portion of the screen is damaged, while showing no difference on
full screen damage.
This implementation uses a line sweep algorithm to copy the precisely the
intersection of all accumulated damage rectangles, ensuring that every
pixel is copied exactly once, and no pixel is ever copied multiple times.
Furthermore, once a row has been swept, we update the framebuffer write
pointer immediately.
Certain drivers do not support pitches that are not multiples of 128 bytes,
and instead just does some kind of rounding internally. On DXGI, this is not
a problem because the API rounds pixel pitch, but NvFBC does not. This causes
certain resolutions to simply not work with dmabuf, most notably 3440x1440,
which is 1440p ultrawide.
Since we are copying pixels with the CPU anyways, we might as well round the
pitch up to 128 bytes (32 pixels).
This commit adds a new host configuration option, nvfbc:diffRes, which
specifies the dimensions of every block in the diff map. This defaults to
128, meaning the default 128x128 block size.
Since block sizes other than 128x128 is not guaranteed to be supported by
NvFBC, the function NvFBCGetDiffMapBlockSize was introduced to query the
support and output the actual block size used.
When our window is destroyed, our timers are also destroyed. This causes our
attempt at destruction to fail. Instead, set MessageHWND to NULL in the
WM_DESTROY handler and don't try destroying the timers if the window is gone.
DestroyWindow can only be invoked on the thread that created the window.
All other threads must use WM_CLOSE or another message to signal tell the
window to destroy itself.
MinGW seems to decide at random whether it wants to use memcpy from
mscvrt.dll or ntdll.dll. Currently, on Debian buster, ntdll.dll is chosen,
while on sid, mscvrt.dll is chosen.
This commit declares a new .def file for ntdll containing only the
functions we want to link from ntdll.dll, and generates ntdll.a from it
with dlltool. This way, MinGW will never be tempted to link functions
like memcpy from ntdll.dll.