The host is a 64-bit application, so requiring 64-bit Windows isn't an
issue. However, not using 32-bit installers has an advantage: we don't
need to require WoW64.
This commit adds damage tracking to the DXGI textures, and only copies the
damaged areas to the textures with ID3D11DeviceContext::CopySubresourceRegion.
The sleep logic in waitFrame makes it difficult for this to reduce the
latency, but removing it shows significant improvements (6-7 ms to ~3 ms)
when a tiny portion of the screen is damaged, while showing no difference on
full screen damage.
This implementation uses a line sweep algorithm to copy the precisely the
intersection of all accumulated damage rectangles, ensuring that every
pixel is copied exactly once, and no pixel is ever copied multiple times.
Furthermore, once a row has been swept, we update the framebuffer write
pointer immediately.
Certain drivers do not support pitches that are not multiples of 128 bytes,
and instead just does some kind of rounding internally. On DXGI, this is not
a problem because the API rounds pixel pitch, but NvFBC does not. This causes
certain resolutions to simply not work with dmabuf, most notably 3440x1440,
which is 1440p ultrawide.
Since we are copying pixels with the CPU anyways, we might as well round the
pitch up to 128 bytes (32 pixels).
This commit adds a new host configuration option, nvfbc:diffRes, which
specifies the dimensions of every block in the diff map. This defaults to
128, meaning the default 128x128 block size.
Since block sizes other than 128x128 is not guaranteed to be supported by
NvFBC, the function NvFBCGetDiffMapBlockSize was introduced to query the
support and output the actual block size used.
When our window is destroyed, our timers are also destroyed. This causes our
attempt at destruction to fail. Instead, set MessageHWND to NULL in the
WM_DESTROY handler and don't try destroying the timers if the window is gone.
DestroyWindow can only be invoked on the thread that created the window.
All other threads must use WM_CLOSE or another message to signal tell the
window to destroy itself.
MinGW seems to decide at random whether it wants to use memcpy from
mscvrt.dll or ntdll.dll. Currently, on Debian buster, ntdll.dll is chosen,
while on sid, mscvrt.dll is chosen.
This commit declares a new .def file for ntdll containing only the
functions we want to link from ntdll.dll, and generates ntdll.a from it
with dlltool. This way, MinGW will never be tempted to link functions
like memcpy from ntdll.dll.
This function is sometimes flaky and may fail for no apparent reason,
see https://stackoverflow.com/q/3945003. This has also been experienced
during the development of #610.
This commit adds logging so we may see if it ever fails for no reason
and work out some way to fix it.
We were using an auto-reset event to signal the mousehook exit. This was
fine when there was only one thread, but with the addition of the update
thread, only one thread is signaled, causing the wait to last forever.
The fix is switching to a manual reset event and call ResetEvent after
the threads have exited.
The type of the QuadPart member of the LARGE_INTEGER union is actually
LONGLONG, so we should cast to LONGLONG instead of int.
This avoids truncation should (ms * 10000.0f) exceed 2^31-1.
This function is available since Windows Vista and can therefore be used
directly without going through GetProcAddress. Unfortunately, MinGW does
not have d3dkmthk.h, but we can declare the prototype ourselves and link
against gdi32.dll.
There is no need to LoadLibrary and GetProcAddress to get pointers to
NtDelayExecution or NtSetTimerResolution. These functions don't have
prototypes in any SDK header, but they are exported in ntdll.dll and
we can simply declare the prototype and link ntdll.
There is also no chance that the functions do not exist: I checked an
old install of Windows NT 4.0 and both of these functions exist.
Also used NtSetTimerResolution instead of ZeSetTimerResolution for
consistency (they are the same).
Also changed system timer resolution log message units to μs with
one decimal digit for readability. This is the actual amount of
precision available to us.
According to MSDN documentation for CreateEnvironmentBlock, "[i]f the
environment block is passed to CreateProcessAsUser, you must also
specify the CREATE_UNICODE_ENVIRONMENT flag."
Also pass DETACHED_PROCESS because the host is a GUI application and
doesn't use the console.