To avoid client showing "Using : NVFBC (NVidia Frame Buffer Capt".
This happens because the string is truncated to 31 characters to fit
in the char capture[32]; member of KVMFRRecord_VMInfo.
This new feature while helps on some systems, others using freesync or
higher refresh rates where the capture can't keep up will limit to
fractions of the refresh rate. Better to disable and allow users to
opt-in.
This change allows the host to provide information to the client about
how the VM is configured, information such as the UUID, CPU
configuration and capture method both for informational display in the
client as well as debugging in the client's logs.
The format of the records allows this to be extended later with new
record types without needing to bump the KVMFR version.
If the wait times out, we used to simply restart the loop, which causes
it to check this->stop and exit if set to true. However, nvfbc_stop
already calls lgSignalEvent, which would wake up the pointer thread to
perform the check, so there is no need to set a timeout on the wait.
Certain drivers do not support pitches that are not multiples of 128 bytes,
and instead just does some kind of rounding internally. On DXGI, this is not
a problem because the API rounds pixel pitch, but NvFBC does not. This causes
certain resolutions to simply not work with dmabuf, most notably 3440x1440,
which is 1440p ultrawide.
Since we are copying pixels with the CPU anyways, we might as well round the
pitch up to 128 bytes (32 pixels).
This commit adds a new host configuration option, nvfbc:diffRes, which
specifies the dimensions of every block in the diff map. This defaults to
128, meaning the default 128x128 block size.
Since block sizes other than 128x128 is not guaranteed to be supported by
NvFBC, the function NvFBCGetDiffMapBlockSize was introduced to query the
support and output the actual block size used.
For adjacent changed regions, we actually use the bounding box for the
entire polygon. This may result in more area being damaged than strictly
necessary, but is nevertheless desirable since it reduces the number of
rectangles.
While it's correct for DXGI to use a asyncronous waitFrame model, other
capture interfaces such as NvFBC it is not correct. This change allows
the capture interface to specify which is more correct for it and moves
the waitFrame/post into the main thread if async is not desired.
Before, we only break out of the current row when a change is detected,
and all subsequent rows are still scanned. Now we break out of the entire
loop. This should make change detection ever so slightly faster.
This so called "enhanced" event logic is completely flawed and can never
work correctly, better to strip it out and put our faith in windows to
handle the events for us.
And yes, I am fully aware I wrote the utter trash in the first place :)
It has been detemined that a failure to init NvFBC causes a 20-30%
performance penalty on non NvFBC supported hardware (GeForce) when using
DXGI, as such reverse the order and default to using DXGI as our first
option.
If NvFBC is still desired, pr #500 added the option `app:capture` which
can be used to force NvFBC.
Since we now let the mouse hook linger until the process is killed, the
cursor event that the hook signals may now be null, as the capture could
have stopped. If the hook fires during this time, a crash occurs.
This commit introduces a new option, app:capture, which can be set to
either DXGI or NvFBC to force the host application to use that backend.
This is very useful for testing DXGI on Quadro cards, which would default
to running with NvFBC.
The mouse hook code is very fragile, and we would like to avoid unhooking
and re-hooking as much as possible.
After this commit, this is done only once, and the hook and 1x1 window is
only destroyed upon exit. This, of course, comes with the downside of
the slight performance penalty if the guest machine is used directly while
the host is running and the client is not running.
Moving NvFBCToSysSetup to nvfbc_init means that when the pointer thread
fails to be created, NvFBCToSysRelease needs to be called.
To resolve such cleanup issues in the future, we instead call nvfbc_deinit,
which should cleanup everything that needs to be cleaned up. fails.
When NvFBCToSysCapture reports recreation is required, we return
CAPTURE_RESULT_REINIT, which eventually calls nvfbc_deinit and then
nvfbc_init.
However, the NvFBC object is actually created in nvfbc_create, which
means the NvFBC object is never actually recreated. The result is an
endless cycle of NvFBC asking for recreation. This commonly manifests
as the client waiting endlessly for the host when the guest machine
reboots.
In this commit, the NvFBC object creation is moved into nvfbc_init,
and when recreation is required, it will actually be recreated.
mouseHook_install and dwmForceComposition both create threads, but these
are only freed in nvfbc_deinit which is not called if nvfbc_init fails.
These should be freed if the pointer thread fails to be created, as
nothing else could be cleaning it up.
This is because we keep track of the top-left corner of the cursor, not
the location of the hotspot. When the cursor shape changes, the hotspot
location may also change. When it does, the position of the top-left
corner changes and requires an update.
In the case that we do not have the current cursor position, which
happens on startup, we do not generate this update.
NvFBC is unable to capture certain applications that bypasses the DWM
compositor, for example, Firefox playing video in full screen. This
has been a known issue for a long time with Nvidia's ShadowPlay, see:
* https://www.nvidia.com/en-us/geforce/forums/geforce-experience/14/233709/
* https://crbug.com/609857
Nvidia won't fix this, but there are workarounds. For example, we
create a transparent 1x1 layered window, which forces desktop composition
to be enabled.
Note that SetLayeredWindowAttributes also supports alpha-based transparency,
but setting transparency to 0 will cause DWM to skip composition. We could
use a transparency of 1, but this ruins the image by the slightest bit,
which is unacceptable. Therefore, we must use chroma key-based
transparency, which tricks DWM into compositing despite being fully
transparent.