Commit Graph

150 Commits

Author SHA1 Message Date
Quantum
042a7d0925 [host] dxgi: add configurable sleep before D3D12 copy 2022-01-10 14:45:51 +11:00
Quantum
c69b19e68f [host] dxgi: add option to disable damage-aware copies 2022-01-10 14:45:51 +11:00
Quantum
cf7d501bc4 [host] dxgi: allow copy backend selection 2022-01-10 14:45:51 +11:00
Quantum
68e5b812a9 [host] dxgi: add preRelease callback
This is meant to avoid freeing the texture before the copy has finished.
2022-01-10 14:45:51 +11:00
Quantum
5a93f1e00c [host] dxgi: implement Direct3D 12 texture copy backend 2022-01-10 14:45:51 +11:00
Quantum
891f00a011 [host] dxgi: add d3d12.h from latest MinGW
This header was added in late 2020 and hasn't made its way into the GitHub
Actions image yet.
2022-01-10 14:45:51 +11:00
Quantum
137171a8a2 [host] dxgi: refactor to support additional copy backends 2022-01-10 14:45:51 +11:00
Quantum
a391e271c3 [host] dxgi: damage all textures when skipping frame 2022-01-09 16:36:26 +11:00
Johnathon Weaver
0f998582b9 [host] nvfbc: Fix dwmapi linking error
Fixed linking for DwmFlush and also rearranged as per how DXGI is.
2022-01-07 01:46:35 +11:00
Geoffrey McRae
2f8b139131 [host] windows: set DwmFlush default to off
This new feature while helps on some systems, others using freesync or
higher refresh rates where the capture can't keep up will limit to
fractions of the refresh rate. Better to disable and allow users to
opt-in.
2022-01-06 19:20:08 +11:00
Geoffrey McRae
b058cbe9fe [host] nvfbc: add DwmFlush here too as it makes a large difference 2022-01-06 19:01:29 +11:00
Geoffrey McRae
92f27cc0f0 [host] dxgi: use DwmFlush to sync to presentation interval
This change reduces the host GPU and CPU load by a large margin
improving guest system performance along with removing latency spikes
when moving the mouse. This is default enabled but can be disabled with
the new option `dxgi:dwmFlush=no` as it limits the capture rate to the
refresh rate of the guests output which may not be desireable.
2022-01-06 18:39:08 +11:00
Geoffrey McRae
952ebea2c5 [all] refresh copyright dates 2022-01-05 19:42:46 +11:00
Geoffrey McRae
ebf20dd108 [host] nvfbc: fix failure to startup 2022-01-05 19:31:47 +11:00
Geoffrey McRae
ba9f2b85b6 [host/client] kvmfr: update to include extra user data about the VM
This change allows the host to provide information to the client about
how the VM is configured, information such as the UUID, CPU
configuration and capture method both for informational display in the
client as well as debugging in the client's logs.

The format of the records allows this to be extended later with new
record types without needing to bump the KVMFR version.
2022-01-05 19:18:43 +11:00
WYF
b21d842f0e [host] nvfbc: add an option to specify adapter 2021-12-26 11:14:17 +11:00
Tudor Brindus
9872d2e407 [host] dxgi: fix typo in debug log message 2021-12-26 09:49:03 +11:00
Quantum
12461196c3 [host] nvfbc: fix comments in updateDamageRects 2021-12-24 15:30:04 +11:00
Geoffrey McRae
b515fa80d5 [host] nvfbc: be more intellegent when unionizing disjointed sets 2021-10-27 00:00:38 +11:00
Quantum
1f24ab0742 [host] nvfbc: avoid waking up pointer thread for no reason
If the wait times out, we used to simply restart the loop, which causes
it to check this->stop and exit if set to true. However, nvfbc_stop
already calls lgSignalEvent, which would wake up the pointer thread to
perform the check, so there is no need to set a timeout on the wait.
2021-09-29 17:47:02 +10:00
Quantum
4b3aaa7e0c [host] cmake: report capture backends enabled 2021-08-20 21:03:50 +10:00
Quantum
4ecf749f7e [host] remove all casts around malloc 2021-08-16 16:26:58 +10:00
Quantum
cdda89cef7 [host] use correct argument order for calloc 2021-08-16 16:25:59 +10:00
Tudor Brindus
982b4e6625 [host] use variable-relative sizeof where possible 2021-08-16 16:22:55 +10:00
Quantum
4f7ce91e7f [host] capture: switch all asserts to DEBUG_ASSERT 2021-08-14 12:19:07 +10:00
Quantum
10ee6cd031 [host] nvfbc: read nvfbc:diffRes option with the correct type 2021-08-14 09:21:34 +10:00
Quantum
e5d252290d [common] array: add ALIGN_PAD macro for common logic
ALIGN_PAD(x, a) returns x rounded up to the nearest multiple of a.
2021-08-14 08:05:29 +10:00
Quantum
acd5ce51db [host] dxgi: use FAILED macro instead of comparing against S_OK 2021-08-13 20:21:50 +10:00
Quantum
d3ea9662bf [host] nvfbc: remove rectangles that are entirely contained in others
This makes nvfbc report less useless damage and makes the client run faster.
2021-08-13 20:21:27 +10:00
Quantum
566c89e9d8 [host] dxgi: correctly count moved rectangles 2021-08-13 20:21:08 +10:00
Quantum
b3173bdddc [host] dxgi: include correct DXGI headers
The declarations in dxgi_extra.h are not missing, they are in dxgi1_2.h and
dxgi1_5.h, which exist in MinGW-w64 since 2017.
2021-08-12 12:35:45 +10:00
Quantum
61a4b0744d [host] dxgi: use standard MinGW libd3d11.a
MinGW has been shipping this file since 2014, and that version contains the
only function from the dll that we call: D3D11CreateDevice.
2021-08-12 11:58:34 +10:00
Quantum
9bded74543 [host] dxgi: use CopySubresourceRegion when possible
This commit adds damage tracking to the DXGI textures, and only copies the
damaged areas to the textures with ID3D11DeviceContext::CopySubresourceRegion.

The sleep logic in waitFrame makes it difficult for this to reduce the
latency, but removing it shows significant improvements (6-7 ms to ~3 ms)
when a tiny portion of the screen is damaged, while showing no difference on
full screen damage.
2021-08-11 19:01:52 +10:00
Quantum
cab95c5eed [common] rects: refactor rect buffer copy code to common module
This also fixes a bug of accidentally multiplying the stride by 4 when
the stride is already in bytes and not pixels.
2021-08-08 08:30:11 +10:00
Quantum
3d29967a8d [host] dxgi: copy only damaged regions to IVSHMEM
This implementation uses a line sweep algorithm to copy the precisely the
intersection of all accumulated damage rectangles, ensuring that every
pixel is copied exactly once, and no pixel is ever copied multiple times.
Furthermore, once a row has been swept, we update the framebuffer write
pointer immediately.
2021-08-06 22:55:15 +10:00
Quantum
3651852430 [host] nvfbc: round pitch to multiple of 128 for dmabuf import
Certain drivers do not support pitches that are not multiples of 128 bytes,
and instead just does some kind of rounding internally. On DXGI, this is not
a problem because the API rounds pixel pitch, but NvFBC does not. This causes
certain resolutions to simply not work with dmabuf, most notably 3440x1440,
which is 1440p ultrawide.

Since we are copying pixels with the CPU anyways, we might as well round the
pitch up to 128 bytes (32 pixels).
2021-08-06 22:48:33 +10:00
Quantum
4da0c64583 [host] nvfbc: make diff map size configurable
This commit adds a new host configuration option, nvfbc:diffRes, which
specifies the dimensions of every block in the diff map. This defaults to
128, meaning the default 128x128 block size.

Since block sizes other than 128x128 is not guaranteed to be supported by
NvFBC, the function NvFBCGetDiffMapBlockSize was introduced to query the
support and output the actual block size used.
2021-08-05 07:27:28 +10:00
Quantum
51b9cd4e5a [all] copyright: use unicode copyright sign ©
This is done for consistency with the license strings in appstrings.c.
2021-08-04 21:16:35 +10:00
arcnmx
c0aec7d8f4 [host] nvfbc: disable pointerThread when unused
fixes log spam that would occur when decoupleCursor=false
2021-08-03 21:05:39 +10:00
Quantum
7801575d99 [host] nvfbc: log error codes for various errors
This makes troubleshooting easier.

We also switch to use %d instead of 0x%x because all error codes in nvfbc.h
are negative decimal numbers.
2021-07-31 15:02:39 +10:00
Quantum
8528969efd [host] nvfbc: clamp damage rectangles to screen size 2021-07-25 14:34:10 +10:00
Quantum
d82333519c [host] dxgi: use SDK versionhelpers.h to test for Windows version
Also, changed logic so that Windows versions before 8 is not treated as 10.
2021-07-20 11:34:57 +10:00
Quantum
9ab85fd0b8 [host] capture: stop sending DPI information
The client doesn't need DPI information anymore, so there is no point
fetching it.
2021-07-18 10:50:57 +10:00
Tudor Brindus
92706caddc [common]: move array length into a common helper
Since it is more generally useful, and less cryptic this way.
2021-07-18 10:41:50 +10:00
Quantum
893b2500c2 [host] nvfbc: copy damaged areas only
This commit tracks the damage made to the framebuffer and only updates those
areas. Damage is tracked directly with NvFBC provided diffmaps.
2021-07-18 10:41:50 +10:00
Quantum
9ce4990793 [host] capture: pass frameIndex to capture backends
This allows capture backends to track damage made to each frame.
2021-07-18 10:41:50 +10:00
Tudor Brindus
f274bec8fc [host] dxgi: compute damage rectangles from moved rectangles
This is untested in that I don't have a Windows 8 VM where move rects
are supplied, but seems sound.
2021-07-18 10:41:50 +10:00
Quantum
e42747f4e3 [host] nvfbc: better algorithm for merging adjacent regions
Use a proper disjoint-set to give a more accurate result.
2021-07-18 10:41:50 +10:00
Quantum
5ed3301cf5 [host] nvfbc: merge adjacent changed regions
For adjacent changed regions, we actually use the bounding box for the
entire polygon. This may result in more area being damaged than strictly
necessary, but is nevertheless desirable since it reduces the number of
rectangles.
2021-07-18 10:41:50 +10:00
Quantum
6b16bb3ea1 [host] nvfbc: populate damage rectangles 2021-07-18 10:41:50 +10:00
Tudor Brindus
d7f9afb3ba [host] dxgi: populate damage rectangles
Co-Authored-By: Quantum <quantum2048@gmail.com>
2021-07-18 10:41:50 +10:00
Geoffrey McRae
74468cf799 [host] windows: remove accidental addition of some junk 2021-07-17 15:02:36 +10:00
Geoffrey McRae
411a6b1e49 [host] windows: add delayExecution function for more accurate sleeps
This change not only exposes and allows use of NtDelayExecution, but
also moves the code to set the system timer resolution.
2021-07-17 14:55:22 +10:00
Geoffrey McRae
789ee70674 [host] dxgi: print out the adapter details earlier
Before we try and perhaps fail to init DXGI, we should print out what
the device is so that when there is an error report we can immediately
see if the user has the QXL device attached still.
2021-07-12 19:28:13 +10:00
Geoffrey McRae
3c0616bab7 [host] dxgi: print out the output device name to aid with support 2021-07-12 19:03:02 +10:00
Geoffrey McRae
7d0b9711bd [host] nvfbc: remove the frameEvent event and associated code
Now that the host application can run the capture interface in
synchronous mode, and NVFBC uses this mode there is no longer need for
the frameEvent.
2021-07-12 17:01:23 +10:00
Geoffrey McRae
e477663a7e [host] app: allow the capture interface to select async or sync mode
While it's correct for DXGI to use a asyncronous waitFrame model, other
capture interfaces such as NvFBC it is not correct. This change allows
the capture interface to specify which is more correct for it and moves
the waitFrame/post into the main thread if async is not desired.
2021-07-12 16:57:22 +10:00
Quantum
eb01efe0cb [host] nvfbc: do not crash when protected content is playing
We return a timeout, so that when protected content finishes playing, we
can immediately resume capture.
2021-07-11 17:54:23 +10:00
Quantum
501b270890 [host] nvfbc: optimize change detection loop
Before, we only break out of the current row when a change is detected,
and all subsequent rows are still scanned. Now we break out of the entire
loop. This should make change detection ever so slightly faster.
2021-07-11 10:15:12 +10:00
Quantum
fd8f8b2b28 [host] dxgi: correctly mention AcquireNextFrame in help text
Also fix some formatting issues.

Co-Authored-By: Tudor Brindus <me@tbrindus.ca>
2021-07-11 10:15:12 +10:00
Geoffrey McRae
78b8e2a73c [host] windows: make D3DKMTSetProcessSchedulingPriorityClass global
Testing shows that `D3DKMTSetProcessSchedulingPriorityClass` has a
positive performance impact for NvFBC as well as DXGI, as such always
try to boost the priority for the windows host.
2021-07-10 12:27:30 +10:00
Geoffrey McRae
041b95507d [host] windows/nvfbc/common: strip out broken "enhanced" event logic
This so called "enhanced" event logic is completely flawed and can never
work correctly, better to strip it out and put our faith in windows to
handle the events for us.

And yes, I am fully aware I wrote the utter trash in the first place :)
2021-07-09 10:22:03 +10:00
Geoffrey McRae
d36c4f0e83 [host] kvmfr: allow the frame size to exceed the available memory
This change allows the host to still transmit a frame that is truncated
if the IVSHMEM size is too small to allow for a full frame.
2021-06-12 18:44:28 +10:00
Geoffrey McRae
f02d61d665 [host] dxgi: sleep until it's close to time to map
This change adds an average function to time how long it takes the GPU
to copy and map the texture, and then uses this average to sleep for 80%
of this average lowering CPU usage and potentially decreasing lock
contention.
2021-06-06 12:26:36 +10:00
Quantum
24d0aa0c18 [all] normalize copyright on all source files 2021-06-06 11:53:05 +10:00
Geoffrey McRae
fcf6abc7c6 [host] NvFBC/DXGI: make DXGI the default instead of the fallback
It has been detemined that a failure to init NvFBC causes a 20-30%
performance penalty on non NvFBC supported hardware (GeForce) when using
DXGI, as such reverse the order and default to using DXGI as our first
option.

If NvFBC is still desired, pr #500 added the option `app:capture` which
can be used to force NvFBC.
2021-06-06 06:14:24 +10:00
Geoffrey McRae
0d9b0bd367 [host] dxgi: increase maxTextures default to 4
Testing shows that at high frame rates the default of 3 is hampering
performance, increasing this to 4 yields a substantial performance
improvement.
2021-06-06 01:35:00 +10:00
Quantum
ec81a353c2 [host] nvfbc: fix null dereference in mouse hook handler
Since we now let the mouse hook linger until the process is killed, the
cursor event that the hook signals may now be null, as the capture could
have stopped. If the hook fires during this time, a crash occurs.
2021-04-29 11:53:19 +10:00
Quantum
f2c0b8c0b4 [host] allow user to select capture backend
This commit introduces a new option, app:capture, which can be set to
either DXGI or NvFBC to force the host application to use that backend.

This is very useful for testing DXGI on Quadro cards, which would default
to running with NvFBC.
2021-02-27 17:41:44 +11:00
Quantum
ff0a859ceb [host] nvfbc: avoid recreating mouse hook and 1x1 window
The mouse hook code is very fragile, and we would like to avoid unhooking
and re-hooking as much as possible.

After this commit, this is done only once, and the hook and 1x1 window is
only destroyed upon exit. This, of course, comes with the downside of
the slight performance penalty if the guest machine is used directly while
the host is running and the client is not running.
2021-01-31 12:17:14 +11:00
Quantum
1b48ac842a [host] nvfbc: fix resource leak when pointer thread creation fails
Moving NvFBCToSysSetup to nvfbc_init means that when the pointer thread
fails to be created, NvFBCToSysRelease needs to be called.

To resolve such cleanup issues in the future, we instead call nvfbc_deinit,
which should cleanup everything that needs to be cleaned up. fails.
2021-01-31 11:21:24 +11:00
Quantum
a702c912ae [host] nvfbc: move NvFBCToSysCreate into nvfbc_init
When NvFBCToSysCapture reports recreation is required, we return
CAPTURE_RESULT_REINIT, which eventually calls nvfbc_deinit and then
nvfbc_init.

However, the NvFBC object is actually created in nvfbc_create, which
means the NvFBC object is never actually recreated. The result is an
endless cycle of NvFBC asking for recreation. This commonly manifests
as the client waiting endlessly for the host when the guest machine
reboots.

In this commit, the NvFBC object creation is moved into nvfbc_init,
and when recreation is required, it will actually be recreated.
2021-01-31 10:57:51 +11:00
Quantum
acc3298344 [host] nvfbc: cleanup threads created by nvfbc_init on failure
mouseHook_install and dwmForceComposition both create threads, but these
are only freed in nvfbc_deinit which is not called if nvfbc_init fails.
These should be freed if the pointer thread fails to be created, as
nothing else could be cleaning it up.
2021-01-31 09:57:07 +11:00
Quantum
fb916cbac1 [host] nvfbc: always update cursor shape on startup
This ensures that the top-left position of the cursor sprite is correctly
computed.
2021-01-28 11:18:02 +11:00
Geoffrey McRae
0d7be70b56 [host] dxgi: fix maybe uninitialized warning 2021-01-27 01:21:06 +11:00
Quantum
d610aaf2cf [host] nvfbc: update cursor position on shape change
This is because we keep track of the top-left corner of the cursor, not
the location of the hotspot. When the cursor shape changes, the hotspot
location may also change. When it does, the position of the top-left
corner changes and requires an update.

In the case that we do not have the current cursor position, which
happens on startup, we do not generate this update.
2021-01-23 20:37:09 +11:00
Geoffrey McRae
04774d9cd6 [host] fix faults caused by improper startup/shudown/restart ordering 2021-01-21 17:05:30 +11:00
Geoffrey McRae
6b8161972d [host] nvfbc: prevent possible double free 2021-01-21 16:28:20 +11:00
Geoffrey McRae
98ea8b0bb8 [host] nvfbc: remove invalid close of the HMONITOR handle 2021-01-21 16:17:24 +11:00
Quantum
ffa72c7992 [host] nvfbc: force composition to capture some full screen apps
NvFBC is unable to capture certain applications that bypasses the DWM
compositor, for example, Firefox playing video in full screen. This
has been a known issue for a long time with Nvidia's ShadowPlay, see:
* https://www.nvidia.com/en-us/geforce/forums/geforce-experience/14/233709/
* https://crbug.com/609857

Nvidia won't fix this, but there are workarounds. For example, we
create a transparent 1x1 layered window, which forces desktop composition
to be enabled.

Note that SetLayeredWindowAttributes also supports alpha-based transparency,
but setting transparency to 0 will cause DWM to skip composition. We could
use a transparency of 1, but this ruins the image by the slightest bit,
which is unacceptable. Therefore, we must use chroma key-based
transparency, which tricks DWM into compositing despite being fully
transparent.
2021-01-21 12:14:03 +11:00
Geoffrey McRae
cac454d9cf [host] dxgi: reverse the rotation angle.
This is undocumented however testing yields that DXGI DD reports the
inverse rotation. Research shows that this is because of a difference in
coordiate spaces.

Ref: https://docs.microsoft.com/en-us/windows/uwp/gaming/supporting-screen-rotation-directx-and-cpp
2021-01-18 15:15:36 +11:00
Geoffrey McRae
f5587b6b6b [host] all: pass back the desktop rotation to the client 2021-01-18 13:53:29 +11:00
Tudor Brindus
a46a3a2668 [all] use explicit void parameter lists
This makes it a compile-time error to call a function that semantically
takes no parameters with a nonzero number of arguments.

Previously, such code would still compile, but risk blowing up the stack
if a compiler chose to use something other than caller-cleanup calling
conventions.
2021-01-14 17:29:37 +11:00
Geoffrey McRae
c2ad9666bb [host] use the HotSpot information as provided by DXGI
I must have originally overlooked this member when I wrote this code. :S
2021-01-05 20:55:39 +11:00
Quantum
7e4d323427 get display DPI info to scale mouse movement 2021-01-05 09:03:29 +11:00
Geoffrey McRae
604b6bec9a [host] don't fail if windows is dumb and doesnt give us the cursor info 2020-11-01 04:45:57 +11:00
Geoffrey McRae
38b05cda50 [host] dxgi: fix incorrect bpp value 2020-10-12 20:08:51 +11:00
Geoffrey McRae
7a49f75d95 [host] dxgi: ensure formatVer is incremented on re-init 2020-10-12 19:39:57 +11:00
Geoffrey McRae
b2961c7939 [all] added new format version field to frame header 2020-10-12 18:52:37 +11:00
Geoffrey McRae
8a9f004ff6 [host/client] fix invalid initialization of RGBA16F 2020-10-11 19:39:47 +11:00
Geoffrey McRae
9c6bd888fd [host/client] added experimental RGBA16 float support (EGL only) 2020-10-11 19:22:31 +11:00
Geoffrey McRae
e20c8a5cc7 [host] dxgi: don't try to get the hotspot of a null cursor 2020-10-06 23:24:01 +11:00
Geoffrey McRae
4f4d2dbf42 [host] dxgi: fix memory leak if an error occurs 2020-10-06 22:32:10 +11:00
Geoffrey McRae
7e362050f7 [all] update KVMFR to provide cursor hotspot information
This commit bumps the KVMFR protocol version as it adds additional
hotspot x & y fields to the KVMFRCursor struct. This corrects the issue
of invalid alignment of the local mouse when the shape has an offset
such as the 'I' beam.
2020-08-20 13:51:01 +10:00
Geoffrey McRae
1c7961daeb [host] dxgi: rework locking and retry logic for lower latency 2020-08-15 20:49:49 +10:00
Geoffrey McRae
cdc3384883 [host] dxgi: improve frame signaling mechanics 2020-08-15 18:16:11 +10:00
Geoffrey McRae
969effedde [host] update information about PsExec now LG can run as a service 2020-08-13 11:41:16 +10:00
Geoffrey McRae
1d6d640b6e [host] dxgi: default to using the acquire lock 2020-08-07 20:31:46 +10:00
Geoffrey McRae
977d7b277d [host] dxgi: boost GPU thread priority if possible 2020-08-07 19:44:00 +10:00
Geoffrey McRae
bc7871f630 [c-host] renamed finall to just plain host 2020-05-25 13:42:43 +10:00