The actual time between opening the device and the device starting to pull
data can range anywhere between nearly instant and hundreds of
milliseconds. To minimise startup latency, open the device as soon as the
first playback data is received from Spice. If the device starts earlier
than required, insert a period of silence at the beginning of playback to
avoid underrunning. If it starts later, just accept the higher latency and
let the adaptive resampling deal with it.
We can set the startup latency for the next playback far more precisely if
we have the device open already.
Only keep the device open with no playback for 30 seconds to avoid keeping
the device open unnecessarily forever.
Underruns can still happen quite easily at the beginning of playback,
particularly at very low latency settings. Further increase the startup
latency to avoid this.
Many X11 window managers will present an application on their
taskbar as a combination of the application name and an icon
imagery pulled from the X-Property _NET_WM_ICON. Applications
built under frameworks such as Qt or GTK have this property
populated by the framework. This commit adds the Atom _NET_WM_ICON
and populates it with a 64x64 icon of Looking Glass.
The desktop doesn't need its own sampler, there is already an identically
configured one in the `desktop->texture`.
For some reason, using the texture sampler fixes a black screen issue
with my GTX 660 using the 470.86 driver. Maybe hitting some limit
for how many samplers can be allocated?
PipeWire startup latency varies wildly depending on what else is, or was
last using the audio device. In the worst case, PipeWire can request two
full buffers within a very short period of time immediately at the start of
playback, so make sure we've got enough data in the buffer to support this.
The target latency is now based upon the device maximum period size
(which may be configured by setting the `PIPEWIRE_LATENCY` environment
variable if using PipeWire), with some allowance for timing jitter from
Spice and the audio device.
PipeWire can change the period size dynamically at any time which must be
taken into account when selecting the target latency to avoid underruns
when the period size is increased. This is explained in detail within the
commit body.
Previously this was hardcoded to 100ms which is far too high in most
instances, instead we get the initial period size and use whichever is
greater out of 50ms or the period size.
The idea is to reduce the amount of time it takes for the latency to
come down after initial stream start.
This removes the need for locking while also giving a better result in
the graph output. Also when the graph is disabled via the overlay
options it will no longer cause redraws.
This change allows the audiodevs to return the minimum period frames
needed to start playback instead of having to rely on a pull to obtain
these details.
Additionally we are using this information to select an initial start
latency as well as to train the desired latency in order to keep it as
low as possible.
This change is based on the techniques described in [1] and [2].
The input audio stream from Spice is not synchronised to the audio playback
device. While the input and output may be both nominally running at 48 kHz,
when compared against each other, they will differ by a tiny fraction of a
percent. Given enough time (typically on the order of a few hours), this
will result in the ring buffer becoming completely full or completely
empty. It will stay in this state permanently, periodically resulting in
glitches as the buffer repeatedly underruns or overruns.
To address this, adjust the speed of the received data to match the rate at
which it is being consumed by the audio device. This will result in a
slight pitch shift, but the changes should be small and smooth enough that
this is unnoticeable to the user.
The process works roughly as follows:
1. Every time audio data is received from Spice, or consumed by the audio
device, sample the current time. These are fed into a pair of delay
locked loops to produce smoothed approximations of the two clocks.
2. Compute the difference between the two clocks and compare this against
the target latency to produce an error value. This error value will be
quite stable during normal operation, but can change quite rapidly due
to external factors, particularly at the start of playback. To smooth
out any sudden changes in playback speed, which would be noticeable to
the user, this value is also filtered through another delay locked loop.
3. Feed this error value into a PI controller to produce a ratio value.
This is the target playback speed in order to bring the error value
towards zero.
4. Resample the input audio using the computed ratio to apply the speed
change. The output of the resampler is what is ultimately inserted into
the ring buffer for consumption by the audio device.
Since this process targets a specific latency value, rather than simply
trying to rate match the input and output, it also has the effect of
'correcting' latency issues. If a high latency application (such as a media
player) is already running, the time between requesting the start of
playback and the audio device actually starting to consume samples can be
very high, easily in the hundreds of milliseconds. The changes here will
automatically adjust the playback speed over the course of a few minutes to
bring the latency back down to the target value.
[1] https://kokkinizita.linuxaudio.org/papers/adapt-resamp.pdf
[2] https://kokkinizita.linuxaudio.org/papers/usingdll.pdf
In unbounded mode, the read and write pointers are free to move
independently of one another. This is useful where the input and output
streams are progressing at the same rate on average, and we want to keep
the latency stable in the event than an underrun or overrun occurs.
If an underrun occurs (i.e., there is not enough data in the buffer to
satisfy a read request), the missing values with be filled with zeros. When
the writer catches up, the same number of values will be skipped from the
input.
If an overrun occurs (i.e., there is not enough free space in the buffer to
satisfy a write request), excess values will be discarded. When the reader
catches up, the same number of values will be zeroed in the output.
Unbounded mode is currently unused since our audio input and output
streams are not synchronised. This will be implemented in a later commit.
Also reimplemented as a lock-free queue which is safer for use in audio
device callbacks.
If a new playback is started while the previous playback is still flushing,
we simply allow the stream to continue playing and effectively cancel the
flush. In general this is not safe because there may not be enough data in
the buffer to avoid underrunning. We could handle this better later by
trying to insert the right number of silent samples into the buffer, but
for now just completely stop the previous stream before starting the new
one.
Automatically restarting playback once draining has completed could result
in playback starting too early (i.e., before there is enough data in the
ring buffer to avoid underrunning). `audio_playbackData` will keep invoking
`start` until it returns true anyway, so we can just allow draining to
complete normally and wait for `start` to be called again.
These are implemented as ScrollLock+Up/Down for volume up and down, and
ScrollLock+M to toggle audio mute. These should prove useful especially
when Looking Glass now supports streaming audio, and the volume is
defined in the guest and set on the output stream.
This prevents LGMP_ERR_QUEUE_FULL from happening with high polling rate
mice, which is caused by receiving many more mouse events while the
guest cursor warps, triggering more warps.
On my machine (Intel UHD Graphics 770), texture processing occasionally
(about 5% of the time) takes more than 20ms (the highest I have seen is
around 32ms) when the host resolution is 2560x1440. This results in the
frame being discarded and the client displays a stale image. Increase the
timeout to 40ms.
This prevents severe buffer underruns if the PipeWire quantum is bigger
than the ring buffer size. This could happen if a media player is running
at the same time as Looking Glass if it requests a very large quantum size,
for example.
This stops the end of the playback from being truncated. It also prevents
an audible glitch when playback next starts due to the truncated data being
left behind in the ring buffer.
When using jitRender, or on the first frame of an alert the window
doesn't get resized immediately causing it to cut off the end of the
text.
ImGui needs two passes to calulate the bounding box for automatically
sized windows, this is per it's design and not a bug, see:
https://github.com/ocornut/imgui/issues/2158#issuecomment-434223618
If the guest supports sending us it's UUID and PureSpice has also
reported the guest's UUID, check them to see if the user has
accidentially connected to the wrong spice socket.
This change allows the host to provide information to the client about
how the VM is configured, information such as the UUID, CPU
configuration and capture method both for informational display in the
client as well as debugging in the client's logs.
The format of the records allows this to be extended later with new
record types without needing to bump the KVMFR version.
g_state.posInfoValid could become valid after the guest reports the
cursor position, in which case we did not show the cursor until another
update occurs.
This commit eliminates the race by performing the update when
g_state.posInfoValid becomes true.
For an unknwon reason when LG is on another desktop (hidden) and the
user switches to that desktop, the first attempt to grab the pointer
results in a GrabFrozen result. This adds some simple retry logic to
attempt again after a short (100ms) delay which seems to resolve the
issue.
This allows the client to work when the OpenGL implementation fails if
EGL_RENDER_BUFFER is passed, printing a warning. This should fix issues
with Nvidia proprietary drivers on Wayland.
NVIDIA still do not implement a complete/working DMABUF implementation
yet advertise support. Best to tell the public to complain to NVIDIA
instead of assuming it's a LG bug or an issue with their system.
filters are now not run if `egl_filterSetup` returns false, as such we
do not need additional `enable` checks in `prepare` or `run` and we can
bypass the filters even earlier if they are not enabled reducing load.
Since this->prepared will never be set to true unless the filter is
enabled, this results in the framebuffer setup being done every frame
for no reason, causing a lot of texture reallocations.
The cursorThread prevents the host from going to sleep when the
video feed is disabled as it's subscribed to the cursor queue. Stopping
the cursorThread will unsubscribe from the queue and allow the host
application to disable capture.
This commit adds check for the extensions that we need and then calls
the functions indirectly through gl_dynprocs.
This should improve compatibility with older versions of OpenGL, as we
now fallback to the ARB extensions if possible, and in the case of
glGenerateMipmap, we can handle the function not existing at all.
The Linux OpenGL ABI does not guarantee that glXSwapIntervalEXT will be
exported statically from any library, and indeed on some systems this
function does not link at load time, e.g. with amdgpu-pro. All other
GLX functions that we use are from GLX 1.0, which is guaranteed to be
exported statically.
This commit solves this issue by using glXGetProcAddressARB to load the
function. Note that only the ARB version of glXGetProcAddress is
guaranteed to exist by the Linux OpenGL ABI, which is why we must use
it.
This prevents attempts to grab the pointer after the guest side warp
finishes if the pointer has left the window in the meantime. On Wayland,
this would result in the pointer moving to the middle of the window when
the confine is created.
Previously, all progress made during sleep is reset, so if the thread keeps
getting interrupted before the sleep finishes, the sleep will never complete.
This saves a lot of GPU power for partial updates. Running testufo with
lanczos downscaling and FSR upscaling consumed over 90 W, but with this
commit, consumed only 75 W.
The translucent white modal background sort of cancels out the dark
background we apply to the overlay, which is undesirable. It should
instead further darken the background.
For consistency, we now use igGetColorU32Col(ImGuiCol_ModalWindowDimBg)
to draw the overlay background, to avoid hardcoding the same colour in
multiple places.
Currently, this is visible through how fast the cursor blinks, with it
blinking faster at higher refresh rates. This commit makes the timing
consistent.
This new function dumps all options marked as preset instead of dumping
individual sections. This should allow filter options to not be all grouped
into the [eglFilter] section.