X11 needs to calibrate to get the best possible latency, as such it
needs the scene to render so that the render time of the scene can be
accounted for in the delay calculation.
This replaces the scaled `destRect` with a version that uses doubles
correcting the rounding error that is causing a failure to properly
clear the black bar areas.
This mesh will later be used to render only damaged portions of the desktop.
We also moved the coordinate transformation for damage overlay into a matrix
and computed by the shader.
XPresent doesn't give us the time before presentation, but the time just
after. This code calculates and calibrates a delay to sleep for before
signaling the wait event for render when using jitRender
After the damage queue PR, EGL damage count 0 means no change, and -1 means
invalidate the entire window. However, several other places have different
semantics, and we are not handling them correctly:
1. KVMFR uses 0 to signal invalidating the entire frame, so if we receive 0
rectangles in egl_on_frame, we should set damage count to -1.
2. The damage overlay treated 0 as full damage, which is now incorrect. This
is fixed, and now it treats 0 as no update, and -1 as full damage.
The way things were handled in EGLTexture is not only very hard to
follow, but broken. This change set breaks up EGLTexture into a modular
design making it easier to implement the various versions.
Note that DMABUF is currently broken and needs to be re-implemented.
There used to be a possible race when a bunch of rectangle is appended, but
the total count is not updated before it's read. Using a lock eliminates
all such races.
Without configuring Wayland compositors to send frame callbacks as late as
possible, JIT rendering can increase latency by more than one frame.
For example, by default, sway asks applications to render right after a
vblank, and does its own composition right after a vblank, resulting in
~2 frame's worth of latency. If max_render_time is set on the output,
it composes that many milliseconds before the vblank, losing ~1 frame's
worth of latency. If max_render_time is set on the window also, the frame
callback is sent that many milliseconds before composition, and we achieve
perfectly low latency.
Therefore, out of the box, JIT rendering should not be enabled, as manual
compositor configuration is required for optimal results.
For reference, the following sway settings results in the best latency:
output <insert output name> max_render_time 1
for_window [app_id="looking-glass-client"] max_render_time 1
This reverts commit 3baed05728.
If we invalidate the window, we used to not update this->cursorLast, and
this causes us to lose track of the cursor. Now we update this->cursorLast
unconditionally, and this fixes the issue.
Version 3 does not send xdg_output.done events, instead guaranteeing that
all xdg_output.* events are sent before wl_output.done. This saves us from
doing the work twice.
The method used is not guaranteed to work on all Wayland compositors,
so offer a way out. We need to support it anyways in case xdg_output
or wp_viewporter protocols are not available.
Currently, we scale the desktop up to the next largest integer, and rely on
the wayland compositor to scale it back down to the correct size.
This is obviously undesirable.
In this commit, we attempt to detect the actual fractional scaling by finding
the current active mode in wl_output, and dividing it by the logical screen
size reported by xdg_output, taking into consideration screen rotation.
We then use wp_viewporter to set the exact buffer and viewport sizes if
fractional scaling is needed.
When requested, JIT render mode will be used if the display server supports it.
Otherwise, a warning is generated instead.
This essentially uses the signalNextFrame logic for imgui, but for everything.
We automatically enable this mode when overlay is on.
Currently, this exposes some damage tracking bugs in the EGL renderer.
This prevents damage from being overwritten when frames are received
faster than could be rendered.
This implementation cycles between two queues, removing all need for
memory allocation.
This method takes an LGEvent and signals it when the next frame should be
rendered in time for the next vblank.
We will be using this to render imgui at screen refresh rate, but this could
potentially be used later to implement a better form of vsync for supported
display servers.
This must be invoked before swapping buffers.
The imgui overlay requires input even if the display is not captured and
operating in raw mode. XInput2 correctly only sends
XI_Press/ReleaseButton events if the device has not been captured, as
such it's safe to handle both raw and non raw buttons events at the same
time.
We look for the client config in $XDG_CONFIG_HOME/looking-glass/client.ini.
This is done because it's more conventional, and also allows us to add
additional configuration files, e.g. for the host.
We fallback to $HOME/.config as is standard, and then as a last resort use
getpwuid(getuid())->pw_dir. This is also recommended by the getpwuid manpage:
> An application that wants to determine its user's home directory should
> inspect the value of HOME (rather than the value getpwuid(getuid())->pw_dir)
> since this allows the user to modify their notion of "the home directory"
> during a login session.
Currently, we load /etc/looking-glass-client.ini and/or
~/.config/looking-glass-client.ini as long as they exist, even if they are
not files. We should only load them if they are files.
We don't want to encourage craziness of people making the client suid to
bypass permission issues on the shm file.
Note: I see no evidence of this happening in the wild, but let's be
proactive.
The refresh-copyright script now automatically updates the copyright string
embedded in config.c. In order to achieve this, refresh-copyright gained the
ability to reflow text as the situation needs.
We now give ImGui the true logical size of the window and tell it to scale
the framebuffer. To fix the blurry fonts, we continue to load fonts at the
scale necessary for the DPI and use FontGlobalScale to shrink the fonts back
to the logical size. The font rectangle is then expanded by the framebuffer
scaling, resulting in good text rendering.
This method has the advantage of not messing up the sizes of resizable
overlays when moving across monitors.
The default of [0, 50] makes sense for FPS/UPS graphs, but does not for
things like the import graph. The latter should not take more than 5 ms
for sure.
This commit allows the min/max y-axis value to be specified when registering
the graph.
Now that we are drawing with damage rects, when the window is hidden and
then exposed the window may not get fully redrawn. This provides
`app_invalidateWindow` for the display server backend to call when the
screen needs a full redraw.
When a new client connects to our session the host will repeat the last
valid frame for the new client. This change will detect this and skip
the duplicated frame.
This is necessary in case overlays change size. When this happens, we must
damage the larger of the overlays' rectangles this frame and last frame.
This erases the overlay from where it is no longer appears.
In order to do this, we must keep track of the rectangles for every overlay
with no exception. We cannot short-circuit the generation of rectangles if
we run out of buffer space, and we must allocate space for MAX_OVERLAY_RECTS
rectangles for every frame. Otherwise, we will not know where to erase the
overlay if it disappears.
Currently, we dispatch the events on the wayland display server ourselves.
This is fine when using the cairo backend of libdecor, as it does the same
thign we do, but other backends may require other things to be dispatched.
This commit lets libdecor dispatch events instead through libdecor_get_fd
and libdecor_dispatch, which should hopefully makes things less sketchy.
While the renderer can internally track this it would be better to
simply provide this information to the renderer directly so it can make
better decisions on how best to update the screen.
If the guest is not sending frames at a constant rate, the minimum FPS
timeout may expire drawing an additional frame. This change calculates
the average ups frame time over the past 100ms and adds this to the
timeout value allowing this value to be dynamic.
The accumulated time is not the best way to do this as the timer
function callback may not be exactly every 1000ms, by using the
monotonic clock we will get more accurate results.
1. Use atomics and return exact cursor positions from egl_cursor_render
to avoid race conditions between cursor render and update.
2. Instead of messing with lastCursorValid in various overlays, simply use
the hasOverlay/hadOverlay logic for cursor damage. This simplifies the
logic greatly.
As a result, I believe all cursor-related artifacts are fixed.
Note to reviewer: as atomic_init and atomic_store are implemented as macros,
it is currently not possible to pass structs as compound literals due to the
comma being interpreted as an argument separator by the preprocessor.
This commit creates a new utility library, eglutil.h, which contains code
to detect and use EGL_KHR_swap_buffers_with_damage or its EXT equivalent.
This logic used to be duplicated between the X11 and Wayland display servers,
which is not ideal.
Instead of using the desktop <GL/gl.h>, we properly use the OpenGL ES 3.x
headers. Also, we now use GL_EXT_buffer_storage for MAP_PERSISTENT_BIT_EXT
and MAP_COHERENT_BIT_EXT as the core versions are only available in desktop
OpenGL 4.4. Similarly, we need GL_EXT_texture_format_BGRA8888 for GL_BGRA_EXT
as GL_BGRA is desktop-only.
The renderer may take time to process the cursor update due to various
internal factors, as such it's best we copy the data and mark the
message as done ASAP. This prevents the host from filling up the queue
as easily when a high dpi mouse is in use.
People often miss the warnings about invalid arguments in their command
line, this last minute patch attempts to address this by making
warnings, errors, fixme's and fatal errors stand out if stdout is a TTY.
Conversion from the float values srcW/srcH to the int values for the client window dimensions would sometimes round down, causing the client to scale instead of matching the host's resolution.
A resolution switch could cause the renderer state to become invalid as
the texture format may change while it's being rendered. This fixes this
by adding a lock around the format change and render calls to the
renderer.
Drawing to the front buffer directly requires special handling to
prevent seeing the draw progress (avoiding glClear, etc) and as a result
the output is quite bad unless a compositor is running. Also vsync if
enabled will not function without double buffering enabled.
As OpenGL is the legacy fallback, there are no plans to implement clean
front buffer draw support, so just enable double buffering.