This patch performs several changes.
* Alters the fourcc codes to types that ignore the alpha channel where
possible to allow the gpu to internally use 24-bit formats.
* Attempts to use DRM_FORMAT_RGB888 first as some GPUs may support this
* If DMABUF is not in use the data is now imported directly as RGB24
without the post-processing shader
According to Erik @ NVidia the open source NVidia driver will not
create a EGLImage from a DMABUF if the target is not
GL_TEXTURE_EXTERNAL_OES. This change set converts the dmabuf texture
from GL_TEXTURE_2D to GL_TEXTURE_EXTERNAL_OES and at runtime performs a
global search & replace on fragment shaders as needed to remain
compatible, replacing `sampler2D` with `samplerExternalOES`.
Ref: https://github.com/NVIDIA/open-gpu-kernel-modules/discussions/243#discussioncomment-3283415
This uses the same line sweep algorithm originally created to copy DXGI
textures to IVSHMEM to implement the copy from IVSHMEM to memory-mapped
pixel buffer objects.
The way things were handled in EGLTexture is not only very hard to
follow, but broken. This change set breaks up EGLTexture into a modular
design making it easier to implement the various versions.
Note that DMABUF is currently broken and needs to be re-implemented.
Instead of using the desktop <GL/gl.h>, we properly use the OpenGL ES 3.x
headers. Also, we now use GL_EXT_buffer_storage for MAP_PERSISTENT_BIT_EXT
and MAP_COHERENT_BIT_EXT as the core versions are only available in desktop
OpenGL 4.4. Similarly, we need GL_EXT_texture_format_BGRA8888 for GL_BGRA_EXT
as GL_BGRA is desktop-only.
Note: This only works with the KVMFR kernel module in a VM->VM
configuration. If this causes issues it can be disabled with the new
option `app:allowDMA`
This changes the method of the memory copy from the host application to
the guest. Instead of performing a full copy from the capture device
into shared memory, and then flagging the new frame, we instead set a
write pointer, flag the client that there is a new frame and then copy
in chunks of 1024 bytes until the entire frame is copied. The client
upon seeing the new frame flag begins to poll at high frequency the
write pointer and upon each update copies as much as it can into the
texture.
This should improve latency but also slightly increase CPU usage on the
client due to the high frequency polling.