According to Erik @ NVidia the open source NVidia driver will not
create a EGLImage from a DMABUF if the target is not
GL_TEXTURE_EXTERNAL_OES. This change set converts the dmabuf texture
from GL_TEXTURE_2D to GL_TEXTURE_EXTERNAL_OES and at runtime performs a
global search & replace on fragment shaders as needed to remain
compatible, replacing `sampler2D` with `samplerExternalOES`.
Ref: https://github.com/NVIDIA/open-gpu-kernel-modules/discussions/243#discussioncomment-3283415
This uses the same line sweep algorithm originally created to copy DXGI
textures to IVSHMEM to implement the copy from IVSHMEM to memory-mapped
pixel buffer objects.
The way things were handled in EGLTexture is not only very hard to
follow, but broken. This change set breaks up EGLTexture into a modular
design making it easier to implement the various versions.
Note that DMABUF is currently broken and needs to be re-implemented.
Instead of using the desktop <GL/gl.h>, we properly use the OpenGL ES 3.x
headers. Also, we now use GL_EXT_buffer_storage for MAP_PERSISTENT_BIT_EXT
and MAP_COHERENT_BIT_EXT as the core versions are only available in desktop
OpenGL 4.4. Similarly, we need GL_EXT_texture_format_BGRA8888 for GL_BGRA_EXT
as GL_BGRA is desktop-only.
The existing code would overwrite the texture's data even if the texture
is currently being used to render to screen. This changeset generates a
texture for each buffer preventing this invalid usage.
Reusing cached EGLImages while the frame format has changed will result
in visual artifacts. We should instead destroy the EGLImages and let
them be recreated.
This commit forces the DMA'd memory to be copied into the texture in
the EGL on_frame handler. This avoids tearing when the LG host inevitably
updates the underlying memory. We need an additional copy inside the GPU,
but this is cheap compared to copying from system memory.
We could have used logic to lock the memory buffer, but that would require
performing DMA on every frame, which wastes memory bandwidth. This
manifests as reduced frame rate when moving the mouse compared to the
non-DMA implementation.
We also keep multiple EGLImages, one for each DMA fd, to avoid issues
with the OpenGL driver.
This makes it a compile-time error to call a function that semantically
takes no parameters with a nonzero number of arguments.
Previously, such code would still compile, but risk blowing up the stack
if a compiler chose to use something other than caller-cleanup calling
conventions.