This fixes a rounding issue on certain hardware (NVidia) which actually
implement mediump as half precision (16-bit) float. It's safe to assume
`highp` is available as if the GPU does not support it, then the shader
compiler will try to find a lower precision that is supported by the GPU
This saves a lot of GPU power for partial updates. Running testufo with
lanczos downscaling and FSR upscaling consumed over 90 W, but with this
commit, consumed only 75 W.
gl_Position is expected to be using homogeneous coordinates, which requires
w to be a coordinate scale factor, usually 1.0. z should also be set in order
for depth to be well-defined. Therefore, we should set gl_Position.zw to
vec2(0.0, 1.0).