It appears that the PCI BAR memory is slow to access with remap_pfn_range
and that it should instead be faulted in one page at a time.
The commit 5774e21965 implemented the former
behaviour and caused a performance regression in the VM->VM case.
This commit retores the old behaviour, but extends it to support mmaping
the kvmfr device directly, without going through a dmabuf.
This allows PCI kvmfr devices to be directly mmap'd just like in-memory
ones. Also, the more efficient mmap implementation is used for mapping
the dmabuf, avoiding the faulting code entirely.
Added an array option static_size_mb to the kvmfr module to create a
list of in-memory kvmfr devices. These devices support dmabuf just like
normal kvmfr devices. Additionally, they can be mmap'd, which allows
them to be passed to qemu as ivshmem devices.