It appears that the PCI BAR memory is slow to access with remap_pfn_range
and that it should instead be faulted in one page at a time.
The commit 5774e219651da3b9bacf9eafd87ae39d75a5eea7 implemented the former
behaviour and caused a performance regression in the VM->VM case.
This commit retores the old behaviour, but extends it to support mmaping
the kvmfr device directly, without going through a dmabuf.
This allows PCI kvmfr devices to be directly mmap'd just like in-memory
ones. Also, the more efficient mmap implementation is used for mapping
the dmabuf, avoiding the faulting code entirely.
Added an array option static_size_mb to the kvmfr module to create a
list of in-memory kvmfr devices. These devices support dmabuf just like
normal kvmfr devices. Additionally, they can be mmap'd, which allows
them to be passed to qemu as ivshmem devices.