These old functions can be removed now too. Let net_param_nic() print
the full set of network devices directly, and also make it note that a
list more specific to this platform/config will be available by using
'-nic model=help' instead.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 21 Oct 2023 20:34:41 +0000 (21:34 +0100)]
hw/mips: use qemu_create_nic_device()
The Jazz and MIPS SIM platforms both instantiate their NIC only if a
corresponding configuration exists for it. Convert them to use the
qemu_create_nic_device() function for that.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Fri, 20 Oct 2023 10:46:56 +0000 (11:46 +0100)]
hw/arm/stellaris: use qemu_find_nic_info()
Rather than just using qemu_configure_nic_device(), populate the MAC
address in the system-registers device by peeking at the NICInfo before
it's assigned to the device.
Generate the MAC address early, if there is no matching -nic option.
Otherwise the MAC address wouldn't be generated until net_client_init1()
runs.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Fri, 20 Oct 2023 00:13:08 +0000 (01:13 +0100)]
hw/net/lan9118: use qemu_configure_nic_device()
Some callers instantiate the device unconditionally, others will do so only
if there is a NICInfo to go with it. This appears to be fairly random, but
preseve the existing behaviour for now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Fri, 20 Oct 2023 00:06:52 +0000 (01:06 +0100)]
hw/net/smc91c111: use qemu_configure_nic_device()
Some callers instantiate the device unconditionally, others will do so only
if there is a NICInfo to go with it. This appears to be fairly random, but
preserve the existing behaviour for now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 21 Oct 2023 20:45:10 +0000 (21:45 +0100)]
hw/sh4/r2d: use pci_init_nic_devices()
Previously, the first PCI NIC would be assigned to slot 2 even if the
user override the model and made it something other than an rtl8139
which is the default. Everything else would be dynamically assigned.
Now, the first rtl8139 gets slot 2 and everything else is dynamic.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 21 Oct 2023 20:41:37 +0000 (21:41 +0100)]
hw/ppc/spapr: use qemu_get_nic_info() and pci_init_nic_devices()
Avoid directly referencing nd_table[] by first instantiating any
spapr-vlan devices using a qemu_get_nic_info() loop, then calling
pci_init_nic_devices() to do the rest.
No functional change intended.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 21 Oct 2023 20:38:05 +0000 (21:38 +0100)]
hw/ppc/prep: use pci_init_nic_devices()
Previously, the first PCI NIC would be placed in PCI slot 3 and the rest
would be dynamically assigned. Even if the user overrode the default NIC
type and made it something other than PCNet.
Now, the first PCNet NIC (that is, anything not explicitly specified
to be anything different) will go to slot 3 even if it isn't the first
NIC specified on the commnd line. And anything else will be dynamically
assigned.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 21 Oct 2023 20:30:51 +0000 (21:30 +0100)]
hw/mips/malta: use pci_init_nic_devices()
The Malta board setup code would previously place the first NIC into PCI
slot 11 if was a PCNet card, and the rest (including the first if it was
anything other than a PCNet card) would be dynamically assigned.
Now it will place any PCNet NIC into slot 11, and then anything else will
be dynamically assigned.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 21 Oct 2023 20:26:19 +0000 (21:26 +0100)]
hw/mips/fuloong2e: use pci_init_nic_devices()
The previous behaviour was: *if* the first NIC specified on the command
line was an RTL8139 (or unspecified model) then it gets assigned to PCI
slot 7, which is where the Fuloong board had an RTL8139. All other
devices (including the first, if it was specified a anything other then
an rtl8319) get dynamically assigned on the bus.
The new behaviour is subtly different: If the first NIC was given a
specific model *other* than rtl8139, and a subsequent NIC was not,
then the rtl8139 (or unspecified) NIC will go to slot 7 and the rest
will be dynamically assigned.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 17 Oct 2023 16:53:58 +0000 (17:53 +0100)]
hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs
When instantiating XenBus itself, for each NIC which is configured with
either the model unspecified, or set to to "xen" or "xen-net-device",
create a corresponding xen-net-device for it.
Now we can launch emulated Xen guests with '-nic user', and this fixes
the setup for Xen PV guests, which was previously broken in various
ways and never actually managed to peer with the netdev.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Thu, 19 Oct 2023 23:07:45 +0000 (00:07 +0100)]
hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices()
Eliminate direct access to nd_table[] and nb_nics by processing the the
ISA NICs first and then calling pci_init_nic_devices() for the test.
It's important to do this *before* the subsequent patch which registers
the Xen PV network devices, because the code being remove here didn't
check whether nd->instantiated was already set before using each entry.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sun, 22 Oct 2023 08:13:41 +0000 (09:13 +0100)]
net: add qemu_create_nic_bus_devices()
This will instantiate any NICs which live on a given bus type. Each bus
is allowed *one* substitution (for PCI it's virtio → virtio-net-pci, for
Xen it's xen → xen-net-device; no point in overengineering it unless we
actually want more).
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 21 Oct 2023 22:09:38 +0000 (23:09 +0100)]
net: report list of available models according to platform
By noting the models for which a configuration was requested, we can give
the user an accurate list of which NIC models were actually available on
the platform/configuration that was otherwise chosen.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Most code which directly accesses nd_table[] and nb_nics uses them for
one of two things. Either "I have created a NIC device and I'd like a
configuration for it", or "I will create a NIC device *if* there is a
configuration for it". With some variants on the theme around whether
they actually *check* if the model specified in the configuration is
the right one.
Provide functions which perform both of those, allowing platforms to
be a little more consistent and as a step towards making nd_table[]
and nb_nics private to the net code.
Also export the qemu_find_nic_info() helper, as some platforms have
special cases they need to handle.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Thu, 19 Oct 2023 14:30:23 +0000 (15:30 +0100)]
docs: update Xen-on-KVM documentation
Add notes about console and network support, and how to launch PV guests.
Clean up the disk configuration examples now that that's simpler, and
remove the comment about IDE unplug on q35/AHCI now that it's fixed.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Thu, 19 Oct 2023 11:56:42 +0000 (12:56 +0100)]
xen-platform: unplug AHCI disks
To support Xen guests using the Q35 chipset, the unplug protocol needs
to also remove AHCI disks.
Make pci_xen_ide_unplug() more generic, iterating over the children
of the PCI device and destroying the "ide-hd" devices. That works the
same for both AHCI and IDE, as does the detection of the primary disk
as unit 0 on the bus named "ide.0".
Then pci_xen_ide_unplug() can be used for both AHCI and IDE devices.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 17 Oct 2023 20:59:03 +0000 (21:59 +0100)]
net: do not delete nics in net_cleanup()
In net_cleanup() we only need to delete the netdevs, as those may have
state which outlives Qemu when it exits, and thus may actually need to
be cleaned up on exit.
The nics, on the other hand, are owned by the device which created them.
Most devices don't bother to clean up on exit because they don't have
any state which will outlive Qemu... but XenBus devices do need to clean
up their nodes in XenStore, and do have an exit handler to delete them.
When the XenBus exit handler destroys the xen-net-device, it attempts
to delete its nic after net_cleanup() had already done so. And crashes.
Fix this by only deleting netdevs as we walk the list. As the comment
notes, we can't use QTAILQ_FOREACH_SAFE() as each deletion may remove
*multiple* entries, including the "safely" saved 'next' pointer. But
we can store the *previous* entry, since nics are safe.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 18 Oct 2023 23:26:42 +0000 (00:26 +0100)]
hw/xen: handle soft reset for primary console
On soft reset, the prinary console event channel needs to be rebound to
the backend port (in the xen-console driver). We could put that into the
xen-console driver itself, but it's slightly less ugly to keep it within
the KVM/Xen code, by stashing the backend port# on event channel reset
and then rebinding in the primary console reset when it has to recreate
the guest port anyway.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 16 Oct 2023 15:00:23 +0000 (16:00 +0100)]
hw/xen: add support for Xen primary console in emulated mode
The primary console is special because the toolstack maps a page at a
fixed GFN and also allocates the guest-side event channel. Add support
for that in emulated mode, so that we can have a primary console.
Add a *very* rudimentary stub of foriegnmem ops for emulated mode, which
supports literally nothing except a single-page mapping of the console
page. This might as well have been a hack in the xen_console driver, but
this way at least the special-casing is kept within the Xen emulation
code, and it gives us a hook for a more complete implementation if/when
we ever do need one.
Now at last we can boot the Xen PV shim and run PV kernels in QEMU.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 16 Oct 2023 09:28:17 +0000 (10:28 +0100)]
hw/xen: do not repeatedly try to create a failing backend device
If xen_backend_device_create() fails to instantiate a device, the XenBus
code will just keep trying over and over again each time the bus is
re-enumerated, as long as the backend appears online and in
XenbusStateInitialising.
The only thing which prevents the XenBus code from recreating duplicates
of devices which already exist, is the fact that xen_device_realize()
sets the backend state to XenbusStateInitWait. If the attempt to create
the device doesn't get *that* far, that's when it will keep getting
retried.
My first thought was to handle errors by setting the backend state to
XenbusStateClosed, but that doesn't work for XenConsole which wants to
*ignore* any device of type != "ioemu" completely.
So, make xen_backend_device_create() *keep* the XenBackendInstance for a
failed device, and provide a new xen_backend_exists() function to allow
xen_bus_type_enumerate() to check whether one already exists before
creating a new one.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 14 Oct 2023 15:53:23 +0000 (16:53 +0100)]
hw/xen: add get_frontend_path() method to XenDeviceClass
The primary Xen console is special. The guest's side is set up for it by
the toolstack automatically and not by the standard PV init sequence.
Accordingly, its *frontend* doesn't appear in …/device/console/0 either;
instead it appears under …/console in the guest's XenStore node.
To allow the Xen console driver to override the frontend path for the
primary console, add a method to the XenDeviceClass which can be used
instead of the standard xen_device_get_frontend_path()
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 16 Oct 2023 12:01:39 +0000 (13:01 +0100)]
hw/xen: automatically assign device index to block devices
There's no need to force the user to assign a vdev. We can automatically
assign one, starting at xvda and searching until we find the first disk
name that's unused.
This means we can now allow '-drive if=xen,file=xxx' to work without an
explicit separate -driver argument, just like if=virtio.
Rip out the legacy handling from the xenpv machine, which was scribbling
over any disks configured by the toolstack, and didn't work with anything
but raw images.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Kevin Wolf <kwolf@redhat.com>
David Woodhouse [Thu, 12 Oct 2023 09:59:45 +0000 (10:59 +0100)]
hw/xen: populate store frontend nodes with XenStore PFN/port
This is kind of redundant since without being able to get these through
some other method (HVMOP_get_param) the guest wouldn't be able to access
XenStore in order to find them. But Xen populates them, and it does
allow guests to *rebind* to the event channel port after a reset.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 2 Aug 2023 16:04:49 +0000 (17:04 +0100)]
hw/xen: Clean up event channel 'type_val' handling to use union
A previous implementation of this stuff used a 64-bit field for all of
the port information (vcpu/type/type_val) and did atomic exchanges on
them. When I implemented that in Qemu I regretted my life choices and
just kept it simple with locking instead.
So there's no need for the XenEvtchnPort to be so simplistic. We can
use a union for the pirq/virq/interdomain information, which lets us
keep a separate bit for the 'remote domain' in interdomain ports. A
single bit is enough since the only possible targets are loopback or
qemu itself.
So now we can ditch PORT_INFO_TYPEVAL_REMOTE_QEMU and the horrid
manual masking, although the in-memory representation is identical
so there's no change in the saved state ABI.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Tue, 17 Oct 2023 12:34:18 +0000 (13:34 +0100)]
hw/xen: fix XenStore watch delivery to guest
When fire_watch_cb() found the response buffer empty, it would call
deliver_watch() to generate the XS_WATCH_EVENT message in the response
buffer and send an event channel notification to the guest… without
actually *copying* the response buffer into the ring. So there was
nothing for the guest to see. The pending response didn't actually get
processed into the ring until the guest next triggered some activity
from its side.
Add the missing call to put_rsp().
It might have been slightly nicer to call xen_xenstore_event() here,
which would *almost* have worked. Except for the fact that it calls
xen_be_evtchn_pending() to check that it really does have an event
pending (and clear the eventfd for next time). And under Xen it's
defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
so the emu implementation follows suit.
This fixes Xen device hot-unplug.
Fixes: 0254c4d19df ("hw/xen: Add xenstore wire implementation and implementation stubs") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 18 Oct 2023 12:31:20 +0000 (13:31 +0100)]
hw/xen: don't clear map_track[] in xen_gnttab_reset()
The refcounts actually correspond to 'active_ref' structures stored in a
GHashTable per "user" on the backend side (mostly, per XenDevice).
If we zero map_track[] on reset, then when the backend drivers get torn
down and release their mapping we hit the assert(s->map_track[ref] != 0)
in gnt_unref().
So leave them in place. Each backend driver will disconnect and reconnect
as the guest comes back up again and reconnects, and it all works out OK
in the end as the old refs get dropped.
Fixes: de26b2619789 ("hw/xen: Implement soft reset for emulated gnttab") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 11 Oct 2023 23:06:26 +0000 (00:06 +0100)]
hw/xen: select kernel mode for per-vCPU event channel upcall vector
A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.
For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:
/* Trick toolstack to think we are enlightened. */
if (!cpu)
rc = xen_set_callback_via(1);
That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.
Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.
Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)
Fixes: 91cce756179 ("hw/xen: Add xen_evtchn device for event channel emulation") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 11 Oct 2023 22:30:08 +0000 (23:30 +0100)]
i386/xen: fix per-vCPU upcall vector for Xen emulation
The per-vCPU upcall vector support had two problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT.
And secondly it was using the wrong ioctl() to pass the vector to
the kernel and thus the *kernel* would always return -EINVAL.
Linux doesn't (yet) use this mode so it went without decent testing
for a while.
Fixes: 105b47fdf2d0 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 30 Aug 2023 20:13:57 +0000 (21:13 +0100)]
hw/timer/hpet: fix IRQ routing in legacy support mode
The interrupt from timer 0 in legacy mode is supposed to go to IRQ 0 on
the i8259 and IRQ 2 on the I/O APIC. The generic x86 GSI handling can't
cope with IRQ numbers differing between the two chips (despite it also
being the case for PCI INTx routing), so add a special case for the HPET.
IRQ 2 isn't valid on the i8259; it's the cascade IRQ and would be
interpreted as spurious interrupt on the secondary PIC. So we can fix
up all attempts to deliver IRQ2, to actually deliver to IRQ0 on the PIC.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Fri, 10 Mar 2023 17:37:03 +0000 (17:37 +0000)]
intel-iommu: Report interrupt remapping faults, fix return value
A generic X86IOMMUClass->int_remap function should not return VT-d
specific values; fix it to return 0 if the interrupt was successfully
translated or -EINVAL if not.
The VTD_FR_IR_xxx values are supposed to be used to actually raise
faults through the fault reporting mechanism, so do that instead for
the case where the IRQ is actually being injected.
There is more work to be done here, as pretranslations for the KVM IRQ
routing table can't fault; an untranslatable IRQ should be handled in
userspace and the fault raised only when the IRQ actually happens (if
indeed the IRTE is still not valid at that time). But we can work on
that later; we can at least raise faults for the direct case.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Peter Xu <peterx@redhat.com>
Stefan Hajnoczi [Wed, 18 Oct 2023 10:21:15 +0000 (06:21 -0400)]
Merge tag 'pull-vfio-20231018' of https://github.com/legoater/qemu into staging
vfio queue:
* Support for VFIODisplay migration with ramfb
* Preliminary work for IOMMUFD support
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmUvlEYACgkQUaNDx8/7
# 7KFlaw//X2053de2eTdo38/UMSzi5ACWWn2j1iGQZf/3+J2LcdlixZarZr/2DN56
# 4axmwF6+GKozt5+EnvWtgodDn6U9iyMNaAB3CGBHFHsH8uqKeZd/Ii754q4Rcmy9
# ZufBOPWm9Ff7s2MMFiAZvso75jP2wuwVEe1YPRjeJnsNSNIJ6WZfemh3Sl96yRBb
# r38uqzqetKwl7HziMMWP3yb8v+dU8A9bqI1hf1FZGttfFz3XA+pmjXKA6XxdfiZF
# AAotu5x9w86a08sAlr/qVsZFLR37oQykkXM0D840DafJDyr5fbJiq8cwfOjMw9+D
# w6+udRm5KoBWPsvb/T3dR88GRMO22PChjH9Vjl51TstMNhdTxuKJTKhhSoUFZbXV
# 8CMjwfALk5ggIOyCk1LRd04ed+9qkqgcbw1Guy5pYnyPnY/X6XurxxaxS6Gemgtn
# UvgRYhSjio+LgHLO77IVkWJMooTEPzUTty2Zxa7ldbbE+utPUtsmac9+1m2pnpqk
# 5VQmB074QnsJuvf+7HPU6vYCzQWoXHsH1UY/A0fF7MPedNUAbVYzKrdGPyqEMqHy
# xbilAIaS3oO0pMT6kUpRv5c5vjbwkx94Nf/ii8fQVjWzPfCcaF3yEfaam62jMUku
# stySaRpavKIx2oYLlucBqeKaBGaUofk13gGTQlsFs8pKCOAV7r4=
# =s0fN
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 18 Oct 2023 04:16:06 EDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20231018' of https://github.com/legoater/qemu: (22 commits)
hw/vfio: add ramfb migration support
ramfb-standalone: add migration support
ramfb: add migration support
vfio/pci: Remove vfio_detach_device from vfio_realize error path
vfio/ccw: Remove redundant definition of TYPE_VFIO_CCW
vfio/ap: Remove pointless apdev variable
vfio/pci: Fix a potential memory leak in vfio_listener_region_add
vfio/common: Move legacy VFIO backend code into separate container.c
vfio/common: Introduce a global VFIODevice list
vfio/common: Store the parent container in VFIODevice
vfio/common: Introduce a per container device list
vfio/common: Move VFIO reset handler registration to a group agnostic function
vfio/ccw: Use vfio_[attach/detach]_device
vfio/ap: Use vfio_[attach/detach]_device
vfio/platform: Use vfio_[attach/detach]_device
vfio/pci: Introduce vfio_[attach/detach]_device
vfio/common: Extract out vfio_kvm_device_[add/del]_fd
vfio/common: Introduce vfio_container_add|del_section_window()
vfio/common: Propagate KVM_SET_DEVICE_ATTR error if any
vfio/common: Move IOMMU agnostic helpers to a separate file
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Wed, 18 Oct 2023 10:20:41 +0000 (06:20 -0400)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* build system and Python cleanups
* fix netbsd VM build
* allow non-relocatable installs
* allow using command line options to configure qemu-ga
* target/i386: check intercept for XSETBV
* target/i386: fix CPUID_HT exposure
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (32 commits)
configure: define "pkg-config" in addition to "pkgconfig"
meson: add a note on why we use config_host for program paths
meson-buildoptions: document the data at the top
configure, meson: use command line options to configure qemu-ga
configure: unify handling of several Debian cross containers
configure: move environment-specific defaults to config-meson.cross
configure: move target-specific defaults to an external machine file
configure: remove some dead cruft
configure: clean up PIE option handling
configure: clean up plugin option handling
configure, tests/tcg: simplify GDB conditionals
tests/tcg/arm: move non-SVE tests out of conditional
hw/remote: move stub vfu_object_set_bus_irq out of stubs/
hw/xen: cleanup sourcesets
configure: clean up handling of CFI option
meson, cutils: allow non-relocatable installs
meson: do not use set10
meson: do not build shaders by default
tracetool: avoid invalid escape in Python string
tests/vm: avoid invalid escape in Python string
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Marc-André Lureau [Mon, 9 Oct 2023 06:32:45 +0000 (10:32 +0400)]
ramfb: add migration support
Implementing RAMFB migration is quite straightforward. One caveat is to
treat the whole RAMFBCfg as a blob, since that's what is exposed to the
guest directly. This avoid having to fiddle with endianness issues if we
were to migrate fields individually as integers.
The devices using RAMFB will have to include ramfb_vmstate in their
migration description.
Eric Auger [Wed, 11 Oct 2023 20:09:34 +0000 (22:09 +0200)]
vfio/pci: Remove vfio_detach_device from vfio_realize error path
In vfio_realize, on the error path, we currently call
vfio_detach_device() after a successful vfio_attach_device.
While this looks natural, vfio_instance_finalize also induces
a vfio_detach_device(), and it seems to be the right place
instead as other resources are released there which happen
to be a prerequisite to a successful UNSET_CONTAINER.
So let's rely on the finalize vfio_detach_device call to free
all the relevant resources.
Zhenzhong Duan [Mon, 9 Oct 2023 02:20:46 +0000 (10:20 +0800)]
vfio/pci: Fix a potential memory leak in vfio_listener_region_add
When there is an failure in vfio_listener_region_add() and the section
belongs to a ram device, there is an inaccurate error report which should
never be related to vfio_dma_map failure. The memory holding err is also
incrementally leaked in each failure.
Fix it by reporting the real error and free it.
Fixes: 567b5b309ab ("vfio/pci: Relax DMA map errors for MMIO regions") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Yi Liu [Mon, 9 Oct 2023 09:09:17 +0000 (11:09 +0200)]
vfio/common: Move legacy VFIO backend code into separate container.c
Move all the code really dependent on the legacy VFIO container/group
into a separate file: container.c. What does remain in common.c is
the code related to VFIOAddressSpace, MemoryListeners, migration and
all other general operations.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:16 +0000 (11:09 +0200)]
vfio/common: Introduce a global VFIODevice list
Some functions iterate over all the VFIODevices. This is currently
achieved by iterating over all groups/devices. Let's
introduce a global list of VFIODevices simplifying that scan.
This will also be useful while migrating to IOMMUFD by hiding the
group specificity.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:15 +0000 (11:09 +0200)]
vfio/common: Store the parent container in VFIODevice
let's store the parent contaienr within the VFIODevice.
This simplifies the logic in vfio_viommu_preset() and
brings the benefice to hide the group specificity which
is useful for IOMMUFD migration.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:14 +0000 (11:09 +0200)]
vfio/common: Introduce a per container device list
Several functions need to iterate over the VFIO devices attached to
a given container. This is currently achieved by iterating over the
groups attached to the container and then over the devices in the group.
Let's introduce a per container device list that simplifies this
search.
Per container list is used in below functions:
vfio_devices_all_dirty_tracking
vfio_devices_all_device_dirty_tracking
vfio_devices_all_running_and_mig_active
vfio_devices_dma_logging_stop
vfio_devices_dma_logging_start
vfio_devices_query_dirty_bitmap
This will also ease the migration of IOMMUFD by hiding the group
specificity.
Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:13 +0000 (11:09 +0200)]
vfio/common: Move VFIO reset handler registration to a group agnostic function
Move the reset handler registration/unregistration to a place that is not
group specific. vfio_[get/put]_address_space are the best places for that
purpose.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Mon, 9 Oct 2023 09:09:12 +0000 (11:09 +0200)]
vfio/ccw: Use vfio_[attach/detach]_device
Let the vfio-ccw device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.
Note that the migration reduces the following trace
"vfio: subchannel %s has already been attached" (featuring
cssid.ssid.devid) into "device is already attached"
Also now all the devices have been migrated to use the new
vfio_attach_device/vfio_detach_device API, let's turn the
legacy functions into static functions, local to container.c.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Mon, 9 Oct 2023 09:09:11 +0000 (11:09 +0200)]
vfio/ap: Use vfio_[attach/detach]_device
Let the vfio-ap device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.
We take the opportunity to use g_path_get_basename() which
is prefered, as suggested by 3e015d815b ("use g_path_get_basename instead of basename")
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Mon, 9 Oct 2023 09:09:10 +0000 (11:09 +0200)]
vfio/platform: Use vfio_[attach/detach]_device
Let the vfio-platform device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.
Drop the trace event for vfio-platform as we have similar
one in vfio_attach_device.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Mon, 9 Oct 2023 09:09:09 +0000 (11:09 +0200)]
vfio/pci: Introduce vfio_[attach/detach]_device
We want the VFIO devices to be able to use two different
IOMMU backends, the legacy VFIO one and the new iommufd one.
Introduce vfio_[attach/detach]_device which aim at hiding the
underlying IOMMU backend (IOCTLs, datatypes, ...).
Once vfio_attach_device completes, the device is attached
to a security context and its fd can be used. Conversely
When vfio_detach_device completes, the device has been
detached from the security context.
At the moment only the implementation based on the legacy
container/group exists. Let's use it from the vfio-pci device.
Subsequent patches will handle other devices.
We also take benefit of this patch to properly free
vbasedev->name on failure.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:08 +0000 (11:09 +0200)]
vfio/common: Extract out vfio_kvm_device_[add/del]_fd
Introduce two new helpers, vfio_kvm_device_[add/del]_fd
which take as input a file descriptor which can be either a group fd or
a cdev fd. This uses the new KVM_DEV_VFIO_FILE VFIO KVM device group,
which aliases to the legacy KVM_DEV_VFIO_GROUP.
vfio_kvm_device_[add/del]_group then call those new helpers.
Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce helper functions that isolate the code used for
VFIO_SPAPR_TCE_v2_IOMMU.
Those helpers hide implementation details beneath the container object
and make the vfio_listener_region_add/del() implementations more
readable. No code change intended.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Mon, 9 Oct 2023 09:09:06 +0000 (11:09 +0200)]
vfio/common: Propagate KVM_SET_DEVICE_ATTR error if any
In the VFIO_SPAPR_TCE_v2_IOMMU container case, when
KVM_SET_DEVICE_ATTR fails, we currently don't propagate the
error as we do on the vfio_spapr_create_window() failure
case. Let's align the code. Take the opportunity to
reword the error message and make it more explicit.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Yi Liu [Mon, 9 Oct 2023 09:09:05 +0000 (11:09 +0200)]
vfio/common: Move IOMMU agnostic helpers to a separate file
Move low-level iommu agnostic helpers to a separate helpers.c
file. They relate to regions, interrupts, device/region
capabilities and etc.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Mon, 9 Oct 2023 09:09:03 +0000 (11:09 +0200)]
scripts/update-linux-headers: Add iommufd.h
Update the script to import iommufd.h
Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Paolo Bonzini [Tue, 17 Oct 2023 15:32:50 +0000 (17:32 +0200)]
configure: define "pkg-config" in addition to "pkgconfig"
Meson used to allow both "pkgconfig" and "pkg-config" entries in machine
files; the former was used for dependency lookup and the latter
was used as return value for "find_program('pkg-config')", which is a less
common use-case and one that QEMU does not need.
This inconsistency is going to be fixed by Meson 1.3, which will deprecate
"pkgconfig" in favor of "pkg-config" (the less common one, but it makes
sense because it matches the name of the binary). For backward
compatibility it is still allowed to define both, so do that in the
configure-generated machine file.
Related: https://github.com/mesonbuild/meson/pull/12385 Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 9 Oct 2023 12:03:56 +0000 (14:03 +0200)]
configure: unify handling of several Debian cross containers
The Debian and GNU architecture names match very often, even though
there are common cases (32-bit Arm or 64-bit x86) where they do not
and other cases in which the GNU triplet is actually a quadruplet.
But it is still possible to group the common case into a single
case inside probe_target_compiler.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 16 Oct 2023 06:20:13 +0000 (08:20 +0200)]
configure: move environment-specific defaults to config-meson.cross
Store the -Werror and SMBD defaults in the machine file, which still allows
them to be overridden on the command line and enables automatic parsing
of the related options.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>