]> www.infradead.org Git - users/dwmw2/qemu.git/log
users/dwmw2/qemu.git
18 months agonet: make nb_nics and nd_table[] static in net/net.c xenfv-nic-1
David Woodhouse [Sun, 22 Oct 2023 15:31:25 +0000 (16:31 +0100)]
net: make nb_nics and nd_table[] static in net/net.c

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agonet: remove qemu_show_nic_models(), qemu_find_nic_model()
David Woodhouse [Sun, 22 Oct 2023 13:47:06 +0000 (14:47 +0100)]
net: remove qemu_show_nic_models(), qemu_find_nic_model()

These old functions can be removed now too. Let net_param_nic() print
the full set of network devices directly, and also make it note that a
list more specific to this platform/config will be available by using
'-nic model=help' instead.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/pci: remove pci_nic_init_nofail()
David Woodhouse [Sun, 22 Oct 2023 13:05:17 +0000 (14:05 +0100)]
hw/pci: remove pci_nic_init_nofail()

This function is no longer used.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agonet: remove qemu_check_nic_model()
David Woodhouse [Sat, 21 Oct 2023 22:12:19 +0000 (23:12 +0100)]
net: remove qemu_check_nic_model()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xtensa/xtfpga: use qemu_create_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:53:56 +0000 (21:53 +0100)]
hw/xtensa/xtfpga: use qemu_create_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/sparc/sun4m: use qemu_configure_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:46:36 +0000 (21:46 +0100)]
hw/sparc/sun4m: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/s390x/s390-virtio-ccw: use qemu_create_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:45:01 +0000 (21:45 +0100)]
hw/s390x/s390-virtio-ccw: use qemu_create_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/riscv: use qemu_configure_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:44:02 +0000 (21:44 +0100)]
hw/riscv: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/openrisc/openrisc_sim: use qemu_create_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:37:19 +0000 (21:37 +0100)]
hw/openrisc/openrisc_sim: use qemu_create_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/net/lasi_i82596: use qemu_configure_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:37:02 +0000 (21:37 +0100)]
hw/net/lasi_i82596: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/mips: use qemu_create_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:34:41 +0000 (21:34 +0100)]
hw/mips: use qemu_create_nic_device()

The Jazz and MIPS SIM platforms both instantiate their NIC only if a
corresponding configuration exists for it. Convert them to use the
qemu_create_nic_device() function for that.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/microblaze: use qemu_configure_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:26:12 +0000 (21:26 +0100)]
hw/microblaze: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/m68k/q800: use qemu_configure_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:24:06 +0000 (21:24 +0100)]
hw/m68k/q800: use qemu_configure_nic_device()

Then fetch the MAC that was assigned, if any. And assign one if not,
ensuring that it uses the Apple OUI.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/m68k/mcf5208: use qemu_create_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:23:13 +0000 (21:23 +0100)]
hw/m68k/mcf5208: use qemu_create_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/net/etraxfs-eth: use qemu_configure_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:22:29 +0000 (21:22 +0100)]
hw/net/etraxfs-eth: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm: use qemu_configure_nic_device()
David Woodhouse [Sat, 21 Oct 2023 20:19:44 +0000 (21:19 +0100)]
hw/arm: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/stellaris: use qemu_find_nic_info()
David Woodhouse [Fri, 20 Oct 2023 10:46:56 +0000 (11:46 +0100)]
hw/arm/stellaris: use qemu_find_nic_info()

Rather than just using qemu_configure_nic_device(), populate the MAC
address in the system-registers device by peeking at the NICInfo before
it's assigned to the device.

Generate the MAC address early, if there is no matching -nic option.
Otherwise the MAC address wouldn't be generated until net_client_init1()
runs.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/npcm7xx: use qemu_configure_nic_device, allow emc0/emc1 as aliases
David Woodhouse [Fri, 20 Oct 2023 10:40:54 +0000 (11:40 +0100)]
hw/arm/npcm7xx: use qemu_configure_nic_device, allow emc0/emc1 as aliases

Also update the test to specify which device to attach the test socket
to, and remove the comment lamenting the fact that we can't do so.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/highbank: use qemu_create_nic_device()
David Woodhouse [Fri, 20 Oct 2023 00:18:35 +0000 (01:18 +0100)]
hw/arm/highbank: use qemu_create_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/net/lan9118: use qemu_configure_nic_device()
David Woodhouse [Fri, 20 Oct 2023 00:13:08 +0000 (01:13 +0100)]
hw/net/lan9118: use qemu_configure_nic_device()

Some callers instantiate the device unconditionally, others will do so only
if there is a NICInfo to go with it. This appears to be fairly random, but
preseve the existing behaviour for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/net/smc91c111: use qemu_configure_nic_device()
David Woodhouse [Fri, 20 Oct 2023 00:06:52 +0000 (01:06 +0100)]
hw/net/smc91c111: use qemu_configure_nic_device()

Some callers instantiate the device unconditionally, others will do so only
if there is a NICInfo to go with it. This appears to be fairly random, but
preserve the existing behaviour for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/fsl: use qemu_configure_nic_device()
David Woodhouse [Thu, 19 Oct 2023 23:39:59 +0000 (00:39 +0100)]
hw/arm/fsl: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/exynos4: use qemu_create_nic_device()
David Woodhouse [Thu, 19 Oct 2023 23:39:20 +0000 (00:39 +0100)]
hw/arm/exynos4: use qemu_create_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/aspeed: use qemu_configure_nic_device()
David Woodhouse [Thu, 19 Oct 2023 23:38:31 +0000 (00:38 +0100)]
hw/arm/aspeed: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/allwinner: use qemu_configure_nic_device()
David Woodhouse [Thu, 19 Oct 2023 23:37:57 +0000 (00:37 +0100)]
hw/arm/allwinner: use qemu_configure_nic_device()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xtensa/virt: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:53:36 +0000 (21:53 +0100)]
hw/xtensa/virt: use pci_init_nic_devices()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/sparc64/sun4u: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:49:18 +0000 (21:49 +0100)]
hw/sparc64/sun4u: use pci_init_nic_devices()

The first sunhme NIC gets placed a function 1 on slot 1 of PCI bus A,
and the rest are dynamically assigned on PCI bus B.

Previously, any PCI NIC would get the special treatment purely by
virtue of being first in the list.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/sh4/r2d: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:45:10 +0000 (21:45 +0100)]
hw/sh4/r2d: use pci_init_nic_devices()

Previously, the first PCI NIC would be assigned to slot 2 even if the
user override the model and made it something other than an rtl8139
which is the default. Everything else would be dynamically assigned.

Now, the first rtl8139 gets slot 2 and everything else is dynamic.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/ppc: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:43:33 +0000 (21:43 +0100)]
hw/ppc: use pci_init_nic_devices()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/ppc/spapr: use qemu_get_nic_info() and pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:41:37 +0000 (21:41 +0100)]
hw/ppc/spapr: use qemu_get_nic_info() and pci_init_nic_devices()

Avoid directly referencing nd_table[] by first instantiating any
spapr-vlan devices using a qemu_get_nic_info() loop, then calling
pci_init_nic_devices() to do the rest.

No functional change intended.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/ppc/prep: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:38:05 +0000 (21:38 +0100)]
hw/ppc/prep: use pci_init_nic_devices()

Previously, the first PCI NIC would be placed in PCI slot 3 and the rest
would be dynamically assigned. Even if the user overrode the default NIC
type and made it something other than PCNet.

Now, the first PCNet NIC (that is, anything not explicitly specified
to be anything different) will go to slot 3 even if it isn't the first
NIC specified on the commnd line. And anything else will be dynamically
assigned.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/mips/loongson3_virt: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:34:11 +0000 (21:34 +0100)]
hw/mips/loongson3_virt: use pci_init_nic_devices()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/mips/malta: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:30:51 +0000 (21:30 +0100)]
hw/mips/malta: use pci_init_nic_devices()

The Malta board setup code would previously place the first NIC into PCI
slot 11 if was a PCNet card, and the rest (including the first if it was
anything other than a PCNet card) would be dynamically assigned.

Now it will place any PCNet NIC into slot 11, and then anything else will
be dynamically assigned.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/mips/fuloong2e: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:26:19 +0000 (21:26 +0100)]
hw/mips/fuloong2e: use pci_init_nic_devices()

The previous behaviour was: *if* the first NIC specified on the command
line was an RTL8139 (or unspecified model) then it gets assigned to PCI
slot 7, which is where the Fuloong board had an RTL8139. All other
devices (including the first, if it was specified a anything other then
an rtl8319) get dynamically assigned on the bus.

The new behaviour is subtly different: If the first NIC was given a
specific model *other* than rtl8139, and a subsequent NIC was not,
then the rtl8139 (or unspecified) NIC will go to slot 7 and the rest
will be dynamically assigned.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/loongarch: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:21:03 +0000 (21:21 +0100)]
hw/loongarch: use pci_init_nic_devices()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/hppa: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:20:34 +0000 (21:20 +0100)]
hw/hppa: use pci_init_nic_devices()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/virt: use pci_init_nic_devices()
David Woodhouse [Sat, 21 Oct 2023 20:19:27 +0000 (21:19 +0100)]
hw/arm/virt: use pci_init_nic_devices()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/arm/sbsa-ref: use pci_init_nic_devices()
David Woodhouse [Fri, 20 Oct 2023 10:45:58 +0000 (11:45 +0100)]
hw/arm/sbsa-ref: use pci_init_nic_devices()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/alpha/dp264: use pci_init_nic_devices()
David Woodhouse [Fri, 20 Oct 2023 00:15:45 +0000 (01:15 +0100)]
hw/alpha/dp264: use pci_init_nic_devices()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs
David Woodhouse [Tue, 17 Oct 2023 16:53:58 +0000 (17:53 +0100)]
hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs

When instantiating XenBus itself, for each NIC which is configured with
either the model unspecified, or set to to "xen" or "xen-net-device",
create a corresponding xen-net-device for it.

Now we can launch emulated Xen guests with '-nic user', and this fixes
the setup for Xen PV guests, which was previously broken in various
ways and never actually managed to peer with the netdev.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices()
David Woodhouse [Thu, 19 Oct 2023 23:07:45 +0000 (00:07 +0100)]
hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices()

Eliminate direct access to nd_table[] and nb_nics by processing the the
ISA NICs first and then calling pci_init_nic_devices() for the test.

It's important to do this *before* the subsequent patch which registers
the Xen PV network devices, because the code being remove here didn't
check whether nd->instantiated was already set before using each entry.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot()
David Woodhouse [Fri, 20 Oct 2023 00:05:31 +0000 (01:05 +0100)]
hw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot()

The loop over nd_table[] to add PCI NICs is repeated in quite a few
places. Add a helper function to do it.

Some platforms also try to instantiate a specific model in a specific
slot, to match the real hardware. Add pci_init_nic_in_slot() for that
purpose.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agonet: add qemu_create_nic_bus_devices()
David Woodhouse [Sun, 22 Oct 2023 08:13:41 +0000 (09:13 +0100)]
net: add qemu_create_nic_bus_devices()

This will instantiate any NICs which live on a given bus type. Each bus
is allowed *one* substitution (for PCI it's virtio → virtio-net-pci, for
Xen it's xen → xen-net-device; no point in overengineering it unless we
actually want more).

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agonet: report list of available models according to platform
David Woodhouse [Sat, 21 Oct 2023 22:09:38 +0000 (23:09 +0100)]
net: report list of available models according to platform

By noting the models for which a configuration was requested, we can give
the user an accurate list of which NIC models were actually available on
the platform/configuration that was otherwise chosen.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agonet: add qemu_{configure,create}_nic_device(), qemu_find_nic_info()
David Woodhouse [Thu, 19 Oct 2023 20:28:29 +0000 (21:28 +0100)]
net: add qemu_{configure,create}_nic_device(), qemu_find_nic_info()

Most code which directly accesses nd_table[] and nb_nics uses them for
one of two things. Either "I have created a NIC device and I'd like a
configuration for it", or "I will create a NIC device *if* there is a
configuration for it".  With some variants on the theme around whether
they actually *check* if the model specified in the configuration is
the right one.

Provide functions which perform both of those, allowing platforms to
be a little more consistent and as a step towards making nd_table[]
and nb_nics private to the net code.

Also export the qemu_find_nic_info() helper, as some platforms have
special cases they need to handle.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: use correct default protocol for xen-block on x86
David Woodhouse [Fri, 20 Oct 2023 17:00:18 +0000 (18:00 +0100)]
hw/xen: use correct default protocol for xen-block on x86

Even on x86_64 the default protocol is the x86-32 one if the guest doesn't
specifically ask for x86-64.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agodocs: update Xen-on-KVM documentation
David Woodhouse [Thu, 19 Oct 2023 14:30:23 +0000 (15:30 +0100)]
docs: update Xen-on-KVM documentation

Add notes about console and network support, and how to launch PV guests.
Clean up the disk configuration examples now that that's simpler, and
remove the comment about IDE unplug on q35/AHCI now that it's fixed.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agoxen-platform: unplug AHCI disks
David Woodhouse [Thu, 19 Oct 2023 11:56:42 +0000 (12:56 +0100)]
xen-platform: unplug AHCI disks

To support Xen guests using the Q35 chipset, the unplug protocol needs
to also remove AHCI disks.

Make pci_xen_ide_unplug() more generic, iterating over the children
of the PCI device and destroying the "ide-hd" devices. That works the
same for both AHCI and IDE, as does the detection of the primary disk
as unit 0 on the bus named "ide.0".

Then pci_xen_ide_unplug() can be used for both AHCI and IDE devices.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agotests/avocado: switch to using xen-net-device for Xen guest tests
David Woodhouse [Thu, 19 Oct 2023 08:30:08 +0000 (09:30 +0100)]
tests/avocado: switch to using xen-net-device for Xen guest tests

Fix the filename in the MAINTAINERS file too.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agonet: do not delete nics in net_cleanup()
David Woodhouse [Tue, 17 Oct 2023 20:59:03 +0000 (21:59 +0100)]
net: do not delete nics in net_cleanup()

In net_cleanup() we only need to delete the netdevs, as those may have
state which outlives Qemu when it exits, and thus may actually need to
be cleaned up on exit.

The nics, on the other hand, are owned by the device which created them.
Most devices don't bother to clean up on exit because they don't have
any state which will outlive Qemu... but XenBus devices do need to clean
up their nodes in XenStore, and do have an exit handler to delete them.

When the XenBus exit handler destroys the xen-net-device, it attempts
to delete its nic after net_cleanup() had already done so. And crashes.

Fix this by only deleting netdevs as we walk the list. As the comment
notes, we can't use QTAILQ_FOREACH_SAFE() as each deletion may remove
*multiple* entries, including the "safely" saved 'next' pointer. But
we can store the *previous* entry, since nics are safe.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: update Xen PV NIC to XenDevice model
David Woodhouse [Tue, 17 Oct 2023 12:58:03 +0000 (13:58 +0100)]
hw/xen: update Xen PV NIC to XenDevice model

This allows us to use Xen PV networking with emulated Xen guests, and to
add them on the command line or hotplug.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: only remove peers of PCI NICs on unplug
David Woodhouse [Tue, 17 Oct 2023 12:32:50 +0000 (13:32 +0100)]
hw/xen: only remove peers of PCI NICs on unplug

When the Xen guest asks to unplug *emulated* NICs, it's kind of unhelpful
also to unplug the peer of the *Xen* PV NIC.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: handle soft reset for primary console
David Woodhouse [Wed, 18 Oct 2023 23:26:42 +0000 (00:26 +0100)]
hw/xen: handle soft reset for primary console

On soft reset, the prinary console event channel needs to be rebound to
the backend port (in the xen-console driver). We could put that into the
xen-console driver itself, but it's slightly less ugly to keep it within
the KVM/Xen code, by stashing the backend port# on event channel reset
and then rebinding in the primary console reset when it has to recreate
the guest port anyway.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: add support for Xen primary console in emulated mode
David Woodhouse [Mon, 16 Oct 2023 15:00:23 +0000 (16:00 +0100)]
hw/xen: add support for Xen primary console in emulated mode

The primary console is special because the toolstack maps a page at a
fixed GFN and also allocates the guest-side event channel. Add support
for that in emulated mode, so that we can have a primary console.

Add a *very* rudimentary stub of foriegnmem ops for emulated mode, which
supports literally nothing except a single-page mapping of the console
page. This might as well have been a hack in the xen_console driver, but
this way at least the special-casing is kept within the Xen emulation
code, and it gives us a hook for a more complete implementation if/when
we ever do need one.

Now at last we can boot the Xen PV shim and run PV kernels in QEMU.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: update Xen console to XenDevice model
David Woodhouse [Tue, 17 Oct 2023 21:20:28 +0000 (22:20 +0100)]
hw/xen: update Xen console to XenDevice model

This allows (non-primary) console devices to be created on the command
line and hotplugged.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: do not repeatedly try to create a failing backend device
David Woodhouse [Mon, 16 Oct 2023 09:28:17 +0000 (10:28 +0100)]
hw/xen: do not repeatedly try to create a failing backend device

If xen_backend_device_create() fails to instantiate a device, the XenBus
code will just keep trying over and over again each time the bus is
re-enumerated, as long as the backend appears online and in
XenbusStateInitialising.

The only thing which prevents the XenBus code from recreating duplicates
of devices which already exist, is the fact that xen_device_realize()
sets the backend state to XenbusStateInitWait. If the attempt to create
the device doesn't get *that* far, that's when it will keep getting
retried.

My first thought was to handle errors by setting the backend state to
XenbusStateClosed, but that doesn't work for XenConsole which wants to
*ignore* any device of type != "ioemu" completely.

So, make xen_backend_device_create() *keep* the XenBackendInstance for a
failed device, and provide a new xen_backend_exists() function to allow
xen_bus_type_enumerate() to check whether one already exists before
creating a new one.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: add get_frontend_path() method to XenDeviceClass
David Woodhouse [Sat, 14 Oct 2023 15:53:23 +0000 (16:53 +0100)]
hw/xen: add get_frontend_path() method to XenDeviceClass

The primary Xen console is special. The guest's side is set up for it by
the toolstack automatically and not by the standard PV init sequence.

Accordingly, its *frontend* doesn't appear in …/device/console/0 either;
instead it appears under …/console in the guest's XenStore node.

To allow the Xen console driver to override the frontend path for the
primary console, add a method to the XenDeviceClass which can be used
instead of the standard xen_device_get_frontend_path()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: automatically assign device index to block devices
David Woodhouse [Mon, 16 Oct 2023 12:01:39 +0000 (13:01 +0100)]
hw/xen: automatically assign device index to block devices

There's no need to force the user to assign a vdev. We can automatically
assign one, starting at xvda and searching until we find the first disk
name that's unused.

This means we can now allow '-drive if=xen,file=xxx' to work without an
explicit separate -driver argument, just like if=virtio.

Rip out the legacy handling from the xenpv machine, which was scribbling
over any disks configured by the toolstack, and didn't work with anything
but raw images.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Kevin Wolf <kwolf@redhat.com>
18 months agohw/xen: populate store frontend nodes with XenStore PFN/port
David Woodhouse [Thu, 12 Oct 2023 09:59:45 +0000 (10:59 +0100)]
hw/xen: populate store frontend nodes with XenStore PFN/port

This is kind of redundant since without being able to get these through
some other method (HVMOP_get_param) the guest wouldn't be able to access
XenStore in order to find them. But Xen populates them, and it does
allow guests to *rebind* to the event channel port after a reset.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agoi386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID
David Woodhouse [Wed, 11 Oct 2023 22:50:02 +0000 (23:50 +0100)]
i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID

This will allow Linux guests (since v6.0) to use the per-vCPU upcall
vector delivered as MSI through the local APIC.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agoinclude: update Xen public headers to Xen 4.17.2 release
David Woodhouse [Wed, 11 Oct 2023 22:47:30 +0000 (23:47 +0100)]
include: update Xen public headers to Xen 4.17.2 release

... in order to advertise the XEN_HVM_CPUID_UPCALL_VECTOR feature,
which will come in a subsequent commit.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: Clean up event channel 'type_val' handling to use union
David Woodhouse [Wed, 2 Aug 2023 16:04:49 +0000 (17:04 +0100)]
hw/xen: Clean up event channel 'type_val' handling to use union

A previous implementation of this stuff used a 64-bit field for all of
the port information (vcpu/type/type_val) and did atomic exchanges on
them. When I implemented that in Qemu I regretted my life choices and
just kept it simple with locking instead.

So there's no need for the XenEvtchnPort to be so simplistic. We can
use a union for the pirq/virq/interdomain information, which lets us
keep a separate bit for the 'remote domain' in interdomain ports. A
single bit is enough since the only possible targets are loopback or
qemu itself.

So now we can ditch PORT_INFO_TYPEVAL_REMOTE_QEMU and the horrid
manual masking, although the in-memory representation is identical
so there's no change in the saved state ABI.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
18 months agoi386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer()
David Woodhouse [Wed, 23 Aug 2023 11:40:45 +0000 (12:40 +0100)]
i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer()

Upstream Xen now ignores this flag¹, since the only guest kernel ever to
use it was buggy.

¹ https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=19c6cbd909

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
18 months agohw/xen: fix XenStore watch delivery to guest
David Woodhouse [Tue, 17 Oct 2023 12:34:18 +0000 (13:34 +0100)]
hw/xen: fix XenStore watch delivery to guest

When fire_watch_cb() found the response buffer empty, it would call
deliver_watch() to generate the XS_WATCH_EVENT message in the response
buffer and send an event channel notification to the guest… without
actually *copying* the response buffer into the ring. So there was
nothing for the guest to see. The pending response didn't actually get
processed into the ring until the guest next triggered some activity
from its side.

Add the missing call to put_rsp().

It might have been slightly nicer to call xen_xenstore_event() here,
which would *almost* have worked. Except for the fact that it calls
xen_be_evtchn_pending() to check that it really does have an event
pending (and clear the eventfd for next time). And under Xen it's
defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
so the emu implementation follows suit.

This fixes Xen device hot-unplug.

Fixes: 0254c4d19df ("hw/xen: Add xenstore wire implementation and implementation stubs")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: don't clear map_track[] in xen_gnttab_reset()
David Woodhouse [Wed, 18 Oct 2023 12:31:20 +0000 (13:31 +0100)]
hw/xen: don't clear map_track[] in xen_gnttab_reset()

The refcounts actually correspond to 'active_ref' structures stored in a
GHashTable per "user" on the backend side (mostly, per XenDevice).

If we zero map_track[] on reset, then when the backend drivers get torn
down and release their mapping we hit the assert(s->map_track[ref] != 0)
in gnt_unref().

So leave them in place. Each backend driver will disconnect and reconnect
as the guest comes back up again and reconnects, and it all works out OK
in the end as the old refs get dropped.

Fixes: de26b2619789 ("hw/xen: Implement soft reset for emulated gnttab")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agohw/xen: select kernel mode for per-vCPU event channel upcall vector
David Woodhouse [Wed, 11 Oct 2023 23:06:26 +0000 (00:06 +0100)]
hw/xen: select kernel mode for per-vCPU event channel upcall vector

A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.

For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:

       /* Trick toolstack to think we are enlightened. */
       if (!cpu)
               rc = xen_set_callback_via(1);

That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.

Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.

Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)

Fixes: 91cce756179 ("hw/xen: Add xen_evtchn device for event channel emulation")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agoi386/xen: fix per-vCPU upcall vector for Xen emulation
David Woodhouse [Wed, 11 Oct 2023 22:30:08 +0000 (23:30 +0100)]
i386/xen: fix per-vCPU upcall vector for Xen emulation

The per-vCPU upcall vector support had two problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT.
And secondly it was using the wrong ioctl() to pass the vector to
the kernel and thus the *kernel* would always return -EINVAL.

Linux doesn't (yet) use this mode so it went without decent testing
for a while.

Fixes: 105b47fdf2d0 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agoi386/xen: Don't advertise XENFEAT_supervisor_mode_kernel
David Woodhouse [Tue, 8 Aug 2023 16:58:46 +0000 (17:58 +0100)]
i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel

This confuses lscpu into thinking it's running in PVH mode.

Fixes: bedcc139248 ("i386/xen: implement HYPERVISOR_xen_version")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
18 months agohw/timer/hpet: fix IRQ routing in legacy support mode
David Woodhouse [Wed, 30 Aug 2023 20:13:57 +0000 (21:13 +0100)]
hw/timer/hpet: fix IRQ routing in legacy support mode

The interrupt from timer 0 in legacy mode is supposed to go to IRQ 0 on
the i8259 and IRQ 2 on the I/O APIC. The generic x86 GSI handling can't
cope with IRQ numbers differing between the two chips (despite it also
being the case for PCI INTx routing), so add a special case for the HPET.

IRQ 2 isn't valid on the i8259; it's the cascade IRQ and would be
interpreted as spurious interrupt on the secondary PIC. So we can fix
up all attempts to deliver IRQ2, to actually deliver to IRQ0 on the PIC.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
18 months agointel-iommu: Report interrupt remapping faults, fix return value
David Woodhouse [Fri, 10 Mar 2023 17:37:03 +0000 (17:37 +0000)]
intel-iommu: Report interrupt remapping faults, fix return value

A generic X86IOMMUClass->int_remap function should not return VT-d
specific values; fix it to return 0 if the interrupt was successfully
translated or -EINVAL if not.

The VTD_FR_IR_xxx values are supposed to be used to actually raise
faults through the fault reporting mechanism, so do that instead for
the case where the IRQ is actually being injected.

There is more work to be done here, as pretranslations for the KVM IRQ
routing table can't fault; an untranslatable IRQ should be handled in
userspace and the fault raised only when the IRQ actually happens (if
indeed the IRTE is still not valid at that time). But we can work on
that later; we can at least raise faults for the direct case.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Peter Xu <peterx@redhat.com>
18 months agoMerge tag 'pull-vfio-20231018' of https://github.com/legoater/qemu into staging
Stefan Hajnoczi [Wed, 18 Oct 2023 10:21:15 +0000 (06:21 -0400)]
Merge tag 'pull-vfio-20231018' of https://github.com/legoater/qemu into staging

vfio queue:

* Support for VFIODisplay migration with ramfb
* Preliminary work for IOMMUFD support

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmUvlEYACgkQUaNDx8/7
# 7KFlaw//X2053de2eTdo38/UMSzi5ACWWn2j1iGQZf/3+J2LcdlixZarZr/2DN56
# 4axmwF6+GKozt5+EnvWtgodDn6U9iyMNaAB3CGBHFHsH8uqKeZd/Ii754q4Rcmy9
# ZufBOPWm9Ff7s2MMFiAZvso75jP2wuwVEe1YPRjeJnsNSNIJ6WZfemh3Sl96yRBb
# r38uqzqetKwl7HziMMWP3yb8v+dU8A9bqI1hf1FZGttfFz3XA+pmjXKA6XxdfiZF
# AAotu5x9w86a08sAlr/qVsZFLR37oQykkXM0D840DafJDyr5fbJiq8cwfOjMw9+D
# w6+udRm5KoBWPsvb/T3dR88GRMO22PChjH9Vjl51TstMNhdTxuKJTKhhSoUFZbXV
# 8CMjwfALk5ggIOyCk1LRd04ed+9qkqgcbw1Guy5pYnyPnY/X6XurxxaxS6Gemgtn
# UvgRYhSjio+LgHLO77IVkWJMooTEPzUTty2Zxa7ldbbE+utPUtsmac9+1m2pnpqk
# 5VQmB074QnsJuvf+7HPU6vYCzQWoXHsH1UY/A0fF7MPedNUAbVYzKrdGPyqEMqHy
# xbilAIaS3oO0pMT6kUpRv5c5vjbwkx94Nf/ii8fQVjWzPfCcaF3yEfaam62jMUku
# stySaRpavKIx2oYLlucBqeKaBGaUofk13gGTQlsFs8pKCOAV7r4=
# =s0fN
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 18 Oct 2023 04:16:06 EDT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg:                 aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-vfio-20231018' of https://github.com/legoater/qemu: (22 commits)
  hw/vfio: add ramfb migration support
  ramfb-standalone: add migration support
  ramfb: add migration support
  vfio/pci: Remove vfio_detach_device from vfio_realize error path
  vfio/ccw: Remove redundant definition of TYPE_VFIO_CCW
  vfio/ap: Remove pointless apdev variable
  vfio/pci: Fix a potential memory leak in vfio_listener_region_add
  vfio/common: Move legacy VFIO backend code into separate container.c
  vfio/common: Introduce a global VFIODevice list
  vfio/common: Store the parent container in VFIODevice
  vfio/common: Introduce a per container device list
  vfio/common: Move VFIO reset handler registration to a group agnostic function
  vfio/ccw: Use vfio_[attach/detach]_device
  vfio/ap: Use vfio_[attach/detach]_device
  vfio/platform: Use vfio_[attach/detach]_device
  vfio/pci: Introduce vfio_[attach/detach]_device
  vfio/common: Extract out vfio_kvm_device_[add/del]_fd
  vfio/common: Introduce vfio_container_add|del_section_window()
  vfio/common: Propagate KVM_SET_DEVICE_ATTR error if any
  vfio/common: Move IOMMU agnostic helpers to a separate file
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
18 months agoMerge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
Stefan Hajnoczi [Wed, 18 Oct 2023 10:20:41 +0000 (06:20 -0400)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* build system and Python cleanups
* fix netbsd VM build
* allow non-relocatable installs
* allow using command line options to configure qemu-ga
* target/i386: check intercept for XSETBV
* target/i386: fix CPUID_HT exposure

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmUvkQQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroM3pQgArXCsmnsjlng1chjCvKnIuVmaTYZ5
# aC9pcx7TlyM0+XWtTN0NQhFt71Te+3ioReXIQRvy5O68RNbEkiu8LXfOJhWAHbWk
# vZVtzHQuOZVizeZtUruKlDaw0nZ8bg+NI4aGLs6rs3WphEAM+tiLnZJ0BouiedKS
# e/COB/Hqjok+Ntksbfv5q7XpWjwQB0y2073vM1Mcf0ToOWFLFdL7x0SZ3hxyYlYl
# eoefp/8kbWeUWA7HuoOKmpiLIxmKnY7eXp+UCvdnEhnSce9sCxpn2nzqqLuPItTK
# V3GrJ2//+lrekPHyQvb8IjUMUrPOmzf8GadIE0tkfdHjEP72IsHk0VX81A==
# =rPte
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 18 Oct 2023 04:02:12 EDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (32 commits)
  configure: define "pkg-config" in addition to "pkgconfig"
  meson: add a note on why we use config_host for program paths
  meson-buildoptions: document the data at the top
  configure, meson: use command line options to configure qemu-ga
  configure: unify handling of several Debian cross containers
  configure: move environment-specific defaults to config-meson.cross
  configure: move target-specific defaults to an external machine file
  configure: remove some dead cruft
  configure: clean up PIE option handling
  configure: clean up plugin option handling
  configure, tests/tcg: simplify GDB conditionals
  tests/tcg/arm: move non-SVE tests out of conditional
  hw/remote: move stub vfu_object_set_bus_irq out of stubs/
  hw/xen: cleanup sourcesets
  configure: clean up handling of CFI option
  meson, cutils: allow non-relocatable installs
  meson: do not use set10
  meson: do not build shaders by default
  tracetool: avoid invalid escape in Python string
  tests/vm: avoid invalid escape in Python string
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
18 months agohw/vfio: add ramfb migration support
Marc-André Lureau [Mon, 9 Oct 2023 06:32:47 +0000 (10:32 +0400)]
hw/vfio: add ramfb migration support

Add a "VFIODisplay" subsection whenever "x-ramfb-migrate" is turned on.

Turn it off by default on machines <= 8.1 for compatibility reasons.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
[ clg:  - checkpatch fixes
   - improved warn_report() in vfio_realize() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agoramfb-standalone: add migration support
Marc-André Lureau [Mon, 9 Oct 2023 06:32:46 +0000 (10:32 +0400)]
ramfb-standalone: add migration support

Add a "ramfb-dev" section whenever "x-migrate" is turned on. Turn it off
by default on machines <= 8.1 for compatibility reasons.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agoramfb: add migration support
Marc-André Lureau [Mon, 9 Oct 2023 06:32:45 +0000 (10:32 +0400)]
ramfb: add migration support

Implementing RAMFB migration is quite straightforward. One caveat is to
treat the whole RAMFBCfg as a blob, since that's what is exposed to the
guest directly. This avoid having to fiddle with endianness issues if we
were to migrate fields individually as integers.

The devices using RAMFB will have to include ramfb_vmstate in their
migration description.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/pci: Remove vfio_detach_device from vfio_realize error path
Eric Auger [Wed, 11 Oct 2023 20:09:34 +0000 (22:09 +0200)]
vfio/pci: Remove vfio_detach_device from vfio_realize error path

In vfio_realize, on the error path, we currently call
vfio_detach_device() after a successful vfio_attach_device.
While this looks natural, vfio_instance_finalize also induces
a vfio_detach_device(), and it seems to be the right place
instead as other resources are released there which happen
to be a prerequisite to a successful UNSET_CONTAINER.

So let's rely on the finalize vfio_detach_device call to free
all the relevant resources.

Fixes: a28e06621170 ("vfio/pci: Introduce vfio_[attach/detach]_device")
Reported-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/ccw: Remove redundant definition of TYPE_VFIO_CCW
Zhenzhong Duan [Mon, 9 Oct 2023 02:20:48 +0000 (10:20 +0800)]
vfio/ccw: Remove redundant definition of TYPE_VFIO_CCW

No functional changes.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/ap: Remove pointless apdev variable
Zhenzhong Duan [Mon, 9 Oct 2023 02:20:47 +0000 (10:20 +0800)]
vfio/ap: Remove pointless apdev variable

No need to double-cast, call VFIO_AP_DEVICE() on DeviceState.

No functional changes.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/pci: Fix a potential memory leak in vfio_listener_region_add
Zhenzhong Duan [Mon, 9 Oct 2023 02:20:46 +0000 (10:20 +0800)]
vfio/pci: Fix a potential memory leak in vfio_listener_region_add

When there is an failure in vfio_listener_region_add() and the section
belongs to a ram device, there is an inaccurate error report which should
never be related to vfio_dma_map failure. The memory holding err is also
incrementally leaked in each failure.

Fix it by reporting the real error and free it.

Fixes: 567b5b309ab ("vfio/pci: Relax DMA map errors for MMIO regions")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Move legacy VFIO backend code into separate container.c
Yi Liu [Mon, 9 Oct 2023 09:09:17 +0000 (11:09 +0200)]
vfio/common: Move legacy VFIO backend code into separate container.c

Move all the code really dependent on the legacy VFIO container/group
into a separate file: container.c. What does remain in common.c is
the code related to VFIOAddressSpace, MemoryListeners, migration and
all other general operations.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Introduce a global VFIODevice list
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:16 +0000 (11:09 +0200)]
vfio/common: Introduce a global VFIODevice list

Some functions iterate over all the VFIODevices. This is currently
achieved by iterating over all groups/devices. Let's
introduce a global list of VFIODevices simplifying that scan.

This will also be useful while migrating to IOMMUFD by hiding the
group specificity.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Store the parent container in VFIODevice
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:15 +0000 (11:09 +0200)]
vfio/common: Store the parent container in VFIODevice

let's store the parent contaienr within the VFIODevice.
This simplifies the logic in vfio_viommu_preset() and
brings the benefice to hide the group specificity which
is useful for IOMMUFD migration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Introduce a per container device list
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:14 +0000 (11:09 +0200)]
vfio/common: Introduce a per container device list

Several functions need to iterate over the VFIO devices attached to
a given container.  This is currently achieved by iterating over the
groups attached to the container and then over the devices in the group.
Let's introduce a per container device list that simplifies this
search.

Per container list is used in below functions:
vfio_devices_all_dirty_tracking
vfio_devices_all_device_dirty_tracking
vfio_devices_all_running_and_mig_active
vfio_devices_dma_logging_stop
vfio_devices_dma_logging_start
vfio_devices_query_dirty_bitmap

This will also ease the migration of IOMMUFD by hiding the group
specificity.

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Move VFIO reset handler registration to a group agnostic function
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:13 +0000 (11:09 +0200)]
vfio/common: Move VFIO reset handler registration to a group agnostic function

Move the reset handler registration/unregistration to a place that is not
group specific. vfio_[get/put]_address_space are the best places for that
purpose.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/ccw: Use vfio_[attach/detach]_device
Eric Auger [Mon, 9 Oct 2023 09:09:12 +0000 (11:09 +0200)]
vfio/ccw: Use vfio_[attach/detach]_device

Let the vfio-ccw device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.

Note that the migration reduces the following trace
"vfio: subchannel %s has already been attached" (featuring
cssid.ssid.devid) into "device is already attached"

Also now all the devices have been migrated to use the new
vfio_attach_device/vfio_detach_device API, let's turn the
legacy functions into static functions, local to container.c.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/ap: Use vfio_[attach/detach]_device
Eric Auger [Mon, 9 Oct 2023 09:09:11 +0000 (11:09 +0200)]
vfio/ap: Use vfio_[attach/detach]_device

Let the vfio-ap device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.

We take the opportunity to use g_path_get_basename() which
is prefered, as suggested by
3e015d815b ("use g_path_get_basename instead of basename")

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/platform: Use vfio_[attach/detach]_device
Eric Auger [Mon, 9 Oct 2023 09:09:10 +0000 (11:09 +0200)]
vfio/platform: Use vfio_[attach/detach]_device

Let the vfio-platform device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.

Drop the trace event for vfio-platform as we have similar
one in vfio_attach_device.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/pci: Introduce vfio_[attach/detach]_device
Eric Auger [Mon, 9 Oct 2023 09:09:09 +0000 (11:09 +0200)]
vfio/pci: Introduce vfio_[attach/detach]_device

We want the VFIO devices to be able to use two different
IOMMU backends, the legacy VFIO one and the new iommufd one.

Introduce vfio_[attach/detach]_device which aim at hiding the
underlying IOMMU backend (IOCTLs, datatypes, ...).

Once vfio_attach_device completes, the device is attached
to a security context and its fd can be used. Conversely
When vfio_detach_device completes, the device has been
detached from the security context.

At the moment only the implementation based on the legacy
container/group exists. Let's use it from the vfio-pci device.
Subsequent patches will handle other devices.

We also take benefit of this patch to properly free
vbasedev->name on failure.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Extract out vfio_kvm_device_[add/del]_fd
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:08 +0000 (11:09 +0200)]
vfio/common: Extract out vfio_kvm_device_[add/del]_fd

Introduce two new helpers, vfio_kvm_device_[add/del]_fd
which take as input a file descriptor which can be either a group fd or
a cdev fd. This uses the new KVM_DEV_VFIO_FILE VFIO KVM device group,
which aliases to the legacy KVM_DEV_VFIO_GROUP.

vfio_kvm_device_[add/del]_group then call those new helpers.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Introduce vfio_container_add|del_section_window()
Eric Auger [Mon, 9 Oct 2023 09:09:07 +0000 (11:09 +0200)]
vfio/common: Introduce vfio_container_add|del_section_window()

Introduce helper functions that isolate the code used for
VFIO_SPAPR_TCE_v2_IOMMU.

Those helpers hide implementation details beneath the container object
and make the vfio_listener_region_add/del() implementations more
readable. No code change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Propagate KVM_SET_DEVICE_ATTR error if any
Eric Auger [Mon, 9 Oct 2023 09:09:06 +0000 (11:09 +0200)]
vfio/common: Propagate KVM_SET_DEVICE_ATTR error if any

In the VFIO_SPAPR_TCE_v2_IOMMU container case, when
KVM_SET_DEVICE_ATTR fails, we currently don't propagate the
error as we do on the vfio_spapr_create_window() failure
case. Let's align the code. Take the opportunity to
reword the error message and make it more explicit.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agovfio/common: Move IOMMU agnostic helpers to a separate file
Yi Liu [Mon, 9 Oct 2023 09:09:05 +0000 (11:09 +0200)]
vfio/common: Move IOMMU agnostic helpers to a separate file

Move low-level iommu agnostic helpers to a separate helpers.c
file. They relate to regions, interrupts, device/region
capabilities and etc.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agolinux-headers: Add iommufd.h
Zhenzhong Duan [Mon, 9 Oct 2023 09:09:04 +0000 (11:09 +0200)]
linux-headers: Add iommufd.h

Since commit da3c22c74a3c ("linux-headers: Update to Linux v6.6-rc1"),
linux-headers has been updated to v6.6-rc1.

As previous patch added iommufd.h to update-linux-headers.sh,
run the script again against TAG v6.6-rc1 to have iommufd.h included.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agoscripts/update-linux-headers: Add iommufd.h
Eric Auger [Mon, 9 Oct 2023 09:09:03 +0000 (11:09 +0200)]
scripts/update-linux-headers: Add iommufd.h

Update the script to import iommufd.h

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
18 months agoconfigure: define "pkg-config" in addition to "pkgconfig"
Paolo Bonzini [Tue, 17 Oct 2023 15:32:50 +0000 (17:32 +0200)]
configure: define "pkg-config" in addition to "pkgconfig"

Meson used to allow both "pkgconfig" and "pkg-config" entries in machine
files; the former was used for dependency lookup and the latter
was used as return value for "find_program('pkg-config')", which is a less
common use-case and one that QEMU does not need.

This inconsistency is going to be fixed by Meson 1.3, which will deprecate
"pkgconfig" in favor of "pkg-config" (the less common one, but it makes
sense because it matches the name of the binary). For backward
compatibility it is still allowed to define both, so do that in the
configure-generated machine file.

Related: https://github.com/mesonbuild/meson/pull/12385
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
18 months agomeson: add a note on why we use config_host for program paths
Paolo Bonzini [Thu, 28 Sep 2023 10:00:48 +0000 (12:00 +0200)]
meson: add a note on why we use config_host for program paths

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
18 months agomeson-buildoptions: document the data at the top
Paolo Bonzini [Thu, 28 Sep 2023 09:20:01 +0000 (11:20 +0200)]
meson-buildoptions: document the data at the top

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
18 months agoconfigure, meson: use command line options to configure qemu-ga
Paolo Bonzini [Mon, 9 Oct 2023 12:13:59 +0000 (14:13 +0200)]
configure, meson: use command line options to configure qemu-ga

Preserve the functionality of the environment variables, but
allow using the command line instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
18 months agoconfigure: unify handling of several Debian cross containers
Paolo Bonzini [Mon, 9 Oct 2023 12:03:56 +0000 (14:03 +0200)]
configure: unify handling of several Debian cross containers

The Debian and GNU architecture names match very often, even though
there are common cases (32-bit Arm or 64-bit x86) where they do not
and other cases in which the GNU triplet is actually a quadruplet.
But it is still possible to group the common case into a single
case inside probe_target_compiler.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
18 months agoconfigure: move environment-specific defaults to config-meson.cross
Paolo Bonzini [Mon, 16 Oct 2023 06:20:13 +0000 (08:20 +0200)]
configure: move environment-specific defaults to config-meson.cross

Store the -Werror and SMBD defaults in the machine file, which still allows
them to be overridden on the command line and enables automatic parsing
of the related options.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>