David Woodhouse [Thu, 19 Oct 2023 14:30:23 +0000 (15:30 +0100)]
docs: update Xen-on-KVM documentation
Add notes about console and network support, and how to launch PV guests.
Clean up the disk configuration examples now that that's simpler, and
remove the comment about IDE unplug on q35/AHCI now that it's fixed.
Also update stale avocado test filename in MAINTAINERS.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 17 Oct 2023 16:53:58 +0000 (17:53 +0100)]
hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs
When instantiating XenBus itself, for each NIC which is configured with
either the model unspecified, or set to to "xen" or "xen-net-device",
create a corresponding xen-net-device for it.
Now we can launch emulated Xen guests with '-nic user', and this fixes
the setup for Xen PV guests, which was previously broken in various
ways and never actually managed to peer with the netdev.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Thu, 19 Oct 2023 23:07:45 +0000 (00:07 +0100)]
hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices()
Eliminate direct access to nd_table[] and nb_nics by processing the the
ISA NICs first and then calling pci_init_nic_devices() for the test.
It's important to do this *before* the subsequent patch which registers
the Xen PV network devices, because the code being remove here didn't
check whether nd->instantiated was already set before using each entry.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sun, 22 Oct 2023 08:13:41 +0000 (09:13 +0100)]
net: add qemu_create_nic_bus_devices()
This will instantiate any NICs which live on a given bus type. Each bus
is allowed *one* substitution (for PCI it's virtio → virtio-net-pci, for
Xen it's xen → xen-net-device; no point in overengineering it unless we
actually want more).
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 21 Oct 2023 22:09:38 +0000 (23:09 +0100)]
net: report list of available models according to platform
By noting the models for which a configuration was requested, we can give
the user an accurate list of which NIC models were actually available on
the platform/configuration that was otherwise chosen.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Most code which directly accesses nd_table[] and nb_nics uses them for
one of two things. Either "I have created a NIC device and I'd like a
configuration for it", or "I will create a NIC device *if* there is a
configuration for it". With some variants on the theme around whether
they actually *check* if the model specified in the configuration is
the right one.
Provide functions which perform both of those, allowing platforms to
be a little more consistent and as a step towards making nd_table[]
and nb_nics private to the net code.
Also export the qemu_find_nic_info() helper, as some platforms have
special cases they need to handle.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Thu, 19 Oct 2023 11:56:42 +0000 (12:56 +0100)]
xen-platform: unplug AHCI disks
To support Xen guests using the Q35 chipset, the unplug protocol needs
to also remove AHCI disks.
Make pci_xen_ide_unplug() more generic, iterating over the children
of the PCI device and destroying the "ide-hd" devices. That works the
same for both AHCI and IDE, as does the detection of the primary disk
as unit 0 on the bus named "ide.0".
Then pci_xen_ide_unplug() can be used for both AHCI and IDE devices.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 17 Oct 2023 20:59:03 +0000 (21:59 +0100)]
net: do not delete nics in net_cleanup()
In net_cleanup() we only need to delete the netdevs, as those may have
state which outlives Qemu when it exits, and thus may actually need to
be cleaned up on exit.
The nics, on the other hand, are owned by the device which created them.
Most devices don't bother to clean up on exit because they don't have
any state which will outlive Qemu... but XenBus devices do need to clean
up their nodes in XenStore, and do have an exit handler to delete them.
When the XenBus exit handler destroys the xen-net-device, it attempts
to delete its nic after net_cleanup() had already done so. And crashes.
Fix this by only deleting netdevs as we walk the list. As the comment
notes, we can't use QTAILQ_FOREACH_SAFE() as each deletion may remove
*multiple* entries, including the "safely" saved 'next' pointer. But
we can store the *previous* entry, since nics are safe.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 16 Oct 2023 15:00:23 +0000 (16:00 +0100)]
hw/xen: add support for Xen primary console in emulated mode
The primary console is special because the toolstack maps a page into
the guest for its ring, and also allocates the guest-side event channel.
The guest's grant table is even primed to export that page using a known
grant ref#. Add support for all that in emulated mode, so that we can
have a primary console.
For reasons unclear, the backends running under real Xen don't just use
a mapping of the well-known GNTTAB_RESERVED_CONSOLE grant ref (which
would also be in the ring-ref node in XenStore). Instead, the toolstack
sets the ring-ref node of the primary console to the GFN of the guest
page. The backend is expected to handle that special case and map it
with foreignmem operations instead.
We don't have an implementation of foreignmem ops for emulated Xen mode,
so just make it map GNTTAB_RESERVED_CONSOLE instead. This would probably
work for real Xen too, but we can't work out how to make real Xen create
a primary console of type "ioemu" to make QEMU drive it, so we can't
test that; might as well leave it as it is for now under Xen.
Now at last we can boot the Xen PV shim and run PV kernels in QEMU.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 16 Oct 2023 09:28:17 +0000 (10:28 +0100)]
hw/xen: do not repeatedly try to create a failing backend device
If xen_backend_device_create() fails to instantiate a device, the XenBus
code will just keep trying over and over again each time the bus is
re-enumerated, as long as the backend appears online and in
XenbusStateInitialising.
The only thing which prevents the XenBus code from recreating duplicates
of devices which already exist, is the fact that xen_device_realize()
sets the backend state to XenbusStateInitWait. If the attempt to create
the device doesn't get *that* far, that's when it will keep getting
retried.
My first thought was to handle errors by setting the backend state to
XenbusStateClosed, but that doesn't work for XenConsole which wants to
*ignore* any device of type != "ioemu" completely.
So, make xen_backend_device_create() *keep* the XenBackendInstance for a
failed device, and provide a new xen_backend_exists() function to allow
xen_bus_type_enumerate() to check whether one already exists before
creating a new one.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Sat, 14 Oct 2023 15:53:23 +0000 (16:53 +0100)]
hw/xen: add get_frontend_path() method to XenDeviceClass
The primary Xen console is special. The guest's side is set up for it by
the toolstack automatically and not by the standard PV init sequence.
Accordingly, its *frontend* doesn't appear in …/device/console/0 either;
instead it appears under …/console in the guest's XenStore node.
To allow the Xen console driver to override the frontend path for the
primary console, add a method to the XenDeviceClass which can be used
instead of the standard xen_device_get_frontend_path()
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 16 Oct 2023 12:01:39 +0000 (13:01 +0100)]
hw/xen: automatically assign device index to block devices
There's no need to force the user to assign a vdev. We can automatically
assign one, starting at xvda and searching until we find the first disk
name that's unused.
This means we can now allow '-drive if=xen,file=xxx' to work without an
explicit separate -driver argument, just like if=virtio.
Rip out the legacy handling from the xenpv machine, which was scribbling
over any disks configured by the toolstack, and didn't work with anything
but raw images.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Kevin Wolf <kwolf@redhat.com>
David Woodhouse [Thu, 12 Oct 2023 09:59:45 +0000 (10:59 +0100)]
hw/xen: populate store frontend nodes with XenStore PFN/port
This is kind of redundant since without being able to get these through
some other method (HVMOP_get_param) the guest wouldn't be able to access
XenStore in order to find them.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Wed, 2 Aug 2023 16:04:49 +0000 (17:04 +0100)]
hw/xen: Clean up event channel 'type_val' handling to use union
A previous implementation of this stuff used a 64-bit field for all of
the port information (vcpu/type/type_val) and did atomic exchanges on
them. When I implemented that in Qemu I regretted my life choices and
just kept it simple with locking instead.
So there's no need for the XenEvtchnPort to be so simplistic. We can
use a union for the pirq/virq/interdomain information, which lets us
keep a separate bit for the 'remote domain' in interdomain ports. A
single bit is enough since the only possible targets are loopback or
qemu itself.
So now we can ditch PORT_INFO_TYPEVAL_REMOTE_QEMU and the horrid
manual masking, although the in-memory representation is identical
so there's no change in the saved state ABI.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Tue, 24 Oct 2023 21:22:47 +0000 (22:22 +0100)]
hw/xen: take iothread mutex in xen_evtchn_reset_op()
The xen_evtchn_soft_reset() function requires the iothread mutex, but is
also called for the EVTCHNOP_reset hypercall. Ensure the mutex is taken
in that case.
Fixes: a15b10978fe6 ("hw/xen: Implement EVTCHNOP_reset") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 17 Oct 2023 12:34:18 +0000 (13:34 +0100)]
hw/xen: fix XenStore watch delivery to guest
When fire_watch_cb() found the response buffer empty, it would call
deliver_watch() to generate the XS_WATCH_EVENT message in the response
buffer and send an event channel notification to the guest… without
actually *copying* the response buffer into the ring. So there was
nothing for the guest to see. The pending response didn't actually get
processed into the ring until the guest next triggered some activity
from its side.
Add the missing call to put_rsp().
It might have been slightly nicer to call xen_xenstore_event() here,
which would *almost* have worked. Except for the fact that it calls
xen_be_evtchn_pending() to check that it really does have an event
pending (and clear the eventfd for next time). And under Xen it's
defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
so the emu implementation follows suit.
This fixes Xen device hot-unplug.
Fixes: 0254c4d19df ("hw/xen: Add xenstore wire implementation and implementation stubs") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 18 Oct 2023 12:31:20 +0000 (13:31 +0100)]
hw/xen: don't clear map_track[] in xen_gnttab_reset()
The refcounts actually correspond to 'active_ref' structures stored in a
GHashTable per "user" on the backend side (mostly, per XenDevice).
If we zero map_track[] on reset, then when the backend drivers get torn
down and release their mapping we hit the assert(s->map_track[ref] != 0)
in gnt_unref().
So leave them in place. Each backend driver will disconnect and reconnect
as the guest comes back up again and reconnects, and it all works out OK
in the end as the old refs get dropped.
Fixes: de26b2619789 ("hw/xen: Implement soft reset for emulated gnttab") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Wed, 11 Oct 2023 23:06:26 +0000 (00:06 +0100)]
hw/xen: select kernel mode for per-vCPU event channel upcall vector
A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.
For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:
/* Trick toolstack to think we are enlightened. */
if (!cpu)
rc = xen_set_callback_via(1);
That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.
Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.
Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)
Fixes: 91cce756179 ("hw/xen: Add xen_evtchn device for event channel emulation") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Wed, 11 Oct 2023 22:30:08 +0000 (23:30 +0100)]
i386/xen: fix per-vCPU upcall vector for Xen emulation
The per-vCPU upcall vector support had three problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT when
the guest tried to set it up. Secondly it was using the wrong ioctl() to
pass the vector to the kernel and thus the *kernel* would always return
-EINVAL. Finally, even when delivering the event directly from userspace
with an MSI, it put the destination CPU ID into the wrong bits of the
MSI address.
Linux doesn't (yet) use this mode so it went without decent testing
for a while.
Fixes: 105b47fdf2d0 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Wed, 30 Aug 2023 20:13:57 +0000 (21:13 +0100)]
hw/timer/hpet: fix IRQ routing in legacy support mode
The interrupt from timer 0 in legacy mode is supposed to go to IRQ 0 on
the i8259 and IRQ 2 on the I/O APIC. The generic x86 GSI handling can't
cope with IRQ numbers differing between the two chips (despite it also
being the case for PCI INTx routing), so add a special case for the HPET.
IRQ 2 isn't valid on the i8259; it's the cascade IRQ and would be
interpreted as spurious interrupt on the secondary PIC. So we can fix
up all attempts to deliver IRQ2, to actually deliver to IRQ0 on the PIC.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Stefan Hajnoczi [Mon, 23 Oct 2023 21:45:46 +0000 (14:45 -0700)]
Merge tag 'pull-tcg-20231023' of https://gitlab.com/rth7680/qemu into staging
tcg: Drop unused tcg_temp_free define
tcg: Introduce tcg_use_softmmu
tcg: Optimize past conditional branches
tcg: Use constant zero when expanding with divu2
tcg: Add negsetcondi
tcg: Define MO_TL
tcg: Export tcg_gen_ext_{i32,i64,tl}
target/*: Use tcg_gen_ext_*
tcg/ppc: Enable direct branching tcg_out_goto_tb with TCG_REG_TB
tcg/ppc: Use ADDPCIS for power9
tcg/ppc: Use prefixed instructions for power10
tcg/ppc: Disable TCG_REG_TB for Power9/Power10
tcg/ppc: Enable direct branching tcg_out_goto_tb with TCG_REG_TB
tcg/ppc: Use ADDPCIS for power9
tcg/ppc: Use prefixed instructions for power10
tcg/ppc: Disable TCG_REG_TB for Power9/Power10
* tag 'pull-tcg-20231023' of https://gitlab.com/rth7680/qemu: (38 commits)
target/xtensa: Use tcg_gen_sextract_i32
target/tricore: Use tcg_gen_*extract_tl
target/rx: Use tcg_gen_ext_i32
target/m68k: Use tcg_gen_ext_i32
target/i386: Use tcg_gen_ext_tl
target/arm: Use tcg_gen_ext_i64
tcg: Define MO_TL
tcg: Export tcg_gen_ext_{i32,i64,tl}
tcg: add negsetcondi
target/i386: Use i128 for 128 and 256-bit loads and stores
tcg: Add tcg_gen_{ld,st}_i128
tcg: Optimize past conditional branches
tcg: Use constant zero when expanding with divu2
tcg: drop unused tcg_temp_free define
tcg/s390x: Use tcg_use_softmmu
tcg/riscv: Use tcg_use_softmmu
tcg/riscv: Do not reserve TCG_GUEST_BASE_REG for guest_base zero
tcg/ppc: Use tcg_use_softmmu
tcg/mips: Use tcg_use_softmmu
tcg/loongarch64: Use tcg_use_softmmu
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Mon, 23 Oct 2023 21:45:29 +0000 (14:45 -0700)]
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, cleanups
infrastructure for vhost-vdpa shadow work
piix south bridge rework
reconnect for vhost-user-scsi
dummy ACPI QTG DSM for cxl
tests, cleanups, fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU
# pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd
# 7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT
# ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV
# 9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa
# zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg=
# =ILek
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 22 Oct 2023 02:18:43 PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (62 commits)
intel-iommu: Report interrupt remapping faults, fix return value
MAINTAINERS: Add include/hw/intc/i8259.h to the PC chip section
vhost-user: Fix protocol feature bit conflict
tests/acpi: Update DSDT.cxl with QTG DSM
hw/cxl: Add QTG _DSM support for ACPI0017 device
tests/acpi: Allow update of DSDT.cxl
hw/i386/cxl: ensure maxram is greater than ram size for calculating cxl range
vhost-user: fix lost reconnect
vhost-user-scsi: start vhost when guest kicks
vhost-user-scsi: support reconnect to backend
vhost: move and rename the conn retry times
vhost-user-common: send get_inflight_fd once
hw/i386/pc_piix: Make PIIX4 south bridge usable in PC machine
hw/isa/piix: Implement multi-process QEMU support also for PIIX4
hw/isa/piix: Resolve duplicate code regarding PCI interrupt wiring
hw/isa/piix: Reuse PIIX3's PCI interrupt triggering in PIIX4
hw/isa/piix: Rename functions to be shared for PCI interrupt triggering
hw/isa/piix: Reuse PIIX3 base class' realize method in PIIX4
hw/isa/piix: Share PIIX3's base class with PIIX4
hw/isa/piix: Harmonize names of reset control memory regions
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu:
MAINTAINERS: Add the ompic.c file to the or1k-sim section
MAINTAINERS: Fix typo in openpic_kvm.c entry
MAINTAINERS: Add unvalued folders in tests/tcg/ to the right sections
MAINTAINERS: Add PPC common files to PowerPC TCG CPUs
MAINTAINERS: Add fw_cfg.c to PPC mac99 machine
MAINTAINERS: Adjust file list for PPC pseries machine
MAINTAINERS: Adjust file list for PPC e500 machines
MAINTAINERS: Adjust file list for PPC 4xx CPUs
MAINTAINERS: Adjust file list for PPC ref405ep machine
ppc/{bamboo, virtex_ml507}: Remove useless dependency on ppc405.h header
MAINTAINERS: Fix a couple s390 paths
MAINTAINERS: Add docs/devel/ebpf_rss.rst to the EBPF section
MAINTAINERS: Add include/hw/intc/i8259.h to the PC chip section
MAINTAINERS: Add the nios2 interrupt controller to the nios2 section
MAINTAINERS: Cover hw/ppc/ppc440_uc.c with Sam460ex board
hw/ppc/ppc440_uc: Remove dead l2sram_update_mappings()
hw/rdma/vmw/pvrdma_cmd: Use correct struct in query_port()
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tag 'hw-misc-20231020' of https://github.com/philmd/qemu: (41 commits)
ui/input: Constify QemuInputHandler structure
hw/net: Declare link using static DEFINE_PROP_LINK() macro
hw/dma: Declare link using static DEFINE_PROP_LINK() macro
hw/scsi/virtio-scsi: Use VIRTIO_SCSI_COMMON() macro
hw/display/virtio-gpu: Use VIRTIO_DEVICE() macro
hw/block/vhost-user-blk: Use DEVICE() / VIRTIO_DEVICE() macros
hw/virtio/virtio-pmem: Replace impossible check by assertion
hw/s390x/css-bridge: Realize sysbus device before accessing it
hw/isa: Realize ISA bridge device before accessing it
hw/arm/virt: Realize ARM_GICV2M sysbus device before accessing it
hw/acpi: Realize ACPI_GED sysbus device before accessing it
hw/pci-host/bonito: Do not use SysBus API to map local MMIO region
hw/misc/allwinner-dramc: Do not use SysBus API to map local MMIO region
hw/misc/allwinner-dramc: Move sysbus_mmio_map call from init -> realize
hw/i386/intel_iommu: Do not use SysBus API to map local MMIO region
hw/i386/amd_iommu: Do not use SysBus API to map local MMIO region
hw/intc/spapr_xive: Do not use SysBus API to map local MMIO region
hw/intc/spapr_xive: Move sysbus_init_mmio() calls around
hw/ppc/pnv: Do not use SysBus API to map local MMIO region
hw/ppc/pnv_xscom: Do not use SysBus API to map local MMIO region
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Richard Henderson [Thu, 19 Oct 2023 18:21:40 +0000 (11:21 -0700)]
target/rx: Use tcg_gen_ext_i32
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Paolo Bonzini [Sun, 22 Oct 2023 23:34:21 +0000 (16:34 -0700)]
tcg: Define MO_TL
This will also come in handy later for "less than" comparisons.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <03ba02fd-fade-4409-be16-2f81a5690b4c@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Paolo Bonzini [Thu, 19 Oct 2023 10:46:43 +0000 (12:46 +0200)]
tcg: add negsetcondi
This can be useful to write a shift bit extraction that does not
depend on TARGET_LONG_BITS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231019104648.389942-15-pbonzini@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Thu, 24 Aug 2023 17:25:21 +0000 (10:25 -0700)]
tcg: Add tcg_gen_{ld,st}_i128
Do not require the translators to jump through concat and
extract of i64 in order to move values to and from env.
Tested-by: Song Gao <gaosong@loongson.cn> Reviewed-by: Song Gao <gaosong@loongson.cn> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mike Frysinger [Sun, 15 Oct 2023 01:00:46 +0000 (06:45 +0545)]
tcg: drop unused tcg_temp_free define
Use of the API was removed a while back, but the define wasn't.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231015010046.16020-1-vapier@gentoo.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sun, 1 Oct 2023 14:53:03 +0000 (07:53 -0700)]
tcg: Introduce tcg_use_softmmu
Begin disconnecting CONFIG_SOFTMMU from !CONFIG_USER_ONLY.
Introduce a variable which can be set at startup to select
one method or another for user-only.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Jordan Niethe [Tue, 15 Aug 2023 16:47:11 +0000 (16:47 +0000)]
tcg/ppc: Enable direct branching tcg_out_goto_tb with TCG_REG_TB
Direct branch patching was disabled when using TCG_REG_TB in commit 736a1588c1 ("tcg/ppc: Fix race in goto_tb implementation").
The issue with direct branch patching with TCG_REG_TB is the lack of
synchronization between the new TCG_REG_TB being established and the
direct branch being patched in.
If each translation block is responsible for establishing its own
TCG_REG_TB then there can be no synchronization issue.
Make each translation block begin by setting up its own TCG_REG_TB.
Use the preferred 'bcl 20,31,$+4' sequence.
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
[rth: Split out tcg_out_tb_start, power9 addpcis] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
David Woodhouse [Wed, 23 Aug 2023 12:23:25 +0000 (13:23 +0100)]
intel-iommu: Report interrupt remapping faults, fix return value
A generic X86IOMMUClass->int_remap function should not return VT-d
specific values; fix it to return 0 if the interrupt was successfully
translated or -EINVAL if not.
The VTD_FR_IR_xxx values are supposed to be used to actually raise
faults through the fault reporting mechanism, so do that instead for
the case where the IRQ is actually being injected.
There is more work to be done here, as pretranslations for the KVM IRQ
routing table can't fault; an untranslatable IRQ should be handled in
userspace and the fault raised only when the IRQ actually happens (if
indeed the IRTE is still not valid at that time). But we can work on
that later; we can at least raise faults for the direct case.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <31bbfc9041690449d3ac891f4431ec82174ee1b4.camel@infradead.org> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Thomas Huth [Tue, 17 Oct 2023 15:26:25 +0000 (17:26 +0200)]
MAINTAINERS: Add include/hw/intc/i8259.h to the PC chip section
i8259.c is already listed here, so the corresponding header should
be mentioned in this section, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231017152625.229022-1-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Hanna Czenczek [Mon, 16 Oct 2023 08:32:01 +0000 (10:32 +0200)]
vhost-user: Fix protocol feature bit conflict
The VHOST_USER_PROTOCOL_F_XEN_MMAP feature bit was defined in f21e95ee97d, which has been part of qemu's 8.1.0 release. However, it
seems it was never added to qemu's code, but it is well possible that it
is already used by different front-ends outside of qemu (i.e., Xen).
VHOST_USER_PROTOCOL_F_SHARED_OBJECT in contrast was added to qemu's code
in 16094766627, but never defined in the vhost-user specification. As a
consequence, both bits were defined to be 17, which cannot work.
Regardless of whether actual code or the specification should take
precedence, F_XEN_MMAP is already part of a qemu release, while
F_SHARED_OBJECT is not. Therefore, bump the latter to take number 18
instead of 17, and add this to the specification.
Take the opportunity to add at least a little note on the
VhostUserShared structure to the specification. This structure is
referenced by the new commands introduced in 16094766627, but was not
defined.
Fixes: 160947666276c5b7f6bca4d746bcac2966635d79
("vhost-user: add shared_object msg") Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016083201.23736-1-hreitz@redhat.com> Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Thu, 12 Oct 2023 12:56:23 +0000 (13:56 +0100)]
tests/acpi: Update DSDT.cxl with QTG DSM
Description of change in previous patch.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Message-Id: <20231012125623.21101-4-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Dave Jiang [Thu, 12 Oct 2023 12:56:22 +0000 (13:56 +0100)]
hw/cxl: Add QTG _DSM support for ACPI0017 device
Add a simple _DSM call support for the ACPI0017 device to return fake QTG
ID values of 0 and 1 in all cases. This for _DSM plumbing testing from the OS.
Following edited for readability
Device (CXLM)
{
Name (_HID, "ACPI0017") // _HID: Hardware ID
...
Method (_DSM, 4, Serialized) // _DSM: Device-Specific Method
{
If ((Arg0 == ToUUID ("f365f9a6-a7de-4071-a66a-b40c0b4f8e52")))
{
If ((Arg2 == Zero))
{
Return (Buffer (One) { 0x01 })
}
If ((Arg2 == One))
{
Return (Package (0x02)
{
One,
Package (0x02)
{
Zero,
One
}
})
}
}
}
Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20231012125623.21101-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Thu, 12 Oct 2023 12:56:21 +0000 (13:56 +0100)]
tests/acpi: Allow update of DSDT.cxl
Addition of QTG in following patch requires an update to the test
data.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Message-Id: <20231012125623.21101-2-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ani Sinha [Wed, 11 Oct 2023 10:53:35 +0000 (16:23 +0530)]
hw/i386/cxl: ensure maxram is greater than ram size for calculating cxl range
pc_get_device_memory_range() finds the device memory size by calculating the
difference between maxram and ram sizes. This calculation makes sense only when
maxram is greater than the ram size. Make sure we check for that before calling
pc_get_device_memory_range().
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20231011105335.42296-1-anisinha@redhat.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Li Feng [Mon, 9 Oct 2023 04:47:01 +0000 (12:47 +0800)]
vhost-user: fix lost reconnect
When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.
The reason is:
When the vhost-user fails at get_features, the vhost_dev_cleanup will be called
immediately.
Li Feng [Mon, 9 Oct 2023 04:47:00 +0000 (12:47 +0800)]
vhost-user-scsi: start vhost when guest kicks
Let's keep the same behavior as vhost-user-blk.
Some old guests kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK.
Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20231009044735.941655-5-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Li Feng [Mon, 9 Oct 2023 04:46:59 +0000 (12:46 +0800)]
vhost-user-scsi: support reconnect to backend
If the backend crashes and restarts, the device is broken.
This patch adds reconnect for vhost-user-scsi.
This patch also improves the error messages, and reports some silent errors.
Tested with spdk backend.
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20231009044735.941655-4-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Li Feng [Mon, 9 Oct 2023 04:46:58 +0000 (12:46 +0800)]
vhost: move and rename the conn retry times
Multiple devices need this macro, move it to a common header.
Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20231009044735.941655-3-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Li Feng [Mon, 9 Oct 2023 04:46:57 +0000 (12:46 +0800)]
vhost-user-common: send get_inflight_fd once
Currently the get_inflight_fd will be sent every time the device is started, and
the backend will allocate shared memory to save the inflight state. If the
backend finds that it receives the second get_inflight_fd, it will release the
previous shared memory, which breaks inflight working logic.
This patch is a preparation for the following patches.
Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20231009044735.941655-2-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:37 +0000 (14:38 +0200)]
hw/i386/pc_piix: Make PIIX4 south bridge usable in PC machine
QEMU's PIIX3 implementation actually models the real PIIX4, but with different
PCI IDs. Usually, guests deal just fine with it. Still, in order to provide a
more consistent illusion to guests, allow QEMU's PIIX4 implementation to be used
in the PC machine.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20231007123843.127151-30-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:36 +0000 (14:38 +0200)]
hw/isa/piix: Implement multi-process QEMU support also for PIIX4
So far multi-process QEMU was only implemented for PIIX3. Move the support into
the base class to achieve feature parity between both device models.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20231007123843.127151-29-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that both PIIX3 and PIIX4 use piix_set_irq() to trigger PCI IRQs the wiring
in the respective realize methods can be shared, too.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-28-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:34 +0000 (14:38 +0200)]
hw/isa/piix: Reuse PIIX3's PCI interrupt triggering in PIIX4
Speeds up PIIX4 which resolves an old TODO. Also makes PIIX4 compatible with Xen
which relies on pci_bus_fire_intx_routing_notifier() to be fired.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-27-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:33 +0000 (14:38 +0200)]
hw/isa/piix: Rename functions to be shared for PCI interrupt triggering
PIIX4 will get the same optimizations which are already implemented for
PIIX3.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-26-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:32 +0000 (14:38 +0200)]
hw/isa/piix: Reuse PIIX3 base class' realize method in PIIX4
Resolves duplicate code. Also makes PIIX4 respect the PIIX3 properties which get
added, too. This allows for using PIIX4 in the PC machine.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20231007123843.127151-25-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:31 +0000 (14:38 +0200)]
hw/isa/piix: Share PIIX3's base class with PIIX4
Having a common base class will allow for futher code sharing between PIIX3 and
PIIX4. Moreover, it makes PIIX4 implement the acpi-dev-aml-interface.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-24-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:30 +0000 (14:38 +0200)]
hw/isa/piix: Harmonize names of reset control memory regions
There is no need for having different names here. Having the same name
further allows code to be shared between PIIX3 and PIIX4.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-23-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:29 +0000 (14:38 +0200)]
hw/isa/piix: Allow for optional PIT creation in PIIX3
In the PC machine, the PIT is created in board code to allow it to be
virtualized with various virtualization techniques. So explicitly disable its
creation in the PC machine via a property which defaults to enabled. Once the
PIIX implementations are consolidated this default will keep Malta working
without further ado.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20231007123843.127151-22-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:28 +0000 (14:38 +0200)]
hw/isa/piix: Allow for optional PIC creation in PIIX3
In the PC machine, the PIC is created in board code to allow it to be
virtualized with various virtualization techniques. So explicitly disable its
creation in the PC machine via a property which defaults to enabled. Once the
PIIX implementations are consolidated this default will keep Malta working
without further ado.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20231007123843.127151-21-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:27 +0000 (14:38 +0200)]
hw/isa/piix3: Merge hw/isa/piix4.c
Now that the PIIX3 and PIIX4 device models are sufficiently prepared, their
implementations can be merged into one file for further consolidation.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-20-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:26 +0000 (14:38 +0200)]
hw/isa/piix4: Reuse struct PIIXState from PIIX3
PIIX4 has its own, private PIIX4State structure. PIIX3 has almost the
same structure, provided in a public header. So reuse it and add a
cpu_intr attribute to it which is only used by PIIX4.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20231007123843.127151-19-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:25 +0000 (14:38 +0200)]
hw/isa/piix4: Rename reset control operations to match PIIX3
Both implementations are the same and will be shared upon merging.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-18-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:24 +0000 (14:38 +0200)]
hw/isa/piix4: Rename "isa" attribute to "isa_irqs_in"
Rename the "isa" attribute to align it with PIIX3 for consolidation.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20231007123843.127151-17-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:23 +0000 (14:38 +0200)]
hw/isa/piix4: Remove unused inbound ISA interrupt lines
The Malta board, which is the only user of PIIX4, doesn't connect to the
exported interrupt lines. PIIX3 doesn't expose such interrupt lines
either, so remove them for PIIX4 for simplicity and consistency.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-16-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:22 +0000 (14:38 +0200)]
hw/isa/piix3: Drop the "3" from PIIX base class name
TYPE_PIIX3_PCI_DEVICE was the former base class of the Xen and non-Xen variants
of the PIIX3 ISA device models. It will become the base class for the PIIX3 and
PIIX4 device models, so drop the "3" from the type names.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-15-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bernhard Beschow [Sat, 7 Oct 2023 12:38:21 +0000 (14:38 +0200)]
hw/isa/piix3: Create power management controller in host device
The power management controller is an integral part of PIIX3 (function 3). So
create it as part of the south bridge.
Note that the ACPI function is optional in QEMU. This is why it gets
object_initialize_child()'ed in realize rather than in instance_init.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231007123843.127151-14-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>