David Woodhouse [Wed, 11 Jan 2023 14:52:17 +0000 (14:52 +0000)]
Notify IRQ sources of level interrupt ack/EOI
This allows drivers to register a callback on a qemu_irq, which is
invoked when a level-triggered IRQ is acked on the irqchip.
This allows us to simulate level-triggered interrupts more efficiently,
by resampling the state of the interrupt only when it actually matters.
This can be used in two ways.
The example in the patch below shows the event source literally being
resampled from the callback — in this case the line level is tied to
the Xen evtchn_upcall_pending flag in vCPU0's vcpu_info, and the
callback from the irqchip allows us to avoid having to constantly poll
for that being clearer).
As we hook it up to INTx interrupts on VFIO PCI devices, it would
unconditionally return 'true' to clear the level in the irqchip, and
also send an event on the 'resample' eventfd to the kernel, so that the
kernel will reraise the interrupt if it's still actually physically set
on the device.
There's theoretically a race condition there, if the kernel reraises
the interrupt before the callback even returns and the irqchip clears
its internal s->irr. But I think we get away with it by being single-
threaded for the I/O processing so we won't actually consume the event
until later?
It was the Xen part firt that offended me, having to poll on vmexit:
https://git.infradead.org/users/dwmw2/qemu.git/commitdiff/7bada5e4f#patch5
But then I looked at how VFIO handles this, and it offends me even
more; sending the resample eventfd down to the kernel on *ever* MMIO
read/write.... having unmapped the device's BARs from the guest in
order to *trap* those MMIO accesses... with a timer to map it back
again...
It'll take a little more work to hook up the reverse path for the ack
back through PCI INTx handling, a bit like I've had to do it with
gsi_ack_handler to convey the ack events back from the {i8259,ioapic}
qemu_irq to the GSI qemu_irqs. And I'll need to do it for more than
just i8259 and ioapic. But I suspect it'll be worth it.
Opinions?
Tested by booting a (KVM) Xen guest with xen_no_vector_callback on its
command line, and sometimes also 'apic=off'. In PIC mode we still seem
to get two interrupts per event, but I think that's actually genuine
because printfs in the evtchn code confirm that ->evtchn_upcall_pending
for vCPU0 really *is* still set the first time the interrupt is acked
in the i8259 and genuinely doesn't get cleared.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 10 Jan 2023 08:01:47 +0000 (08:01 +0000)]
i386/xen: Initialize XenBus and legacy backends from pc_init1()
Now that we're close to being able to use the PV backends without actual
Xen, move the bus instantiation out from xen_hvm_init_pc() to pc_init1().
However, still only do it for (xen_mode == XEN_ATTACH) (i.e. when running
on true Xen) because we don't have XenStore ops for emulation yet, and
the XenBus instantiation failure is fatal. Once we have a functional
XenStore for emulated mode, this will become (xen_mode != XEN_DISABLED).
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 10 Jan 2023 01:35:33 +0000 (01:35 +0000)]
hw/xen: Remove old version of Xen headers
For Xen emulation support, a full set of Xen headers was imported to
include/standard-headers/xen. This renders the original partial set in
include/hw/xen/interface redundant.
Now that the header ordering and handling of __XEN_INTERFACE_VERSION__
vs. __XEN_TOOLS__ is all untangled, remove the old set of headers and
use only the new ones.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 10 Jan 2023 01:09:04 +0000 (01:09 +0000)]
hw/xen: Implement soft reset for emulated gnttab
This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Fri, 6 Jan 2023 09:59:28 +0000 (09:59 +0000)]
hw/xen: Add backend implementation of grant table operations
This is limited to mapping a single grant at a time, because under Xen the
pages are mapped *contiguously* into qemu's address space, and that's very
hard to do when those pages actually come from anonymous mappings in qemu
in the first place.
Eventually perhaps we can look at using shared mappings of actual objects
for system RAM, and then we can make new mappings of the same backing
store (be it deleted files, shmem, whatever). But for now let's stuck to
a page at a time.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 2 Jan 2023 00:39:13 +0000 (00:39 +0000)]
hw/xen: Rename xen_common.h to xen_native.h
This header is now only for native Xen code, not PV backends that may be
used in Xen emulation. Since the toolstack libraries may depend on the
specific version of Xen headers that they pull in (and will set the
__XEN_TOOLS__ macro to enable internal definitions that they depend on),
the rule is that xen_native.h (and thus the toolstack library headers)
must be included *before* any of the headers in standard-headers/xen.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 7 Jan 2023 16:47:43 +0000 (16:47 +0000)]
hw/xen: Use XEN_PAGE_SIZE in PV backend drivers
XC_PAGE_SIZE comes from the actual Xen libraries, while XEN_PAGE_SIZE is
provided by QEMU itself in xen_backend_ops.h. For backends which may be
built for emulation mode, use the latter.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sat, 7 Jan 2023 16:17:51 +0000 (16:17 +0000)]
hw/xen: Move xenstore_store_pv_console_info to xen_console.c
There's no need for this to be in the Xen accel code, and as we want to
use the Xen console support with KVM-emulated Xen we'll want to have a
platform-agnostic version of it. Make it use GString to build up the
path while we're at it.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Sun, 1 Jan 2023 21:31:37 +0000 (21:31 +0000)]
hw/xen: Add gnttab operations to allow redirection to internal emulation
In emulation, mapping more than one grant ref to be virtually contiguous
would be fairly difficult. The best way to do it might be to make the
ram_block mappings actually backed by a file (shmem or a deleted file,
perhaps) so that we can have multiple *shared* mappings of it. But that
would be fairly intrusive.
Making the backend drivers cope with page *lists* instead of expecting
the mapping to be contiguous is also non-trivial, since some structures
would actually *cross* page boundaries (e.g. the 32-bit blkif responses
which are 12 bytes).
So for now, we'll support only single-page mappings in emulation.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
David Woodhouse [Wed, 28 Dec 2022 10:06:49 +0000 (10:06 +0000)]
hw/xen: Add basic ring handling to xenstore
Extract requests, return ENOSYS to all of them. This is enough to allow
older Linux guests to boot, as they need *something* back but it doesn't
matter much what.
In the first instance we're likely to wire this up over a UNIX socket to
an actual xenstored implementation, but in the fullness of time it would
be nice to have a fully single-tenant "virtual" xenstore within qemu.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 27 Dec 2022 21:20:07 +0000 (21:20 +0000)]
hw/xen: Add backend implementation of interdomain event channel support
The provides the QEMU side of interdomain event channels, allowing events
to be sent to/from the guest.
The API mirrors libxenevtchn, and in time both this and the real Xen one
will be available through ops structures so that the PV backend drivers
can use the correct one as appropriate.
For now, this implementation can be used directly by our XenStore which
will be for emulated mode only.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Which is used to fetch xenstore PFN and port to be used
by the guest. This is preallocated by the toolstack when
guest will just read those and use it straight away.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Introduce support for one shot and periodic mode of Xen PV timers,
whereby timer interrupts come through a special virq event channel
with deadlines being set through:
David Woodhouse [Fri, 16 Dec 2022 00:03:21 +0000 (00:03 +0000)]
hw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback
The guest is permitted to specify an arbitrary domain/bus/device/function
and INTX pin from which the callback IRQ shall appear to have come.
In QEMU we can only easily do this for devices that actually exist, and
even that requires us "knowing" that it's a PCMachine in order to find
the PCI root bus — although that's OK really because it's always true.
We also don't get to get notified of INTX routing changes, because we
can't do that as a passive observer; if we try to register a notifier
it will overwrite any existing notifier callback on the device.
But in practice, guests using PCI_INTX will only ever use pin A on the
Xen platform device, and won't swizzle the INTX routing after they set
it up. So this is just fine.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Thu, 15 Dec 2022 20:35:24 +0000 (20:35 +0000)]
hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback
The GSI callback (and later PCI_INTX) is a level triggered interrupt. It
is asserted when an event channel is delivered to vCPU0, and is supposed
to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0
is cleared again.
Thankfully, Xen does *not* assert the GSI if the guest sets its own
evtchn_upcall_pending field; we only need to assert the GSI when we
have delivered an event for ourselves. So that's the easy part.
However, we *do* need to poll for the evtchn_upcall_pending flag being
cleared. In an ideal world we would poll that when the EOI happens on
the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd
pairs — one is used to trigger the interrupt, and the other works in the
other direction to 'resample' on EOI, and trigger the first eventfd
again if the line is still active.
However, QEMU doesn't seem to do that. Even VFIO level interrupts seem
to be supported by temporarily unmapping the device's BARs from the
guest when an interrupt happens, then trapping *all* MMIO to the device
and sending the 'resample' event on *every* MMIO access until the IRQ
is cleared! Maybe in future we'll plumb the 'resample' concept through
QEMU's irq framework but for now we'll do what Xen itself does: just
check the flag on every vmexit if the upcall GSI is known to be
asserted.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 13 Dec 2022 22:40:56 +0000 (22:40 +0000)]
hw/xen: Implement EVTCHNOP_bind_virq
Add the array of virq ports to each vCPU so that we can deliver timers,
debug ports, etc. Global virqs are allocated against vCPU 0 initially,
but can be migrated to other vCPUs (when we implement that).
The kernel needs to know about VIRQ_TIMER in order to accelerate timers,
so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value
of the singleshot timer across migration, as the kernel will handle the
hypercalls automatically now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 13 Dec 2022 17:20:46 +0000 (17:20 +0000)]
hw/xen: Implement EVTCHNOP_unmask
This finally comes with a mechanism for actually injecting events into
the guest vCPU, with all the atomic-test-and-set that's involved in
setting the bit in the shinfo, then the index in the vcpu_info, and
injecting either the lapic vector as MSI, or letting KVM inject the
bare vector.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Tue, 13 Dec 2022 13:57:44 +0000 (13:57 +0000)]
hw/xen: Implement EVTCHNOP_close
It calls an internal close_port() helper which will also be used from
EVTCHNOP_reset and will actually do the work to disconnect/unbind a port
once any of that is actually implemented in the first place.
That in turn calls a free_port() internal function which will be in
error paths after allocation.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Fri, 16 Dec 2022 14:32:25 +0000 (14:32 +0000)]
i386/xen: Add support for Xen event channel delivery to vCPU
The kvm_xen_inject_vcpu_callback_vector() function will either deliver
the per-vCPU local APIC vector (as an MSI), or just kick the vCPU out
of the kernel to trigger KVM's automatic delivery of the global vector.
Support for asserting the GSI/PCI_INTX callbacks will come later.
Also add kvm_xen_get_vcpu_info_hva() which returns the vcpu_info of
a given vCPU.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Fri, 16 Dec 2022 14:02:29 +0000 (14:02 +0000)]
hw/xen: Add xen_evtchn device for event channel emulation
Include basic support for setting HVM_PARAM_CALLBACK_IRQ to the global
vector method HVM_PARAM_CALLBACK_TYPE_VECTOR, which is handled in-kernel
by raising the vector whenever the vCPU's vcpu_info->evtchn_upcall_pending
flag is set.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Ankur Arora [Tue, 6 Dec 2022 11:14:07 +0000 (11:14 +0000)]
i386/xen: implement HVMOP_set_param
This is the hook for adding the HVM_PARAM_CALLBACK_IRQ parameter in a
subsequent commit.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Split out from another commit] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
The HVMOP_set_evtchn_upcall_vector hypercall sets the per-vCPU upcall
vector, to be delivered to the local APIC just like an MSI (with an EOI).
This takes precedence over the system-wide delivery method set by the
HVMOP_set_param hypercall with HVM_PARAM_CALLBACK_IRQ. It's used by
Windows and Xen (PV shim) guests but normally not by Linux.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Rework for upstream kernel changes and split from HVMOP_set_param] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
Joao Martins [Thu, 28 Jun 2018 16:36:19 +0000 (12:36 -0400)]
i386/xen: implement HYPERVISOR_event_channel_op
Additionally set XEN_INTERFACE_VERSION to most recent in order to
exercise the "new" event_channel_op.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch event_channel_op_compat which was never available to HVM guests] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Joao Martins [Mon, 18 Jun 2018 16:26:44 +0000 (12:26 -0400)]
i386/xen: implement HYPERVISOR_vcpu_op
This is simply when guest tries to register a vcpu_info
and since vcpu_info placement is optional in the minimum ABI
therefore we can just fail with -ENOSYS
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 12 Dec 2022 14:03:41 +0000 (14:03 +0000)]
i386/xen: manage and save/restore Xen guest long_mode setting
Xen will "latch" the guest's 32-bit or 64-bit ("long mode") setting when
the guest writes the MSR to fill in the hypercall page, or when the guest
sets the event channel callback in HVM_PARAM_CALLBACK_IRQ.
KVM handles the former and sets the kernel's long_mode flag accordingly.
The latter will be handled in userspace. Keep them in sync by noticing
when a hypercall is made in a mode that doesn't match qemu's idea of
the guest mode, and resyncing from the kernel. Do that same sync right
before serialization too, in case the guest has set the hypercall page
but hasn't yet made a system call.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Mon, 12 Dec 2022 23:40:45 +0000 (23:40 +0000)]
i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
The xen_overlay device (and later similar devices for event channels and
grant tables) need to be instantiated. Do this from a kvm_type method on
the PC machine derivatives, since KVM is only way to support Xen emulation
for now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 7 Dec 2022 09:19:31 +0000 (09:19 +0000)]
hw/xen: Add xen_overlay device for emulating shared xenheap pages
For the shared info page and for grant tables, Xen shares its own pages
from the "Xen heap" to the guest. The guest requests that a given page
from a certain address space (XENMAPSPACE_shared_info, etc.) be mapped
to a given GPA using the XENMEM_add_to_physmap hypercall.
To support that in qemu when *emulating* Xen, create a memory region
(migratable) and allow it to be mapped as an overlay when requested.
Xen theoretically allows the same page to be mapped multiple times
into the guest, but that's hard to track and reinstate over migration,
so we automatically *unmap* any previous mapping when creating a new
one. This approach has been used in production with.... a non-trivial
number of guests expecting true Xen, without any problems yet being
noticed.
This adds just the shared info page for now. The grant tables will be
a larger region, and will need to be overlaid one page at a time. I
think that means I need to create separate aliases for each page of
the overall grant_frames region, so that they can be mapped individually.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Wed, 14 Dec 2022 21:50:41 +0000 (21:50 +0000)]
i386/xen: Implement SCHEDOP_poll and SCHEDOP_yield
They both do the same thing and just call sched_yield. This is enough to
stop the Linux guest panicking when running on a host kernel which doesn't
intercept SCHEDOP_poll and lets it reach userspace.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
It allows to shutdown itself via hypercall with any of the 3 reasons:
1) self-reboot
2) shutdown
3) crash
Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather
than leading to triple faults if it remains unimplemented.
In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset
Xen shared pages and other enlightenments and leave a clean slate for the
new kernel without the hypervisor helpfully writing information at
unexpected addresses.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch sched_op_compat which was never available for HVM guests,
Add SCHEDOP_soft_reset] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Joao Martins [Thu, 14 Jun 2018 12:29:45 +0000 (08:29 -0400)]
i386/xen: implement HYPERVISOR_xen_version
This is just meant to serve as an example on how we can implement
hypercalls. xen_version specifically since Qemu does all kind of
feature controllability. So handling that here seems appropriate.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Implement kvm_gva_rw() safely] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
Joao Martins [Wed, 13 Jun 2018 14:14:31 +0000 (10:14 -0400)]
i386/xen: handle guest hypercalls
This means handling the new exit reason for Xen but still
crashing on purpose. As we implement each of the hypercalls
we will then return the right return code.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
David Woodhouse [Fri, 16 Dec 2022 11:05:29 +0000 (11:05 +0000)]
i386/hvm: Set Xen vCPU ID in KVM
There are (at least) three different vCPU ID number spaces. One is the
internal KVM vCPU index, based purely on which vCPU was chronologically
created in the kernel first. If userspace threads are all spawned and
create their KVM vCPUs in essentially random order, then the KVM indices
are basically random too.
The second number space is the APIC ID space, which is consistent and
useful for referencing vCPUs. MSIs will specify the target vCPU using
the APIC ID, for example, and the KVM Xen APIs also take an APIC ID
from userspace whenever a vCPU needs to be specified (as opposed to
just using the appropriate vCPU fd).
The third number space is not normally relevant to the kernel, and is
the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But
Xen timer hypercalls use it, and Xen timer hypercalls *really* want
to be accelerated in the kernel rather than handled in userspace, so
the kernel needs to be told.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
Joao Martins [Tue, 6 Dec 2022 10:48:53 +0000 (10:48 +0000)]
i386/kvm: handle Xen HVM cpuid leaves
Introduce support for emulating CPUID for Xen HVM guests. It doesn't make
sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally
when the xen-version machine property is set.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Obtain xen_version from KVM property, make it automatic] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Sat, 3 Dec 2022 17:51:13 +0000 (09:51 -0800)]
i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support
This just initializes the basic Xen support in KVM for now. Only permitted
on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap
overlay, event channel, grant tables and other stuff will exist. There's
no point having the basic hypercall support if nothing else works.
Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used
later by support devices.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Tue, 6 Dec 2022 09:03:48 +0000 (09:03 +0000)]
xen: add CONFIG_XENFV_MACHINE and CONFIG_XEN_EMU options for Xen emulation
The XEN_EMU option will cover core Xen support in target/, which exists
only for x86 with KVM today but could theoretically also be implemented
on Arm/Aarch64 and with TCG or other accelerators. It will also cover
the support for architecture-independent grant table and event channel
support which will be added in hw/i386/kvm/ (on the basis that the
non-KVM support is very theoretical and making it not use KVM directly
seems like gratuitous overengineering at this point).
The XENFV_MACHINE option is for the xenfv platform support, which will
now be used both by XEN_EMU and by real Xen.
The XEN option remains dependent on the Xen runtime libraries, and covers
support for real Xen. Some code which currently resides under CONFIG_XEN
will be moving to CONFIG_XENFV_MACHINE over time.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
Joao Martins [Wed, 13 Feb 2019 17:29:47 +0000 (12:29 -0500)]
include: import Xen public headers to include/standard-headers/
There are already some partial headers in include/hw/xen/interface/
which will be removed once we migrate users to the new location.
To start with, define __XEN_TOOLS__ in hw/xen/xen.h to ensure that any
internal definitions needed by Xen toolstack libraries are present
regardless of the order in which the headers are included. A reckoning
will come later, once we make the PV backends work in emulation and
untangle the headers for Xen-native vs. generic parts.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Update to Xen public headers from 4.16.2 release, add some in io/,
define __XEN_TOOLS__ in hw/xen/xen.h] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
Peter Maydell [Mon, 9 Jan 2023 15:54:31 +0000 (15:54 +0000)]
Merge tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu into staging
* s390x header clean-ups from Philippe
* Rework and improvements of the EINTR handling by Nikita
* Deprecate the -no-hpet command line option
* Disable the qtests in the 32-bit Windows CI job again
* Some other misc fixes here and there
* tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu:
.gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job
error handling: Use RETRY_ON_EINTR() macro where applicable
Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
docs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst
i386: Deprecate the -no-hpet QEMU command line option
tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter
tests/readconfig: spice doesn't support unix socket on windows yet
target/s390x: Restrict sysemu/reset.h to system emulation
target/s390x/tcg/excp_helper: Restrict system headers to sysemu
target/s390x/tcg/misc_helper: Remove unused "memory.h" include
hw/s390x/pv: Restrict Protected Virtualization to sysemu
exec/memory: Expose memory_region_access_valid()
MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section
tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts
qemu-iotests/stream-under-throttle: do not shutdown QEMU
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Thomas Huth [Thu, 5 Jan 2023 19:30:58 +0000 (20:30 +0100)]
.gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job
The qtests are not stable in the msys2-32bit job yet - especially
the test-hmp and the qom-test are failing randomly. Until this is
fixed, let's better disable the qtests here again to avoid failing
CI tests.
Message-Id: <20230105204819.26992-1-thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Thu, 29 Dec 2022 11:49:13 +0000 (12:49 +0100)]
i386: Deprecate the -no-hpet QEMU command line option
The HPET setting has been turned into a machine property a while ago
already, so we should finally do the next step and deprecate the
legacy CLI option, too.
Message-Id: <20221229114913.260400-1-thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Mon, 9 Jan 2023 08:08:23 +0000 (09:08 +0100)]
tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter
We are going to deprecate (and finally remove later) the -no-hpet command
line option. Prepare the bios-tables-test by using the replacement hpet=off
machine parameter instead.
Message-Id: <20230109081205.116369-1-thuth@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Philippe Mathieu-Daudé [Tue, 20 Dec 2022 14:56:24 +0000 (15:56 +0100)]
target/s390x: Restrict sysemu/reset.h to system emulation
In user emulation, threads -- implemented as CPU -- are
created/destroyed, but never reset. There is no point in
allowing the user emulation access the sysemu/reset API.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221220145625.26392-5-philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Philippe Mathieu-Daudé [Sat, 17 Dec 2022 15:24:54 +0000 (16:24 +0100)]
target/s390x/tcg/excp_helper: Restrict system headers to sysemu
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221217152454.96388-6-philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Philippe Mathieu-Daudé [Sat, 17 Dec 2022 15:24:53 +0000 (16:24 +0100)]
target/s390x/tcg/misc_helper: Remove unused "memory.h" include
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221217152454.96388-5-philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Philippe Mathieu-Daudé [Sat, 17 Dec 2022 15:24:52 +0000 (16:24 +0100)]
hw/s390x/pv: Restrict Protected Virtualization to sysemu
Protected Virtualization is irrelevant in user emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221217152454.96388-4-philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Philippe Mathieu-Daudé [Sat, 17 Dec 2022 15:24:50 +0000 (16:24 +0100)]
exec/memory: Expose memory_region_access_valid()
Instead of having hardware device poking into memory
internal API, expose memory_region_access_valid().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221217152454.96388-2-philmd@linaro.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Mon, 12 Dec 2022 17:12:52 +0000 (18:12 +0100)]
MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section
docs/system/target-mips.rst and configs/targets/mips* are not covered
in our MAINTAINERS file yet, so let's add them now.
Message-Id: <20221212171252.194864-1-thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Philippe Mathieu-Daudé [Fri, 9 Dec 2022 16:47:43 +0000 (17:47 +0100)]
tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts
On non-x86_64 host, if KVM is not available we get:
Traceback (most recent call last):
File "tests/vm/basevm.py", line 634, in main
vm = vmcls(args, config=config)
File "tests/vm/basevm.py", line 104, in __init__
mem = max(4, args.jobs)
TypeError: '>' not supported between instances of 'NoneType' and 'int'
Fix by always returning a -- not ideal but safe -- '1' value.
Fixes: b09539444a ("tests/vm: allow us to take advantage of MTTCG") Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221209164743.70836-1-philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Christian Borntraeger [Wed, 7 Dec 2022 13:14:52 +0000 (14:14 +0100)]
qemu-iotests/stream-under-throttle: do not shutdown QEMU
Without a kernel or boot disk a QEMU on s390 will exit (usually with a
disabled wait state). This breaks the stream-under-throttle test case.
Do not exit qemu if on s390.
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Message-Id: <20221207131452.8455-1-borntraeger@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Peter Maydell [Mon, 9 Jan 2023 10:07:11 +0000 (10:07 +0000)]
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, cleanups, fixes
mostly vhost-vdpa:
guest announce feature emulation when using shadow virtqueue
support for configure interrupt
startup speed ups
an acpi change to only generate cluster node in PPTT when specified for arm
misc fixes, cleanups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 08 Jan 2023 08:01:39 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (50 commits)
vhost-scsi: fix memleak of vsc->inflight
acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
tests: acpi: aarch64: Add *.topology tables
tests: acpi: aarch64: Add topology test for aarch64
tests: acpi: Add and whitelist *.topology blobs
tests: virt: Update expected ACPI tables for virt test
hw/acpi/aml-build: Only generate cluster node in PPTT when specified
tests: virt: Allow changes to PPTT test table
virtio-pci: fix proxy->vector_irqfd leak in virtio_pci_set_guest_notifiers
vdpa: commit all host notifier MRs in a single MR transaction
vhost: configure all host notifiers in a single MR transaction
vhost: simplify vhost_dev_enable_notifiers
vdpa: harden the error path if get_iova_range failed
vdpa-dev: get iova range explicitly
docs/devel: Rules on #include in headers
include: Include headers where needed
include/hw/virtio: Break inclusion loop
include/hw/cxl: Break inclusion loop cxl_pci.h and cxl_cdat_h
include/hw/pci: Include hw/pci/pci.h where needed
include/hw/pci: Split pci_device.h off pci.h
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Sun, 8 Jan 2023 14:27:40 +0000 (14:27 +0000)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* Atomic memslot updates for KVM (Emanuele, David)
* Always send errors to logfile when daemonized (Greg)
* Add support for IDE CompactFlash card (Lubomir)
* First round of build system cleanups (myself)
* First round of feature removals (myself)
* Reduce "qemu/accel.h" inclusion (Philippe)
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
i386: SGX: remove deprecated member of SGXInfo
target/i386: Add SGX aex-notify and EDECCSSA support
util: remove support -chardev tty and -chardev parport
util: remove support for hex numbers with a scaling suffix
KVM: remove support for kernel-irqchip=off
docs: do not talk about past removal as happening in the future
meson: accept relative symlinks in "meson introspect --installed" data
meson: cleanup compiler detection
meson: support meson 0.64 -Doptimization=plain
configure: test all warnings
tests/qapi-schema: remove Meson workaround
meson: cleanup dummy-cpus.c rules
meson: tweak hardening options for Windows
configure: remove backwards-compatibility and obsolete options
configure: preserve qemu-ga variables
configure: cleanup $cpu tests
configure: remove dead function
configure: remove useless write_c_skeleton
ide: Add "ide-cf" driver, a CompactFlash card
ide: Add 8-bit data mode
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Sun, 8 Jan 2023 11:23:17 +0000 (11:23 +0000)]
Merge tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu into staging
tcg/s390x improvements:
- drop support for pre-z196 cpus (eol before 2017)
- add support for misc-instruction-extensions-3
- misc cleanups
# gpg: Signature made Sat 07 Jan 2023 07:47:59 GMT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu: (27 commits)
tcg/s390x: Avoid the constant pool in tcg_out_movi
tcg/s390x: Cleanup tcg_out_movi
tcg/s390x: Tighten constraints for 64-bit compare
tcg/s390x: Implement ctpop operation
tcg/s390x: Use tgen_movcond_int in tgen_clz
tcg/s390x: Support SELGR instruction in movcond
tcg/s390x: Generalize movcond implementation
tcg/s390x: Create tgen_cmp2 to simplify movcond
tcg/s390x: Support MIE3 logical operations
tcg/s390x: Tighten constraints for and_i64
tcg/s390x: Tighten constraints for or_i64 and xor_i64
tcg/s390x: Issue XILF directly for xor_i32
tcg/s390x: Support MIE2 MGRK instruction
tcg/s390x: Support MIE2 multiply single instructions
tcg/s390x: Distinguish RIE formats
tcg/s390x: Distinguish RRF-a and RRF-c formats
tcg/s390x: Use LARL+AGHI for odd addresses
tcg/s390x: Remove DISTINCT_OPERANDS facility check
tcg/s390x: Remove FAST_BCR_SER facility check
tcg/s390x: Check for load-on-condition facility at startup
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f00aec57917 in __interceptor_calloc (/lib64/libasan.so.6+0xb4917)
#1 0x7f00ada0d7b5 in g_malloc0 (/lib64/libglib-2.0.so.0+0x517b5)
#2 0x5648ffd38bac in vhost_scsi_start ../hw/scsi/vhost-scsi.c:92
#3 0x5648ffd38d52 in vhost_scsi_set_status ../hw/scsi/vhost-scsi.c:131
#4 0x5648ffda340e in virtio_set_status ../hw/virtio/virtio.c:2036
#5 0x5648ff8de281 in virtio_ioport_write ../hw/virtio/virtio-pci.c:431
#6 0x5648ff8deb29 in virtio_pci_config_write ../hw/virtio/virtio-pci.c:576
#7 0x5648ffe5c0c2 in memory_region_write_accessor ../softmmu/memory.c:493
#8 0x5648ffe5c424 in access_with_adjusted_size ../softmmu/memory.c:555
#9 0x5648ffe6428f in memory_region_dispatch_write ../softmmu/memory.c:1515
#10 0x5648ffe8613d in flatview_write_continue ../softmmu/physmem.c:2825
#11 0x5648ffe86490 in flatview_write ../softmmu/physmem.c:2867
#12 0x5648ffe86d9f in address_space_write ../softmmu/physmem.c:2963
#13 0x5648ffe86e57 in address_space_rw ../softmmu/physmem.c:2973
#14 0x5648fffbfb3d in kvm_handle_io ../accel/kvm/kvm-all.c:2639
#15 0x5648fffc0e0d in kvm_cpu_exec ../accel/kvm/kvm-all.c:2890
#16 0x5648fffc90a7 in kvm_vcpu_thread_fn ../accel/kvm/kvm-accel-ops.c:51
#17 0x56490042400a in qemu_thread_start ../util/qemu-thread-posix.c:505
#18 0x7f00ac3b6ea4 in start_thread (/lib64/libpthread.so.0+0x7ea4)
Free the vsc->inflight at the 'stop' path.
Fixes: b82526c7ee ("vhost-scsi: support inflight io track") Cc: Joe Jin <joe.jin@oracle.com> Cc: Li Feng <fengli@smartx.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20230104160433.21353-1-dongli.zhang@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Laszlo Ersek [Thu, 5 Jan 2023 16:18:04 +0000 (17:18 +0100)]
acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
The modern ACPI CPU hotplug interface was introduced in the following
series (aa1dd39ca307..679dd1a957df), released in v2.7.0:
1 abd49bc2ed2f docs: update ACPI CPU hotplug spec with new protocol
2 16bcab97eb9f pc: piix4/ich9: add 'cpu-hotplug-legacy' property
3 5e1b5d93887b acpi: cpuhp: add CPU devices AML with _STA method
4 ac35f13ba8f8 pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
5 d2238cb6781d acpi: cpuhp: implement hot-add parts of CPU hotplug
interface
6 8872c25a26cc acpi: cpuhp: implement hot-remove parts of CPU hotplug
interface
7 76623d00ae57 acpi: cpuhp: add cpu._OST handling
8 679dd1a957df pc: use new CPU hotplug interface since 2.7 machine type
Before patch#1, "docs/specs/acpi_cpu_hotplug.txt" only specified 1-byte
accesses for the hotplug register block. Patch#1 preserved the same
restriction for the legacy register block, but:
- it specified DWORD accesses for some of the modern registers,
- in particular, the switch from the legacy block to the modern block
would require a DWORD write to the *legacy* block.
The latter functionality was then implemented in cpu_status_write()
[hw/acpi/cpu_hotplug.c], in patch#8.
Unfortunately, all DWORD accesses depended on a dormant bug: the one
introduced in earlier commit a014ed07bd5a ("memory: accept mismatching
sizes in memory_region_access_valid", 2013-05-29); first released in
v1.6.0. Due to commit a014ed07bd5a, the DWORD accesses to the *legacy*
CPU hotplug register block would work in spite of the above series *not*
relaxing "valid.max_access_size = 1" in "hw/acpi/cpu_hotplug.c":
Later, in commits e6d0c3ce6895 ("acpi: cpuhp: introduce 'Command data 2'
field", 2020-01-22) and ae340aa3d256 ("acpi: cpuhp: spec: add typical
usecases", 2020-01-22), first released in v5.0.0, the modern CPU hotplug
interface (including the documentation) was extended with another DWORD
*read* access, namely to the "Command data 2" register, which would be
important for the guest to confirm whether it managed to switch the
register block from legacy to modern.
This functionality too silently depended on the bug from commit a014ed07bd5a.
In commit 5d971f9e6725 ('memory: Revert "memory: accept mismatching sizes
in memory_region_access_valid"', 2020-06-26), first released in v5.1.0,
the bug from commit a014ed07bd5a was fixed (the commit was reverted).
That swiftly exposed the bug in "AcpiCpuHotplug_ops", still present from
the v2.7.0 series quoted at the top -- namely the fact that
"valid.max_access_size = 1" didn't match what the guest was supposed to
do, according to the spec ("docs/specs/acpi_cpu_hotplug.txt").
The symptom is that the "modern interface negotiation protocol"
described in commit ae340aa3d256:
> + Use following steps to detect and enable modern CPU hotplug interface:
> + 1. Store 0x0 to the 'CPU selector' register,
> + attempting to switch to modern mode
> + 2. Store 0x0 to the 'CPU selector' register,
> + to ensure valid selector value
> + 3. Store 0x0 to the 'Command field' register,
> + 4. Read the 'Command data 2' register.
> + If read value is 0x0, the modern interface is enabled.
> + Otherwise legacy or no CPU hotplug interface available
falls apart for the guest: steps 1 and 2 are lost, because they are DWORD
writes; so no switching happens. Step 3 (a single-byte write) is not
lost, but it has no effect; see the condition in cpu_status_write() in
patch#8. And step 4 *misleads* the guest into thinking that the switch
worked: the DWORD read is lost again -- it returns zero to the guest
without ever reaching the device model, so the guest never learns the
switch didn't work.
This means that guest behavior centered on the "Command data 2" register
worked *only* in the v5.0.0 release; it got effectively regressed in
v5.1.0.
To make things *even more* complicated, the breakage was (and remains, as
of today) visible with TCG acceleration only. Commit 5d971f9e6725 makes
no difference with KVM acceleration -- the DWORD accesses still work,
despite "valid.max_access_size = 1".
As commit 5d971f9e6725 suggests, fix the problem by raising
"valid.max_access_size" to 4 -- the spec now clearly instructs the guest
to perform DWORD accesses to the legacy register block too, for enabling
(and verifying!) the modern block. In order to keep compatibility for the
device model implementation though, set "impl.max_access_size = 1", so
that wide accesses be split before they reach the legacy read/write
handlers, like they always have been on KVM, and like they were on TCG
before 5d971f9e6725 (v5.1.0).
Tested with:
- OVMF IA32 + qemu-system-i386, CPU hotplug/hot-unplug with SMM,
intermixed with ACPI S3 suspend/resume, using KVM accel
(regression-test);
- OVMF IA32X64 + qemu-system-x86_64, CPU hotplug/hot-unplug with SMM,
intermixed with ACPI S3 suspend/resume, using KVM accel
(regression-test);
- OVMF IA32 + qemu-system-i386, SMM enabled, using TCG accel; verified the
register block switch and the present/possible CPU counting through the
modern hotplug interface, during OVMF boot (bugfix test);
- I do not have any testcase (guest payload) for regression-testing CPU
hotplug through the *legacy* CPU hotplug register block.
Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Ani Sinha <ani@anisinha.ca> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Philippe Mathieu-Daudé <philmd@linaro.org> Cc: qemu-stable@nongnu.org
Ref: "IO port write width clamping differs between TCG and KVM" Link: http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com Link: https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html Reported-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230105161804.82486-1-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-7-yangyicong@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Yicong Yang [Thu, 29 Dec 2022 06:55:12 +0000 (14:55 +0800)]
tests: acpi: aarch64: Add topology test for aarch64
Add test for aarch64's ACPI topology building for all the supported
levels.
Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Tested-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-6-yangyicong@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Yicong Yang [Thu, 29 Dec 2022 06:55:11 +0000 (14:55 +0800)]
tests: acpi: Add and whitelist *.topology blobs
Add and whitelist *.topology blobs, prepares for the aarch64's ACPI
topology building test.
Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-5-yangyicong@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-4-yangyicong@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Yicong Yang [Thu, 29 Dec 2022 06:55:09 +0000 (14:55 +0800)]
hw/acpi/aml-build: Only generate cluster node in PPTT when specified
Currently we'll always generate a cluster node no matter user has
specified '-smp clusters=X' or not. Cluster is an optional level
and will participant the building of Linux scheduling domains and
only appears on a few platforms. It's unncessary to always build
it when it cannot reflect the real topology on platforms having no
cluster implementation and to avoid affecting the linux scheduling
domains in the VM. So only generate the cluster topology in ACPI
PPTT when the user has specified it explicitly in -smp.
Tested qemu-system-aarch64 with `-smp 8` and linux 6.1-rc1, without
this patch:
estuary:/sys/devices/system/cpu/cpu0/topology$ cat cluster_*
ff # cluster_cpus
0-7 # cluster_cpus_list
56 # cluster_id
with this patch:
estuary:/sys/devices/system/cpu/cpu0/topology$ cat cluster_*
ff # cluster_cpus
0-7 # cluster_cpus_list
36 # cluster_id, with no cluster node kernel will make it to
physical package id
Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Tested-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-3-yangyicong@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Yicong Yang [Thu, 29 Dec 2022 06:55:08 +0000 (14:55 +0800)]
tests: virt: Allow changes to PPTT test table
Allow changes to test/data/acpi/virt/PPTT*, prepare to change the
building policy of the cluster topology.
Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-2-yangyicong@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
leixiang [Tue, 27 Dec 2022 08:16:04 +0000 (16:16 +0800)]
virtio-pci: fix proxy->vector_irqfd leak in virtio_pci_set_guest_notifiers
proxy->vector_irqfd did not free when kvm_virtio_pci_vector_use or
msix_set_vector_notifiers failed in virtio_pci_set_guest_notifiers.
Fixes: 7d37d351 Signed-off-by: Lei Xiang <leixiang@kylinos.cn> Tested-by: Zeng Chi <zengchi@kylinos.cn> Suggested-by: Xie Ming <xieming@kylinos.cn>
Message-Id: <20221227081604.806415-1-leixiang@kylinos.cn> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Longpeng [Tue, 27 Dec 2022 07:20:15 +0000 (15:20 +0800)]
vdpa: commit all host notifier MRs in a single MR transaction
This allows the vhost-vdpa device to batch the setup of all its MRs of
host notifiers.
This significantly reduces the device starting time, e.g. the time spend
on setup the host notifier MRs reduce from 423ms to 32ms for a VM with
64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device).
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221227072015.3134-4-longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Longpeng [Tue, 27 Dec 2022 07:20:14 +0000 (15:20 +0800)]
vhost: configure all host notifiers in a single MR transaction
This allows the vhost device to batch the setup of all its host notifiers.
This significantly reduces the device starting time, e.g. the time spend
on enabling notifiers reduce from 376ms to 9.1ms for a VM with 64 vCPUs
and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device)
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221227072015.3134-3-longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Longpeng [Tue, 27 Dec 2022 07:20:13 +0000 (15:20 +0800)]
vhost: simplify vhost_dev_enable_notifiers
Simplify the error path in vhost_dev_enable_notifiers by using
vhost_dev_disable_notifiers directly.
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221227072015.3134-2-longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Longpeng [Sat, 24 Dec 2022 11:48:48 +0000 (19:48 +0800)]
vdpa: harden the error path if get_iova_range failed
We should stop if the GET_IOVA_RANGE ioctl failed.
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-3-longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>