]> www.infradead.org Git - users/dwmw2/qemu.git/log
users/dwmw2/qemu.git
2 years agoi386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_verson
David Woodhouse [Fri, 16 Dec 2022 23:40:57 +0000 (23:40 +0000)]
i386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_verson

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Support mapping grant frames
David Woodhouse [Fri, 16 Dec 2022 18:33:49 +0000 (18:33 +0000)]
hw/xen: Support mapping grant frames

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add xen_gnttab device for grant table emulation
David Woodhouse [Fri, 16 Dec 2022 15:50:26 +0000 (15:50 +0000)]
hw/xen: Add xen_gnttab device for grant table emulation

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agokvm/i386: Add xen-gnttab-max-frames property
David Woodhouse [Fri, 16 Dec 2022 16:27:00 +0000 (16:27 +0000)]
kvm/i386: Add xen-gnttab-max-frames property

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback
David Woodhouse [Fri, 16 Dec 2022 00:03:21 +0000 (00:03 +0000)]
hw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback

The guest is permitted to specify an arbitrary domain/bus/device/function
and INTX pin from which the callback IRQ shall appear to have come.

In QEMU we can only easily do this for devices that actually exist, and
even that requires us "knowing" that it's a PCMachine in order to find
the PCI root bus — although that's OK really because it's always true.

We also don't get to get notified of INTX routing changes, because we
can't do that as a passive observer; if we try to register a notifier
it will overwrite any existing notifier callback on the device.

But in practice, guests using PCI_INTX will only ever use pin A on the
Xen platform device, and won't swizzle the INTX routing after they set
it up. So this is just fine.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback
David Woodhouse [Thu, 15 Dec 2022 20:35:24 +0000 (20:35 +0000)]
hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback

The GSI callback (and later PCI_INTX) is a level triggered interrupt. It
is asserted when an event channel is delivered to vCPU0, and is supposed
to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0
is cleared again.

Thankfully, Xen does *not* assert the GSI if the guest sets its own
evtchn_upcall_pending field; we only need to assert the GSI when we
have delivered an event for ourselves. So that's the easy part.

However, we *do* need to poll for the evtchn_upcall_pending flag being
cleared. In an ideal world we would poll that when the EOI happens on
the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd
pairs — one is used to trigger the interrupt, and the other works in the
other direction to 'resample' on EOI, and trigger the first eventfd
again if the line is still active.

However, QEMU doesn't seem to do that. Even VFIO level interrupts seem
to be supported by temporarily unmapping the device's BARs from the
guest when an interrupt happens, then trapping *all* MMIO to the device
and sending the 'resample' event on *every* MMIO access until the IRQ
is cleared! Maybe in future we'll plumb the 'resample' concept through
QEMU's irq framework but for now we'll do what Xen itself does: just
check the flag on every vmexit if the upcall GSI is known to be
asserted.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: add monitor commands to test event injection
Joao Martins [Tue, 21 Aug 2018 16:16:19 +0000 (12:16 -0400)]
i386/xen: add monitor commands to test event injection

Specifically add listing, injection of event channels.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_reset
David Woodhouse [Wed, 14 Dec 2022 19:36:15 +0000 (19:36 +0000)]
hw/xen: Implement EVTCHNOP_reset

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_bind_vcpu
David Woodhouse [Wed, 14 Dec 2022 19:27:38 +0000 (19:27 +0000)]
hw/xen: Implement EVTCHNOP_bind_vcpu

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_bind_interdomain
David Woodhouse [Wed, 14 Dec 2022 17:26:32 +0000 (17:26 +0000)]
hw/xen: Implement EVTCHNOP_bind_interdomain

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_alloc_unbound
David Woodhouse [Wed, 14 Dec 2022 16:39:48 +0000 (16:39 +0000)]
hw/xen: Implement EVTCHNOP_alloc_unbound

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_send
David Woodhouse [Wed, 14 Dec 2022 00:11:07 +0000 (00:11 +0000)]
hw/xen: Implement EVTCHNOP_send

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_bind_ipi
David Woodhouse [Tue, 13 Dec 2022 23:12:59 +0000 (23:12 +0000)]
hw/xen: Implement EVTCHNOP_bind_ipi

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_bind_virq
David Woodhouse [Tue, 13 Dec 2022 22:40:56 +0000 (22:40 +0000)]
hw/xen: Implement EVTCHNOP_bind_virq

Add the array of virq ports to each vCPU so that we can deliver timers,
debug ports, etc. Global virqs are allocated against vCPU 0 initially,
but can be migrated to other vCPUs (when we implement that).

The kernel needs to know about VIRQ_TIMER in order to accelerate timers,
so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value
of the singleshot timer across migration, as the kernel will handle the
hypercalls automatically now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_unmask
David Woodhouse [Tue, 13 Dec 2022 17:20:46 +0000 (17:20 +0000)]
hw/xen: Implement EVTCHNOP_unmask

This finally comes with a mechanism for actually injecting events into
the guest vCPU, with all the atomic-test-and-set that's involved in
setting the bit in the shinfo, then the index in the vcpu_info, and
injecting either the lapic vector as MSI, or letting KVM inject the
bare vector.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_close
David Woodhouse [Tue, 13 Dec 2022 13:57:44 +0000 (13:57 +0000)]
hw/xen: Implement EVTCHNOP_close

It calls an internal close_port() helper which will also be used from
EVTCHNOP_reset and will actually do the work to disconnect/unbind a port
once any of that is actually implemented in the first place.

That in turn calls a free_port() internal function which will be in
error paths after allocation.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement EVTCHNOP_status
David Woodhouse [Tue, 13 Dec 2022 13:29:46 +0000 (13:29 +0000)]
hw/xen: Implement EVTCHNOP_status

This adds the basic structure for maintaining the port table and reporting
the status of ports therein.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: Add support for Xen event channel delivery to vCPU
David Woodhouse [Fri, 16 Dec 2022 14:32:25 +0000 (14:32 +0000)]
i386/xen: Add support for Xen event channel delivery to vCPU

The kvm_xen_inject_vcpu_callback_vector() function will either deliver
the per-vCPU local APIC vector (as an MSI), or just kick the vCPU out
of the kernel to trigger KVM's automatic delivery of the global vector.
Support for asserting the GSI/PCI_INTX callbacks will come later.

Also add kvm_xen_get_vcpu_info_hva() which returns the vcpu_info of
a given vCPU.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add xen_evtchn device for event channel emulation
David Woodhouse [Fri, 16 Dec 2022 14:02:29 +0000 (14:02 +0000)]
hw/xen: Add xen_evtchn device for event channel emulation

Include basic support for setting HVM_PARAM_CALLBACK_IRQ to the global
vector method HVM_PARAM_CALLBACK_TYPE_VECTOR, which is handled in-kernel
by raising the vector whenever the vCPU's vcpu_info->evtchn_upcall_pending
flag is set.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: implement HVMOP_set_param
Ankur Arora [Tue, 6 Dec 2022 11:14:07 +0000 (11:14 +0000)]
i386/xen: implement HVMOP_set_param

This is the hook for adding the HVM_PARAM_CALLBACK_IRQ parameter in a
subsequent commit.

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Split out from another commit]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: implement HVMOP_set_evtchn_upcall_vector
Ankur Arora [Tue, 6 Dec 2022 11:14:07 +0000 (11:14 +0000)]
i386/xen: implement HVMOP_set_evtchn_upcall_vector

The HVMOP_set_evtchn_upcall_vector hypercall sets the per-vCPU upcall
vector, to be delivered to the local APIC just like an MSI (with an EOI).

This takes precedence over the system-wide delivery method set by the
HVMOP_set_param hypercall with HVM_PARAM_CALLBACK_IRQ. It's used by
Windows and Xen (PV shim) guests but normally not by Linux.

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Rework for upstream kernel changes and split from HVMOP_set_param]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HYPERVISOR_event_channel_op
Joao Martins [Thu, 28 Jun 2018 16:36:19 +0000 (12:36 -0400)]
i386/xen: implement HYPERVISOR_event_channel_op

Additionally set XEN_INTERFACE_VERSION to most recent in order to
exercise the "new" event_channel_op.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch event_channel_op_compat which was never available to HVM guests]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: handle VCPUOP_register_runstate_memory_area
Joao Martins [Tue, 24 Jul 2018 16:46:00 +0000 (12:46 -0400)]
i386/xen: handle VCPUOP_register_runstate_memory_area

Allow guest to setup the vcpu runstates which is used as
steal clock.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: handle VCPUOP_register_vcpu_time_info
Joao Martins [Mon, 23 Jul 2018 15:24:36 +0000 (11:24 -0400)]
i386/xen: handle VCPUOP_register_vcpu_time_info

In order to support Linux vdso in Xen.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: handle VCPUOP_register_vcpu_info
Joao Martins [Fri, 29 Jun 2018 14:54:50 +0000 (10:54 -0400)]
i386/xen: handle VCPUOP_register_vcpu_info

Handle the hypercall to set a per vcpu info, and also wire up the default
vcpu_info in the shared_info page for the first 32 vCPUs.

To avoid deadlock within KVM a vCPU thread must set its *own* vcpu_info
rather than it being set from the context in which the hypercall is
invoked.

Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration,
and restore it in kvm_arch_put_registers() appropriately.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: implement HYPERVISOR_vcpu_op
Joao Martins [Mon, 18 Jun 2018 16:26:44 +0000 (12:26 -0400)]
i386/xen: implement HYPERVISOR_vcpu_op

This is simply when guest tries to register a vcpu_info
and since vcpu_info placement is optional in the minimum ABI
therefore we can just fail with -ENOSYS

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: implement HYPERVISOR_hvm_op
Joao Martins [Mon, 18 Jun 2018 16:21:57 +0000 (12:21 -0400)]
i386/xen: implement HYPERVISOR_hvm_op

This is when guest queries for support for HVMOP_pagetable_dying.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: implement XENMEM_add_to_physmap_batch
David Woodhouse [Thu, 15 Dec 2022 10:39:30 +0000 (10:39 +0000)]
i386/xen: implement XENMEM_add_to_physmap_batch

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: implement HYPERVISOR_memory_op
Joao Martins [Mon, 18 Jun 2018 16:17:42 +0000 (12:17 -0400)]
i386/xen: implement HYPERVISOR_memory_op

Specifically XENMEM_add_to_physmap with space XENMAPSPACE_shared_info to
allow the guest to set its shared_info page.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Use the xen_overlay device, add compat support]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: manage and save/restore Xen guest long_mode setting
David Woodhouse [Mon, 12 Dec 2022 14:03:41 +0000 (14:03 +0000)]
i386/xen: manage and save/restore Xen guest long_mode setting

Xen will "latch" the guest's 32-bit or 64-bit ("long mode") setting when
the guest writes the MSR to fill in the hypercall page, or when the guest
sets the event channel callback in HVM_PARAM_CALLBACK_IRQ.

KVM handles the former and sets the kernel's long_mode flag accordingly.
The latter will be handled in userspace. Keep them in sync by noticing
when a hypercall is made in a mode that doesn't match qemu's idea of
the guest mode, and resyncing from the kernel. Do that same sync right
before serialization too, in case the guest has set the hypercall page
but hasn't yet made a system call.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
David Woodhouse [Mon, 12 Dec 2022 23:40:45 +0000 (23:40 +0000)]
i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode

The xen_overlay device (and later similar devices for event channels and
grant tables) need to be instantiated. Do this from a kvm_type method on
the PC machine derivatives, since KVM is only way to support Xen emulation
for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add xen_overlay device for emulating shared xenheap pages
David Woodhouse [Wed, 7 Dec 2022 09:19:31 +0000 (09:19 +0000)]
hw/xen: Add xen_overlay device for emulating shared xenheap pages

For the shared info page and for grant tables, Xen shares its own pages
from the "Xen heap" to the guest. The guest requests that a given page
from a certain address space (XENMAPSPACE_shared_info, etc.) be mapped
to a given GPA using the XENMEM_add_to_physmap hypercall.

To support that in qemu when *emulating* Xen, create a memory region
(migratable) and allow it to be mapped as an overlay when requested.

Xen theoretically allows the same page to be mapped multiple times
into the guest, but that's hard to track and reinstate over migration,
so we automatically *unmap* any previous mapping when creating a new
one. This approach has been used in production with.... a non-trivial
number of guests expecting true Xen, without any problems yet being
noticed.

This adds just the shared info page for now. The grant tables will be
a larger region, and will need to be overlaid one page at a time. I
think that means I need to create separate aliases for each page of
the overall grant_frames region, so that they can be mapped individually.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: Implement SCHEDOP_poll and SCHEDOP_yield
David Woodhouse [Wed, 14 Dec 2022 21:50:41 +0000 (21:50 +0000)]
i386/xen: Implement SCHEDOP_poll and SCHEDOP_yield

They both do the same thing and just call sched_yield. This is enough to
stop the Linux guest panicking when running on a host kernel which doesn't
intercept SCHEDOP_poll and lets it reach userspace.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown
Joao Martins [Fri, 20 Jul 2018 19:19:05 +0000 (15:19 -0400)]
i386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown

It allows to shutdown itself via hypercall with any of the 3 reasons:
  1) self-reboot
  2) shutdown
  3) crash

Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather
than leading to triple faults if it remains unimplemented.

In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset
Xen shared pages and other enlightenments and leave a clean slate for the
new kernel without the hypervisor helpfully writing information at
unexpected addresses.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch sched_op_compat which was never available for HVM guests,
        Add SCHEDOP_soft_reset]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoi386/xen: implement HYPERVISOR_xen_version
Joao Martins [Thu, 14 Jun 2018 12:29:45 +0000 (08:29 -0400)]
i386/xen: implement HYPERVISOR_xen_version

This is just meant to serve as an example on how we can implement
hypercalls. xen_version specifically since Qemu does all kind of
feature controllability. So handling that here seems appropriate.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Implement kvm_gva_rw() safely]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: handle guest hypercalls
Joao Martins [Wed, 13 Jun 2018 14:14:31 +0000 (10:14 -0400)]
i386/xen: handle guest hypercalls

This means handling the new exit reason for Xen but still
crashing on purpose. As we implement each of the hypercalls
we will then return the right return code.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoxen-platform: allow its creation with XEN_EMULATE mode
Joao Martins [Tue, 19 Jun 2018 10:44:46 +0000 (06:44 -0400)]
xen-platform: allow its creation with XEN_EMULATE mode

The only thing we need to handle on KVM side is to change the
pfn from R/W to R/O.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agoxen-platform: exclude vfio-pci from the PCI platform unplug
Joao Martins [Fri, 18 Jan 2019 19:29:52 +0000 (14:29 -0500)]
xen-platform: exclude vfio-pci from the PCI platform unplug

Such that PCI passthrough devices work for Xen emulated guests.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/hvm: Set Xen vCPU ID in KVM
David Woodhouse [Fri, 16 Dec 2022 11:05:29 +0000 (11:05 +0000)]
i386/hvm: Set Xen vCPU ID in KVM

There are (at least) three different vCPU ID number spaces. One is the
internal KVM vCPU index, based purely on which vCPU was chronologically
created in the kernel first. If userspace threads are all spawned and
create their KVM vCPUs in essentially random order, then the KVM indices
are basically random too.

The second number space is the APIC ID space, which is consistent and
useful for referencing vCPUs. MSIs will specify the target vCPU using
the APIC ID, for example, and the KVM Xen APIs also take an APIC ID
from userspace whenever a vCPU needs to be specified (as opposed to
just using the appropriate vCPU fd).

The third number space is not normally relevant to the kernel, and is
the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But
Xen timer hypercalls use it, and Xen timer hypercalls *really* want
to be accelerated in the kernel rather than handled in userspace, so
the kernel needs to be told.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/kvm: handle Xen HVM cpuid leaves
Joao Martins [Tue, 6 Dec 2022 10:48:53 +0000 (10:48 +0000)]
i386/kvm: handle Xen HVM cpuid leaves

Introduce support for emulating CPUID for Xen HVM guests. It doesn't make
sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally
when the xen-version machine property is set.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Obtain xen_version from KVM property, make it automatic]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/kvm: Add xen-version KVM accelerator property and init KVM Xen support
David Woodhouse [Sat, 3 Dec 2022 17:51:13 +0000 (09:51 -0800)]
i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support

This just initializes the basic Xen support in KVM for now. Only permitted
on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap
overlay, event channel, grant tables and other stuff will exist. There's
no point having the basic hypercall support if nothing else works.

Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used
later by support devices.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoxen: Add XEN_DISABLED mode and make it default
David Woodhouse [Mon, 12 Dec 2022 22:32:54 +0000 (22:32 +0000)]
xen: Add XEN_DISABLED mode and make it default

Also set XEN_ATTACH mode in xen_init() to reflect the truth; not that
anyone ever cared before. It was *only* ever checked in xen_init_pv()
before.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoxen: add CONFIG_XENFV_MACHINE and CONFIG_XEN_EMU options for Xen emulation
David Woodhouse [Tue, 6 Dec 2022 09:03:48 +0000 (09:03 +0000)]
xen: add CONFIG_XENFV_MACHINE and CONFIG_XEN_EMU options for Xen emulation

The XEN_EMU option will cover core Xen support in target/, which exists
only for x86 with KVM today but could theoretically also be implemented
on Arm/Aarch64 and with TCG or other accelerators. It will also cover
the support for architecture-independent grant table and event channel
support which will be added in hw/i386/kvm/ (on the basis that the
non-KVM support is very theoretical and making it not use KVM directly
seems like gratuitous overengineering at this point).

The XENFV_MACHINE option is for the xenfv platform support, which will
now be used both by XEN_EMU and by real Xen.

The XEN option remains dependent on the Xen runtime libraries, and covers
support for real Xen. Some code which currently resides under CONFIG_XEN
will be moving to CONFIG_XENFV_MACHINE over time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoinclude: import Xen public headers to include/standard-headers/
Joao Martins [Wed, 13 Feb 2019 17:29:47 +0000 (12:29 -0500)]
include: import Xen public headers to include/standard-headers/

There are already some partial headers in include/hw/xen/interface/
which will be removed once we migrate users to the new location.

To start with, define __XEN_TOOLS__ in hw/xen/xen.h to ensure that any
internal definitions needed by Xen toolstack libraries are present
regardless of the order in which the headers are included. A reckoning
will come later, once we make the PV backends work in emulation and
untangle the headers for Xen-native vs. generic parts.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Update to Xen public headers from 4.16.2 release, add some in io/,
        define __XEN_TOOLS__ in hw/xen/xen.h]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoMerge tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu into staging
Peter Maydell [Mon, 9 Jan 2023 15:54:31 +0000 (15:54 +0000)]
Merge tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu into staging

* s390x header clean-ups from Philippe
* Rework and improvements of the EINTR handling by Nikita
* Deprecate the -no-hpet command line option
* Disable the qtests in the 32-bit Windows CI job again
* Some other misc fixes here and there

# gpg: Signature made Mon 09 Jan 2023 14:21:19 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu:
  .gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job
  error handling: Use RETRY_ON_EINTR() macro where applicable
  Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
  docs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst
  i386: Deprecate the -no-hpet QEMU command line option
  tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter
  tests/readconfig: spice doesn't support unix socket on windows yet
  target/s390x: Restrict sysemu/reset.h to system emulation
  target/s390x/tcg/excp_helper: Restrict system headers to sysemu
  target/s390x/tcg/misc_helper: Remove unused "memory.h" include
  hw/s390x/pv: Restrict Protected Virtualization to sysemu
  exec/memory: Expose memory_region_access_valid()
  MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section
  tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts
  qemu-iotests/stream-under-throttle: do not shutdown QEMU

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 years ago.gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job
Thomas Huth [Thu, 5 Jan 2023 19:30:58 +0000 (20:30 +0100)]
.gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job

The qtests are not stable in the msys2-32bit job yet - especially
the test-hmp and the qom-test are failing randomly. Until this is
fixed, let's better disable the qtests here again to avoid failing
CI tests.

Message-Id: <20230105204819.26992-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agoerror handling: Use RETRY_ON_EINTR() macro where applicable
Nikita Ivanov [Sun, 23 Oct 2022 09:04:22 +0000 (12:04 +0300)]
error handling: Use RETRY_ON_EINTR() macro where applicable

There is a defined RETRY_ON_EINTR() macro in qemu/osdep.h
which handles the same while loop.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/415
Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-3-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[thuth: Dropped the hunk that changed socket_accept() in libqtest.c]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agoRefactoring: refactor TFR() macro to RETRY_ON_EINTR()
Nikita Ivanov [Sun, 23 Oct 2022 09:04:21 +0000 (12:04 +0300)]
Refactoring: refactor TFR() macro to RETRY_ON_EINTR()

Rename macro name to more transparent one and refactor
it to expression.

Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-2-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agodocs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst
Thomas Huth [Tue, 13 Dec 2022 10:18:06 +0000 (11:18 +0100)]
docs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst

The file seems to contain perfectly valid rst syntax already, so
rename it to .rst and wire it up in the index.

Message-Id: <20221213101806.46640-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agoi386: Deprecate the -no-hpet QEMU command line option
Thomas Huth [Thu, 29 Dec 2022 11:49:13 +0000 (12:49 +0100)]
i386: Deprecate the -no-hpet QEMU command line option

The HPET setting has been turned into a machine property a while ago
already, so we should finally do the next step and deprecate the
legacy CLI option, too.

Message-Id: <20221229114913.260400-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agotests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter
Thomas Huth [Mon, 9 Jan 2023 08:08:23 +0000 (09:08 +0100)]
tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter

We are going to deprecate (and finally remove later) the -no-hpet command
line option. Prepare the bios-tables-test by using the replacement hpet=off
machine parameter instead.

Message-Id: <20230109081205.116369-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agotests/readconfig: spice doesn't support unix socket on windows yet
Marc-André Lureau [Tue, 3 Jan 2023 11:08:09 +0000 (15:08 +0400)]
tests/readconfig: spice doesn't support unix socket on windows yet

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230103110814.3726795-6-marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agotarget/s390x: Restrict sysemu/reset.h to system emulation
Philippe Mathieu-Daudé [Tue, 20 Dec 2022 14:56:24 +0000 (15:56 +0100)]
target/s390x: Restrict sysemu/reset.h to system emulation

In user emulation, threads -- implemented as CPU -- are
created/destroyed, but never reset. There is no point in
allowing the user emulation access the sysemu/reset API.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221220145625.26392-5-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agotarget/s390x/tcg/excp_helper: Restrict system headers to sysemu
Philippe Mathieu-Daudé [Sat, 17 Dec 2022 15:24:54 +0000 (16:24 +0100)]
target/s390x/tcg/excp_helper: Restrict system headers to sysemu

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221217152454.96388-6-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agotarget/s390x/tcg/misc_helper: Remove unused "memory.h" include
Philippe Mathieu-Daudé [Sat, 17 Dec 2022 15:24:53 +0000 (16:24 +0100)]
target/s390x/tcg/misc_helper: Remove unused "memory.h" include

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221217152454.96388-5-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agohw/s390x/pv: Restrict Protected Virtualization to sysemu
Philippe Mathieu-Daudé [Sat, 17 Dec 2022 15:24:52 +0000 (16:24 +0100)]
hw/s390x/pv: Restrict Protected Virtualization to sysemu

Protected Virtualization is irrelevant in user emulation.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221217152454.96388-4-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agoexec/memory: Expose memory_region_access_valid()
Philippe Mathieu-Daudé [Sat, 17 Dec 2022 15:24:50 +0000 (16:24 +0100)]
exec/memory: Expose memory_region_access_valid()

Instead of having hardware device poking into memory
internal API, expose memory_region_access_valid().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221217152454.96388-2-philmd@linaro.org>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agoMAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section
Thomas Huth [Mon, 12 Dec 2022 17:12:52 +0000 (18:12 +0100)]
MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section

docs/system/target-mips.rst and configs/targets/mips* are not covered
in our MAINTAINERS file yet, so let's add them now.

Message-Id: <20221212171252.194864-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agotests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts
Philippe Mathieu-Daudé [Fri, 9 Dec 2022 16:47:43 +0000 (17:47 +0100)]
tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts

On non-x86_64 host, if KVM is not available we get:

  Traceback (most recent call last):
    File "tests/vm/basevm.py", line 634, in main
      vm = vmcls(args, config=config)
    File "tests/vm/basevm.py", line 104, in __init__
      mem = max(4, args.jobs)
  TypeError: '>' not supported between instances of 'NoneType' and 'int'

Fix by always returning a -- not ideal but safe -- '1' value.

Fixes: b09539444a ("tests/vm: allow us to take advantage of MTTCG")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221209164743.70836-1-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agoqemu-iotests/stream-under-throttle: do not shutdown QEMU
Christian Borntraeger [Wed, 7 Dec 2022 13:14:52 +0000 (14:14 +0100)]
qemu-iotests/stream-under-throttle: do not shutdown QEMU

Without a kernel or boot disk a QEMU on s390 will exit (usually with a
disabled wait state). This breaks the stream-under-throttle test case.
Do not exit qemu if on s390.

Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Message-Id: <20221207131452.8455-1-borntraeger@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 years agoMerge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into...
Peter Maydell [Mon, 9 Jan 2023 10:07:11 +0000 (10:07 +0000)]
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, cleanups, fixes

mostly vhost-vdpa:
    guest announce feature emulation when using shadow virtqueue
    support for configure interrupt
    startup speed ups

an acpi change to only generate cluster node in PPTT when specified for arm

misc fixes, cleanups

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 08 Jan 2023 08:01:39 GMT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (50 commits)
  vhost-scsi: fix memleak of vsc->inflight
  acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
  tests: acpi: aarch64: Add *.topology tables
  tests: acpi: aarch64: Add topology test for aarch64
  tests: acpi: Add and whitelist *.topology blobs
  tests: virt: Update expected ACPI tables for virt test
  hw/acpi/aml-build: Only generate cluster node in PPTT when specified
  tests: virt: Allow changes to PPTT test table
  virtio-pci: fix proxy->vector_irqfd leak in virtio_pci_set_guest_notifiers
  vdpa: commit all host notifier MRs in a single MR transaction
  vhost: configure all host notifiers in a single MR transaction
  vhost: simplify vhost_dev_enable_notifiers
  vdpa: harden the error path if get_iova_range failed
  vdpa-dev: get iova range explicitly
  docs/devel: Rules on #include in headers
  include: Include headers where needed
  include/hw/virtio: Break inclusion loop
  include/hw/cxl: Break inclusion loop cxl_pci.h and cxl_cdat_h
  include/hw/pci: Include hw/pci/pci.h where needed
  include/hw/pci: Split pci_device.h off pci.h
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 years agoMerge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
Peter Maydell [Sun, 8 Jan 2023 14:27:40 +0000 (14:27 +0000)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* Atomic memslot updates for KVM (Emanuele, David)
* Always send errors to logfile when daemonized (Greg)
* Add support for IDE CompactFlash card (Lubomir)
* First round of build system cleanups (myself)
* First round of feature removals (myself)
* Reduce "qemu/accel.h" inclusion (Philippe)

# gpg: Signature made Thu 05 Jan 2023 23:51:09 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
  i386: SGX: remove deprecated member of SGXInfo
  target/i386: Add SGX aex-notify and EDECCSSA support
  util: remove support -chardev tty and -chardev parport
  util: remove support for hex numbers with a scaling suffix
  KVM: remove support for kernel-irqchip=off
  docs: do not talk about past removal as happening in the future
  meson: accept relative symlinks in "meson introspect --installed" data
  meson: cleanup compiler detection
  meson: support meson 0.64 -Doptimization=plain
  configure: test all warnings
  tests/qapi-schema: remove Meson workaround
  meson: cleanup dummy-cpus.c rules
  meson: tweak hardening options for Windows
  configure: remove backwards-compatibility and obsolete options
  configure: preserve qemu-ga variables
  configure: cleanup $cpu tests
  configure: remove dead function
  configure: remove useless write_c_skeleton
  ide: Add "ide-cf" driver, a CompactFlash card
  ide: Add 8-bit data mode
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 years agoMerge tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu into staging
Peter Maydell [Sun, 8 Jan 2023 11:23:17 +0000 (11:23 +0000)]
Merge tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu into staging

tcg/s390x improvements:
 - drop support for pre-z196 cpus (eol before 2017)
 - add support for misc-instruction-extensions-3
 - misc cleanups

# gpg: Signature made Sat 07 Jan 2023 07:47:59 GMT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu: (27 commits)
  tcg/s390x: Avoid the constant pool in tcg_out_movi
  tcg/s390x: Cleanup tcg_out_movi
  tcg/s390x: Tighten constraints for 64-bit compare
  tcg/s390x: Implement ctpop operation
  tcg/s390x: Use tgen_movcond_int in tgen_clz
  tcg/s390x: Support SELGR instruction in movcond
  tcg/s390x: Generalize movcond implementation
  tcg/s390x: Create tgen_cmp2 to simplify movcond
  tcg/s390x: Support MIE3 logical operations
  tcg/s390x: Tighten constraints for and_i64
  tcg/s390x: Tighten constraints for or_i64 and xor_i64
  tcg/s390x: Issue XILF directly for xor_i32
  tcg/s390x: Support MIE2 MGRK instruction
  tcg/s390x: Support MIE2 multiply single instructions
  tcg/s390x: Distinguish RIE formats
  tcg/s390x: Distinguish RRF-a and RRF-c formats
  tcg/s390x: Use LARL+AGHI for odd addresses
  tcg/s390x: Remove DISTINCT_OPERANDS facility check
  tcg/s390x: Remove FAST_BCR_SER facility check
  tcg/s390x: Check for load-on-condition facility at startup
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 years agovhost-scsi: fix memleak of vsc->inflight
Dongli Zhang [Wed, 4 Jan 2023 16:04:33 +0000 (08:04 -0800)]
vhost-scsi: fix memleak of vsc->inflight

This is below memleak detected when to quit the qemu-system-x86_64 (with
vhost-scsi-pci).

(qemu) quit

=================================================================
==15568==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7f00aec57917 in __interceptor_calloc (/lib64/libasan.so.6+0xb4917)
    #1 0x7f00ada0d7b5 in g_malloc0 (/lib64/libglib-2.0.so.0+0x517b5)
    #2 0x5648ffd38bac in vhost_scsi_start ../hw/scsi/vhost-scsi.c:92
    #3 0x5648ffd38d52 in vhost_scsi_set_status ../hw/scsi/vhost-scsi.c:131
    #4 0x5648ffda340e in virtio_set_status ../hw/virtio/virtio.c:2036
    #5 0x5648ff8de281 in virtio_ioport_write ../hw/virtio/virtio-pci.c:431
    #6 0x5648ff8deb29 in virtio_pci_config_write ../hw/virtio/virtio-pci.c:576
    #7 0x5648ffe5c0c2 in memory_region_write_accessor ../softmmu/memory.c:493
    #8 0x5648ffe5c424 in access_with_adjusted_size ../softmmu/memory.c:555
    #9 0x5648ffe6428f in memory_region_dispatch_write ../softmmu/memory.c:1515
    #10 0x5648ffe8613d in flatview_write_continue ../softmmu/physmem.c:2825
    #11 0x5648ffe86490 in flatview_write ../softmmu/physmem.c:2867
    #12 0x5648ffe86d9f in address_space_write ../softmmu/physmem.c:2963
    #13 0x5648ffe86e57 in address_space_rw ../softmmu/physmem.c:2973
    #14 0x5648fffbfb3d in kvm_handle_io ../accel/kvm/kvm-all.c:2639
    #15 0x5648fffc0e0d in kvm_cpu_exec ../accel/kvm/kvm-all.c:2890
    #16 0x5648fffc90a7 in kvm_vcpu_thread_fn ../accel/kvm/kvm-accel-ops.c:51
    #17 0x56490042400a in qemu_thread_start ../util/qemu-thread-posix.c:505
    #18 0x7f00ac3b6ea4 in start_thread (/lib64/libpthread.so.0+0x7ea4)

Free the vsc->inflight at the 'stop' path.

Fixes: b82526c7ee ("vhost-scsi: support inflight io track")
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Li Feng <fengli@smartx.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20230104160433.21353-1-dongli.zhang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agoacpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
Laszlo Ersek [Thu, 5 Jan 2023 16:18:04 +0000 (17:18 +0100)]
acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block

The modern ACPI CPU hotplug interface was introduced in the following
series (aa1dd39ca307..679dd1a957df), released in v2.7.0:

  1  abd49bc2ed2f docs: update ACPI CPU hotplug spec with new protocol
  2  16bcab97eb9f pc: piix4/ich9: add 'cpu-hotplug-legacy' property
  3  5e1b5d93887b acpi: cpuhp: add CPU devices AML with _STA method
  4  ac35f13ba8f8 pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
  5  d2238cb6781d acpi: cpuhp: implement hot-add parts of CPU hotplug
                  interface
  6  8872c25a26cc acpi: cpuhp: implement hot-remove parts of CPU hotplug
                  interface
  7  76623d00ae57 acpi: cpuhp: add cpu._OST handling
  8  679dd1a957df pc: use new CPU hotplug interface since 2.7 machine type

Before patch#1, "docs/specs/acpi_cpu_hotplug.txt" only specified 1-byte
accesses for the hotplug register block.  Patch#1 preserved the same
restriction for the legacy register block, but:

- it specified DWORD accesses for some of the modern registers,

- in particular, the switch from the legacy block to the modern block
  would require a DWORD write to the *legacy* block.

The latter functionality was then implemented in cpu_status_write()
[hw/acpi/cpu_hotplug.c], in patch#8.

Unfortunately, all DWORD accesses depended on a dormant bug: the one
introduced in earlier commit a014ed07bd5a ("memory: accept mismatching
sizes in memory_region_access_valid", 2013-05-29); first released in
v1.6.0.  Due to commit a014ed07bd5a, the DWORD accesses to the *legacy*
CPU hotplug register block would work in spite of the above series *not*
relaxing "valid.max_access_size = 1" in "hw/acpi/cpu_hotplug.c":

> static const MemoryRegionOps AcpiCpuHotplug_ops = {
>     .read = cpu_status_read,
>     .write = cpu_status_write,
>     .endianness = DEVICE_LITTLE_ENDIAN,
>     .valid = {
>         .min_access_size = 1,
>         .max_access_size = 1,
>     },
> };

Later, in commits e6d0c3ce6895 ("acpi: cpuhp: introduce 'Command data 2'
field", 2020-01-22) and ae340aa3d256 ("acpi: cpuhp: spec: add typical
usecases", 2020-01-22), first released in v5.0.0, the modern CPU hotplug
interface (including the documentation) was extended with another DWORD
*read* access, namely to the "Command data 2" register, which would be
important for the guest to confirm whether it managed to switch the
register block from legacy to modern.

This functionality too silently depended on the bug from commit
a014ed07bd5a.

In commit 5d971f9e6725 ('memory: Revert "memory: accept mismatching sizes
in memory_region_access_valid"', 2020-06-26), first released in v5.1.0,
the bug from commit a014ed07bd5a was fixed (the commit was reverted).
That swiftly exposed the bug in "AcpiCpuHotplug_ops", still present from
the v2.7.0 series quoted at the top -- namely the fact that
"valid.max_access_size = 1" didn't match what the guest was supposed to
do, according to the spec ("docs/specs/acpi_cpu_hotplug.txt").

The symptom is that the "modern interface negotiation protocol"
described in commit ae340aa3d256:

> +      Use following steps to detect and enable modern CPU hotplug interface:
> +        1. Store 0x0 to the 'CPU selector' register,
> +           attempting to switch to modern mode
> +        2. Store 0x0 to the 'CPU selector' register,
> +           to ensure valid selector value
> +        3. Store 0x0 to the 'Command field' register,
> +        4. Read the 'Command data 2' register.
> +           If read value is 0x0, the modern interface is enabled.
> +           Otherwise legacy or no CPU hotplug interface available

falls apart for the guest: steps 1 and 2 are lost, because they are DWORD
writes; so no switching happens.  Step 3 (a single-byte write) is not
lost, but it has no effect; see the condition in cpu_status_write() in
patch#8.  And step 4 *misleads* the guest into thinking that the switch
worked: the DWORD read is lost again -- it returns zero to the guest
without ever reaching the device model, so the guest never learns the
switch didn't work.

This means that guest behavior centered on the "Command data 2" register
worked *only* in the v5.0.0 release; it got effectively regressed in
v5.1.0.

To make things *even more* complicated, the breakage was (and remains, as
of today) visible with TCG acceleration only.  Commit 5d971f9e6725 makes
no difference with KVM acceleration -- the DWORD accesses still work,
despite "valid.max_access_size = 1".

As commit 5d971f9e6725 suggests, fix the problem by raising
"valid.max_access_size" to 4 -- the spec now clearly instructs the guest
to perform DWORD accesses to the legacy register block too, for enabling
(and verifying!) the modern block.  In order to keep compatibility for the
device model implementation though, set "impl.max_access_size = 1", so
that wide accesses be split before they reach the legacy read/write
handlers, like they always have been on KVM, and like they were on TCG
before 5d971f9e6725 (v5.1.0).

Tested with:

- OVMF IA32 + qemu-system-i386, CPU hotplug/hot-unplug with SMM,
  intermixed with ACPI S3 suspend/resume, using KVM accel
  (regression-test);

- OVMF IA32X64 + qemu-system-x86_64, CPU hotplug/hot-unplug with SMM,
  intermixed with ACPI S3 suspend/resume, using KVM accel
  (regression-test);

- OVMF IA32 + qemu-system-i386, SMM enabled, using TCG accel; verified the
  register block switch and the present/possible CPU counting through the
  modern hotplug interface, during OVMF boot (bugfix test);

- I do not have any testcase (guest payload) for regression-testing CPU
  hotplug through the *legacy* CPU hotplug register block.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ani Sinha <ani@anisinha.ca>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: qemu-stable@nongnu.org
Ref: "IO port write width clamping differs between TCG and KVM"
Link: http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com
Link: https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230105161804.82486-1-lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agotests: acpi: aarch64: Add *.topology tables
Yicong Yang [Thu, 29 Dec 2022 06:55:13 +0000 (14:55 +0800)]
tests: acpi: aarch64: Add *.topology tables

Add *.topology tables for the aarch64's topology test and empty
bios-tables-test-allowed-diff.h

The disassembled differences between actual and expected
PPTT (the table which we actually care about):

 +/*
 + * Intel ACPI Component Architecture
 + * AML/ASL+ Disassembler version 20180105 (64-bit version)
 + * Copyright (c) 2000 - 2018 Intel Corporation
 + *
 + * Disassembly of /tmp/aml-WUN4U1, Tue Nov  1 09:51:52 2022
 + *
 + * ACPI Data Table [PPTT]
 + *
 + * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
 + */
 +
 +[000h 0000   4]                    Signature : "PPTT"    [Processor Properties Topology Table]
 +[004h 0004   4]                 Table Length : 00000150
 +[008h 0008   1]                     Revision : 02
 +[009h 0009   1]                     Checksum : 7C
 +[00Ah 0010   6]                       Oem ID : "BOCHS "
 +[010h 0016   8]                 Oem Table ID : "BXPC    "
 +[018h 0024   4]                 Oem Revision : 00000001
 +[01Ch 0028   4]              Asl Compiler ID : "BXPC"
 +[020h 0032   4]        Asl Compiler Revision : 00000001
 +
 +
 +[024h 0036   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[025h 0037   1]                       Length : 14
 +[026h 0038   2]                     Reserved : 0000
 +[028h 0040   4]        Flags (decoded below) : 00000001
 +                            Physical package : 1
 +                     ACPI Processor ID valid : 0
 +[02Ch 0044   4]                       Parent : 00000000
 +[030h 0048   4]            ACPI Processor ID : 00000000
 +[034h 0052   4]      Private Resource Number : 00000000
 +
 +[038h 0056   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[039h 0057   1]                       Length : 14
 +[03Ah 0058   2]                     Reserved : 0000
 +[03Ch 0060   4]        Flags (decoded below) : 00000000
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 0
 +[040h 0064   4]                       Parent : 00000024
 +[044h 0068   4]            ACPI Processor ID : 00000000
 +[048h 0072   4]      Private Resource Number : 00000000
 +
 +[04Ch 0076   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[04Dh 0077   1]                       Length : 14
 +[04Eh 0078   2]                     Reserved : 0000
 +[050h 0080   4]        Flags (decoded below) : 00000000
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 0
 +[054h 0084   4]                       Parent : 00000038
 +[058h 0088   4]            ACPI Processor ID : 00000000
 +[05Ch 0092   4]      Private Resource Number : 00000000
 +
 +[060h 0096   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[061h 0097   1]                       Length : 14
 +[062h 0098   2]                     Reserved : 0000
 +[064h 0100   4]        Flags (decoded below) : 0000000E
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[068h 0104   4]                       Parent : 0000004C
 +[06Ch 0108   4]            ACPI Processor ID : 00000000
 +[070h 0112   4]      Private Resource Number : 00000000
 +
 +[074h 0116   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[075h 0117   1]                       Length : 14
 +[076h 0118   2]                     Reserved : 0000
 +[078h 0120   4]        Flags (decoded below) : 0000000E
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[07Ch 0124   4]                       Parent : 0000004C
 +[080h 0128   4]            ACPI Processor ID : 00000001
 +[084h 0132   4]      Private Resource Number : 00000000
 +
 +[088h 0136   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[089h 0137   1]                       Length : 14
 +[08Ah 0138   2]                     Reserved : 0000
 +[08Ch 0140   4]        Flags (decoded below) : 00000000
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 0
 +[090h 0144   4]                       Parent : 00000038
 +[094h 0148   4]            ACPI Processor ID : 00000001
 +[098h 0152   4]      Private Resource Number : 00000000
 +
 +[09Ch 0156   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[09Dh 0157   1]                       Length : 14
 +[09Eh 0158   2]                     Reserved : 0000
 +[0A0h 0160   4]        Flags (decoded below) : 0000000E
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[0A4h 0164   4]                       Parent : 00000088
 +[0A8h 0168   4]            ACPI Processor ID : 00000002
 +[0ACh 0172   4]      Private Resource Number : 00000000
 +
 +[0B0h 0176   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[0B1h 0177   1]                       Length : 14
 +[0B2h 0178   2]                     Reserved : 0000
 +[0B4h 0180   4]        Flags (decoded below) : 0000000E
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[0B8h 0184   4]                       Parent : 00000088
 +[0BCh 0188   4]            ACPI Processor ID : 00000003
 +[0C0h 0192   4]      Private Resource Number : 00000000
 +
 +[0C4h 0196   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[0C5h 0197   1]                       Length : 14
 +[0C6h 0198   2]                     Reserved : 0000
 +[0C8h 0200   4]        Flags (decoded below) : 00000000
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 0
 +[0CCh 0204   4]                       Parent : 00000024
 +[0D0h 0208   4]            ACPI Processor ID : 00000001
 +[0D4h 0212   4]      Private Resource Number : 00000000
 +
 +[0D8h 0216   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[0D9h 0217   1]                       Length : 14
 +[0DAh 0218   2]                     Reserved : 0000
 +[0DCh 0220   4]        Flags (decoded below) : 00000000
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 0
 +[0E0h 0224   4]                       Parent : 000000C4
 +[0E4h 0228   4]            ACPI Processor ID : 00000000
 +[0E8h 0232   4]      Private Resource Number : 00000000
 +
 +[0ECh 0236   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[0EDh 0237   1]                       Length : 14
 +[0EEh 0238   2]                     Reserved : 0000
 +[0F0h 0240   4]        Flags (decoded below) : 0000000E
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[0F4h 0244   4]                       Parent : 000000D8
 +[0F8h 0248   4]            ACPI Processor ID : 00000004
 +[0FCh 0252   4]      Private Resource Number : 00000000
 +
 +[100h 0256   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[101h 0257   1]                       Length : 14
 +[102h 0258   2]                     Reserved : 0000
 +[104h 0260   4]        Flags (decoded below) : 0000000E
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[108h 0264   4]                       Parent : 000000D8
 +[10Ch 0268   4]            ACPI Processor ID : 00000005
 +[110h 0272   4]      Private Resource Number : 00000000
 +
 +[114h 0276   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[115h 0277   1]                       Length : 14
 +[116h 0278   2]                     Reserved : 0000
 +[118h 0280   4]        Flags (decoded below) : 00000000
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 0
 +[11Ch 0284   4]                       Parent : 000000C4
 +[120h 0288   4]            ACPI Processor ID : 00000001
 +[124h 0292   4]      Private Resource Number : 00000000
 +
 +[128h 0296   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[129h 0297   1]                       Length : 14
 +[12Ah 0298   2]                     Reserved : 0000
 +[12Ch 0300   4]        Flags (decoded below) : 0000000E
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[130h 0304   4]                       Parent : 00000114
 +[134h 0308   4]            ACPI Processor ID : 00000006
 +[138h 0312   4]      Private Resource Number : 00000000
 +
 +[13Ch 0316   1]                Subtable Type : 00 [Processor Hierarchy Node]
 +[13Dh 0317   1]                       Length : 14
 +[13Eh 0318   2]                     Reserved : 0000
 +[140h 0320   4]        Flags (decoded below) : 0000000E
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[144h 0324   4]                       Parent : 00000114
 +[148h 0328   4]            ACPI Processor ID : 00000007
 +[14Ch 0332   4]      Private Resource Number : 00000000
 +
 +Raw Table Data: Length 336 (0x150)
 +
 +  0000: 50 50 54 54 50 01 00 00 02 7C 42 4F 43 48 53 20  // PPTTP....|BOCHS
 +  0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC    ....BXPC
 +  0020: 01 00 00 00 00 14 00 00 01 00 00 00 00 00 00 00  // ................
 +  0030: 00 00 00 00 00 00 00 00 00 14 00 00 00 00 00 00  // ................
 +  0040: 24 00 00 00 00 00 00 00 00 00 00 00 00 14 00 00  // $...............
 +  0050: 00 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00  // ....8...........
 +  0060: 00 14 00 00 0E 00 00 00 4C 00 00 00 00 00 00 00  // ........L.......
 +  0070: 00 00 00 00 00 14 00 00 0E 00 00 00 4C 00 00 00  // ............L...
 +  0080: 01 00 00 00 00 00 00 00 00 14 00 00 00 00 00 00  // ................
 +  0090: 38 00 00 00 01 00 00 00 00 00 00 00 00 14 00 00  // 8...............
 +  00A0: 0E 00 00 00 88 00 00 00 02 00 00 00 00 00 00 00  // ................
 +  00B0: 00 14 00 00 0E 00 00 00 88 00 00 00 03 00 00 00  // ................
 +  00C0: 00 00 00 00 00 14 00 00 00 00 00 00 24 00 00 00  // ............$...
 +  00D0: 01 00 00 00 00 00 00 00 00 14 00 00 00 00 00 00  // ................
 +  00E0: C4 00 00 00 00 00 00 00 00 00 00 00 00 14 00 00  // ................
 +  00F0: 0E 00 00 00 D8 00 00 00 04 00 00 00 00 00 00 00  // ................
 +  0100: 00 14 00 00 0E 00 00 00 D8 00 00 00 05 00 00 00  // ................
 +  0110: 00 00 00 00 00 14 00 00 00 00 00 00 C4 00 00 00  // ................
 +  0120: 01 00 00 00 00 00 00 00 00 14 00 00 0E 00 00 00  // ................
 +  0130: 14 01 00 00 06 00 00 00 00 00 00 00 00 14 00 00  // ................
 +  0140: 0E 00 00 00 14 01 00 00 07 00 00 00 00 00 00 00  // ................

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-7-yangyicong@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agotests: acpi: aarch64: Add topology test for aarch64
Yicong Yang [Thu, 29 Dec 2022 06:55:12 +0000 (14:55 +0800)]
tests: acpi: aarch64: Add topology test for aarch64

Add test for aarch64's ACPI topology building for all the supported
levels.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-6-yangyicong@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agotests: acpi: Add and whitelist *.topology blobs
Yicong Yang [Thu, 29 Dec 2022 06:55:11 +0000 (14:55 +0800)]
tests: acpi: Add and whitelist *.topology blobs

Add and whitelist *.topology blobs, prepares for the aarch64's ACPI
topology building test.

Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-5-yangyicong@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agotests: virt: Update expected ACPI tables for virt test
Yicong Yang [Thu, 29 Dec 2022 06:55:10 +0000 (14:55 +0800)]
tests: virt: Update expected ACPI tables for virt test

Update the ACPI tables according to the acpi aml_build change, also
empty bios-tables-test-allowed-diff.h.

The disassembled differences between actual and expected PPTT:

  /*
   * Intel ACPI Component Architecture
   * AML/ASL+ Disassembler version 20180105 (64-bit version)
   * Copyright (c) 2000 - 2018 Intel Corporation
   *
 - * Disassembly of tests/data/acpi/virt/PPTT, Tue Nov  1 09:29:12 2022
 + * Disassembly of /tmp/aml-DIIGV1, Tue Nov  1 09:29:12 2022
   *
   * ACPI Data Table [PPTT]
   *
   * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
   */

  [000h 0000   4]                    Signature : "PPTT"    [Processor Properties Topology Table]
 -[004h 0004   4]                 Table Length : 00000060
 +[004h 0004   4]                 Table Length : 0000004C
  [008h 0008   1]                     Revision : 02
 -[009h 0009   1]                     Checksum : 48
 +[009h 0009   1]                     Checksum : A8
  [00Ah 0010   6]                       Oem ID : "BOCHS "
  [010h 0016   8]                 Oem Table ID : "BXPC    "
  [018h 0024   4]                 Oem Revision : 00000001
  [01Ch 0028   4]              Asl Compiler ID : "BXPC"
  [020h 0032   4]        Asl Compiler Revision : 00000001

  [024h 0036   1]                Subtable Type : 00 [Processor Hierarchy Node]
  [025h 0037   1]                       Length : 14
  [026h 0038   2]                     Reserved : 0000
  [028h 0040   4]        Flags (decoded below) : 00000001
                              Physical package : 1
                       ACPI Processor ID valid : 0
  [02Ch 0044   4]                       Parent : 00000000
  [030h 0048   4]            ACPI Processor ID : 00000000
  [034h 0052   4]      Private Resource Number : 00000000

  [038h 0056   1]                Subtable Type : 00 [Processor Hierarchy Node]
  [039h 0057   1]                       Length : 14
  [03Ah 0058   2]                     Reserved : 0000
 -[03Ch 0060   4]        Flags (decoded below) : 00000000
 +[03Ch 0060   4]        Flags (decoded below) : 0000000A
                              Physical package : 0
 -                     ACPI Processor ID valid : 0
 +                     ACPI Processor ID valid : 1
  [040h 0064   4]                       Parent : 00000024
  [044h 0068   4]            ACPI Processor ID : 00000000
  [048h 0072   4]      Private Resource Number : 00000000

 -[04Ch 0076   1]                Subtable Type : 00 [Processor Hierarchy Node]
 -[04Dh 0077   1]                       Length : 14
 -[04Eh 0078   2]                     Reserved : 0000
 -[050h 0080   4]        Flags (decoded below) : 0000000A
 -                            Physical package : 0
 -                     ACPI Processor ID valid : 1
 -[054h 0084   4]                       Parent : 00000038
 -[058h 0088   4]            ACPI Processor ID : 00000000
 -[05Ch 0092   4]      Private Resource Number : 00000000
 -
 -Raw Table Data: Length 96 (0x60)
 +Raw Table Data: Length 76 (0x4C)

 -  0000: 50 50 54 54 60 00 00 00 02 48 42 4F 43 48 53 20  // PPTT`....HBOCHS
 +  0000: 50 50 54 54 4C 00 00 00 02 A8 42 4F 43 48 53 20  // PPTTL.....BOCHS
    0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC    ....BXPC
    0020: 01 00 00 00 00 14 00 00 01 00 00 00 00 00 00 00  // ................
 -  0030: 00 00 00 00 00 00 00 00 00 14 00 00 00 00 00 00  // ................
 -  0040: 24 00 00 00 00 00 00 00 00 00 00 00 00 14 00 00  // $...............
 -  0050: 0A 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00  // ....8...........
 +  0030: 00 00 00 00 00 00 00 00 00 14 00 00 0A 00 00 00  // ................
 +  0040: 24 00 00 00 00 00 00 00 00 00 00 00              // $...........

PPTT.acpihmatvirt is also updated:
  /*
   * Intel ACPI Component Architecture
   * AML/ASL+ Disassembler version 20180105 (64-bit version)
   * Copyright (c) 2000 - 2018 Intel Corporation
   *
 - * Disassembly of tests/data/acpi/virt/PPTT.acpihmatvirt, Wed Dec 28 15:36:06 2022
 + * Disassembly of /tmp/aml-IPKJX1, Wed Dec 28 15:36:06 2022
   *
   * ACPI Data Table [PPTT]
   *
   * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
   */

  [000h 0000   4]                    Signature : "PPTT"    [Processor Properties Topology Table]
 -[004h 0004   4]                 Table Length : 000000C4
 +[004h 0004   4]                 Table Length : 0000009C
  [008h 0008   1]                     Revision : 02
 -[009h 0009   1]                     Checksum : 9E
 +[009h 0009   1]                     Checksum : FE
  [00Ah 0010   6]                       Oem ID : "BOCHS "
  [010h 0016   8]                 Oem Table ID : "BXPC    "
  [018h 0024   4]                 Oem Revision : 00000001
  [01Ch 0028   4]              Asl Compiler ID : "BXPC"
  [020h 0032   4]        Asl Compiler Revision : 00000001

  [024h 0036   1]                Subtable Type : 00 [Processor Hierarchy Node]
  [025h 0037   1]                       Length : 14
  [026h 0038   2]                     Reserved : 0000
  [028h 0040   4]        Flags (decoded below) : 00000001
                              Physical package : 1
                       ACPI Processor ID valid : 0
  [02Ch 0044   4]                       Parent : 00000000
  [030h 0048   4]            ACPI Processor ID : 00000000
  [034h 0052   4]      Private Resource Number : 00000000

  [038h 0056   1]                Subtable Type : 00 [Processor Hierarchy Node]
  [039h 0057   1]                       Length : 14
  [03Ah 0058   2]                     Reserved : 0000
 -[03Ch 0060   4]        Flags (decoded below) : 00000000
 +[03Ch 0060   4]        Flags (decoded below) : 0000000A
                              Physical package : 0
 -                     ACPI Processor ID valid : 0
 +                     ACPI Processor ID valid : 1
  [040h 0064   4]                       Parent : 00000024
  [044h 0068   4]            ACPI Processor ID : 00000000
  [048h 0072   4]      Private Resource Number : 00000000

  [04Ch 0076   1]                Subtable Type : 00 [Processor Hierarchy Node]
  [04Dh 0077   1]                       Length : 14
  [04Eh 0078   2]                     Reserved : 0000
  [050h 0080   4]        Flags (decoded below) : 0000000A
                              Physical package : 0
                       ACPI Processor ID valid : 1
 -[054h 0084   4]                       Parent : 00000038
 -[058h 0088   4]            ACPI Processor ID : 00000000
 +[054h 0084   4]                       Parent : 00000024
 +[058h 0088   4]            ACPI Processor ID : 00000001
  [05Ch 0092   4]      Private Resource Number : 00000000

  [060h 0096   1]                Subtable Type : 00 [Processor Hierarchy Node]
  [061h 0097   1]                       Length : 14
  [062h 0098   2]                     Reserved : 0000
 -[064h 0100   4]        Flags (decoded below) : 0000000A
 -                            Physical package : 0
 -                     ACPI Processor ID valid : 1
 -[068h 0104   4]                       Parent : 00000038
 +[064h 0100   4]        Flags (decoded below) : 00000001
 +                            Physical package : 1
 +                     ACPI Processor ID valid : 0
 +[068h 0104   4]                       Parent : 00000000
  [06Ch 0108   4]            ACPI Processor ID : 00000001
  [070h 0112   4]      Private Resource Number : 00000000

  [074h 0116   1]                Subtable Type : 00 [Processor Hierarchy Node]
  [075h 0117   1]                       Length : 14
  [076h 0118   2]                     Reserved : 0000
 -[078h 0120   4]        Flags (decoded below) : 00000001
 -                            Physical package : 1
 -                     ACPI Processor ID valid : 0
 -[07Ch 0124   4]                       Parent : 00000000
 -[080h 0128   4]            ACPI Processor ID : 00000001
 +[078h 0120   4]        Flags (decoded below) : 0000000A
 +                            Physical package : 0
 +                     ACPI Processor ID valid : 1
 +[07Ch 0124   4]                       Parent : 00000060
 +[080h 0128   4]            ACPI Processor ID : 00000002
  [084h 0132   4]      Private Resource Number : 00000000

  [088h 0136   1]                Subtable Type : 00 [Processor Hierarchy Node]
  [089h 0137   1]                       Length : 14
  [08Ah 0138   2]                     Reserved : 0000
 -[08Ch 0140   4]        Flags (decoded below) : 00000000
 -                            Physical package : 0
 -                     ACPI Processor ID valid : 0
 -[090h 0144   4]                       Parent : 00000074
 -[094h 0148   4]            ACPI Processor ID : 00000000
 -[098h 0152   4]      Private Resource Number : 00000000
 -
 -[09Ch 0156   1]                Subtable Type : 00 [Processor Hierarchy Node]
 -[09Dh 0157   1]                       Length : 14
 -[09Eh 0158   2]                     Reserved : 0000
 -[0A0h 0160   4]        Flags (decoded below) : 0000000A
 -                            Physical package : 0
 -                     ACPI Processor ID valid : 1
 -[0A4h 0164   4]                       Parent : 00000088
 -[0A8h 0168   4]            ACPI Processor ID : 00000002
 -[0ACh 0172   4]      Private Resource Number : 00000000
 -
 -[0B0h 0176   1]                Subtable Type : 00 [Processor Hierarchy Node]
 -[0B1h 0177   1]                       Length : 14
 -[0B2h 0178   2]                     Reserved : 0000
 -[0B4h 0180   4]        Flags (decoded below) : 0000000A
 +[08Ch 0140   4]        Flags (decoded below) : 0000000A
                              Physical package : 0
                       ACPI Processor ID valid : 1
 -[0B8h 0184   4]                       Parent : 00000088
 -[0BCh 0188   4]            ACPI Processor ID : 00000003
 -[0C0h 0192   4]      Private Resource Number : 00000000
 +[090h 0144   4]                       Parent : 00000060
 +[094h 0148   4]            ACPI Processor ID : 00000003
 +[098h 0152   4]      Private Resource Number : 00000000

 -Raw Table Data: Length 196 (0xC4)
 +Raw Table Data: Length 156 (0x9C)

 -  0000: 50 50 54 54 C4 00 00 00 02 9E 42 4F 43 48 53 20  // PPTT......BOCHS
 +  0000: 50 50 54 54 9C 00 00 00 02 FE 42 4F 43 48 53 20  // PPTT......BOCHS
    0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC    ....BXPC
    0020: 01 00 00 00 00 14 00 00 01 00 00 00 00 00 00 00  // ................
 -  0030: 00 00 00 00 00 00 00 00 00 14 00 00 00 00 00 00  // ................
 +  0030: 00 00 00 00 00 00 00 00 00 14 00 00 0A 00 00 00  // ................
    0040: 24 00 00 00 00 00 00 00 00 00 00 00 00 14 00 00  // $...............
 -  0050: 0A 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00  // ....8...........
 -  0060: 00 14 00 00 0A 00 00 00 38 00 00 00 01 00 00 00  // ........8.......
 -  0070: 00 00 00 00 00 14 00 00 01 00 00 00 00 00 00 00  // ................
 -  0080: 01 00 00 00 00 00 00 00 00 14 00 00 00 00 00 00  // ................
 -  0090: 74 00 00 00 00 00 00 00 00 00 00 00 00 14 00 00  // t...............
 -  00A0: 0A 00 00 00 88 00 00 00 02 00 00 00 00 00 00 00  // ................
 -  00B0: 00 14 00 00 0A 00 00 00 88 00 00 00 03 00 00 00  // ................
 -  00C0: 00 00 00 00                                      // ....
 +  0050: 0A 00 00 00 24 00 00 00 01 00 00 00 00 00 00 00  // ....$...........
 +  0060: 00 14 00 00 01 00 00 00 00 00 00 00 01 00 00 00  // ................
 +  0070: 00 00 00 00 00 14 00 00 0A 00 00 00 60 00 00 00  // ............`...
 +  0080: 02 00 00 00 00 00 00 00 00 14 00 00 0A 00 00 00  // ................
 +  0090: 60 00 00 00 03 00 00 00 00 00 00 00              // `...........

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-4-yangyicong@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agohw/acpi/aml-build: Only generate cluster node in PPTT when specified
Yicong Yang [Thu, 29 Dec 2022 06:55:09 +0000 (14:55 +0800)]
hw/acpi/aml-build: Only generate cluster node in PPTT when specified

Currently we'll always generate a cluster node no matter user has
specified '-smp clusters=X' or not. Cluster is an optional level
and will participant the building of Linux scheduling domains and
only appears on a few platforms. It's unncessary to always build
it when it cannot reflect the real topology on platforms having no
cluster implementation and to avoid affecting the linux scheduling
domains in the VM. So only generate the cluster topology in ACPI
PPTT when the user has specified it explicitly in -smp.

Tested qemu-system-aarch64 with `-smp 8` and linux 6.1-rc1, without
this patch:
estuary:/sys/devices/system/cpu/cpu0/topology$ cat cluster_*
ff # cluster_cpus
0-7 # cluster_cpus_list
56 # cluster_id

with this patch:
estuary:/sys/devices/system/cpu/cpu0/topology$ cat cluster_*
ff # cluster_cpus
0-7 # cluster_cpus_list
36 # cluster_id, with no cluster node kernel will make it to
  physical package id

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-3-yangyicong@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agotests: virt: Allow changes to PPTT test table
Yicong Yang [Thu, 29 Dec 2022 06:55:08 +0000 (14:55 +0800)]
tests: virt: Allow changes to PPTT test table

Allow changes to test/data/acpi/virt/PPTT*, prepare to change the
building policy of the cluster topology.

Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-2-yangyicong@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio-pci: fix proxy->vector_irqfd leak in virtio_pci_set_guest_notifiers
leixiang [Tue, 27 Dec 2022 08:16:04 +0000 (16:16 +0800)]
virtio-pci: fix proxy->vector_irqfd leak in virtio_pci_set_guest_notifiers

proxy->vector_irqfd did not free when kvm_virtio_pci_vector_use or
msix_set_vector_notifiers failed in virtio_pci_set_guest_notifiers.

Fixes: 7d37d351
Signed-off-by: Lei Xiang <leixiang@kylinos.cn>
Tested-by: Zeng Chi <zengchi@kylinos.cn>
Suggested-by: Xie Ming <xieming@kylinos.cn>
Message-Id: <20221227081604.806415-1-leixiang@kylinos.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovdpa: commit all host notifier MRs in a single MR transaction
Longpeng [Tue, 27 Dec 2022 07:20:15 +0000 (15:20 +0800)]
vdpa: commit all host notifier MRs in a single MR transaction

This allows the vhost-vdpa device to batch the setup of all its MRs of
host notifiers.

This significantly reduces the device starting time, e.g. the time spend
on setup the host notifier MRs reduce from 423ms to 32ms for a VM with
64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device).

Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221227072015.3134-4-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2 years agovhost: configure all host notifiers in a single MR transaction
Longpeng [Tue, 27 Dec 2022 07:20:14 +0000 (15:20 +0800)]
vhost: configure all host notifiers in a single MR transaction

This allows the vhost device to batch the setup of all its host notifiers.
This significantly reduces the device starting time, e.g. the time spend
on enabling notifiers reduce from 376ms to 9.1ms for a VM with 64 vCPUs
and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device)

Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221227072015.3134-3-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2 years agovhost: simplify vhost_dev_enable_notifiers
Longpeng [Tue, 27 Dec 2022 07:20:13 +0000 (15:20 +0800)]
vhost: simplify vhost_dev_enable_notifiers

Simplify the error path in vhost_dev_enable_notifiers by using
vhost_dev_disable_notifiers directly.

Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221227072015.3134-2-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovdpa: harden the error path if get_iova_range failed
Longpeng [Sat, 24 Dec 2022 11:48:48 +0000 (19:48 +0800)]
vdpa: harden the error path if get_iova_range failed

We should stop if the GET_IOVA_RANGE ioctl failed.

Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-3-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2 years agovdpa-dev: get iova range explicitly
Longpeng [Sat, 24 Dec 2022 11:48:47 +0000 (19:48 +0800)]
vdpa-dev: get iova range explicitly

In commit a585fad26b ("vdpa: request iova_range only once") we remove
GET_IOVA_RANGE form vhost_vdpa_init, the generic vdpa device will start
without iova_range populated, so the device won't work. Let's call
GET_IOVA_RANGE ioctl explicitly.

Fixes: a585fad26b2e6ccc ("vdpa: request iova_range only once")
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-2-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2 years agodocs/devel: Rules on #include in headers
Markus Armbruster [Thu, 22 Dec 2022 12:08:13 +0000 (13:08 +0100)]
docs/devel: Rules on #include in headers

Rules for headers were proposed a long time ago, and generally liked:

    Message-ID: <87h9g8j57d.fsf@blackfin.pond.sub.org>
    https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg03345.html

Wortk them into docs/devel/style.rst.

Suggested-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222120813.727830-5-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
2 years agoinclude: Include headers where needed
Markus Armbruster [Thu, 22 Dec 2022 12:08:11 +0000 (13:08 +0100)]
include: Include headers where needed

A number of headers neglect to include everything they need.  They
compile only if the headers they need are already included from
elsewhere.  Fix that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20221222120813.727830-3-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agoinclude/hw/virtio: Break inclusion loop
Markus Armbruster [Thu, 22 Dec 2022 12:08:10 +0000 (13:08 +0100)]
include/hw/virtio: Break inclusion loop

hw/virtio/virtio.h and hw/virtio/vhost.h include each other.  The
former doesn't actually need the latter, so drop that inclusion to
break the loop.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20221222120813.727830-2-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
2 years agoinclude/hw/cxl: Break inclusion loop cxl_pci.h and cxl_cdat_h
Markus Armbruster [Thu, 22 Dec 2022 10:03:30 +0000 (11:03 +0100)]
include/hw/cxl: Break inclusion loop cxl_pci.h and cxl_cdat_h

hw/cxl/cxl_pci.h and hw/cxl/cxl_cdat.h include each other.  The former
doesn't actually need the latter, so drop that inclusion to break the
loop.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222100330.380143-8-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agoinclude/hw/pci: Include hw/pci/pci.h where needed
Markus Armbruster [Thu, 22 Dec 2022 10:03:29 +0000 (11:03 +0100)]
include/hw/pci: Include hw/pci/pci.h where needed

hw/pci/pcie_sriov.h needs PCI_NUM_REGIONS.  Without the previous
commit, this would close an inclusion loop: hw/pci/pci.h used to
include hw/pci/pcie.h for PCIExpressDevice, which includes
pcie_sriov.h for PCIESriovPF, which now includes hw/pci/pci.h for
PCI_NUM_REGIONS.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221222100330.380143-7-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agoinclude/hw/pci: Split pci_device.h off pci.h
Markus Armbruster [Thu, 22 Dec 2022 10:03:28 +0000 (11:03 +0100)]
include/hw/pci: Split pci_device.h off pci.h

PCIDeviceClass and PCIDevice are defined in pci.h.  Many users of the
header don't actually need them.  Similar structs live in their own
headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in
pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h,
PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and
PCIESlot in pcie_port.h.

Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with
the code that needs them.  Adjust include directives.

This also enables the next commit.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222100330.380143-6-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agoinclude/hw/pci: Clean up a few things checkpatch.pl would flag
Markus Armbruster [Thu, 22 Dec 2022 10:03:27 +0000 (11:03 +0100)]
include/hw/pci: Clean up a few things checkpatch.pl would flag

Fix a few style violations so that checkpatch.pl won't complain when I
move this code.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221222100330.380143-5-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agoinclude/hw/cxl: Include hw/cxl/*.h where needed
Markus Armbruster [Thu, 22 Dec 2022 10:03:26 +0000 (11:03 +0100)]
include/hw/cxl: Include hw/cxl/*.h where needed

hw/cxl/cxl_component.h needs CDATObject from hw/cxl/cxl_cdat.h.

hw/cxl/cxl_device.h needs CXLComponentState from
hw/cxl/cxl_component.h.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20221222100330.380143-4-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agoinclude/hw/cxl: Move typedef PXBDev to cxl.h, and put it to use
Markus Armbruster [Thu, 22 Dec 2022 10:03:25 +0000 (11:03 +0100)]
include/hw/cxl: Move typedef PXBDev to cxl.h, and put it to use

hw/cxl/cxl.h uses the PXBDev structure tag instead of the typedef
name.  The typedef name is defined in hw/pci/pci_bridge.h.  Its
inclusion was dropped in the previous commit to break an inclusion
loop.

Move the typedef to hw/cxl/cxl.h, and use it there.  Delete an extra
typedef in hw/pci-bridge/pci_expander_bridge.c.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222100330.380143-3-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agoinclude/hw/pci: Break inclusion loop pci_bridge.h and cxl.h
Markus Armbruster [Thu, 22 Dec 2022 10:03:24 +0000 (11:03 +0100)]
include/hw/pci: Break inclusion loop pci_bridge.h and cxl.h

hw/pci/pci_bridge.h and hw/cxl/cxl.h include each other.

Fortunately, breaking the loop is merely a matter of deleting
unnecessary includes from headers, and adding them back in places
where they are now missing.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222100330.380143-2-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agohw/virtio: Extract QMP QOM-specific functions to virtio-qmp.c
Philippe Mathieu-Daudé [Thu, 22 Dec 2022 08:00:05 +0000 (09:00 +0100)]
hw/virtio: Extract QMP QOM-specific functions to virtio-qmp.c

virtio.c is big enough, extract more QMP related code to virtio-qmp.c.
To do so, expose qmp_find_virtio_device() and declar virtio_list in
the internal virtio-qmp.h header.

Note we have to leave qmp_x_query_virtio_queue_status() and
qmp_x_query_virtio_queue_element(), because they access VirtQueue
internal fields, and VirtQueue is only declared within virtio.c.

Suggested-by: Jonah Palmer <jonah.palmer@oracle.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221222080005.27616-3-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agohw/virtio: Rename virtio_device_find() -> qmp_find_virtio_device()
Philippe Mathieu-Daudé [Thu, 22 Dec 2022 08:00:04 +0000 (09:00 +0100)]
hw/virtio: Rename virtio_device_find() -> qmp_find_virtio_device()

To emphasize this function is QMP related, rename it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221222080005.27616-2-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio-pci: add support for configure interrupt
Cindy Lu [Thu, 22 Dec 2022 07:04:51 +0000 (15:04 +0800)]
virtio-pci: add support for configure interrupt

Add process to handle the configure interrupt, The function's
logic is the same with vq interrupt.Add extra process to check
the configure interrupt

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-11-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio-mmio: add support for configure interrupt
Cindy Lu [Thu, 22 Dec 2022 07:04:50 +0000 (15:04 +0800)]
virtio-mmio: add support for configure interrupt

Add configure interrupt support in virtio-mmio bus.
add function to set configure guest notifier.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-10-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio-net: add support for configure interrupt
Cindy Lu [Thu, 22 Dec 2022 07:04:49 +0000 (15:04 +0800)]
virtio-net: add support for configure interrupt

Add functions to support configure interrupt in virtio_net
Add the functions to support vhost_net_config_pending
and vhost_net_config_mask.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-9-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovhost: add support for configure interrupt
Cindy Lu [Thu, 22 Dec 2022 07:04:48 +0000 (15:04 +0800)]
vhost: add support for configure interrupt

Add functions to support configure interrupt.
The configure interrupt process will start in vhost_dev_start
and stop in vhost_dev_stop.

Also add the functions to support vhost_config_pending and
vhost_config_mask.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-8-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio: add support for configure interrupt
Cindy Lu [Thu, 22 Dec 2022 07:04:47 +0000 (15:04 +0800)]
virtio: add support for configure interrupt

Add the functions to support the configure interrupt in virtio
The function virtio_config_guest_notifier_read will notify the
guest if there is an configure interrupt.
The function virtio_config_set_guest_notifier_fd_handler is
to set the fd hander for the notifier

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-7-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovhost-vdpa: add support for config interrupt
Cindy Lu [Thu, 22 Dec 2022 07:04:46 +0000 (15:04 +0800)]
vhost-vdpa: add support for config interrupt

Add new call back function in vhost-vdpa, The function
vhost_set_config_call can set the event fd to kernel.
This function will be called in the vhost_dev_start
and vhost_dev_stop

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-6-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovhost: introduce new VhostOps vhost_set_config_call
Cindy Lu [Thu, 22 Dec 2022 07:04:45 +0000 (15:04 +0800)]
vhost: introduce new VhostOps vhost_set_config_call

This patch introduces new VhostOps vhost_set_config_call.
This function allows the qemu to set the config
event fd to kernel driver.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-5-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio-pci: decouple the single vector from the interrupt process
Cindy Lu [Thu, 22 Dec 2022 07:04:44 +0000 (15:04 +0800)]
virtio-pci: decouple the single vector from the interrupt process

To reuse the interrupt process in configure interrupt
Need to decouple the single vector from the interrupt process.
We add new function kvm_virtio_pci_vector_use_one and _release_one.
These functions are used for the single vector, the whole process will
finish in the loop with vq number.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-4-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio-pci: decouple notifier from interrupt process
Cindy Lu [Thu, 22 Dec 2022 07:04:43 +0000 (15:04 +0800)]
virtio-pci: decouple notifier from interrupt process

To reuse the notifier process. We add the virtio_pci_get_notifier
to get the notifier and vector. The INPUT for this function is IDX,
The OUTPUT is the notifier and the vector

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-3-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio: introduce macro VIRTIO_CONFIG_IRQ_IDX
Cindy Lu [Thu, 22 Dec 2022 07:04:42 +0000 (15:04 +0800)]
virtio: introduce macro VIRTIO_CONFIG_IRQ_IDX

To support configure interrupt for vhost-vdpa
Introduce VIRTIO_CONFIG_IRQ_IDX -1 as configure interrupt's queue index,
Then we can reuse the functions guest_notifier_mask and guest_notifier_pending.
Add the check of queue index in these drivers, if the driver does not support
configure interrupt, the function will just return

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-2-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovhost-user: Fix the virtio features negotiation flaw
Hyman Huang(黄勇) [Wed, 21 Dec 2022 13:06:40 +0000 (21:06 +0800)]
vhost-user: Fix the virtio features negotiation flaw

This patch aims to fix unexpected negotiation features for
vhost-user netdev interface.

When openvswitch reconnect Qemu after an unexpected disconnection
and Qemu therefore start the vhost_dev, acked_features field in
vhost_dev is initialized with value fetched from acked_features
field in NetVhostUserState, which should be up-to-date at that
moment but Qemu could not make it actually during the time window
of virtio features negotiation.

So we save the acked_features right after being configured by
guest virtio driver so it can be used to restore acked_features
field in vhost_dev correctly.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Signed-off-by: Liuxiangdong <liuxiangdong5@huawei.com>
Message-Id: <b9f8cf5561a79ea65ea38960e5a5e6d3707eef0a.1671627406.git.huangy81@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>