]> www.infradead.org Git - users/dwmw2/qemu.git/log
users/dwmw2/qemu.git
2 years agoi386/xen: Initialize Xen backends from pc_basic_device_init() for emulation xenfv-1
David Woodhouse [Wed, 15 Feb 2023 15:10:00 +0000 (16:10 +0100)]
i386/xen: Initialize Xen backends from pc_basic_device_init() for emulation

Now that all the work is done to enable the PV backends to work without
actual Xen, instantiate the bus from pc_basic_device_init() for emulated
mode.

This allows us finally to launch an emulated Xen guest with PV disk.

   qemu-system-x86_64 -serial mon:stdio -M q35 -cpu host -display none \
     -m 1G -smp 2 -accel kvm,xen-version=0x4000a,kernel-irqchip=split \
     -kernel bzImage -append "console=ttyS0 root=/dev/xvda1" \
     -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
     -device xen-disk,drive=disk,vdev=xvda

If we use -M pc instead of q35, we can even add an IDE disk and boot a
guest image normally through grub. But q35 gives us AHCI and that isn't
unplugged by the Xen magic, so the guests ends up seeing "both" disks.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement soft reset for emulated gnttab
David Woodhouse [Tue, 10 Jan 2023 01:09:04 +0000 (01:09 +0000)]
hw/xen: Implement soft reset for emulated gnttab

This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).

Some more work on the actual PV back ends and xen-bus code is going to be
needed to really make soft reset and migration fully functional, and this
part is the basis for that.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore
David Woodhouse [Sat, 7 Jan 2023 13:54:07 +0000 (13:54 +0000)]
hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add emulated implementation of XenStore operations
David Woodhouse [Thu, 19 Jan 2023 00:04:31 +0000 (00:04 +0000)]
hw/xen: Add emulated implementation of XenStore operations

Now that we have an internal implementation of XenStore, we can populate
the xenstore_backend_ops to allow PV backends to talk to it.

Watches can't be processed with immediate callbacks because that would
call back into XenBus code recursively. Defer them to a QEMUBH to be run
as appropriate from the main loop. We use a QEMUBH per XS handle, and it
walks all the watches (there shouldn't be many per handle) to fire any
which have pending events. We *could* have done it differently but this
allows us to use the same struct watch_event as we have for the guest
side, and keeps things relatively simple.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add emulated implementation of grant table operations
David Woodhouse [Fri, 6 Jan 2023 09:59:28 +0000 (09:59 +0000)]
hw/xen: Add emulated implementation of grant table operations

This is limited to mapping a single grant at a time, because under Xen the
pages are mapped *contiguously* into qemu's address space, and that's very
hard to do when those pages actually come from anonymous mappings in qemu
in the first place.

Eventually perhaps we can look at using shared mappings of actual objects
for system RAM, and then we can make new mappings of the same backing
store (be it deleted files, shmem, whatever). But for now let's stick to
a page at a time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Hook up emulated implementation for event channel operations
David Woodhouse [Sun, 1 Jan 2023 23:49:25 +0000 (23:49 +0000)]
hw/xen: Hook up emulated implementation for event channel operations

We provided the backend-facing evtchn functions very early on as part of
the core Xen platform support, since things like timers and xenstore need
to use them.

By what may or may not be an astonishing coincidence, those functions
just *happen* all to have exactly the right function prototypes to slot
into the evtchn_backend_ops table and be called by the PV backends.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Only advertise ring-page-order for xen-block if gnttab supports it
David Woodhouse [Mon, 30 Jan 2023 16:27:07 +0000 (17:27 +0100)]
hw/xen: Only advertise ring-page-order for xen-block if gnttab supports it

Whem emulating Xen, multi-page grants are distinctly non-trivial and we
have elected not to support them for the time being. Don't advertise
them to the guest.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Avoid crash when backend watch fires too early
Paul Durrant [Mon, 30 Jan 2023 14:35:28 +0000 (14:35 +0000)]
hw/xen: Avoid crash when backend watch fires too early

The xen-block code ends up calling aio_poll() through blkconf_geometry(),
which means we see watch events during the indirect call to
xendev_class->realize() in xen_device_realize(). Unfortunately this call
is made before populating the initial frontend and backend device nodes
in xenstore and hence xen_block_frontend_changed() (which is called from
a watch event) fails to read the frontend's 'state' node, and hence
believes the device is being torn down. This in-turn sets the backend
state to XenbusStateClosed and causes the device to be deleted before it
is fully set up, leading to the crash.
By simply moving the call to xendev_class->realize() after the initial
xenstore nodes are populated, this sorry state of affairs is avoided.

Reported-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Build PV backend drivers for CONFIG_XEN_BUS
David Woodhouse [Mon, 2 Jan 2023 01:26:04 +0000 (01:26 +0000)]
hw/xen: Build PV backend drivers for CONFIG_XEN_BUS

Now that we have the redirectable Xen backend operations we can build the
PV backends even without the Xen libraries.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Rename xen_common.h to xen_native.h
David Woodhouse [Mon, 2 Jan 2023 00:39:13 +0000 (00:39 +0000)]
hw/xen: Rename xen_common.h to xen_native.h

This header is now only for native Xen code, not PV backends that may be
used in Xen emulation. Since the toolstack libraries may depend on the
specific version of Xen headers that they pull in (and will set the
__XEN_TOOLS__ macro to enable internal definitions that they depend on),
the rule is that xen_native.h (and thus the toolstack library headers)
must be included *before* any of the headers in include/hw/xen/interface.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Use XEN_PAGE_SIZE in PV backend drivers
David Woodhouse [Sat, 7 Jan 2023 16:47:43 +0000 (16:47 +0000)]
hw/xen: Use XEN_PAGE_SIZE in PV backend drivers

XC_PAGE_SIZE comes from the actual Xen libraries, while XEN_PAGE_SIZE is
provided by QEMU itself in xen_backend_ops.h. For backends which may be
built for emulation mode, use the latter.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Move xenstore_store_pv_console_info to xen_console.c
David Woodhouse [Sat, 7 Jan 2023 16:17:51 +0000 (16:17 +0000)]
hw/xen: Move xenstore_store_pv_console_info to xen_console.c

There's no need for this to be in the Xen accel code, and as we want to
use the Xen console support with KVM-emulated Xen we'll want to have a
platform-agnostic version of it. Make it use GString to build up the
path while we're at it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add xenstore operations to allow redirection to internal emulation
Paul Durrant [Mon, 2 Jan 2023 11:05:16 +0000 (11:05 +0000)]
hw/xen: Add xenstore operations to allow redirection to internal emulation

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add foreignmem operations to allow redirection to internal emulation
David Woodhouse [Mon, 2 Jan 2023 01:13:46 +0000 (01:13 +0000)]
hw/xen: Add foreignmem operations to allow redirection to internal emulation

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
2 years agohw/xen: Pass grant ref to gnttab unmap operation
David Woodhouse [Tue, 10 Jan 2023 00:03:49 +0000 (00:03 +0000)]
hw/xen: Pass grant ref to gnttab unmap operation

The previous commit introduced redirectable gnttab operations fairly
much like-for-like, with the exception of the extra arguments to the
->open() call which were always NULL/0 anyway.

This *changes* the arguments to the ->unmap() operation to include the
original ref# that was mapped. Under real Xen it isn't necessary; all we
need to do from QEMU is munmap(), then the kernel will release the grant,
and Xen does the tracking/refcounting for the guest.

When we have emulated grant tables though, we need to do all that for
ourselves. So let's have the back ends keep track of what they mapped
and pass it in to the ->unmap() method for us.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add gnttab operations to allow redirection to internal emulation
David Woodhouse [Sun, 1 Jan 2023 21:31:37 +0000 (21:31 +0000)]
hw/xen: Add gnttab operations to allow redirection to internal emulation

Move the existing code using libxengnttab to xen-operations.c and allow
the operations to be redirected so that we can add emulation of grant
table mapping for backend drivers.

In emulation, mapping more than one grant ref to be virtually contiguous
would be fairly difficult. The best way to do it might be to make the
ram_block mappings actually backed by a file (shmem or a deleted file,
perhaps) so that we can have multiple *shared* mappings of it. But that
would be fairly intrusive.

Making the backend drivers cope with page *lists* instead of expecting
the mapping to be contiguous is also non-trivial, since some structures
would actually *cross* page boundaries (e.g. the 32-bit blkif responses
which are 12 bytes).

So for now, we'll support only single-page mappings in emulation. Add a
XEN_GNTTAB_OP_FEATURE_MAP_MULTIPLE flag to indicate that the native Xen
implementation *does* support multi-page maps, and a helper function to
query it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
2 years agohw/xen: Add evtchn operations to allow redirection to internal emulation
David Woodhouse [Sun, 1 Jan 2023 17:54:41 +0000 (17:54 +0000)]
hw/xen: Add evtchn operations to allow redirection to internal emulation

The existing implementation calling into the real libxenevtchn moves to
a new file hw/xen/xen-operations.c, and is called via a function table
which in a subsequent commit will also be able to invoke the emulated
event channel support.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
2 years agohw/xen: Create initial XenStore nodes
Paul Durrant [Tue, 24 Jan 2023 09:34:06 +0000 (09:34 +0000)]
hw/xen: Create initial XenStore nodes

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement core serialize/deserialize methods for xenstore_impl
David Woodhouse [Tue, 31 Jan 2023 15:00:54 +0000 (15:00 +0000)]
hw/xen: Implement core serialize/deserialize methods for xenstore_impl

In fact I think we want to only serialize the contents of the domain's
path in /local/domain/${domid} and leave the rest to be recreated? Will
defer to Paul for that.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement XenStore permissions
Paul Durrant [Mon, 23 Jan 2023 16:21:16 +0000 (16:21 +0000)]
hw/xen: Implement XenStore permissions

Store perms as a GList of strings, check permissions.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Watches on XenStore transactions
David Woodhouse [Sun, 22 Jan 2023 22:59:49 +0000 (22:59 +0000)]
hw/xen: Watches on XenStore transactions

Firing watches on the nodes that still exist is relatively easy; just
walk the tree and look at the nodes with refcount of one.

Firing watches on *deleted* nodes is more fun. We add 'modified_in_tx'
and 'deleted_in_tx' flags to each node. Nodes with those flags cannot
be shared, as they will always be unique to the transaction in which
they were created.

When xs_node_walk would need to *create* a node as scaffolding and it
encounters a deleted_in_tx node, it can resurrect it simply by clearing
its deleted_in_tx flag. If that node originally had any *data*, they're
gone, and the modified_in_tx flag will have been set when it was first
deleted.

We then attempt to send appropriate watches when the transaction is
committed, properly delete the deleted_in_tx nodes, and remove the
modified_in_tx flag from the others.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement XenStore transactions
David Woodhouse [Sun, 22 Jan 2023 22:05:37 +0000 (22:05 +0000)]
hw/xen: Implement XenStore transactions

Given that the whole thing supported copy on write from the beginning,
transactions end up being fairly simple. On starting a transaction, just
take a ref of the existing root; swap it back in on a successful commit.

The main tree has a transaction ID too, and we keep a record of the last
transaction ID given out. if the main tree is ever modified when it isn't
the latest, it gets a new transaction ID.

A commit can only succeed if the main tree hasn't moved on since it was
forked. Strictly speaking, the XenStore protocol allows a transaction to
succeed as long as nothing *it* read or wrote has changed in the interim,
but no implementations do that; *any* change is sufficient to abort a
transaction.

This does not yet fire watches on the changed nodes on a commit. That bit
is more fun and will come in a follow-on commit.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Implement XenStore watches
David Woodhouse [Sun, 22 Jan 2023 18:38:23 +0000 (18:38 +0000)]
hw/xen: Implement XenStore watches

Starts out fairly simple: a hash table of watches based on the path.

Except there can be multiple watches on the same path, so the watch ends
up being a simple linked list, and the head of that list is in the hash
table. Which makes removal a bit of a PITA but it's not so bad; we just
special-case "I had to remove the head of the list and now I have to
replace it in / remove it from the hash table". And if we don't remove
the head, it's a simple linked-list operation.

We do need to fire watches on *deleted* nodes, so instead of just a simple
xs_node_unref() on the topmost victim, we need to recurse down and fire
watches on them all.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add basic XenStore tree walk and write/read/directory support
David Woodhouse [Fri, 20 Jan 2023 01:36:38 +0000 (01:36 +0000)]
hw/xen: Add basic XenStore tree walk and write/read/directory support

This is a fairly simple implementation of a copy-on-write tree.

The node walk function starts off at the root, with 'inplace == true'.
If it ever encounters a node with a refcount greater than one (including
the root node), then that node is shared with other trees, and cannot
be modified in place, so the inplace flag is cleared and we copy on
write from there on down.

Xenstore write has 'mkdir -p' semantics and will create the intermediate
nodes if they don't already exist, so in that case we flip the inplace
flag back to true as as populated the newly-created nodes.

We put a copy of the absolute path into the buffer in the struct walk_op,
with *two* NUL terminators at the end. As xs_node_walk() goes down the
tree, it replaces the next '/' separator with a NUL so that it can use
the 'child name' in place. The next recursion down then puts the '/'
back and repeats the exercise for the next path element... if it doesn't
hit that *second* NUL termination which indicates the true end of the
path.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add xenstore wire implementation and implementation stubs
David Woodhouse [Wed, 18 Jan 2023 18:55:47 +0000 (18:55 +0000)]
hw/xen: Add xenstore wire implementation and implementation stubs

This implements the basic wire protocol for the XenStore commands, punting
all the actual implementation to xs_impl_* functions which all just return
errors for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Subsume xen_be_register_common() into xen_be_init() xenfv-kvm-15
David Woodhouse [Wed, 15 Feb 2023 15:05:26 +0000 (16:05 +0100)]
hw/xen: Subsume xen_be_register_common() into xen_be_init()

Every caller of xen_be_init() checks and exits on error, then calls
xen_be_register_common(). Just make xen_be_init() abort for itself and
return void, and register the common devices too.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: Document Xen HVM emulation
David Woodhouse [Tue, 17 Jan 2023 14:13:32 +0000 (14:13 +0000)]
i386/xen: Document Xen HVM emulation

Signed-off-by: David Woodhouse <dwmw2@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agokvm/i386: Add xen-evtchn-max-pirq property
David Woodhouse [Wed, 18 Jan 2023 14:36:23 +0000 (14:36 +0000)]
kvm/i386: Add xen-evtchn-max-pirq property

The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI
devices. Allow it to be increased if the user desires.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Support MSI mapping to PIRQ
David Woodhouse [Fri, 13 Jan 2023 23:35:46 +0000 (23:35 +0000)]
hw/xen: Support MSI mapping to PIRQ

The way that Xen handles MSI PIRQs is kind of awful.

There is a special MSI message which targets a PIRQ. The vector in the
low bits of data must be zero. The low 8 bits of the PIRQ# are in the
destination ID field, the extended destination ID field is unused, and
instead the high bits of the PIRQ# are in the high 32 bits of the address.

Using the high bits of the address means that we can't intercept and
translate these messages in kvm_send_msi(), because they won't be caught
by the APIC — addresses like 0x1000fee46000 aren't in the APIC's range.

So we catch them in pci_msi_trigger() instead, and deliver the event
channel directly.

That isn't even the worst part. The worst part is that Xen snoops on
writes to devices' MSI vectors while they are *masked*. When a MSI
message is written which looks like it targets a PIRQ, it remembers
the device and vector for later.

When the guest makes a hypercall to bind that PIRQ# (snooped from a
marked MSI vector) to an event channel port, Xen *unmasks* that MSI
vector on the device. Xen guests using PIRQ delivery of MSI don't
ever actually unmask the MSI for themselves.

Now that this is working we can finally enable XENFEAT_hvm_pirqs and
let the guest use it all.

Tested with passthrough igb and emulated e1000e + AHCI.

           CPU0       CPU1
  0:         65          0   IO-APIC   2-edge      timer
  1:          0         14  xen-pirq   1-ioapic-edge  i8042
  4:          0        846  xen-pirq   4-ioapic-edge  ttyS0
  8:          1          0  xen-pirq   8-ioapic-edge  rtc0
  9:          0          0  xen-pirq   9-ioapic-level  acpi
 12:        257          0  xen-pirq  12-ioapic-edge  i8042
 24:       9600          0  xen-percpu    -virq      timer0
 25:       2758          0  xen-percpu    -ipi       resched0
 26:          0          0  xen-percpu    -ipi       callfunc0
 27:          0          0  xen-percpu    -virq      debug0
 28:       1526          0  xen-percpu    -ipi       callfuncsingle0
 29:          0          0  xen-percpu    -ipi       spinlock0
 30:          0       8608  xen-percpu    -virq      timer1
 31:          0        874  xen-percpu    -ipi       resched1
 32:          0          0  xen-percpu    -ipi       callfunc1
 33:          0          0  xen-percpu    -virq      debug1
 34:          0       1617  xen-percpu    -ipi       callfuncsingle1
 35:          0          0  xen-percpu    -ipi       spinlock1
 36:          8          0   xen-dyn    -event     xenbus
 37:          0       6046  xen-pirq    -msi       ahci[0000:00:03.0]
 38:          1          0  xen-pirq    -msi-x     ens4
 39:          0         73  xen-pirq    -msi-x     ens4-rx-0
 40:         14          0  xen-pirq    -msi-x     ens4-rx-1
 41:          0         32  xen-pirq    -msi-x     ens4-tx-0
 42:         47          0  xen-pirq    -msi-x     ens4-tx-1

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Support GSI mapping to PIRQ
David Woodhouse [Fri, 13 Jan 2023 20:41:19 +0000 (20:41 +0000)]
hw/xen: Support GSI mapping to PIRQ

If I advertise XENFEAT_hvm_pirqs then a guest now boots successfully as
long as I tell it 'pci=nomsi'.

[root@localhost ~]# cat /proc/interrupts
           CPU0
  0:         52   IO-APIC   2-edge      timer
  1:         16  xen-pirq   1-ioapic-edge  i8042
  4:       1534  xen-pirq   4-ioapic-edge  ttyS0
  8:          1  xen-pirq   8-ioapic-edge  rtc0
  9:          0  xen-pirq   9-ioapic-level  acpi
 11:       5648  xen-pirq  11-ioapic-level  ahci[0000:00:04.0]
 12:        257  xen-pirq  12-ioapic-edge  i8042
...

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement emulated PIRQ hypercall support
David Woodhouse [Fri, 13 Jan 2023 19:51:32 +0000 (19:51 +0000)]
hw/xen: Implement emulated PIRQ hypercall support

This wires up the basic infrastructure but the actual interrupts aren't
there yet, so don't advertise it to the guest.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: Implement HYPERVISOR_physdev_op
David Woodhouse [Fri, 13 Jan 2023 00:24:40 +0000 (00:24 +0000)]
i386/xen: Implement HYPERVISOR_physdev_op

Just hook up the basic hypercalls to stubs in xen_evtchn.c for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Automatically add xen-platform PCI device for emulated Xen guests
David Woodhouse [Tue, 17 Jan 2023 13:51:22 +0000 (13:51 +0000)]
hw/xen: Automatically add xen-platform PCI device for emulated Xen guests

It isn't strictly mandatory but Linux guests at least will only map
their grant tables over the dummy BAR that it provides, and don't have
sufficient wit to map them in any other unused part of their guest
address space. So include it by default for minimal surprise factor.

As I come to document "how to run a Xen guest in QEMU", this means one
fewer thing to tell the user about, according to the mantra of "if it
needs documenting, fix it first, then document what remains".

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Add basic ring handling to xenstore
David Woodhouse [Wed, 28 Dec 2022 10:06:49 +0000 (10:06 +0000)]
hw/xen: Add basic ring handling to xenstore

Extract requests, return ENOSYS to all of them. This is enough to allow
older Linux guests to boot, as they need *something* back but it doesn't
matter much what.

A full implementation of a single-tentant internal XenStore copy-on-write
tree with transactions and watches is waiting in the wings to be sent in
a subsequent round of patches along with hooking up the actual PV disk
back end in qemu, but this is enough to get guests booting for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Add xen_xenstore device for xenstore emulation
David Woodhouse [Fri, 23 Dec 2022 17:39:23 +0000 (17:39 +0000)]
hw/xen: Add xen_xenstore device for xenstore emulation

Just the basic shell, with the event channel hookup. It only dumps the
buffer for now; a real ring implmentation will come in a subsequent patch.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Add backend implementation of interdomain event channel support
David Woodhouse [Tue, 27 Dec 2022 21:20:07 +0000 (21:20 +0000)]
hw/xen: Add backend implementation of interdomain event channel support

The provides the QEMU side of interdomain event channels, allowing events
to be sent to/from the guest.

The API mirrors libxenevtchn, and in time both this and the real Xen one
will be available through ops structures so that the PV backend drivers
can use the correct one as appropriate.

For now, this implementation can be used directly by our XenStore which
will be for emulated mode only.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: handle HVMOP_get_param
Joao Martins [Fri, 28 Sep 2018 17:17:47 +0000 (13:17 -0400)]
i386/xen: handle HVMOP_get_param

Which is used to fetch xenstore PFN and port to be used
by the guest. This is preallocated by the toolstack when
guest will just read those and use it straight away.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: Reserve Xen special pages for console, xenstore rings
David Woodhouse [Tue, 27 Dec 2022 19:02:23 +0000 (19:02 +0000)]
i386/xen: Reserve Xen special pages for console, xenstore rings

Xen has eight frames at 0xfeff8000 for this; we only really need two for
now and KVM puts the identity map at 0xfeffc000, so limit ourselves to
four.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: handle PV timer hypercalls
Joao Martins [Mon, 17 Sep 2018 11:04:54 +0000 (07:04 -0400)]
i386/xen: handle PV timer hypercalls

Introduce support for one shot and periodic mode of Xen PV timers,
whereby timer interrupts come through a special virq event channel
with deadlines being set through:

1) set_timer_op hypercall (only oneshot)
2) vcpu_op hypercall for {set,stop}_{singleshot,periodic}_timer
hypercalls

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement GNTTABOP_query_size
David Woodhouse [Fri, 16 Dec 2022 23:49:48 +0000 (23:49 +0000)]
hw/xen: Implement GNTTABOP_query_size

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_verson
David Woodhouse [Fri, 16 Dec 2022 23:40:57 +0000 (23:40 +0000)]
i386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_verson

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Support mapping grant frames
David Woodhouse [Fri, 16 Dec 2022 18:33:49 +0000 (18:33 +0000)]
hw/xen: Support mapping grant frames

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Add xen_gnttab device for grant table emulation
David Woodhouse [Fri, 16 Dec 2022 15:50:26 +0000 (15:50 +0000)]
hw/xen: Add xen_gnttab device for grant table emulation

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agokvm/i386: Add xen-gnttab-max-frames property
David Woodhouse [Fri, 16 Dec 2022 16:27:00 +0000 (16:27 +0000)]
kvm/i386: Add xen-gnttab-max-frames property

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback
David Woodhouse [Fri, 16 Dec 2022 00:03:21 +0000 (00:03 +0000)]
hw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback

The guest is permitted to specify an arbitrary domain/bus/device/function
and INTX pin from which the callback IRQ shall appear to have come.

In QEMU we can only easily do this for devices that actually exist, and
even that requires us "knowing" that it's a PCMachine in order to find
the PCI root bus — although that's OK really because it's always true.

We also don't get to get notified of INTX routing changes, because we
can't do that as a passive observer; if we try to register a notifier
it will overwrite any existing notifier callback on the device.

But in practice, guests using PCI_INTX will only ever use pin A on the
Xen platform device, and won't swizzle the INTX routing after they set
it up. So this is just fine.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback
David Woodhouse [Thu, 15 Dec 2022 20:35:24 +0000 (20:35 +0000)]
hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback

The GSI callback (and later PCI_INTX) is a level triggered interrupt. It
is asserted when an event channel is delivered to vCPU0, and is supposed
to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0
is cleared again.

Thankfully, Xen does *not* assert the GSI if the guest sets its own
evtchn_upcall_pending field; we only need to assert the GSI when we
have delivered an event for ourselves. So that's the easy part, kind of.

There's a slight complexity in that we need to hold the BQL before we
can call qemu_set_irq(), and we definitely can't do that while holding
our own port_lock (because we'll need to take that from the qemu-side
functions that the PV backend drivers will call). So if we end up
wanting to set the IRQ in a context where we *don't* already hold the
BQL, defer to a BH.

However, we *do* need to poll for the evtchn_upcall_pending flag being
cleared. In an ideal world we would poll that when the EOI happens on
the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd
pairs — one is used to trigger the interrupt, and the other works in the
other direction to 'resample' on EOI, and trigger the first eventfd
again if the line is still active.

However, QEMU doesn't seem to do that. Even VFIO level interrupts seem
to be supported by temporarily unmapping the device's BARs from the
guest when an interrupt happens, then trapping *all* MMIO to the device
and sending the 'resample' event on *every* MMIO access until the IRQ
is cleared! Maybe in future we'll plumb the 'resample' concept through
QEMU's irq framework but for now we'll do what Xen itself does: just
check the flag on every vmexit if the upcall GSI is known to be
asserted.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: add monitor commands to test event injection
Joao Martins [Tue, 21 Aug 2018 16:16:19 +0000 (12:16 -0400)]
i386/xen: add monitor commands to test event injection

Specifically add listing, injection of event channels.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_reset
David Woodhouse [Wed, 14 Dec 2022 19:36:15 +0000 (19:36 +0000)]
hw/xen: Implement EVTCHNOP_reset

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_bind_vcpu
David Woodhouse [Wed, 14 Dec 2022 19:27:38 +0000 (19:27 +0000)]
hw/xen: Implement EVTCHNOP_bind_vcpu

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_bind_interdomain
David Woodhouse [Wed, 14 Dec 2022 17:26:32 +0000 (17:26 +0000)]
hw/xen: Implement EVTCHNOP_bind_interdomain

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_alloc_unbound
David Woodhouse [Wed, 14 Dec 2022 16:39:48 +0000 (16:39 +0000)]
hw/xen: Implement EVTCHNOP_alloc_unbound

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_send
David Woodhouse [Wed, 14 Dec 2022 00:11:07 +0000 (00:11 +0000)]
hw/xen: Implement EVTCHNOP_send

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_bind_ipi
David Woodhouse [Tue, 13 Dec 2022 23:12:59 +0000 (23:12 +0000)]
hw/xen: Implement EVTCHNOP_bind_ipi

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_bind_virq
David Woodhouse [Tue, 13 Dec 2022 22:40:56 +0000 (22:40 +0000)]
hw/xen: Implement EVTCHNOP_bind_virq

Add the array of virq ports to each vCPU so that we can deliver timers,
debug ports, etc. Global virqs are allocated against vCPU 0 initially,
but can be migrated to other vCPUs (when we implement that).

The kernel needs to know about VIRQ_TIMER in order to accelerate timers,
so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value
of the singleshot timer across migration, as the kernel will handle the
hypercalls automatically now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_unmask
David Woodhouse [Tue, 13 Dec 2022 17:20:46 +0000 (17:20 +0000)]
hw/xen: Implement EVTCHNOP_unmask

This finally comes with a mechanism for actually injecting events into
the guest vCPU, with all the atomic-test-and-set that's involved in
setting the bit in the shinfo, then the index in the vcpu_info, and
injecting either the lapic vector as MSI, or letting KVM inject the
bare vector.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_close
David Woodhouse [Tue, 13 Dec 2022 13:57:44 +0000 (13:57 +0000)]
hw/xen: Implement EVTCHNOP_close

It calls an internal close_port() helper which will also be used from
EVTCHNOP_reset and will actually do the work to disconnect/unbind a port
once any of that is actually implemented in the first place.

That in turn calls a free_port() internal function which will be in
error paths after allocation.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Implement EVTCHNOP_status
David Woodhouse [Tue, 13 Dec 2022 13:29:46 +0000 (13:29 +0000)]
hw/xen: Implement EVTCHNOP_status

This adds the basic structure for maintaining the port table and reporting
the status of ports therein.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: Add support for Xen event channel delivery to vCPU
David Woodhouse [Fri, 16 Dec 2022 14:32:25 +0000 (14:32 +0000)]
i386/xen: Add support for Xen event channel delivery to vCPU

The kvm_xen_inject_vcpu_callback_vector() function will either deliver
the per-vCPU local APIC vector (as an MSI), or just kick the vCPU out
of the kernel to trigger KVM's automatic delivery of the global vector.
Support for asserting the GSI/PCI_INTX callbacks will come later.

Also add kvm_xen_get_vcpu_info_hva() which returns the vcpu_info of
a given vCPU.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agohw/xen: Add xen_evtchn device for event channel emulation
David Woodhouse [Fri, 16 Dec 2022 14:02:29 +0000 (14:02 +0000)]
hw/xen: Add xen_evtchn device for event channel emulation

Include basic support for setting HVM_PARAM_CALLBACK_IRQ to the global
vector method HVM_PARAM_CALLBACK_TYPE_VECTOR, which is handled in-kernel
by raising the vector whenever the vCPU's vcpu_info->evtchn_upcall_pending
flag is set.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HVMOP_set_param
Ankur Arora [Tue, 6 Dec 2022 11:14:07 +0000 (11:14 +0000)]
i386/xen: implement HVMOP_set_param

This is the hook for adding the HVM_PARAM_CALLBACK_IRQ parameter in a
subsequent commit.

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Split out from another commit]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HVMOP_set_evtchn_upcall_vector
Ankur Arora [Tue, 6 Dec 2022 11:14:07 +0000 (11:14 +0000)]
i386/xen: implement HVMOP_set_evtchn_upcall_vector

The HVMOP_set_evtchn_upcall_vector hypercall sets the per-vCPU upcall
vector, to be delivered to the local APIC just like an MSI (with an EOI).

This takes precedence over the system-wide delivery method set by the
HVMOP_set_param hypercall with HVM_PARAM_CALLBACK_IRQ. It's used by
Windows and Xen (PV shim) guests but normally not by Linux.

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Rework for upstream kernel changes and split from HVMOP_set_param]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HYPERVISOR_event_channel_op
Joao Martins [Thu, 28 Jun 2018 16:36:19 +0000 (12:36 -0400)]
i386/xen: implement HYPERVISOR_event_channel_op

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch event_channel_op_compat which was never available to HVM guests]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: handle VCPUOP_register_runstate_memory_area
Joao Martins [Tue, 24 Jul 2018 16:46:00 +0000 (12:46 -0400)]
i386/xen: handle VCPUOP_register_runstate_memory_area

Allow guest to setup the vcpu runstates which is used as
steal clock.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: handle VCPUOP_register_vcpu_time_info
Joao Martins [Mon, 23 Jul 2018 15:24:36 +0000 (11:24 -0400)]
i386/xen: handle VCPUOP_register_vcpu_time_info

In order to support Linux vdso in Xen.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: handle VCPUOP_register_vcpu_info
Joao Martins [Fri, 29 Jun 2018 14:54:50 +0000 (10:54 -0400)]
i386/xen: handle VCPUOP_register_vcpu_info

Handle the hypercall to set a per vcpu info, and also wire up the default
vcpu_info in the shared_info page for the first 32 vCPUs.

To avoid deadlock within KVM a vCPU thread must set its *own* vcpu_info
rather than it being set from the context in which the hypercall is
invoked.

Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration,
and restore it in kvm_arch_put_registers() appropriately.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HYPERVISOR_vcpu_op
Joao Martins [Mon, 18 Jun 2018 16:26:44 +0000 (12:26 -0400)]
i386/xen: implement HYPERVISOR_vcpu_op

This is simply when guest tries to register a vcpu_info
and since vcpu_info placement is optional in the minimum ABI
therefore we can just fail with -ENOSYS

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HYPERVISOR_hvm_op
Joao Martins [Mon, 18 Jun 2018 16:21:57 +0000 (12:21 -0400)]
i386/xen: implement HYPERVISOR_hvm_op

This is when guest queries for support for HVMOP_pagetable_dying.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement XENMEM_add_to_physmap_batch
David Woodhouse [Thu, 15 Dec 2022 10:39:30 +0000 (10:39 +0000)]
i386/xen: implement XENMEM_add_to_physmap_batch

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HYPERVISOR_memory_op
Joao Martins [Mon, 18 Jun 2018 16:17:42 +0000 (12:17 -0400)]
i386/xen: implement HYPERVISOR_memory_op

Specifically XENMEM_add_to_physmap with space XENMAPSPACE_shared_info to
allow the guest to set its shared_info page.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Use the xen_overlay device, add compat support]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: manage and save/restore Xen guest long_mode setting
David Woodhouse [Mon, 12 Dec 2022 14:03:41 +0000 (14:03 +0000)]
i386/xen: manage and save/restore Xen guest long_mode setting

Xen will "latch" the guest's 32-bit or 64-bit ("long mode") setting when
the guest writes the MSR to fill in the hypercall page, or when the guest
sets the event channel callback in HVM_PARAM_CALLBACK_IRQ.

KVM handles the former and sets the kernel's long_mode flag accordingly.
The latter will be handled in userspace. Keep them in sync by noticing
when a hypercall is made in a mode that doesn't match qemu's idea of
the guest mode, and resyncing from the kernel. Do that same sync right
before serialization too, in case the guest has set the hypercall page
but hasn't yet made a system call.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
David Woodhouse [Mon, 12 Dec 2022 23:40:45 +0000 (23:40 +0000)]
i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode

The xen_overlay device (and later similar devices for event channels and
grant tables) need to be instantiated. Do this from a kvm_type method on
the PC machine derivatives, since KVM is only way to support Xen emulation
for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoxen: Permit --xen-domid argument when accel is KVM
Paul Durrant [Tue, 24 Jan 2023 10:59:49 +0000 (10:59 +0000)]
xen: Permit --xen-domid argument when accel is KVM

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Wooodhouse <dwmw@amazon.co.uk>
2 years agohw/xen: Add xen_overlay device for emulating shared xenheap pages
David Woodhouse [Wed, 7 Dec 2022 09:19:31 +0000 (09:19 +0000)]
hw/xen: Add xen_overlay device for emulating shared xenheap pages

For the shared info page and for grant tables, Xen shares its own pages
from the "Xen heap" to the guest. The guest requests that a given page
from a certain address space (XENMAPSPACE_shared_info, etc.) be mapped
to a given GPA using the XENMEM_add_to_physmap hypercall.

To support that in qemu when *emulating* Xen, create a memory region
(migratable) and allow it to be mapped as an overlay when requested.

Xen theoretically allows the same page to be mapped multiple times
into the guest, but that's hard to track and reinstate over migration,
so we automatically *unmap* any previous mapping when creating a new
one. This approach has been used in production with.... a non-trivial
number of guests expecting true Xen, without any problems yet being
noticed.

This adds just the shared info page for now. The grant tables will be
a larger region, and will need to be overlaid one page at a time. I
think that means I need to create separate aliases for each page of
the overall grant_frames region, so that they can be mapped individually.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: Implement SCHEDOP_poll and SCHEDOP_yield
David Woodhouse [Wed, 14 Dec 2022 21:50:41 +0000 (21:50 +0000)]
i386/xen: Implement SCHEDOP_poll and SCHEDOP_yield

They both do the same thing and just call sched_yield. This is enough to
stop the Linux guest panicking when running on a host kernel which doesn't
intercept SCHEDOP_poll and lets it reach userspace.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown
Joao Martins [Fri, 20 Jul 2018 19:19:05 +0000 (15:19 -0400)]
i386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown

It allows to shutdown itself via hypercall with any of the 3 reasons:
  1) self-reboot
  2) shutdown
  3) crash

Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather
than leading to triple faults if it remains unimplemented.

In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset
Xen shared pages and other enlightenments and leave a clean slate for the
new kernel without the hypervisor helpfully writing information at
unexpected addresses.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch sched_op_compat which was never available for HVM guests,
        Add SCHEDOP_soft_reset]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: implement HYPERVISOR_xen_version
Joao Martins [Thu, 14 Jun 2018 12:29:45 +0000 (08:29 -0400)]
i386/xen: implement HYPERVISOR_xen_version

This is just meant to serve as an example on how we can implement
hypercalls. xen_version specifically since Qemu does all kind of
feature controllability. So handling that here seems appropriate.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Implement kvm_gva_rw() safely]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/xen: handle guest hypercalls
Joao Martins [Wed, 13 Jun 2018 14:14:31 +0000 (10:14 -0400)]
i386/xen: handle guest hypercalls

This means handling the new exit reason for Xen but still
crashing on purpose. As we implement each of the hypercalls
we will then return the right return code.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoxen-platform: allow its creation with XEN_EMULATE mode
Joao Martins [Tue, 19 Jun 2018 10:44:46 +0000 (06:44 -0400)]
xen-platform: allow its creation with XEN_EMULATE mode

The only thing we need to fix to make this build is the PIO hack which
sets the BIOS memory areas to R/W v.s. R/O. Theoretically we could hook
that up to the PAM registers on the emulated PIIX, but in practice
nobody cares, so just leave it doing nothing.

Now it builds without actual Xen, move it to CONFIG_XEN_BUS to include it
in the KVM-only builds.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoxen-platform: exclude vfio-pci from the PCI platform unplug
Joao Martins [Fri, 18 Jan 2019 19:29:52 +0000 (14:29 -0500)]
xen-platform: exclude vfio-pci from the PCI platform unplug

Such that PCI passthrough devices work for Xen emulated guests.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/kvm: Set Xen vCPU ID in KVM
David Woodhouse [Fri, 16 Dec 2022 11:05:29 +0000 (11:05 +0000)]
i386/kvm: Set Xen vCPU ID in KVM

There are (at least) three different vCPU ID number spaces. One is the
internal KVM vCPU index, based purely on which vCPU was chronologically
created in the kernel first. If userspace threads are all spawned and
create their KVM vCPUs in essentially random order, then the KVM indices
are basically random too.

The second number space is the APIC ID space, which is consistent and
useful for referencing vCPUs. MSIs will specify the target vCPU using
the APIC ID, for example, and the KVM Xen APIs also take an APIC ID
from userspace whenever a vCPU needs to be specified (as opposed to
just using the appropriate vCPU fd).

The third number space is not normally relevant to the kernel, and is
the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But
Xen timer hypercalls use it, and Xen timer hypercalls *really* want
to be accelerated in the kernel rather than handled in userspace, so
the kernel needs to be told.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/kvm: handle Xen HVM cpuid leaves
Joao Martins [Tue, 6 Dec 2022 10:48:53 +0000 (10:48 +0000)]
i386/kvm: handle Xen HVM cpuid leaves

Introduce support for emulating CPUID for Xen HVM guests. It doesn't make
sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally
when the xen-version machine property is set.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Obtain xen_version from KVM property, make it automatic]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoi386/kvm: Add xen-version KVM accelerator property and init KVM Xen support
David Woodhouse [Sat, 3 Dec 2022 17:51:13 +0000 (09:51 -0800)]
i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support

This just initializes the basic Xen support in KVM for now. Only permitted
on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap
overlay, event channel, grant tables and other stuff will exist. There's
no point having the basic hypercall support if nothing else works.

Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used
later by support devices.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoxen: Add XEN_DISABLED mode and make it default
David Woodhouse [Mon, 12 Dec 2022 22:32:54 +0000 (22:32 +0000)]
xen: Add XEN_DISABLED mode and make it default

Also set XEN_ATTACH mode in xen_init() to reflect the truth; not that
anyone ever cared before. It was *only* ever checked in xen_init_pv()
before.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoxen: add CONFIG_XEN_BUS and CONFIG_XEN_EMU options for Xen emulation
David Woodhouse [Tue, 6 Dec 2022 09:03:48 +0000 (09:03 +0000)]
xen: add CONFIG_XEN_BUS and CONFIG_XEN_EMU options for Xen emulation

The XEN_EMU option will cover core Xen support in target/, which exists
only for x86 with KVM today but could theoretically also be implemented
on Arm/Aarch64 and with TCG or other accelerators (if anyone wants to
run the gauntlet of struct layout compatibility, errno mapping, and the
rest of that fui).

It will also cover the support for architecture-independent grant table
and event channel support which will be added in hw/i386/kvm/ (on the
basis that the non-KVM support is very theoretical and making it not use
KVM directly seems like gratuitous overengineering at this point).

The XEN_BUS option is for the xenfv platform support, which will now be
used both by XEN_EMU and by real Xen.

The XEN option remains dependent on the Xen runtime libraries, and covers
support for real Xen. Some code which currently resides under CONFIG_XEN
will be moving to CONFIG_XEN_BUS over time as the direct dependencies on
Xen runtime libraries are eliminated. The Xen PCI platform device will
also reside under CONFIG_XEN_BUS.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoinclude: import Xen public headers to hw/xen/interface
Joao Martins [Wed, 13 Feb 2019 17:29:47 +0000 (12:29 -0500)]
include: import Xen public headers to hw/xen/interface

There's already a partial set here; update them and pull in a more
complete set.

To start with, define __XEN_TOOLS__ in hw/xen/xen.h to ensure that any
internal definitions needed by Xen toolstack libraries are present
regardless of the order in which the headers are included. A reckoning
will come later, once we make the PV backends work in emulation and
untangle the headers for Xen-native vs. generic parts.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Update to Xen public headers from 4.16.2 release, add some in io/,
        define __XEN_TOOLS__ in hw/xen/xen.h, move to hw/xen/interface/]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2 years agoMerge tag 'buildsys-qom-qdev-ui-20230227' of https://github.com/philmd/qemu into...
Peter Maydell [Tue, 28 Feb 2023 15:09:18 +0000 (15:09 +0000)]
Merge tag 'buildsys-qom-qdev-ui-20230227' of https://github.com/philmd/qemu into staging

- buildsys
  - Various header cleaned up (removing pointless headers)
  - Mark various files/code user/system specific
  - Make various objects target-independent
  - Remove tswapN() calls from dump.o
  - Suggest g_assert_not_reached() instead of assert(0)

- qdev / qom
  - Replace various container_of() by QOM cast macros
  - Declare some QOM macros using OBJECT_DECLARE_TYPE()
  - Embed OHCI QOM child in SM501 chipset

- hw (ISA & IDE)
  - add some documentation, improve function names
  - un-inline, open-code few functions
  - have ISA API accessing IRQ/DMA prefer ISABus over ISADevice
  - Demote IDE subsystem maintenance to "Odd Fixes"

- ui: Improve Ctrl+Alt hint on Darwin Cocoa

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmP9IeAACgkQ4+MsLN6t
# wN7bdQ//SxJYJuQvqTT6s+O0LmP6NbqvhxCXX7YAwK2jCTM+zTgcqqRZCcisLQol
# 3ENu2UhnZmiLKHSOxatOVozbws08/u8Vl+WkW4UTMUb1yo5KPaPtq808Y95RdAJB
# 7D7B5juDGnFRAHXZz38zVk9uIuEkm+Po/pD0JQa+upBtAAgOJTqGavDNSR5+T0Yl
# VjGdwK0b10skPqiF6OABYoy/4IFHVJJFIbARZh+a7hrF0llsbzUts5JiYsOxEEHQ
# t3woUItdMnS1m0+Ty4AQ8m0Yv9y4HZOIzixvsZ+vChj5ariwUhL9/7wC/s/UCYEg
# gKVA5X8R6n/ME6DScK99a+CyR/MXkz70b/rOUZxoutXhV3xdh4X1stL4WN9W/m3z
# D4i4ZrUsDUcKCGWlj49of/dKbOPwk1+e/mT0oDZD6JzG0ODjfdVxvJ/JEV2iHgS3
# WqHuSKzX/20H9j7/MgfbQ0HjBFOQ8tl781vQzhD+y+cF/IiTsHhrE6esIWho4bob
# kfSdVydUWWRnBsnyGoRZXoEMX9tn+pu0nKxEDm2Bo2+jajsa0aZZPokgjxaz4MnD
# Hx+/p1E+8IuOn05JgzQSgTJmKFdSbya203tXIsTo1kL2aJTJ6QfMvgEPP/fkn+lS
# oQyVBFZmb1JDdTM1MxOncnlWLg74rp/CWEc+u5pSdbxMO/M/uac=
# =AV/+
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 27 Feb 2023 21:34:24 GMT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'buildsys-qom-qdev-ui-20230227' of https://github.com/philmd/qemu: (125 commits)
  ui/cocoa: user friendly characters for release mouse
  dump: Add create_win_dump() stub for non-x86 targets
  dump: Simplify compiling win_dump.o by introducing win_dump_available()
  dump: Clean included headers
  dump: Replace TARGET_PAGE_SIZE -> qemu_target_page_size()
  dump: Replace tswapN() -> cpu_to_dumpN()
  hw/ide/pci: Add PCIIDEState::isa_irq[]
  hw/ide/via: Replace magic 2 value by ARRAY_SIZE / MAX_IDE_DEVS
  hw/ide/piix: Refactor pci_piix_init_ports as pci_piix_init_bus per bus
  hw/ide/piix: Pass Error* to pci_piix_init_ports() for better error msg
  hw/ide/piix: Remove unused includes
  hw/ide/pci: Unexport bmdma_active_if()
  hw/ide/ioport: Remove unnecessary includes
  hw/ide: Declare ide_get_[geometry/bios_chs_trans] in 'hw/ide/internal.h'
  hw/ide: Rename idebus_active_if() -> ide_bus_active_if()
  hw/ide: Rename ide_init2() -> ide_bus_init_output_irq()
  hw/ide: Rename ide_exec_cmd() -> ide_bus_exec_cmd()
  hw/ide: Rename ide_register_restart_cb -> ide_bus_register_restart_cb
  hw/ide: Rename ide_create_drive() -> ide_bus_create_drive()
  hw/ide: Rename ide_set_irq() -> ide_bus_set_irq()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 years agoui/cocoa: user friendly characters for release mouse
Christian Schoenebeck [Tue, 27 Dec 2022 16:15:31 +0000 (17:15 +0100)]
ui/cocoa: user friendly characters for release mouse

While mouse is grabbed, window title contains a hint for the user what
keyboard keys to press to release the mouse. Make that hint text a bit
more user friendly for a Mac user:

 - Replace "Ctrl" and "Alt" by appropriate symbols for those keyboard
   keys typically displayed for them on a Mac (encode those symbols by
   using UTF-8 characters).

 - Drop " + " in between the keys, as that's not common on macOS for
   documenting keyboard shortcuts.

 - Convert lower case "g" to upper case "G", as that's common on macOS.

 - Add one additional space at start and end of key stroke set, to
   visually separate the key strokes from the rest of the text.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <E1pAClj-0003Jo-OB@lizzy.crudebyte.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2 years agodump: Add create_win_dump() stub for non-x86 targets
Philippe Mathieu-Daudé [Thu, 23 Feb 2023 22:59:19 +0000 (23:59 +0100)]
dump: Add create_win_dump() stub for non-x86 targets

Implement the non-x86 create_win_dump(). We can remove
the last TARGET_X86_64 #ifdef'ry in dump.c, which thus
becomes target-independent. Update meson accordingly.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230225094903.53167-6-philmd@linaro.org>

2 years agodump: Simplify compiling win_dump.o by introducing win_dump_available()
Philippe Mathieu-Daudé [Thu, 23 Feb 2023 22:58:16 +0000 (23:58 +0100)]
dump: Simplify compiling win_dump.o by introducing win_dump_available()

To make dump.c less target dependent, move the TARGET_X86_64 #ifdef'ry
from dump.c to win_dump.c (introducing a win_dump_available() method
there). By doing so we can build win_dump.c on any target, and
simplify the meson rule.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230225094903.53167-5-philmd@linaro.org>

2 years agodump: Clean included headers
Philippe Mathieu-Daudé [Thu, 23 Feb 2023 22:56:46 +0000 (23:56 +0100)]
dump: Clean included headers

"qemu/win_dump_defs.h" is only required by win_dump.c,
but win_dump.h requires "sysemu/dump.h" which declares
the DumpState type. Remove various unused headers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230225094903.53167-4-philmd@linaro.org>

2 years agodump: Replace TARGET_PAGE_SIZE -> qemu_target_page_size()
Philippe Mathieu-Daudé [Thu, 23 Feb 2023 22:38:59 +0000 (23:38 +0100)]
dump: Replace TARGET_PAGE_SIZE -> qemu_target_page_size()

TARGET_PAGE_SIZE is target specific. In preparation of
making dump.c target-agnostic, replace the compile-time
TARGET_PAGE_SIZE definition by runtime qemu_target_page_size().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230225094903.53167-3-philmd@linaro.org>

2 years agodump: Replace tswapN() -> cpu_to_dumpN()
Philippe Mathieu-Daudé [Fri, 16 Dec 2022 07:28:32 +0000 (08:28 +0100)]
dump: Replace tswapN() -> cpu_to_dumpN()

All uses of tswap in that file are wrong, and should be using
cpu_to_dumpN, which correctly tests the endianness of the output.

Reported-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230225094903.53167-2-philmd@linaro.org>

2 years agohw/ide/pci: Add PCIIDEState::isa_irq[]
Bernhard Beschow [Thu, 26 Jan 2023 21:17:36 +0000 (22:17 +0100)]
hw/ide/pci: Add PCIIDEState::isa_irq[]

These legacy ISA IRQs allow the PIIX IDE functions to be wired up in
their south bridges and the VIA IDE functions to disuse
PCI_INTERRUPT_LINE as outlined in https://lists.nongnu.org/archive/html/qemu-devel/2020-03/msg01707.html

Suggested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230126211740.66874-7-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2 years agohw/ide/via: Replace magic 2 value by ARRAY_SIZE / MAX_IDE_DEVS
Philippe Mathieu-Daudé [Wed, 24 Mar 2021 17:47:59 +0000 (18:47 +0100)]
hw/ide/via: Replace magic 2 value by ARRAY_SIZE / MAX_IDE_DEVS

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: John Snow <jsnow@redhat.com>
Message-Id: <20210511041848.2743312-5-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2 years agohw/ide/piix: Refactor pci_piix_init_ports as pci_piix_init_bus per bus
Philippe Mathieu-Daudé [Tue, 14 Feb 2023 15:28:19 +0000 (16:28 +0100)]
hw/ide/piix: Refactor pci_piix_init_ports as pci_piix_init_bus per bus

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-21-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2 years agohw/ide/piix: Pass Error* to pci_piix_init_ports() for better error msg
Philippe Mathieu-Daudé [Tue, 14 Feb 2023 15:47:39 +0000 (16:47 +0100)]
hw/ide/piix: Pass Error* to pci_piix_init_ports() for better error msg

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-20-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2 years agohw/ide/piix: Remove unused includes
Philippe Mathieu-Daudé [Thu, 9 Feb 2023 10:57:15 +0000 (11:57 +0100)]
hw/ide/piix: Remove unused includes

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-19-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2 years agohw/ide/pci: Unexport bmdma_active_if()
Bernhard Beschow [Mon, 22 Aug 2022 17:02:12 +0000 (19:02 +0200)]
hw/ide/pci: Unexport bmdma_active_if()

The function is only used inside ide/pci.c, so doesn't need to be exported.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-18-philmd@linaro.org>

2 years agohw/ide/ioport: Remove unnecessary includes
Philippe Mathieu-Daudé [Fri, 10 Feb 2023 21:58:05 +0000 (22:58 +0100)]
hw/ide/ioport: Remove unnecessary includes

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-17-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2 years agohw/ide: Declare ide_get_[geometry/bios_chs_trans] in 'hw/ide/internal.h'
Philippe Mathieu-Daudé [Thu, 9 Feb 2023 22:33:35 +0000 (23:33 +0100)]
hw/ide: Declare ide_get_[geometry/bios_chs_trans] in 'hw/ide/internal.h'

ide_get_geometry() and ide_get_bios_chs_trans() are only
used by the TYPE_PC_MACHINE.
"hw/ide.h" is a mixed bag of lost IDE declarations. In order
to remove this (almost) pointless header soon, move these
declarations to "hw/ide/internal.h".

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230220091358.17038-18-philmd@linaro.org>