www.infradead.org Git - qemu-nvme.git/log

Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging

Block layer patches

- qcow2 spec: Rename "zlib" compression to "deflate"
- Honour graph read lock even in the main thread + prerequisite fixes
- aio-posix: do not nest poll handlers (fixes infinite recursion)
- Refactor QMP blockdev transactions
- graph-lock: Disable locking for now
- iotests/245: Check if 'compress' driver is available

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmRnrxURHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9aHyw/9H0xpceVb0kcC5CStOWCcq4PJHzkl/8/m
# c6ABFe0fgEuN2FCiKiCKOt6+V7qaIAw0+YLgPr/LGIsbIBzdxF3Xgd2UyIH6o4dK
# bSaIAaes6ZLTcYGIYEVJtHuwNgvzhjyBlW5qqwTpN0YArKS411eHyQ3wlUkCEVwK
# ZNmDY/MC8jq8r1xfwpPi7CaH6k1I6HhDmyl1PdURW9hmoAKZQZMhEdA5reJrUwZ9
# EhfgbLIaK0kkLLsufJ9YIkd+b/P3mUbH30kekNMOiA0XlnhWm1Djol5pxlnNiflg
# CGh6CAyhJKdXzwV567cSF11NYCsFmiY+c/l0xRIGscujwvO4iD7wFT5xk2geUAKV
# yaox8JA7Le36g7lO2CRadlS24/Ekqnle6q09g2i8s2tZwB4fS286vaZz6QDPmf7W
# VSQp9vuDj6ZcVjMsuo2+LzF3yA2Vqvgd9s032iBAjRDSGLAoOdQZjBJrreypJ0Oi
# pVFwgK+9QNCZBsqVhwVOgElSoK/3Vbl1kqpi30Ikgc0epAn0suM1g2QQPJ2Zt/MJ
# xqMlTv+48OW3vq3ebr8GXqkhvG/u0ku6I1G6ZyCrjOce89osK8QUaovERyi1eOmo
# ouoZ8UJJa6VfEkkmdhq2vF6u/MP4PeZ8MW3pYQy6qEnSOPDKpLnR30Z/s/HZCZcm
# H4QIbfQnzic=
# =edNP
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 19 May 2023 10:17:09 AM PDT
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]

* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (21 commits)
  iotests: Test commit with iothreads and ongoing I/O
  nbd/server: Fix drained_poll to wake coroutine in right AioContext
  graph-lock: Disable locking for now
  tested: add test for nested aio_poll() in poll handlers
  aio-posix: do not nest poll handlers
  iotests/245: Check if 'compress' driver is available
  graph-lock: Honour read locks even in the main thread
  blockjob: Adhere to rate limit even when reentered early
  test-bdrv-drain: Call bdrv_co_unref() in coroutine context
  test-bdrv-drain: Take graph lock more selectively
  qemu-img: Take graph lock more selectively
  qcow2: Unlock the graph in qcow2_do_open() where necessary
  block/export: Fix null pointer dereference in error path
  block: Call .bdrv_co_create(_opts) unlocked
  docs/interop/qcow2.txt: fix description about "zlib" clusters
  blockdev: qmp_transaction: drop extra generic layer
  blockdev: use state.bitmap in block-dirty-bitmap-add action
  blockdev: transaction: refactor handling transaction properties
  blockdev: qmp_transaction: refactor loop to classic for
  blockdev: transactions: rename some things
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'for-upstream-urgent' of https://gitlab.com/bonzini/qemu into staging

Fixes for Python venv changes

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRn7D4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOHbQgAiQW824iL2Iw+wjYckp0rwLxe53+z
# P4kCdQePrfKW3sPglbeDArPr4gzuo7bdj75dscZmco+nBU40qGqEpRHBqjQol5pE
# kcQsmqx+0Udbsc6kJe47fgSsBLD2KbT1QQCVBgScNuDviogQ0/PCLNWjk9V4OhgL
# 0ZlK8QFnuv0qNthS+oNjkNi6SYGYNOw+4LQ/WcLWnowwhNRGUvYoq9QdOCocfyxD
# t+1xQvF4Pxqnhbkni51JRoXv/Np8U/yDHMgonvw8BLxTMNAes4nV7ifzyW2pltnf
# YEHGUKYPtrPR9dKLr/Au9ktr7n3O5ikOEpPIPSi4BwFqzv6hdE4DDAMXDA==
# =Auyq
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 19 May 2023 02:38:06 PM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream-urgent' of https://gitlab.com/bonzini/qemu:
  scripts: make sure scripts are invoked via $(PYTHON)
  gitlab: custom-runners: preserve more artifacts for debugging
  mkvenv: pass first missing package to diagnose()
  configure: fix backwards-compatibility for meson sphinx_build option
  build: rebuild build.ninja using "meson setup --reconfigure"
  mkvenv: replace distlib.database with importlib.metadata/pkg_resources
  remove remaining traces of meson submodule

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: fixes, features, cleanups

CXL volatile memory support
More memslots for vhost-user on x86 and ARM.
vIOMMU support for vhost-vdpa
pcie-to-pci bridge can now be compiled out
MADT revision bumped to 3
Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmRniWoPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpN4MH/RqdvHmujrjvjzXbbN/gq87Njp+kQLKEooIE
# ZkqdNaVUE6vjCH8iU+chjsxt4VSquSjOL9CWWrYefEIeqCFLWsuXSAY0VDAbY67x
# +aes51tTYILVsx7fbb+T5mJKRgVuWW4C5KaGeQ1djSexy42nvplZUJdIJUhZr0t9
# dzzOsD+mezHS7Xu2QOzSfl5QQRuOVVJnjJXkqJG/yRvHrZM5aTolatr/X7jNGedm
# 4oyMsVMaAcQ+dnEQigRJodf/MpFfs9DfNZAH55VwwQWsNT0t0ueD0xigR203jjaE
# mJJJipAqetFax2JjC7QMXWf+LR36BnL/0/xH+x/BWb0FI42wr0I=
# =ajmR
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 19 May 2023 07:36:26 AM PDT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (40 commits)
  hw/i386/pc: No need for rtc_state to be an out-parameter
  hw/i386/pc: Create RTC controllers in south bridges
  hw/cxl: Introduce cxl_device_get_timestamp() utility function
  hw/cxl: rename mailbox return code type from ret_code to CXLRetCode
  hw/pci-bridge: make building pcie-to-pci bridge configurable
  virtio-pci: add handling of PCI ATS and Device-TLB enable/disable
  hw/pci-host/pam: Make init_pam() usage more readable
  hw/i386/pc: Initialize ram_memory variable directly
  hw/i386/pc_{q35,piix}: Minimize usage of get_system_memory()
  hw/i386/pc_{q35,piix}: Reuse MachineClass::desc as SMB product name
  hw/i386/pc_q35: Reuse machine parameter
  hw/pci-host/q35: Inline sysbus_add_io()
  hw/pci-host/i440fx: Inline sysbus_add_io()
  vhost-vdpa: Add support for vIOMMU.
  vhost-vdpa: Add check for full 64-bit in region delete
  vhost_vdpa: fix the input in trace_vhost_vdpa_listener_region_del()
  vhost: expose function vhost_dev_has_iommu()
  virtio-crypto: fix NULL pointer dereference in virtio_crypto_free_request
  virtio-net: not enable vq reset feature unconditionally
  vhost-user: Remove acpi-specific memslot limit
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Revert last two patches

Unintentionally pushed.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Raise crash-test-debian timeout to 90 minutes

When running on the Kubernetes runner, this CI job is timing out.
Raise the limit to give the job enough time to run.

Signed-off-by: Camilla Conte <cconte@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230407145252.32955-2-cconte@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Add CI configuration for Kubernetes

Configure Gitlab CI to run on Kubernetes
according to the official documentation.
https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#docker-in-docker-with-tls-enabled-in-kubernetes

These changes are needed because of the CI jobs
using Docker-in-Docker (dind).
As soon as Docker-in-Docker is replaced with Kaniko,
these changes can be reverted.

I documented what I did to set up the Kubernetes runner on the wiki:
https://wiki.qemu.org/Testing/CI/KubernetesRunners

Signed-off-by: Camilla Conte <cconte@redhat.com>
Message-Id: <20230407145252.32955-1-cconte@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

scripts: make sure scripts are invoked via $(PYTHON)

Some scripts are invoked via the first "python3" binary in the PATH,
because they are executable and their shebang line is "#! /usr/bin/env
python3". To enforce usage of $(PYTHON), make them nonexecutable.
Scripts invoked via meson need nothing else, and meson-buildoptions.py
is already using $(PYTHON). For probe-gdb-support.py however the
invocation in the configure script has to be adjusted.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

gitlab: custom-runners: preserve more artifacts for debugging

Since custom runners are not generally available, make it possible to
debug the differences between a successful and a failing build by
comparing the logs and the build.ninja rules.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

mkvenv: pass first missing package to diagnose()

If sphinx is present but the theme is not, mkvenv will print an
inaccurate diagnostic:

ERROR: Could not find a version that satisfies the requirement sphinx-rtd-theme>=0.5.0 (from versions: none)
ERROR: No matching distribution found for sphinx-rtd-theme>=0.5.0

'sphinx>=1.6.0' not found:
• Python package 'sphinx' version '5.3.0' was found, but isn't suitable.
• mkvenv was configured to operate offline and did not check PyPI.

Instead, ignore the packages that were found to be present, and report
an error based on the first absent package.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

configure: fix backwards-compatibility for meson sphinx_build option

Reintroduce the cmd_line.txt mangling to remove the sphinx_build option
when rerunning meson. The mechanism was removed in commit 75cc28648574
("configure: remove backwards-compatibility code", 2023-01-11) because
the fixups were obsolete at the time; however, the Meson deprecation
mechanism doesn't quite work when options are finally removed, so we
need to bring it back.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

build: rebuild build.ninja using "meson setup --reconfigure"

Do not use the rule in build.ninja, because the path to meson is hardcoded
in build.ninja and this breaks if meson moves (for example if the distro
meson suddenly becomes too old after an update).

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

mkvenv: replace distlib.database with importlib.metadata/pkg_resources

importlib.metadata is just as good as distlib.database and a bit more
battle-proven for "egg" based distributions, and in fact that is exactly
why mkvenv.py is not using distlib.database to find entry points: it
simply does not work for eggs.

The only disadvantage of importlib.metadata is that it is not available
by default before Python 3.8, so we need a fallback to pkg_resources
(again, just like for the case of finding entry points). Do so to
fix issues where incorrect egg metadata results in a JSONDecodeError.

While at it, reuse the new _get_version function to diagnose an incorrect
version of the package even if importlib.metadata is not available.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

remove remaining traces of meson submodule

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

iotests: Test commit with iothreads and ongoing I/O

This tests exercises graph locking, draining, and graph modifications
with AioContext switches a lot. Amongst others, it serves as a
regression test for bdrv_graph_wrlock() deadlocking because it is called
with a locked AioContext and for AioContext handling in the NBD server.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-4-kwolf@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

nbd/server: Fix drained_poll to wake coroutine in right AioContext

nbd_drained_poll() generally runs in the main thread, not whatever
iothread the NBD server coroutine is meant to run in, so it can't
directly reenter the coroutines to wake them up.

The code seems to have the right intention, it specifies the correct
AioContext when it calls qemu_aio_coroutine_enter(). However, this
functions doesn't schedule the coroutine to run in that AioContext, but
it assumes it is already called in the home thread of the AioContext.

To fix this, add a new thread-safe qio_channel_wake_read() that can be
called in the main thread to wake up the coroutine in its AioContext,
and use this in nbd_drained_poll().

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

graph-lock: Disable locking for now

In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They
come from callers that hold an AioContext lock, which is not allowed
during polling. In theory, we could temporarily release the lock, but
callers are inconsistent about whether they hold a lock, and if they do,
some are also confused about which one they hold. While all of this is
fixable, it's not trivial, and the best course of action for 8.0.1 is
probably just disabling the graph locking code temporarily.

We don't currently rely on graph locking yet. It is supposed to replace
the AioContext lock eventually to enable multiqueue support, but as long
as we still have the AioContext lock, it is sufficient without the graph
lock. Once the AioContext lock goes away, the deadlock doesn't exist any
more either and this commit can be reverted. (Of course, it can also be
reverted while the AioContext lock still exists if the callers have been
fixed.)

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

tested: add test for nested aio_poll() in poll handlers

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-3-stefanha@redhat.com>
[kwolf: Restrict to CONFIG_POSIX, Windows doesn't support polling]
Tested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

aio-posix: do not nest poll handlers

QEMU's event loop supports nesting, which means that event handler
functions may themselves call aio_poll(). The condition that triggered a
handler must be reset before the nested aio_poll() call, otherwise the
same handler will be called and immediately re-enter aio_poll. This
leads to an infinite loop and stack exhaustion.

Poll handlers are especially prone to this issue, because they typically
reset their condition by finishing the processing of pending work.
Unfortunately it is during the processing of pending work that nested
aio_poll() calls typically occur and the condition has not yet been
reset.

Disable a poll handler during ->io_poll_ready() so that a nested
aio_poll() call cannot invoke ->io_poll_ready() again. As a result, the
disabled poll handler and its associated fd handler do not run during
the nested aio_poll(). Calling aio_set_fd_handler() from inside nested
aio_poll() could cause it to run again. If the fd handler is pending
inside nested aio_poll(), then it will also run again.

In theory fd handlers can be affected by the same issue, but they are
more likely to reset the condition before calling nested aio_poll().

This is a special case and it's somewhat complex, but I don't see a way
around it as long as nested aio_poll() is supported.

Cc: qemu-stable@nongnu.org
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2186181
Fixes: c38270692593 ("block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCK")
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests/245: Check if 'compress' driver is available

Skip TestBlockdevReopen.test_insert_compress_filter() if the 'compress'
driver isn't available.

In order to make the test succeed when the case is skipped, we also need
to remove any output from it (which would be missing in the case where
we skip it). This is done by replacing qemu_io_log() with qemu_io(). In
case of failure, qemu_io() raises an exception with the output of the
qemu-io binary in its message, so we don't actually lose anything.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230511143801.255021-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

graph-lock: Honour read locks even in the main thread

There are some conditions under which we don't actually need to do
anything for taking a reader lock: Writing the graph is only possible
from the main context while holding the BQL. So if a reader is running
in the main context under the BQL and knows that it won't be interrupted
until the next writer runs, we don't actually need to do anything.

This is the case if the reader code neither has a nested event loop
(this is forbidden anyway while you hold the lock) nor is a coroutine
(because a writer could run when the coroutine has yielded).

These conditions are exactly what bdrv_graph_rdlock_main_loop() asserts.
They are not fulfilled in bdrv_graph_co_rdlock(), which always runs in a
coroutine.

This deletes the shortcuts in bdrv_graph_co_rdlock() that skip taking
the reader lock in the main thread.

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-9-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockjob: Adhere to rate limit even when reentered early

When jobs are sleeping, for example to enforce a given rate limit, they
can be reentered early, in particular in order to get paused, to update
the rate limit or to get cancelled.

Before this patch, they behave in this case as if they had fully
completed their rate limiting delay. This means that requests are sped
up beyond their limit, violating the constraints that the user gave us.

Change the block jobs to sleep in a loop until the necessary delay is
completed, while still allowing cancelling them immediately as well
pausing (handled by the pause point in job_sleep_ns()) and updating the
rate limit.

This change is also motivated by iotests cases being prone to fail
because drain operations pause and unpause them so often that block jobs
complete earlier than they are supposed to. In particular, the next
commit would fail iotests 030 without this change.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-8-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

test-bdrv-drain: Call bdrv_co_unref() in coroutine context

bdrv_unref() is a no_coroutine_fn, so calling it from coroutine context
is invalid. Use bdrv_co_unref() instead.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-7-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

test-bdrv-drain: Take graph lock more selectively

If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-6-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-img: Take graph lock more selectively

If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-5-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2: Unlock the graph in qcow2_do_open() where necessary

qcow2_do_open() calls a few no_co_wrappers that wrap functions taking
the graph lock internally as a writer. Therefore, it can't hold the
reader lock across these calls, it causes deadlocks. Drop the lock
temporarily around the calls.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-4-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/export: Fix null pointer dereference in error path

There are some error paths in blk_exp_add() that jump to 'fail:' before
'exp' is even created. So we can't just unconditionally access exp->blk.

Add a NULL check, and switch from exp->blk to blk, which is available
earlier, just to be extra sure that we really cover all cases where
BlockDevOps could have been set for it (in practice, this only happens
in drv->create() today, so this part of the change isn't strictly
necessary).

Fixes: Coverity CID 1509238
Fixes: de79b52604e43fdeba6cee4f5af600b62169f2d2
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Call .bdrv_co_create(_opts) unlocked

These are functions that modify the graph, so they must be able to take
a writer lock. This is impossible if they already hold the reader lock.
If they need a reader lock for some of their operations, they should
take it internally.

Many of them go through blk_*(), which will always take the lock itself.
Direct calls of bdrv_*() need to take the reader lock. Note that while
locking for bdrv_co_*() calls is checked by TSA, this is not the case
for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
when they are called from coroutine context like here!

This effectively reverts 4ec8df0183, but adds some internal locking
instead.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

docs/interop/qcow2.txt: fix description about "zlib" clusters

"zlib" clusters are actually raw deflate (RFC1951) clusters without
zlib headers.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Message-Id: <168424874322.11954.1340942046351859521-0@git.sr.ht>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: qmp_transaction: drop extra generic layer

Let's simplify things:

First, actions generally don't need access to common BlkActionState
structure. The only exclusion are backup actions that need
block_job_txn.

Next, for transaction actions of Transaction API is more native to
allocated state structure in the action itself.

So, do the following transformation:

1. Let all actions be represented by a function with corresponding
structure as arguments.

2. Instead of array-map marshaller, let's make a function, that calls
corresponding action directly.

3. BlkActionOps and BlkActionState structures become unused

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230510150624.310640-7-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: use state.bitmap in block-dirty-bitmap-add action

Other bitmap related actions use the .bitmap pointer in .abort action,
let's do same here:

1. It helps further refactoring, as bitmap-add is the only bitmap
   action that uses state.action in .abort

2. It must be safe: transaction actions rely on the fact that on
   .abort() the state is the same as at the end of .prepare(), so that
   in .abort() we could precisely rollback the changes done by
   .prepare().
   The only way to remove the bitmap during transaction should be
   block-dirty-bitmap-remove action, but it postpones actual removal to
   .commit(), so we are OK on any rollback path. (Note also that
   bitmap-remove is the only bitmap action that has .commit() phase,
   except for simple g_free the state on .clean())

3. Again, other bitmap actions behave this way: keep the bitmap pointer
   during the transaction.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230510150624.310640-6-vsementsov@yandex-team.ru>
[kwolf: Also remove the now unused BlockDirtyBitmapState.prepared]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: transaction: refactor handling transaction properties

Only backup supports GROUPED mode. Make this logic more clear. And
avoid passing extra thing to each action.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230510150624.310640-5-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: qmp_transaction: refactor loop to classic for

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510150624.310640-4-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: transactions: rename some things

Look at qmp_transaction(): dev_list is not obvious name for list of
actions. Let's look at qapi spec, this argument is "actions". Let's
follow the common practice of using same argument names in qapi scheme
and code.

To be honest, rename props to properties for same reason.

Next, we have to rename global map of actions, to not conflict with new
name for function argument.

Rename also dev_entry loop variable accordingly to new name of the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510150624.310640-3-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: refactor transaction to use Transaction API

We are going to add more block-graph modifying transaction actions,
and block-graph modifying functions are already based on Transaction
API.

Next, we'll need to separately update permissions after several
graph-modifying actions, and this would be simple with help of
Transaction API.

So, now let's just transform what we have into new-style transaction
actions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230510150624.310640-2-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Revert "arm/kvm: add support for MTE"

This reverts commit b320e21c48ce64853904bea6631c0158cc2ef227,
which accidentally broke TCG, because it made the TCG -cpu max
report the presence of MTE to the guest even if the board hadn't
enabled MTE by wiring up the tag RAM. This meant that if the guest
then tried to use MTE QEMU would segfault accessing the
non-existent tag RAM:

    ==346473==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address (pc 0x55f328952a4a bp 0x00000213a400 sp 0x7f7871859b80 T346476)
    ==346473==The signal is caused by a READ memory access.
    ==346473==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
        #0 0x55f328952a4a in address_space_to_flatview /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/include/exec/memory.h:1108:12
        #1 0x55f328952a4a in address_space_translate /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/include/exec/memory.h:2797:31
        #2 0x55f328952a4a in allocation_tag_mem /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/../../target/arm/tcg/mte_helper.c:176:10
        #3 0x55f32895366c in helper_stgm /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/../../target/arm/tcg/mte_helper.c:461:15
        #4 0x7f782431a293  (<unknown module>)

It's also not clear that the KVM logic is correct either:
MTE defaults to on there, rather than being only on if the
board wants it on.

Revert the whole commit for now so we can sort out the issues.

(We didn't catch this in CI because we have no test cases in
avocado that use guests with MTE support.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230519145808.348701-1-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

hw/i386/pc: No need for rtc_state to be an out-parameter

Now that the RTC is created as part of the southbridges it doesn't need
to be an out-parameter any longer.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230519084734.220480-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/i386/pc: Create RTC controllers in south bridges

Just like in the real hardware (and in PIIX4), create the RTC
controllers in the south bridges.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230519084734.220480-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/cxl: Introduce cxl_device_get_timestamp() utility function

There are new users of this functionality coming shortly so factor
it out from the GET_TIMESTAMP mailbox command handling.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230423162013.4535-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/cxl: rename mailbox return code type from ret_code to CXLRetCode

Given the increasing usage of this mailbox return code type, now
is a good time to switch to QEMU style naming.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230423162013.4535-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/pci-bridge: make building pcie-to-pci bridge configurable

Introduce a CONFIG option to build the pcie-to-pci bridge. No
functional change since it's enabled per default for PCIE_PORT=y.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Message-Id: <72b6599d-6b27-00b5-aac5-2ebc16a2e023@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio-pci: add handling of PCI ATS and Device-TLB enable/disable

According to PCIe Address Translation Services specification 5.1.3.,
ATS Control Register has Enable bit to enable/disable ATS. Guest may
enable/disable PCI ATS and, accordingly, Device-TLB for the VirtIO PCI
device. So, raise/lower a flag and call a trigger function to pass this
event to a device implementation.

Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Message-Id: <20230512135122.70403-2-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/pci-host/pam: Make init_pam() usage more readable

Unlike pam_update() which takes the subject -- PAMMemoryRegion -- as
first argument, init_pam() takes it as fifth (!) argument. This makes it
quite hard to figure out what an init_pam() invocation actually
initializes. By moving the subject to the front this should become
clearer.

While at it, lower the DeviceState parameter to Object, also
communicating more clearly that this parameter is just the owner rather
than some (heavy?) dependency.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213162004.2797-8-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/i386/pc: Initialize ram_memory variable directly

Going through pc_memory_init() seems quite complicated for a simple
assignment.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213162004.2797-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/i386/pc_{q35,piix}: Minimize usage of get_system_memory()

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/i386/pc_{q35,piix}: Reuse MachineClass::desc as SMB product name

No need to repeat the descriptions.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/i386/pc_q35: Reuse machine parameter

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

hw/pci-host/q35: Inline sysbus_add_io()

sysbus_add_io() just wraps memory_region_add_subregion() while also
obscuring where the memory is attached. So use
memory_region_add_subregion() directly and attach it to the existing
memory region s->mch.address_space_io which is set as an alias to
get_system_io() by the q35 machine.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/pci-host/i440fx: Inline sysbus_add_io()

sysbus_add_io() just wraps memory_region_add_subregion() while also
obscuring where the memory is attached. So use
memory_region_add_subregion() directly and attach it to the existing
memory region s->bus->address_space_io which is set as an alias to
get_system_io() by the pc machine.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

vhost-vdpa: Add support for vIOMMU.

1. The vIOMMU support will make vDPA can work in IOMMU mode. This
will fix security issues while using the no-IOMMU mode.
To support this feature we need to add new functions for IOMMU MR adds and
deletes.

Also since the SVQ does not support vIOMMU yet, add the check for IOMMU
in vhost_vdpa_dev_start, if the SVQ and IOMMU enable at the same time
the function will return fail.

2. Skip the iova_max check vhost_vdpa_listener_skipped_section(). While
MR is IOMMU, move this check to vhost_vdpa_iommu_map_notify()

Verified in vp_vdpa and vdpa_sim_net driver

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20230510054631.2951812-5-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost-vdpa: Add check for full 64-bit in region delete

The unmap ioctl doesn't accept a full 64-bit span. So need to
add check for the section's size in vhost_vdpa_listener_region_del().

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20230510054631.2951812-4-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost_vdpa: fix the input in trace_vhost_vdpa_listener_region_del()

In trace_vhost_vdpa_listener_region_del, the value for llend
should change to int128_get64(int128_sub(llend, int128_one()))

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20230510054631.2951812-3-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost: expose function vhost_dev_has_iommu()

To support vIOMMU in vdpa, need to exposed the function
vhost_dev_has_iommu, vdpa will use this function to check
if vIOMMU enable.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20230510054631.2951812-2-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio-crypto: fix NULL pointer dereference in virtio_crypto_free_request

Ensure op_info is not NULL in case of QCRYPTODEV_BACKEND_ALG_SYM algtype.

Fixes: 0e660a6f90a ("crypto: Introduce RSA algorithm")
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Yiming Tao <taoym@zju.edu.cn>
Message-Id: <20230509075317.1132301-1-mcascell@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: zhenwei pi<pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio-net: not enable vq reset feature unconditionally

The commit 93a97dc5200a ("virtio-net: enable vq reset feature") enables
unconditionally vq reset feature as long as the device is emulated.
This makes impossible to actually disable the feature, and it causes
migration problems from qemu version previous than 7.2.

The entire final commit is unneeded as device system already enable or
disable the feature properly.

This reverts commit 93a97dc5200a95e63b99cb625f20b7ae802ba413.
Fixes: 93a97dc5200a ("virtio-net: enable vq reset feature")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230504101447.389398-1-eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost-user: Remove acpi-specific memslot limit

Let's just support 512 memslots on x86-64 and aarch64 as well. The maximum
number of ACPI slots (256) is no longer completely expressive ever since
we supported virtio-based memory devices. Further, we're completely
ignoring other memslots used outside of memory device context, such as
memslots used for boot memory.

Note that the vhost memslot limit in the kernel is usually configured to
be 509. With this change, we prepare vhost-user on the QEMU side to be
closer to that limit, to eventually support ~512 memslots in most vhost
implementations and have less "surprises" when cold/hotplugging vhost
devices while also consuming more memslots than we're currently used to
by memory devices (e.g., once virtio-mem starts using multiple memslots).

Note that most vhost-user implementations only support a small number of
memslots so far, which we can hopefully improve in the near future.

We'll leave the PPC special-case as is for now.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230503184144.808478-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio-mem: Default to "unplugged-inaccessible=on" with 8.1 on x86-64

Allowing guests to read unplugged memory simplified the bring-up of
virtio-mem in Linux guests -- which was limited to x86-64 only. On arm64
(which was added later), we never had legacy guests and don't even allow
to configure it, essentially always having "unplugged-inaccessible=on".

At this point, all guests we care about
should be supporting VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, so let's
change the default for the 8.1 machine.

This change implies that also memory that supports the shared zeropage
(private anonymous memory) will now require
VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE in the driver in order to be usable by
the guest -- as default, one can still manually set the
unplugged-inaccessible property.

Disallowing the guest to read unplugged memory will be important for
some future features, such as memslot optimizations or protection of
unplugged memory, whereby we'll actually no longer allow the guest to
even read from unplugged memory.

At some point, we might want to deprecate and remove that property.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230503182352.792458-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0

Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK
set for machine types < 8.0 will cause migration to fail if the target
QEMU version is < 8.0.0 :

qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0
qemu-system-x86_64: Failed to load PCIDevice:config
qemu-system-x86_64: Failed to load e1000e:parent_obj
qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e'
qemu-system-x86_64: load of migration failed: Invalid argument

The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0,
with this cmdline:

./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX]

In order to fix this, property x-pcie-err-unc-mask was introduced to
control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by
default, but is disabled if machine type <= 7.2.

Fixes: 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20230503002701.854329-1-leobras@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost-user: send SET_STATUS 0 after GET_VRING_BASE

Setting the VIRTIO Device Status Field to 0 resets the device. The
device's state is lost, including the vring configuration.

vhost-user.c currently sends SET_STATUS 0 before GET_VRING_BASE. This
risks confusion about the lifetime of the vhost-user state (e.g. vring
last_avail_idx) across VIRTIO device reset.

Eugenio Pérez <eperezma@redhat.com> adjusted the order for vhost-vdpa.c
in commit c3716f260bff ("vdpa: move vhost reset after get vring base")
and in that commit description suggested doing the same for vhost-user
in the future.

Go ahead and adjust vhost-user.c now. I ran various online code searches
to identify vhost-user backends implementing SET_STATUS. It seems only
DPDK implements SET_STATUS and Yajun Wu <yajunw@nvidia.com> has
confirmed that it is safe to make this change.

Fixes: commit 923b8921d210763359e96246a58658ac0db6c645 ("vhost-user: Support vhost_dev_start")
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cindy Lu <lulu@redhat.com>
Cc: Yajun Wu <yajunw@nvidia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230501230409.274178-1-stefanha@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

pci: pci_add_option_rom(): refactor: use g_autofree for path variable

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230515125229.44836-3-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>

pci: pci_add_option_rom(): improve style

Fix over-80 lines and missing curly brackets for if-operators, which
are required by QEMU coding style.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230515125229.44836-2-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>

ACPI: bios-tables-test.c step 5 (update expected table binaries)

Following the guidelines in tests/qtest/bios-tables-test.c, this
is step 5 and 6.

An examination of all the files impacted (as listed in
bios-tables-test-allowe-diff.h) shows only the MADT/APIC tables
bumping revision from 1 to 3, and a corresponding change to
the checksum. The below diff is typical:

--- /tmp/asl-1F9641.dsl 2023-05-16 15:18:31.292579156 -0400
+++ /tmp/asl-GVD741.dsl 2023-05-16 15:18:31.291579149 -0400
@@ -1,32 +1,32 @@
  /*
   * Intel ACPI Component Architecture
   * AML/ASL+ Disassembler version 20230331 (64-bit version)
   * Copyright (c) 2000 - 2023 Intel Corporation
   *
- * Disassembly of tests/data/acpi/pc/APIC, Tue May 16 15:18:31 2023
+ * Disassembly of /tmp/aml-R4D741, Tue May 16 15:18:31 2023
   *
   * ACPI Data Table [APIC]
   *
   * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue (in hex)
   */

  [000h 0000 004h]                   Signature : "APIC"    [Multiple APIC Description Table (MADT)]
  [004h 0004 004h]                Table Length : 00000078
-[008h 0008 001h]                    Revision : 01
-[009h 0009 001h]                    Checksum : 8A
+[008h 0008 001h]                    Revision : 03
+[009h 0009 001h]                    Checksum : 88
  [00Ah 0010 006h]                      Oem ID : "BOCHS "
  [010h 0016 008h]                Oem Table ID : "BXPC    "
  [018h 0024 004h]                Oem Revision : 00000001
  [01Ch 0028 004h]             Asl Compiler ID : "BXPC"
  [020h 0032 004h]       Asl Compiler Revision : 00000001

  [024h 0036 004h]          Local Apic Address : FEE00000
  [028h 0040 004h]       Flags (decoded below) : 00000001
                           PC-AT Compatibility : 1

  [02Ch 0044 001h]               Subtable Type : 00 [Processor Local APIC]
  [02Dh 0045 001h]                      Length : 08
  [02Eh 0046 001h]                Processor ID : 00
  [02Fh 0047 001h]               Local Apic ID : 00
  [030h 0048 004h]       Flags (decoded below) : 00000001
                             Processor Enabled : 1
@@ -81,24 +81,24 @@
  [06Bh 0107 001h]                      Source : 0B
  [06Ch 0108 004h]                   Interrupt : 0000000B
  [070h 0112 002h]       Flags (decoded below) : 000D
                                      Polarity : 1
                                  Trigger Mode : 3

  [072h 0114 001h]               Subtable Type : 04 [Local APIC NMI]
  [073h 0115 001h]                      Length : 06
  [074h 0116 001h]                Processor ID : FF
  [075h 0117 002h]       Flags (decoded below) : 0000
                                      Polarity : 0
                                  Trigger Mode : 0
  [077h 0119 001h]        Interrupt Input LINT : 01

  Raw Table Data: Length 120 (0x78)

-    0000: 41 50 49 43 78 00 00 00 01 8A 42 4F 43 48 53 20  // APICx.....BOCHS
+    0000: 41 50 49 43 78 00 00 00 03 88 42 4F 43 48 53 20  // APICx.....BOCHS
      0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC    ....BXPC
      0020: 01 00 00 00 00 00 E0 FE 01 00 00 00 00 08 00 00  // ................
      0030: 01 00 00 00 01 0C 00 00 00 00 C0 FE 00 00 00 00  // ................
      0040: 02 0A 00 00 02 00 00 00 00 00 02 0A 00 05 05 00  // ................
      0050: 00 00 0D 00 02 0A 00 09 09 00 00 00 0D 00 02 0A  // ................
      0060: 00 0A 0A 00 00 00 0D 00 02 0A 00 0B 0B 00 00 00  // ................
      0070: 0D 00 04 06 FF 00 00 01                          // ........

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Message-Id: <20230517162545.2191-4-eric.devolder@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Ani Sinha <anisinha@redhat.com>

ACPI: i386: bump to MADT to revision 3

Currently i386 QEMU generates MADT revision 3, and reports
MADT revision 1. Set .revision to 3 to match reality.

Link: https://lore.kernel.org/linux-acpi/20230327191026.3454-1-eric.devolder@ora
cle.com/T/#t
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230517162545.2191-3-eric.devolder@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>

ACPI: bios-tables-test.c step 2 (allowed-diff entries)

Following the guidelines in tests/qtest/bios-tables-test.c,
set up bios-tables-test-allowed-diff.h to ignore the
imminent changes to the APIC tables, per step 2.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Message-Id: <20230517162545.2191-2-eric.devolder@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Ani Sinha <ani@anisinha.ca>

hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)

This commit enables each CXL Type-3 device to contain one volatile
memory region and one persistent region.

Two new properties have been added to cxl-type3 device initialization:
[volatile-memdev] and [persistent-memdev]

The existing [memdev] property has been deprecated and will default the
memory region to a persistent memory region (although a user may assign
the region to a ram or file backed region). It cannot be used in
combination with the new [persistent-memdev] property.

Partitioning volatile memory from persistent memory is not yet supported.

Volatile memory is mapped at DPA(0x0), while Persistent memory is mapped
at DPA(vmem->size), per CXL Spec 8.2.9.8.2.0 - Get Partition Info.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421160827.2227-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/mem: Use memory_region_size() in cxl_type3

Accessors prefered over direct use of int128_get64() as they
clamp out of range values. None are expected here but
cleaner to always use the accessor than mix and match.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421160827.2227-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>

tests/qtest/cxl-test: whitespace, line ending cleanup

Defines are starting to exceed line length limits, align them for
cleanliness before making modifications.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421160827.2227-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/cxl: Fix incorrect reset of commit and associated clearing of committed.

The hardware clearing the commit bit is not spec compliant.
Clearing of committed bit when commit is cleared is not specifically
stated in the CXL spec, but is the expected (and simplest) permitted
behaviour so use that for QEMU emulation.

Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
--
v2: Picked up tags.
Message-Id: <20230421135906.3515-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/cxl: Fix endian handling for decoder commit.

Not a real problem yet as all supported architectures are
little endian, but continue to tidy these up when touching
code for other reasons.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421135906.3515-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/cxl: drop pointless memory_region_transaction_guards

Not clear what intent was here, but probably based on a misunderstanding
of what these guards are for.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421135906.3515-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

docs/cxl: Replace unsupported AARCH64 with x86_64

Currently Qemu CXL emulation support is not availabe on AARCH64 but its
available with qemu x86_64 architecture, updating the document to reflect
the supported platform.

Signed-off-by: Raghu H <raghuhack78@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421134507.26842-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

docs/cxl: Remove incorrect CXL type 3 size parameter

cxl-type3 memory size is read directly from the provided memory backed end
device. Remove non existent size option

Signed-off-by: Raghu H <raghuhack78@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421134507.26842-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

docs/cxl: fix some typos

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421134507.26842-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/cxl: cdat: Fix failure to free buffer in erorr paths

The failure paths in CDAT file loading did not clear up properly.
Change to using g_auto_free and a local pointer for the buffer to
ensure this function has no side effects on error.
Also drop some unnecessary checks that can not fail.

Cleanup properly after a failure to load a CDAT file.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421132020.7408-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/cxl: cdat: Fix open file not closed in ct3_load_cdat()

Open file descriptor not closed in error paths. Fix by replace
open coded handling of read of whole file into a buffer with
g_file_get_contents()

Fixes: aba578bdac ("hw/cxl: CDAT Data Object Exchange implementation")
Signed-off-by: Zeng Hao <zenghao@kylinos.cn>
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Jonathan Cameron via <qemu-devel@nongnu.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
--
Changes since v5:
- Drop if guard on g_free() as per checkpatch warning.
Message-Id: <20230421132020.7408-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost: fix possible wrap in SVQ descriptor ring

QEMU invokes vhost_svq_add() when adding a guest's element
into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots()
to check whether QEMU can add the element into SVQ. If there is
enough space, then QEMU combines some out descriptors and some
in descriptors into one descriptor chain, and adds it into
`svq->vring.desc` by vhost_svq_vring_write_descs().

Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx`
in vhost_svq_available_slots() returns the number of occupied elements,
or the number of descriptor chains, instead of the number of occupied
descriptors, which may cause wrapping in SVQ descriptor ring.

Here is an example. In vhost_handle_guest_kick(), QEMU forwards
as many available buffers to device by virtqueue_pop() and
vhost_svq_add_element(). virtqueue_pop() returns a guest's element,
and then this element is added into SVQ by vhost_svq_add_element(),
a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and
vhost_svq_add_element() `svq->vring.num` times,
vhost_svq_available_slots() thinks QEMU just ran out of slots and
everything should work fine. But in fact, virtqueue_pop() returns
`svq->vring.num` elements or descriptor chains, more than
`svq->vring.num` descriptors due to guest memory fragmentation,
and this causes wrapping in SVQ descriptor ring.

This bug is valid even before marking the descriptors used.
If the guest memory is fragmented, SVQ must add chains
so it can try to add more descriptors than possible.

This patch solves it by adding `num_free` field in
VhostShadowVirtqueue structure and updating this field
in vhost_svq_add() and vhost_svq_get_buf(), to record
the number of free descriptors.

Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230509084817.3973-1-yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>

Merge tag 'pull-hex-20230518-1' of https://github.com/quic/qemu into staging

Hexagon update

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEENjXHiM5iuR/UxZq0ewJE+xLeRCIFAmRmgQgACgkQewJE+xLe
# RCJLtAf8C/0kQRa4mjnbsztXuFyca53UxAv3BSBEDla4ZcMfFBoVJsGB3OP7IPXd
# KBQpkLyJAVye9idex5xqdp9nIfoGKDTsc6YtCfGujZ17cDpzLRDpHdUTex8PcZYK
# wpfM3hoVJsYRBMsojZ4OaxatjFQ+FWzrIH6FcgH086Q8TH4w9dZLNEJzHC4lOj0s
# 7qOuw2tgm+vOVlzsk/fv6/YD/BTeZTON3jgTPvAnvdRLb/482UpM9JkJ8E4rbte3
# Ss5PUK8QTQHU0yamspGy/PfsYxiptM+jIWGd836fAGzwF12Ug27mSc1enndRtQVW
# pQTdnOnWuuRzOwEpd7x3xh9upACm4g==
# =1CyJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 May 2023 12:48:24 PM PDT
# gpg:                using RSA key 3635C788CE62B91FD4C59AB47B0244FB12DE4422
# gpg: Good signature from "Taylor Simpson (Rock on) <tsimpson@quicinc.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 3635 C788 CE62 B91F D4C5  9AB4 7B02 44FB 12DE 4422

* tag 'pull-hex-20230518-1' of https://github.com/quic/qemu: (44 commits)
  Hexagon (linux-user/hexagon): handle breakpoints
  Hexagon (gdbstub): add HVX support
  Hexagon (gdbstub): fix p3:0 read and write via stub
  Hexagon: add core gdbstub xml data for LLDB
  gdbstub: add test for untimely stop-reply packets
  gdbstub: only send stop-reply packets when allowed to
  Remove test_vshuff from hvx_misc tests
  Hexagon (decode): look for pkts with multiple insns at the same slot
  Hexagon (iclass): update J4_hintjumpr slot constraints
  Hexagon: append eflags to unknown cpu model string
  Hexagon: list available CPUs with `-cpu help`
  Hexagon (target/hexagon/*.py): raise exception on reg parsing error
  target/hexagon: fix = vs. == mishap
  Hexagon (target/hexagon) Additional instructions handled by idef-parser
  Hexagon (target/hexagon) Move items to DisasContext
  Hexagon (target/hexagon) Move pkt_has_store_s1 to DisasContext
  Hexagon (target/hexagon) Move pred_written to DisasContext
  Hexagon (target/hexagon) Move new_pred_value to DisasContext
  Hexagon (target/hexagon) Move new_value to DisasContext
  Hexagon (target/hexagon) Make special new_value for USR
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Hexagon (linux-user/hexagon): handle breakpoints

This enables LLDB to work with hexagon linux-user mode through the GDB
remote protocol.

Helped-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <c287a129dcbe7d974d8b7608e8672d34a3c91c04.1683214375.git.quic_mathbern@quicinc.com>

Hexagon (gdbstub): add HVX support

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Co-authored-by: Brian Cain <bcain@quicinc.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
Co-authored-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <17cb32f34d469f705c3cc066a3583935352ee048.1683214375.git.quic_mathbern@quicinc.com>

Hexagon (gdbstub): fix p3:0 read and write via stub

Signed-off-by: Brian Cain <bcain@quicinc.com>
Co-authored-by: Sid Manning <sidneym@quicinc.com>
Signed-off-by: Sid Manning <sidneym@quicinc.com>
Co-authored-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <32e7de567cdae184a6781644454bbb19916c955b.1683214375.git.quic_mathbern@quicinc.com>

Hexagon: add core gdbstub xml data for LLDB

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <d25a3a79334d81f0e1ecfb438b6ee82585d02dc4.1683214375.git.quic_mathbern@quicinc.com>

gdbstub: add test for untimely stop-reply packets

In the previous commit, we modified gdbstub.c to only send stop-reply
packets as a response to GDB commands that accept it. Now, let's add a
test for this intended behavior. Running this test before the fix from
the previous commit fails as QEMU sends a stop-reply packet
asynchronously, when GDB was in fact waiting an ACK.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <a30d93b9a8d66e9d9294354cfa2fc3af35f00202.1683214375.git.quic_mathbern@quicinc.com>

gdbstub: only send stop-reply packets when allowed to

GDB's remote serial protocol allows stop-reply messages to be sent by
the stub either as a notification packet or as a reply to a GDB command
(provided that the cmd accepts such a response). QEMU currently does not
implement notification packets, so it should only send stop-replies
synchronously and when requested. Nevertheless, it still issues
unsolicited stop messages through gdb_vm_state_change().

Although this behavior doesn't seem to cause problems with GDB itself
(the messages are just ignored), it can impact other debuggers that
implement the GDB remote serial protocol, like hexagon-lldb. Let's
change the gdbstub to send stop messages only as a response to a
previous GDB command that accepts such a reply.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <a49c0897fc22a6a7827c8dfc32aef2e1d933ec6b.1683214375.git.quic_mathbern@quicinc.com>

Remove test_vshuff from hvx_misc tests

test_vshuff checks that the vshuff instruction works correctly when
both vector registers are the same. Using vshuff in this way is
undefined and will be rejected by the compiler in a future version of
the toolchain.

Signed-off-by: Marco Liebel <quic_mliebel@quicinc.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20230509184231.2467626-1-quic_mliebel@quicinc.com>

Hexagon (decode): look for pkts with multiple insns at the same slot

Each slot in a packet can be assigned to at most one instruction.
Although the assembler generally ought to enforce this rule, we better
be safe than sorry and also do some check to properly throw an "invalid
packet" exception on wrong slot assignments.

This should also make it easier to debug possible future errors caused
by missing updates to `find_iclass_slots()` rules in
target/hexagon/iclass.c.

Co-authored-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <f8b829443523568823d062adf8bf6659bc6d4a3f.1683552984.git.quic_mathbern@quicinc.com>

Hexagon (iclass): update J4_hintjumpr slot constraints

The Hexagon PRM says that "The assembler automatically encodes
instructions in the packet in the proper order. In the binary encoding
of a packet, the instructions must be ordered from Slot 3 down to
Slot 0."

Prior to the architecture version v73, the slot constraints from
instruction "hintjr" only allowed it to be executed at slot 2.
With that in mind, consider the packet:

    {
        hintjr(r0)
        nop
        nop
        if (!p0) memd(r1+#0) = r1:0
    }

To satisfy the ordering rule quoted from the PRM, the assembler would,
thus, move one of the nops to the first position, so that it can be
assigned to slot 3 and the subsequent hintjr to slot 2.

However, since v73, hintjr can be executed at either slot 2 or 3. So
there is no need to reorder that packet and the assembler will encode it
as is. When QEMU tries to execute it, however, we end up hitting a
"misaliged store" exception because both the store and the hintjr will
be assigned to store 0, and some functions like `slot_is_predicated()`
expect the decode machinery to assign only one instruction per slot. In
particular, the mentioned function will traverse the packet until it
finds the first instruction at the desired slot which, for slot 0, will
be hintjr. Since hintjr is not predicated, the result is that we try to
execute the store regardless of the predicate. And because the predicate
is false, we had not previously loaded hex_store_addr[0] or
hex_store_width[0]. As a result, the store will decide de width based on
trash memory, causing it to be misaligned.

Update the slot constraints for hintjr so that QEMU can properly handle
such encodings.

Note: to avoid similar-but-not-identical issues in the future, we should
look for multiple instructions at the same slot during decoding time and
throw an invalid packet exception. That will be done in the subsequent
commit.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <0fcd8293642c6324119fbbab44741164bcbd04fb.1673616964.git.quic_mathbern@quicinc.com>

Hexagon: append eflags to unknown cpu model string

Running qemu-hexagon with a binary that was compiled for an arch version
unknown by qemu can produce a somewhat confusing message:

qemu-hexagon: unable to find CPU model 'unknown'

Let's give a bit more info by appending the eflags so that the message
becomes:

qemu-hexagon: unable to find CPU model 'unknown (0x69)'

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <8a8d013cc619b94fd4fb577ae6a8df26cedb972b.1683225804.git.quic_mathbern@quicinc.com>

Hexagon: list available CPUs with `-cpu help`

Currently, qemu-hexagon only models the v67 cpu. Nonetheless if we try
to get this information with `-cpu help`, qemu just exists with an error
code and no output. Let's correct that.

The code is basically a copy from target/alpha/cpu.h, but we strip the
"-hexagon-cpu" suffix before printing. This is to avoid confusing
situations like the following:

    $ qemu-hexagon -cpu help

    Available CPUs:
      v67-hexagon-cpu

    $ qemu-hexagon -cpu v67-hexagon-cpu ./prog

    qemu-hexagon: unable to find CPU model 'v67-hexagon-cpu'

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <b946e17c7e17eed9095700b54c5ead36e5d55dfa.1683225804.git.quic_mathbern@quicinc.com>

Hexagon (target/hexagon/*.py): raise exception on reg parsing error

Currently, the python scripts used for the hexagon building will not
abort the compilation when there is an error parsing a register. Let's
make the compilation properly fail in such cases by rasing an exception
instead of just printing a warning message, which might get lost in the
output.

This patch was generated with:

git grep -l "Bad register" *hexagon* | \
xargs sed -i "" -e 's/print("Bad register parse: "[, ]*$[^)]*$)/hex_common.bad_register(\1)/g'

Plus the bad_register() helper added to hex_common.py.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <1f5dbd92f68fdd89e2647e4ba527a2c32cf0f070.1683217043.git.quic_mathbern@quicinc.com>

target/hexagon: fix = vs. == mishap

**** Changes in v2 ****
Fix yyassert's for sign and zero extends

Coverity reports a parameter that is "set but never used". This is caused
by an assignment operator being used instead of equality.

Co-authored-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230428204411.1400931-1-tsimpson@quicinc.com>

Hexagon (target/hexagon) Additional instructions handled by idef-parser

**** Changes in v3 ****
Fix bugs exposed by dpmpyss_rnd_s0 instruction
    Set correct size/signedness for constants
    Test cases added to tests/tcg/hexagon/misc.c

**** Changes in v2 ****
Fix bug in imm_print identified in clang build

Currently, idef-parser skips all floating point instructions.  However,
there are some floating point instructions that can be handled.

The following instructions are now parsed
    F2_sfimm_p
    F2_sfimm_n
    F2_dfimm_p
    F2_dfimm_n
    F2_dfmpyll
    F2_dfmpylh

To make these instructions work, we fix some bugs in parser-helpers.c
    gen_rvalue_extend
    gen_cast_op
    imm_print
    lexer properly sets size/signedness of constants

Test cases added to tests/tcg/hexagon/fpstuff.c

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230501203125.4025991-1-tsimpson@quicinc.com>

Hexagon (target/hexagon) Move items to DisasContext

The following items in the CPUHexagonState are only used for bookkeeping
within the translation of a packet.  With recent changes that eliminate
the need to free TCGv variables, these make more sense to be transient
and kept in DisasContext.

The following items are moved
    dczero_addr
    branch_taken
    this_PC

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-22-tsimpson@quicinc.com>

Hexagon (target/hexagon) Move pkt_has_store_s1 to DisasContext

The pkt_has_store_s1 field is only used for bookkeeping helpers with
a load. With recent changes that eliminate the need to free TCGv
variables, it makes more sense to make this transient.

These helpers already take the instruction slot as an argument. We
combine the slot and pkt_has_store_s1 into a single argument called
slotval.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-21-tsimpson@quicinc.com>

Hexagon (target/hexagon) Move pred_written to DisasContext

The pred_written variable in the CPUHexagonState is only used for
bookkeeping within the translation of a packet. With recent changes
that eliminate the need to free TCGv variables, these make more sense
to be transient and kept in DisasContext.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-20-tsimpson@quicinc.com>

Hexagon (target/hexagon) Move new_pred_value to DisasContext

The new_pred_value array in the CPUHexagonState is only used for
bookkeeping within the translation of a packet. With recent changes
that eliminate the need to free TCGv variables, these make more sense
to be transient and kept in DisasContext.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-19-tsimpson@quicinc.com>

Hexagon (target/hexagon) Move new_value to DisasContext

The new_value array in the CPUHexagonState is only used for bookkeeping
within the translation of a packet. With recent changes that eliminate
the need to free TCGv variables, these make more sense to be transient
and kept in DisasContext.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-18-tsimpson@quicinc.com>

Hexagon (target/hexagon) Make special new_value for USR

Precursor to moving new_value from the global state to DisasContext

USR will need to stay in the global state because some helpers will
set it's value

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-17-tsimpson@quicinc.com>

Hexagon (target/hexagon) Add overrides for disabled idef-parser insns

The following have overrides
    S2_insert
    S2_insert_rp
    S2_asr_r_svw_trun
    A2_swiz

These instructions have semantics that write to the destination
before all the operand reads have been completed.  Therefore,
the idef-parser versions were disabled with the short-circuit patch.

Test cases added to tests/tcg/hexagon/read_write_overlap.c

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-16-tsimpson@quicinc.com>

Hexagon (target/hexagon) Short-circuit more HVX single instruction packets

The generated helpers for HVX use pass-by-reference, so they can't
short-circuit when the reads/writes overlap. The instructions with
overrides are OK because they use tcg_gen_gvec_*.

We add a flag has_hvx_helper to DisasContext and extend gen_analyze_funcs
to set the flag when the instruction is an HVX instruction with a
generated helper.

We add an override for V6_vcombine so that it can be short-circuited
along with a test case in tests/tcg/hexagon/hvx_misc.c

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-15-tsimpson@quicinc.com>

Hexagon (target/hexagon) Short-circuit packet HVX writes

In certain cases, we can avoid the overhead of writing to future_VRegs
and write directly to VRegs. We consider HVX reads/writes when computing
ctx->need_commit. Then, we can early-exit from gen_commit_hvx.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-14-tsimpson@quicinc.com>

Hexagon (target/hexagon) Short-circuit packet predicate writes

In certain cases, we can avoid the overhead of writing to hex_new_pred_value
and write directly to hex_pred.  We consider predicate reads/writes when
computing ctx->need_commit.  The get_result_pred() function uses this
field to decide between hex_new_pred_value and hex_pred.  Then, we can
early-exit from gen_pred_writes.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-13-tsimpson@quicinc.com>