Fix a memory leak in ide_register_restart_cb() in hw/ide/core.c and add
idebus_unrealize() in hw/ide/qdev.c to have calls to
qemu_del_vm_change_state_handler() to deal with the dangling change
state handler during hot-unplugging ide devices which might lead to a
crash.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com> Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1474995212-10580-1-git-send-email-ashijeetacharya@gmail.com
[Minor whitespace fix --js] Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit ca44141d5fb801dd5903102acefd0f2d8e8bb6a1) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The isa_register_portio_list() function allocates ioports
data/state. Let's keep the reference to this data on some owner. This
isn't enough to fix leaks, but at least, ASAN stops complaining of
direct leaks. Further cleanup would require calling
portio_list_del/destroy().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e305a16510afa74eec20390479e349402e55ef4c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Mon, 14 Nov 2016 16:15:54 +0000 (11:15 -0500)]
block-backend: Always notify on blk_eject
blk_eject is only used by scsi-disk and atapi, and in both cases we
only attempt to invoke blk_eject if we have a bona-fide change in
tray state.
The "issue" here is that the tray state does not generate a QMP event
unless there is a medium/BDS attached to the device, so if libvirt et al
are waiting for a tray event to occur from an empty-but-closed drive,
software opening that drive will not emit an event and libvirt will
wait forever.
Change this by modifying blk_eject to always emit an event, instead of
conditionally on a "real" backend eject.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1373264 Reported-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1478553214-497-2-git-send-email-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit c47ee043dc2cc85da710e87524144a720598c096)
Mark Cave-Ayland [Thu, 27 Oct 2016 20:29:13 +0000 (16:29 -0400)]
dma-helpers: explicitly pass alignment into DMA helpers
The hard-coded default alignment is BDRV_SECTOR_SIZE, however this is not
necessarily the case for all platforms. Use this as the default alignment for
all current callers.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: John Snow <jsnow@redhat.com>
Message-id: 1476445266-27503-2-git-send-email-mark.cave-ayland@ilande.co.uk Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 99868af3d0a75cf6a515a9aa81bf0d7bcb39eadb) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Mon, 14 Nov 2016 16:15:54 +0000 (11:15 -0500)]
atapi: classify read_cd as conditionally returning data
For the purposes of byte_count_limit verification, add a new flag that
identifies read_cd as sometimes returning data, then check the BCL in
its command handler after we know that it will indeed return data.
Reported-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1477970211-25754-2-git-send-email-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit e7bd708ec85e40fd51569bb90c52d6613ffd8f45) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Stefan Hajnoczi [Wed, 14 Dec 2016 14:25:18 +0000 (14:25 +0000)]
ui/gtk: fix "Copy" menu item segfault
The "Copy" menu item copies VTE terminal text to the clipboard. This
only works with VTE terminals, not with graphics consoles.
Disable the menu item when the current notebook page isn't a VTE
terminal.
This patch fixes a segfault. Reproducer: Start QEMU and click the Copy
menu item when the guest display is visible.
Reported-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Tested-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20161214142518.10504-1-stefanha@redhat.com Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a08156321ab9a7d2fed9ee77dbfeea2a61ffd153) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Introductory comment for rtl8168 VFIO MSI-X quirk states:
At BAR2 offset 0x70 there is a dword data register,
offset 0x74 is a dword address register.
vfio: vfio_bar_read(0000:05:00.0:BAR2+0x70, 4) = 0xfee00398 // read data
Thus, correct offset for data read is 0x70,
but function vfio_rtl8168_quirk_data_read() wrongfully uses offset 0x74.
Signed-off-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit 31e6a7b17b35711eb44f0e686b5ba68d15bfe4c1) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Lin Ma [Thu, 15 Sep 2016 14:31:58 +0000 (22:31 +0800)]
msmouse: Fix segfault caused by free the chr before chardev cleanup.
Segfault happens when leaving qemu with msmouse backend:
#0 0x00007fa8526ac975 in raise () at /lib64/libc.so.6
#1 0x00007fa8526add8a in abort () at /lib64/libc.so.6
#2 0x0000558be78846ab in error_exit (err=16, msg=0x558be799da10 ...
#3 0x0000558be7884717 in qemu_mutex_destroy (mutex=0x558be93be750) at ...
#4 0x0000558be7549951 in qemu_chr_free_common (chr=0x558be93be750) at ...
#5 0x0000558be754999c in qemu_chr_free (chr=0x558be93be750) at ...
#6 0x0000558be7549a20 in qemu_chr_delete (chr=0x558be93be750) at ...
#7 0x0000558be754a8ef in qemu_chr_cleanup () at qemu-char.c:4643
#8 0x0000558be755843e in main (argc=5, argv=0x7ffe925d7118, ...
The chr was freed by msmouse close callback before chardev cleanup,
Then qemu_mutex_destroy triggered raise().
Because freeing chr is handled by qemu_chr_free_common, Remove the free from
msmouse_chr_close to avoid double free.
Fixes: c1111a24a3358ecd2f17be7c8b117cfe8bc5e5f8 Cc: qemu-stable@nongnu.org Signed-off-by: Lin Ma <lma@suse.com>
Message-Id: <20160915143158.4796-1-lma@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9e14037f05e99ca3b8a33d8be9a2a636bbf09326) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Initialization of memory backends may take a while when
prealloc=yes is used, depending on their size. Initializing
memory backends before chardevs may delay the creation of monitor
sockets, and trigger timeouts on management software that waits
until the monitor socket is created by QEMU. See, for example,
the bug report at:
https://bugzilla.redhat.com/show_bug.cgi?id=1371211
In addition to that, allocating memory before calling
configure_accelerator() breaks the tcg_enabled() checks at
memory_region_init_*().
This patch fixes those problems by adding "memory-backend-*"
classes to the delayed-initialization list.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 6546d0dba6c211c1a3eac1252a4f50a0c151a08a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
vhost-user-test: Use libqos instead of pxe-virtio.rom
vhost-user-test relies on iPXE just to initialize the virtio-net
device, and doesn't do any actual packet tx/rx testing.
In addition to that, the test relies on TCG, which is
imcompatible with vhost. The test only worked by accident: a bug
the memory backend initialization made memory regions not have
the DIRTY_MEMORY_CODE bit set in dirty_log_mask.
This changes vhost-user-test to initialize the virtio-net device
using libqos, and not use TCG nor pxe-virtio.rom.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit cdafe929615ec5eca71bcd5a3d12bab5678e5886) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Peter Xu [Tue, 29 Nov 2016 05:43:40 +0000 (13:43 +0800)]
intel_iommu: fix incorrect device invalidate
"mask" needs to be inverted before use.
Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6cb99acc2808cc41e2d772a23e9cc564515535cc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Peter Xu [Fri, 25 Nov 2016 02:55:22 +0000 (10:55 +0800)]
pci-assign: sync MSI/MSI-X cap and table with PCIDevice
Since commit e1d4fb2d ("kvm-irqchip: x86: add msi route notify fn"),
kvm_irqchip_add_msi_route() starts to use pci_get_msi_message() to fetch
MSI info. This requires that we setup MSI related fields in PCIDevice.
For most devices, that won't be a problem, as long as we are using
general interfaces like msi_init()/msix_init().
However, for pci-assign devices, MSI/MSI-X is treated differently - PCI
assign devices are maintaining its own MSI table and cap information in
AssignedDevice struct. however that's not synced up with PCIDevice's
fields. That will leads to pci_get_msi_message() failed to find correct
MSI capability, even with an NULL msix_table.
A quick fix is to sync up the two places: both the capability bits and
table address for MSI/MSI-X.
Reported-by: Changlimin <changlimin@h3c.com> Tested-by: Changlimin <changlimin@h3c.com> Cc: qemu-stable@nongnu.org Fixes: e1d4fb2d ("kvm-irqchip: x86: add msi route notify fn") Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1480042522-16551-1-git-send-email-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 64e184e2608d3c93dda1bba8ae6dc2185b5228fb) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Zhuang Yanying [Thu, 17 Nov 2016 12:31:03 +0000 (20:31 +0800)]
ivshmem: Fix 64 bit memory bar configuration
Device ivshmem property use64=0 is designed to make the device
expose a 32 bit shared memory BAR instead of 64 bit one. The
default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit
BAR. A 32 bit BAR can support only up to 1 GiB of shared memory.
This worked as designed until commit 5400c02 accidentally flipped
its sense: since then, we misinterpret use64=0 as use64=1 and vice
versa. Worse, the default got flipped as well. Devices
ivshmem-plain and ivshmem-doorbell are not affected.
Fix by restoring the test of IVShmemState member not_legacy_32bit
that got messed up in commit 5400c02. Also update its
initialization for devices ivhsmem-plain and ivshmem-doorbell.
Without that, they'd regress to 32 bit BARs.
Cc: qemu-stable@nongnu.org Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit be4e0d737527d8670dc271712faae0de6a181b4e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Greg Kurz [Fri, 4 Nov 2016 08:39:22 +0000 (09:39 +0100)]
vhost: drop legacy vring layout bits
The legacy vring layout is not used anymore as we use the separate
mappings even for legacy devices.
This patch simply removes it.
This also fixes a bug with virtio 1 devices when the vring descriptor table
is mapped at a higher address than the used vring because the following
function may return an insanely great value:
Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1cdce7c54d26e64f5eddb10a6f4f7dd938dfc2c4) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Reitz [Tue, 25 Oct 2016 02:54:31 +0000 (04:54 +0200)]
block/curl: Do not wait for data beyond EOF
libcurl will only give us as much data as there is, not more. The block
layer will deny requests beyond the end of file for us; but since this
block driver is still using a sector-based interface, we can still get
in trouble if the file size is not a multiple of 512.
While we have already made sure not to attempt transfers beyond the end
of the file, we are currently still trying to receive data from there if
the original request exceeds the file size. This patch fixes this issue
and invokes qemu_iovec_memset() on the iovec's tail.
Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20161025025431.24714-5-mreitz@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 4e504535c16dfa66290281e704384abfaca08673) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Reitz [Tue, 25 Oct 2016 02:54:30 +0000 (04:54 +0200)]
block/curl: Remember all sockets
For some connection types (like FTP, generally), more than one socket
may be used (in FTP's case: control vs. data stream). As of commit 838ef602498b8d1985a231a06f5e328e2946a81d ("curl: Eliminate unnecessary
use of curl_multi_socket_all"), we have to remember all of the sockets
used by libcurl, but in fact we only did that for a single one. Since
one libcurl connection may use multiple sockets, however, we have to
remember them all.
Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20161025025431.24714-4-mreitz@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit ff5ca1664af85b24a4180d595ea6873fd3deac57) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Reitz [Tue, 25 Oct 2016 02:54:29 +0000 (04:54 +0200)]
block/curl: Fix return value from curl_read_cb
While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that
the callback is supposed to return the number of bytes handled; what it
does not mention is that libcurl will throw an error if the callback did
not "handle" all of the data passed to it.
Therefore, if the callback receives some data that it cannot handle
(either because the receive buffer has not been set up yet or because it
would not fit into the receive buffer) and we have to ignore it, we
still have to report that the data has been handled.
Obviously, this should not happen normally. But it does happen at least
for FTP connections where some data (that we do not expect) may be
generated when the connection is established.
Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20161025025431.24714-3-mreitz@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 4e7676571bccb42dd49b5efbb91ac49077ea5197) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Eric Blake [Thu, 17 Nov 2016 20:13:58 +0000 (14:13 -0600)]
block: Pass unaligned discard requests to drivers
Discard is advisory, so rounding the requests to alignment
boundaries is never semantically wrong from the data that
the guest sees. But at least the Dell Equallogic iSCSI SANs
has an interesting property that its advertised discard
alignment is 15M, yet documents that discarding a sequence
of 1M slices will eventually result in the 15M page being
marked as discarded, and it is possible to observe which
pages have been discarded.
Between commits 9f1963b and b8d0a980, we converted the block
layer to a byte-based interface that ultimately ignores any
unaligned head or tail based on the driver's advertised
discard granularity, which means that qemu 2.7 refuses to
pass any discard request smaller than 15M down to the Dell
Equallogic hardware. This is a slight regression in behavior
compared to earlier qemu, where a guest executing discards
in power-of-2 chunks used to be able to get every page
discarded, but is now left with various pages still allocated
because the guest requests did not align with the hardware's
15M pages.
Since the SCSI specification says nothing about a minimum
discard granularity, and only documents the preferred
alignment, it is best if the block layer gives the driver
every bit of information about discard requests, rather than
rounding it to alignment boundaries early.
Rework the block layer discard algorithm to mirror the write
zero algorithm: always peel off any unaligned head or tail
and manage that in isolation, then do the bulk of the request
on an aligned boundary. The fallback when the driver returns
-ENOTSUP for an unaligned request is to silently ignore that
portion of the discard request; but for devices that can pass
the partial request all the way down to hardware, this can
result in the hardware coalescing requests and discarding
aligned pages after all.
Reported by: Peter Lieven <pl@kamp.de> CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 3482b9bc411a9a12b2efde1018e1ddc906cd817e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Eric Blake [Thu, 17 Nov 2016 20:13:57 +0000 (14:13 -0600)]
block: Return -ENOTSUP rather than assert on unaligned discards
Right now, the block layer rounds discard requests, so that
individual drivers are able to assert that discard requests
will never be unaligned. But there are some ISCSI devices
that track and coalesce multiple unaligned requests, turning it
into an actual discard if the requests eventually cover an
entire page, which implies that it is better to always pass
discard requests as low down the stack as possible.
In isolation, this patch has no semantic effect, since the
block layer currently never passes an unaligned request through.
But the block layer already has code that silently ignores
drivers that return -ENOTSUP for a discard request that cannot
be honored (as well as drivers that return 0 even when nothing
was done). But the next patch will update the block layer to
fragment discard requests, so that clients are guaranteed that
they are either dealing with an unaligned head or tail, or an
aligned core, making it similar to the block layer semantics of
write zero fragmentation.
CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 49228d1e95e1be879c57f5dbccb44405670e343d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Eric Blake [Thu, 17 Nov 2016 20:13:56 +0000 (14:13 -0600)]
block: Let write zeroes fallback work even with small max_transfer
Commit 443668ca rewrote the write_zeroes logic to guarantee that
an unaligned request never crosses a cluster boundary. But
in the rewrite, the new code assumed that at most one iteration
would be needed to get to an alignment boundary.
However, it is easy to trigger an assertion failure: the Linux
kernel limits loopback devices to advertise a max_transfer of
only 64k. Any operation that requires falling back to writes
rather than more efficient zeroing must obey max_transfer during
that fallback, which means an unaligned head may require multiple
iterations of the write fallbacks before reaching the aligned
boundaries, when layering a format with clusters larger than 64k
atop the protocol of file access to a loopback device.
In fairness to Denis (as the original listed author of the culprit
commit), the faulty logic for at most one iteration is probably all
my fault in reworking his idea. But the solution is to restore what
was in place prior to that commit: when dealing with an unaligned
head or tail, iterate as many times as necessary while fragmenting
the operation at max_transfer boundaries.
Reported-by: Ed Swierk <eswierk@skyportsystems.com> CC: qemu-stable@nongnu.org CC: Denis V. Lunev <den@openvz.org> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b2f95feec5e4d546b932848dd421ec3361e8ef77) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Eric Blake [Thu, 17 Nov 2016 20:13:55 +0000 (14:13 -0600)]
qcow2: Inform block layer about discard boundaries
At the qcow2 layer, discard is only possible on a per-cluster
basis; at the moment, qcow2 silently rounds any unaligned
requests to this granularity. However, an upcoming patch will
fix a regression in the block layer ignoring too much of an
unaligned discard request, by changing the block layer to
break up a discard request at alignment boundaries; for that
to work, the block layer must know about our limits.
However, we can't go one step further by changing
qcow2_discard_clusters() to assert that requests are always
aligned, since that helper function is reached on paths
outside of the block layer.
CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit ecdbead659f037dc572bba9eb1cd31a5a1a9ad9a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Samuel Thibault [Sun, 13 Nov 2016 22:54:27 +0000 (23:54 +0100)]
slirp: Fix access to freed memory
if_start() goes through the slirp->if_fastq and slirp->if_batchq
list of pending messages, and accesses ifm->ifq_so->so_nqueued of its
elements if ifm->ifq_so != NULL. When freeing a socket, we thus need
to make sure that any pending message for this socket does not refer
to the socket any more.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Tested-by: Brian Candler <b.candler@pobox.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ea64d5f08817b5e79e17135dce516c7583107f91) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Greg Kurz [Fri, 4 Nov 2016 08:39:15 +0000 (09:39 +0100)]
vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout
With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.
In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.
This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.
This works for legacy mappings as well.
Cc: qemu-stable@nongnu.org Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f1f9e6c5961ffb36fd4a81cd7edcded7bfad2ab2) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kevin Wolf [Fri, 4 Nov 2016 23:03:15 +0000 (00:03 +0100)]
block: Don't mark node clean after failed flush
Commit 3ff2f67a changed bdrv_co_flush() so that no flush is issues if
the image hasn't been dirtied since the last flush. This is not quite
correct: The condition should be that the image hasn't been dirtied
since the last _successful_ flush. This patch changes the logic
accordingly.
Without this fix, subsequent bdrv_co_flush() calls would return success
without actually doing anything even though the image is still dirty.
The difference is visible in some blkdebug test cases where error
messages incorrectly disappeared after commit 3ff2f67a.
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1478300595-10090-1-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e6af1e085416378918cca357bf2abd8b90224667)
Michael S. Tsirkin [Fri, 4 Nov 2016 10:27:52 +0000 (12:27 +0200)]
virtio-net: mark VIRTIO_NET_F_GSO as legacy
virtio 1.0 spec says this is a legacy feature bit,
hide it from guests in modern mode.
Note: for cross-version migration compatibility,
we keep the bit set in host_features.
The result will be that a guest migrating cross-version
will see host features change under it.
As guests only seem to read it once, this should
not be an issue. Meanwhile, will work to fix guests to
ignore this bit in virtio1 mode, too.
Cc: qemu-stable@nongnu.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 2a083ffd2e37ef08769749a5c7cfc6ca65c9f8ea) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Michael S. Tsirkin [Fri, 4 Nov 2016 10:04:23 +0000 (12:04 +0200)]
virtio: allow per-device-class legacy features
Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.
Cc: qemu-stable@nongnu.org # dependency for the next patch Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 9b706dbbbb81f5cb7c67e491d38cd6077205e056)
Conflicts:
hw/virtio/virtio.c
* drop context dep on ff4c07df
* resolv func dep on ff4c07df creating vdc variable in
virtio_device_class_init()
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
David Gibson [Mon, 21 Nov 2016 05:28:12 +0000 (16:28 +1100)]
target-ppc: Fix CPU migration from qemu-2.6 <-> later versions
When migration for target-ppc was converted to vmstate, several
VMSTATE_EQUAL() checks were foolishly included of things that really
should be internal state. Specifically we verified equality of the
insns_flags and insns_flags2 fields, which are used within TCG to
determine which groups of instructions are available on this cpu
model. Between qemu-2.6 and qemu-2.7 we made some changes to these
classes which broke migration.
This path fixes migration both forwards and backwards. On migration
from 2.6 to later versions we import the fields into teporary
variables, which we then ignore. In migration backwards, we populate
the temporary fields from the runtime fields, but mask out the bits
which were added after qemu-2.6, allowing the VMSTATE_EQUAL in
qemu-2.6 to accept the stream.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 16a2497bd44cac1856e259654fd304079bd1dcdc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This function is from net/socket.c, move it to net.c and net.h.
Add SocketReadState to make others reuse net_fill_rstate().
suggestion from jason.
This refactored the state out of NetSocketState into a
separate SocketReadState. This refactoring requires
that a callback is provided to be triggered upon
completion of a packet receive from the guest.
The patch only registered this callback in the codepaths
hit by -net socket,connect, not -net socket,listen. So
as a result packets sent by the guest in the latter case
get dropped on the floor.
This bug is hidden because net_fill_rstate() silently
does nothing if the callback is not set.
This patch adds in the middle callback registration
and also adds an assert so that QEMU aborts if there
are any other codepaths hit which are missing the
callback.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit e79cd4068063ea2859199002a049010a11202939) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Corey Minyard [Mon, 24 Oct 2016 20:10:21 +0000 (15:10 -0500)]
acpi/ipmi: Initialize the fwinfo before fetching it
The initialization was missed before, resulting in some
bad data in the smbus case.
Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 698ae42b9124dce23e03d0fea2e635b70540ef13) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Alex Williamson [Mon, 31 Oct 2016 15:53:03 +0000 (09:53 -0600)]
memory: Don't use memcpy for ram_device regions
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors. If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap. Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion. When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.
This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU. If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap. Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems. I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces. It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.
To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.
With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access. The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities. If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access. A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device. Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).
Reported-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4a2e242bbb306ef5c16ce9e7bb2da3bd8a4eb098) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Alex Williamson [Mon, 31 Oct 2016 15:53:03 +0000 (09:53 -0600)]
memory: Replace skip_dump flag with "ram_device"
Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that. If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it. Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 21e00fa55f3fdfcbb20da7c6876c91ef3609b387) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
net: rtl8139: limit processing of ring descriptors
RTL8139 ethernet controller in C+ mode supports multiple
descriptor rings, each with maximum of 64 descriptors. While
processing transmit descriptor ring in 'rtl8139_cplus_transmit',
it does not limit the descriptor count and runs forever. Add
check to avoid it.
Reported-by: Andrew Henderson <hendersa@icculus.org> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit c7c35916692fe010fef25ac338443d3fe40be225) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Alberto Garcia [Mon, 17 Oct 2016 15:46:03 +0000 (18:46 +0300)]
qemu-iotests: Test I/O in a single drive from a throttling group
iotest 093 contains a test that creates a throttling group with
several drives and performs I/O in all of them. This patch adds a new
test that creates a similar setup but only performs I/O in one of the
drives at the same time.
This is useful to test that the round robin algorithm is behaving
properly in these scenarios, and is specifically written using the
regression introduced in 27ccdd52598290f0f8b58be56e as an example.
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a26ddb43963e77aeebc2a4f011d27b2d9c017f21) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Alberto Garcia [Mon, 17 Oct 2016 15:46:02 +0000 (18:46 +0300)]
throttle: Correct access to wrong BlockBackendPublic structures
In 27ccdd52598290f0f8b58be56e235aff7aebfaf3 the throttling fields were
moved from BlockDriverState to BlockBackend. However in a few cases
the code started using throttling fields from the active BlockBackend
instead of the round-robin token, making the algorithm behave
incorrectly.
This can cause starvation if there's a throttling group with several
drives but only one of them has I/O.
Cc: qemu-stable@nongnu.org Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6bf77e1c2dc24da1bade16e8a9a637f3b127314d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Thomas Huth [Wed, 21 Sep 2016 09:42:15 +0000 (11:42 +0200)]
ppc/kvm: Mark 64kB page size support as disabled if not available
QEMU currently refuses to start with KVM-PR and only prints out
qemu: fatal: Unknown MMU model 851972
when being started there. This is because commit 4322e8ced5aaac719
("ppc: Fix 64K pages support in full emulation") introduced a new
POWERPC_MMU_64K bit to indicate support for this page size, but
it never gets cleared on KVM-PR if the host kernel does not support
this. Thus we've got to turn off this bit in the mmu_model for KVM-PR.
Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 0d594f5565837fe2886a8aa307ef8abb65eab8f7) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Markus Armbruster [Tue, 4 Oct 2016 15:23:50 +0000 (17:23 +0200)]
tests/test-qmp-input-strict: Cover missing struct members
These tests would have caught the bug fixed by the previous commit.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1475594630-24758-1-git-send-email-armbru@redhat.com>
(cherry picked from commit bce3035a44c40bd3ec29d3162025fd350f2d8dbf) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
qapi: Fix crash when 'any' or 'null' parameter is missing
Unlike the other visit methods, visit_type_any() and visit_type_null()
neglect to check whether qmp_input_get_object() succeeded. They crash
when it fails. Reproducer:
Since commit ad739706bbadee49, user_creatable_add_type() expects to be
given a qdict. However, if object-add is called without props, you reach
the assert: "qemu/qom/object_interfaces.c:115: user_creatable_add_type:
Assertion `qdict' failed.", because the qdict isn't created in this
case (it's optional).
Furthermore, qmp_input_visitor_new() is not meant to be called without a
dict, and a further commit will assert in this situation.
If none given, create an empty qdict in qmp to avoid the
user_creatable_add_type() assert(qdict).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20160922203927.28241-2-marcandre.lureau@redhat.com> Tested-by: Xiao Long Jiang <zxiaol@linux.vnet.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit e64c75a9752c5d0fd64eb2e684c656a5ea7d03c6) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Daniel P. Berrange [Fri, 30 Sep 2016 15:02:01 +0000 (16:02 +0100)]
char: fix missing return in error path for chardev TLS init
If the qio_channel_tls_new_(server|client) methods fail,
we disconnect the client. Unfortunately a missing return
means we then go on to try and run the TLS handshake on
a NULL I/O channel. This gives predictably segfaulty
results.
The main way to trigger this is to request a bogus TLS
priority string for the TLS credentials. e.g.
Most other ways appear impossible to trigger except
perhaps if OOM conditions cause gnutls initialization
to fail.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 660a2d83e026496db6b3eaec2256a2cdd6c74de8) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Emilio G. Cota [Wed, 5 Oct 2016 22:34:39 +0000 (18:34 -0400)]
qht: fix unlock-after-free segfault upon resizing
The old map's bucket locks are being unlocked *after*
that same old map has been passed to RCU for destruction.
This is a bug that can cause a segfault, since there's
no guarantee that the deletion will be deferred (e.g.
there may be no concurrent readers).
The segfault is easily triggered in RHEL6/CentOS6 with qht-test,
particularly on a single-core system or by pinning qht-test
to a single core.
Fix it by unlocking the map's bucket locks right after having
published the new map, and (crucially) before marking the map
for deletion via call_rcu().
While at it, expand qht_do_resize() to atomically do (1) a reset,
(2) a resize, or (3) a reset+resize. This simplifies the calling
code, since the new function (qht_do_resize_reset()) acquires
and releases the buckets' locks.
Note that no qht_do_reset inline is provided, since it would have
no users--qht_reset() already performs a reset without taking
ht->lock.
Reported-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1475706880-10667-3-git-send-email-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 76b553b308dc8671eb672b889b38889b1231cf1e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Emilio G. Cota [Wed, 5 Oct 2016 22:34:38 +0000 (18:34 -0400)]
qht: simplify qht_reset_size
Sometimes gcc doesn't pick up the fact that 'new' is properly
set if 'resize == true', which may generate an unnecessary
build warning.
Fix it by removing 'resize' and directly checking that 'new'
is non-NULL.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1475706880-10667-2-git-send-email-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f555a9d0b3c785b698f32e6879e97d0a4b387314) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Eric Blake [Fri, 9 Sep 2016 03:14:14 +0000 (22:14 -0500)]
migrate: Fix cpu-throttle-increment regression in HMP
Commit 69ef1f3 accidentally broke migrate_set_parameter's ability
to set the cpu-throttle-increment to anything other than the
default, because it forgot to parse the user's string into an
integer.
CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit bb2b777cf9a2862fe31a40256659ff49ae3d2006) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Fri, 23 Sep 2016 01:45:52 +0000 (21:45 -0400)]
block-backend: remove blk_flush_all
We can teach Xen to drain and flush each device as it needs to, instead
of trying to flush ALL devices. This removes the last user of
blk_flush_all.
The function is therefore removed under the premise that any new uses
of blk_flush_all would be the wrong paradigm: either flush the single
device that requires flushing, or use an appropriate flush_all mechanism
from outside of the BlkBackend layer.
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 49137bf6845eaecad51a047fc06dd11c56118460) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Fri, 23 Sep 2016 01:45:51 +0000 (21:45 -0400)]
qemu: use bdrv_flush_all for vm_stop et al
Reimplement bdrv_flush_all for vm_stop. In contrast to blk_flush_all,
bdrv_flush_all does not have device model restrictions. This allows
us to flush and halt unconditionally without error.
This allows us to do things like migrate when we have a device with
an open tray, but has a node that may need to be flushed, or nodes
that aren't currently attached to any device and need to be flushed.
Specifically, this allows us to migrate when we have a CDROM with
an open tray.
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 22af08eacf6b5aa0e6c0581e547380b3eb4f95e9)
Conflicts:
cpus.c
John Snow [Fri, 23 Sep 2016 01:45:50 +0000 (21:45 -0400)]
block: reintroduce bdrv_flush_all
Commit fe1a9cbc moved the flush_all routine from the bdrv layer to the
block-backend layer. In doing so, however, the semantics of the routine
changed slightly such that flush_all now used blk_flush instead of
bdrv_flush.
blk_flush can fail if the attached device model reports that it is not
"available," (i.e. the tray is open.) This changed the semantics of
flush_all such that it can now fail for e.g. open CDROM drives.
Reintroduce bdrv_flush_all to regain the old semantics without having to
alter the behavior of blk_flush or blk_flush_all, which are already
'doing the right thing.'
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 4085f5c7a239567a292876f46cb59d9b19bcf6ac) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Eric Blake [Wed, 7 Sep 2016 21:27:20 +0000 (16:27 -0500)]
iscsi: Fix divide-by-zero regression on raw SG devices
When qemu uses iscsi devices in sg mode, iscsilun->block_size
is left at 0. Prior to commits cf081fca and similar, when
block limits were tracked in sectors, this did not matter:
various block limits were just left at 0. But when we started
scaling by block size, this caused SIGFPE.
Then, in a later patch, commit a5b8dd2c added an assertion to
bdrv_open_common() that request_alignment is always non-zero;
which was not true for SG mode. Rather than relax that assertion,
we can just provide a sane value (we don't know of any SG device
with a block size smaller than qemu's default sizing of 512 bytes).
One possible solution for SG mode is to just blindly skip ALL
of iscsi_refresh_limits(), since we already short circuit so
many other things in sg mode. But this patch takes a slightly
more conservative approach, and merely guarantees that scaling
will succeed, while still using multiples of the original size
where possible. Resulting limits may still be zero in SG mode
(that is, we mostly only fix block_size used as a denominator
or which affect assertions, not all uses).
Reported-by: Holger Schranz <holger@fam-schranz.de> Signed-off-by: Eric Blake <eblake@redhat.com> CC: qemu-stable@nongnu.org
Message-Id: <1473283640-15756-1-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 95eaa78537c734fa3cb3373d47ba8c0099a36ff0) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The copy_sectors() code was originally using the 'sector'
parameter for encryption, which was passed in by the caller
from the QCowL2Meta.offset field (aka the guest logical
offset).
After the change, the code is using 'cluster_offset' which
was passed in from QCow2L2Meta.alloc_offset field (aka the
host physical offset).
This would cause the data to be encrypted using an incorrect
initialization vector which will in turn cause later reads
to return garbage.
Although current qcow2 built-in encryption is blocked from
usage in the emulator, one could still hit this if writing
to the file via qemu-{img,io,nbd} commands.
Cc: qemu-stable@nongnu.org Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit bb9f8dd0e15a9744b8d09d06ecb6a18ca3dcc173)
Conflicts:
tests/qemu-iotests/group
* drop context dependancy on non-2.7 iotest groups
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
David Gibson [Thu, 15 Sep 2016 06:11:48 +0000 (16:11 +1000)]
vfio/pci: Fix regression in MSI routing configuration
d1f6af6 "kvm-irqchip: simplify kvm_irqchip_add_msi_route" was a cleanup
of kvmchip routing configuration, that was mostly intended for x86.
However, it also contains a subtle change in behaviour which breaks EEH[1]
error recovery on certain VFIO passthrough devices on spapr guests. So far
it's only been seen on a BCM5719 NIC on a POWER8 server, but there may be
other hardware with the same problem. It's also possible there could be
circumstances where it causes a bug on x86 as well, though I don't know of
any obvious candidates.
Prior to d1f6af6, both vfio_msix_vector_do_use() and
vfio_add_kvm_msi_virq() used msg == NULL as a special flag to mark this
as the "dummy" vector used to make the host hardware state sync with the
guest expected hardware state in terms of MSI configuration.
Specifically that flag caused vfio_add_kvm_msi_virq() to become a no-op,
meaning the dummy irq would always be delivered via qemu. d1f6af6 changed
vfio_add_kvm_msi_virq() so it takes a vector number instead of the msg
parameter, and determines the correct message itself. The test for !msg
was removed, and not replaced with anything there or in the caller.
With an spapr guest which has a VFIO device, if an EEH error occurs on the
host hardware, then the device will be isolated then reset. This is a
combination of host and guest action, mediated by some EEH related
hypercalls. I haven't fully traced the mechanics, but somehow installing
the kvm irqchip route for the dummy irq on the BCM5719 means that after EEH
reset and recovery, at least some irqs are no longer delivered to the
guest.
In particular, the guest never gets the link up event, and so the NIC is
effectively dead.
[1] EEH (Enhanced Error Handling) is an IBM POWER server specific PCI-*
error reporting and recovery mechanism. The concept is somewhat
similar to PCI-E AER, but the details are different.
Cornelia Huck [Mon, 15 Aug 2016 09:10:28 +0000 (11:10 +0200)]
s390x/css: handle cssid 255 correctly
The cssid 255 is reserved but still valid from an architectural
point of view. However, feeding a bogus schid of 0xffffffff into
the virtio hypercall will lead to a crash:
John Snow [Mon, 26 Sep 2016 18:33:37 +0000 (14:33 -0400)]
ahci: clear aiocb in ncq_cb
Similar to existing fixes for IDE (87ac25fd) and ATAPI (7f951b2d), the
AIOCB must be cleared in the callback. Otherwise, we may accidentally
try to reset a dangling pointer in bdrv_aio_cancel() from a port reset.
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1474575040-32079-2-git-send-email-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit df403bc58859c893ebd0accda07678e84d15dc5d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Even if tray is not open, it can be empty (blk_is_inserted() == false).
Handle both cases correctly by replacing the s->tray_open checks with
blk_is_available(), which is an AND of the two.
Also simplify successive checks of them into blk_is_available(), in a
couple cases.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1473848224-24809-2-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cd723b85601baa7a0eeffbac83421357a70d81ee) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Right after main_loop ends, we release various things but keep iothread
alive. The latter is not prepared to the sudden change of resources.
Specifically, after bdrv_close_all(), virtio-scsi dataplane get a
surprise at the empty BlockBackend:
(gdb) bt
at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:543
at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:577
It is because the d->conf.blk->root is set to NULL, then
blk_get_aio_context() returns qemu_aio_context, whereas s->ctx is still
pointing to the iothread:
hw/scsi/virtio-scsi.c:543:
if (s->dataplane_started) {
assert(blk_get_aio_context(d->conf.blk) == s->ctx);
}
To fix this, let's stop iothreads before doing bdrv_close_all().
Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1473326931-9699-1-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit dce8921b2baaf95974af8176406881872067adfa) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Daniel P. Berrange [Wed, 24 Aug 2016 15:28:15 +0000 (16:28 +0100)]
crypto: ensure XTS is only used with ciphers with 16 byte blocks
The XTS cipher mode needs to be used with a cipher which has
a block size of 16 bytes. If a mis-matching block size is used,
the code will either corrupt memory beyond the IV array, or
not fully encrypt/decrypt the IV.
This fixes a memory corruption crash when attempting to use
cast5-128 with xts, since the former has an 8 byte block size.
A test case is added to ensure the cipher creation fails with
such an invalid combination.
Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit a5d2f44d0d3e7523670e103a8c37faed29ff2b76) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Paolo Bonzini [Mon, 29 Aug 2016 09:35:37 +0000 (11:35 +0200)]
scsi: mptconfig: fix misuse of MPTSAS_CONFIG_PACK
These issues cause respectively a QEMU crash and a leak of 2 bytes of
stack. They were discovered by VictorV of 360 Marvel Team.
Reported-by: Tom Victor <i-tangtianwen@360.cm> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 65a8e1f6413a0f6f79894da710b5d6d43361d27d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When LSI SAS1068 Host Bus emulator builds configuration page
headers, mptsas_config_pack() should assert that the size
fits in a byte. However, the size is expressed in 32-bit
units, so up to 1020 bytes fit. The assertion was only
allowing replies up to 252 bytes, so fix it.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1472645167-30765-2-git-send-email-ppandit@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cf2bce203a45d7437029d108357fb23fea0967b6) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
vmw_pvscsi: check page count while initialising descriptor rings
Vmware Paravirtual SCSI emulation uses command descriptors to
process SCSI commands. These descriptors come with their ring
buffers. A guest could set the page count for these rings to
an arbitrary value, leading to infinite loop or OOB access.
Add check to avoid it.
Reported-by: Tom Victor <vv474172261@gmail.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1472626169-12989-1-git-send-email-ppandit@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7f61f4690dd153be98900a2a508b88989e692753) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Rony Weng [Mon, 29 Aug 2016 07:52:18 +0000 (15:52 +0800)]
scsi-disk: change disk serial length from 20 to 36
Openstack Cinder assigns volume a 36 characters uuid as serial.
QEMU will shrinks the uuid to 20 characters, which does not match
the original uuid.
Note that there is no limit to the length of the serial number in
the SCSI spec. 20 was copy-pasted from virtio-blk which in turn was
copy-pasted from ATA; 36 is even more arbitrary. However, bumping it
up too much might cause issues (e.g. 252 seems to make sense because
then the maximum amount of returned data is 256; but who knows there's
no off-by-one somewhere for such a nicely rounded number).
Signed-off-by: Rony Weng <ronyweng@synology.com>
Message-Id: <1472457138-23386-1-git-send-email-ronyweng@synology.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 48b6206305b8d56524ac2ee347b68e6e0a528559) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Lin Ma [Wed, 14 Sep 2016 06:22:50 +0000 (14:22 +0800)]
qemu-char: avoid segfault if user lacks of permisson of a given logfile
Function qemu_chr_alloc returns NULL if it failed to open logfile by any reason,
says no write permission. For backends tty, stdio and msmouse, They need to
check this return value to avoid segfault in this case.
Signed-off-by: Lin Ma <lma@suse.com> Cc: qemu-stable <qemu-stable@nongnu.org>
Message-Id: <20160914062250.22226-1-lma@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 71200fb9664c2967a1cdd22b68b0da3a8b2b3eb7) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Vmware Paravirtual SCSI emulator while processing IO requests
could run into an infinite loop if 'pvscsi_ring_pop_req_descr'
always returned positive value. Limit IO loop to the ring size.
Cc: qemu-stable@nongnu.org Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1473845952-30785-1-git-send-email-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d251157ac1928191af851d199a9ff255d330bec9) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Li Qiang [Mon, 12 Sep 2016 12:44:11 +0000 (18:14 +0530)]
scsi: mptsas: use g_new0 to allocate MPTSASRequest object
When processing IO request in mptsas, it uses g_new to allocate
a 'req' object. If an error occurs before 'req->sreq' is
allocated, It could lead to an OOB write in mptsas_free_request
function. Use g_new0 to avoid it.
Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1473684251-17476-1-git-send-email-ppandit@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 670e56d3ed2918b3861d9216f2c0540d9e9ae0d5) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Greg Kurz [Fri, 16 Sep 2016 09:44:49 +0000 (11:44 +0200)]
9pfs: fix potential segfault during walk
If the call to fid_to_qid() returns an error, we will call v9fs_path_free()
on uninitialized paths.
It is a regression introduced by the following commit:
56f101ecce0e 9pfs: handle walk of ".." in the root directory
Let's fix this by initializing dpath and path before calling fid_to_qid().
Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
[groug: updated the changelog to indicate this is regression and to provide
the offending commit SHA1] Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 13fd08e631ec0c3ff5ad1bdcb6a4474c7d9a024f) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio-balloon: discard virtqueue element on reset
The one pending element is being freed but not discarded on device
reset, which causes svq->inuse to creep up, eventually hitting the
"Virtqueue size exceeded" error.
Properly discarding the element on device reset makes sure that its
buffers are unmapped and the inuse counter stays balanced.
Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Ladi Prosek <lprosek@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 104e70cae78bd4afd95d948c6aff188f10508a9c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Stefan Hajnoczi [Wed, 7 Sep 2016 15:51:25 +0000 (11:51 -0400)]
virtio: zero vq->inuse in virtio_reset()
vq->inuse must be zeroed upon device reset like most other virtqueue
fields.
In theory, virtio_reset() just needs assert(vq->inuse == 0) since
devices must clean up in-flight requests during reset (requests cannot
not be leaked!).
In practice, it is difficult to achieve vq->inuse == 0 across reset
because balloon, blk, 9p, etc implement various different strategies for
cleaning up requests. Most devices call g_free(elem) directly without
telling virtio.c that the VirtQueueElement is cleaned up. Therefore
vq->inuse is not decremented during reset.
This patch zeroes vq->inuse and trusts that devices are not leaking
VirtQueueElements across reset.
I will send a follow-up series that refactors request life-cycle across
all devices and converts vq->inuse = 0 into assert(vq->inuse == 0) but
this more invasive approach is not appropriate for stable trees.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Cc: qemu-stable <qemu-stable@nongnu.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ladi Prosek <lprosek@redhat.com>
(cherry picked from commit 4b7f91ed0270a371e1933efa21ba600b6da23ab9) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Michael Roth [Wed, 2 Nov 2016 21:40:01 +0000 (16:40 -0500)]
Merge tag 'ppc-for-2.7-20161013' into stable-2.7-staging
qemu-2.7 (stable): ppc patch queue 2016-10-13
TCG for ppc does not properly implement hardware transactional memory.
It has a stub implementation in which transactions always fail.
Unfortunately in v2.7.0, HTM is advertised as being available to
guests, which means guests may incorrectly attempt to use it and hang.
This has been the case for a while, but has become more urgent with
recent (guest) Linux kernel versions which attempt to lazily enable
TM. Under TCG that now triggers the problem regularly, instead of
just when running a TM aware userspace program.
The problem is already fixed in the 2.8/master branch, by correctly
advertising HTM as not being available with TCG. This series
backports the relevant patches to the qemu-2.7 stable branch to fix
the problem there.
* tag 'ppc-for-2.7-20161013':
ppc: Check the availability of transactional memory
hw/ppc/spapr: Fix the selection of the processor features
hw/ppc/spapr: Move code related to "ibm,pa-features" to a separate function
linux-headers: update
Thomas Huth [Wed, 28 Sep 2016 11:16:30 +0000 (13:16 +0200)]
ppc: Check the availability of transactional memory
KVM-PR currently does not support transactional memory, and the
implementation in TCG is just a fake. We should not announce TM
support in the ibm,pa-features property when running on such a
system, so disable it by default and only enable it if the KVM
implementation supports it (i.e. recent versions of KVM-HV).
These changes are based on some earlier work from Anton Blanchard
(thanks!).
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit bac3bf287ab60e264b636f5f00c116a19b655762)
Thomas Huth [Wed, 28 Sep 2016 11:16:29 +0000 (13:16 +0200)]
hw/ppc/spapr: Fix the selection of the processor features
The current code uses pa_features_206 for POWERPC_MMU_2_06, and
for everything else, it uses pa_features_207. This is bad in some
cases because there is also a "degraded" MMU version of ISA 2.06,
called POWERPC_MMU_2_06a, which should of course use the flags for
2.06 instead. And there is also the possibility that the user runs
the pseries machine with a POWER5+ or even 970 processor. In that
case we certainly do not want to set the flags for 2.07, and rather
simply skip the setting of the pa-features property instead.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 4cbec30d769a73853b60dc7f275e6e7da9ab5162)
Thomas Huth [Wed, 28 Sep 2016 11:16:28 +0000 (13:16 +0200)]
hw/ppc/spapr: Move code related to "ibm,pa-features" to a separate function
The function spapr_populate_cpu_dt() has become quite big
already, and since we likely have to extend the pa-features
property for every new processor generation, it is nicer
if we put the related code into a separate function.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 230bf719d3a3b144a4ffa441e5d6170ef0ad8999)
Greg Kurz [Tue, 30 Aug 2016 15:02:27 +0000 (17:02 +0200)]
9pfs: handle walk of ".." in the root directory
The 9P spec at http://man.cat-v.org/plan_9/5/intro says:
All directories must support walks to the directory .. (dot-dot) meaning
parent directory, although by convention directories contain no explicit
entry for .. or . (dot). The parent of the root directory of a server's
tree is itself.
This means that a client cannot walk further than the root directory
exported by the server. In other words, if the client wants to walk
"/.." or "/foo/../..", the server should answer like the request was
to walk "/".
This patch just does that:
- we cache the QID of the root directory at attach time
- during the walk we compare the QID of each path component with the root
QID to detect if we're in a "/.." situation
- if so, we skip the current component and go to the next one
Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The unlinkat request also gets patched for consistency (even if
rmdir("foo/..") is expected to fail according to POSIX.1-2001).
The various error values come from the linux manual pages.
Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Greg Kurz [Tue, 30 Aug 2016 17:11:05 +0000 (19:11 +0200)]
9pfs: forbid illegal path names
Empty path components don't make sense for most commands and may cause
undefined behavior, depending on the backend.
Also, the walk request described in the 9P spec [1] clearly shows that
the client is supposed to send individual path components: the official
linux client never sends portions of path containing the / character for
example.
Moreover, the 9P spec [2] also states that a system can decide to restrict
the set of supported characters used in path components, with an explicit
mention "to remove slashes from name components".
This patch introduces a new name_is_illegal() helper that checks the
names sent by the client are not empty and don't contain unwanted chars.
Since 9pfs is only supported on linux hosts, only the / character is
checked at the moment. When support for other hosts (AKA. win32) is added,
other chars may need to be blacklisted as well.
If a client sends an illegal path component, the request will fail and
ENOENT is returned to the client.
Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Paolo Bonzini [Tue, 30 Aug 2016 12:04:12 +0000 (14:04 +0200)]
Revert "Change net/socket.c to use socket_*() functions"
Since commit 7e8449594c929, the socket connect code is blocking, because
calling socket_connect() without callback is blocking. This reverts the
commit.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Christian Borntraeger [Thu, 25 Aug 2016 18:11:26 +0000 (20:11 +0200)]
translate: early exit in tb_flush if there is no tcg
tb_flush does all kind of things, which are very tcg specific. As it
is called from some places even for KVM (e.g. gdb server) it is better
to detect these cases and do an early exit.
This also fixes a crash in the gdb server that was triggered by
commit 909eaac9bbc2 ("tb hash: track translated blocks with qht").
Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Richard Henderson <rth@twiddle.net> Reported-by: Brent Baccala <cosine@freesoft.org> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1472148686-39841-1-git-send-email-borntraeger@de.ibm.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
vnc: only alloc server surface with clients connected
the VNC server was changed so that the 'vd->server' pixman
image was only allocated when a client is connected.
Since then if a client disconnects and then reconnects to
the VNC server all they will see is a black screen until
they do something that triggers a refresh. On a graphical
desktop this is not often noticed since there's many things
going on which cause a refresh. On a plain text console it
is really obvious since nothing refreshes frequently.
The problem is that the VNC server didn't update the guest
dirty bitmap, so still believes its server image is in sync
with the guest contents.
To fix this we must explicitly mark the entire guest desktop
as dirty after re-creating the server surface. Move this
logic into vnc_update_server_surface() so it is guaranteed
to be call in all code paths that re-create the surface
instead of only in vnc_dpy_switch()
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Tested-by: Peter Lieven <pl@kamp.de>
Message-id: 1471365032-18096-1-git-send-email-berrange@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Must include "qemu-version.h" for the QEMU_PKGVERSION definition.
Signed-off-by: Ed Maste <emaste@freebsd.org>
Message-id: 1471877833-52343-1-git-send-email-emaste@freebsd.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Stefan Hajnoczi [Mon, 15 Aug 2016 12:54:16 +0000 (13:54 +0100)]
virtio: decrement vq->inuse in virtqueue_discard()
virtqueue_discard() moves vq->last_avail_idx back so the element can be
popped again. It's necessary to decrement vq->inuse to avoid "leaking"
the element count.
Cc: qemu-stable@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stefan Hajnoczi [Mon, 15 Aug 2016 12:54:15 +0000 (13:54 +0100)]
virtio: recalculate vq->inuse after migration
The vq->inuse field is not migrated. Many devices don't hold
VirtQueueElements across migration so it doesn't matter that vq->inuse
starts at 0 on the destination QEMU.
At least virtio-serial, virtio-blk, and virtio-balloon migrate while
holding VirtQueueElements. For these devices we need to recalculate
vq->inuse upon load so the value is correct.
Cc: qemu-stable@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Peter Maydell [Mon, 22 Aug 2016 09:02:28 +0000 (10:02 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 22 Aug 2016 09:06:32 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
e1000e: remove internal interrupt flag
slirp: fix segv when init failed
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cao jin [Thu, 18 Aug 2016 14:15:54 +0000 (22:15 +0800)]
e1000e: remove internal interrupt flag
Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, E1000E_USE_MSIX
is not necessary too, remove it now. And interrupt flag field intr_state also
can be removed now.
CC: Dmitry Fleytman <dmitry@daynix.com> CC: Jason Wang <jasowang@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Marcel Apfelbaum <marcel@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Dmitry Fleytman <dmitry@daynix.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Marc-André Lureau [Thu, 18 Aug 2016 13:44:05 +0000 (17:44 +0400)]
slirp: fix segv when init failed
Since commit f6c2e66ae8c8a, slirp uses an exit notifier to call
slirp_smb_cleanup. However, if init() failed, the notifier isn't added,
and removing it will fail:
==18447== Invalid write of size 8
==18447== at 0x7EF2B5: notifier_remove (notify.c:32)
==18447== by 0x48E80C: qemu_remove_exit_notifier (vl.c:2661)
==18447== by 0x6A2187: net_slirp_cleanup (slirp.c:134)
==18447== by 0x69419D: qemu_cleanup_net_client (net.c:338)
==18447== by 0x69445B: qemu_del_net_client (net.c:401)
==18447== by 0x6A2B81: net_slirp_init (slirp.c:366)
==18447== by 0x6A4241: net_init_slirp (slirp.c:865)
==18447== by 0x695C6D: net_client_init1 (net.c:1051)
==18447== by 0x695F6E: net_client_init (net.c:1108)
==18447== by 0x696DBA: net_init_netdev (net.c:1498)
==18447== by 0x7F1F99: qemu_opts_foreach (qemu-option.c:1116)
==18447== by 0x696E60: net_init_clients (net.c:1516)
==18447== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Sascha Silbe [Thu, 18 Aug 2016 18:46:03 +0000 (20:46 +0200)]
test-logging: don't hard-code paths in /tmp
Since f6880b7f [qemu-log: support simple pid substitution for logs],
test-logging creates files with hard-coded names in /tmp. In the best
case, this prevents multiple developers from running "make check" on
the same machine. In the worst case, it allows for symlink attacks,
enabling an attacker to overwrite files that are writable to the
developer running "make check".
Instead of hard-coding the paths, create a temporary directory using
g_dir_make_tmp() and clean it up afterwards.
Fixes: f6880b7f ("qemu-log: support simple pid substitution for logs") Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-id: 1471545963-11720-3-git-send-email-silbe@linux.vnet.ibm.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Sascha Silbe [Thu, 18 Aug 2016 18:46:02 +0000 (20:46 +0200)]
glib: add compatibility implementation for g_dir_make_tmp()
We're going to make use of g_dir_make_tmp() in test-logging. Provide a
compatibility implementation of it for glib < 2.30.
May behave differently in some edge cases (e.g. pattern only at the
end of the template, the file name is not part of the error message),
but good enough in practice.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-id: 1471545963-11720-2-git-send-email-silbe@linux.vnet.ibm.com
[PMM: removed variable "template" which caused compilation failures
when C++ files include glib-compat.h] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Michal Privoznik [Fri, 19 Aug 2016 08:06:40 +0000 (10:06 +0200)]
syscall.c: Redefine IFLA_* enums
In 9c37146782 I've tried to fix a broken build with older
linux-headers. However, I didn't do it properly. The solution
implemented here is to grab the enums that caused the problem
initially, and rename their values so that they are "QEMU_"
prefixed. In order to guarantee matching values with actual
enums from linux-headers, the enums are seeded with starting
values from the original enums.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 75c14d6e8a97c4ff3931d69c13eab7376968d8b4.1471593869.git.mprivozn@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Denis V. Lunev [Wed, 17 Aug 2016 18:06:54 +0000 (21:06 +0300)]
block: fix possible reorder of flush operations
This patch reduce CPU usage of flush operations a bit. When we have one
flush completed we should kick only next operation. We should not start
all pending operations in the hope that they will go back to wait on
wait_queue.
Also there is a technical possibility that requests will get reordered
with the previous approach. After wakeup all requests are removed from
the wait queue. They become active and they are processed one-by-one
adding to the wait queue in the same order. Though new flush can arrive
while all requests are not put into the queue.
Signed-off-by: Denis V. Lunev <den@openvz.org> Tested-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Message-id: 1471457214-3994-3-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Evgeny Yakovlev [Wed, 17 Aug 2016 18:06:53 +0000 (21:06 +0300)]
block: fix deadlock in bdrv_co_flush
The following commit
commit 3ff2f67a7c24183fcbcfe1332e5223ac6f96438c
Author: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Date: Mon Jul 18 22:39:52 2016 +0300
block: ignore flush requests when storage is clean
has introduced a regression.
There is a problem that it is still possible for 2 requests to execute
in non sequential fashion and sometimes this results in a deadlock
when bdrv_drain_one/all are called for BDS with such stalled requests.
1. Current flushed_gen and flush_started_gen is 1.
2. Request 1 enters bdrv_co_flush to with write_gen 1 (i.e. the same
as flushed_gen). It gets past flushed_gen != flush_started_gen and
sets flush_started_gen to 1 (again, the same it was before).
3. Request 1 yields somewhere before exiting bdrv_co_flush
4. Request 2 enters bdrv_co_flush with write_gen 2. It gets past
flushed_gen != flush_started_gen and sets flush_started_gen to 2.
5. Request 2 runs to completion and sets flushed_gen to 2
6. Request 1 is resumed, runs to completion and sets flushed_gen to 1.
However flush_started_gen is now 2.
From here on out flushed_gen is always != to flush_started_gen and all
further requests will wait on flush_queue. This change replaces
flush_started_gen with an explicitly tracked active flush request.
Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1471457214-3994-2-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>