Kevin Wolf [Fri, 26 Aug 2011 13:27:13 +0000 (15:27 +0200)]
qemu-img: Require larger zero areas for sparse handling
By default, require 4k of consecutive zero bytes for qemu-img to make the
output file sparse by not issuing a write request for the zeroed parts. Add an
-S option to allow users to tune this setting.
This helps to avoid situations where a lot of zero sectors and data sectors are
mixed and qemu-img tended to issue many tiny 512 byte writes.
Avi Kivity [Mon, 29 Aug 2011 06:12:49 +0000 (09:12 +0300)]
memory: fix rom_device I/O mode
When adding a rom_device in I/O mode, we incorrectly masked off the low
bits, resulting in a pure RAM map. Fix my masking off the high bits and
IO_MEM_ROMD, yielding a pure I/O map.
clear interrupt request if the interrupt priority < CPU pil
clear hardware interrupt request if interrupts are disabled
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
[blauwirbel@gmail.com: added a comment about magic 2] Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Blue Swirl [Sat, 30 Jul 2011 19:18:32 +0000 (19:18 +0000)]
TCG: improve optimizer debugging
Use enum TCGOpcode instead of plain old int so that the name of
current op can be seen in GDB. Add a default case to switch
so that GCC does not complain about unhandled enum cases.
Gerd Hoffmann [Fri, 26 Aug 2011 09:16:10 +0000 (11:16 +0200)]
Fix linker scripts
Remove PROVIDE_HIDDEN and ONLY_IF_{RO,RW} from linker scripts to make
them work with older binutils versions. Fixes *-bsd-user build on
OpenBSD 4.9 which ships binutils 2.15.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Brad [Mon, 22 Aug 2011 20:39:59 +0000 (16:39 -0400)]
Fix build on OpenBSD with BSD userland emu and smartcard NSS enabled
The first issue is the hard coded POSIX Real Time extensions library in the
libcacard/Makefile. From looking at the code it doesn't seem this is necessary
anyway. Robert Relyea seems to think it most likely isn't necessary.
The second issue was the missing exclusion of the BSD userland binary
builds from the addition of this Makefile target for the smartcard NSS
code which breaks the builds if smartcard NSS support is enabled.
pastebin clip of the build failure..
http://pastebin.com/raw.php?i=BLCKd3s6
Signed-off-by: Brad Smith <brad@comstyle.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Christoph Hellwig [Thu, 25 Aug 2011 06:26:10 +0000 (08:26 +0200)]
block: latency accounting
Account the total latency for read/write/flush requests. This allows
management tools to average it based on a snapshot of the nr ops
counters and allow checking for SLAs or provide statistics.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Michael S. Tsirkin [Thu, 11 Aug 2011 07:21:18 +0000 (10:21 +0300)]
vhost-net: cleanup host notifiers at last step
When the vhost notifier is disabled, the userspace handler runs
immediately: virtio_pci_set_host_notifier_internal might
call virtio_queue_notify_vq.
Since the VQ state and the tap backend state aren't
recovered yet, this causes
"Guest moved used index from XXX to YYY" assertions.
The solution is to split out host notifier handling
from vhost VQ setup and disable notifiers as our last step
when we stop vhost-net. For symmetry enable them first thing
on start.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jan Kiszka [Thu, 25 Aug 2011 09:10:13 +0000 (11:10 +0200)]
vga: Silence bogus gcc warning about uninitialized variables
Some gcc versions do not properly detect that all possible cases are
covered and base and size are always initialized. Please gcc by defining
a pseudo default case.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Christoph Hellwig [Thu, 25 Aug 2011 06:26:01 +0000 (08:26 +0200)]
block: explicit I/O accounting
Decouple the I/O accounting from bdrv_aio_readv/writev/flush and
make the hardware models call directly into the accounting helpers.
This means:
- we do not count internal requests from image formats in addition
to guest originating I/O
- we do not double count I/O ops if the device model handles it
chunk wise
- we only account I/O once it actuall is done
- can extent I/O accounting to synchronous or coroutine I/O easily
- implement I/O latency tracking easily (see the next patch)
I've conveted the existing device model callers to the new model,
device models that are using synchronous I/O and weren't accounted
before haven't been updated yet. Also scsi hasn't been converted
to the end-to-end accounting as I want to defer that after the pending
scsi layer overhaul.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Jamie Iles [Wed, 10 Aug 2011 14:18:42 +0000 (15:18 +0100)]
monitor: fix build breakage for !CONFIG_VNC
Commit c62f6d1 (monitor: fix build breakage with --disable-vnc)
conditionalised some VNC setup code but left an unused variable. Move
the variable into the conditional code to fix the build breakage.
Avi Kivity [Mon, 1 Aug 2011 08:04:39 +0000 (11:04 +0300)]
piix_pci: wrap memory update in a transaction
The code will remap all PAMs, even if just one is updated, resulting
in reduced performance. Wrap in a transaction to detect that those
other PAMs have not changed.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 24 Aug 2011 18:37:05 +0000 (21:37 +0300)]
ppc_oldworld, ppc_newworld: fix escc BAR related crash
ppc maps the escc mmio region both at a fixed offset (as a sysbus area) and as part of a PCI BAR.
This crashes, since a MemoryRegion may have only one parent. Use an alias so we have a separate
MemoryRegion for the BAR.
MORITA Kazutaka [Fri, 12 Aug 2011 12:33:15 +0000 (21:33 +0900)]
sheepdog: use coroutines
This makes the sheepdog block driver support bdrv_co_readv/writev
instead of bdrv_aio_readv/writev.
With this patch, Sheepdog network I/O becomes fully asynchronous. The
block driver yields back when send/recv returns EAGAIN, and is resumed
when the sheepdog network connection is ready for the operation.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Jan Kiszka [Wed, 24 Aug 2011 12:29:30 +0000 (14:29 +0200)]
pci: Error on PCI capability collisions
Nothing good can happen when we overlap capabilities. This may happen
when plugging in assigned devices or when devices models contain bugs.
Detect the overlap and report it.
Based on qemu-kvm commit by Alex Williamson.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Isaku Yamahata [Fri, 5 Aug 2011 09:22:03 +0000 (18:22 +0900)]
pcie/slot: fix hotplug event
When slot status register is cleared, PCIDevice::exp.hpev_notify
needs to be cleared.
Otherwise, PCIDevice::exp.hpev_notify is never set to false resulting
in no more hot plug event once it's raised.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alexander Graf [Tue, 23 Aug 2011 04:55:44 +0000 (06:55 +0200)]
PPC: E500: Set ESR values
When an exception occurs on BookE, we need to set ESR bits to expose
to the guest information on what exactly happened. Add the obvious ones.
Reported-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Alexander Graf [Tue, 23 Aug 2011 04:55:43 +0000 (06:55 +0200)]
PPC: E500: Inject SPE exception on invalid SPE access
When accessing an SPE instruction despite it being not available,
throw an SPE exception instead of an APU exception. That way the
guest knows what's going on and actually uses SPE.
Reported-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Alexander Graf [Tue, 23 Aug 2011 04:55:42 +0000 (06:55 +0200)]
PPC: E500: Add ESR bit definitions
The BookE spec specifies a number of ESR bits. Add defines for them
so we can use them later on.
Reported-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Peter Maydell [Tue, 23 Aug 2011 18:24:32 +0000 (19:24 +0100)]
hw/omap_gpmc: Don't try to map CS0 twice on reset
Remove a spurious second map of the OMAP GPMC CS0 region on reset.
This fixes an assertion failure when we try to add the region to
its container when it was already added. (The old code did not
complain about mismatched map/unmap calls, but the new MemoryRegion
implementation does.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Richard Henderson [Tue, 23 Aug 2011 17:43:32 +0000 (10:43 -0700)]
tcg: Update --enable-debug for TCG_OPF_NOT_PRESENT.
Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Avi Kivity [Sun, 14 Aug 2011 04:04:49 +0000 (07:04 +0300)]
posix-aio-compat: fix latency issues
In certain circumstances, posix-aio-compat can incur a lot of latency:
- threads are created by vcpu threads, so if vcpu affinity is set,
aio threads inherit vcpu affinity. This can cause many aio threads
to compete for one cpu.
- we can create up to max_threads (64) aio threads in one go; since a
pthread_create can take around 30μs, we have up to 2ms of cpu time
under a global lock.
Fix by:
- moving thread creation to the main thread, so we inherit the main
thread's affinity instead of the vcpu thread's affinity.
- if a thread is currently being created, and we need to create yet
another thread, let thread being born create the new thread, reducing
the amount of time we spend under the main thread.
- drop the local lock while creating a thread (we may still hold the
global mutex, though)
Note this doesn't eliminate latency completely; scheduler artifacts or
lack of host cpu resources can still cause it. We may want pre-allocated
threads when this cannot be tolerated.
Thanks to Uli Obergfell of Red Hat for his excellent analysis and suggestions.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Nicholas Thomas [Mon, 15 Aug 2011 09:00:34 +0000 (10:00 +0100)]
block/curl: Handle failed reads gracefully.
Current behaviour if a read fails is for the acb to not get finished.
This causes an infinite loop in bdrv_read_em (block.c). The read failure
never gets reported to the guest and if the error condition clears, the
process never recovers.
With this patch, when curl reports a failure we finish the acb as a
failure. This results in the guest receiving an I/O error (rather than
the read hanging indefinitely) and if the error condition subsequently
clears, retries work as expected.
The simplest test is to put an ISO on a web server you have control over
and open it with qemu-io. Then move the ISO out of the way and attempt
to read some data - you should see behaviour matching the above.
Signed-off-by: Nick Thomas <nick@bytemark.co.uk> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Mon, 8 Aug 2011 12:09:12 +0000 (14:09 +0200)]
qemu-img: Use qemu_blockalign
Now that you can use cache=none for the output file in qemu-img, we should
properly align our buffers so that raw-posix doesn't have to use its (smaller)
bounce buffer.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Philipp Hahn [Thu, 4 Aug 2011 17:22:10 +0000 (19:22 +0200)]
qcow2: Fix DEBUG_* compilation
By introducing BlockDriverState compiling qcow2 with DEBUG_ALLOC and DEBUG_EXT
defined got broken.
Define a BdrvCheckResult structure locally which is now needed as the second
argument.
Also fix qcow2_read_extensions() needing BDRVQcowState.
Signed-off-by: Philipp Hahn <hahn@univention.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Stefan Hajnoczi [Thu, 4 Aug 2011 11:26:52 +0000 (12:26 +0100)]
block: add cache=directsync parameter to -drive
This patch adds -drive cache=directsync for O_DIRECT | O_SYNC host file
I/O with no disk write cache presented to the guest.
This mode is useful when guests may not be sending flushes when
appropriate and therefore leave data at risk in case of power failure.
When cache=directsync is used, write operations are only completed to
the guest when data is safely on disk.
This new mode is like cache=writethrough but it bypasses the host page
cache.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Peter A. G. Crosthwaite [Mon, 22 Aug 2011 08:15:25 +0000 (18:15 +1000)]
xilinx: removed microbalze_pic_init from xilinx.h
This is a microblaze target specific function that belongs outside
of xilinx.h (which is a collection of target independent device model
instantiator functions)
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Jan Kiszka [Mon, 22 Aug 2011 16:35:25 +0000 (18:35 +0200)]
Replace qemu_system_cond with VCPU stop mechanism
We can express the VCPU thread wakeup with the stop mechanism, saving
both qemu_system_ready and the qemu_system_cond. For KVM threads, we can
just enter the main loop as long as the thread is stopped. The central
TCG thread is better held back before the loop as there can be side
effects of the services called even when all CPUs are stopped.
Creating VCPUs in stopped state will also be required for proper CPU
hotplugging support.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Jan Kiszka [Mon, 22 Aug 2011 17:12:12 +0000 (19:12 +0200)]
vga: Use linear mapping + dirty logging in chain 4 memory access mode
Most VGA memory access modes require MMIO handling as they demand weird
logic to get a byte from or into the video RAM. However, there is one
exception: chain 4 mode with all memory planes enabled for writing. This
mode actually allows lineary mapping, which can then be combined with
dirty logging to accelerate KVM.
This patch accelerates specifically VBE accesses like they are used by
grub in graphical mode. Not only the standard VGA adapter benefits from
this, also vmware and spice in VGA mode.
CC: Gerd Hoffmann <kraxel@redhat.com> CC: Avi Kivity <avi@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Jan Kiszka [Mon, 22 Aug 2011 17:12:11 +0000 (19:12 +0200)]
vmware-vga: Eliminate vga_dirty_log_restart
After the conversion to the new Memory API, vga_dirty_log_restart became
seriously pointless. Remove it from vmware-vga and and then finally drop
the service.
CC: Andrzej Zaborowski <balrogg@gmail.com> CC: Avi Kivity <avi@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>