Stuart Yoder [Mon, 28 Mar 2011 20:01:56 +0000 (15:01 -0500)]
KVM: PPC: use ticks, not usecs, for exit timing
Convert to microseconds when displaying
(with fix from Bharat Bhushan <Bharat.Bhushan@freescale.com>).
This reduces rounding error with large quantities of short exits.
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Scott Wood [Mon, 28 Mar 2011 20:01:24 +0000 (15:01 -0500)]
KVM: PPC: fix exit accounting for SPRs, tlbwe, tlbsx
The exit type setting for mfspr/mtspr is moved from 44x to toplevel SPR
emulation. This enables it on e500, and makes sure that all SPRs
are covered.
Exit accounting for tlbwe and tlbsx is added to e500.
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Scott Wood [Tue, 29 Mar 2011 21:49:10 +0000 (16:49 -0500)]
KVM: PPC: e500: emulate SVR
Return the actual host SVR for now, as we already do for PVR. Eventually
we may support Qemu overriding PVR/SVR if the situation is appropriate,
once we implement KVM_SET_SREGS on e500.
Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Avi Kivity [Wed, 27 Apr 2011 16:42:18 +0000 (19:42 +0300)]
KVM: VMX: Cache vmcs segment fields
Since the emulator now checks segment limits and access rights, it
generates a lot more accesses to the vmcs segment fields. Undo some
of the performance hit by cacheing those fields in a read-only cache
(the entire cache is invalidated on any write, or on guest exit).
Jeff Mahoney [Wed, 27 Apr 2011 18:06:07 +0000 (14:06 -0400)]
KVM: ia64: fix sparse warnings
This patch fixes some sparse warning about "dubious one-bit signed bitfield."
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Originally-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Jan Blunck <jblunck@suse.de> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
The CPUIDs for Centaur are added, and then the features of
PadLock hardware engine on VIA CPU, such as "ace", "ace_en"
and so on, can be passed into the kvm guest.
Signed-off-by: Brilly Wu <brillywu@viatech.com.cn> Signed-off-by: Kary Jin <karyjin@viatech.com.cn> Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 24 Apr 2011 09:25:50 +0000 (12:25 +0300)]
KVM: MMU: Add unlikely() annotations to walk_addr_generic()
walk_addr_generic() is a hot path and is also hard for the cpu to predict -
some of the parameters (fetch_fault in particular) vary wildly from
invocation to invocation.
Add unlikely() annotations where appropriate; all walk failures are
considered unlikely, as are cases where we have to mark the accessed or
dirty bit, as they are slow paths both in kvm and on real processors.
This patch makes the cmpxchg_gpte() function aware of the
difference between l1-gfns and l2-gfns when nested
virtualization is in use. This fixes a potential
data-corruption problem in the l1-guest and makes the code
work correct (at least as correct as the hardware which is
emulated in this code) again.
Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
The last_guest_tsc is used in vcpu_load to adjust the
tsc_offset since tsc-scaling is merged. So the
last_guest_tsc needs to be updated in vcpu_put instead of
the the last_host_tsc. This is fixed with this patch.
Reported-by: Jan Kiszka <jan.kiszka@web.de> Tested-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
KVM: SVM: Fix nested sel_cr0 intercept path with decode-assists
This patch fixes a bug in the nested-svm path when
decode-assists is available on the machine. After a
selective-cr0 intercept is detected the rip is advanced
unconditionally. This causes the l1-guest to continue
running with an l2-rip.
This bug was with the sel_cr0 unit-test on decode-assists
capable hardware.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Nelson Elhage [Wed, 13 Apr 2011 15:44:13 +0000 (11:44 -0400)]
KVM: x86 emulator: Handle wraparound in (cs_base + offset) when fetching insns
Currently, setting a large (i.e. negative) base address for %cs does not work on
a 64-bit host. The "JOS" teaching operating system, used by MIT and other
universities, relies on such segments while bootstrapping its way to full
virtual memory management.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Jeff Mahoney [Wed, 13 Apr 2011 01:30:17 +0000 (21:30 -0400)]
KVM: Fix off by one in kvm_for_each_vcpu iteration
This patch avoids gcc issuing the following warning when KVM_MAX_VCPUS=1:
warning: array subscript is above array bounds
kvm_for_each_vcpu currently checks to see if the index for the vcpu is
valid /after/ loading it. We don't run into problems because the address
is still inside the enclosing struct kvm and we never deference or write
to it, so this isn't a security issue.
The warning occurs when KVM_MAX_VCPUS=1 because the increment portion of
the loop will *always* cause the loop to load an invalid location since
++idx will always be > 0.
This patch moves the load so that the check occurs before the load and
we don't run into the compiler warning.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Avi Kivity <avi@redhat.com>
When doing a soft int, we need to bump eip before pushing it to
the stack. Otherwise we'll do the int a second time.
[apw@canonical.com: merged eip update as per Jan's recommendation.] Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Avi Kivity <avi@redhat.com>
KVM: x86 emulator: Make emulate_push() store the value directly
PUSH emulation stores the value by calling writeback() after setting
the dst operand appropriately in emulate_push().
This writeback() using dst is not needed at all because we know the
target is the stack. So this patch makes emulate_push() call, newly
introduced, segmented_write() directly.
By this, many inlined writeback()'s are removed.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Tue, 12 Apr 2011 23:27:55 +0000 (01:27 +0200)]
KVM: VMX: Ensure that vmx_create_vcpu always returns proper error
In case certain allocations fail, vmx_create_vcpu may return 0 as error
instead of a negative value encoded via ERR_PTR. This causes a NULL
pointer dereferencing later on in kvm_vm_ioctl_vcpu_create.
Gleb Natapov [Thu, 31 Mar 2011 10:06:41 +0000 (12:06 +0200)]
KVM: emulator: do not needlesly sync registers from emulator ctxt to vcpu
Currently we sync registers back and forth before/after exiting
to userspace for IO, but during IO device model shouldn't need to
read/write the registers, so we can as well skip those sync points. The
only exaception is broken vmware backdor interface. The new code sync
registers content during IO only if registers are read from/written to
by userspace in the middle of the IO operation and this almost never
happens in practise.
Avi Kivity [Sun, 10 Apr 2011 08:40:42 +0000 (11:40 +0300)]
Merge remote branch 'upstream'
* upstream: (103 commits)
signal.c: fix erroneous syscall kernel-doc
watchdog: mpc8xxx_wdt: fix build
NFS: Change initial mount authflavor only when server returns NFS4ERR_WRONGSEC
fix build fail for hv_mouse indefine udelay
mm: avoid wrapping vm_pgoff in mremap()
x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()
x86-32, fpu: Fix FPU exception handling on non-SSE systems
staging: usbip: bugfix for isochronous packets and optimization
staging: usbip: bugfix add number of packets for isochronous frames
staging: usbip: bugfixes related to kthread conversion
staging: usbip: fix shutdown problems.
staging: hv: Fix GARP not sent after Quick Migration
NFS: Fix a signed vs. unsigned secinfo bug
x86, hibernate: Initialize mmu_cr4_features during boot
sh: select ARCH_NO_SYSDEV_OPS.
ARM: arch-shmobile: only run FSI init on respective boards
ARM: arch-shmobile: only run HDMI init on respective boards
Revert "net/sunrpc: Use static const char arrays"
ARM: mach-shmobile: Correctly check for CONFIG_MACH_MACKEREL
...
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Change initial mount authflavor only when server returns NFS4ERR_WRONGSEC
NFS: Fix a signed vs. unsigned secinfo bug
Revert "net/sunrpc: Use static const char arrays"
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
quota: Don't write quota info in dquot_commit()
ext3: Fix writepage credits computation for ordered mode
Peter Korsgaard [Wed, 30 Mar 2011 13:48:22 +0000 (15:48 +0200)]
watchdog: mpc8xxx_wdt: fix build
Since 1c48a5c93da6313 (dt: Eliminate of_platform_{,un}register_driver)
mpc8xxx_wdt no longer builds as it tries to refer to a 'match' variable
rather than ofdev->dev.of_match that it checks just before.
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
NFS: Change initial mount authflavor only when server returns NFS4ERR_WRONGSEC
When attempting an initial mount, we should only attempt other
authflavors if AUTH_UNIX receives a NFS4ERR_WRONGSEC error.
This allows other errors to be passed back to userspace programs.
Merge branch 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6
* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
efifb: Add override for 11" Macbook Air 3,1
efifb: Support overriding fields FW tells us with the DMI data.
fb: Reduce priority of resource conflict message
savagefb: Remove obsolete else clause in savage_setup_i2c_bus
savagefb: Set up I2C based on chip family instead of card id
savagefb: Replace magic register address with define
drivers/video/bfin-lq035q1-fb.c: introduce missing kfree
video: s3c-fb: fix checkpatch errors and warning
efifb: support AMD Radeon HD 6490
s3fb: fix Virge/GX2
fbcon: Remove unused 'display *p' variable from fb_flashcursor()
fbdev: sh_mobile_lcdcfb: fix module lock acquisition
fbdev: sh_mobile_lcdcfb: add blanking support
viafb: initialize margins correct
viafb: refresh rate bug collection
sh: mach-ap325rxa: move backlight control code
sh: mach-ecovec24: support for main lcd backlight
Merge branch 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
ARM: arch-shmobile: only run FSI init on respective boards
ARM: arch-shmobile: only run HDMI init on respective boards
ARM: mach-shmobile: Correctly check for CONFIG_MACH_MACKEREL
Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', 'timers-fixes-for-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, fpu: Fix FPU exception handling on non-SSE systems
x86, hibernate: Initialize mmu_cr4_features during boot
x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
x86: visws: Fixup irq overhaul fallout
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Clean up rebalance_domains() load-balance interval calculation
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix cpumask leak in __setup_irq()
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf probe: Fix listing incorrect line number with inline function
perf probe: Fix to find recursively inlined function
perf probe: Fix multiple --vars options behavior
perf probe: Fix to remove redundant close
perf probe: Fix to ensure function declared file
Merge branch 'for-linus' of git://git.infradead.org/ubifs-2.6
* 'for-linus' of git://git.infradead.org/ubifs-2.6:
UBI: do not select KALLSYMS_ALL
UBI: do not compare array with NULL
UBI: check if we are in RO mode in the erase routine
UBIFS: fix debugging failure in dbg_check_space_info
UBIFS: fix error path in dbg_debugfs_init_fs
UBIFS: unify error path dbg_debugfs_init_fs
UBIFS: do not select KALLSYMS_ALL
UBIFS: fix assertion warnings
UBIFS: fix oops on error path in read_pnode
UBIFS: do not read flash unnecessarily
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: Add support for CH Pro Throttle
HID: hid-magicmouse: Increase evdev buffer size
HID: add FF support for Logitech G25/G27
HID: roccat: Add support for wireless variant of Pyra
HID: Fix typo Keyoutch -> Keytouch
HID: add support for Skycable 0x3f07 wireless presenter
The normal mmap paths all avoid creating a mapping where the pgoff
inside the mapping could wrap around due to overflow. However, an
expanding mremap() can take such a non-wrapping mapping and make it
bigger and cause a wrapping condition.
Noticed by Robert Swiecki when running a system call fuzzer, where it
caused a BUG_ON() due to terminally confusing the vma_prio_tree code. A
vma dumping patch by Hugh then pinpointed the crazy wrapped case.
Andre Przywara [Thu, 31 Mar 2011 14:58:49 +0000 (16:58 +0200)]
KVM: move and fix substitue search for missing CPUID entries
If KVM cannot find an exact match for a requested CPUID leaf, the
code will try to find the closest match instead of simply confessing
it's failure.
The implementation was meant to satisfy the CPUID specification, but
did not properly check for extended and standard leaves and also
didn't account for the index subleaf.
Beside that this rule only applies to CPUID intercepts, which is not
the only user of the kvm_find_cpuid_entry() function.
So fix this algorithm and call it from kvm_emulate_cpuid().
This fixes a crash of newer Linux kernels as KVM guests on
AMD Bulldozer CPUs, where bogus values were returned in response to
a CPUID intercept.
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Andre Przywara [Wed, 30 Mar 2011 13:01:45 +0000 (15:01 +0200)]
KVM: fix XSAVE bit scanning
When KVM scans the 0xD CPUID leaf for propagating the XSAVE save area
leaves, it assumes that the leaves are contigious and stops at the
first zero one. On AMD hardware there is a gap, though, as LWP uses
leaf 62 to announce it's state save area.
So lets iterate through all 64 possible leaves and simply skip zero
ones to also cover later features.
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Hans Rosenfeld [Wed, 6 Apr 2011 16:06:43 +0000 (18:06 +0200)]
x86-32, fpu: Fix FPU exception handling on non-SSE systems
On 32bit systems without SSE (that is, they use FSAVE/FRSTOR for FPU
context switches), FPU exceptions in user mode cause Oopses, BUGs,
recursive faults and other nasty things:
staging: usbip: bugfix for isochronous packets and optimization
For isochronous packets the actual_length is the sum of the actual
length of each of the packets, however between the packets might be
padding, so it is not sufficient to just send the first actual_length
bytes of the buffer. To fix this and simultanesouly optimize the
bandwidth the content of the isochronous packets are send without the
padding, the padding is restored on the receiving end.
staging: usbip: bugfix add number of packets for isochronous frames
The number_of_packets was not transmitted for RET_SUBMIT packets. The
linux client used the stored number_of_packet from the submitted
request. The windows userland client does not do this however and needs
to know the number_of_packets to determine the size of the transmission.
staging: usbip: bugfixes related to kthread conversion
When doing a usb port reset do a queued reset instead to prevent a
deadlock: the reset will cause the driver to unbind, causing the
usb_driver_lock_for_reset to stall.
staging: hv: Fix GARP not sent after Quick Migration
After Quick Migration, the network is not immediately operational in the
current context when receiving RNDIS_STATUS_MEDIA_CONNECT event. So, I added
another netif_notify_peers() into a scheduled work, otherwise GARP packet will
not be sent after quick migration, and cause network disconnection.
Thanks to Mike Surcouf <mike@surcouf.co.uk> for reporting the bug and
testing the patch.
Reported-by: Mike Surcouf <mike@surcouf.co.uk> Tested-by: Mike Surcouf <mike@surcouf.co.uk> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Hank Janssen <hjanssen@microsoft.com> Signed-off-by: Abhishek Kane <v-abkane@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
rpc_authflavor_t is cast from an unsigned int, but the
initial code tried to use it as a signed int. I fix
this by passing an rpc_authflavor_t pointer around, and
returning signed integers from functions.
thereby breaking resume from hibernate. This restores previous
functionality in approximately the same place, and corrects the
reading of %cr4 on pre-CPUID hardware (%cr4 exists if and only if
CPUID is supported.)
However, part of the problem is that the hibernate suspend/resume
sequence should manage the save/restore of %cr4 explicitly.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <201104020154.57136.rjw@sisk.pl>
ARM: arch-shmobile: only run FSI init on respective boards
If several boards are enabled in the kernel configuration,
fsi_init_pm_clock() functions from board-ap4evb.c
will run on any of them. Prevent this by calling these functions from the
.init_machine() callback instead of using device_initcall().
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
ARM: arch-shmobile: only run HDMI init on respective boards
If several boards are enabled in the kernel configuration,
hdmi_init_pm_clock() functions from board-ap4evb.c and board-mackerel.c
will run on any of them. Prevent this by calling these functions from the
.init_machine() callback instead of using device_initcall().
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Cc: Magnus Damm <damm@opensource.se> Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>