Konrad Rzeszutek Wilk [Mon, 27 Feb 2012 01:08:48 +0000 (20:08 -0500)]
Merge branch 'stable/cpufreq-xen.v6.rebased' into uek2-merge
* stable/cpufreq-xen.v6.rebased:
[CPUFREQ] xen: governor for Xen hypervisor frequency scaling.
xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.
Konrad Rzeszutek Wilk [Fri, 3 Feb 2012 21:03:20 +0000 (16:03 -0500)]
[CPUFREQ] xen: governor for Xen hypervisor frequency scaling.
This CPU freq governor leaves the frequency decision to the Xen hypervisor.
To do that the driver parses the Power Management data and uploads said
information to the Xen hypervisor. Then the Xen hypervisor can select the
proper Cx and Pxx states for the initial domain and all other domains.
To upload the information, this CPU frequency driver reads Power Management (PM)
(_Pxx and _Cx) which are populated in the 'struct acpi_processor' structure.
It simply reads the contents of that structure and pass it up the Xen hypervisor.
For that to work we depend on the appropriate CPU frequency scaling driver
to do the heavy-lifting - so that the contents is correct.
The CPU frequency governor it has been loaded also sets up a timer
to check if the ACPI IDs count is different from the APIC ID count - which
can happen if the user choose to use dom0_max_vcpu argument. In such a case
a backup of the PM structure is used and uploaded to the hypervisor.
[v1-v2: Initial RFC implementations that were posted]
[v3: Changed the name to passthru suggested by Pasi Kärkkäinen <pasik@iki.fi>]
[v4: Added vCPU != pCPU support - aka dom0_max_vcpus support]
[v5: Cleaned up the driver, fix bug under Athlon XP]
[v6: Changed the driver to a CPU frequency governor] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Konrad Rzeszutek Wilk [Tue, 14 Feb 2012 03:26:32 +0000 (22:26 -0500)]
xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.
For the hypervisor to take advantage of the MWAIT support it needs
to extract from the ACPI _CST the register address. But the
hypervisor does not have the support to parse DSDT so it relies on
the initial domain (dom0) to parse the ACPI Power Management information
and push it up to the hypervisor. The pushing of the data is done
by the processor_harveset_xen module which parses the information that
the ACPI parser has graciously exposed in 'struct acpi_processor'.
For the ACPI parser to also expose the Cx states for MWAIT, we need
to expose the MWAIT capability (leaf 1). Furthermore we also need to
expose the MWAIT_LEAF capability (leaf 5) for cstate.c to properly
function.
The hypervisor could expose these flags when it traps the XEN_EMULATE_PREFIX
operations, but it can't do it since it needs to be backwards compatible.
Instead we choose to use the native CPUID to figure out if the MWAIT
capability exists and use the XEN_SET_PDC query hypercall to figure out
if the hypervisor wants us to expose the MWAIT_LEAF capability or not.
Note: The XEN_SET_PDC query was implemented in c/s 23783:
"ACPI: add _PDC input override mechanism".
With this in place, instead of
C3 ACPI IOPORT 415
we get now
C3:ACPI FFH INTEL MWAIT 0x20
Note: The cpu_idle which would be calling the mwait variants for idling
never gets set b/c we set the default pm_idle to be the hypercall variant.
Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Fri, 24 Feb 2012 05:35:13 +0000 (00:35 -0500)]
Merge branch 'stable/not-upstreamed' into uek2-merge
* stable/not-upstreamed:
Xen: Export host physical CPU information to dom0
xen/mce: Change the machine check point
Add mcelog support from xen platform
When a MCE/CMCI error happens (or by polling), the related error
information will be sent to privileged pv-ops domain by XEN. This
patch will help to fetch the xen-logged information by hypercall
and then convert XEN-format log into Linux format MCELOG. It makes
using current available mcelog tools for native Linux possible.
With this patch, after mce/cmci error log information is sent to
pv-ops guest, Running mcelog tools in the guest, you will get same
detailed decoded mce information as in Native Linux.
Nathanael Rensen [Tue, 7 Feb 2012 05:50:24 +0000 (13:50 +0800)]
usb: xen pvusb driver
Port the original Xen PV USB drivers developed by Noboru Iwamatsu
<n_iwamatsu@jp.fujitsu.com> to the Linux pvops kernel. The backend driver
resides in dom0 with access to the physical USB device. The frontend driver
resides in a domU to provide paravirtualised access to physical USB devices.
For usage, see http://wiki.xensource.com/xenwiki/XenUSBPassthrough.
Signed-off-by: Nathanael Rensen <nathanael@polymorpheus.com>. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Note: The approach in upstream is to get rid of the microcode
driver altogether and do at early bootup. Even as early as
syslinux/pxeboot or initial kernel image. But those patches
are not yet ready.
Konrad Rzeszutek Wilk [Fri, 24 Feb 2012 04:54:01 +0000 (23:54 -0500)]
Merge branch 'devel/acpi-s3.v4.rebased' into uek2-merge
* devel/acpi-s3.v4.rebased:
xen/pci:use hypercall PHYSDEVOP_restore_msi_ext to restore MSI/MSI-X vectors
xen/acpi/sleep: Register to the acpi_suspend_lowlevel a callback.
xen/acpi/sleep: Enable ACPI sleep via the __acpi_override_sleep
xen/acpi: Domain0 acpi parser related platform hypercall
xen: Utilize the restore_msi_irqs hook.
x86/acpi/sleep: Provide registration for acpi_suspend_lowlevel.
x86, acpi, tboot: Have a ACPI sleep override instead of calling tboot_sleep.
x86: Expand the x86_msi_ops to have a restore MSIs.
Konrad Rzeszutek Wilk [Fri, 24 Feb 2012 04:49:59 +0000 (23:49 -0500)]
Merge branch 'stable/processor-passthru.v5.rebased' into uek2-merge
* stable/processor-passthru.v5.rebased:
xen/processor-passthru: Provide an driver that passes struct acpi_processor data to the hypervisor.
xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.
xen/setup/pm/acpi: Remove the call to boot_option_idle_override.
xen/acpi: Domain0 acpi parser related platform hypercall
xen/pm_idle: Make pm_idle be default_idle under Xen.
cpuidle: stop depending on pm_idle
cpuidle: replace xen access to x86 pm_idle and default_idle
cpuidle: create bootparam "cpuidle.off=1"
Konrad Rzeszutek Wilk [Fri, 24 Feb 2012 04:47:20 +0000 (23:47 -0500)]
xen/processor-passthru: Provide an driver that passes struct acpi_processor data to the hypervisor.
The ACPI processor processes the _Pxx and the _Cx state information
which are populated in the 'struct acpi_processor' per-cpu structure.
We read the contents of that structure and pass it up the Xen hypervisor.
The ACPI processor along with the CPU freq driver does all the heavy-lifting
for us (filtering, calling ACPI functions, etc) so that the contents is correct.
After we are done parsing the information, we wait in case of hotplug CPUs
get loaded and then pass that information to the hypervisor.
[v1-v2: Initial RFC implementations that were posted]
[v3: Changed the name to passthru suggested by Pasi Kärkkäinen <pasik@iki.fi>]
[v4: Added vCPU != pCPU support - aka dom0_max_vcpus support]
[v5: Cleaned up the driver, fix bug under Athlon XP] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Konrad Rzeszutek Wilk [Tue, 14 Feb 2012 03:26:32 +0000 (22:26 -0500)]
xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.
For the hypervisor to take advantage of the MWAIT support it needs
to extract from the ACPI _CST the register address. But the
hypervisor does not have the support to parse DSDT so it relies on
the initial domain (dom0) to parse the ACPI Power Management information
and push it up to the hypervisor. The pushing of the data is done
by the processor_harveset_xen module which parses the information that
the ACPI parser has graciously exposed in 'struct acpi_processor'.
For the ACPI parser to also expose the Cx states for MWAIT, we need
to expose the MWAIT capability (leaf 1). Furthermore we also need to
expose the MWAIT_LEAF capability (leaf 5) for cstate.c to properly
function.
The hypervisor could expose these flags when it traps the XEN_EMULATE_PREFIX
operations, but it can't do it since it needs to be backwards compatible.
Instead we choose to use the native CPUID to figure out if the MWAIT
capability exists and use the XEN_SET_PDC query hypercall to figure out
if the hypervisor wants us to expose the MWAIT_LEAF capability or not.
Note: The XEN_SET_PDC query was implemented in c/s 23783:
"ACPI: add _PDC input override mechanism".
With this in place, instead of
C3 ACPI IOPORT 415
we get now
C3:ACPI FFH INTEL MWAIT 0x20
Note: The cpu_idle which would be calling the mwait variants for idling
never gets set b/c we set the default pm_idle to be the hypercall variant.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Yu Ke [Wed, 24 Mar 2010 18:01:13 +0000 (11:01 -0700)]
xen/acpi: Domain0 acpi parser related platform hypercall
This patches implements the xen_platform_op hypercall, to pass the parsed
ACPI info to hypervisor.
Signed-off-by: Yu Ke <ke.yu@intel.com> Signed-off-by: Tian Kevin <kevin.tian@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[v1: Added DEFINE_GUEST.. in appropiate headers]
[v2: Ripped out typedefs] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Mon, 21 Nov 2011 23:02:02 +0000 (18:02 -0500)]
xen/pm_idle: Make pm_idle be default_idle under Xen.
The idea behind commit d91ee5863b71 ("cpuidle: replace xen access to x86
pm_idle and default_idle") was to have one call - disable_cpuidle()
which would make pm_idle not be molested by other code. It disallows
cpuidle_idle_call to be set to pm_idle (which is excellent).
But in the select_idle_routine() and idle_setup(), the pm_idle can still
be set to either: amd_e400_idle, mwait_idle or default_idle. This
depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.
In case of mwait_idle we can hit some instances where the hypervisor
(Amazon EC2 specifically) sets the MWAIT and we get:
In the case of amd_e400_idle we don't get so spectacular crashes, but we
do end up making an MSR which is trapped in the hypervisor, and then
follow it up with a yield hypercall. Meaning we end up going to
hypervisor twice instead of just once.
The previous behavior before v3.0 was that pm_idle was set to
default_idle regardless of select_idle_routine/idle_setup.
We want to do that, but only for one specific case: Xen. This patch
does that.
Fixes RH BZ #739499 and Ubuntu #881076 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cc: Kevin Hilman <khilman@deeprootsystems.com>
cc: Paul Mundt <lethal@linux-sh.org>
cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 1 Apr 2011 22:28:35 +0000 (18:28 -0400)]
cpuidle: replace xen access to x86 pm_idle and default_idle
When a Xen Dom0 kernel boots on a hypervisor, it gets access
to the raw-hardware ACPI tables. While it parses the idle tables
for the hypervisor's beneift, it uses HLT for its own idle.
Rather than have xen scribble on pm_idle and access default_idle,
have it simply disable_cpuidle() so acpi_idle will not load and
architecture default HLT will be used.
cc: xen-devel@lists.xensource.com Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Jeremy Fitzhardinge [Sat, 28 Mar 2009 00:39:15 +0000 (17:39 -0700)]
xen: add CPU microcode update driver
Xen does all the hard work for us, including choosing the right update
method for this cpu type and actually doing it for all cpus. We just
need to supply it with the firmware blob.
Because Xen updates all CPUs (and the kernel's virtual cpu numbers have
no fixed relationship with the underlying physical cpus), we only bother
doing anything for cpu "0".
[ Impact: allow CPU microcode update in Xen dom0 ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Yu Ke [Wed, 24 Mar 2010 18:01:13 +0000 (11:01 -0700)]
xen/acpi: Domain0 acpi parser related platform hypercall
This patches implements the xen_platform_op hypercall, to pass the parsed
ACPI info to hypervisor.
Signed-off-by: Yu Ke <ke.yu@intel.com> Signed-off-by: Tian Kevin <kevin.tian@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[v1: Added DEFINE_GUEST.. in appropiate headers]
[v2: Ripped out typedefs] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 15 Feb 2012 15:48:39 +0000 (10:48 -0500)]
Merge branch 'stable/bug.fixes-3.3.rebased' into uek2-merge
* stable/bug.fixes-3.3.rebased:
xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback
xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic.
xen/bootup: During bootup suppress XENBUS: Unable to read cpu state
The reason for this should be obvious from this call-chain:
cpu_bringup_and_idle:
\- cpu_bringup
| \-[preempt_disable]
|
|- cpu_idle
\- play_dead [assuming the user offlined the VCPU]
| \
| +- (xen_play_dead)
| \- HYPERVISOR_VCPU_off [so VCPU is dead, once user
| | onlines it starts from here]
| \- cpu_bringup [preempt_disable]
|
+- preempt_enable_no_reschedule()
+- schedule()
\- preempt_enable()
So we have two preempt_disble() and one preempt_enable(). Calling
preempt_enable() after the cpu_bringup() in the xen_play_dead
fixes the imbalance.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 1 Feb 2012 21:07:41 +0000 (16:07 -0500)]
xen/bootup: During bootup suppress XENBUS: Unable to read cpu state
When the initial domain starts, it prints (depending on the
amount of CPUs) a slew of
XENBUS: Unable to read cpu state
XENBUS: Unable to read cpu state
XENBUS: Unable to read cpu state
XENBUS: Unable to read cpu state
which provide no useful information - as the error is a valid
issue - but not on the initial domain. The reason is that the
XenStore is not accessible at that time (it is after all the
first guest) so the CPU hotplug watch cannot parse "availability/cpu"
attribute.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Thu, 2 Feb 2012 20:03:05 +0000 (15:03 -0500)]
Merge branch 'stable/for-linus-3.3.rebased' into uek2-merge
* stable/for-linus-3.3.rebased: (39 commits)
Merge conflict resolved. Somehow the letter 's' slipped in the Makefile. This fixes the compile issues.
xen/events: BUG() when we can't allocate our event->irq array.
xen/granttable: Disable grant v2 for HVM domains.
xen-blkfront: Use kcalloc instead of kzalloc to allocate array
xen/pciback: Expand the warning message to include domain id.
xen/pciback: Fix "device has been assigned to X domain!" warning
xen/xenbus: don't reimplement kvasprintf via a fixed size buffer
xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
Xen: consolidate and simplify struct xenbus_driver instantiation
xen-gntalloc: introduce missing kfree
xen/xenbus: Fix compile error - missing header for xen_initial_domain()
xen/netback: Enable netback on HVM guests
xen/grant-table: Support mappings required by blkback
xenbus: Use grant-table wrapper functions
xenbus: Support HVM backends
xen/xenbus-frontend: Fix compile error with randconfig
xen/xenbus-frontend: Make error message more clear
xen/privcmd: Remove unused support for arch specific privcmp mmap
xen: Add xenbus_backend device
...
Thomas Meyer [Tue, 29 Nov 2011 21:08:00 +0000 (22:08 +0100)]
xen-blkfront: Use kcalloc instead of kzalloc to allocate array
The advantage of kcalloc is, that will prevent integer overflows which could
result from the multiplication of number of elements and size and it is also
a bit nicer to read.
The semantic patch that makes this change is available
in https://lkml.org/lkml/2011/11/25/107
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
[v1: Seperated the drivers/block/cciss_scsi.c out of this patch] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 4 Jan 2012 19:16:45 +0000 (14:16 -0500)]
xen/pciback: Expand the warning message to include domain id.
When a PCI device is transferred to another domain and it is still
in usage (from the internal perspective), mention which other
domain is using it to aid in debugging.
[v2: Truncate the verbose message per Jan Beulich suggestion]
[v3: Suggestions from Ian Campbell on the wording] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Konrad Rzeszutek Wilk [Wed, 4 Jan 2012 20:11:02 +0000 (15:11 -0500)]
xen/pciback: Fix "device has been assigned to X domain!" warning
The full warning is:
"pciback 0000:05:00.0: device has been assigned to 2 domain! Over-writting the ownership, but beware."
which is correct - the previous domain that was using the device
forgot to unregister the ownership. This patch fixes this by
calling the unregister ownership function when the PCI device is
relinquished from the guest domain.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Campbell [Wed, 4 Jan 2012 11:39:51 +0000 (11:39 +0000)]
xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
Use this now that it is defined even though it happens to be == PAGE_SIZE.
The code which takes requests from userspace already validates against the size
of this buffer so no further checks are required to ensure that userspace
requests comply with the protocol in this respect.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Haogang Chen <haogangchen@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Campbell [Wed, 4 Jan 2012 09:34:49 +0000 (09:34 +0000)]
xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
Haogang Chen found out that:
There is a potential integer overflow in process_msg() that could result
in cross-domain attack.
body = kmalloc(msg->hdr.len + 1, GFP_NOIO | __GFP_HIGH);
When a malicious guest passes 0xffffffff in msg->hdr.len, the subsequent
call to xb_read() would write to a zero-length buffer.
The other end of this connection is always the xenstore backend daemon
so there is no guest (malicious or otherwise) which can do this. The
xenstore daemon is a trusted component in the system.
However this seem like a reasonable robustness improvement so we should
have it.
And Ian when read the API docs found that:
The payload length (len field of the header) is limited to 4096
(XENSTORE_PAYLOAD_MAX) in both directions. If a client exceeds the
limit, its xenstored connection will be immediately killed by
xenstored, which is usually catastrophic from the client's point of
view. Clients (particularly domains, which cannot just reconnect)
should avoid this.
so this patch checks against that instead.
This also avoids a potential integer overflow pointed out by Haogang Chen.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Haogang Chen <haogangchen@gmail.com> CC: stable@kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Thu, 22 Dec 2011 09:08:13 +0000 (09:08 +0000)]
Xen: consolidate and simplify struct xenbus_driver instantiation
The 'name', 'owner', and 'mod_name' members are redundant with the
identically named fields in the 'driver' sub-structure. Rather than
switching each instance to specify these fields explicitly, introduce
a macro to simplify this.
Eliminate further redundancy by allowing the drvname argument to
DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from
the ID table will be used for .driver.name).
Also eliminate the questionable xenbus_register_{back,front}end()
wrappers - their sole remaining purpose was the checking of the
'owner' field, proper setting of which shouldn't be an issue anymore
when the macro gets used.
v2: Restore DRV_NAME for the driver name in xen-pciback.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Thu, 2 Feb 2012 19:21:59 +0000 (14:21 -0500)]
Merge branch 'stable/xen-pciback-0.6.3.bugfixes' into stable/for-linus-3.3.rebased
* stable/xen-pciback-0.6.3.bugfixes: (22 commits)
xen/pciback: Check if the device is found instead of blindly assuming so.
xen/pciback: Do not dereference psdev during printk when it is NULL.
xen/pciback: double lock typo
xen/pciback: use mutex rather than spinlock in vpci backend
xen/pciback: Use mutexes when working with Xenbus state transitions.
xen/pciback: miscellaneous adjustments
xen/pciback: use mutex rather than spinlock in passthrough backend
xen/pciback: use resource_size()
xen/pciback: remove duplicated #include
xen/pciback: Have 'passthrough' option instead of XEN_PCIDEV_BACKEND_PASS and XEN_PCIDEV_BACKEND_VPCI
xen/pciback: Remove the DEBUG option.
xen/pciback: Drop two backends, squash and cleanup some code.
xen/pciback: Print out the MSI/MSI-X (PIRQ) values
xen/pciback: Don't setup an fake IRQ handler for SR-IOV devices.
xen: rename pciback module to xen-pciback.
xen/pciback: Fine-grain the spinlocks and fix BUG: scheduling while atomic cases.
xen/pciback: Allocate IRQ handler for device that is shared with guest.
xen/pciback: Disable MSI/MSI-X when reseting a device
xen/pciback: guest SR-IOV support for PV guest
xen/pciback: Register the owner (domain) of the PCI device.
...
Konrad Rzeszutek Wilk [Thu, 2 Feb 2012 19:20:07 +0000 (14:20 -0500)]
Merge branch 'stable/xen-block.rebase' into stable/for-linus-3.3.rebased
* stable/xen-block.rebase: (21 commits)
xen-blkback: Don't disconnect backend until state switched to XenbusStateClosed.
block: xen-blkback: use API provided by xenbus module to map rings
xen-blkback: convert hole punching to discard request on loop devices
xen/blkback: Move processing of BLKIF_OP_DISCARD from dispatch_rw_block_io
xen/blk[front|back]: Enhance discard support with secure erasing support.
xen/blk[front|back]: Squash blkif_request_rw and blkif_request_discard together
xen/blkback: Fix two races in the handling of barrier requests.
xen/blkback: Check for proper operation.
xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
xen/blkback: Report VBD_WSECT (wr_sect) properly.
xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests.
xen-blkfront: plug device number leak in xlblk_init() error path
xen-blkfront: If no barrier or flush is supported, use invalid operation.
xen-blkback: use kzalloc() in favor of kmalloc()+memset()
xen-blkback: fixed indentation and comments
xen-blkfront: fix a deadlock while handling discard response
xen-blkfront: Handle discard requests.
xen-blkback: Implement discard requests ('feature-discard')
xen-blkfront: add BLKIF_OP_DISCARD and discard request struct
xen/blkback: Add module alias for autoloading
...
Julia Lawall [Fri, 23 Dec 2011 17:39:29 +0000 (18:39 +0100)]
xen-gntalloc: introduce missing kfree
Error handling code following a kmalloc should free the allocated data.
Out_unlock is used on both success and failure, so free vm_priv before
jumping to that label.
A simplified version of the semantic match that finds the problem is as
follows: (http://coccinelle.lip6.fr)
// <smpl>
@r exists@
local idexpression x;
statement S;
identifier f1;
position p1,p2;
expression *ptr != NULL;
@@
x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
<... when != x
when != if (...) { <+...x...+> }
x->f1
...>
(
return \(0\|<+...x...+>\|ptr\);
|
return@p2 ...;
)
Daniel De Graaf [Wed, 14 Dec 2011 20:12:13 +0000 (15:12 -0500)]
xen/netback: Enable netback on HVM guests
Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Daniel De Graaf [Wed, 14 Dec 2011 20:12:11 +0000 (15:12 -0500)]
xen/grant-table: Support mappings required by blkback
Add support for mappings without GNTMAP_contains_pte. This was not
supported because the unmap operation assumed that this flag was being
used; adding a parameter to the unmap operation to allow the PTE
clearing to be disabled is sufficient to make unmap capable of
supporting either mapping type.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
[v1: Fix cleanpatch warnings] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Daniel De Graaf [Wed, 14 Dec 2011 20:12:10 +0000 (15:12 -0500)]
xenbus: Use grant-table wrapper functions
For xenbus_{map,unmap}_ring to work on HVM, the grant table operations
must be set up using the gnttab_set_{map,unmap}_op functions instead of
directly populating the fields of gnttab_map_grant_ref. These functions
simply populate the structure on paravirtualized Xen; however, on HVM
they must call __pa() on vaddr when populating op->host_addr because the
hypervisor cannot directly interpret guest-virtual addresses.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
[v1: Fixed cleanpatch error] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Daniel De Graaf [Mon, 19 Dec 2011 19:55:14 +0000 (14:55 -0500)]
xenbus: Support HVM backends
Add HVM implementations of xenbus_(map,unmap)_ring_v(alloc,free) so
that ring mappings can be done without using GNTMAP_contains_pte which
is not supported on HVM. This also removes the need to use vmlist_lock
on PV by tracking the allocated xenbus rings.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
[v1: Fix compile error when XENBUS_FRONTEND is defined as module] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Thu, 2 Feb 2012 19:16:32 +0000 (14:16 -0500)]
Merge branch 'stable/drivers-3.2.rebased' into stable/for-linus-3.3.rebased
* stable/drivers-3.2.rebased:
xen: use static initializers in xen-balloon.c
xenbus: don't rely on xen_initial_domain to detect local xenstore
xenbus: Fix loopback event channel assuming domain 0
xen/pv-on-hvm:kexec: Fix implicit declaration of function 'xen_hvm_domain'
xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel
xen/pv-on-hvm kexec: update xs_wire.h:xsd_sockmsg_type from xen-unstable
xen/pv-on-hvm kexec+kdump: reset PV devices in kexec or crash kernel
xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports
xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch events arrive
Konrad Rzeszutek Wilk [Thu, 2 Feb 2012 19:12:34 +0000 (14:12 -0500)]
Merge branch 'stable/vmalloc-3.2.rebased' into stable/for-linus-3.3.rebased
* stable/vmalloc-3.2.rebased:
xen: map foreign pages for shared rings by updating the PTEs directly
net: xen-netback: use API provided by xenbus module to map rings
block: xen-blkback: use API provided by xenbus module to map rings
xen: use generic functions instead of xen_{alloc, free}_vm_area()
Konrad Rzeszutek Wilk [Thu, 2 Feb 2012 19:11:52 +0000 (14:11 -0500)]
Merge branch 'stable/bug.fixes-3.2.rebased' into stable/for-linus-3.3.rebased
* stable/bug.fixes-3.2.rebased:
xen: Remove hanging references to CONFIG_XEN_PLATFORM_PCI
xen/irq: If we fail during msi_capability_init return proper error code.
xen: remove XEN_PLATFORM_PCI config option
xen: XEN_PVHVM depends on PCI
xen/p2m/debugfs: Make type_name more obvious.
xen/p2m/debugfs: Fix potential pointer exception.
xen/enlighten: Fix compile warnings and set cx to known value.
xen/xenbus: Remove the unnecessary check.
xen/events: Don't check the info for NULL as it is already done.
xen/pci: Use 'acpi_gsi_to_irq' value unconditionally.
xen/pci: Remove 'xen_allocate_pirq_gsi'.
xen/pci: Retire unnecessary #ifdef CONFIG_ACPI
xen/pci: Move the allocation of IRQs when there are no IOAPIC's to the end
xen/pci: Squash pci_xen_initial_domain and xen_setup_pirqs together.
xen/pci: Use the xen_register_pirq for HVM and initial domain users
xen/pci: In xen_register_pirq bind the GSI to the IRQ after the hypercall.
xen/pci: Provide #ifdef CONFIG_ACPI to easy code squashing.
xen/pci: Update comments and fix empty spaces.
xen/pci: Shuffle code around.
Konrad Rzeszutek Wilk [Mon, 19 Dec 2011 20:08:15 +0000 (15:08 -0500)]
xen/xenbus-frontend: Fix compile error with randconfig
drivers/xen/xenbus/xenbus_dev_frontend.c: In function 'xenbus_init':
drivers/xen/xenbus/xenbus_dev_frontend.c:609:2: error: implicit declaration of function 'xen_domain'
Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bastian Blank [Sat, 10 Dec 2011 18:29:48 +0000 (19:29 +0100)]
xen: Add xenbus_backend device
Access for xenstored to the event channel and pre-allocated ring is
managed via xenfs. This adds its own character device featuring mmap
for the ring and an ioctl for the event channel.
Signed-off-by: Bastian Blank <waldi@debian.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bastian Blank [Fri, 16 Dec 2011 16:34:33 +0000 (11:34 -0500)]
xen: Add privcmd device driver
Access to arbitrary hypercalls is currently provided via xenfs. This
adds a standard character device to handle this. The support in xenfs
remains for backward compatibility and uses the device driver code.
Signed-off-by: Bastian Blank <waldi@debian.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Daniel De Graaf [Mon, 28 Nov 2011 16:49:11 +0000 (11:49 -0500)]
xen/gntalloc: fix reference counts on multi-page mappings
When a multi-page mapping of gntalloc is created, the reference counts
of all pages in the vma are incremented. However, the vma open/close
operations only adjusted the reference count of the first page in the
mapping, leaking the other pages. Store a struct in the vm_private_data
to track the original page count to properly free the pages when the
last reference to the vma is closed.
Reported-by: Anil Madhavapeddy <anil@recoil.org> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Daniel De Graaf [Mon, 28 Nov 2011 16:49:10 +0000 (11:49 -0500)]
xen/gntalloc: release grant references on page free
gnttab_end_foreign_access_ref does not return the grant reference it is
passed to the free list; gnttab_free_grant_reference needs to be
explicitly called. While gnttab_end_foreign_access provides a wrapper
for this, it is unsuitable because it does not return errors.
Reported-by: Anil Madhavapeddy <anil@recoil.org> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Daniel De Graaf [Mon, 28 Nov 2011 16:49:09 +0000 (11:49 -0500)]
xen/events: prevent calling evtchn_get on invalid channels
The event channel number provided to evtchn_get can be provided by
userspace, so needs to be checked against the maximum number of event
channels prior to using it to index into evtchn_to_irq.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Annie Li [Mon, 12 Dec 2011 10:15:07 +0000 (18:15 +0800)]
xen/granttable: Support transitive grants
These allow a domain A which has been granted access on a page of domain B's
memory to issue domain C with a copy-grant on the same page. This is useful
e.g. for forwarding packets between domains.
Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Annie Li [Mon, 12 Dec 2011 10:14:42 +0000 (18:14 +0800)]
xen/granttable: Support sub-page grants
- They can't be used to map the page (so can only be used in a GNTTABOP_copy
hypercall).
- It's possible to grant access with a finer granularity than whole pages.
- Xen guarantees that they can be revoked quickly (a normal map grant can
only be revoked with the cooperation of the domain which has been granted
access).
Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Luck, Tony [Wed, 30 Nov 2011 18:22:37 +0000 (10:22 -0800)]
xen/ia64: fix build breakage because of conflicting u64 guest handles
include/xen/interface/xen.h:526: error: conflicting types for ‘__guest_handle_u64’
arch/ia64/include/asm/xen/interface.h:74: error: previous declaration of ‘__guest_handle_u64’ was here
Problem introduced by "xen/granttable: Introducing grant table V2 stucture"
which added a new definition to include/xen/interface/xen.h for "u64".
Fix: delete the ia64 arch specific definition.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Annie Li [Tue, 22 Nov 2011 01:59:56 +0000 (09:59 +0800)]
xen/granttable: Keep code format clean
Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Annie Li [Tue, 22 Nov 2011 01:59:21 +0000 (09:59 +0800)]
xen/granttable: Grant tables V2 implementation
Receiver-side copying of packets is based on this implementation, it gives
better performance and better CPU accounting. It totally supports three types:
full-page, sub-page and transitive grants.
However this patch does not cover sub-page and transitive grants, it mainly
focus on Full-page part and implements grant table V2 interfaces corresponding
to what already exists in grant table V1, such as: grant table V2
initialization, mapping, releasing and exported interfaces.
Each guest can only supports one type of grant table type, every entry in grant
table should be the same version. It is necessary to set V1 or V2 version before
initializing the grant table.
Grant table exported interfaces of V2 are same with those of V1, Xen is
responsible to judge what grant table version guests are using in every grant
operation.
V2 fulfills the same role of V1, and it is totally backwards compitable with V1.
If dom0 support grant table V2, the guests runing on it can run with either V1
or V2.
Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Annie Li <annie.li@oracle.com>
[v1: Modified alloc_vm_area call (new parameters), indentation, and cleanpatch
warnings] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Annie Li [Tue, 22 Nov 2011 01:58:47 +0000 (09:58 +0800)]
xen/granttable: Refactor some code
Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Annie Li [Tue, 22 Nov 2011 01:58:06 +0000 (09:58 +0800)]
xen/granttable: Introducing grant table V2 stucture
This patch introduces new structures of grant table V2, grant table V2 is an
extension from V1. Grant table is shared between guest and Xen, and Xen is
responsible to do corresponding work for grant operations, such as: figure
out guest's grant table version, perform different actions based on
different grant table version, etc. Although full-page structure of V2
is different from V1, it play the same role as V1.
Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jeremy Fitzhardinge [Fri, 18 Nov 2011 23:56:06 +0000 (15:56 -0800)]
Xen: update MAINTAINER info
No longer at Citrix, still interested in Xen.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Daniel De Graaf [Thu, 27 Oct 2011 21:58:47 +0000 (17:58 -0400)]
xen/event: Add reference counting to event channels
Event channels exposed to userspace by the evtchn module may be used by
other modules in an asynchronous manner, which requires that reference
counting be used to prevent the event channel from being closed before
the signals are delivered.
The reference count on new event channels defaults to -1 which indicates
the event channel is not referenced outside the kernel; evtchn_get fails
if called on such an event channel. The event channels made visible to
userspace by evtchn have a normal reference count.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Thu, 2 Feb 2012 18:58:45 +0000 (13:58 -0500)]
Merge branch 'in-3.1/bug.fixes' into stable/for-linus-3.3.rebased
* in-3.1/bug.fixes:
x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode.
xen/i386: follow-up to "replace order-based range checking of M2P table by linear one"
xen/irq: Alter the locking to use a mutex instead of a spinlock.
xen/e820: if there is no dom0_mem=, don't tweak extra_pages.
Revert "xen/e820: if there is no dom0_mem=, don't tweak extra_pages."
xen/e820: if there is no dom0_mem=, don't tweak extra_pages.
xen: disable PV spinlocks on HVM
xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead.
xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
xen: use maximum reservation to limit amount of usable RAM
xen: Do not enable PV IPIs when vector callback not present
xen/x86: replace order-based range checking of M2P table by linear one
xen: Fix misleading WARN message at xen_release_chunk
xen: Fix printk() format in xen/setup.c
xen/grant: Fix compile warning.
xen:pvhvm: Modpost section mismatch fix
Daniel De Graaf [Thu, 27 Oct 2011 21:58:49 +0000 (17:58 -0400)]
xen/gnt{dev,alloc}: reserve event channels for notify
When using the unmap notify ioctl, the event channel used for
notification needs to be reserved to avoid it being deallocated prior to
sending the notification.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Daniel De Graaf [Thu, 27 Oct 2011 21:58:48 +0000 (17:58 -0400)]
xen/gntalloc: Change gref_lock to a mutex
The event channel release function cannot be called under a spinlock
because it can attempt to acquire a mutex due to the event channel
reference acquired when setting up unmap notifications.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
David Vrabel [Wed, 26 Oct 2011 10:57:43 +0000 (11:57 +0100)]
xen: document balloon driver sysfs files
Add ABI documentation for the balloon driver's sysfs files.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Daniel Kiper <dkiper@net-space.pl>
[v2: Added comments from Daniel] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Dan Carpenter [Thu, 26 Jan 2012 13:55:16 +0000 (16:55 +0300)]
xfs: fix acl count validation in xfs_acl_from_disk()
We applied a fix for CVE-2012-0038 fa8b18edd7 "xfs: validate acl count",
but there was a follow on patch which is not in our kernel. If count
was a negative then we could get by the new check.
From 093019cf1b18dd31b2c3b77acce4e000e2cbc9ce Mon Sep 17 00:00:00 2001
From: Xi Wang <xi.wang@gmail.com>
Date: Mon, 12 Dec 2011 21:55:52 +0000
Subject: [PATCH] xfs: fix acl count validation in xfs_acl_from_disk()
Commit fa8b18ed didn't prevent the integer overflow and possible
memory corruption. "count" can go negative and bypass the check.
Signed-off-by: Xi Wang <xi.wang@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Srinivas Eeda [Tue, 31 Jan 2012 22:37:19 +0000 (14:37 -0800)]
ocfs2: use spinlock irqsave for downconvert lock.patch
When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
Below is the stack snippet.
The patch disables interrupts when acquiring dc_task_lock spinlock.
Chris Mason [Wed, 25 Jan 2012 18:47:40 +0000 (13:47 -0500)]
Btrfs: fix reservations in btrfs_page_mkwrite
Josef fixed btrfs_page_mkwrite to properly release reserved
extents if there was an error. But if we fail to get a reservation
and we fail to dirty the inode (for ENOSPC reasons), we'll end up
trying to release a reservation we never had.
This makes sure we only release if we were able to reserve.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 16 Jan 2012 13:13:11 +0000 (08:13 -0500)]
Btrfs: use larger system chunks
system chunks by default are very small. This makes them slightly
larger and also fixes the conditional checks to make sure we don't
allocate a billion of them at once.