Seth Forshee [Mon, 14 Nov 2016 11:12:56 +0000 (11:12 +0000)]
xenfs: Use proc_create_mount_point() to create /proc/xen
Mounting proc in user namespace containers fails if the xenbus
filesystem is mounted on /proc/xen because this directory fails
the "permanently empty" test. proc_create_mount_point() exists
specifically to create such mountpoints in proc but is currently
proc-internal. Export this interface to modules, then use it in
xenbus when creating /proc/xen.
Arnd Bergmann [Thu, 10 Nov 2016 08:55:42 +0000 (09:55 +0100)]
xen-netback: fix error handling output
The connect function prints an unintialized error code after an
earlier initialization was removed:
drivers/net/xen-netback/xenbus.c: In function 'connect':
drivers/net/xen-netback/xenbus.c:938:3: error: 'err' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This prints it as -EINVAL instead, which seems to be the most
appropriate error code. Before the patch that caused the warning,
this would print a positive number returned by vsscanf() instead,
which is also wrong. We probably don't need a backport though,
as fixing the warning here should be sufficient.
Fixes: f95842e7a9f2 ("xen: make use of xenbus_read_unsigned() in xen-netback") Fixes: 8d3d53b3e433 ("xen-netback: Add support for multiple queues") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392
Juergen Gross [Mon, 31 Oct 2016 13:58:42 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xenbus
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-pciback
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-fbfront
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-pcifront
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-netfront
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of some reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-netback
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of some reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.
Cc: wei.liu2@citrix.com Cc: paul.durrant@citrix.com Cc: netdev@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
(cherry picked from commit f95842e7a9f235ef3b7d6d4b70fee2244149f1e7) Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
drivers/net/xen-netback/xenbus.c
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-kbdfront
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-tpmfront
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of one read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-blkfront
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of some reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-blkback
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of one read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: introduce xenbus_read_unsigned()
There are multiple instances of code reading an optional unsigned
parameter from Xenstore via xenbus_scanf(). Instead of repeating the
same code over and over add a service function doing the job.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Dongli Zhang [Wed, 2 Nov 2016 01:04:33 +0000 (09:04 +0800)]
xen-netfront: cast grant table reference first to type int
IS_ERR_VALUE() in commit 87557efc27f6a50140fb20df06a917f368ce3c66
("xen-netfront: do not cast grant table reference to signed short") would
not return true for error code unless we cast ref first to type int.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OraBug: 25497392
Juergen Gross [Tue, 11 Oct 2016 11:34:16 +0000 (13:34 +0200)]
xenbus: advertise control feature flags
The Xen docs specify several flags which a guest can set to advertise
which values of the xenstore control/shutdown key it will recognize.
This patch adds code to write all the relevant feature-flag keys.
Based-on-patch-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Support the driver_override scheme introduced with commit 782a985d7af2
("PCI: Introduce new device binding path using pci_dev.driver_override")
As pcistub_probe() is called for all devices (it has to check for a
match based on the slot address rather than device type) it has to
check for driver_override set to "pciback" itself.
Up to now for assigning a pci device to pciback you need something like:
The Xen pciback driver has a list of all pci devices it is ready to
seize. There is no check whether a to be added entry already exists.
While this might be no problem in the common case it might confuse
those which consume the list via sysfs.
Modify the handling of this list by not adding an entry which already
exists. As this will be needed later split out the list handling into
a separate function.
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
The Xen pciback driver maintains a list of all its seized devices.
There are two functions searching the list for a specific device with
basically the same semantics just returning different structures in
case of a match.
Split out the search function.
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Colin Ian King [Mon, 12 Sep 2016 10:20:46 +0000 (11:20 +0100)]
x86/xen: add missing \n at end of printk warning message
The message is missing a \n, add it.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
xen-netfront: avoid packet loss when ethernet header crosses page boundary
Small packet loss is reported on complex multi host network configurations
including tunnels, NAT, ... My investigation led me to the following check
in netback which drops packets:
But this check itself is legitimate. SKBs consist of a linear part (which
has to have the ethernet header) and (optionally) a number of frags.
Netfront transmits the head of the linear part up to the page boundary
as the first request and all the rest becomes frags so when we're
reconstructing the SKB in netback we can't distinguish between original
frags and the 'tail' of the linear part. The first SKB needs to be at
least ETH_HLEN size. So in case we have an SKB with its linear part
starting too close to the page boundary the packet is lost.
I see two ways to fix the issue:
- Change the 'wire' protocol between netfront and netback to start keeping
the original SKB structure. We'll have to add a flag indicating the fact
that the particular request is a part of the original linear part and not
a frag. We'll need to know the length of the linear part to pre-allocate
memory.
- Avoid transmitting SKBs with linear parts starting too close to the page
boundary. That seems preferable short-term and shouldn't bring
significant performance degradation as such packets are rare. That's what
this patch is trying to achieve with skb_copy().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OraBug: 25497392
Markus Elfring [Thu, 25 Aug 2016 11:23:06 +0000 (13:23 +0200)]
xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()
* A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus reuse the corresponding function "kmalloc_array".
This issue was detected by using the Coccinelle software.
* Replace the specification of a data type by a pointer dereference
to make the corresponding size determination a bit safer according to
the Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Juergen Gross [Tue, 2 Aug 2016 07:22:12 +0000 (09:22 +0200)]
xen: Make VPMU init message look less scary
The default for the Xen hypervisor is to not enable VPMU in order to
avoid security issues. In this case the Linux kernel will issue the
message "Could not initialize VPMU for cpu 0, error -95" which looks
more like an error than a normal state.
Change the message to something less scary in case the hypervisor
returns EOPNOTSUPP or ENOSYS when trying to activate VPMU.
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Juergen Gross [Tue, 2 Aug 2016 06:53:36 +0000 (08:53 +0200)]
xen: rename xen_pmu_init() in sys-hypervisor.c
There are two functions with name xen_pmu_init() in the kernel. Rename
the one in drivers/xen/sys-hypervisor.c to avoid shadowing the one in
arch/x86/xen/pmu.c
To avoid the same problem in future rename some more functions.
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Petr Tesarik [Tue, 2 Aug 2016 21:06:19 +0000 (14:06 -0700)]
kexec: allow kdump with crash_kexec_post_notifiers
If a crash kernel is loaded, do not crash the running domain. This is
needed if the kernel is loaded with crash_kexec_post_notifiers, because
panic notifiers are run before __crash_kexec() in that case, and this
Xen hook prevents its being called later.
[akpm@linux-foundation.org: build fix: unconditionally include kexec.h] Link: http://lkml.kernel.org/r/20160713122000.14969.99963.stgit@hananiah.suse.cz Signed-off-by: Petr Tesarik <ptesarik@suse.com> Cc: Juergen Gross <jgross@suse.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Eric Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OraBug: 25497392
(cherry picked from commit c0253115968c35f3e1ee497282efb75ccf29fb98) Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
arch/x86/xen/enlighten.c
Jan Beulich [Fri, 8 Jul 2016 12:15:07 +0000 (06:15 -0600)]
xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7
As of Xen 4.7 PV CPUID doesn't expose either of CPUID[1].ECX[7] and
CPUID[0x80000007].EDX[7] anymore, causing the driver to fail to load on
both Intel and AMD systems. Doing any kind of hardware capability
checks in the driver as a prerequisite was wrong anyway: With the
hypervisor being in charge, all such checking should be done by it. If
ACPI data gets uploaded despite some missing capability, the hypervisor
is free to ignore part or all of that data.
Ditch the entire check_prereq() function, and do the only valid check
(xen_initial_domain()) in the caller in its place.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
Boris Ostrovsky [Wed, 2 Dec 2015 17:10:48 +0000 (12:10 -0500)]
xen: Resume PMU from non-atomic context
Resuming PMU currently triggers a warning from ___might_sleep() (assuming
CONFIG_DEBUG_ATOMIC_SLEEP is set) when xen_pmu_init() allocates GFP_KERNEL
page because we are in state resembling atomic context.
Move resuming PMU to xen_arch_resume() which is called in regular context.
For symmetry move suspending PMU to xen_arch_suspend() as well.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> # 4.3 Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392
David Vrabel [Fri, 9 Dec 2016 14:41:13 +0000 (14:41 +0000)]
xenbus: fix deadlock on writes to /proc/xen/xenbus
/proc/xen/xenbus does not work correctly. A read blocked waiting for
a xenstore message holds the mutex needed for atomic file position
updates. This blocks any writes on the same file handle, which can
deadlock if the write is needed to unblock the read.
Clear FMODE_ATOMIC_POS when opening this device to always get
character device like sematics.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
Orabug: 25425387
(cherry picked from commit 581d21a2d02a798ee34e56dbfa13f891b3a90c30)
Jira: OCC-36718 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:36 +0000 (17:56 +0200)]
x86/acpi: store ACPI ids from MADT for future usage
Currently we don't save ACPI ids (unlike LAPIC ids which go to
x86_cpu_to_apicid) from MADT and we may need this information later.
Particularly, ACPI ids is the only existent way for a PVHVM Xen guest
to figure out Xen's idea of its vCPUs ids before these CPUs boot and
in some cases these ids diverge from Linux's cpu ids.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 3e9e57fad3d8530aa30787f861c710f598ddc4e7) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Filipe Manco [Thu, 15 Sep 2016 15:10:46 +0000 (17:10 +0200)]
xen-netback: fix error handling on netback_probe()
In case of error during netback_probe() (e.g. an entry missing on the
xenstore) netback_remove() is called on the new device, which will set
the device backend state to XenbusStateClosed by calling
set_backend_state(). However, the backend state wasn't initialized by
netback_probe() at this point, which will cause and invalid transaction
and set_backend_state() to BUG().
Initialize the backend state at the beginning of netback_probe() to
XenbusStateInitialising, and create two new valid state transitions on
set_backend_state(), from XenbusStateInitialising to XenbusStateClosed,
and from XenbusStateInitialising to XenbusStateInitWait.
Signed-off-by: Filipe Manco <filipe.manco@neclab.eu> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cce94483e47e8e3d74cf4475dea33f9fd4b6ad9f) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
We pass xen_vcpu_id mapping information to hypercalls which require
uint32_t type so it would be cleaner to have it as uint32_t. The
initializer to -1 can be dropped as we always do the mapping before using
it and we never check the 'not set' value anyway.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 55467dea2967259f21f4f854fc99d39cc5fea60e) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Mon, 15 Aug 2016 15:02:38 +0000 (09:02 -0600)]
xenbus: don't look up transaction IDs for ordinary writes
This should really only be done for XS_TRANSACTION_END messages, or
else at least some of the xenstore-* tools don't work anymore.
Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition") Reported-by: Richard Schütz <rschuetz@uni-koblenz.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Richard Schütz <rschuetz@uni-koblenz.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 9a035a40f7f3f6708b79224b86c5777a3334f7ea) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Bob Liu [Wed, 27 Jul 2016 09:42:04 +0000 (17:42 +0800)]
xen-blkfront: free resources if xlvbd_alloc_gendisk fails
Current code forgets to free resources in the failure path of
xlvbd_alloc_gendisk(), this patch fix it.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 4e876c2bd37fbb5c37a4554a79cf979d486f0e82) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
xen: add static initialization of steal_clock op to xen_time_ops
pv_time_ops might be overwritten with xen_time_ops after the
steal_clock operation has been initialized already. To prevent calling
a now uninitialized function pointer add the steal_clock static
initialization to xen_time_ops.
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit d34c30cc1fa80f509500ff192ea6bc7d30671061) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:43 +0000 (17:56 +0200)]
xen/pvhvm: run xen_vcpu_setup() for the boot CPU
Historically we didn't call VCPUOP_register_vcpu_info for CPU0 for
PVHVM guests (while we had it for PV and ARM guests). This is usually
fine as we can use vcpu info in the shared_info page but when we try
booting on a vCPU with Xen's vCPU id > 31 (e.g. when we try to kdump
after crashing on this CPU) we're not able to boot.
Switch to always doing VCPUOP_register_vcpu_info for the boot CPU.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ee42d665d3f5db975caf87baf101a57235ddb566) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:42 +0000 (17:56 +0200)]
xen/evtchn: use xen_vcpu_id mapping
Use the newly introduced xen_vcpu_id mapping to get Xen's idea of vCPU
id for CPU0.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit cbbb4682394c45986a34d8c77a02e7a066e30235) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:41 +0000 (17:56 +0200)]
xen/events: fifo: use xen_vcpu_id mapping
EVTCHNOP_init_control has vCPU id as a parameter and Xen's idea of
vCPU id should be used. Use the newly introduced xen_vcpu_id mapping
to convert it from Linux's id.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit be78da1cf43db4c1a9e13af8b6754199a89d5d75) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:40 +0000 (17:56 +0200)]
xen/events: use xen_vcpu_id mapping in events_base
EVTCHNOP_bind_ipi and EVTCHNOP_bind_virq pass vCPU id as a parameter
and Xen's idea of vCPU id should be used. Use the newly introduced
xen_vcpu_id mapping to convert it from Linux's id.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 8058c0b897e7d1ba5c900cb17eb82aa0d88fca53) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:39 +0000 (17:56 +0200)]
x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info
shared_info page has space for 32 vcpu info slots for first 32 vCPUs
but these are the first 32 vCPUs from Xen's perspective and we should
map them accordingly with the newly introduced xen_vcpu_id mapping.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit e15a8621935cac527b4e0ed4078d24c3e5ef73a6) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:38 +0000 (17:56 +0200)]
x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op
HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter
while Xen's idea is expected. In some cases these ideas diverge so we
need to do remapping.
Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr().
Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as
they're only being called by PV guests before perpu areas are
initialized. While the issue could be solved by switching to
early_percpu for xen_vcpu_id I think it's not worth it: PV guests will
probably never get to the point where their idea of vCPU id diverges
from Xen's.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ad5475f9faf5186b7f59de2c6481ee3e211f1ed7) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:37 +0000 (17:56 +0200)]
xen: introduce xen_vcpu_id mapping
It may happen that Xen's and Linux's ideas of vCPU id diverge. In
particular, when we crash on a secondary vCPU we may want to do kdump
and unlike plain kexec where we do migrate_to_reboot_cpu() we try
booting on the vCPU which crashed. This doesn't work very well for
PVHVM guests as we have a number of hypercalls where we pass vCPU id
as a parameter. These hypercalls either fail or do something
unexpected.
To solve the issue introduce percpu xen_vcpu_id mapping. ARM and PV
guests get direct mapping for now. Boot CPU for PVHVM guest gets its
id from CPUID. With secondary CPUs it is a bit more
trickier. Currently, we initialize IPI vectors before these CPUs boot
so we can't use CPUID. Use ACPI ids from MADT instead.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 88e957d6e47f1232ad15b21e54a44f1147ea8c1b) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:35 +0000 (17:56 +0200)]
x86/xen: update cpuid.h from Xen-4.7
Update cpuid.h header from xen hypervisor tree to get
XEN_HVM_CPUID_VCPU_ID_PRESENT definition.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit de2f5537b397249e91cafcbed4de64a24818542e) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
David Vrabel [Mon, 11 Jul 2016 14:45:51 +0000 (15:45 +0100)]
xen/evtchn: add IOCTL_EVTCHN_RESTRICT
IOCTL_EVTCHN_RESTRICT limits the file descriptor to being able to bind
to interdomain event channels from a specific domain. Event channels
that are already bound continue to work for sending and receiving
notifications.
This is useful as part of deprivileging a user space PV backend or
device model (QEMU). e.g., Once the device model as bound to the
ioreq server event channels it can restrict the file handle so an
exploited DM cannot use it to create or bind to arbitrary event
channels.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
(cherry picked from commit fbc872c38c8fed31948c85683b5326ee5ab9fccc) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Thu, 7 Jul 2016 08:05:46 +0000 (02:05 -0600)]
xen-blkfront: prefer xenbus_scanf() over xenbus_gather()
... for single items being collected: It is more typesafe (as the
compiler can check format string and to-be-written-to variable match)
and requires one less parameter to be passed.
Jan Beulich [Thu, 7 Jul 2016 08:05:21 +0000 (02:05 -0600)]
xen-blkback: prefer xenbus_scanf() over xenbus_gather()
... for single items being collected: It is more typesafe (as the
compiler can check format string and to-be-written-to variable match)
and requires one less parameter to be passed.
Paul Gortmaker [Thu, 14 Jul 2016 00:18:59 +0000 (20:18 -0400)]
x86/xen: Audit and remove any unnecessary uses of module.h
Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends. That changed
when we forked out support for the latter into the export.h file.
This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig. The advantage
in doing so is that module.h itself sources about 15 other headers;
adding significantly to what we feed cpp, and it can obscure what
headers we are effectively using.
Since module.h was the source for init.h (for __init) and for
export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance
for the presence of either and replace as needed.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20160714001901.31603-7-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 7a2463dcacee3f2f36c78418c201756372eeea6b) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Sat, 9 Jul 2016 00:35:30 +0000 (17:35 -0700)]
Input: xen-kbdfront - prefer xenbus_write() over xenbus_printf() where possible
... as being the simpler variant.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(cherry picked from commit cd6763be8f553c7db421d38ddcb36466fb8512cd) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Up to now reading the stolen time of a remote cpu was not possible in a
performant way under Xen. This made support of runqueue steal time via
paravirt_steal_rq_enabled impossible.
With the addition of an appropriate hypervisor interface this is now
possible, so add the support.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6ba286ad845799b135e5af73d1fbc838fa79f709) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Wed, 6 Jul 2016 07:00:14 +0000 (01:00 -0600)]
xen-pciback: drop superfluous variables
req_start is simply an alias of the "offset" function parameter, and
req_end is being used just once in each function. (And both variables
were loop invariant anyway, so should at least have got initialized
outside the loop.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 1ad6344acfbf19288573b4a5fa0b07cbb5af27d7) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Wed, 6 Jul 2016 06:59:35 +0000 (00:59 -0600)]
xen-pciback: short-circuit read path used for merging write values
There's no point calling xen_pcibk_config_read() here - all it'll do is
return whatever conf_space_read() returns for the field which was found
here (and which would be found there again). Also there's no point
clearing tmp_val before the call.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ee87d6d0d36d98c550f99274a81841033226e3bf) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Wed, 6 Jul 2016 06:58:58 +0000 (00:58 -0600)]
xen-pciback: use const and unsigned in bar_init()
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 585203609c894db11dea724b743c04d0c9927f39) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Wed, 6 Jul 2016 06:58:19 +0000 (00:58 -0600)]
xen-pciback: simplify determination of 64-bit memory resource
Other than for raw BAR values, flags are properly separated in the
internal representation.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit c8670c22e04e4e42e752cc5b53922106b3eedbda) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Wed, 6 Jul 2016 06:57:43 +0000 (00:57 -0600)]
xen-pciback: fold read_dev_bar() into its now single caller
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6ad2655d87d2d35c1de4500402fae10fe7b30b4a) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Wed, 6 Jul 2016 06:57:07 +0000 (00:57 -0600)]
xen-pciback: drop rom_init()
It is now identical to bar_init().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 664093bb6b797c8ba0a525ee0a36ad8cbf89413e) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Wed, 6 Jul 2016 06:56:27 +0000 (00:56 -0600)]
xen-pciback: drop unused function parameter of read_dev_bar()
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6c6e4caa2006ab82587a3648967314ec92569a98) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
The kernel.h macro DIV_ROUND_UP performs the computation
(((n) + (d) - 1) /(d)) but is perhaps more readable.
The Coccinelle script used to make this change is as follows:
@haskernel@
@@
@depends on haskernel@
expression n,d;
@@
(
- (n + d - 1) / d
+ DIV_ROUND_UP(n,d)
|
- (n + (d - 1)) / d
+ DIV_ROUND_UP(n,d)
)
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 585423c8c4d2f39a2c299bc6dd16433e6141fba5) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Bhaktipriya Shridhar [Tue, 31 May 2016 16:56:30 +0000 (22:26 +0530)]
xen: xenbus: Remove create_workqueue
System workqueues have been able to handle high level of concurrency
for a long time now and there's no reason to use dedicated workqueues
just to gain concurrency. Replace dedicated xenbus_frontend_wq with the
use of system_wq.
Unlike a dedicated per-cpu workqueue created with create_workqueue(),
system_wq allows multiple work items to overlap executions even on
the same CPU; however, a per-cpu workqueue doesn't have any CPU
locality or global ordering guarantees unless the target CPU is
explicitly specified and the increase of local concurrency shouldn't
make any difference.
In this case, there is only a single work item, increase of concurrency
level by switching to system_wq should not make any difference.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 5ee405d9d234ee5641741c07a654e4c6ba3e2a9d) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Bhaktipriya Shridhar [Wed, 1 Jun 2016 14:15:08 +0000 (19:45 +0530)]
xen: xen-pciback: Remove create_workqueue
System workqueues have been able to handle high level of concurrency
for a long time now and there's no reason to use dedicated workqueues
just to gain concurrency. Replace dedicated xen_pcibk_wq with the
use of system_wq.
Unlike a dedicated per-cpu workqueue created with create_workqueue(),
system_wq allows multiple work items to overlap executions even on
the same CPU; however, a per-cpu workqueue doesn't have any CPU
locality or global ordering guarantees unless the target CPU is
explicitly specified and thus the increase of local concurrency shouldn't
make any difference.
Since the work items could be pending, flush_work() has been used in
xen_pcibk_disconnect(). xen_pcibk_xenbus_remove() calls free_pdev()
which in turn calls xen_pcibk_disconnect() for every pdev to ensure that
there is no pending task while disconnecting the driver.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 429eafe60943bdfa33b15540ab2db5642a1f8c3c) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Boris Ostrovsky [Tue, 21 Jun 2016 14:17:33 +0000 (10:17 -0400)]
xen/PMU: Log VPMU initialization error at lower level
This will match how PMU errors are reported at check_hw_exists()'s
msr_fail label, which is reached when VPMU initialzation fails.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6ab9507ed96a6c0b24174d3430064a90b3dddd0a) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
The pv_time_ops structure contains a function pointer for the
"steal_clock" functionality used only by KVM and Xen on ARM. Xen on x86
uses its own mechanism to account for the "stolen" time a thread wasn't
able to run due to hypervisor scheduling.
Add support in Xen arch independent time handling for this feature by
moving it out of the arm arch into drivers/xen and remove the x86 Xen
hack.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ecb23dc6f2eff0ce64dd60351a81f376f13b12cc) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Muhammad Falak R Wani [Sun, 24 Apr 2016 12:03:32 +0000 (20:03 +0800)]
xen: use vma_pages().
Replace explicit computation of vma page count by a call to
vma_pages().
Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit c7ebf9d9c6b4e9402b978da0b0785db4129c1f79) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
XEN: EFI: Move x86 specific codes to architecture directory
Move x86 specific codes to architecture directory and export those EFI
runtime service functions. This will be useful for initializing runtime
service on ARM later.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit a62ed500307bfaf4c1a818b69f7c1e7df1039a16) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
xen/hvm/params: Add a new delivery type for event-channel in HVM_PARAM_CALLBACK_IRQ
This new delivery type which is for ARM shares the same value with
HVM_PARAM_CALLBACK_TYPE_VECTOR which is for x86.
val[15:8] is flag: val[7:0] is a PPI.
To the flag, bit 8 stands the interrupt mode is edge(1) or level(0) and
bit 9 stands the interrupt polarity is active low(1) or high(0).
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com>
(cherry picked from commit 383ff518a79fe3dcece579b9d30be77b219d10f8) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Xen: xlate: Use page_to_xen_pfn instead of page_to_pfn
Make xen_xlate_map_ballooned_pages work with 64K pages. In that case
Kernel pages are 64K in size but Xen pages remain 4K in size. Xen pfns
refer to 4K pages.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com>
(cherry picked from commit 975fac3c4f38e0b47514abdb689548a8e9971081) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Andy Lutomirski [Thu, 30 Jul 2015 21:31:31 +0000 (14:31 -0700)]
x86/xen: Probe target addresses in set_aliased_prot() before the hypercall
The update_va_mapping hypercall can fail if the VA isn't present
in the guest's page tables. Under certain loads, this can
result in an OOPS when the target address is in unpopulated vmap
space.
While we're at it, add comments to help explain what's going on.
This isn't a great long-term fix. This code should probably be
changed to use something like set_memory_ro.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Vrabel <dvrabel@cantab.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org <security@kernel.org> Cc: <stable@vger.kernel.org> Cc: xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit aa1acff356bbedfd03b544051f5b371746735d89) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Konrad Rzeszutek Wilk [Mon, 3 Oct 2016 13:34:21 +0000 (09:34 -0400)]
Merge branch 'uek4/4.7-xen-backport' into topic/uek-4.1/xen
* uek4/4.7-xen-backport:
xenbus: simplify xenbus_dev_request_and_reply()
xenbus: don't bail early from xenbus_dev_request_and_reply()
xenbus: don't BUG() on user mode induced condition
xen-pciback: return proper values during BAR sizing
x86/xen: avoid m2p lookup when setting early page table entries
xen/pciback: Fix conf_space read/write overlap check.
x86/xen: fix upper bound of pmd loop in xen_cleanhighmap()
xen/balloon: Fix declared-but-not-defined warning
xen-blkfront: fix resume issues after a migration
xen-blkfront: don't call talk_to_blkback when already connected to blkback
xen: use same main loop for counting and remapping pages
Xen: don't warn about 2-byte wchar_t in efi
xen/gntdev: reduce copy batch size to 16
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Konrad Rzeszutek Wilk [Fri, 19 Aug 2016 15:06:44 +0000 (11:06 -0400)]
x86/xen: Add x86_platform.is_untracked_pat_range quirk to ignore ISA regions.
On x86 whenever VMAs are setup, the 'is_ISA_range quirk' (which this
patch re-implements) is used to figure whether to ignore the
requested PAT type and always use WB (see 'reserve_memtype').
Specifically it forces the WB type for any region in the ISA space.
From the Intel SDM, the combination of MTRR (UC, which is setup by
the BIOS) and PAT (UC or WB) for the ISA region ends up with the same
value - UC.
However on Xen, due to XSA 154 we enforce that mappings that _ANY_
pagetable entry to MMIO ranges MUST have the same the same cachability
mapping - and in this case we enforce UC.
Which means that with XSA 154 (and without this patch) any application
that maps /dev/mem to get SMBIOS information (like mcelog), and pokes
in the ISA region will not have an PTE set. That is due to
reserve_pfn_range returning -EINVAL which results in the PTE not being set.
[These are debug entries added in 'reserve_pfn_range']
mcelog:2471 0xf0000->0xf1000, req_type=write-back new_type=write-back
mcelog:2471 0xeb000->0xed000, req_type=write-back new_type=write-back
.. above are successfull ones, but:
mcelog:2471 0xeb000->0xed000, req_type=uncached new_type=uncached
[again, a debug one:]
mcelog:2471 want=uncached got=write-back strict 0x000eb000-0x000ecfff
mcelog:2471 map pfn expected mapping type uncached for [mem 0x000eb000-0x000ecfff], got write-back
------------[ cut here ]------------
The effective result of the function below is for 'reserver_memtype'
to ignore the result from 'x86_platform.is_untracked_pat_range' quirk.
Which means that the splat above does not happen.
Orabug: 24491985 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Elena Ufimtseva [Thu, 21 Jul 2016 21:25:27 +0000 (17:25 -0400)]
xen-pciback: mark device to be hidden on AER error trigger
Some platforms are configured to reboot the machine upon
AER unrecoverable error and some virtualized systems are subject
to security risks described in XSA-124.
This patch allows for simple AER unrecoverable errors containment
together with killing the guest upon receiving of fatal AER error.
Patch stores in xenstore sbdf of passed through device that triggered
AER unrecoverable error. This will allow xend to make device
unassignable until next reboot or special hypervisor hypercall.
Juergen Gross [Wed, 18 May 2016 14:44:54 +0000 (16:44 +0200)]
xen: use same main loop for counting and remapping pages
Instead of having two functions for cycling through the E820 map in
order to count to be remapped pages and remap them later, just use one
function with a caller supplied sub-function called for each region to
be processed. This eliminates the possibility of a mismatch between
both loops which showed up in certain configurations.
Suggested-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit dd14be92fbf5bc1ef7343f34968440e44e21b46a) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 24012238
Jan Beulich [Thu, 7 Jul 2016 07:32:35 +0000 (01:32 -0600)]
xenbus: simplify xenbus_dev_request_and_reply()
No need to retain a local copy of the full request message, only the
type is really needed.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit e5a79475a7ae171fef82608c6e11f51bb85a6745) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Jan Beulich [Thu, 7 Jul 2016 07:32:04 +0000 (01:32 -0600)]
xenbus: don't bail early from xenbus_dev_request_and_reply()
xenbus_dev_request_and_reply() needs to track whether a transaction is
open. For XS_TRANSACTION_START messages it calls transaction_start()
and for XS_TRANSACTION_END messages it calls transaction_end().
If sending an XS_TRANSACTION_START message fails or responds with an
an error, the transaction is not open and transaction_end() must be
called.
If sending an XS_TRANSACTION_END message fails, the transaction is
still open, but if an error response is returned the transaction is
closed.
Commit 027bd7e89906 ("xen/xenbus: Avoid synchronous wait on XenBus
stalling shutdown/restart") introduced a regression where failed
XS_TRANSACTION_START messages were leaving the transaction open. This
can cause problems with suspend (and migration) as all transactions
must be closed before suspending.
It appears that the problematic change was added accidentally, so just
remove it.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 7469be95a487319514adce2304ad2af3553d2fc9) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Jan Beulich [Thu, 7 Jul 2016 07:23:57 +0000 (01:23 -0600)]
xenbus: don't BUG() on user mode induced condition
Inability to locate a user mode specified transaction ID should not
lead to a kernel crash. For other than XS_TRANSACTION_START also
don't issue anything to xenbus if the specified ID doesn't match that
of any active transaction.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 0beef634b86a1350c31da5fcc2992f0d7c8a622b) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Bob Liu [Fri, 1 Jul 2016 01:11:15 +0000 (21:11 -0400)]
xen-blkfront: dynamic configuration of per-vbd resources
The current VBD layer reserves buffer space for each attached device based on
three statically configured settings which are read at boot time.
* max_indirect_segs: Maximum amount of segments.
* max_ring_page_order: Maximum order of pages to be used for the shared ring.
* max_queues: Maximum of queues(rings) to be used.
But the storage backend, workload, and guest memory result in very different
tuning requirements. It's impossible to centrally predict application
characteristics so it's best to leave allow the settings can be dynamiclly
adjusted based on workload inside the Guest.
Usage:
Show current values:
cat /sys/devices/vbd-xxx/max_indirect_segs
cat /sys/devices/vbd-xxx/max_ring_page_order
cat /sys/devices/vbd-xxx/max_queues
Bob Liu [Fri, 1 Jul 2016 21:43:39 +0000 (17:43 -0400)]
xen-blkfront: introduce blkif_set_queue_limits()
blk_mq_update_nr_hw_queues() reset all queue limits to default which it's not
as xen-blkfront expected, introducing blkif_set_queue_limits() to reset limits
with initial correct values.
Orabug: 23720696 Signed-off-by: Bob Liu <bob.liu@oracle.com>
where on certain very specific Intel platforms (which can do memory
hotplug) we would crash during bootup.
The reason for the failure is that we use the baremetal mechanism
to do memory hotplug - which is not appropiate. We should use
the xen-acpi-memory.c one, but ..
config XEN_ACPI_HOTPLUG_MEMORY
tristate "Xen ACPI memory hotplug"
depends on XEN_DOM0 && XEN_STUB && ACPI
and
config XEN_STUB
In the meantime we can:
1) Revert f5775e0b6116b7e2425ccf535243b21768566d87
I would prefer not as we would diverge from upstream when it comes
to backporting features.
2) Simulate acpi_no_memhotplug being passed on the Linux command line.
That is much easier, and we can carry this patch until upstream
gets a proper fix.
The patch "solves" the regression be implementing 2).
Reported-and-Tested-by: Deepak Patel <deepak.patel@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug:23735125
Jan Beulich [Fri, 24 Jun 2016 09:13:34 +0000 (03:13 -0600)]
xen-pciback: return proper values during BAR sizing
Reads following writes with all address bits set to 1 should return all
changeable address bits as one, not the BAR size (nor, as was the case
for the upper half of 64-bit BARs, the high half of the region's end
address). Presumably this didn't cause any problems so far because
consumers use the value to calculate the size (usually via val & -val),
and do nothing else with it.
But also consider the exception here: Unimplemented BARs should always
return all zeroes.
And finally, the check for whether to return the sizing address on read
for the ROM BAR should ignore all non-address bits, not just the ROM
Enable one.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit d2bd05d88d245c13b64c3bf9c8927a1c56453d8c) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
David Vrabel [Tue, 17 May 2016 14:54:50 +0000 (15:54 +0100)]
x86/xen: avoid m2p lookup when setting early page table entries
When page tables entries are set using xen_set_pte_init() during early
boot there is no page fault handler that could handle a fault when
performing an M2P lookup.
In 64 bit guests (usually dom0) early_ioremap() would fault in
xen_set_pte_init() because an M2P lookup faults because the MFN is in
MMIO space and not mapped in the M2P. This lookup is done to see if
the PFN in in the range used for the initial page table pages, so that
the PTE may be set as read-only.
The M2P lookup can be avoided by moving the check (and clear of RW)
earlier when the PFN is still available.
Reported-by: Kevin Moraga <kmoragas@riseup.net> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit d6b186c1e2d852a92c43f090d0d8fad4704d51ef) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Current overlap check is evaluating to false a case where a filter
field is fully contained (proper subset) of a r/w request. This
change applies classical overlap check instead to include all the
scenarios.
More specifically, for (Hilscher GmbH CIFX 50E-DP(M/S)) device driver
the logic is such that the entire confspace is read and written in 4
byte chunks. In this case as an example, CACHE_LINE_SIZE,
LATENCY_TIMER and PCI_BIST are arriving together in one call to
xen_pcibk_config_write() with offset == 0xc and size == 4. With the
exsisting overlap check the LATENCY_TIMER field (offset == 0xd, length
== 1) is fully contained in the write request and hence is excluded
from write, which is incorrect.
Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 02ef871ecac290919ea0c783d05da7eedeffc10e) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Juergen Gross [Thu, 23 Jun 2016 05:12:27 +0000 (07:12 +0200)]
x86/xen: fix upper bound of pmd loop in xen_cleanhighmap()
xen_cleanhighmap() is operating on level2_kernel_pgt only. The upper
bound of the loop setting non-kernel-image entries to zero should not
exceed the size of level2_kernel_pgt.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 1cf38741308c64d08553602b3374fb39224eeb5a) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Ross Lagerwall [Tue, 10 May 2016 09:27:54 +0000 (10:27 +0100)]
xen/balloon: Fix declared-but-not-defined warning
Fix a declared-but-not-defined warning when building with
XEN_BALLOON_MEMORY_HOTPLUG=n. This fixes a regression introduced by
commit dfd74a1edfab ("xen/balloon: Fix crash when ballooning on x86 32
bit PAE").
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 842775f1509054ea969f1787f38d6a0ec2ccfaba) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Bob Liu [Tue, 31 May 2016 08:59:17 +0000 (16:59 +0800)]
xen-blkfront: fix resume issues after a migration
After a migrate to another host (which may not have multiqueue
support), the number of rings (block hardware queues)
may be changed and the ring info structure will also be reallocated.
This patch fixes two related bugs:
* call blk_mq_update_nr_hw_queues() to make blk-core know the number
of hardware queues have been changed.
* Don't store rinfo pointer to hctx->driver_data, because rinfo may be
reallocated so use hctx->queue_num to get the rinfo structure instead.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 2a6f71ad99cabe436e70c3f5fcf58072cb3bc07f) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Bob Liu [Tue, 7 Jun 2016 14:43:15 +0000 (10:43 -0400)]
xen-blkfront: don't call talk_to_blkback when already connected to blkback
Sometimes blkfront may twice receive blkback_changed() notification
(XenbusStateConnected) after migration, which will cause
talk_to_blkback() to be called twice too and confuse xen-blkback.
The flow is as follow:
blkfront blkback
blkfront_resume()
> talk_to_blkback()
> Set blkfront to XenbusStateInitialised
front changed()
> Connect()
> Set blkback to XenbusStateConnected
blkback_changed()
> Skip talk_to_blkback()
because frontstate == XenbusStateInitialised
> blkfront_connect()
> Set blkfront to XenbusStateConnected
-----
And here we get another XenbusStateConnected notification leading
to:
-----
blkback_changed()
> because now frontstate != XenbusStateInitialised
talk_to_blkback() is also called again
> blkfront state changed from
XenbusStateConnected to XenbusStateInitialised
(Which is not correct!)
front_changed():
> Do nothing because blkback
already in XenbusStateConnected
Now blkback is in XenbusStateConnected but blkfront is still
in XenbusStateInitialised - leading to no disks.
Poking of the XenbusStateConnected state is allowed (to deal with
block disk change) and has to be dealt with. The most likely
cause of this bug are custom udev scripts hooking up the disks
and then validating the size.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit efd1535270c1deb0487527bf0c3c827301a69c93) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393
Juergen Gross [Wed, 18 May 2016 14:44:54 +0000 (16:44 +0200)]
xen: use same main loop for counting and remapping pages
Instead of having two functions for cycling through the E820 map in
order to count to be remapped pages and remap them later, just use one
function with a caller supplied sub-function called for each region to
be processed. This eliminates the possibility of a mismatch between
both loops which showed up in certain configurations.
Suggested-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit dd14be92fbf5bc1ef7343f34968440e44e21b46a) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 23585393