Instead of using physical addresses for accounting of extra memory
areas available for ballooning switch to pfns as this is much less
error prone regarding partial pages.
Reported-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 626d7508664c4bc8e67f496da4387ecd0c410b8c) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When a pv-domain (including dom0) is started it tries to size it's
p2m list according to the maximum possible memory amount it ever can
achieve. Limit the initial maximum memory size to the architectural
limit of the hardware in order to avoid overflows during remapping
of memory.
This problem will occur when dom0 is started with an initial memory
size being a multiple of 1GB, but without specifying it's maximum
memory size. The kernel must be configured without
CONFIG_XEN_BALLOON_MEMORY_HOTPLUG for the problem to happen.
Reported-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit cb9e444b5aaa900bb4310da411315b6947c53e37) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Juergen Gross [Wed, 19 Aug 2015 16:53:11 +0000 (18:53 +0200)]
xen: avoid another early crash of memory limited dom0
Commit b1c9f169047b ("xen: split counting of extra memory pages...")
introduced an error when dom0 was started with limited memory occurring
only on some hardware.
The problem arises in case dom0 is started with initial memory and
maximum memory being the same. The kernel must be configured without
CONFIG_XEN_BALLOON_MEMORY_HOTPLUG for the problem to happen. If all
of this is true and the E820 map of the machine is sparse (some areas
are not covered) then the machine might crash early in the boot
process.
An example E820 map triggering the problem looks like this:
In this case the area a0000-dffff isn't present in the map. This will
confuse the memory setup of the domain when remapping the memory from
such holes to populated areas.
To avoid the problem the accounting of to be remapped memory has to
count such holes in the E820 map as well.
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ab24507cfae8d916814bb6c16f66e453184a29a5) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Juergen Gross [Wed, 19 Aug 2015 16:52:34 +0000 (18:52 +0200)]
xen: avoid early crash of memory limited dom0
Commit b1c9f169047b ("xen: split counting of extra memory pages...")
introduced an error when dom0 was started with limited memory.
The problem arises in case dom0 is started with initial memory and
maximum memory being the same and exactly a multiple of 1 GB. The
kernel must be configured without CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
for the problem to happen. In this case it will crash very early
during boot due to the virtual mapped p2m list not being large
enough to be able to remap any memory:
This can be avoided by allocating aneough space for the p2m to cover
the maximum memory of dom0 plus the identity mapped holes required
for PCI space, BIOS etc.
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit eafd72e016c69df511b14a98b61e439c58ad9c51) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Boris Ostrovsky [Mon, 10 Aug 2015 20:34:38 +0000 (16:34 -0400)]
xen/x86: Don't try to set PCE bit in CR4
Since VPMU code emulates RDPMC instruction with RDMSR and because hypervisor
does not emulate it there is no reason to try setting CR4's PCE bit (and the
hypervisor will warn on seeing it set).
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 3375d8284dfb7866f261ec008d15d30999ff273b) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Boris Ostrovsky [Mon, 10 Aug 2015 20:34:37 +0000 (16:34 -0400)]
xen/PMU: PMU emulation code
Add PMU emulation code that runs when we are processing a PMU interrupt.
This code will allow us not to trap to hypervisor on each MSR/LVTPC access
(of which there may be quite a few in the handler).
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit bf6dfb154d935725c9a2005033ca33017b9df439) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Boris Ostrovsky [Mon, 10 Aug 2015 20:34:36 +0000 (16:34 -0400)]
xen/PMU: Intercept PMU-related MSR and APIC accesses
Provide interfaces for recognizing accesses to PMU-related MSRs and
LVTPC APIC and process these accesses in Xen PMU code.
(The interrupt handler performs XENPMU_flush right away in the beginning
since no PMU emulation is available. It will be added with a later patch).
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6b08cd6328c58a2ae190c5ee03a2ffcab5ef828e) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Boris Ostrovsky [Mon, 10 Aug 2015 20:34:35 +0000 (16:34 -0400)]
xen/PMU: Describe vendor-specific PMU registers
AMD and Intel PMU register initialization and helpers that determine
whether a register belongs to PMU.
This and some of subsequent PMU emulation code is somewhat similar to
Xen's PMU implementation.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit e27b72df01109c689062caeba1defa013b759e0e) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Boris Ostrovsky [Mon, 10 Aug 2015 20:34:34 +0000 (16:34 -0400)]
xen/PMU: Initialization code for Xen PMU
Map shared data structure that will hold CPU registers, VPMU context,
V/PCPU IDs of the CPU interrupted by PMU interrupt. Hypervisor fills
this information in its handler and passes it to the guest for further
processing.
Set up PMU VIRQ.
Now that perf infrastructure will assume that PMU is available on a PV
guest we need to be careful and make sure that accesses via RDPMC
instruction don't cause fatal traps by the hypervisor. Provide a nop
RDPMC handler.
For the same reason avoid issuing a warning on a write to APIC's LVTPC.
Both of these will be made functional in later patches.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 65d0cf0be79feebeb19e7626fd3ed41ae73f642d) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Boris Ostrovsky [Mon, 10 Aug 2015 20:34:33 +0000 (16:34 -0400)]
xen/PMU: Sysfs interface for setting Xen PMU mode
Set Xen's PMU mode via /sys/hypervisor/pmu/pmu_mode. Add XENPMU hypercall.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 5f141548824cebbff2e838ff401c34e667797467) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Boris Ostrovsky [Mon, 10 Aug 2015 20:34:32 +0000 (16:34 -0400)]
xen: xensyms support
Export Xen symbols to dom0 via /proc/xen/xensyms (similar to
/proc/kallsyms).
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit a11f4f0a4e18b4bdc7d5e36438711e038b7a1f74) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: allow more than 512 GB of RAM for 64 bit pv-domains
64 bit pv-domains under Xen are limited to 512 GB of RAM today. The
main reason has been the 3 level p2m tree, which was replaced by the
virtual mapped linear p2m list. Parallel to the p2m list which is
being used by the kernel itself there is a 3 level mfn tree for usage
by the Xen tools and eventually for crash dump analysis. For this tree
the linear p2m list can serve as a replacement, too. As the kernel
can't know whether the tools are capable of dealing with the p2m list
instead of the mfn tree, the limit of 512 GB can't be dropped in all
cases.
This patch replaces the hard limit by a kernel parameter which tells
the kernel to obey the 512 GB limit or not. The default is selected by
a configuration parameter which specifies whether the 512 GB limit
should be active per default for domUs (domain save/restore/migration
and crash dump analysis are affected).
Memory above the domain limit is returned to the hypervisor instead of
being identity mapped, which was wrong anyway.
The kernel configuration parameter to specify the maximum size of a
domain can be deleted, as it is not relevant any more.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit c70727a5bc18a5a233fddc6056d1de9144d7a293) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Check whether the hypervisor supplied p2m list is placed at a location
which is conflicting with the target E820 map. If this is the case
relocate it to a new area unused up to now and compliant to the E820
map.
As the p2m list might by huge (up to several GB) and is required to be
mapped virtually, set up a temporary mapping for the copied list.
For pvh domains just delete the p2m related information from start
info instead of reserving the p2m memory, as we don't need it at all.
For 32 bit kernels adjust the memblock_reserve() parameters in order
to cover the page tables only. This requires to memblock_reserve() the
start_info page on it's own.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 70e61199559a09c62714694cd5ac3c3640c41552) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: add explicit memblock_reserve() calls for special pages
Some special pages containing interfaces to xen are being reserved
implicitly only today. The memblock_reserve() call to reserve them is
meant to reserve the p2m list supplied by xen. It is just reserving
not only the p2m list itself, but some more pages up to the start of
the xen built page tables.
To be able to move the p2m list to another pfn range, which is needed
for support of huge RAM, this memblock_reserve() must be split up to
cover all affected reserved pages explicitly.
The affected pages are:
- start_info page
- xenstore ring (might be missing, mfn is 0 in this case)
- console ring (not for initial domain)
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6c2681c863b24360098d1ba60f2af060a13a0561) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
mm: provide early_memremap_ro to establish read-only mapping
During early boot as Xen pv domain the kernel needs to map some page
tables supplied by the hypervisor read only. This is needed to be
able to relocate some data structures conflicting with the physical
memory map especially on systems with huge RAM (above 512GB).
Provide the function early_memremap_ro() to provide this read only
mapping.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 2592dbbbf4c67501c2bd2dcf89c2b8924d592a9f) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Check whether the initrd is placed at a location which is conflicting
with the target E820 map. If this is the case relocate it to a new
area unused up to now and compliant to the E820 map.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 4b9c15377f96e241be347fd3bbeeff74fbad0b44) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: check pre-allocated page tables for conflict with memory map
Check whether the page tables built by the domain builder are at
memory addresses which are in conflict with the target memory map.
If this is the case just panic instead of running into problems
later.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 04414baab5ba862b10bde837c4773ffdbb78f0e0) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: check for kernel memory conflicting with memory layout
Checks whether the pre-allocated memory of the loaded kernel is in
conflict with the target memory map. If this is the case, just panic
instead of run into problems later, as there is nothing we can do
to repair this situation.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 808fdb71936c41d46245f0e3aa6ec889cba70d97) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
For being able to relocate pre-allocated data areas like initrd or
p2m list it is mandatory to find a contiguous memory area which is
not yet in use and doesn't conflict with the memory map we want to
be in effect.
In case such an area is found reserve it at once as this will be
required to be done in any case.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 9ddac5b724a9465e27f25a0aa943e92c8341a85b) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Provide a service routine to check a physical memory area against the
E820 map. The routine will return false if the complete area is RAM
according to the E820 map and true otherwise.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit e612b4a7db4ae1dd8c2bbe171e10c21723de95b2) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: split counting of extra memory pages from remapping
Memory pages in the initial memory setup done by the Xen hypervisor
conflicting with the target E820 map are remapped. In order to do this
those pages are counted and remapped in xen_set_identity_and_remap().
Split the counting from the remapping operation to be able to setup
the needed memory sizes in time but doing the remap operation at a
later time. This enables us to simplify the interface to
xen_set_identity_and_remap() as the number of remapped and released
pages is no longer needed here.
Finally move the remapping further down to prepare relocating
conflicting memory contents before the memory might be clobbered by
xen_set_identity_and_remap(). This requires to not destroy the Xen
E820 map when the one for the system is being constructed.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 5097cdf6cef15439f971df54f9abcf143d7ca698) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Instead of using a function local static e820 map in xen_memory_setup()
and calling various functions in the same source with the map as a
parameter use a map directly accessible by all functions in the source.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 69632ecfcd03b12202ed62dfa0aabac83904f8ac) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: eliminate scalability issues from initial mapping setup
Direct Xen to place the initial P->M table outside of the initial
mapping, as otherwise the 1G (implementation) / 2G (theoretical)
restriction on the size of the initial mapping limits the amount
of memory a domain can be handed initially.
As the initial P->M table is copied rather early during boot to
domain private memory and it's initial virtual mapping is dropped,
the easiest way to avoid virtual address conflicts with other
addresses in the kernel is to use a user address area for the
virtual address of the initial P->M table. This allows us to just
throw away the page tables of the initial mapping after the copy
without having to care about address invalidation.
It should be noted that this patch won't enable a pv-domain to USE
more than 512 GB of RAM. It just enables it to be started with a
P->M table covering more memory. This is especially important for
being able to boot a Dom0 on a system with more than 512 GB memory.
Signed-off-by: Juergen Gross <jgross@suse.com> Based-on-patch-by: Jan Beulich <jbeulich@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 8f5b0c63987207fd5c3c1f89c9eb6cb95b30386e) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: save linear p2m list address in shared info structure
The virtual address of the linear p2m list should be stored in the
shared info structure read by the Xen tools to be able to support
64 bit pv-domains larger than 512 GB. Additionally the linear p2m
list interface includes a generation count which is changed prior
to and after each mapping change of the p2m list. Reading the
generation count the Xen tools can detect changes of the mappings
and re-read the p2m list eventually.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 4b9c9a11803eaa73b3223da9fcaea39b2f919d80) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Currently, the event channel rebind code is gated with the presence of
the vector callback.
The virtual interrupt controller on ARM has the concept of per-CPU
interrupt (PPI) which allow us to support per-VCPU event channel.
Therefore there is no need of vector callback for ARM.
Xen is already using a free PPI to notify the guest VCPU of an event.
Furthermore, the xen code initialization in Linux (see
arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ.
Introduce new helper xen_support_evtchn_rebind to allow architecture
decide whether rebind an event is support or not. It will always return
true on ARM and keep the same behavior on x86.
This is also allow us to drop the usage of xen_have_vector_callback
entirely in the ARM code.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 4a5b69464e51f4a8dd432e8c2a1468630df1a53c) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bob Liu [Mon, 13 Jul 2015 09:55:24 +0000 (17:55 +0800)]
xen-blkfront: convert to blk-mq APIs
Note: This patch is based on original work of Arianna's internship for
GNOME's Outreach Program for Women.
Only one hardware queue is used now, so there is no significant
performance change
The legacy non-mq code is deleted completely which is the same as other
drivers like virtio, mtip, and nvme.
Also dropped one unnecessary holding of info->io_lock when calling
blk_mq_stop_hw_queues().
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jens Axboe <axboe@fb.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 907c3eb18e0bd86ca12a9de80befe8e3647bac3e) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konstantin Khlebnikov [Wed, 15 Jul 2015 09:52:01 +0000 (12:52 +0300)]
xen/preempt: use need_resched() instead of should_resched()
This code is used only when CONFIG_PREEMPT=n and only in non-atomic
context: xen_in_preemptible_hcall is set only in
privcmd_ioctl_hypercall(). Thus preempt_count is zero and
should_resched() is equal to need_resched().
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit a7da51ae10032a507ddeae6a490916eadbd1e10a) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Colin Ian King [Thu, 16 Jul 2015 19:34:42 +0000 (20:34 +0100)]
x86/xen: fix non-ANSI declaration of xen_has_pv_devices()
xen_has_pv_devices() has no parameters, so use the normal void
parameter convention to make it match the prototype in the header file
include/xen/platform_pci.h.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 772f95e3b9460c64fb99b134022855cbce75b9a0) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Joe Jin [Mon, 19 Oct 2015 05:37:17 +0000 (13:37 +0800)]
xen-netfront: update num_queues to real created
Sometimes xennet_create_queues() may failed to created all requested
queues, we need to update num_queues to real created to avoid NULL
pointer dereference.
Signed-off-by: Joe Jin <joe.jin@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David S. Miller <davem@davemloft.net>
xen-blkfront will crash if the check to talk_to_blkback()
in blkback_changed()(XenbusStateInitWait) returns an error.
The driver data is freed and info is set to NULL. Later during
the close process via talk_to_blkback's call to xenbus_dev_fatal()
the null pointer is passed to and dereference in blkfront_closing.
Signed-off-by: Cathy Avery <cathy.avery@oracle.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Wei Liu [Thu, 10 Sep 2015 10:18:58 +0000 (11:18 +0100)]
xen-netfront: respect user provided max_queues
Originally that parameter was always reset to num_online_cpus during
module initialisation, which renders it useless.
The fix is to only set max_queues to num_online_cpus when user has not
provided a value.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Tested-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 32a844056fd43dda647e1c3c6b9983bdfa04d17d) Signed-off-by: Annie Li <annie.li@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Chas Williams <3chas3@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 274b045509175db0405c784be85e8cce116e6f7d) Signed-off-by: Annie Li <annie.li@oracle.com>
Chas Williams [Wed, 19 Aug 2015 23:14:20 +0000 (19:14 -0400)]
net/xen-netfront: only clean up queues if present
If you simply load and unload the module without starting the interfaces,
the queues are never created and you get a bad pointer dereference.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Chas Williams <3chas3@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 9a873c71e91cabf4c10fd9bbd8358c22deaf6c9e) Signed-off-by: Annie Li <annie.li@oracle.com>
Wei Liu [Thu, 10 Sep 2015 10:18:57 +0000 (11:18 +0100)]
xen-netback: respect user provided max_queues
Originally that parameter was always reset to num_online_cpus during
module initialisation, which renders it useless.
The fix is to only set max_queues to num_online_cpus when user has not
provided a value.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Reported-by: Johnny Strom <johnny.strom@linuxsolutions.fi> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4c82ac3c37363e8c4ded6a5fe1ec5fa756b34df3) Signed-off-by: Annie Li <annie.li@oracle.com>
The PV frontend in IPXE only places 4 requests on the guest Rx ring.
Since netback required at least (MAX_SKB_FRAGS + 1) slots, IPXE could
not receive any packets.
a) If GSO is not enabled on the VIF, fewer guest Rx slots are required
for the largest possible packet. Calculate the required slots
based on the maximum GSO size or the MTU.
This calculation of the number of required slots relies on 1650d5455bd2 (xen-netback: always fully coalesce guest Rx packets)
which present in 4.0-rc1 and later.
b) Reduce the Rx stall detection to checking for at least one
available Rx request. This is fine since we're predominately
concerned with detecting interfaces which are down and thus have
zero available Rx requests.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1d5d48523900a4b0f25d6b52f1a93c84bd671186) Signed-off-by: Annie Li <annie.li@oracle.com>
Paul Durrant [Wed, 2 Sep 2015 16:58:36 +0000 (17:58 +0100)]
xen-netback: add support for multicast control
Xen's PV network protocol includes messages to add/remove ethernet
multicast addresses to/from a filter list in the backend. This allows
the frontend to request the backend only forward multicast packets
which are of interest thus preventing unnecessary noise on the shared
ring.
The canonical netif header in git://xenbits.xen.org/xen.git specifies
the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal
necessary changes have been pulled into include/xen/interface/io/netif.h.
To prevent the frontend from extending the multicast filter list
arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries.
This limit is not specified by the protocol and so may change in future.
If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD
sent by the frontend will be failed with NETIF_RSP_ERROR.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 210c34dcd8d912dcc740f1f17625a7293af5cb56) Signed-off-by: Annie Li <annie.li@oracle.com>
Ross Lagerwall [Tue, 4 Aug 2015 14:40:59 +0000 (15:40 +0100)]
xen/netback: Wake dealloc thread after completing zerocopy work
Waking the dealloc thread before decrementing inflight_packets is racy
because it means the thread may go to sleep before inflight_packets is
decremented. If kthread_stop() has already been called, the dealloc
thread may wait forever with nothing to wake it. Instead, wake the
thread only after decrementing inflight_packets.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 57b229063ae6dc65036209018dc7f4290cc026bb) Signed-off-by: Annie Li <annie.li@oracle.com>
Ross Lagerwall [Mon, 3 Aug 2015 14:38:03 +0000 (15:38 +0100)]
xen-netback: Allocate fraglist early to avoid complex rollback
Determine if a fraglist is needed in the tx path, and allocate it if
necessary before setting up the copy and map operations.
Otherwise, undoing the copy and map operations is tricky.
This fixes a use-after-free: if allocating the fraglist failed, the copy
and map operations that had been set up were still executed, writing
over the data area of a freed skb.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2475b22526d70234ecfe4a1ff88aed69badefba9) Signed-off-by: Annie Li <annie.li@oracle.com>
Dan Carpenter [Sat, 11 Jul 2015 22:20:55 +0000 (01:20 +0300)]
net/xen-netback: off by one in BUG_ON() condition
The > should be >=. I also added spaces around the '-' operations so
the code is a little more consistent and matches the condition better.
Fixes: f53c3fe8dad7 ('xen-netback: Introduce TX grant mapping') Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 50c2e4dd6749725338621fff456b26d3a592259f) Signed-off-by: Annie Li <annie.li@oracle.com>
xen-netback: remove duplicated function definition
There are two duplicated xenvif_zerocopy_callback() definitions.
Remove one of them.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Liang Li <liang.z.li@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6ab13b27699e5a71cca20d301c3c424653bd0841) Signed-off-by: Annie Li <annie.li@oracle.com>
Julien Grall [Tue, 16 Jun 2015 19:10:48 +0000 (20:10 +0100)]
net/xen-netback: Don't mix hexa and decimal with 0x in the printf format
Append 0x to all %x in order to avoid while reading when there is other
decimal value in the log.
Also replace some of the hexadecimal print to decimal to uniformize the
format with netfront.
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Julien Grall <julien.grall@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: netdev@vger.kernel.org Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 68946159da1b0b6791c5990242940950b9383cfc) Signed-off-by: Annie Li <annie.li@oracle.com>
Julien Grall [Tue, 16 Jun 2015 19:10:47 +0000 (20:10 +0100)]
net/xen-netback: Remove unused code in xenvif_rx_action
The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
always fully coalesce guest Rx packets".
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle> Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 44f0764cfec9c607d43cad6a51e8592c7b2b9b84) Signed-off-by: Annie Li <annie.li@oracle.com>
Santosh Shilimkar [Thu, 13 Aug 2015 20:14:34 +0000 (13:14 -0700)]
Merge branch 'uek-4.1/xen' of git://ca-git.us.oracle.com/linux-konrad-public into topic/uek-4.1/xen
* 'uek-4.1/xen' of git://ca-git.us.oracle.com/linux-konrad-public: (31 commits)
xen-netfront: Remove the meaningless code
net/xen-netfront: Correct printf format in xennet_get_responses
xen-netfront: Use setup_timer
xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
Revert "xen/events/fifo: Handle linked events when closing a port"
x86/xen: build "Xen PV" APIC driver for domU as well
xen/events/fifo: Handle linked events when closing a port
xen: release lock occasionally during ballooning
xen/gntdevt: Fix race condition in gntdev_release()
block/xen-blkback: s/nr_pages/nr_segs/
block/xen-blkfront: Remove invalid comment
arm/xen: Drop duplicate define mfn_to_virt
xen/grant-table: Remove unused macro SPP
xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
xen: Include xen/page.h rather than asm/xen/page.h
kconfig: add xenconfig defconfig helper
kconfig: clarify kvmconfig is for kvm
xen/pcifront: Remove usage of struct timeval
xen/tmem: use BUILD_BUG_ON() in favor of BUG_ON()
hvc_xen: avoid uninitialized variable warning
...
Konrad Rzeszutek Wilk [Wed, 12 Aug 2015 19:20:26 +0000 (15:20 -0400)]
Merge branch 'backports-for-konrad' of git://ca-git.us.oracle.com/linux-eufimtse-public into uek-4.1/xen
* 'backports-for-konrad' of git://ca-git.us.oracle.com/linux-eufimtse-public: (31 commits)
xen-netfront: Remove the meaningless code
net/xen-netfront: Correct printf format in xennet_get_responses
xen-netfront: Use setup_timer
xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
Revert "xen/events/fifo: Handle linked events when closing a port"
x86/xen: build "Xen PV" APIC driver for domU as well
xen/events/fifo: Handle linked events when closing a port
xen: release lock occasionally during ballooning
xen/gntdevt: Fix race condition in gntdev_release()
block/xen-blkback: s/nr_pages/nr_segs/
block/xen-blkfront: Remove invalid comment
arm/xen: Drop duplicate define mfn_to_virt
xen/grant-table: Remove unused macro SPP
xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
xen: Include xen/page.h rather than asm/xen/page.h
kconfig: add xenconfig defconfig helper
kconfig: clarify kvmconfig is for kvm
xen/pcifront: Remove usage of struct timeval
xen/tmem: use BUILD_BUG_ON() in favor of BUG_ON()
hvc_xen: avoid uninitialized variable warning
...
Backports from Linux 4.1
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Li, Liang Z [Fri, 26 Jun 2015 23:17:26 +0000 (07:17 +0800)]
xen-netfront: Remove the meaningless code
The function netif_set_real_num_tx_queues() will return -EINVAL if
the second parameter < 1, so call this function with the second
parameter set to 0 is meaningless.
Signed-off-by: Liang Li <liang.z.li@intel.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 905726c1c5a3ca620ba7d73c78eddfb91de5ce28) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Julien Grall [Tue, 16 Jun 2015 19:10:46 +0000 (20:10 +0100)]
net/xen-netfront: Correct printf format in xennet_get_responses
rx->status is an int16_t, print it using %d rather than %u in order to
have a meaningful value when the field is negative.
Also use %u rather than %x for rx->offset.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6c10127d91bdc80e02938085d03c232ae3118ad5) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 493be55ac3d81f9c32832237288eb397a9993d5d) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Julien Grall [Mon, 10 Aug 2015 18:10:38 +0000 (19:10 +0100)]
xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
The commit ccc9d90a9a8b5c4ad7e9708ec41f75ff9e98d61d "xenbus_client:
Extend interface to support multi-page ring" removes the call to
free_xenballooned_pages() in xenbus_unmap_ring_vfree_hvm(), leaking a
page for every shared ring.
Only with backends running in HVM domains were affected.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Cc: <stable@vger.kernel.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit c22fe519e7e2b94ad173e0ea3b89c1a7d8be8d00) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
This was causing a WARNING whenever a PIRQ was closed since
shutdown_pirq() is called with irqs disabled.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: <stable@vger.kernel.org>
(cherry picked from commit ad6cd7bafcd2c812ba4200d5938e07304f1e2fcd) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Jason A. Donenfeld [Mon, 10 Aug 2015 13:40:27 +0000 (15:40 +0200)]
x86/xen: build "Xen PV" APIC driver for domU as well
It turns out that a PV domU also requires the "Xen PV" APIC
driver. Otherwise, the flat driver is used and we get stuck in busy
loops that never exit, such as in this stack trace:
(gdb) target remote localhost:9999
Remote debugging using localhost:9999
__xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
56 while (native_apic_mem_read(APIC_ICR) & APIC_ICR_BUSY)
(gdb) bt
#0 __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
#1 __default_send_IPI_shortcut (shortcut=<optimized out>,
dest=<optimized out>, vector=<optimized out>) at
./arch/x86/include/asm/ipi.h:75
#2 apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54
#3 0xffffffff81011336 in arch_irq_work_raise () at
arch/x86/kernel/irq_work.c:47
#4 0xffffffff8114990c in irq_work_queue (work=0xffff88000fc0e400) at
kernel/irq_work.c:100
#5 0xffffffff8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633
#6 0xffffffff8110ca60 in vprintk_emit (facility=0, level=<optimized
out>, dict=0x0 <irq_stack_union>, dictlen=<optimized out>,
fmt=<optimized out>, args=<optimized out>)
at kernel/printk/printk.c:1778
#7 0xffffffff816010c8 in printk (fmt=<optimized out>) at
kernel/printk/printk.c:1868
#8 0xffffffffc00013ea in ?? ()
#9 0x0000000000000000 in ?? ()
Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit fc5fee86bdd3d720e2d1d324e4fae0c35845fa63) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Ross Lagerwall [Fri, 31 Jul 2015 13:30:42 +0000 (14:30 +0100)]
xen/events/fifo: Handle linked events when closing a port
An event channel bound to a CPU that was offlined may still be linked
on that CPU's queue. If this event channel is closed and reused,
subsequent events will be lost because the event channel is never
unlinked and thus cannot be linked onto the correct queue.
When a channel is closed and the event is still linked into a queue,
ensure that it is unlinked before completing.
If the CPU to which the event channel bound is online, spin until the
event is handled by that CPU. If that CPU is offline, it can't handle
the event, so clear the event queue during the close, dropping the
events.
This fixes the missing interrupts (and subsequent disk stalls etc.)
when offlining a CPU.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit fcdf31a7c162de0c93a2bee51df4688ab0a348f8) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
When dom0 is being ballooned balloon_process() will hold the balloon
mutex until it is finished. This will block e.g. creation of new
domains as the device backends for the new domain need some
autoballooned pages for the ring buffers.
Avoid this by releasing the balloon mutex from time to time during
ballooning. Adjust the comment above balloon_process() regarding
multiple instances of balloon_process().
Instead of open coding it, just use cond_resched().
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 929423fa83e5b75e94101b280738b9a5a376a0e1) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Marek Marczykowski-Górecki [Fri, 26 Jun 2015 01:28:24 +0000 (03:28 +0200)]
xen/gntdevt: Fix race condition in gntdev_release()
While gntdev_release() is called the MMU notifier is still registered
and can traverse priv->maps list even if no pages are mapped (which is
the case -- gntdev_release() is called after all). But
gntdev_release() will clear that list, so make sure that only one of
those things happens at the same time.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 30b03d05e07467b8c6ec683ea96b5bffcbcd3931) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Julien Grall [Wed, 17 Jun 2015 14:28:07 +0000 (15:28 +0100)]
block/xen-blkfront: Remove invalid comment
Since commit b764915 "xen-blkfront: use a different scatterlist for each
request", biovec has been replaced by scatterlist when copying back the
data during a completion request.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ee4b7179fc9bf19fd22f9d17757a86582c40229e) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Julien Grall [Wed, 17 Jun 2015 14:28:04 +0000 (15:28 +0100)]
xen/grant-table: Remove unused macro SPP
SPP was used by the grant table v2 code which has been removed in
commit 438b33c7145ca8a5131a30c36d8f59bce119a19a "xen/grant-table:
remove support for V2 tables".
Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 548f7c94759ac58d4744ef2663e2a66a106e21c5) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Julien Grall [Wed, 17 Jun 2015 14:28:03 +0000 (15:28 +0100)]
xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
virt_to_mfn should take a void* rather an unsigned long. While it
doesn't really matter now, it would throw a compiler warning later when
virt_to_mfn will enforce the type.
At the same time, avoid to compute new virtual address every time in the
loop and directly increment the parameter as we don't use it later.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit c9fd55eb6625ead6a1207e7da38026ff47c5198b) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Luis R. Rodriguez [Wed, 20 May 2015 18:53:39 +0000 (11:53 -0700)]
kconfig: add xenconfig defconfig helper
This lets you build a kernel which can support xen dom0
or xen guests on i386, x86-64 and arm64 by just using:
make xenconfig
You can start from an allnoconfig and then switch to xenconfig.
This also splits out the options which are available currently
to be built with x86 and 'make ARCH=arm64' under a shared config.
Technically xen supports a dom0 kernel and also a guest
kernel configuration but upon review with the xen team
since we don't have many dom0 options its best to just
combine these two into one.
A few generic notes: we enable both of these:
CONFIG_INET=y
CONFIG_BINFMT_ELF=y
although technically not required given you likely will
end up with a pretty useless system otherwise.
A few architectural differences worth noting:
$ make allnoconfig; make xenconfig > /dev/null ; \
grep XEN .config > 64-bit-config
$ make ARCH=i386 allnoconfig; make ARCH=i386 xenconfig > /dev/null; \
grep XEN .config > 32-bit-config
$ make ARCH=arm64 allnoconfig; make ARCH=arm64 xenconfig > /dev/null; \
grep XEN .config > arm64-config
Since the options are already split up with a generic config and
architecture specific configs you anything on the x86 configs
are known to only work right now on x86. For instance arm64 doesn't
support MEMORY_HOTPLUG yet as such although we try to enabe it
generically arm64 doesn't have it yet, so we leave the xen
specific kconfig option XEN_BALLOON_MEMORY_HOTPLUG on x86's config
file to set expecations correctly.
Then on x86 we have differences between i386 and x86-64. The difference
between 64-bit-config and 32-bit-config is you don't get XEN_MCE_LOG as
this is only supported on 64-bit. You also do not get on i386
XEN_BALLOON_MEMORY_HOTPLUG, there does not seem to be any technical
reasons to not allow this but I gave up after a few attempts.
Cc: Josh Triplett <josh@joshtriplett.org> Cc: Borislav Petkov <bp@suse.de> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: penberg@kernel.org Cc: levinsasha928@gmail.com Cc: mtosatti@redhat.com Cc: fengguang.wu@intel.com Cc: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xenproject.org Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Michal Marek <mmarek@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6c6685055a285de53f18fbf6611687291b57ccd6) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Tina Ruchandani [Tue, 19 May 2015 06:08:09 +0000 (11:38 +0530)]
xen/pcifront: Remove usage of struct timeval
struct timeval uses a 32-bit field for representing seconds, which
will overflow in the year 2038 and beyond. Replace struct timeval with
64-bit ktime_t which is 2038 safe. This is part of a larger effort to
remove instances of 32-bit timekeeping variables (timeval, time_t and
timespec) from the kernel.
Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com> Suggested-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit e1d5bbcdc7ca08d8731f5d780f0de342a768d96a) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Jan Beulich [Thu, 28 May 2015 12:04:33 +0000 (13:04 +0100)]
xen/tmem: use BUILD_BUG_ON() in favor of BUG_ON()
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 01b720f3295b6c1b2dcfc1affd0fedc6f5d28c1e) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Jan Beulich [Thu, 28 May 2015 08:28:22 +0000 (09:28 +0100)]
hvc_xen: avoid uninitialized variable warning
Older compilers don't recognize that "v" can't be used uninitialized;
other code using hvm_get_parameter() zeros the value too, so follow
suit here.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit c4ace5daf4ff726402b13f1ababf2ad0e0ceec65) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Jan Beulich [Thu, 28 May 2015 08:26:37 +0000 (09:26 +0100)]
xenbus: avoid uninitialized variable warning
Older compilers don't recognize that "v" can't be used uninitialized;
other code using hvm_get_parameter() zeros the value too, so follow
suit here.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 76ea3cb428c5dc4fd5a415a7651026caf198fb3d) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Ard Biesheuvel [Wed, 6 May 2015 14:14:22 +0000 (14:14 +0000)]
xen/arm: allow console=hvc0 to be omitted for guests
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
This patch registers hvc0 as the preferred console if no console
has been specified explicitly on the kernel command line.
The purpose is to allow platform agnostic kernels and boot images
(such as distro installers) to boot in a Xen/ARM domU without the
need to modify the command line by hand.
Stefano Stabellini [Wed, 6 May 2015 14:13:31 +0000 (14:13 +0000)]
arm,arm64/xen: move Xen initialization earlier
Currently, Xen is initialized/discovered in an initcall. This doesn't
allow us to support earlyprintk or choosing the preferred console when
running on Xen.
The current function xen_guest_init is now split in 2 parts:
- xen_early_init: Check if there is a Xen node in the device tree
and setup domain type
- xen_guest_init: Retrieve the information from the device node and
initialize Xen (grant table, shared page...)
The former is called in setup_arch, while the latter is an initcall.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 5882bfef6327093bff63569be19795170ff71e5f) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Bob Liu [Wed, 22 Jul 2015 06:40:10 +0000 (14:40 +0800)]
xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
work haven't finished.
There is a work_pending() before this BUG_ON, but it doesn't account if the work
is still currently running.
CC: stable@vger.kernel.org Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 53bc7dc004fecf39e0ba70f2f8d120a1444315d3) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Bob Liu [Wed, 22 Jul 2015 06:40:09 +0000 (14:40 +0800)]
xen-blkfront: don't add indirect pages to list when !feature_persistent
We should consider info->feature_persistent when adding indirect page to list
info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.
When we are using persistent grants the indirect_pages list
should always be empty because blkfront has pre-allocated enough
persistent pages to fill all requests on the ring.
CC: stable@vger.kernel.org Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 7b0767502b5db11cb1f0daef2d01f6d71b1192dc) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
There is a bug when migrate from !feature-persistent host to feature-persistent
host, because domU still thinks new host/backend doesn't support persistent.
Dmesg like:
backed has not unmapped grant: 839
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 839
The fix is to recheck feature-persistent of new backend in blkif_recover().
See: https://lkml.org/lkml/2015/5/25/469
As Roger suggested, we can split the part of blkfront_connect that checks for
optional features, like persistent grants, indirect descriptors and
flush/barrier features to a separate function and call it from both
blkfront_connect and blkif_recover
Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit d50babbe300eedf33ea5b00a12c5df3a05bd96c7) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Bob Liu [Fri, 19 Jun 2015 04:23:00 +0000 (00:23 -0400)]
drivers: xen-blkfront: only talk_to_blkback() when in XenbusStateInitialising
Patch 69b91ede5cab843dcf345c28bd1f4b5a99dacd9b
"drivers: xen-blkback: delay pending_req allocation to connect_ring"
exposed an problem that Xen blkfront has. There is a race
with XenStored and the drivers such that we can see two:
vbd vbd-268440320: blkfront:blkback_changed to state 2.
vbd vbd-268440320: blkfront:blkback_changed to state 2.
vbd vbd-268440320: blkfront:blkback_changed to state 4.
state changes to XenbusStateInitWait ('2'). The end result is that
blkback_changed() receives two notify and calls twice setup_blkring().
While the backend driver may only get the first setup_blkring() which is
wrong and reads out-dated (or reads them as they are being updated
with new ring-ref values).
The end result is that the ring ends up being incorrectly set.
The other drivers in the tree have such checks already in.
Reported-and-Tested-by: Robert Butera <robert.butera@oracle.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit a9b54bb95176cd27f952cd9647849022c4c998d6) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Bob Liu [Wed, 3 Jun 2015 05:40:03 +0000 (13:40 +0800)]
xen/block: add multi-page ring support
Extend xen/block to support multi-page ring, so that more requests can be
issued by using more than one pages as the request ring between blkfront
and backend.
As a result, the performance can get improved significantly.
We got some impressive improvements on our highend iscsi storage cluster
backend. If using 64 pages as the ring, the IOPS increased about 15 times
for the throughput testing and above doubled for the latency testing.
The reason was the limit on outstanding requests is 32 if use only one-page
ring, but in our case the iscsi lun was spread across about 100 physical
drives, 32 was really not enough to keep them busy.
Changes in v2:
- Rebased to 4.0-rc6.
- Document on how multi-page ring feature working to linux io/blkif.h.
Changes in v3:
- Remove changes to linux io/blkif.h and follow the protocol defined
in io/blkif.h of XEN tree.
- Rebased to 4.1-rc3
Changes in v4:
- Turn to use 'ring-page-order' and 'max-ring-page-order'.
- A few comments from Roger.
Changes in v5:
- Clarify with 4k granularity to comment
- Address more comments from Roger
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 86839c56dee28c315a4c19b7bfee450ccd84cd25) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Bob Liu [Wed, 3 Jun 2015 05:40:02 +0000 (13:40 +0800)]
driver: xen-blkfront: move talk_to_blkback to a more suitable place
The major responsibility of talk_to_blkback() is allocate and initialize
the request ring and write the ring info to xenstore.
But this work should be done after backend entered 'XenbusStateInitWait' as
defined in the protocol file.
See xen/include/public/io/blkif.h in XEN git tree:
Front Back
================================= =====================================
XenbusStateInitialising XenbusStateInitialising
o Query virtual device o Query backend device identification
properties. data.
o Setup OS device instance. o Open and validate backend device.
o Publish backend features and
transport parameters.
|
|
V
XenbusStateInitWait
o Query backend features and
transport parameters.
o Allocate and initialize the
request ring.
There is no problem with this yet, but it is an violation of the design and
furthermore it would not allow frontend/backend to negotiate 'multi-page'
and 'multi-queue' features.
Changes in v2:
- Re-write the commit message to be more clear.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 8ab0144a466320cc37c52e7866b5103c5bbd4e90) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Bob Liu [Wed, 3 Jun 2015 05:40:01 +0000 (13:40 +0800)]
drivers: xen-blkback: delay pending_req allocation to connect_ring
This is a pre-patch for multi-page ring feature.
In connect_ring, we can know exactly how many pages are used for the shared
ring, delay pending_req allocation here so that we won't waste too much memory.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 69b91ede5cab843dcf345c28bd1f4b5a99dacd9b) Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Currently non initial domain guest use Intel or AMD specific ops.
This will also slow up the startup on heavily overcommited guests (say 256VCPUs
on 20 PCPU), as there are many read and write to x86 MSR registers which will
trap to xen during microcode update. Finally it will fail and report errors.
A dummy ops could fix that and also make udevd silent (bug18379824)
by augmenting the commit c18a317f6892536851e5852b6aaa4ef42cbc11a2
"xen/microcode: Only load under initial domain." which fell short of its
intended fix.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com>
(cherry picked from commit 62b84234f23c1020c690d162b7d8250042425e1e) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
arch/x86/kernel/microcode_core.c
Konrad Rzeszutek Wilk [Fri, 6 Jun 2014 18:09:14 +0000 (14:09 -0400)]
xen/microcode: Fix compile warning.
We get a bunch of them. Might as well fix it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 3b99e95e07cee4bbe3ba2b511fc2ac38ff7769b9) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Campbell [Mon, 26 Nov 2012 09:41:02 +0000 (09:41 +0000)]
microcode_xen: Add support for AMD family >= 15h
Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 8b080aa43b95719d8981ba06f357abb6f0ba9d52) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ben Guthro [Thu, 3 Nov 2011 15:06:56 +0000 (11:06 -0400)]
x86/microcode: check proper return code.
After pulling in this change from your tree, I found the following bug,
when checking an enum value, which should be considered before inclusion:
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 33a4651e09e0d6fb6e9c1293810d8a66b734840a) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jeremy Fitzhardinge [Sat, 28 Mar 2009 00:39:15 +0000 (17:39 -0700)]
xen: add CPU microcode update driver
Xen does all the hard work for us, including choosing the right update
method for this cpu type and actually doing it for all cpus. We just
need to supply it with the firmware blob.
Because Xen updates all CPUs (and the kernel's virtual cpu numbers have
no fixed relationship with the underlying physical cpus), we only bother
doing anything for cpu "0".
[ Impact: allow CPU microcode update in Xen dom0 ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Conflicts:
arch/x86/xen/Kconfig
(cherry picked from commit da3d1c83399886c443cbf9e57455bcc2e5caf28c) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
arch/x86/include/asm/microcode.h
arch/x86/kernel/Makefile
arch/x86/kernel/microcode_core.c
arch/x86/xen/Kconfig
Boris Ostrovsky [Tue, 4 Mar 2014 02:40:21 +0000 (21:40 -0500)]
x86/xen: Disable APIC PM for Xen PV guests
Xen PV guests support only few APIC registers and writes to
unsupported registers result in WARN_ONs. Most APIC accesses in these
guests have been eliminated; however, lapic_suspend/resume are still
called (on 32-bit kernels).
We can disable APIC power management in xen_smp_prepare_boot_cpu()
(which is called after APIC has been initialized).
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen/pvhvm: Support more than 32 VCPUs when migrating (v3).
When Xen migrates an HVM guest, by default its shared_info can
only hold up to 32 CPUs. As such the hypercall
VCPUOP_register_vcpu_info was introduced which allowed us to
setup per-page areas for VCPUs. This means we can boot PVHVM
guest with more than 32 VCPUs. During migration the per-cpu
structure is allocated freshly by the hypervisor (vcpu_info_mfn
is set to INVALID_MFN) so that the newly migrated guest
can make an VCPUOP_register_vcpu_info hypercall.
Unfortunatly we end up triggering this condition in Xen:
/* Run this command on yourself or on other offline VCPUS. */
if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) )
which means we are unable to setup the per-cpu VCPU structures
for running vCPUS. The Linux PV code paths make this work by
iterating over every vCPU with:
1) is target CPU up (VCPUOP_is_up hypercall?)
2) if yes, then VCPUOP_down to pause it.
3) VCPUOP_register_vcpu_info
4) if it was down, then VCPUOP_up to bring it back up
But since VCPUOP_down, VCPUOP_is_up, and VCPUOP_up are
not allowed on HVM guests we can't do this. However with the
git commit XYZ ("hvm: Support more than 32 VCPUS when migrating.")
we can do this. As such first check if VCPUOP_is_up is actually
possible before trying this dance.
As most of this dance code is done already in 'xen_setup_vcpu'
lets make it callable on both PV and HVM. This means moving one
of the checks out to 'xen_setup_runstate_info'.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Linus Torvalds [Sat, 20 Jun 2015 20:54:22 +0000 (13:54 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A smattering of fixes,
mgag200:
don't accept modes that aren't aligned properly as hw can't do it
i915:
two regression fixes
radeon:
one query to allow userspace fixes
one oops fixer for older hw with new options enabled"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: don't probe MST on hw we don't support it on
drm/radeon: Add RADEON_INFO_VA_UNMAP_WORKING query
drm/mgag200: Reject non-character-cell-aligned mode widths
Revert "drm/i915: Don't skip request retirement if the active list is empty"
drm/i915: Always reset vma->ggtt_view.pages cache on unbinding
Linus Torvalds [Fri, 19 Jun 2015 17:34:14 +0000 (07:34 -1000)]
Merge tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Nothing looks scary, just a few usual HD-audio regression fixes and
fixup, in addition to a minor Kconfig dependency fix for the old MIPS
drivers"
* tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix unused label skip_i915
ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
ALSA: mips: let SND_SGI_O2 select SND_PCM
ALSA: hda - Fix audio crackles on Dell Latitude E7x40
ALSA: hda - adding a DAC/pin preference map for a HP Envy TS machine
Boris Brezillon [Fri, 27 Mar 2015 22:53:15 +0000 (23:53 +0100)]
clk: at91: pll: fix input range validity check
The PLL impose a certain input range to work correctly, but it appears that
this input range does not apply on the input clock (or parent clock) but
on the input clock after it has passed the PLL divisor.
Fix the implementation accordingly.
Cc: <stable@vger.kernel.org> # v3.14+ Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reported-by: Jonas Andersson <jonas@microbit.se>
Linus Torvalds [Fri, 19 Jun 2015 03:02:27 +0000 (17:02 -1000)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c documentation fix from Wolfram Sang:
"Here is a small documentation fix for I2C.
We already had a user who unsuccessfully tried to get the new slave
framework running with the currently broken example. So, before this
happens again, I'd like to have this how-to-use section fixed for 4.1
already. So that no more hacking time is wasted"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: slave: fix the example how to instantiate from userspace
Dave Airlie [Fri, 19 Jun 2015 01:58:39 +0000 (11:58 +1000)]
Merge tag 'drm-intel-fixes-2015-06-18' of git://anongit.freedesktop.org/drm-intel into drm-fixes
one fix, one revert
* tag 'drm-intel-fixes-2015-06-18' of git://anongit.freedesktop.org/drm-intel:
Revert "drm/i915: Don't skip request retirement if the active list is empty"
drm/i915: Always reset vma->ggtt_view.pages cache on unbinding