]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
8 years agoxen-netfront: generalize recycling for grants
Joao Martins [Fri, 12 May 2017 08:46:46 +0000 (09:46 +0100)]
xen-netfront: generalize recycling for grants

Takes the already existent mechanism for recycling pages and leverages it
for grant references too. The difference though is that pages permanently
granted to the backend cannot be revoked (because those are mapped by the
other side) and hence these need to go to a separate quarantine pool, until
the point these pages can be consumed. The strategy is: 1) Get a page by
fetching oldest entry in rx_pool 2) If it's not granted then the page is
freed at the head 3) if it's reusable return the page otherwise add it to
quarantine pool 4) fetch oldest entry in quarantine pool and finally 5) if
all else fails then we resort to allocating a new page. Worst case scenario
if we have two atomic read op added on packet path when allocating a new
page for Rx requests.

This page reuse strategy allows us to remove a copy for each page handed
over by the backend leveraging guest RX performance to ~42-47 Gbit/s when
testing backend -> frontend. The measured recycling percentage is about
30% on TCP streams if pool size == ring size; and with pool size == 2 *
ring size these rises up to 80 - 100%.  This shows that bigger ring sizes
should allow for better recycling, which remains to be explored.

The only downside of this approach is that it is not 100% guaranteed that
the Rx requests provided to the backend will be already mapped; in other
words, backend may need to do a grant copy on 1% of the packets.
This is not the case though when we are in full copy mode whereby we always
reuse the same grants while copying into new pages into the upper layers.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netfront: add rx page statistics
Joao Martins [Fri, 12 May 2017 08:46:45 +0000 (09:46 +0100)]
xen-netfront: add rx page statistics

Add three new counters namely rx_alloc_pages, rx_alloc_failed_pages
and rx_packet_pages such that we can observe how many packets hit
the recyling path (or otherwise).

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netfront: introduce rx page recyling
Joao Martins [Fri, 12 May 2017 08:46:44 +0000 (09:46 +0100)]
xen-netfront: introduce rx page recyling

Recycling pages lets us avoid the page allocator when possible, as
similar approach followed by ixgbe and mlx{4,5} drivers. Introduce
a small buffer pool tracking outstanding pages. We increase page
refcount by 1 to avoid stack freeing the page in upper layers. Recycling
of pages is then possible on inflight skbs, by the time we process N
requests by the stack and thus when allocating new Rx requests we
attempting at reusing the oldest page in the pool if and only if
page._refcount is 1. Otherwise we just decrement the refcount (on
free_page).

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netfront: move rx_gso_checksum_fixup into netfront_stats
Joao Martins [Fri, 12 May 2017 08:46:43 +0000 (09:46 +0100)]
xen-netfront: move rx_gso_checksum_fixup into netfront_stats

It allows us to remove one atomic op (on a very rare case) and
further allow easier adding of new statistics.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netfront: introduce staging gref pools
Joao Martins [Fri, 12 May 2017 08:46:42 +0000 (09:46 +0100)]
xen-netfront: introduce staging gref pools

Grant buffers and allow backend to permanently map these grants
through the control messages newly added. This only happens if
the backend advertises "feature-staging-grants".

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netback: use gref mappings for Tx requests
Joao Martins [Fri, 12 May 2017 08:46:41 +0000 (09:46 +0100)]
xen-netback: use gref mappings for Tx requests

Introduces grants already mapped (by control ring request of the guest)
for TX path which follows similar code path as the grant mapping.

It starts by checking if there's a grant available for header
and frags grefs and if so setting it in tx_grants. If no gref mapping
is found in the tree for the header it will resort to grant copy. For the
frags it will perform a gref lookup on the mapping table, and in case of
no entry is found it falls back to grant map/unmap using mmap_pages. When
skb destructor callback gets called we release the slot and the grant
within the callback to avoid waking up the dealloc thread. As long as there
are no unmaps to be done the dealloc thread will remain inactive.

Results show an improvement of 46% (3.6 vs 1.24 Mpps, 64 pkt size)
measured with pktgen and up to over 48% (28 vs 14.5 Gbit/s) measured
with iperf3 2 queue vif, DomU to Dom0. Measured too with sendfile()
and it goes further up to 35.3 Gbit/s given the lack of a second copy.
Tests run locally on a Intel Xeon CPU E5-2699 v3 with HT disabled,
Dom0 <-> DomU.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netback: use gref mappings for Rx requests
Joao Martins [Fri, 12 May 2017 08:46:40 +0000 (09:46 +0100)]
xen-netback: use gref mappings for Rx requests

First lookup in the frontend gref mapping table to see whether
the requested gref is already mapped and has the right permissions.
If so, use that instead.

Results are 2.04 Mpps measured with pktgen (pkt_size 64, burst 1)
with already mapped grants versus half of it with grant copy.
Fundamentally it works in the same way as grants, it just avoids
asking Xen to copy the page, and hence opening room for other
improvements.

For example with the mapped grefs it further adds up contention on
queue->wq as the kthread_guest_rx goes to sleep more often. We can
alternatively copy the skb on xenvif_start_xmit() instead of going
through the RX kthread. It would only be beneficial if guest would
*only* use the mapped grants (either by copying or recycling mechanisms)
otherwise it would significantly add up the added cost of a grant copy
hypercall per packet.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netback: shorten tx grant copy
Joao Martins [Fri, 12 May 2017 08:46:39 +0000 (09:46 +0100)]
xen-netback: shorten tx grant copy

Refactors grant copy setup on Transmit side and fit into a helper.
Further commits will allow this routine to memcpy from a premapped
page.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netback: introduce staging grant mappings ops
Joao Martins [Fri, 12 May 2017 08:46:38 +0000 (09:46 +0100)]
xen-netback: introduce staging grant mappings ops

Introduce support for staging grants which means having a
set of preallocated buffers that get reused over time. This is
negotiated through a couple of xenstore entries in the form of:

 * /local/domain/1/device/vif/0/queue-0 = ""
 * /local/domain/1/device/vif/0/queue-0/tx-pool-ref  = "<ring-ref-tx0>"
 * /local/domain/1/device/vif/0/queue-0/tx-pool-size = "<nr-entries-tx0>"
 * /local/domain/1/device/vif/0/queue-0/rx-pool-ref  = "<ring-ref-rx0>"
 * /local/domain/1/device/vif/0/queue-0/rx-pool-size = "<nr-entries-rx0>"

These entries will hand over a list of `struct xen_ext_gref_alloc` which
frontend provide (size of XEN_PAGE_SIZE which fits 512 entries). And
these entries contain the gref and flags to map into a Domain-0
ballooned page, which gets added in a hash table of gref <-> backing
page kept per queue. Frontend can use this to pregrant certain pages and
reuse them for Rx/Tx requests.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoinclude/xen: import vendor extension to netif.h
Joao Martins [Fri, 12 May 2017 08:46:37 +0000 (09:46 +0100)]
include/xen: import vendor extension to netif.h

Describe in the protocol headers the extension we're making
with respect to staging grants. The extensions here described
are a middle ground with what is being discussed upstream
while keeping similar (yet different naming) structures
to be proposed upstream. The difference with upstream proposal
is that the staging grants occurs through a control ring;
here we do at xenbus features negotiation, which is more
maintainable while we keep this code out of tree.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoxen-netback: fix type mismatch warning
Arnd Bergmann [Fri, 12 May 2017 08:46:36 +0000 (09:46 +0100)]
xen-netback: fix type mismatch warning

Wiht the latest rework of the xen-netback driver, we get a warning
on ARM about the types passed into min():

drivers/net/xen-netback/rx.c: In function 'xenvif_rx_next_chunk':
include/linux/kernel.h:739:16: error: comparison of distinct pointer types lacks a cast [-Werror]

The reason is that XEN_PAGE_SIZE is not size_t here. There
is no actual bug, and we can easily avoid the warning using the
min_t() macro instead of min().

Fixes: eb1723a29b9a ("xen-netback: refactor guest rx")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f112be65fd3964ec2d56ddd0d5e6061b0fd502da)
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen-netback: fix guest Rx stall detection (after guest Rx refactor)
David Vrabel [Fri, 12 May 2017 08:46:35 +0000 (09:46 +0100)]
xen-netback: fix guest Rx stall detection (after guest Rx refactor)

If a VIF has been ready for rx_stall_timeout (60s by default) and an
Rx ring is drained of all requests an Rx stall will be incorrectly
detected.  When this occurs and the guest Rx queue is empty, the Rx
ring's event index will not be set and the frontend will not raise an
event when new requests are placed on the ring, permanently stalling
the VIF.

This is a regression introduced by eb1723a29b9a7 (xen-netback:
refactor guest rx).

Fix this by reinstating the setting of queue->last_rx_time when
placing a packet onto the guest Rx ring.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d1ef006dc116bf6487426b0b50c1bf2bf51e6423)
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen/netback: add fraglist support for to-guest rx
Ross Lagerwall [Fri, 12 May 2017 08:46:34 +0000 (09:46 +0100)]
xen/netback: add fraglist support for to-guest rx

This allows full 64K skbuffs (with 1500 mtu ethernet, composed of 45
fragments) to be handled by netback for to-guest rx.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2167ca029c2449018314fdf8637c1eb3f123036e)
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen-netback: batch copies for multiple to-guest rx packets
David Vrabel [Fri, 12 May 2017 08:46:33 +0000 (09:46 +0100)]
xen-netback: batch copies for multiple to-guest rx packets

Instead of flushing the copy ops when an packet is complete, complete
packets when their copy ops are done.  This improves performance by
reducing the number of grant copy hypercalls.

Latency is still limited by the relatively small size of the copy
batch.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a37f12298c251a48bc74d4012e07bf0d78175f46)
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen-netback: process guest rx packets in batches
David Vrabel [Fri, 12 May 2017 08:46:32 +0000 (09:46 +0100)]
xen-netback: process guest rx packets in batches

Instead of only placing one skb on the guest rx ring at a time, process
a batch of up-to 64.  This improves performance by ~10% in some tests.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 98f6d57ced73b723551568262019f1d6c8771f20)
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen-netback: immediately wake tx queue when guest rx queue has space
David Vrabel [Fri, 12 May 2017 08:46:31 +0000 (09:46 +0100)]
xen-netback: immediately wake tx queue when guest rx queue has space

When an skb is removed from the guest rx queue, immediately wake the
tx queue, instead of after processing them.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7c0b1a23e6f983fe392c8ffa71d05189ae52ebb5)
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen-netback: refactor guest rx
David Vrabel [Fri, 12 May 2017 08:46:30 +0000 (09:46 +0100)]
xen-netback: refactor guest rx

Refactor the to-guest (rx) path to:

1. Push responses for completed skbs earlier, reducing latency.

2. Reduce the per-queue memory overhead by greatly reducing the
   maximum number of grant copy ops in each hypercall (from 4352 to
   64).  Each struct xenvif_queue is now only 44 kB instead of 220 kB.

3. Make the code more maintainable.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit eb1723a29b9a75dd787510a39096a68dba6cc200)

 Conflicts:
drivers/net/xen-netback/common.h
drivers/net/xen-netback/rx.c

Exclude the hash handling.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen-netback: retire guest rx side prefix GSO feature
Paul Durrant [Fri, 12 May 2017 08:46:29 +0000 (09:46 +0100)]
xen-netback: retire guest rx side prefix GSO feature

As far as I am aware only very old Windows network frontends make use of
this style of passing GSO packets from backend to frontend. These
frontends can easily be replaced by the freely available Xen Project
Windows PV network frontend, which uses the 'default' mechanism for
passing GSO packets, which is also used by all Linux frontends.

NOTE: Removal of this feature will not cause breakage in old Windows
      frontends. They simply will no longer receive GSO packets - the
      packets instead being fragmented in the backend.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fedbc8c132bcf836358103195d8b6df6c03d9daf)
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen-netback: separate guest side rx code into separate module
Paul Durrant [Fri, 12 May 2017 08:46:28 +0000 (09:46 +0100)]
xen-netback: separate guest side rx code into separate module

The netback source module has become very large and somewhat confusing.
This patch simply moves all code related to the backend to frontend (i.e
guest side rx) data-path into a separate rx source module.

This patch contains no functional change, it is code movement and
minimal changes to avoid patch style-check issues.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3254f83694fe519ac18b8334a2f481d80c3a8a3a)

Do not account for the hash support.

 Conflicts:
drivers/net/xen-netback/Makefile
drivers/net/xen-netback/netback.c
drivers/net/xen-netback/rx.c

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agox86/xen/time: setup secondary time info for vdso
Joao Martins [Mon, 15 May 2017 16:51:10 +0000 (17:51 +0100)]
x86/xen/time: setup secondary time info for vdso

In order to support pvclock vdso on xen we need to setup the
time info page for each vcpu and register those pages with Xen
using the VCPUOP_register_vcpu_time_memory_area hypercall. This
hypercall will also forcefully update the pvti which will set
some of the necessary flags for vdso. Afterwards we check if it
supports the PVCLOCK_TSC_STABLE_BIT flag which is mandatory for
having vdso/vsyscall support. And if so, it will set the cpu
pvti's that will be later used when mapping the vdso image.

Note that before setting up vdso we check if PVCLOCK_TSC_STABLE_BIT
with the primary vcpu_info which if supported adds up this flag
to the pvclock supported ones. This is to allow Xen clocksource
to be faster irrespesctive of how the pvclock vdso pages are setup.
This allows to speed up pvclock_clocksource_read() users.

The xen headers are also updated to include the new hypercall for
registering the secondary vcpu_time_info copy.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 26107942

8 years agoMerge branch 'topic/uek-4.1/4.10-xen' of git://ca-git.us.oracle.com/linux-bostrovs...
Konrad Rzeszutek Wilk [Wed, 8 Feb 2017 19:24:18 +0000 (14:24 -0500)]
Merge branch 'topic/uek-4.1/4.10-xen' of git://ca-git.us.oracle.com/linux-bostrovs-public into topic/uek-4.1/xen

* 'topic/uek-4.1/4.10-xen' of git://ca-git.us.oracle.com/linux-bostrovs-public: (49 commits)
  xen: events: Replace BUG() with BUG_ON()
  xen: remove stale xs_input_avail() from header
  xen: return xenstore command failures via response instead of rc
  xen: xenbus driver must not accept invalid transaction ids
  xen/evtchn: use rb_entry()
  xen/setup: Don't relocate p2m over existing one
  xen/balloon: Only mark a page as managed when it is released
  xen/scsifront: don't request a slot on the ring until request is ready
  xen/x86: Increase xen_e820_map to E820_X_MAX possible entries
  x86: Make E820_X_MAX unconditionally larger than E820MAX
  xen/pci: Bubble up error and fix description.
  xen: xenbus: set error code on failure
  xen: set error code on failures
  xen/events: use xen_vcpu_id mapping for EVTCHNOP_status
  xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing
  tpm xen: Remove bogus tpm_chip_unregister
  xen-scsifront: Add a missing call to kfree
  xenfs: Use proc_create_mount_point() to create /proc/xen
  xen-netback: fix error handling output
  xen: make use of xenbus_read_unsigned() in xenbus
  ...

OraBug: 25497392
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agoxen-netback: fix extra_info handling in xenvif_tx_err()
Paul Durrant [Thu, 12 May 2016 13:43:03 +0000 (14:43 +0100)]
xen-netback: fix extra_info handling in xenvif_tx_err()

Patch 562abd39 "xen-netback: support multiple extra info fragments
passed from frontend" contained a mistake which can result in an in-
correct number of responses being generated when handling errors
encountered when processing packets containing extra info fragments.
This patch fixes the problem.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reported-by: Jan Beulich <JBeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 72eec92accabe3ec34f27a9d3cd459bf5a877c33)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 25445336
Tested-by: Majid Valiollahzadeh <majid.valiollahzadeh@oracle.com>
8 years agoxen: events: Replace BUG() with BUG_ON()
Shyam Saini [Sat, 24 Dec 2016 08:52:46 +0000 (14:22 +0530)]
xen: events: Replace BUG() with BUG_ON()

Replace BUG() with BUG_ON() using coccinelle

Signed-off-by: Shyam Saini <mayhs11saini@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit f9751a60f17eb09e1d1bd036daaddc3ea3a8bed6)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: remove stale xs_input_avail() from header
Juergen Gross [Thu, 22 Dec 2016 07:19:48 +0000 (08:19 +0100)]
xen: remove stale xs_input_avail() from header

In drivers/xen/xenbus/xenbus_comms.h there is a stale declaration of
xs_input_avail(). Remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 61033e089cde41464f820c8c381ce170d89470f0)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: return xenstore command failures via response instead of rc
Juergen Gross [Thu, 22 Dec 2016 07:19:47 +0000 (08:19 +0100)]
xen: return xenstore command failures via response instead of rc

When the xenbus driver does some special handling for a Xenstore
command any error condition related to the command should be returned
via an error response instead of letting the related write operation
fail. Otherwise the user land handler might take wrong decisions
assuming the connection to Xenstore is broken.

While at it try to return the same error values xenstored would
return for those cases.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 9a6161fe73bdd3ae4a1e18421b0b20cb7141f680)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: xenbus driver must not accept invalid transaction ids
Juergen Gross [Thu, 22 Dec 2016 07:19:46 +0000 (08:19 +0100)]
xen: xenbus driver must not accept invalid transaction ids

When accessing Xenstore in a transaction the user is specifying a
transaction id which he normally obtained from Xenstore when starting
the transaction. Xenstore is validating a transaction id against all
known transaction ids of the connection the request came in. As all
requests of a domain not being the one where Xenstore lives share
one connection, validation of transaction ids of different users of
Xenstore in that domain should be done by the kernel of that domain
being the multiplexer between the Xenstore users in that domain and
Xenstore.

In order to prohibit one Xenstore user "hijacking" a transaction from
another user the xenbus driver has to verify a given transaction id
against all known transaction ids of the user before forwarding it to
Xenstore.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 639b08810d6ad74ded2c5f6e233c4fcb9d147168)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/evtchn: use rb_entry()
Geliang Tang [Tue, 20 Dec 2016 14:02:20 +0000 (22:02 +0800)]
xen/evtchn: use rb_entry()

To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 2f60b28831c7e63759b59113898e6fe4dc90dd43)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/setup: Don't relocate p2m over existing one
Ross Lagerwall [Mon, 12 Dec 2016 14:35:13 +0000 (14:35 +0000)]
xen/setup: Don't relocate p2m over existing one

When relocating the p2m, take special care not to relocate it so
that is overlaps with the current location of the p2m/initrd. This is
needed since the full extent of the current location is not marked as a
reserved region in the e820.

This was seen to happen to a dom0 with a large initial p2m and a small
reserved region in the middle of the initial p2m.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 7ecec8503af37de6be4f96b53828d640a968705f)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/balloon: Only mark a page as managed when it is released
Ross Lagerwall [Fri, 9 Dec 2016 17:10:22 +0000 (17:10 +0000)]
xen/balloon: Only mark a page as managed when it is released

Only mark a page as managed when it is released back to the allocator.
This ensures that the managed page count does not get falsely increased
when a VM is running. Correspondingly change it so that pages are
marked as unmanaged after getting them from the allocator.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 709613ad2b3c9eaeb2a3e24284b7c8feffc17326)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/scsifront: don't request a slot on the ring until request is ready
Juergen Gross [Fri, 2 Dec 2016 06:15:45 +0000 (07:15 +0100)]
xen/scsifront: don't request a slot on the ring until request is ready

Instead of requesting a new slot on the ring to the backend early, do
so only after all has been setup for the request to be sent. This
makes error handling easier as we don't need to undo the request id
allocation and ring slot allocation.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 3da96be58f2c8aaa86cfe78b16f837e610dfcfe2)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/x86: Increase xen_e820_map to E820_X_MAX possible entries
Alex Thorlton [Mon, 5 Dec 2016 17:49:14 +0000 (11:49 -0600)]
xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

On systems with sufficiently large e820 tables, and several IOAPICs, it
is possible for the XENMEM_machine_memory_map callback (and its
counterpart, XENMEM_memory_map) to attempt to return an e820 table with
more than 128 entries.  This callback adds entries to the BIOS-provided
e820 table to account for IOAPIC registers, which, on sufficiently large
systems, can result in an e820 table that is too large to copy back into
xen_e820_map.

This change simply increases the size of xen_e820_map to E820_X_MAX to
ensure that there is enough room to store the entire e820 map returned
from this callback.

Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 738662c35c491fc360bb6adcb8a0db88d87b5d88)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86: Make E820_X_MAX unconditionally larger than E820MAX
Alex Thorlton [Mon, 5 Dec 2016 17:49:13 +0000 (11:49 -0600)]
x86: Make E820_X_MAX unconditionally larger than E820MAX

It's really not necessary to limit E820_X_MAX to 128 in the non-EFI
case.  This commit drops E820_X_MAX's dependency on CONFIG_EFI, so that
E820_X_MAX is always at least slightly larger than E820MAX.

The real motivation behind this is actually to prevent some issues in
the Xen kernel, where the XENMEM_machine_memory_map hypercall can
produce an e820 map larger than 128 entries, even on systems where the
original e820 table was quite a bit smaller than that, depending on how
many IOAPICs are installed on the system.

Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Suggested-by: Ingo Molnar <mingo@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 9d2f86c6cad5a8a3f0b38a80136ba68364ca7278)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/pci: Bubble up error and fix description.
Konrad Rzeszutek Wilk [Tue, 6 Dec 2016 14:28:21 +0000 (09:28 -0500)]
xen/pci: Bubble up error and fix description.

The function is never called under PV guests, and only shows up
when MSI (or MSI-X) cannot be allocated. Convert the message
to include the error value.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 577f79e411b7a81a8ae7ae4daf5d4056ebbfbc58)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: xenbus: set error code on failure
Pan Bian [Mon, 5 Dec 2016 08:22:22 +0000 (16:22 +0800)]
xen: xenbus: set error code on failure

Variable err is initialized with 0. As a result, the return value may
be 0 even if get_zeroed_page() fails to allocate memory. This patch fixes
the bug, initializing err with "-ENOMEM".

Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 2466d4b9d0c21e6c28cd63516dea65806bf5a307)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: set error code on failures
Pan Bian [Mon, 5 Dec 2016 08:23:05 +0000 (16:23 +0800)]
xen: set error code on failures

Variable rc is reset in the loop, and its value will be non-negative
during the second and after repeat of the loop. If it fails to allocate
memory then, it may return a non-negative integer, which indicates no
error. This patch fixes the bug, assigning "-ENOMEM" to rc when
kzalloc() or alloc_page() returns NULL, and removing the initialization
of rc outside of the loop.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 0fdb47440203ce06e09923c4d578cf3c20aef69a)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/events: use xen_vcpu_id mapping for EVTCHNOP_status
Vitaly Kuznetsov [Wed, 23 Nov 2016 12:38:45 +0000 (13:38 +0100)]
xen/events: use xen_vcpu_id mapping for EVTCHNOP_status

EVTCHNOP_status hypercall returns Xen's idea of vcpu id so we need to
compare it against xen_vcpu_id mapping, not the Linux cpu id.

Suggested-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit b36585a0a3c169612f3105139464a2da1d3ecc03)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing
Boris Ostrovsky [Mon, 21 Nov 2016 14:56:06 +0000 (09:56 -0500)]
xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing

Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to
NUMA balancing") set VM_IO flag to prevent grant maps from being
subjected to NUMA balancing.

It was discovered recently that this flag causes get_user_pages() to
always fail with -EFAULT.

check_vma_flags
__get_user_pages
__get_user_pages_locked
__get_user_pages_unlocked
get_user_pages_fast
iov_iter_get_pages
dio_refill_pages
do_direct_IO
do_blockdev_direct_IO
do_blockdev_direct_IO
ext4_direct_IO_read
generic_file_read_iter
aio_run_iocb

(which can happen if guest's vdisk has direct-io-safe option).

To avoid this let's use VM_MIXEDMAP flag instead --- it prevents
NUMA balancing just as VM_IO does and has no effect on
check_vma_flags().

Cc: stable@vger.kernel.org
Reported-by: Olaf Hering <olaf@aepfle.de>
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 30faaafdfa0c754c91bac60f216c9f34a2bfdf7e)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agotpm xen: Remove bogus tpm_chip_unregister
Jason Gunthorpe [Wed, 26 Oct 2016 22:28:45 +0000 (16:28 -0600)]
tpm xen: Remove bogus tpm_chip_unregister

tpm_chip_unregister can only be called after tpm_chip_register.
devm manages the allocation so no unwind is needed here.

Cc: stable@vger.kernel.org
Fixes: afb5abc262e96 ("tpm: two-phase chip management functions")
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
OraBug: 25497392

(cherry picked from commit 1f0f30e404b3d8f4597a2d9b77fba55452f8fd0e)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen-scsifront: Add a missing call to kfree
Quentin Lambert [Sat, 19 Nov 2016 18:22:56 +0000 (19:22 +0100)]
xen-scsifront: Add a missing call to kfree

Most error branches following the call to kmalloc contain
a call to kfree. This patch add these calls where they are
missing.

This issue was found with Hector.

Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 1eb08545c0a3a2249ad53e393383cc06163d0d16)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxenfs: Use proc_create_mount_point() to create /proc/xen
Seth Forshee [Mon, 14 Nov 2016 11:12:56 +0000 (11:12 +0000)]
xenfs: Use proc_create_mount_point() to create /proc/xen

Mounting proc in user namespace containers fails if the xenbus
filesystem is mounted on /proc/xen because this directory fails
the "permanently empty" test. proc_create_mount_point() exists
specifically to create such mountpoints in proc but is currently
proc-internal. Export this interface to modules, then use it in
xenbus when creating /proc/xen.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit f97df70b1c879f764f88b25b0e67b03a5213968a)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen-netback: fix error handling output
Arnd Bergmann [Thu, 10 Nov 2016 08:55:42 +0000 (09:55 +0100)]
xen-netback: fix error handling output

The connect function prints an unintialized error code after an
earlier initialization was removed:

drivers/net/xen-netback/xenbus.c: In function 'connect':
drivers/net/xen-netback/xenbus.c:938:3: error: 'err' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This prints it as -EINVAL instead, which seems to be the most
appropriate error code. Before the patch that caused the warning,
this would print a positive number returned by vsscanf() instead,
which is also wrong. We probably don't need a backport though,
as fixing the warning here should be sufficient.

Fixes: f95842e7a9f2 ("xen: make use of xenbus_read_unsigned() in xen-netback")
Fixes: 8d3d53b3e433 ("xen-netback: Add support for multiple queues")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 25497392

(cherry picked from commit 0f06ac3b6616b9793b3fb5c398d94044a0423492)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xenbus
Juergen Gross [Mon, 31 Oct 2016 13:58:42 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xenbus

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 999c9af9e3a2535d9ad41182e93eb128e587eb84)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xen-pciback
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-pciback

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 4e81f1caa7ff77f7fd31bd31f84b1a0dcfc8184e)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xen-fbfront
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-fbfront

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.

Cc: tomi.valkeinen@ti.com
Cc: linux-fbdev@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit eaf46e181ec3cc3b6eafdbe8e30fb5a03ebbde68)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xen-scsifront
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-scsifront

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 1080b38db49f7e3075aa9cd5a87f1587282cc0b0)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xen-pcifront
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-pcifront

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.

Cc: bhelgaas@google.com
Cc: linux-pci@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 58faf07b76817782ea20c392639569ea613cd439)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xen-netfront
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-netfront

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of some reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.

Cc: netdev@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 2890ea5c13321d26732c4520649681965480ee1c)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xen-netback
Juergen Gross [Mon, 31 Oct 2016 13:58:41 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-netback

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of some reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.

Cc: wei.liu2@citrix.com
Cc: paul.durrant@citrix.com
Cc: netdev@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit f95842e7a9f235ef3b7d6d4b70fee2244149f1e7)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
drivers/net/xen-netback/xenbus.c

8 years agoxen: make use of xenbus_read_unsigned() in xen-kbdfront
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-kbdfront

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of the reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.

Cc: dmitry.torokhov@gmail.com
Cc: linux-input@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 81362c6f159dcb59fadd60927aa00497d715ca80)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xen-tpmfront
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-tpmfront

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of one read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.

Cc: peterhuewe@gmx.de
Cc: tpmdd@selhorst.net
Cc: jarkko.sakkinen@linux.intel.com
Cc: jgunthorpe@obsidianresearch.com
Cc: tpmdd-devel@lists.sourceforge.net
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 0240933469ea4cc1aa1c32867349c4aa718fe264)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: make use of xenbus_read_unsigned() in xen-blkfront
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-blkfront

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of some reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.

Cc: konrad.wilk@oracle.com
Cc: roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit f27dc1ac56865c2cc43d0ec3110a2b4a95b04e1d)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
drivers/block/xen-blkfront.c

8 years agoxen: make use of xenbus_read_unsigned() in xen-blkback
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: make use of xenbus_read_unsigned() in xen-blkback

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of one read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.

Cc: konrad.wilk@oracle.com
Cc: roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 8235777b2068e3280b6fa1413f1940ade31f0adf)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: introduce xenbus_read_unsigned()
Juergen Gross [Mon, 31 Oct 2016 13:58:40 +0000 (14:58 +0100)]
xen: introduce xenbus_read_unsigned()

There are multiple instances of code reading an optional unsigned
parameter from Xenstore via xenbus_scanf(). Instead of repeating the
same code over and over add a service function doing the job.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 9c53a1792a5e6c708897d0cb17f2a4509e499a52)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen-netfront: cast grant table reference first to type int
Dongli Zhang [Wed, 2 Nov 2016 01:04:33 +0000 (09:04 +0800)]
xen-netfront: cast grant table reference first to type int

IS_ERR_VALUE() in commit 87557efc27f6a50140fb20df06a917f368ce3c66
("xen-netfront: do not cast grant table reference to signed short") would
not return true for error code unless we cast ref first to type int.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
OraBug: 25497392

(cherry picked from commit 269ebce4531b8edc4224259a02143181a1c1d77c)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen-netfront: do not cast grant table reference to signed short
Dongli Zhang [Mon, 31 Oct 2016 05:38:29 +0000 (13:38 +0800)]
xen-netfront: do not cast grant table reference to signed short

While grant reference is of type uint32_t, xen-netfront erroneously casts
it to signed short in BUG_ON().

This would lead to the xen domU panic during boot-up or migration when it
is attached with lots of paravirtual devices.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
OraBug: 25497392

(cherry picked from commit 87557efc27f6a50140fb20df06a917f368ce3c66)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxenbus: check return value of xenbus_scanf()
Jan Beulich [Mon, 24 Oct 2016 15:05:18 +0000 (09:05 -0600)]
xenbus: check return value of xenbus_scanf()

Don't ignore errors here: Set backend state to unknown when
unsuccessful.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit c251f15c7dbf2cb72e7b2b282020b41f4e4d3665)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxenbus: prefer list_for_each()
Jan Beulich [Mon, 24 Oct 2016 15:03:49 +0000 (09:03 -0600)]
xenbus: prefer list_for_each()

This is more efficient than list_for_each_safe() when list modification
is accompanied by breaking out of the loop.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit e1e5b3ff41983f506c3cbcf123fe7d682f61a8f1)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxenbus: advertise control feature flags
Juergen Gross [Tue, 11 Oct 2016 11:34:16 +0000 (13:34 +0200)]
xenbus: advertise control feature flags

The Xen docs specify several flags which a guest can set to advertise
which values of the xenstore control/shutdown key it will recognize.
This patch adds code to write all the relevant feature-flag keys.

Based-on-patch-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 44b3c7af02ca2701b6b90ee30c9d1d9c3ae07653)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/pciback: support driver_override
Juergen Gross [Thu, 22 Sep 2016 08:45:41 +0000 (10:45 +0200)]
xen/pciback: support driver_override

Support the driver_override scheme introduced with commit 782a985d7af2
("PCI: Introduce new device binding path using pci_dev.driver_override")

As pcistub_probe() is called for all devices (it has to check for a
match based on the slot address rather than device type) it has to
check for driver_override set to "pciback" itself.

Up to now for assigning a pci device to pciback you need something like:

echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind
echo 0000:07:10.0 > /sys/bus/pci/drivers/pciback/new_slot
echo 0000:07:10.0 > /sys/bus/pci/drivers_probe

while with the patch you can use the same mechanism as for similar
drivers like pci-stub and vfio-pci:

echo pciback > /sys/bus/pci/devices/0000\:07\:10.0/driver_override
echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind
echo 0000:07:10.0 > /sys/bus/pci/drivers_probe

So e.g. libvirt doesn't need special handling for pciback.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit b057878b2aadc7e06280e7e702a36e7adb1bcdf7)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/pciback: avoid multiple entries in slot list
Juergen Gross [Thu, 22 Sep 2016 08:45:40 +0000 (10:45 +0200)]
xen/pciback: avoid multiple entries in slot list

The Xen pciback driver has a list of all pci devices it is ready to
seize. There is no check whether a to be added entry already exists.
While this might be no problem in the common case it might confuse
those which consume the list via sysfs.

Modify the handling of this list by not adding an entry which already
exists. As this will be needed later split out the list handling into
a separate function.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 9f8bee9c981f5fe7382a0615d117cc128dd22458)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/pciback: simplify pcistub device handling
Juergen Gross [Thu, 22 Sep 2016 08:45:39 +0000 (10:45 +0200)]
xen/pciback: simplify pcistub device handling

The Xen pciback driver maintains a list of all its seized devices.
There are two functions searching the list for a specific device with
basically the same semantics just returning different structures in
case of a match.

Split out the search function.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 1af916b701db1a9905e559e742f45818eb233d12)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/xen: add missing \n at end of printk warning message
Colin Ian King [Mon, 12 Sep 2016 10:20:46 +0000 (11:20 +0100)]
x86/xen: add missing \n at end of printk warning message

The message is missing a \n, add it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 8129554c643b0e1a8336d842cce2f3d595aeeed7)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen-netfront: avoid packet loss when ethernet header crosses page boundary
Vitaly Kuznetsov [Mon, 19 Sep 2016 10:53:40 +0000 (12:53 +0200)]
xen-netfront: avoid packet loss when ethernet header crosses page boundary

Small packet loss is reported on complex multi host network configurations
including tunnels, NAT, ... My investigation led me to the following check
in netback which drops packets:

        if (unlikely(txreq.size < ETH_HLEN)) {
                netdev_err(queue->vif->dev,
                           "Bad packet size: %d\n", txreq.size);
                xenvif_tx_err(queue, &txreq, extra_count, idx);
                break;
        }

But this check itself is legitimate. SKBs consist of a linear part (which
has to have the ethernet header) and (optionally) a number of frags.
Netfront transmits the head of the linear part up to the page boundary
as the first request and all the rest becomes frags so when we're
reconstructing the SKB in netback we can't distinguish between original
frags and the 'tail' of the linear part. The first SKB needs to be at
least ETH_HLEN size. So in case we have an SKB with its linear part
starting too close to the page boundary the packet is lost.

I see two ways to fix the issue:
- Change the 'wire' protocol between netfront and netback to start keeping
  the original SKB structure. We'll have to add a flag indicating the fact
  that the particular request is a part of the original linear part and not
  a frag. We'll need to know the length of the linear part to pre-allocate
  memory.
- Avoid transmitting SKBs with linear parts starting too close to the page
  boundary. That seems preferable short-term and shouldn't bring
  significant performance degradation as such packets are rare. That's what
  this patch is trying to achieve with skb_copy().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
OraBug: 25497392

(cherry picked from commit fd07160bb7180cdd0afeb089d8cdfd66002f17e6)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: Sync xen header
Juergen Gross [Mon, 29 Aug 2016 06:48:42 +0000 (08:48 +0200)]
xen: Sync xen header

Import the actual version of include/xen/interface/sched.h from Xen.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Cc: Douglas_Warzecha@dell.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: chrisw@sous-sol.org
Cc: hpa@zytor.com
Cc: jdelvare@suse.com
Cc: jeremy@goop.org
Cc: linux@roeck-us.net
Cc: pali.rohar@gmail.com
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1472453327-19050-2-git-send-email-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
OraBug: 25497392

(cherry picked from commit 3260ab5616b4cd049c79c342617525456a2391b2)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()
Markus Elfring [Thu, 25 Aug 2016 11:23:06 +0000 (13:23 +0200)]
xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()

* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus reuse the corresponding function "kmalloc_array".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data type by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 4f0fbdf22e739c94ad4c18c790be014dddaedd28)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: Make VPMU init message look less scary
Juergen Gross [Tue, 2 Aug 2016 07:22:12 +0000 (09:22 +0200)]
xen: Make VPMU init message look less scary

The default for the Xen hypervisor is to not enable VPMU in order to
avoid security issues. In this case the Linux kernel will issue the
message "Could not initialize VPMU for cpu 0, error -95" which looks
more like an error than a normal state.

Change the message to something less scary in case the hypervisor
returns EOPNOTSUPP or ENOSYS when trying to activate VPMU.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 0252937a87e1d46a8261da85cbd99dffe612a2d3)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: rename xen_pmu_init() in sys-hypervisor.c
Juergen Gross [Tue, 2 Aug 2016 06:53:36 +0000 (08:53 +0200)]
xen: rename xen_pmu_init() in sys-hypervisor.c

There are two functions with name xen_pmu_init() in the kernel. Rename
the one in drivers/xen/sys-hypervisor.c to avoid shadowing the one in
arch/x86/xen/pmu.c

To avoid the same problem in future rename some more functions.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 5b00b504b13b2f0d1aa73d59cf8984726f19100f)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agokexec: allow kdump with crash_kexec_post_notifiers
Petr Tesarik [Tue, 2 Aug 2016 21:06:19 +0000 (14:06 -0700)]
kexec: allow kdump with crash_kexec_post_notifiers

If a crash kernel is loaded, do not crash the running domain.  This is
needed if the kernel is loaded with crash_kexec_post_notifiers, because
panic notifiers are run before __crash_kexec() in that case, and this
Xen hook prevents its being called later.

[akpm@linux-foundation.org: build fix: unconditionally include kexec.h]
Link: http://lkml.kernel.org/r/20160713122000.14969.99963.stgit@hananiah.suse.cz
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OraBug: 25497392

(cherry picked from commit c0253115968c35f3e1ee497282efb75ccf29fb98)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
arch/x86/xen/enlighten.c

8 years agoxen/acpi: allow xen-acpi-processor driver to load on Xen 4.7
Jan Beulich [Fri, 8 Jul 2016 12:15:07 +0000 (06:15 -0600)]
xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7

As of Xen 4.7 PV CPUID doesn't expose either of CPUID[1].ECX[7] and
CPUID[0x80000007].EDX[7] anymore, causing the driver to fail to load on
both Intel and AMD systems. Doing any kind of hardware capability
checks in the driver as a prerequisite was wrong anyway: With the
hypervisor being in charge, all such checking should be done by it. If
ACPI data gets uploaded despite some missing capability, the hypervisor
is free to ignore part or all of that data.

Ditch the entire check_prereq() function, and do the only valid check
(xen_initial_domain()) in the caller in its place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit 6f2d9d99213514360034c6d52d2c3919290b3504)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoproc: Allow creating permanently empty directories that serve as mount points
Eric W. Biederman [Mon, 11 May 2015 21:44:25 +0000 (16:44 -0500)]
proc: Allow creating permanently empty directories that serve as mount points

Add a new function proc_create_mount_point that when used to creates a
directory that can not be added to.

Add a new function is_empty_pde to test if a function is a mount
point.

Update the code to use make_empty_dir_inode when reporting
a permanently empty directory to the vfs.

Update the code to not allow adding to permanently empty directories.

Update /proc/openprom and /proc/fs/nfsd to be permanently empty directories.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
OraBug: 25497392

(cherry picked from commit eb6d38d5427b3ad42f5268da0f1dd31bb0af1264)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxen: Resume PMU from non-atomic context
Boris Ostrovsky [Wed, 2 Dec 2015 17:10:48 +0000 (12:10 -0500)]
xen: Resume PMU from non-atomic context

Resuming PMU currently triggers a warning from ___might_sleep() (assuming
CONFIG_DEBUG_ATOMIC_SLEEP is set) when xen_pmu_init() allocates GFP_KERNEL
page because we are in state resembling atomic context.

Move resuming PMU to xen_arch_resume() which is called in regular context.
For symmetry move suspending PMU to xen_arch_suspend() as well.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org> # 4.3
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
OraBug: 25497392

(cherry picked from commit de0afc9bdeeadaa998797d2333c754bf9f4d5dcf)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agoxenbus: fix deadlock on writes to /proc/xen/xenbus
David Vrabel [Fri, 9 Dec 2016 14:41:13 +0000 (14:41 +0000)]
xenbus: fix deadlock on writes to /proc/xen/xenbus

/proc/xen/xenbus does not work correctly.  A read blocked waiting for
a xenstore message holds the mutex needed for atomic file position
updates.  This blocks any writes on the same file handle, which can
deadlock if the write is needed to unblock the read.

Clear FMODE_ATOMIC_POS when opening this device to always get
character device like sematics.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Orabug: 25425387
(cherry picked from commit 581d21a2d02a798ee34e56dbfa13f891b3a90c30)
Jira: OCC-36718
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agox86/acpi: store ACPI ids from MADT for future usage
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:36 +0000 (17:56 +0200)]
x86/acpi: store ACPI ids from MADT for future usage

Currently we don't save ACPI ids (unlike LAPIC ids which go to
x86_cpu_to_apicid) from MADT and we may need this information later.
Particularly, ACPI ids is the only existent way for a PVHVM Xen guest
to figure out Xen's idea of its vCPUs ids before these CPUs boot and
in some cases these ids diverge from Linux's cpu ids.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 3e9e57fad3d8530aa30787f861c710f598ddc4e7)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-netback: fix error handling on netback_probe()
Filipe Manco [Thu, 15 Sep 2016 15:10:46 +0000 (17:10 +0200)]
xen-netback: fix error handling on netback_probe()

In case of error during netback_probe() (e.g. an entry missing on the
xenstore) netback_remove() is called on the new device, which will set
the device backend state to XenbusStateClosed by calling
set_backend_state(). However, the backend state wasn't initialized by
netback_probe() at this point, which will cause and invalid transaction
and set_backend_state() to BUG().

Initialize the backend state at the beginning of netback_probe() to
XenbusStateInitialising, and create two new valid state transitions on
set_backend_state(), from XenbusStateInitialising to XenbusStateClosed,
and from XenbusStateInitialising to XenbusStateInitWait.

Signed-off-by: Filipe Manco <filipe.manco@neclab.eu>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cce94483e47e8e3d74cf4475dea33f9fd4b6ad9f)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen: change the type of xen_vcpu_id to uint32_t
Vitaly Kuznetsov [Fri, 29 Jul 2016 09:06:48 +0000 (11:06 +0200)]
xen: change the type of xen_vcpu_id to uint32_t

We pass xen_vcpu_id mapping information to hypercalls which require
uint32_t type so it would be cleaner to have it as uint32_t. The
initializer to -1 can be dropped as we always do the mapping before using
it and we never check the 'not set' value anyway.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 55467dea2967259f21f4f854fc99d39cc5fea60e)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxenbus: don't look up transaction IDs for ordinary writes
Jan Beulich [Mon, 15 Aug 2016 15:02:38 +0000 (09:02 -0600)]
xenbus: don't look up transaction IDs for ordinary writes

This should really only be done for XS_TRANSACTION_END messages, or
else at least some of the xenstore-* tools don't work anymore.

Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition")
Reported-by: Richard Schütz <rschuetz@uni-koblenz.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Richard Schütz <rschuetz@uni-koblenz.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 9a035a40f7f3f6708b79224b86c5777a3334f7ea)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-blkfront: free resources if xlvbd_alloc_gendisk fails
Bob Liu [Wed, 27 Jul 2016 09:42:04 +0000 (17:42 +0800)]
xen-blkfront: free resources if xlvbd_alloc_gendisk fails

Current code forgets to free resources in the failure path of
xlvbd_alloc_gendisk(), this patch fix it.

Signed-off-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 4e876c2bd37fbb5c37a4554a79cf979d486f0e82)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

 Conflicts:
drivers/block/xen-blkfront.c

8 years agoxen: add static initialization of steal_clock op to xen_time_ops
Juergen Gross [Tue, 26 Jul 2016 12:15:11 +0000 (14:15 +0200)]
xen: add static initialization of steal_clock op to xen_time_ops

pv_time_ops might be overwritten with xen_time_ops after the
steal_clock operation has been initialized already. To prevent calling
a now uninitialized function pointer add the steal_clock static
initialization to xen_time_ops.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit d34c30cc1fa80f509500ff192ea6bc7d30671061)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen/pvhvm: run xen_vcpu_setup() for the boot CPU
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:43 +0000 (17:56 +0200)]
xen/pvhvm: run xen_vcpu_setup() for the boot CPU

Historically we didn't call VCPUOP_register_vcpu_info for CPU0 for
PVHVM guests (while we had it for PV and ARM guests). This is usually
fine as we can use vcpu info in the shared_info page but when we try
booting on a vCPU with Xen's vCPU id > 31 (e.g. when we try to kdump
after crashing on this CPU) we're not able to boot.

Switch to always doing VCPUOP_register_vcpu_info for the boot CPU.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ee42d665d3f5db975caf87baf101a57235ddb566)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen/evtchn: use xen_vcpu_id mapping
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:42 +0000 (17:56 +0200)]
xen/evtchn: use xen_vcpu_id mapping

Use the newly introduced xen_vcpu_id mapping to get Xen's idea of vCPU
id for CPU0.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit cbbb4682394c45986a34d8c77a02e7a066e30235)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen/events: fifo: use xen_vcpu_id mapping
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:41 +0000 (17:56 +0200)]
xen/events: fifo: use xen_vcpu_id mapping

EVTCHNOP_init_control has vCPU id as a parameter and Xen's idea of
vCPU id should be used. Use the newly introduced xen_vcpu_id mapping
to convert it from Linux's id.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit be78da1cf43db4c1a9e13af8b6754199a89d5d75)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen/events: use xen_vcpu_id mapping in events_base
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:40 +0000 (17:56 +0200)]
xen/events: use xen_vcpu_id mapping in events_base

EVTCHNOP_bind_ipi and EVTCHNOP_bind_virq pass vCPU id as a parameter
and Xen's idea of vCPU id should be used. Use the newly introduced
xen_vcpu_id mapping to convert it from Linux's id.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 8058c0b897e7d1ba5c900cb17eb82aa0d88fca53)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agox86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:39 +0000 (17:56 +0200)]
x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info

shared_info page has space for 32 vcpu info slots for first 32 vCPUs
but these are the first 32 vCPUs from Xen's perspective and we should
map them accordingly with the newly introduced xen_vcpu_id mapping.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit e15a8621935cac527b4e0ed4078d24c3e5ef73a6)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

 Conflicts:
arch/x86/xen/enlighten.c

8 years agox86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:38 +0000 (17:56 +0200)]
x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op

HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter
while Xen's idea is expected. In some cases these ideas diverge so we
need to do remapping.

Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr().

Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as
they're only being called by PV guests before perpu areas are
initialized. While the issue could be solved by switching to
early_percpu for xen_vcpu_id I think it's not worth it: PV guests will
probably never get to the point where their idea of vCPU id diverges
from Xen's.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ad5475f9faf5186b7f59de2c6481ee3e211f1ed7)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

 Conflicts:
arch/x86/xen/enlighten.c
arch/x86/xen/time.c

8 years agoxen: introduce xen_vcpu_id mapping
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:37 +0000 (17:56 +0200)]
xen: introduce xen_vcpu_id mapping

It may happen that Xen's and Linux's ideas of vCPU id diverge. In
particular, when we crash on a secondary vCPU we may want to do kdump
and unlike plain kexec where we do migrate_to_reboot_cpu() we try
booting on the vCPU which crashed. This doesn't work very well for
PVHVM guests as we have a number of hypercalls where we pass vCPU id
as a parameter. These hypercalls either fail or do something
unexpected.

To solve the issue introduce percpu xen_vcpu_id mapping. ARM and PV
guests get direct mapping for now. Boot CPU for PVHVM guest gets its
id from CPUID. With secondary CPUs it is a bit more
trickier. Currently, we initialize IPI vectors before these CPUs boot
so we can't use CPUID. Use ACPI ids from MADT instead.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 88e957d6e47f1232ad15b21e54a44f1147ea8c1b)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

 Conflicts:
arch/arm/xen/enlighten.c

8 years agox86/xen: update cpuid.h from Xen-4.7
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:35 +0000 (17:56 +0200)]
x86/xen: update cpuid.h from Xen-4.7

Update cpuid.h header from xen hypervisor tree to get
XEN_HVM_CPUID_VCPU_ID_PRESENT definition.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit de2f5537b397249e91cafcbed4de64a24818542e)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen/evtchn: add IOCTL_EVTCHN_RESTRICT
David Vrabel [Mon, 11 Jul 2016 14:45:51 +0000 (15:45 +0100)]
xen/evtchn: add IOCTL_EVTCHN_RESTRICT

IOCTL_EVTCHN_RESTRICT limits the file descriptor to being able to bind
to interdomain event channels from a specific domain.  Event channels
that are already bound continue to work for sending and receiving
notifications.

This is useful as part of deprivileging a user space PV backend or
device model (QEMU).  e.g., Once the device model as bound to the
ioreq server event channels it can restrict the file handle so an
exploited DM cannot use it to create or bind to arbitrary event
channels.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
(cherry picked from commit fbc872c38c8fed31948c85683b5326ee5ab9fccc)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-blkback: really don't leak mode property
Jan Beulich [Thu, 7 Jul 2016 07:38:13 +0000 (01:38 -0600)]
xen-blkback: really don't leak mode property

Commit 9d092603cc ("xen-blkback: do not leak mode property") left one
path unfixed; correct this.

Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit aea305e11f7a7af12aa2beb7c7e053a338659c49)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-blkback: constify instance of "struct attribute_group"
Jan Beulich [Thu, 7 Jul 2016 07:38:58 +0000 (01:38 -0600)]
xen-blkback: constify instance of "struct attribute_group"

The functions these get passed to have been taking pointers to const
since at least 2.6.16.

Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 530439484d2d9f2a7f1038b1afd3d3543ecc63f6)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-blkfront: prefer xenbus_scanf() over xenbus_gather()
Jan Beulich [Thu, 7 Jul 2016 08:05:46 +0000 (02:05 -0600)]
xen-blkfront: prefer xenbus_scanf() over xenbus_gather()

... for single items being collected: It is more typesafe (as the
compiler can check format string and to-be-written-to variable match)
and requires one less parameter to be passed.

Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit ff595325ed556fb4b83af5b9ffd5c427c18405d7)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-blkback: prefer xenbus_scanf() over xenbus_gather()
Jan Beulich [Thu, 7 Jul 2016 08:05:21 +0000 (02:05 -0600)]
xen-blkback: prefer xenbus_scanf() over xenbus_gather()

... for single items being collected: It is more typesafe (as the
compiler can check format string and to-be-written-to variable match)
and requires one less parameter to be passed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 6694389af9be4d1eb8d3313788a902f0590fb8c2)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agox86/xen: Audit and remove any unnecessary uses of module.h
Paul Gortmaker [Thu, 14 Jul 2016 00:18:59 +0000 (20:18 -0400)]
x86/xen: Audit and remove any unnecessary uses of module.h

Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends.  That changed
when we forked out support for the latter into the export.h file.

This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig.  The advantage
in doing so is that module.h itself sources about 15 other headers;
adding significantly to what we feed cpp, and it can obscure what
headers we are effectively using.

Since module.h was the source for init.h (for __init) and for
export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance
for the presence of either and replace as needed.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20160714001901.31603-7-paul.gortmaker@windriver.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 7a2463dcacee3f2f36c78418c201756372eeea6b)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoInput: xen-kbdfront - prefer xenbus_write() over xenbus_printf() where possible
Jan Beulich [Sat, 9 Jul 2016 00:35:30 +0000 (17:35 -0700)]
Input: xen-kbdfront - prefer xenbus_write() over xenbus_printf() where possible

... as being the simpler variant.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(cherry picked from commit cd6763be8f553c7db421d38ddcb36466fb8512cd)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

 Conflicts:
drivers/input/misc/xen-kbdfront.c

8 years agoxen: support runqueue steal time on xen
Juergen Gross [Wed, 6 Jul 2016 05:00:30 +0000 (07:00 +0200)]
xen: support runqueue steal time on xen

Up to now reading the stolen time of a remote cpu was not possible in a
performant way under Xen. This made support of runqueue steal time via
paravirt_steal_rq_enabled impossible.

With the addition of an appropriate hypervisor interface this is now
possible, so add the support.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6ba286ad845799b135e5af73d1fbc838fa79f709)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen: update xen headers
Juergen Gross [Wed, 6 Jul 2016 05:00:28 +0000 (07:00 +0200)]
xen: update xen headers

Update some Xen headers to be able to use new functionality.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 7ba8dba95cb227eb6c270b1aa77f942e45f5e47c)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-pciback: drop superfluous variables
Jan Beulich [Wed, 6 Jul 2016 07:00:14 +0000 (01:00 -0600)]
xen-pciback: drop superfluous variables

req_start is simply an alias of the "offset" function parameter, and
req_end is being used just once in each function. (And both variables
were loop invariant anyway, so should at least have got initialized
outside the loop.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 1ad6344acfbf19288573b4a5fa0b07cbb5af27d7)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-pciback: short-circuit read path used for merging write values
Jan Beulich [Wed, 6 Jul 2016 06:59:35 +0000 (00:59 -0600)]
xen-pciback: short-circuit read path used for merging write values

There's no point calling xen_pcibk_config_read() here - all it'll do is
return whatever conf_space_read() returns for the field which was found
here (and which would be found there again). Also there's no point
clearing tmp_val before the call.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit ee87d6d0d36d98c550f99274a81841033226e3bf)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-pciback: use const and unsigned in bar_init()
Jan Beulich [Wed, 6 Jul 2016 06:58:58 +0000 (00:58 -0600)]
xen-pciback: use const and unsigned in bar_init()

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 585203609c894db11dea724b743c04d0c9927f39)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-pciback: simplify determination of 64-bit memory resource
Jan Beulich [Wed, 6 Jul 2016 06:58:19 +0000 (00:58 -0600)]
xen-pciback: simplify determination of 64-bit memory resource

Other than for raw BAR values, flags are properly separated in the
internal representation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit c8670c22e04e4e42e752cc5b53922106b3eedbda)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937

8 years agoxen-pciback: fold read_dev_bar() into its now single caller
Jan Beulich [Wed, 6 Jul 2016 06:57:43 +0000 (00:57 -0600)]
xen-pciback: fold read_dev_bar() into its now single caller

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 6ad2655d87d2d35c1de4500402fae10fe7b30b4a)
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937