]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
7 years agocrypto: rng - Convert low-level crypto_rng to new style
Herbert Xu [Tue, 21 Apr 2015 02:46:38 +0000 (10:46 +0800)]
crypto: rng - Convert low-level crypto_rng to new style

Orabug: 26330509

This patch converts the low-level crypto_rng interface to the
"new" style.

This allows existing implementations to be converted over one-
by-one.  Once that is complete we can then remove the old rng
interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit acec27ff35af9caf34d76d16ee17ff3b292e7d83)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: rng - Mark crypto_rng_reset seed as const
Herbert Xu [Tue, 21 Apr 2015 02:46:37 +0000 (10:46 +0800)]
crypto: rng - Mark crypto_rng_reset seed as const

Orabug: 26330509

There is no reason why crypto_rng_reset should modify the seed
so this patch marks it as const.  Since our algorithms don't
export a const seed function yet we have to go through some
contortions for now.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 3c5d8fa9f56ad0928e7a1f06003e5034f5eedb52)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: rng - Introduce crypto_rng_generate
Herbert Xu [Mon, 20 Apr 2015 05:39:04 +0000 (13:39 +0800)]
crypto: rng - Introduce crypto_rng_generate

Orabug: 26330509

This patch adds the new top-level function crypto_rng_generate
which generates random numbers with additional input.  It also
extends the mid-level rng_gen_random function to take additional
data as input.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ff030b099a21a4753af575b4304249e88400e506)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: rng - Convert crypto_rng to new style crypto_type
Herbert Xu [Mon, 20 Apr 2015 05:39:03 +0000 (13:39 +0800)]
crypto: rng - Convert crypto_rng to new style crypto_type

Orabug: 26330509

This patch converts the top-level crypto_rng to the "new" style.
It was the last algorithm type added before we switched over
to the new way of doing things exemplified by shash.

All users will automatically switch over to the new interface.

Note that this patch does not touch the low-level interface to
rng implementations.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit d0e83059a6c9b04f00264a74b8f6439948de4613)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agomm: Tighten x86 /dev/mem with zeroing reads
Kees Cook [Wed, 5 Apr 2017 16:39:08 +0000 (09:39 -0700)]
mm: Tighten x86 /dev/mem with zeroing reads

Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is
disallowed. However, on x86, the first 1MB was always allowed for BIOS
and similar things, regardless of it actually being System RAM. It was
possible for heap to end up getting allocated in low 1MB RAM, and then
read by things like x86info or dd, which would trip hardened usercopy:

usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes)

This changes the x86 exception for the low 1MB by reading back zeros for
System RAM areas instead of blindly allowing them. More work is needed to
extend this to mmap, but currently mmap doesn't go through usercopy, so
hardened usercopy won't Oops the kernel.

Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
(cherry picked from commit a4866aa812518ed1a37d8ea0c881dc946409de94)

Orabug: 25917914
CVE: CVE-2017-7889

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years ago[media] saa7164: fix double fetch PCIe access condition
Steven Toth [Tue, 6 Jun 2017 12:30:27 +0000 (09:30 -0300)]
[media] saa7164: fix double fetch PCIe access condition

Avoid a double fetch by reusing the values from the prior transfer.

Originally reported via https://bugzilla.kernel.org/show_bug.cgi?id=195559

Thanks to Pengfei Wang <wpengfeinudt@gmail.com> for reporting.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
(cherry picked from commit 6fb05e0dd32e566facb96ea61a48c7488daa5ac3)

Orabug: 26093949
CVE: CVE-2017-8831

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agoRevert "net/rds: Revert "RDS: add reconnect retry scheme for stalled
Wei Lin Guay [Mon, 11 Sep 2017 13:17:28 +0000 (15:17 +0200)]
Revert "net/rds: Revert "RDS: add reconnect retry scheme for stalled
connections""

This commit restores commit 5acb959ad599 ("RDS: add reconnect retry scheme
for stalled connections").  Even though this retry scheme "workaround"
causes a long brownout time in the OVM configuration, it is needed to avoid
RDS loopback connections stalls after switch reboot in the bare-metal
system. As for now, the plan agreed with Exadata is to put back this commit
first and have a similar code path among QU6, QU5 and QU4.

Orabug: 26497333

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
7 years agoRevert "net/rds: prioritize the base connection establishment"
Wei Lin Guay [Thu, 10 Aug 2017 09:59:35 +0000 (11:59 +0200)]
Revert "net/rds: prioritize the base connection establishment"

This reverts commit 1bc87d23681a ("net/rds: prioritize the base connection
establishment"). This patch is reverted because the RDS path record caching
is replaced by the underlying ibacm path record caching. By doing so, all
the TOS connections can be established without depending on its base
connection to perform SA query. Thus, it is not required to prioritize
the base connection establishment.

Orabug: 26497333

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
7 years agoRevert "net/rds: determine active/passive connection with IP addresses"
Wei Lin Guay [Thu, 10 Aug 2017 09:57:34 +0000 (11:57 +0200)]
Revert "net/rds: determine active/passive connection with IP addresses"

This reverts commit 1f2ea7a020a1 ("net/rds: determine active/passive
connection with IP addresses"). The plan is to use the original, well
tested one-sided reconnection that was reverted in "812c02791: RDS: restore
the exponential back-off scheme".

Orabug: 26497333

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
7 years agoRevert "net/rds: use different workqueue for base_conn"
Wei Lin Guay [Thu, 10 Aug 2017 09:36:03 +0000 (11:36 +0200)]
Revert "net/rds: use different workqueue for base_conn"

This reverts commit ad7312bb8b8e ("net/rds: use different workqueue for
base_conn"). The RDS path record caching will be replaced by ibacm. Thus,
all the TOS connections do not rely on its base connection to perform SA
query.  As a result, RDS does not need a separate "priority" workqueue for
the base connection.

Orabug: 26497333

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
7 years agoblk-mq: add missing blk_mq_put_ctx
Ankur Arora [Wed, 6 Sep 2017 19:47:52 +0000 (12:47 -0700)]
blk-mq: add missing blk_mq_put_ctx

In case of failure to queue to the lower level driver, we
enqueue it via blk_mq_insert_request() and return while
holding a cpu reference.

Give up the currently held reference before calling
blk_mq_insert_request().

Orabug: 26339553

Suggested-by: Bhavesh Davda <bhavesh.davda@oracle.com>
Reviewed-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
7 years agoblk-mq: avoid re-initialize request which is failed in direct dispatch
Shaohua Li [Fri, 8 May 2015 17:51:31 +0000 (10:51 -0700)]
blk-mq: avoid re-initialize request which is failed in direct dispatch

If we directly issue a request and it fails, we use
blk_mq_merge_queue_io(). But we already assigned bio to a request in
blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run
blk_mq_bio_to_request again.

Orabug: 26339553
Suggested-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 239ad215f0d8388cbe6c09a0fab8ad8ff5dba420)
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Bhavesh Davda <bhavesh.davda@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
7 years agobnxt_en: Add bnxt_get_num_stats() to centrally get the number of ethtool stats.
Michael Chan [Mon, 24 Jul 2017 16:34:23 +0000 (12:34 -0400)]
bnxt_en: Add bnxt_get_num_stats() to centrally get the number of ethtool stats.

Orabug: 26726982

Instead of duplicating the logic multiple times.  Also, it is unnecessary
to zero the buffer in .get_ethtool_stats() because it is already zeroed
by the caller.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5c8227d0d3b1eb1ad8f98d0b6dc619d70f2cfa04)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
7 years agobnxt_en: Implement ndo_bridge_{get|set}link methods.
Michael Chan [Mon, 24 Jul 2017 16:34:22 +0000 (12:34 -0400)]
bnxt_en: Implement ndo_bridge_{get|set}link methods.

Orabug: 26726982

To allow users to set the hardware bridging mode to VEB or VEPA.  Only
single function PF can change the bridging mode.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 39d8ba2e71fbdde686d7e31ad141a01994dc0793)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c

7 years agobnxt_en: Retrieve the hardware bridge mode from the firmware.
Michael Chan [Mon, 24 Jul 2017 16:34:21 +0000 (12:34 -0400)]
bnxt_en: Retrieve the hardware bridge mode from the firmware.

Orabug: 26726982

Retrieve and store the hardware bridge mode, so that we can implement
ndo_bridge_{get|set)link methods in the next patch.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 32e8239c9138a050bc1feeea7cf41f27d79e6664)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
7 years agobnxt_en: Update firmware interface spec to 1.8.0.
Michael Chan [Mon, 24 Jul 2017 16:34:20 +0000 (12:34 -0400)]
bnxt_en: Update firmware interface spec to 1.8.0.

Orabug: 26726982

VF representors and PTP are added features in the new firmware spec.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit acb2005463612930b07723e852b2483d669ff856)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
7 years agox86/mm: Fix flush_tlb_page() on Xen
Andy Lutomirski [Sat, 22 Apr 2017 07:01:22 +0000 (00:01 -0700)]
x86/mm: Fix flush_tlb_page() on Xen

flush_tlb_page() passes a bogus range to flush_tlb_others() and
expects the latter to fix it up.  native_flush_tlb_others() has the
fixup but Xen's version doesn't.  Move the fixup to
flush_tlb_others().

AFAICS the only real effect is that, without this fix, Xen would
flush everything instead of just the one page on remote vCPUs in
when flush_tlb_page() was called.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: e7b52ffd45a6 ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
OraBug: 26662731

(cherry picked from commit dbd68d8e84c606673ebbcf15862f8c155fa92326)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/x86/mm/tlb.c
(uek's native_flush_tlb_others() did not have end's non-zero check)

7 years agoxen-netback: correctly schedule rate-limited queues
Wei Liu [Wed, 21 Jun 2017 09:21:22 +0000 (10:21 +0100)]
xen-netback: correctly schedule rate-limited queues

Add a flag to indicate if a queue is rate-limited. Test the flag in
NAPI poll handler and avoid rescheduling the queue if true, otherwise
we risk locking up the host. The rescheduling will be done in the
timer callback function.

Reported-by: Jean-Louis Dupond <jean-louis@dupond.be>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Jean-Louis Dupond <jean-louis@dupond.be>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
OraBug: 26662731

(cherry picked from commit dfa523ae9f2542bee4cddaea37b3be3e157f6e6b)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
drivers/net/xen-netback/interface.c
(Upstream's code used napi_complete_done())

7 years agoxen/blkback: don't use xen_blkif_get() in xen-blkback kthread
Juergen Gross [Thu, 18 May 2017 15:28:49 +0000 (17:28 +0200)]
xen/blkback: don't use xen_blkif_get() in xen-blkback kthread

There is no need to use xen_blkif_get()/xen_blkif_put() in the kthread
of xen-blkback. Thread stopping is synchronous and using the blkif
reference counting in the kthread will avoid to ever let the reference
count drop to zero at the end of an I/O running concurrent to
disconnecting and multiple rings.

Setting ring->xenblkd to NULL after stopping the kthread isn't needed
as the kthread does this already.

Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 26662731

(cherry picked from commit a24fa22ce22ae302b3bf8f7008896d52d5d57b8d)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/blkback: don't free be structure too early
Juergen Gross [Thu, 18 May 2017 15:28:48 +0000 (17:28 +0200)]
xen/blkback: don't free be structure too early

The be structure must not be freed when freeing the blkif structure
isn't done. Otherwise a use-after-free of be when unmapping the ring
used for communicating with the frontend will occur in case of a
late call of xenblk_disconnect() (e.g. due to an I/O still active
when trying to disconnect).

Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 26662731

(cherry picked from commit 71df1d7ccad1c36f7321d6b3b48f2ea42681c363)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen: make xen_flush_tlb_all() static
Juergen Gross [Thu, 18 May 2017 15:46:48 +0000 (17:46 +0200)]
xen: make xen_flush_tlb_all() static

xen_flush_tlb_all() is used in arch/x86/xen/mmu.c only. Make it static.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit c71e6d804c88168ecf02aaf14e1fd5773d683b5f)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/x86/xen/mmu.c
(Different location of xen_flush_tlb_all())

7 years agoblock: xen-blkback: add null check to avoid null pointer dereference
Gustavo A. R. Silva [Thu, 11 May 2017 15:27:35 +0000 (10:27 -0500)]
block: xen-blkback: add null check to avoid null pointer dereference

Add null check before calling xen_blkif_put() to avoid potential
null pointer dereference.

Addresses-Coverity-ID: 1350942
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 26662731

(cherry picked from commit 2d4456c73a487abe53863e10641c2f73537edf5c)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen: adjust early dom0 p2m handling to xen hypervisor behavior
Juergen Gross [Wed, 10 May 2017 04:08:44 +0000 (06:08 +0200)]
xen: adjust early dom0 p2m handling to xen hypervisor behavior

When booted as pv-guest the p2m list presented by the Xen is already
mapped to virtual addresses. In dom0 case the hypervisor might make use
of 2M- or 1G-pages for this mapping. Unfortunately while being properly
aligned in virtual and machine address space, those pages might not be
aligned properly in guest physical address space.

So when trying to obtain the guest physical address of such a page
pud_pfn() and pmd_pfn() must be avoided as those will mask away guest
physical address bits not being zero in this special case.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 69861e0a52f8733355ce246f0db15e1b240ad667)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/x86/xen/mmu_pv.c
(No mmu_pv.c in our tree)

7 years agoxen/x86: Do not call xen_init_time_ops() until shared_info is initialized
Boris Ostrovsky [Wed, 3 May 2017 20:20:51 +0000 (16:20 -0400)]
xen/x86: Do not call xen_init_time_ops() until shared_info is initialized

Routines that are set by xen_init_time_ops() use shared_info's
pvclock_vcpu_time_info area. This area is not properly available until
shared_info is mapped in xen_setup_shared_info().

This became especially problematic due to commit dd759d93f4dd ("x86/timers:
Add simple udelay calibration") where we end up reading tsc_to_system_mul
from xen_dummy_shared_info (i.e. getting zero value) and then trying
to divide by it in pvclock_tsc_khz().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit d162809f85b4f54ef075517ffa2f3d02e55d5a53)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/x86/xen/enlighten_pv.c
(No enlighten_pv.c in our tree. Note also that
         we don't have dd759d93f4dd. Still, this fixes a
         latent bug).

7 years agoxen: Implement EFI reset_system callback
Julien Grall [Mon, 24 Apr 2017 17:58:39 +0000 (18:58 +0100)]
xen: Implement EFI reset_system callback

When rebooting DOM0 with ACPI on ARM64, the kernel is crashing with the stack
trace [1].

This is happening because when EFI runtimes are enabled, the reset code
(see machine_restart) will first try to use EFI restart method.

However, the EFI restart code is expecting the reset_system callback to
be always set. This is not the case for Xen and will lead to crash.

The EFI restart helper is used in multiple places and some of them don't
not have fallback (see machine_power_off). So implement reset_system
callback as a call to xen_reboot when using EFI Xen.

[   36.999270] reboot: Restarting system
[   37.002921] Internal error: Attempting to execute userspace memory: 86000004 [#1] PREEMPT SMP
[   37.011460] Modules linked in:
[   37.014598] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.11.0-rc1-00003-g1e248b60a39b-dirty #506
[   37.023903] Hardware name: (null) (DT)
[   37.027734] task: ffff800902068000 task.stack: ffff800902064000
[   37.033739] PC is at 0x0
[   37.036359] LR is at efi_reboot+0x94/0xd0
[   37.040438] pc : [<0000000000000000>] lr : [<ffff00000880f2c4>] pstate: 404001c5
[   37.047920] sp : ffff800902067cf0
[   37.051314] x29: ffff800902067cf0 x28: ffff800902068000
[   37.056709] x27: ffff000008992000 x26: 000000000000008e
[   37.062104] x25: 0000000000000123 x24: 0000000000000015
[   37.067499] x23: 0000000000000000 x22: ffff000008e6e250
[   37.072894] x21: ffff000008e6e000 x20: 0000000000000000
[   37.078289] x19: ffff000008e5d4c8 x18: 0000000000000010
[   37.083684] x17: 0000ffffa7c27470 x16: 00000000deadbeef
[   37.089079] x15: 0000000000000006 x14: ffff000088f42bef
[   37.094474] x13: ffff000008f42bfd x12: ffff000008e706c0
[   37.099870] x11: ffff000008e70000 x10: 0000000005f5e0ff
[   37.105265] x9 : ffff800902067a50 x8 : 6974726174736552
[   37.110660] x7 : ffff000008cc6fb8 x6 : ffff000008cc6fb0
[   37.116055] x5 : ffff000008c97dd8 x4 : 0000000000000000
[   37.121453] x3 : 0000000000000000 x2 : 0000000000000000
[   37.126845] x1 : 0000000000000000 x0 : 0000000000000000
[   37.132239]
[   37.133808] Process systemd-shutdow (pid: 1, stack limit = 0xffff800902064000)
[   37.141118] Stack: (0xffff800902067cf0 to 0xffff800902068000)
[   37.146949] 7ce0:                                   ffff800902067d40 ffff000008085334
[   37.154869] 7d00: 0000000000000000 ffff000008f3b000 ffff800902067d40 ffff0000080852e0
[   37.162787] 7d20: ffff000008cc6fb0 ffff000008cc6fb8 ffff000008c7f580 ffff000008c97dd8
[   37.170706] 7d40: ffff800902067d60 ffff0000080e2c2c 0000000000000000 0000000001234567
[   37.178624] 7d60: ffff800902067d80 ffff0000080e2ee8 0000000000000000 ffff0000080e2df4
[   37.186544] 7d80: 0000000000000000 ffff0000080830f0 0000000000000000 00008008ff1c1000
[   37.194462] 7da0: ffffffffffffffff 0000ffffa7c4b1cc 0000000000000000 0000000000000024
[   37.202380] 7dc0: ffff800902067dd0 0000000000000005 0000fffff24743c8 0000000000000004
[   37.210299] 7de0: 0000fffff2475f03 0000000000000010 0000fffff2474418 0000000000000005
[   37.218218] 7e00: 0000fffff2474578 000000000000000a 0000aaaad6b722c0 0000000000000001
[   37.226136] 7e20: 0000000000000123 0000000000000038 ffff800902067e50 ffff0000081e7294
[   37.234055] 7e40: ffff800902067e60 ffff0000081e935c ffff800902067e60 ffff0000081e9388
[   37.241973] 7e60: ffff800902067eb0 ffff0000081ea388 0000000000000000 00008008ff1c1000
[   37.249892] 7e80: ffffffffffffffff 0000ffffa7c4a79c 0000000000000000 ffff000000020000
[   37.257810] 7ea0: 0000010000000004 0000000000000000 0000000000000000 ffff0000080830f0
[   37.265729] 7ec0: fffffffffee1dead 0000000028121969 0000000001234567 0000000000000000
[   37.273651] 7ee0: ffffffffffffffff 8080000000800000 0000800000008080 feffa9a9d4ff2d66
[   37.281567] 7f00: 000000000000008e feffa9a9d5b60e0f 7f7fffffffff7f7f 0101010101010101
[   37.289485] 7f20: 0000000000000010 0000000000000008 000000000000003a 0000ffffa7ccf588
[   37.297404] 7f40: 0000aaaad6b87d00 0000ffffa7c4b1b0 0000fffff2474be0 0000aaaad6b88000
[   37.305326] 7f60: 0000fffff2474fb0 0000000001234567 0000000000000000 0000000000000000
[   37.313240] 7f80: 0000000000000000 0000000000000001 0000aaaad6b70d4d 0000000000000000
[   37.321159] 7fa0: 0000000000000001 0000fffff2474ea0 0000aaaad6b5e2e0 0000fffff2474e80
[   37.329078] 7fc0: 0000ffffa7c4b1cc 0000000000000000 fffffffffee1dead 000000000000008e
[   37.336997] 7fe0: 0000000000000000 0000000000000000 9ce839cffee77eab fafdbf9f7ed57f2f
[   37.344911] Call trace:
[   37.347437] Exception stack(0xffff800902067b20 to 0xffff800902067c50)
[   37.353970] 7b20: ffff000008e5d4c8 0001000000000000 0000000080f82000 0000000000000000
[   37.361883] 7b40: ffff800902067b60 ffff000008e17000 ffff000008f44c68 00000001081081b4
[   37.369802] 7b60: ffff800902067bf0 ffff000008108478 0000000000000000 ffff000008c235b0
[   37.377721] 7b80: ffff800902067ce0 0000000000000000 0000000000000000 0000000000000015
[   37.385643] 7ba0: 0000000000000123 000000000000008e ffff000008992000 ffff800902068000
[   37.393557] 7bc0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   37.401477] 7be0: 0000000000000000 ffff000008c97dd8 ffff000008cc6fb0 ffff000008cc6fb8
[   37.409396] 7c00: 6974726174736552 ffff800902067a50 0000000005f5e0ff ffff000008e70000
[   37.417318] 7c20: ffff000008e706c0 ffff000008f42bfd ffff000088f42bef 0000000000000006
[   37.425234] 7c40: 00000000deadbeef 0000ffffa7c27470
[   37.430190] [<          (null)>]           (null)
[   37.434982] [<ffff000008085334>] machine_restart+0x6c/0x70
[   37.440550] [<ffff0000080e2c2c>] kernel_restart+0x6c/0x78
[   37.446030] [<ffff0000080e2ee8>] SyS_reboot+0x130/0x228
[   37.451337] [<ffff0000080830f0>] el0_svc_naked+0x24/0x28
[   37.456737] Code: bad PC value
[   37.459891] ---[ end trace 76e2fc17e050aecd ]---

Signed-off-by: Julien Grall <julien.grall@arm.com>
--

Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
The x86 code has theoritically a similar issue, altought EFI does not
seem to be the preferred method. I have only built test it on x86.

This should also probably be fixed in stable tree.

    Changes in v2:
        - Implement xen_efi_reset_system using xen_reboot
        - Move xen_efi_reset_system in drivers/xen/efi.c
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit e371fd7607999fabbd955b4d22c8e912594a7997)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen: Export xen_reboot
Julien Grall [Mon, 24 Apr 2017 17:58:37 +0000 (18:58 +0100)]
xen: Export xen_reboot

The helper xen_reboot will be called by the EFI code in a later patch.

Note that the ARM version does not yet exist and will be added in a
later patch too.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 5d9404e1185de8d508cd042761306495f727d7eb)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/x86/xen/xen-ops.h
arch/x86/xen/enlighten.c
(No changes to xen-ops.h, make xen_reboot() non-static)

7 years agoxen/pvh: Do not fill kernel's e820 map in init_pvh_bootparams()
Boris Ostrovsky [Fri, 21 Apr 2017 15:13:14 +0000 (11:13 -0400)]
xen/pvh: Do not fill kernel's e820 map in init_pvh_bootparams()

e820 map is updated with information from the zeropage (i.e. pvh_bootparams)
by default_machine_specific_memory_setup(). With the way things are done
now,  we end up with a duplicated e820 map.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 5f6a1614fab801834e32b420b60acdc27acfcdec)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/x86/xen/enlighten_pvh.c
(No enlighten_pvh.c in our tree)

7 years agoxen/scsifront: use offset_in_page() macro
Geliang Tang [Sat, 22 Apr 2017 01:21:13 +0000 (09:21 +0800)]
xen/scsifront: use offset_in_page() macro

Use offset_in_page() macro instead of open-coding.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 6483e3135a693548874429db901c0544d3a9b4cd)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen,kdump: handle pv domain in paddr_vmcoreinfo_note()
Juergen Gross [Tue, 11 Apr 2017 16:14:26 +0000 (18:14 +0200)]
xen,kdump: handle pv domain in paddr_vmcoreinfo_note()

For kdump to work correctly it needs the physical address of
vmcoreinfo_note. When running as dom0 this means the virtual address
has to be translated to the related machine address.

paddr_vmcoreinfo_note() is meant to do the translation via
__pa_symbol() only, but being attributed "weak" it can be replaced
easily in Xen case.

Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Petr Tesarik <ptesarik@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 29985b09613ba106a1ed0496988636d288600515)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/x86/xen/mmu_pv.c
( no mmu_pv.c in our tree)

7 years agoInput: xen-kbdfront - add module parameter for setting resolution
Juergen Gross [Wed, 19 Apr 2017 15:47:06 +0000 (08:47 -0700)]
Input: xen-kbdfront - add module parameter for setting resolution

Add a parameter for setting the resolution of xen-kbdfront in order to
be able to cope with a (virtual) frame buffer of arbitrary resolution.

While at it remove the pointless second reading of parameters from
Xenstore in the device connection phase: all parameters are available
during device probing already and that is where they should be read.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
OraBug: 26662731

(cherry picked from commit 8b3afdfa48c70144479a2a5ca51a66e96ec60934)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
drivers/input/misc/xen-kbdfront.c
(xenbus_read_unsigned vs. xenbus_scanf)

7 years agoblkfront: add uevent for size change
Marc Olson [Tue, 11 Apr 2017 19:24:09 +0000 (12:24 -0700)]
blkfront: add uevent for size change

When a blkfront device is resized from dom0, emit a KOBJ_CHANGE uevent to
notify the guest about the change. This allows for custom udev rules, such
as automatically resizing a filesystem, when an event occurs.

With this patch you get these udev

KERNEL[577.206230] change   /devices/vbd-51728/block/xvdb (block)
UDEV  [577.226218] change   /devices/vbd-51728/block/xvdb (block)

Signed-off-by: Marc Olson <marcolso@amazon.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 26662731

(cherry picked from commit 89515d0255c918e08aa4085956c79bf17615fda5)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agox86/xen/time: Set ->min_delta_ticks and ->max_delta_ticks
Nicolai Stange [Thu, 30 Mar 2017 20:06:42 +0000 (22:06 +0200)]
x86/xen/time: Set ->min_delta_ticks and ->max_delta_ticks

In preparation for making the clockevents core NTP correction aware,
all clockevent device drivers must set ->min_delta_ticks and
->max_delta_ticks rather than ->min_delta_ns and ->max_delta_ns: a
clockevent device's rate is going to change dynamically and thus, the
ratio of ns to ticks ceases to stay invariant.

Make the x86 arch's xen clockevent driver initialize these fields properly.

This patch alone doesn't introduce any change in functionality as the
clockevents core still looks exclusively at the (untouched) ->min_delta_ns
and ->max_delta_ns. As soon as this has changed, a followup patch will
purge the initialization of ->min_delta_ns and ->max_delta_ns from this
driver.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: xen-devel@lists.xenproject.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
OraBug: 26662731

(cherry picked from commit 3d18d661aaad5a22f4d37a0592acc9d784f2a11b)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/x86/xen/time.c
(whitespaces)

7 years agoxen, fbfront: add support for specifying size via xenstore
Juergen Gross [Fri, 7 Apr 2017 15:03:24 +0000 (17:03 +0200)]
xen, fbfront: add support for specifying size via xenstore

Today xen-fbfront supports specifying the display size via module
parameters only. Add support for specifying the size via Xenstore in
order to enable doing this easily via the domain's Xen configuration.

Add an error message in case the configured display size conflicts
with video memory size.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
OraBug: 26662731

(cherry picked from commit 5a93db427ab170c9793d76abf3e4be1ebd09375f)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen: Create KABI-compatible version of struct xenbus_watch
Boris Ostrovsky [Wed, 31 May 2017 17:49:37 +0000 (13:49 -0400)]
xen: Create KABI-compatible version of struct xenbus_watch

Commit 5584ea250ae4 ("xen: modify xenstore watch event interface")
modified definition of struct xenbus_watch, breaking UEK KABI.

The patch introduces compat version of the struct, with callback_
pointing to KABI-compatible definition.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen, fbfront: fix connecting to backend
Juergen Gross [Fri, 7 Apr 2017 15:28:23 +0000 (17:28 +0200)]
xen, fbfront: fix connecting to backend

Connecting to the backend isn't working reliably in xen-fbfront: in
case XenbusStateInitWait of the backend has been missed the backend
transition to XenbusStateConnected will trigger the connected state
only without doing the actions required when the backend has
connected.

Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
OraBug: 26662731

(cherry picked from commit 9121b15b5628b38b4695282dc18c553440e0f79b)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxenbus: remove transaction holder from list before freeing
Jan Beulich [Tue, 4 Apr 2017 12:27:22 +0000 (06:27 -0600)]
xenbus: remove transaction holder from list before freeing

After allocation the item is being placed on the list right away.
Consequently it needs to be taken off the list before freeing in the
case xenbus_dev_request_and_reply() failed, as in that case the
callback (xenbus_dev_queue_reply()) is not being called (and if it
was called, it should do both).

Fixes: 5584ea250ae44f929feb4c7bd3877d1c5edbf813
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit ac4cde398a96c1d28b1c28a0f69b6efd892a1c8a)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/acpi: upload PM state from init-domain to Xen
Ankur Arora [Tue, 21 Mar 2017 22:43:38 +0000 (15:43 -0700)]
xen/acpi: upload PM state from init-domain to Xen

This was broken in commit cd979883b9ed ("xen/acpi-processor:
fix enabling interrupts on syscore_resume"). do_suspend (from
xen/manage.c) and thus xen_resume_notifier never get called on
the initial-domain at resume (it is if running as guest.)

The rationale for the breaking change was that upload_pm_data()
potentially does blocking work in syscore_resume(). This patch
addresses the original issue by scheduling upload_pm_data() to
execute in workqueue context.

Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: stable@vger.kernel.org
Based-on-patch-by: Konrad Wilk <konrad.wilk@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit 1914f0cd203c941bba72f9452c8290324f1ef3dc)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/acpi: Replace hard coded "ACPI0007"
Ankur Arora [Tue, 21 Mar 2017 22:43:37 +0000 (15:43 -0700)]
xen/acpi: Replace hard coded "ACPI0007"

Replace hard coded "ACPI0007" with ACPI_PROCESSOR_DEVICE_HID

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit 1c2593cc8fd5960f8861de1be67135851f884836)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/privcmd: add IOCTL_PRIVCMD_RESTRICT
Paul Durrant [Mon, 13 Feb 2017 17:03:24 +0000 (17:03 +0000)]
xen/privcmd: add IOCTL_PRIVCMD_RESTRICT

The purpose if this ioctl is to allow a user of privcmd to restrict its
operation such that it will no longer service arbitrary hypercalls via
IOCTL_PRIVCMD_HYPERCALL, and will check for a matching domid when
servicing IOCTL_PRIVCMD_DM_OP or IOCTL_PRIVCMD_MMAP*. The aim of this
is to limit the attack surface for a compromised device model.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit 4610d240d691768203fdd210a5da0a2e02eddb76)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/privcmd: Add IOCTL_PRIVCMD_DM_OP
Paul Durrant [Mon, 13 Feb 2017 17:03:23 +0000 (17:03 +0000)]
xen/privcmd: Add IOCTL_PRIVCMD_DM_OP

Recently a new dm_op[1] hypercall was added to Xen to provide a mechanism
for restricting device emulators (such as QEMU) to a limited set of
hypervisor operations, and being able to audit those operations in the
kernel of the domain in which they run.

This patch adds IOCTL_PRIVCMD_DM_OP as gateway for __HYPERVISOR_dm_op.

NOTE: There is no requirement for user-space code to bounce data through
      locked memory buffers (as with IOCTL_PRIVCMD_HYPERCALL) since
      privcmd has enough information to lock the original buffers
      directly.

[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=524a98c2

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit ab520be8cd5d56867fc95cfbc34b90880faf1f9d)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
arch/arm/xen/enlighten.c
arch/arm/xen/hypercall.S
arch/arm64/xen/hypercall.S
        (uek doesn't define HYPERVISOR_vm_assist() for ARM)
        arch/arm/include/asm/xen/hypercall.h
        (Upstream defines HYPERVISOR_dm_op() in
         include/xen/arm/hypercall.h, which uek4 does not have)

7 years agotpm xen: drop unneeded chip variable
Julia Lawall [Sun, 5 Feb 2017 21:39:37 +0000 (22:39 +0100)]
tpm xen: drop unneeded chip variable

The call that used chip was dropped in 1f0f30e404b3.  Drop the
leftover declaration and initialization.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
OraBug: 26662731

(cherry picked from commit 5cec5bacd37784a1596d2f8fea377212948a0bf4)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoswiotlb-xen: implement xen_swiotlb_get_sgtable callback
Andrii Anisov [Tue, 7 Feb 2017 17:58:03 +0000 (19:58 +0200)]
swiotlb-xen: implement xen_swiotlb_get_sgtable callback

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
OraBug: 26662731

(cherry picked from commit 69369f52d28a34c84acb6f2a8a585e743441566a)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoswiotlb-xen: implement xen_swiotlb_dma_mmap callback
Stefano Stabellini [Tue, 7 Feb 2017 17:58:02 +0000 (19:58 +0200)]
swiotlb-xen: implement xen_swiotlb_dma_mmap callback

This function creates userspace mapping for the DMA-coherent memory.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
OraBug: 26662731

(cherry picked from commit 7e91c7df29b5e196de3dc6f086c8937973bd0b88)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/privcmd: return -ENOTTY for unimplemented IOCTLs
Paul Durrant [Mon, 13 Feb 2017 17:03:22 +0000 (17:03 +0000)]
xen/privcmd: return -ENOTTY for unimplemented IOCTLs

The code sets the default return code to -ENOSYS but then overrides this
to -EINVAL in the switch() statement's default case, which is clearly
silly.

This patch removes the override and sets the default return code to
-ENOTTY, which is the conventional return for an unimplemented ioctl.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit dc9eab6fd94dd26340749321bba2c58634761516)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen: optimize xenbus driver for multiple concurrent xenstore accesses
Juergen Gross [Thu, 9 Feb 2017 13:39:58 +0000 (14:39 +0100)]
xen: optimize xenbus driver for multiple concurrent xenstore accesses

Handling of multiple concurrent Xenstore accesses through xenbus driver
either from the kernel or user land is rather lame today: xenbus is
capable to have one access active only at one point of time.

Rewrite xenbus to handle multiple requests concurrently by making use
of the request id of the Xenstore protocol. This requires to:

- Instead of blocking inside xb_read() when trying to read data from
  the xenstore ring buffer do so only in the main loop of
  xenbus_thread().

- Instead of doing writes to the xenstore ring buffer in the context of
  the caller just queue the request and do the write in the dedicated
  xenbus thread.

- Instead of just forwarding the request id specified by the caller of
  xenbus to xenstore use a xenbus internal unique request id. This will
  allow multiple outstanding requests.

- Modify the locking scheme in order to allow multiple requests being
  active in parallel.

- Instead of waiting for the reply of a user's xenstore request after
  writing the request to the xenstore ring buffer return directly to
  the caller and do the waiting in the read path.

Additionally signal handling was optimized by avoiding waking up the
xenbus thread or sending an event to Xenstore in case the addressed
entity is known to be running already.

As a result communication with Xenstore is sped up by a factor of up
to 5: depending on the request type (read or write) and the amount of
data transferred the gain was at least 20% (small reads) and went up to
a factor of 5 for large writes.

In the end some more rough edges of xenbus have been smoothed:

- Handling of memory shortage when reading from xenstore ring buffer in
  the xenbus driver was not optimal: it was busy looping and issuing a
  warning in each loop.

- In case of xenstore not running in dom0 but in a stubdom we end up
  with two xenbus threads running as the initialization of xenbus in
  dom0 expecting a local xenstored will be redone later when connecting
  to the xenstore domain. Up to now this was no problem as locking
  would prevent the two xenbus threads interfering with each other, but
  this was just a waste of kernel resources.

- An out of memory situation while writing to or reading from the
  xenstore ring buffer no longer will lead to a possible loss of
  synchronization with xenstore.

- The user read and write part are now interruptible by signals.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit fd8aa9095a95c02dcc35540a263267c29b8fda9d)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
drivers/xen/xenbus/xenbus_comms.c
        (Use mb() instead of virt_mb() for consistency in our code)

7 years agoxen: modify xenstore watch event interface
Juergen Gross [Thu, 9 Feb 2017 13:39:57 +0000 (14:39 +0100)]
xen: modify xenstore watch event interface

Today a Xenstore watch event is delivered via a callback function
declared as:

void (*callback)(struct xenbus_watch *,
                 const char **vec, unsigned int len);

As all watch events only ever come with two parameters (path and token)
changing the prototype to:

void (*callback)(struct xenbus_watch *,
                 const char *path, const char *token);

is the natural thing to do.

Apply this change and adapt all users.

Cc: konrad.wilk@oracle.com
Cc: roger.pau@citrix.com
Cc: wei.liu2@citrix.com
Cc: paul.durrant@citrix.com
Cc: netdev@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit 5584ea250ae44f929feb4c7bd3877d1c5edbf813)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
drivers/xen/ovmapi.c
(Updated for new interface)

7 years agoxen: clean up xenbus internal headers
Juergen Gross [Thu, 9 Feb 2017 13:39:56 +0000 (14:39 +0100)]
xen: clean up xenbus internal headers

The xenbus driver has an awful mixture of internally and globally
visible headers: some of the internally used only stuff is defined in
the global header include/xen/xenbus.h while some stuff defined in
internal headers is used by other drivers, too.

Clean this up by moving the externally used symbols to
include/xen/xenbus.h and the symbols used internally only to a new
header drivers/xen/xenbus/xenbus.h replacing xenbus_comms.h and
xenbus_probe.h

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit 332f791dc98d98116f4473b726f67c9321b0f31e)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
drivers/xen/xenbus/xenbus_dev_frontend.c
        (MODULE_LICENSE() is present in our code)

7 years agoxenbus: Neaten xenbus_va_dev_error
Joe Perches [Wed, 8 Feb 2017 11:33:36 +0000 (03:33 -0800)]
xenbus: Neaten xenbus_va_dev_error

This function error patch can be simplified, so do so.

Remove fail: label and somewhat obfuscating, used once "error_path"
function.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit c0d197d55e8e8aeeea55f79bdf67e1c957bfa25d)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/pvh: Use Xen's emergency_restart op for PVH guests
Boris Ostrovsky [Mon, 6 Feb 2017 15:58:06 +0000 (10:58 -0500)]
xen/pvh: Use Xen's emergency_restart op for PVH guests

Using native_machine_emergency_restart (called during reboot) will
lead PVH guests to machine_real_restart()  where we try to use
real_mode_header which is not initialized.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 7a1c44ebc5ac2e2c28d95b0da6060728c334e7e4)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/pvh: Enable CPU hotplug
Boris Ostrovsky [Mon, 6 Feb 2017 15:58:05 +0000 (10:58 -0500)]
xen/pvh: Enable CPU hotplug

PVH guests don't (yet) receive ACPI hotplug interrupts and therefore
need to monitor xenstore for CPU hotplug event.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 2a7197f02dddf1f9cee300bd12512375ed56524a)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/pvh: PVH guests always have PV devices
Boris Ostrovsky [Mon, 6 Feb 2017 15:57:15 +0000 (10:57 -0500)]
xen/pvh: PVH guests always have PV devices

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 26662731

(cherry picked from commit bcc57df281d93dfa502c824e9f73e0191c3f7c34)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/pvh: Initialize grant table for PVH guests
Boris Ostrovsky [Mon, 6 Feb 2017 15:56:15 +0000 (10:56 -0500)]
xen/pvh: Initialize grant table for PVH guests

Like PV guests, PVH does not have PCI devices and therefore cannot
use MMIO space to store grants. Instead it balloons out memory and
keeps grants there.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 8613d78ab09900010a5b32e78ff229f551d661d6)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/pvh: Make sure we don't use ACPI_IRQ_MODEL_PIC for SCI
Boris Ostrovsky [Mon, 6 Feb 2017 00:50:58 +0000 (19:50 -0500)]
xen/pvh: Make sure we don't use ACPI_IRQ_MODEL_PIC for SCI

Since we are not using PIC and (at least currently) don't have IOAPIC
we want to make sure that acpi_irq_model doesn't stay set to
ACPI_IRQ_MODEL_PIC (which is the default value). If we allowed it to
stay then acpi_os_install_interrupt_handler() would try (and fail) to
request_irq() for PIC.

Instead we set the model to ACPI_IRQ_MODEL_PLATFORM which will prevent
this from happening.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 5adad168e586cb381633f45d181bb729b04393a5)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/pvh: Bootstrap PVH guest
Boris Ostrovsky [Mon, 6 Feb 2017 00:50:52 +0000 (19:50 -0500)]
xen/pvh: Bootstrap PVH guest

Start PVH guest at XEN_ELFNOTE_PHYS32_ENTRY address. Setup hypercall
page, initialize boot_params, enable early page tables.

Since this stub is executed before kernel entry point we cannot use
variables in .bss which is cleared by kernel. We explicitly place
variables that are initialized here into .data.

While adjusting xen_hvm_init_shared_info() make it use cpuid_e?x()
instead of cpuid() (wherever possible).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
OraBug: 26662731

(cherry picked from commit 7243b93345f7f8de260e8f5b4670803e64fcbb00)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/pvh: Import PVH-related Xen public interfaces
Boris Ostrovsky [Fri, 3 Feb 2017 21:57:22 +0000 (16:57 -0500)]
xen/pvh: Import PVH-related Xen public interfaces

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 26662731

(cherry picked from commit cee2cfb7d18db1d7bcd0952b5e0e3f19c004023c)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/x86: Remove PVH support
Boris Ostrovsky [Fri, 3 Feb 2017 21:57:22 +0000 (16:57 -0500)]
xen/x86: Remove PVH support

We are replacing existing PVH guests with new implementation.

We are keeping xen_pvh_domain() macro (for now set to zero) because
when we introduce new PVH implementation later in this series we will
reuse current PVH-specific code (xen_pvh_gnttab_setup()), and that
code is conditioned by 'if (xen_pvh_domain())'. (We will also need
a noop xen_pvh_domain() for !CONFIG_XEN_PVH).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 26662731

(cherry picked from commit 063334f30543597430f172bd7690d21e3590e148)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Conflicts:
       arch/x86/xen/enlighten.c
       (We don't have xen_dom0_set_legacy_features() and xen_cpuhp_setup())
       arch/x86/xen/smp.h
       (xen_smp_intr_*() are not in our header files)
       arch/x86/xen/xen-head.S
       (difference in a comment)
       include/xen/xen.h
       (xen_have_vector_callback is not in upstream's definition of xen_pvh_domain())

7 years agoxen/manage: correct return value check on xenbus_scanf()
Jan Beulich [Fri, 3 Feb 2017 08:54:05 +0000 (01:54 -0700)]
xen/manage: correct return value check on xenbus_scanf()

A negative return value indicates an error; in fact the function at
present won't ever return zero.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit 4fed1b125eb6252bde478665fc05d4819f774fa8)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/netback: set default upper limit of tx/rx queues to 8
Juergen Gross [Tue, 10 Jan 2017 13:32:52 +0000 (14:32 +0100)]
xen/netback: set default upper limit of tx/rx queues to 8

The default for the maximum number of tx/rx queues of one interface is
the number of cpus of the system today. As each queue pair reserves 512
grant pages this default consumes a ridiculous number of grants for
large guests.

Limit the queue number to 8 as default. This value can be modified
via a module parameter if required.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit 56dd5af9bc23d0d5d23bb207c477715b4c2216c5)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agoxen/netfront: set default upper limit of tx/rx queues to 8
Juergen Gross [Tue, 10 Jan 2017 13:32:51 +0000 (14:32 +0100)]
xen/netfront: set default upper limit of tx/rx queues to 8

The default for the number of tx/rx queues of one interface is the
number of vcpus of the system today. As each queue pair reserves 512
grant pages this default consumes a ridiculous number of grants for
large guests.

Limit the queue number to 8 as default. This value can be modified
via a module parameter if required.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OraBug: 26662731

(cherry picked from commit 034702a64a6692a8d5d0d9630064a014fc633728)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
7 years agords: reduce memory footprint for RDS when transport is RDMA
Ka-Cheong Poon [Thu, 31 Aug 2017 03:16:26 +0000 (20:16 -0700)]
rds: reduce memory footprint for RDS when transport is RDMA

RDS over IB does not use multipath RDS, so the array
of additional rds_conn_path structures is always superfluous
in this case. Reduce the memory footprint of the rds module
by making this a dynamic allocation predicated on whether
the transport is mp_capable.

Orabug: 26412003

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Tested-by: Efrain Galaviz <efrain.galaviz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>
(cherry picked from commit 840df162b3eb3ec02e2613411fad1285a0017c13)

7 years agoRDS: IB: Destroy rdma_cm_id when unloading module
Avinash Repaka [Fri, 21 Oct 2016 22:34:19 +0000 (15:34 -0700)]
RDS: IB: Destroy rdma_cm_id when unloading module

rdma_cm_id is not being destroyed as part of module_exit(). This would
lead to memory corruption and eventual node crash if rds_rdma module is
unloaded.

To avoid this, stop listening(destroy rdma_cm_id) as the first step of
unloading the module and curb any more connection attempts.

This undoes the change made by commit 4d4692145bf7 ("RDS: Drop stale
iWARP support")

Orabug: 26089296

Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
7 years agoRDS: IB: Destroy aux_wq if rds_ib_init() fails
Avinash Repaka [Thu, 31 Aug 2017 21:37:42 +0000 (14:37 -0700)]
RDS: IB: Destroy aux_wq if rds_ib_init() fails

This patch destroys rds_aux_wq in rds_ib_init() error path to avoid
memory leak. Since destroy_workqueue() waits on pending work to be done,
we can remove flush_workqueue() prior to destroying.

Orabug: 26732887

Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com>
Reviewed-by: Gerd Rausch <gerd.rausch@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
7 years agofuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dio
Ashish Samant [Fri, 11 Aug 2017 21:59:36 +0000 (14:59 -0700)]
fuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dio

Orabug: 26752442

Commit 8fba54aebbdf ("fuse: direct-io: don't dirty ITER_BVEC pages") fixes
the ITER_BVEC page deadlock for direct io in fuse by checking in
fuse_direct_io(), whether the page is a bvec page or not, before locking
it.  However, this check is missed when the "async_dio" mount option is
enabled.  In this case, set_page_dirty_lock() is called from the req->end
callback in request_end(), when the fuse thread is returning from userspace
to respond to the read request.  This will cause the same deadlock because
the bvec condition is not checked in this path.

Here is the stack of the deadlocked thread, while returning from userspace:

[13706.656686] INFO: task glusterfs:3006 blocked for more than 120 seconds.
[13706.657808] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[13706.658788] glusterfs       D ffffffff816c80f0     0  3006      1
0x00000080
[13706.658797]  ffff8800d6713a58 0000000000000086 ffff8800d9ad7000
ffff8800d9ad5400
[13706.658799]  ffff88011ffd5cc0 ffff8800d6710008 ffff88011fd176c0
7fffffffffffffff
[13706.658801]  0000000000000002 ffffffff816c80f0 ffff8800d6713a78
ffffffff816c790e
[13706.658803] Call Trace:
[13706.658809]  [<ffffffff816c80f0>] ? bit_wait_io_timeout+0x80/0x80
[13706.658811]  [<ffffffff816c790e>] schedule+0x3e/0x90
[13706.658813]  [<ffffffff816ca7e5>] schedule_timeout+0x1b5/0x210
[13706.658816]  [<ffffffff81073ffb>] ? gup_pud_range+0x1db/0x1f0
[13706.658817]  [<ffffffff810668fe>] ? kvm_clock_read+0x1e/0x20
[13706.658819]  [<ffffffff81066909>] ? kvm_clock_get_cycles+0x9/0x10
[13706.658822]  [<ffffffff810f5792>] ? ktime_get+0x52/0xc0
[13706.658824]  [<ffffffff816c6f04>] io_schedule_timeout+0xa4/0x110
[13706.658826]  [<ffffffff816c8126>] bit_wait_io+0x36/0x50
[13706.658828]  [<ffffffff816c7d06>] __wait_on_bit_lock+0x76/0xb0
[13706.658831]  [<ffffffffa0545636>] ? lock_request+0x46/0x70 [fuse]
[13706.658834]  [<ffffffff8118800a>] __lock_page+0xaa/0xb0
[13706.658836]  [<ffffffff810c8500>] ? wake_atomic_t_function+0x40/0x40
[13706.658838]  [<ffffffff81194d08>] set_page_dirty_lock+0x58/0x60
[13706.658841]  [<ffffffffa054d968>] fuse_release_user_pages+0x58/0x70 [fuse]
[13706.658844]  [<ffffffffa0551430>] ? fuse_aio_complete+0x190/0x190 [fuse]
[13706.658847]  [<ffffffffa0551459>] fuse_aio_complete_req+0x29/0x90 [fuse]
[13706.658849]  [<ffffffffa05471e9>] request_end+0xd9/0x190 [fuse]
[13706.658852]  [<ffffffffa0549126>] fuse_dev_do_write+0x336/0x490 [fuse]
[13706.658854]  [<ffffffffa054963e>] fuse_dev_write+0x6e/0xa0 [fuse]
[13706.658857]  [<ffffffff812a9ef3>] ? security_file_permission+0x23/0x90
[13706.658859]  [<ffffffff81205300>] do_iter_readv_writev+0x60/0x90
[13706.658862]  [<ffffffffa05495d0>] ? fuse_dev_splice_write+0x350/0x350
[fuse]
[13706.658863]  [<ffffffff812062a1>] do_readv_writev+0x171/0x1f0
[13706.658866]  [<ffffffff810b3d00>] ? try_to_wake_up+0x210/0x210
[13706.658868]  [<ffffffff81206361>] vfs_writev+0x41/0x50
[13706.658870]  [<ffffffff81206496>] SyS_writev+0x56/0xf0
[13706.658872]  [<ffffffff810257a1>] ? syscall_trace_leave+0xf1/0x160
[13706.658874]  [<ffffffff816cbb2e>] system_call_fastpath+0x12/0x71

Fix this by making should_dirty a fuse_io_priv parameter that can be
checked in fuse_aio_complete_req().

(cherrypicked from commit 61c12b49e1c9c77d7a1bcc161de540d0fd21cf0c)

Reported-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Joe Jin <joe.jin@oracle.com>
7 years agords: Reintroduce statistics counting
Håkon Bugge [Tue, 8 Aug 2017 09:13:32 +0000 (11:13 +0200)]
rds: Reintroduce statistics counting

In commit 7e3f2952eeb1 ("rds: don't let RDS shutdown a connection
while senders are present"), refilling the receive queue was removed
from rds_ib_recv(), along with the increment of
s_ib_rx_refill_from_thread.

Commit 73ce4317bf98 ("RDS: make sure we post recv buffers")
re-introduces filling the receive queue from rds_ib_recv(), but does
not add the statistics counter. rds_ib_recv() was later renamed to
rds_ib_recv_path().

This commit reintroduces the statistics counting of
s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry-picked from upstream 05bfd7dbb53a10be4a3e7aebaeec04b558198d49)

Conflicts:
net/rds/ib_recv.c

Orabug: 26717115

Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>
7 years agorcu: sysctl: Panic on RCU Stall
Daniel Bristot de Oliveira [Thu, 2 Jun 2016 16:51:41 +0000 (13:51 -0300)]
rcu: sysctl: Panic on RCU Stall

Orabug: 26338027

It is not always easy to determine the cause of an RCU stall just by
analysing the RCU stall messages, mainly when the problem is caused
by the indirect starvation of rcu threads. For example, when preempt_rcu
is not awakened due to the starvation of a timer softirq.

We have been hard coding panic() in the RCU stall functions for
some time while testing the kernel-rt. But this is not possible in
some scenarios, like when supporting customers.

This patch implements the sysctl kernel.panic_on_rcu_stall. If
set to 1, the system will panic() when an RCU stall takes place,
enabling the capture of a vmcore. The vmcore provides a way to analyze
all kernel/tasks states, helping out to point to the culprit and the
solution for the stall.

The kernel.panic_on_rcu_stall sysctl is disabled by default.

Changes from v1:
- Fixed a typo in the git log
- The if(sysctl_panic_on_rcu_stall) panic() is in a static function
- Fixed the CONFIG_TINY_RCU compilation issue
- The var sysctl_panic_on_rcu_stall is now __read_mostly

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
(cherry picked from commit 088e9d253d3a4ab7e058dd84bb532c32dadf1882)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
Documentation/sysctl/kernel.txt

7 years agosched: check user input value of sysctl_sched_time_avg v4.1.12-111.0.20170907_2225
Ethan Zhao [Fri, 8 Sep 2017 00:11:55 +0000 (08:11 +0800)]
sched: check user input value of sysctl_sched_time_avg

Orabug: 26371482

System will hang if user set sysctl_sched_time_avg to 0 by

[root@XXX ~]# sysctl kernel.sched_time_avg_ms=0

Stack traceback for pid 0
0xffff883f6406c600 0 0 1 3 R 0xffff883f6406cf50 *swapper/3
ffff883f7ccc3ae8 0000000000000018 ffffffff810c4dd0 0000000000000000
0000000000017800 ffff883f7ccc3d78 0000000000000003 ffff883f7ccc3bf8
ffffffff810c4fc9 ffff883f7ccc3c08 00000000810c5043 ffff883f7ccc3c08
Call Trace:
<IRQ> [<ffffffff810c4dd0>] ? update_group_capacity+0x110/0x200
[<ffffffff810c4fc9>] ? update_sd_lb_stats+0x109/0x600
[<ffffffff810c5507>] ? find_busiest_group+0x47/0x530
[<ffffffff810c5b84>] ? load_balance+0x194/0x900
[<ffffffff810ad5ca>] ? update_rq_clock.part.83+0x1a/0xe0
[<ffffffff810c6d42>] ? rebalance_domains+0x152/0x290
[<ffffffff810c6f5c>] ? run_rebalance_domains+0xdc/0x1d0
[<ffffffff8108a75b>] ? __do_softirq+0xfb/0x320
[<ffffffff8108ac85>] ? irq_exit+0x125/0x130
[<ffffffff810b3a17>] ? scheduler_ipi+0x97/0x160
[<ffffffff81052709>] ? smp_reschedule_interrupt+0x29/0x30
[<ffffffff8173a1be>] ? reschedule_interrupt+0x6e/0x80
 <EOI> [<ffffffff815bc83c>] ? cpuidle_enter_state+0xcc/0x230
[<ffffffff815bc80c>] ? cpuidle_enter_state+0x9c/0x230
[<ffffffff815bc9d7>] ? cpuidle_enter+0x17/0x20
[<ffffffff810cd6dc>] ? cpu_startup_entry+0x38c/0x420
[<ffffffff81053373>] ? start_secondary+0x173/0x1e0

Because divide-by-zero error happens in function

update_group_capacity()
  update_cpu_capacity()
    scale_rt_capacity()
     {
          ...
          total = sched_avg_period() + delta;
          used = div_u64(avg, total);
          ...
     }

Seems this issue could be reproduced on all I tried stable 4.1 - last
kernel.

To fix this issue, check user input value of sysctl_sched_time_avg, keep
it unchanged when hit invalid input, set the min limit of
sysctl_sched_time_avg to 1 ms.

Reported-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed by: Rohit Jain <rohit.k.jain@oracle.com>

7 years agogenirq: Revert sparse irq locking around __cpu_up() and move it to x86 for now
Thomas Gleixner [Tue, 14 Jul 2015 20:03:57 +0000 (22:03 +0200)]
genirq: Revert sparse irq locking around __cpu_up() and move it to x86 for now

Boris reported that the sparse_irq protection around __cpu_up() in the
generic code causes a regression on Xen. Xen allocates interrupts and
some more in the xen_cpu_up() function, so it deadlocks on the
sparse_irq_lock.

There is no simple fix for this and we really should have the
protection for all architectures, but for now the only solution is to
move it to x86 where actual wreckage due to the lack of protection has
been observed.

Reported-and-tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Fixes: a89941816726 'hotplug: Prevent alloc/free of irq descriptors during cpu up/down'
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
(cherry picked from commit ce0d3c0a6fb1422101498ef378c0851dabbbf67f)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Brian Maly <brian.maly@oracle.com>
7 years agodtrace: Update UEK RPM specs
Tomas Jedlicka [Tue, 29 Aug 2017 13:50:34 +0000 (15:50 +0200)]
dtrace: Update UEK RPM specs

RPM package adjustments can't be done on topic branch as there is no
uek-rpm directory with the specfiles.

Orabug: 26585689

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
7 years agoMerge remote-tracking branch 'linux-dtrace/gpl/topic/uek-4.1/dtrace' into uek-next
Tomas Jedlicka [Fri, 1 Sep 2017 10:15:55 +0000 (12:15 +0200)]
Merge remote-tracking branch 'linux-dtrace/gpl/topic/uek-4.1/dtrace' into uek-next

7 years agouek-rpm: Enable generic driver for Hyper-V VMBus
Somasundaram Krishnasamy [Mon, 21 Aug 2017 19:09:28 +0000 (12:09 -0700)]
uek-rpm: Enable generic driver for Hyper-V VMBus

Orabug: 26668113

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agouio-hv-generic: store physical addresses instead of virtual
Arnd Bergmann [Fri, 9 Dec 2016 11:44:40 +0000 (12:44 +0100)]
uio-hv-generic: store physical addresses instead of virtual

Orabug: 26668113

gcc warns about the newly added driver when phys_addr_t is wider than
a pointer:

drivers/uio/uio_hv_generic.c: In function 'hv_uio_mmap':
drivers/uio/uio_hv_generic.c:71:17: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    virt_to_phys((void *)info->mem[mi].addr) >> PAGE_SHIFT,
drivers/uio/uio_hv_generic.c: In function 'hv_uio_probe':
drivers/uio/uio_hv_generic.c:140:5: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
   = (phys_addr_t)dev->channel->ringbuffer_pages;
drivers/uio/uio_hv_generic.c:147:3: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
   (phys_addr_t)vmbus_connection.int_page;
drivers/uio/uio_hv_generic.c:153:3: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
   (phys_addr_t)vmbus_connection.monitor_pages[1];

I can't see why we store a virtual address in a phys_addr_t here,
as the only user of that variable converts it into a physical
address anyway, so this moves the conversion to where it logically
fits according to the types.

Fixes: 95096f2fbd10 ("uio-hv-generic: new userspace i/o driver for VMBus")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 190cc65e912de7e8f7ebddcecfbf55a610281a8c)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agouio-hv-generic: new userspace i/o driver for VMBus
Stephen Hemminger [Sat, 3 Dec 2016 20:34:40 +0000 (12:34 -0800)]
uio-hv-generic: new userspace i/o driver for VMBus

Orabug: 26668113

This is a new driver to enable userspace networking on VMBus.
It is based largely on the similar driver that already exists
for PCI, and earlier work done by Brocade to support DPDK.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 95096f2fbd10186d3e78a328b327afc71428f65f)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agoasm-generic: implement virt_xxx memory barriers
Michael S. Tsirkin [Sun, 27 Dec 2015 16:23:01 +0000 (18:23 +0200)]
asm-generic: implement virt_xxx memory barriers

Orabug: 26668113

Guests running within virtual machines might be affected by SMP effects even if
the guest itself is compiled without SMP support.  This is an artifact of
interfacing with an SMP host while running an UP kernel.  Using mandatory
barriers for this use-case would be possible but is often suboptimal.

In particular, virtio uses a bunch of confusing ifdefs to work around
this, while xen just uses the mandatory barriers.

To better handle this case, low-level virt_mb() etc macros are made available.
These are implemented trivially using the low-level __smp_xxx macros,
the purpose of these wrappers is to annotate those specific cases.

These have the same effect as smp_mb() etc when SMP is enabled, but generate
identical code for SMP and non-SMP systems. For example, virtual machine guests
should use virt_mb() rather than smp_mb() when synchronizing against a
(possibly SMP) host.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
(cherry picked from commit 6a65d26385bf487926a0616650927303058551e3)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agox86: define __smp_xxx
Michael S. Tsirkin [Sun, 27 Dec 2015 13:04:42 +0000 (15:04 +0200)]
x86: define __smp_xxx

Orabug: 26668113

This defines __smp_xxx barriers for x86,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit 1638fb72070f8faf2ac0787fafbb839d0c859d5b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
arch/x86/include/asm/barrier.h

7 years agoasm-generic: add __smp_xxx wrappers
Michael S. Tsirkin [Sun, 27 Dec 2015 11:50:07 +0000 (13:50 +0200)]
asm-generic: add __smp_xxx wrappers

Orabug: 26668113

On !SMP, most architectures define their
barriers as compiler barriers.
On SMP, most need an actual barrier.

Make it possible to remove the code duplication for
!SMP by defining low-level __smp_xxx barriers
which do not depend on the value of SMP, then
use them from asm-generic conditionally.

Besides reducing code duplication, these low level APIs will also be
useful for virtualization, where a barrier is sometimes needed even if
!SMP since we might be talking to another kernel on the same SMP system.

Both virtio and Xen drivers will benefit.

The smp_xxx variants should use __smp_XXX ones or barrier() depending on
SMP, identically for all architectures.

We keep ifndef guards around them for now - once/if all
architectures are converted to use the generic
code, we'll be able to remove these.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
(cherry picked from commit a9e4252a9b147043142282ebb65da94dcb951e2a)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
include/asm-generic/barrier.h

7 years agox86: reuse asm-generic/barrier.h
Michael S. Tsirkin [Mon, 21 Dec 2015 07:22:18 +0000 (09:22 +0200)]
x86: reuse asm-generic/barrier.h

Orabug: 26668113

As on most architectures, on x86 read_barrier_depends and
smp_read_barrier_depends are empty.  Drop the local definitions and pull
the generic ones from asm-generic/barrier.h instead: they are identical.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit 300b06d4555305dc227748674f75970f2f84c224)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
arch/x86/include/asm/barrier.h

7 years agouek-rpm: Fix package dependencies for kernel-ueknano
Somasundaram Krishnasamy [Tue, 15 Aug 2017 19:55:52 +0000 (12:55 -0700)]
uek-rpm: Fix package dependencies for kernel-ueknano

Orabug: 26639379

Since kernel-ueknano has no explicit dependencies specified in the spec
file, it fails to check or install linux-nano-firmware rpm. This commit
adds requires, provides, conflicts, obsoletes directives to kernel-ueknano
same as in kernel-uek package.

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Acked-by: Ethan Zhao <ethan.zhao@oracle.com>
7 years agoxsigo: PCA 2.3.1 Compute Node panics in xve_create_arp+430
Pradeep Gopanapalli [Tue, 29 Aug 2017 18:59:54 +0000 (11:59 -0700)]
xsigo: PCA 2.3.1 Compute Node panics in xve_create_arp+430

Orabug: 26474000

After some idle time on a RC connection xve driver cleans up the idle entry
and by that time arp entry for that address would be flushed out.
Further, traffic re-establishment for that address is taken care by arps
generated by the system. But its not rare that arp entry for that
address is still present with driver cleaned up the path and in that
case xve driver generates arp/ndp packets to resolve the path.

While generating an vlan based arp, xve driver uses "vlan_eth_hdr" API
to fill vlan specific data. This API works as expected only if skb
mac_header is initialized to proper value before using API.
In xve_create_arp and xve_create_ndp newly allocated skb's will not have
correct mac_header offset and as a result the pointer returned by
vlan_eth_hdr function is not the correct address to use, some times this
will result in a kernel panic.

Call Trace:
BUG: unable to handle kernel paging request at ffff88005a47060b
IP: [<ffffffffa12e3614>] xve_create_arp+0x1ae/0x200 [xve]

The fix here is to directly typecast skb->data instead of using
"vlan_eth_hdr" API.

This change in xve of using "vlan_eth_hdr" was mostly introduced during
EDR/uVNIC development and is applicable only for UEK4.

Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: Chien Yen <chien.yen@oracle.com>
7 years agodtrace: work around libdtrace-ctf bug
Nick Alcock [Fri, 4 Aug 2017 16:54:24 +0000 (17:54 +0100)]
dtrace: work around libdtrace-ctf bug

The bug involves synthesising pointers to types (ipaddr_t *, in
particular) when such pointer types do not appear in the CTF files but
are needed by the CTF itself.  This is working in standalone modules,
but not in modules with parent type containers.

As a workaround, pro tem before fixing this properly in libdtrace-ctf,
hack around it for the one type it is necessary for (a type that is used
in the DTrace system translators, so if this type does not resolve
correctly DTrace will not start).  A suitable workaround is simply to
introduce a use of this pointer type in the C code, and it so happens
that we have a place where this would fit perfectly well.

Orabug: 26583958
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
7 years agodtrace: dtrace.ko won't build when DT_DISABLE_CTF is set
Tomas Jedlicka [Tue, 1 Aug 2017 14:38:44 +0000 (10:38 -0400)]
dtrace: dtrace.ko won't build when DT_DISABLE_CTF is set

Fix for build failure that can occur when CTF generation is disabled.

Orabug: 26587631

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
7 years agodtrace: Integrate DTrace Modules into kernel proper
Tomas Jedlicka [Tue, 1 Aug 2017 13:15:44 +0000 (09:15 -0400)]
dtrace: Integrate DTrace Modules into kernel proper

This changeset integrates DTrace module sources into the main kernel
source tree under the GPLv2 license.  Sources have been moved to
appropriate locations in the kernel tree.

In addition a new RPM package is introduced: kernel-headers-dtrace.
This package is responsible for installation of DTrace related header
files for its userspace component.

Orabug: 26585689

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Signed-off-by: David Mc Lean <david.mclean@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
7 years agoMerge modules branch to the kernel tree
Tomas Jedlicka [Tue, 29 Aug 2017 13:29:39 +0000 (15:29 +0200)]
Merge modules branch to the kernel tree

7 years agopacket: fix tp_reserve race in packet_set_ring
Willem de Bruijn [Thu, 10 Aug 2017 16:41:58 +0000 (12:41 -0400)]
packet: fix tp_reserve race in packet_set_ring

Orabug: 26681154
CVE: CVE-2017-1000111

Updates to tp_reserve can race with reads of the field in
packet_set_ring. Avoid this by holding the socket lock during
updates in setsockopt PACKET_RESERVE.

This bug was discovered by syzkaller.

Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c27927e372f0785f3303e8fad94b85945e2c97b7)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
7 years agoscsi: fnic: changing queue command to return result DID_IMM_RETRY when rport is init
Satish Kharat [Tue, 27 Jun 2017 00:49:06 +0000 (17:49 -0700)]
scsi: fnic: changing queue command to return result DID_IMM_RETRY when rport is init

Currently the queue command returns DID_NO_CONNECT anytime the rport is
not in RPORT_ST_READY state. Changing it to return DID_NO_CONNECT only
when the rport is in RPORT_ST_DELETE state. When the rport is in one of
the init states retruning DID_IMM_RETRY.

Orabug: 26670341

Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4a1108d6caf7cd0f15d439bee6163ba00c7c1b33)
Signed-off-by: Joe Jin <joe.jin@oracle.com>
7 years agousbip_recv(): switch to sock_recvmsg()
Al Viro [Sun, 21 Dec 2014 08:53:10 +0000 (03:53 -0500)]
usbip_recv(): switch to sock_recvmsg()

Orabug: 26668125

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 3995d1610713c1a62af687872e460c3dca82d17c)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years ago[net] drop 'size' argument of sock_recvmsg()
Al Viro [Sun, 15 Mar 2015 01:13:46 +0000 (21:13 -0400)]
[net] drop 'size' argument of sock_recvmsg()

Orabug: 26668125

all callers have it equal to msg_data_left(msg).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 2da62906b1e298695e1bb725927041cd59942c98)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agousbip: don't call stub_device_reset() during stub_disconnect()
Alexander Popov [Fri, 20 May 2016 09:37:28 +0000 (12:37 +0300)]
usbip: don't call stub_device_reset() during stub_disconnect()

Orabug: 26668125

stub_disconnect() calls stub_device_reset() during usb_unbind_device() when
usb device is locked. So usb_lock_device_for_reset() in stub_device_reset()
in that case polls for one second and returns -EBUSY anyway.

Remove useless flag USBIP_EH_RESET from SDEV_EVENT_REMOVED.

Signed-off-by: Alexander Popov <alpopov@ptsecurity.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 134a92659f9382f00f94b880183472b0769ad53e)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agousbip: vudc: Make usbip_common vudc-aware
Igor Kotrasinski [Tue, 8 Mar 2016 20:48:57 +0000 (21:48 +0100)]
usbip: vudc: Make usbip_common vudc-aware

Orabug: 26668125

Add constants for VUDC events in usbip_common.h
and make use of them in usbip_common.c.

This commit is a result of cooperation between Samsung R&D Institute
Poland and Open Operating Systems Student Society at University
of Warsaw (O2S3@UW) consisting of:

    Igor Kotrasinski <ikotrasinsk@gmail.com>
    Karol Kosik <karo9@interia.eu>
    Ewelina Kosmider <3w3lfin@gmail.com>
    Dawid Lazarczyk <lazarczyk.dawid@gmail.com>
    Piotr Szulc <ps347277@students.mimuw.edu.pl>

Tutor and project owner:
    Krzysztof Opasiak <k.opasiak@samsung.com>

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@samsung.com>
Signed-off-by: Karol Kosik <karo9@interia.eu>
[Small fixes and commit message update]
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c7af4c221878ad684f0412eba2a1f18fa126c6d5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agousbip: event handler as one thread
Nobuo Iwata [Thu, 24 Mar 2016 01:50:59 +0000 (10:50 +0900)]
usbip: event handler as one thread

Orabug: 26668125

Dear all,

1. Overview

In current USB/IP implementation, event kernel threads are created for
each port. The functions of the threads are closing connection and
error handling so they don't have not so many events to handle. There's
no need to have thread for each port.

BEFORE) vhci side - VHCI_NPORTS(8) threads are created.
$ ps aux | grep usbip
root     10059  0.0  0.0      0     0 ?        S    17:06   0:00 [usbip_eh]
root     10060  0.0  0.0      0     0 ?        S    17:06   0:00 [usbip_eh]
root     10061  0.0  0.0      0     0 ?        S    17:06   0:00 [usbip_eh]
root     10062  0.0  0.0      0     0 ?        S    17:06   0:00 [usbip_eh]
root     10063  0.0  0.0      0     0 ?        S    17:06   0:00 [usbip_eh]
root     10064  0.0  0.0      0     0 ?        S    17:06   0:00 [usbip_eh]
root     10065  0.0  0.0      0     0 ?        S    17:06   0:00 [usbip_eh]
root     10066  0.0  0.0      0     0 ?        S    17:06   0:00 [usbip_eh]

BEFORE) stub side - threads will be created every bind operation.
$ ps aux | grep usbip
root      8368  0.0  0.0      0     0 ?        S    17:56   0:00 [usbip_eh]
root      8399  0.0  0.0      0     0 ?        S    17:56   0:00 [usbip_eh]

This patch put event threads of stub and vhci driver as one workqueue.

AFTER) only one event threads in each vhci and stub side.
$ ps aux | grep usbip
root     10457  0.0  0.0      0     0 ?        S<   17:47   0:00 [usbip_event]

2. Modification to usbip_event.c

BEFORE) kernel threads are created in usbip_start_eh().

AFTER) one workqueue is created in new usbip_init_eh().

Event handler which was main loop of kernel thread is modified to
workqueue handler.

Events themselves are stored in struct usbip_device - same as before.
usbip_devices which have event are listed in event_list.

The handler picks an element from the list and wakeup usbip_device. The
wakeup method is same as before.

usbip_in_eh() substitutes statement which checks whether functions are
called from eh_ops or not. In this function, the worker context is used
for the checking. The context will be set in a variable in the
beginning of first event handling. usbip_in_eh() is used in event
handler so it works well.

3. Modifications to programs using usbip_event.c

Initialization and termination of workqueue are added to init and exit
routine of usbip_core respectively.

A. version info

v2)
build fail at 1/2 casued unmodified user programs in 1/2.

Signed-off-by: Nobuo Iwata <nobuo.iwata@fujixerox.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bb7871ad99ea814565c3d6b551e039c71f24cbb3)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agousb: usbip: Fix possible deadlocks reported by lockdep
Andrew Goodbody [Tue, 2 Feb 2016 17:36:39 +0000 (17:36 +0000)]
usb: usbip: Fix possible deadlocks reported by lockdep

Orabug: 26668125

Change spin_lock calls to spin_lock_irqsave to prevent
attmpted recursive lock taking in interrupt context.

This patch fixes Bug 109351
  https://bugzilla.kernel.org/show_bug.cgi?id=109351

Signed-off-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 21619792d1eca7e772ca190ba68588e57f29595b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
drivers/usb/usbip/vhci_hcd.c

7 years agouek-rpm: Enable USB/IP support (Add usbip-core driver).
Somasundaram Krishnasamy [Wed, 23 Aug 2017 18:16:15 +0000 (11:16 -0700)]
uek-rpm: Enable USB/IP support (Add usbip-core driver).

Orabug: 26668125

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agox86/nmi: Save regs in crash dump on external NMI
Hidehiro Kawai [Mon, 14 Dec 2015 10:19:13 +0000 (11:19 +0100)]
x86/nmi: Save regs in crash dump on external NMI

Now, multiple CPUs can receive an external NMI simultaneously by
specifying the "apic_extnmi=all" command line parameter. When we take
a crash dump by using external NMI with this option, we fail to save
registers into the crash dump. This happens as follows:

  CPU 0                              CPU 1
  ================================   =============================
  receive an external NMI
  default_do_nmi()                   receive an external NMI
    spin_lock(&nmi_reason_lock)      default_do_nmi()
    io_check_error()                   spin_lock(&nmi_reason_lock)
      panic()                            busy loop
      ...
        kdump_nmi_shootdown_cpus()
          issue NMI IPI -----------> blocked until IRET
                                         busy loop...

Here, since CPU 1 is in NMI context, an additional NMI from CPU 0
remains unhandled until CPU 1 IRETs. However, CPU 1 will never execute
IRET so the NMI is not handled and the callback function to save
registers is never called.

To solve this issue, we check if the IPI for crash dumping was issued
while waiting for nmi_reason_lock to be released, and if so, call its
callback function directly. If the IPI is not issued (e.g. kdump is
disabled), the actual behavior doesn't change.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20151210065245.4587.39316.stgit@softrs
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit b279d67df88a49c6ca32b3eebd195660254be394)

Orabug: 25679037

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agoaio: mark AIO pseudo-fs noexec
Jann Horn [Thu, 15 Sep 2016 22:31:22 +0000 (00:31 +0200)]
aio: mark AIO pseudo-fs noexec

This ensures that do_mmap() won't implicitly make AIO memory mappings
executable if the READ_IMPLIES_EXEC personality flag is set.  Such
behavior is problematic because the security_mmap_file LSM hook doesn't
catch this case, potentially permitting an attacker to bypass a W^X
policy enforced by SELinux.

I have tested the patch on my machine.

To test the behavior, compile and run this:

    #define _GNU_SOURCE
    #include <unistd.h>
    #include <sys/personality.h>
    #include <linux/aio_abi.h>
    #include <err.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/syscall.h>

    int main(void) {
        personality(READ_IMPLIES_EXEC);
        aio_context_t ctx = 0;
        if (syscall(__NR_io_setup, 1, &ctx))
            err(1, "io_setup");

        char cmd[1000];
        sprintf(cmd, "cat /proc/%d/maps | grep -F '/[aio]'",
            (int)getpid());
        system(cmd);
        return 0;
    }

In the output, "rw-s" is good, "rwxs" is bad.

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 22f6b4d34fcf039c63a94e7670e0da24f8575a5a)

Orabug: 26540416
CVE: CVE-2016-10044

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agovfs: Commit to never having exectuables on proc and sysfs.
Eric W. Biederman [Mon, 29 Jun 2015 19:42:03 +0000 (14:42 -0500)]
vfs: Commit to never having exectuables on proc and sysfs.

Today proc and sysfs do not contain any executable files.  Several
applications today mount proc or sysfs without noexec and nosuid and
then depend on there being no exectuables files on proc or sysfs.
Having any executable files show on proc or sysfs would cause
a user space visible regression, and most likely security problems.

Therefore commit to never allowing executables on proc and sysfs by
adding a new flag to mark them as filesystems without executables and
enforce that flag.

Test the flag where MNT_NOEXEC is tested today, so that the only user
visible effect will be that exectuables will be treated as if the
execute bit is cleared.

The filesystems proc and sysfs do not currently incoporate any
executable files so this does not result in any user visible effects.

This makes it unnecessary to vet changes to proc and sysfs tightly for
adding exectuable files or changes to chattr that would modify
existing files, as no matter what the individual file say they will
not be treated as exectuable files by the vfs.

Not having to vet changes to closely is important as without this we
are only one proc_create call (or another goof up in the
implementation of notify_change) from having problematic executables
on proc.  Those mistakes are all too easy to make and would create
a situation where there are security issues or the assumptions of
some program having to be broken (and cause userspace regressions).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
(cherry picked from commit 90f8572b0f021fdd1baa68e00a8c30482ee9e5f4)

Orabug: 26540416
CVE: CVE-2016-10044

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
 Conflicts:
include/linux/fs.h

7 years agovfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB
Tejun Heo [Tue, 16 Jun 2015 22:48:31 +0000 (18:48 -0400)]
vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB

FS_CGROUP_WRITEBACK indicates whether a file_system_type supports
cgroup writeback; however, different super_blocks of the same
file_system_type may or may not support cgroup writeback depending on
filesystem options.  This patch replaces FS_CGROUP_WRITEBACK with a
per-super_block flag.

super_block->s_flags carries some internal flags in the high bits but
it's exposd to userland through uapi header and running out of space
anyway.  This patch adds a new field super_block->s_iflags to carry
kernel-internal flags.  It is currently only used by the new
SB_I_CGROUPWB flag whose concatenated and abbreviated name is for
consistency with other super_block flags.

ext2_fill_super() is updated to set SB_I_CGROUPWB.

v2: Added super_block->s_iflags instead of stealing another high bit
    from sb->s_flags as suggested by Christoph and Jan.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 46b15caa7cb19b0f6e3bc8ebaee5bc1bb2e35110)

Orabug: 26540416
CVE: CVE-2016-10044

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
 Conflicts:
include/linux/backing-dev.h
include/linux/fs.h
fs/ext2/super.c

7 years agocrypto: ccp - Release locks before returning
Gary R Hook [Mon, 19 Jun 2017 17:31:17 +0000 (12:31 -0500)]
crypto: ccp - Release locks before returning

Orabug: 26644685

krobot warning: make sure that all error return paths release locks.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 30b4c54ccdebfc5a0210d9a4d77cc3671c9d1576)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - return NULL instead of 0
pjambhlekar [Wed, 3 May 2017 04:02:09 +0000 (09:32 +0530)]
crypto: ccp - return NULL instead of 0

Orabug: 26644685

This change is to handle sparse warning. Return type of function is a pointer to the structure and
it returns 0. Instead it should return NULL.

Signed-off-by: Pushkar Jambhlekar <pushkar.iit@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 9d1fb19668207b66bc54f8e10fa91a2fcff061e4)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Add debugfs entries for CCP information
Gary R Hook [Tue, 2 May 2017 22:33:40 +0000 (17:33 -0500)]
crypto: ccp - Add debugfs entries for CCP information

Orabug: 26644685

Expose some data about the configuration and operation of the CCP
through debugfs entries: device name, capabilities, configuration,
statistics.

Allow the user to reset the counters to zero by writing (any value)
to the 'stats' file. This can be done per queue or per device.

Changes from V1:
 - Correct polarity of test when destroying devices at module unload

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 3cdbe346ed3f380eae1cb3e9febfe703e7d8a7b0)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Use IPAD/OPAD constant
Corentin LABBE [Fri, 19 May 2017 06:53:31 +0000 (08:53 +0200)]
crypto: ccp - Use IPAD/OPAD constant

Orabug: 26644685

This patch simply replace all occurrence of HMAC IPAD/OPAD value by their
define.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 6507c57bb013a22ae0f61bdbb2a63380598d34a2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: hmac - add hmac IPAD/OPAD constant
Corentin LABBE [Fri, 19 May 2017 06:53:23 +0000 (08:53 +0200)]
crypto: hmac - add hmac IPAD/OPAD constant

Orabug: 26644685

Many HMAC users directly use directly 0x36/0x5c values.
It's better with crypto to use a name instead of directly some crypto
constant.

This patch simply add HMAC_IPAD_VALUE/HMAC_OPAD_VALUE defines in a new
include file "crypto/hmac.h" and use them in crypto/hmac.c

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 03d7db5654aefb78bf2ded62e2187833fda2a6cc)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>