www.infradead.org Git - users/jedix/linux-maple.git/log

mm: page_alloc: pass PFN to __free_pages_bootmem

Orabug: 23331157

[ Upstream commit d70ddd7a5d9aa335f9b4b0c3d879e1e70ee1e4e3 ]

__free_pages_bootmem prepares a page for release to the buddy allocator
and assumes that the struct page is initialised. Parallel initialisation
of struct pages defers initialisation and __free_pages_bootmem can be
called for struct pages that cannot yet map struct page to PFN. This
patch passes PFN to __free_pages_bootmem with no other functional change.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Nate Zimmer <nzimmer@sgi.com>
Tested-by: Waiman Long <waiman.long@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Nate Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit bbd0b13f917860a686ccaf40cda003332a640a70)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list

Orabug: 23331156

[ Upstream commit be12b299a83fc807bbaccd2bcb8ec50cbb0cb55c ]

When master handles convert request, it queues ast first and then
returns status.  This may happen that the ast is sent before the request
status because the above two messages are sent by two threads.  And
right after the ast is sent, if master down, it may trigger BUG in
dlm_move_lockres_to_recovery_list in the requested node because ast
handler moves it to grant list without clear lock->convert_pending.  So
remove BUG_ON statement and check if the ast is processed in
dlmconvert_remote.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reported-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 474b8c6d329cfc40680ef308878d22f5a1a3b02b)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

ocfs2/dlm: fix race between convert and recovery

Orabug: 23331155

[ Upstream commit ac7cf246dfdbec3d8fed296c7bf30e16f5099dac ]

There is a race window between dlmconvert_remote and
dlm_move_lockres_to_recovery_list, which will cause a lock with
OCFS2_LOCK_BUSY in grant list, thus system hangs.

dlmconvert_remote
{
        spin_lock(&res->spinlock);
        list_move_tail(&lock->list, &res->converting);
        lock->convert_pending = 1;
        spin_unlock(&res->spinlock);

        status = dlm_send_remote_convert_request();
        >>>>>> race window, master has queued ast and return DLM_NORMAL,
               and then down before sending ast.
               this node detects master down and calls
               dlm_move_lockres_to_recovery_list, which will revert the
               lock to grant list.
               Then OCFS2_LOCK_BUSY won't be cleared as new master won't
               send ast any more because it thinks already be authorized.

        spin_lock(&res->spinlock);
        lock->convert_pending = 0;
        if (status != DLM_NORMAL)
                dlm_revert_pending_convert(res, lock);
        spin_unlock(&res->spinlock);
}

In this case, check if res->state has DLM_LOCK_RES_RECOVERING bit set
(res is still in recovering) or res master changed (new master has
finished recovery), reset the status to DLM_RECOVERING, then it will
retry convert.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reported-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit d5865dc7deb118b1db82dcaf4d249668bc81a311)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Input: ati_remote2 - fix crashes on detecting device with invalid descriptor

Orabug: stable_rc4

[ Upstream commit 950336ba3e4a1ffd2ca60d29f6ef386dd2c7351d ]

The ati_remote2 driver expects at least two interfaces with one
endpoint each. If given malicious descriptor that specify one
interface or no endpoints, it will crash in the probe function.
Ensure there is at least two interfaces and one endpoint for each
interface before using it.

The full disclosure: http://seclists.org/bugtraq/2016/Mar/90

Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 4b586dc3d736a43659acb575c90d33370ba2fb0d)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

rapidio/rionet: fix deadlock on SMP

Orabug: 23331153

[ Upstream commit 36915976eca58f2eefa040ba8f9939672564df61 ]

Fix deadlocking during concurrent receive and transmit operations on SMP
platforms caused by the use of incorrect lock: on transmit 'tx_lock'
spinlock should be used instead of 'lock' which is used for receive
operation.

This fix is applicable to kernel versions starting from v2.15.

Signed-off-by: Aurelien Jacquiot <a-jacquiot@ti.com>
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit e025091aa3a343703375c66e0bd02675b251411a)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

fs/coredump: prevent fsuid=0 dumps into user-controlled directories

Orabug: 23331152

[ Upstream commit 378c6520e7d29280f400ef2ceaf155c86f05a71a ]

This commit fixes the following security hole affecting systems where
all of the following conditions are fulfilled:

- The fs.suid_dumpable sysctl is set to 2.
- The kernel.core_pattern sysctl's value starts with "/". (Systems
   where kernel.core_pattern starts with "|/" are not affected.)
- Unprivileged user namespace creation is permitted. (This is
   true on Linux >=3.8, but some distributions disallow it by
   default using a distro patch.)

Under these conditions, if a program executes under secure exec rules,
causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
namespace, changes its root directory and crashes, the coredump will be
written using fsuid=0 and a path derived from kernel.core_pattern - but
this path is interpreted relative to the root directory of the process,
allowing the attacker to control where a coredump will be written with
root privileges.

To fix the security issue, always interpret core_pattern for dumps that
are written under SUID_DUMP_ROOT relative to the root directory of init.

Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Dan Duval <dan.duval@oracle.com>
(cherry picked from commit 2e840836fdf9ba0767134dcc5103dd25e442023b)

Conflict:

fs/coredump.c

KVM: fix spin_lock_init order on x86

Orabug: 23331151

[ Upstream commit e9ad4ec8379ad1ba6f68b8ca1c26b50b5ae0a327 ]

Moving the initialization earlier is needed in 4.6 because
kvm_arch_init_vm is now using mmu_lock, causing lockdep to
complain:

[  284.440294] INFO: trying to register non-static key.
[  284.445259] the code is fine but needs lockdep annotation.
[  284.450736] turning off the locking correctness validator.
...
[  284.528318]  [<ffffffff810aecc3>] lock_acquire+0xd3/0x240
[  284.533733]  [<ffffffffa0305aa0>] ? kvm_page_track_register_notifier+0x20/0x60 [kvm]
[  284.541467]  [<ffffffff81715581>] _raw_spin_lock+0x41/0x80
[  284.546960]  [<ffffffffa0305aa0>] ? kvm_page_track_register_notifier+0x20/0x60 [kvm]
[  284.554707]  [<ffffffffa0305aa0>] kvm_page_track_register_notifier+0x20/0x60 [kvm]
[  284.562281]  [<ffffffffa02ece70>] kvm_mmu_init_vm+0x20/0x30 [kvm]
[  284.568381]  [<ffffffffa02dbf7a>] kvm_arch_init_vm+0x1ea/0x200 [kvm]
[  284.574740]  [<ffffffffa02bff3f>] kvm_dev_ioctl+0xbf/0x4d0 [kvm]

However, it also helps fixing a preexisting problem, which is why this
patch is also good for stable kernels: kvm_create_vm was incrementing
current->mm->mm_count but not decrementing it at the out_err label (in
case kvm_init_mmu_notifier failed).  The new initialization order makes
it possible to add the required mmdrop without adding a new error label.

Cc: stable@vger.kernel.org
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 01954dfd48d936199204ae0860897b8aed18c1e2)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

KVM: VMX: avoid guest hang on invalid invept instruction

Orabug: 23331149

[ Upstream commit 2849eb4f99d54925c543db12917127f88b3c38ff ]

A guest executing an invalid invept instruction would hang
because the instruction pointer was not updated.

Cc: stable@vger.kernel.org
Fixes: bfd0a56b90005f8c8a004baf407ad90045c2b11e
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 7a33539146bdcbbce25dbe93e853f39058c640a9)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

target: Fix target_release_cmd_kref shutdown comp leak

Orabug: 23331148

[ Upstream commit 5e47f1985d7107331c3f64fb3ec83d66fd73577e ]

This patch fixes an active I/O shutdown bug for fabric
drivers using target_wait_for_sess_cmds(), where se_cmd
descriptor shutdown would result in hung tasks waiting
indefinitely for se_cmd->cmd_wait_comp to complete().

To address this bug, drop the incorrect list_del_init()
usage in target_wait_for_sess_cmds() and always complete()
during se_cmd target_release_cmd_kref() put, in order to
let caller invoke the final fabric release callback
into se_cmd->se_tfo->release_cmd() code.

Reported-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Tested-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Dan Duval <dan.duval@oracle.com>
(cherry picked from commit b90311472f0dc0aa7a7223c1220db47ce4db5447)

Conflict:

drivers/target/target_core_transport.c

bitops: Do not default to __clear_bit() for __clear_bit_unlock()

Orabug: 23331133

[ Upstream commit f75d48644c56a31731d17fa693c8175328957e1d ]

__clear_bit_unlock() is a special little snowflake. While it carries the
non-atomic '__' prefix, it is specifically documented to pair with
test_and_set_bit() and therefore should be 'somewhat' atomic.

Therefore the generic implementation of __clear_bit_unlock() cannot use
the fully non-atomic __clear_bit() as a default.

If an arch is able to do better; is must provide an implementation of
__clear_bit_unlock() itself.

Specifically, this came up as a result of hackbench livelock'ing in
slab_lock() on ARC with SMP + SLUB + !LLSC.

The issue was incorrect pairing of atomic ops.

slab_lock() -> bit_spin_lock() -> test_and_set_bit()
slab_unlock() -> __bit_spin_unlock() -> __clear_bit()

The non serializing __clear_bit() was getting "lost"

80543b8e: ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
80543b90: or         r3,r2,1    <--- (B) other core unlocks right here
80543b94: st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)

Fixes ARC STAR 9000817404 (and probably more).

Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160309114054.GJ6356@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 20b25a3a2ce6ab20ceab54a2650809cc191c3287)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

splice: handle zero nr_pages in splice_to_pipe()

Orabug: 23331132

[ Upstream commit d6785d9152147596f60234157da2b02540c3e60f ]

Running the following command:

busybox cat /sys/kernel/debug/tracing/trace_pipe > /dev/null

with any tracing enabled pretty very quickly leads to various NULL
pointer dereferences and VM BUG_ON()s, such as these:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffff8119df6c>] generic_pipe_buf_release+0xc/0x40
Call Trace:
  [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0
  [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10
  [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0
  [<ffffffff81196869>] do_sendfile+0x199/0x380
  [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0
  [<ffffffff8192cbee>] entry_SYSCALL_64_fastpath+0x12/0x6d

page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
kernel BUG at include/linux/mm.h:367!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
RIP: [<ffffffff8119df9c>] generic_pipe_buf_release+0x3c/0x40
Call Trace:
  [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0
  [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10
  [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0
  [<ffffffff81196869>] do_sendfile+0x199/0x380
  [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0
  [<ffffffff8192cd1e>] tracesys_phase2+0x84/0x89

(busybox's cat uses sendfile(2), unlike the coreutils version)

This is because tracing_splice_read_pipe() can call splice_to_pipe()
with spd->nr_pages == 0.  spd_pages underflows in splice_to_pipe() and
we fill the page pointers and the other fields of the pipe_buffers with
garbage.

All other callers of splice_to_pipe() avoid calling it when nr_pages ==
0, and we could make tracing_splice_read_pipe() do that too, but it
seems reasonable to have splice_to_page() handle this condition
gracefully.

Cc: stable@vger.kernel.org
Signed-off-by: Rabin Vincent <rabin@rab.in>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit ff6c4184945bb5ec843cde5c3936b93823cf5641)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

tracing: Fix crash from reading trace_pipe with sendfile

Orabug: 23331131

[ Upstream commit a29054d9478d0435ab01b7544da4f674ab13f533 ]

If tracing contains data and the trace_pipe file is read with sendfile(),
then it can trigger a NULL pointer dereference and various BUG_ON within the
VM code.

There's a patch to fix this in the splice_to_pipe() code, but it's also a
good idea to not let that happen from trace_pipe either.

Link: http://lkml.kernel.org/r/1457641146-9068-1-git-send-email-rabin@rab.in
Cc: stable@vger.kernel.org # 2.6.30+
Reported-by: Rabin Vincent <rabin.vincent@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit ab31b690cdf88cf5cb0493718b89dbc37dd784a3)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

USB: uas: Reduce can_queue to MAX_CMNDS

Orabug: 23331130

[ Upstream commit 55ff8cfbc4e12a7d2187df523938cc671fbebdd1 ]

The uas driver can never queue more then MAX_CMNDS (- 1) tags and tags
are shared between luns, so there is no need to claim that we can_queue
some random large number.

Not claiming that we can_queue 65536 commands, fixes the uas driver
failing to initialize while allocating the tag map with a "Page allocation
failure (order 7)" error on systems which have been running for a while
and thus have fragmented memory.

Cc: stable@vger.kernel.org
Reported-and-tested-by: Yves-Alexis Perez <corsac@corsac.net>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 0b914b30c534502d1d6a68a5859ee65c3d7d8aa5)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

USB: cdc-acm: more sanity checking

Orabug: 23331129

[ Upstream commit 8835ba4a39cf53f705417b3b3a94eb067673f2c9 ]

An attack has become available which pretends to be a quirky
device circumventing normal sanity checks and crashes the kernel
by an insufficient number of interfaces. This patch adds a check
to the code path for quirky devices.

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit a635bc779e7b7748c9b0b773eaf08a7f2184ec50)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

USB: usb_driver_claim_interface: add sanity checking

Orabug: 23331128

[ Upstream commit 0b818e3956fc1ad976bee791eadcbb3b5fec5bfd ]

Attacks that trick drivers into passing a NULL pointer
to usb_driver_claim_interface() using forged descriptors are
known. This thwarts them by sanity checking.

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit febfaffe4aa9b0600028f46f50156e3354e34208)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

USB: iowarrior: fix oops with malicious USB descriptors

Orabug: 23331127

[ Upstream commit 4ec0ef3a82125efc36173062a50624550a900ae0 ]

The iowarrior driver expects at least one valid endpoint. If given
malicious descriptors that specify 0 for the number of endpoints,
it will crash in the probe function. Ensure there is at least
one endpoint on the interface before using it.

The full report of this issue can be found here:
http://seclists.org/bugtraq/2016/Mar/87

Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 14a710bbf6b4095876ff682b3066ac485480049c)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt()

Orabug: stable_rc4

[ Upstream commit 7834c10313fb823e538f2772be78edcdeed2e6e3 ]

Since 4.4, I've been able to trigger this occasionally:

===============================
[ INFO: suspicious RCU usage. ]
4.5.0-rc7-think+ #3 Not tainted
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20160315012054.GA17765@codemonkey.org.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-------------------------------
./arch/x86/include/asm/msr-trace.h:47 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
no locks held by swapper/3/0.

stack backtrace:
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.5.0-rc7-think+ #3
ffffffff92f821e0 1f3e5c340597d7fc ffff880468e07f10 ffffffff92560c2a
ffff880462145280 0000000000000001 ffff880468e07f40 ffffffff921376a6
ffffffff93665ea0 0000cc7c876d28da 0000000000000005 ffffffff9383dd60
Call Trace:
<IRQ> [<ffffffff92560c2a>] dump_stack+0x67/0x9d
[<ffffffff921376a6>] lockdep_rcu_suspicious+0xe6/0x100
[<ffffffff925ae7a7>] do_trace_write_msr+0x127/0x1a0
[<ffffffff92061c83>] native_apic_msr_eoi_write+0x23/0x30
[<ffffffff92054408>] smp_trace_call_function_interrupt+0x38/0x360
[<ffffffff92d1ca60>] trace_call_function_interrupt+0x90/0xa0
<EOI> [<ffffffff92ac5124>] ? cpuidle_enter_state+0x1b4/0x520

Move the entering_irq() call before ack_APIC_irq(), because entering_irq()
tells the RCU susbstems to end the extended quiescent state, so that the
following trace call in ack_APIC_irq() works correctly.

Suggested-by: Andi Kleen <ak@linux.intel.com>
Fixes: 4787c368a9bc "x86/tracing: Add irq_enter/exit() in smp_trace_reschedule_interrupt()"
Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit e69b0c2d1686a3344265e26a2ea6197e856f0459)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Input: synaptics - handle spurious release of trackstick buttons, again

Orabug: stable_rc4

[ Upstream commit 82be788c96ed5978d3cb4a00079e26b981a3df3f ]

Looks like the fimware 8.2 still has the extra buttons spurious release
bug.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=114321
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit dfeccb29d75a2a7dc75d8b2c68b53bd7601d8686)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mm: memcontrol: reclaim when shrinking memory.high below usage

Orabug: 23331124

[ Upstream commit 588083bb37a3cea8533c392370a554417c8f29cb ]

When setting memory.high below usage, nothing happens until the next
charge comes along, and then it will only reclaim its own charge and not
the now potentially huge excess of the new memory.high. This can cause
groups to stay in excess of their memory.high indefinitely.

To fix that, when shrinking memory.high, kick off a reclaim cycle that
goes after the delta.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 62b3bd2a4f98852cefd0df17333362638246aa8f)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Input: ims-pcu - sanity check against missing interfaces

Orabug: 23331123

[ Upstream commit a0ad220c96692eda76b2e3fd7279f3dcd1d8a8ff ]

A malicious device missing interface can make the driver oops.
Add sanity checking.

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 3ec245e8591a183e276df89cd7f9e7a15645b9da)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mmc: atmel-mci: Check pdata for NULL before dereferencing it at DMA config

Orabug: stable_rc4

[ Upstream commit 93c77d2999b09f2084b033ea6489915e0104ad9c ]

Using an at91sam9g20ek development board with DTS configuration may trigger
a kernel panic because of a NULL pointer dereference exception, while
configuring DMA. Let's fix this by adding a check for pdata before
dereferencing it.

Signed-off-by: Brent Taylor <motobud@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 0db0888f18f37d9ee10e99a7f3e90861b4d525a2)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mmc: atmel-mci: restore dma on AVR32

Orabug: 23331120

[ Upstream commit 74843787158e9dff249f0528e7d4806102cc2c26 ]

Commit ecb89f2f5f3e7 ("mmc: atmel-mci: remove compat for non DT board
when requesting dma chan") broke dma on AVR32 and any other boards not
using DT. This restores a fallback mechanism for such cases.

Signed-off-by: Mans Rullgard <mans@mansr.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 7fa28eeed844c4388957095e5c485eb6f87a14de)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

nfsd: fix deadlock secinfo+readdir compound

Orabug: 23331119

[ Upstream commit 2f6fc056e899bd0144a08da5cacaecbe8997cd74 ]

nfsd_lookup_dentry exits with the parent filehandle locked. fh_put also
unlocks if necessary (nfsd filehandle locking is probably too lenient),
so it gets unlocked eventually, but if the following op in the compound
needs to lock it again, we can deadlock.

A fuzzer ran into this; normal clients don't send a secinfo followed by
a readdir in the same compound.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 03d44e3d9dd7744fe97ca472fce1725b7179fa2f)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

ALSA: usb-audio: Add sanity checks for endpoint accesses

Orabug: 23331117

[ Upstream commit 447d6275f0c21f6cc97a88b3a0c601436a4cdf2a ]

Add some sanity check codes before actually accessing the endpoint via
get_endpoint() in order to avoid the invalid access through a
malformed USB descriptor. Mostly just checking bNumEndpoints, but in
one place (snd_microii_spdif_default_get()), the validity of iface and
altsetting index is checked as well.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=971125
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 4fa1657957f668fcc9606268df01bc0f3e4f1379)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

ALSA: usb-audio: Fix NULL dereference in create_fixed_stream_quirk()

Orabug: 23331116

[ Upstream commit 0f886ca12765d20124bd06291c82951fd49a33be ]

create_fixed_stream_quirk() may cause a NULL-pointer dereference by
accessing the non-existing endpoint when a USB device with a malformed
USB descriptor is used.

This patch avoids it simply by adding a sanity check of bNumEndpoints
before the accesses.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=971125
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 6ed72ce6ab8b38803b12df8c62a3a52becf19017)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

HID: i2c-hid: fix OOB write in i2c_hid_set_or_send_report()

Orabug: 23331115

[ Upstream commit 3b654288b196ceaa156029d9457ccbded0489b98 ]

Even though hid_hw_* checks that passed in data_len is less than
HID_MAX_BUFFER_SIZE it is not enough, as i2c-hid does not necessarily
allocate buffers of HID_MAX_BUFFER_SIZE but rather checks all device
reports and select largest size. In-kernel users normally just send as much
data as report needs, so there is no problem, but hidraw users can do
whatever they please:

BUG: KASAN: slab-out-of-bounds in memcpy+0x34/0x54 at addr ffffffc07135ea80
Write of size 4101 by task syz-executor/8747
CPU: 2 PID: 8747 Comm: syz-executor Tainted: G    BU         3.18.0 #37
Hardware name: Google Tegra210 Smaug Rev 1,3+ (DT)
Call trace:
[<ffffffc00020ebcc>] dump_backtrace+0x0/0x258 arch/arm64/kernel/traps.c:83
[<ffffffc00020ee40>] show_stack+0x1c/0x2c arch/arm64/kernel/traps.c:172
[<     inline     >] __dump_stack lib/dump_stack.c:15
[<ffffffc001958114>] dump_stack+0x90/0x140 lib/dump_stack.c:50
[<     inline     >] print_error_description mm/kasan/report.c:97
[<     inline     >] kasan_report_error mm/kasan/report.c:278
[<ffffffc0004597dc>] kasan_report+0x268/0x530 mm/kasan/report.c:305
[<ffffffc0004592e8>] __asan_storeN+0x20/0x150 mm/kasan/kasan.c:718
[<ffffffc0004594e0>] memcpy+0x30/0x54 mm/kasan/kasan.c:299
[<ffffffc001306354>] __i2c_hid_command+0x2b0/0x7b4 drivers/hid/i2c-hid/i2c-hid.c:178
[<     inline     >] i2c_hid_set_or_send_report drivers/hid/i2c-hid/i2c-hid.c:321
[<ffffffc0013079a0>] i2c_hid_output_raw_report.isra.2+0x3d4/0x4b8 drivers/hid/i2c-hid/i2c-hid.c:589
[<ffffffc001307ad8>] i2c_hid_output_report+0x54/0x68 drivers/hid/i2c-hid/i2c-hid.c:602
[<     inline     >] hid_hw_output_report include/linux/hid.h:1039
[<ffffffc0012cc7a0>] hidraw_send_report+0x400/0x414 drivers/hid/hidraw.c:154
[<ffffffc0012cc7f4>] hidraw_write+0x40/0x64 drivers/hid/hidraw.c:177
[<ffffffc0004681dc>] vfs_write+0x1d4/0x3cc fs/read_write.c:534
[<     inline     >] SYSC_pwrite64 fs/read_write.c:627
[<ffffffc000468984>] SyS_pwrite64+0xec/0x144 fs/read_write.c:614
Object at ffffffc07135ea80, in cache kmalloc-512
Object allocated with size 268 bytes.

Let's check data length against the buffer size before attempting to copy
data over.

Cc: stable@vger.kernel.org
Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Dmitry Torokhov <dtor@chromium.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 33184f68f4b527a6582e8fc5e94a7a7b6ba9c588)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

md: multipath: don't hardcopy bio in .make_request path

Orabug: 23331114

[ Upstream commit fafcde3ac1a418688a734365203a12483b83907a ]

Inside multipath_make_request(), multipath maps the incoming
bio into low level device's bio, but it is totally wrong to
copy the bio into mapped bio via '*mapped_bio = *bio'. For
example, .__bi_remaining is kept in the copy, especially if
the incoming bio is chained to via bio splitting, so .bi_end_io
can't be called for the mapped bio at all in the completing path
in this kind of situation.

This patch fixes the issue by using clone style.

Cc: stable@vger.kernel.org (v3.14+)
Reported-and-tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 79deb9b280a508c77f6cc233e083544712bc2458)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Input: powermate - fix oops with malicious USB descriptors

Orabug: 23331113

[ Upstream commit 9c6ba456711687b794dcf285856fc14e2c76074f ]

The powermate driver expects at least one valid USB endpoint in its
probe function. If given malicious descriptors that specify 0 for
the number of endpoints, it will crash. Validate the number of
endpoints on the interface before using them.

The full report for this issue can be found here:
http://seclists.org/bugtraq/2016/Mar/85

Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 76b69dfeb5f1bf19a6bd65991506bbb00647716b)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

fuse: Add reference counting for fuse_io_priv

Orabug: 23331112

[ Upstream commit 744742d692e37ad5c20630e57d526c8f2e2fe3c9 ]

The 'reqs' member of fuse_io_priv serves two purposes. First is to track
the number of oustanding async requests to the server and to signal that
the io request is completed. The second is to be a reference count on the
structure to know when it can be freed.

For sync io requests these purposes can be at odds. fuse_direct_IO() wants
to block until the request is done, and since the signal is sent when
'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to
use the object after the userspace server has completed processing
requests. This leads to some handshaking and special casing that it
needlessly complicated and responsible for at least one race condition.

It's much cleaner and safer to maintain a separate reference count for the
object lifecycle and to let 'reqs' just be a count of outstanding requests
to the userspace server. Then we can know for sure when it is safe to free
the object without any handshaking or special cases.

The catch here is that most of the time these objects are stack allocated
and should not be freed. Initializing these objects with a single reference
that is never released prevents accidental attempts to free the objects.

Fixes: 9d5722b7777e ("fuse: handle synchronous iocbs internally")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit bbd5f23b1eaba29f46f97846b361fab4c5becc78)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

fuse: do not use iocb after it may have been freed

Orabug: 23331110

[ Upstream commit 7cabc61e01a0a8b663bd2b4c982aa53048218734 ]

There's a race in fuse_direct_IO(), whereby is_sync_kiocb() is called on an
iocb that could have been freed if async io has already completed. The fix
in this case is simple and obvious: cache the result before starting io.

It was discovered by KASan:

kernel: ==================================================================
kernel: BUG: KASan: use after free in fuse_direct_IO+0xb1a/0xcc0 at addr ffff88036c414390

Signed-off-by: Robert Doebbelin <robert@quobyte.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: bcba24ccdc82 ("fuse: enable asynchronous processing direct IO")
Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 19167d65fabb60ff11fc5f9c4a5248c17a12f615)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

iser-target: Separate flows for np listeners and connections cma events

Orabug: stable_rc4

[ Upstream commit f81bf458208ef6d12b2fc08091204e3859dcdba4 ]

No need to restrict this check to specific events.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 843513677254865b92c551339103e8eaa2b07669)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

iser-target: Add new state ISER_CONN_BOUND to isert_conn

Orabug: 23331108

[ Upstream commit aea92980601f7ddfcb3c54caa53a43726314fe46 ]

We need an indication that isert_conn->iscsi_conn binding has
happened so we'll know not to invoke a connection reinstatement
on an unbound connection which will lead to a bogus isert_conn->conn
dereferece.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 82ba90c00ba7e4ffd31a1ad8a5ed224c8e6f3d37)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

iser-target: Fix identification of login rx descriptor type

Orabug: 23331107

[ Upstream commit b89a7c25462b164db280abc3b05d4d9d888d40e9 ]

Once connection request is accepted, one rx descriptor
is posted to receive login request. This descriptor has rx type,
but is outside the main pool of rx descriptors, and thus
was mistreated as tx type.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 3c6961458165c0ec909d68191bff4de7bdf50549)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

dm thin metadata: don't issue prefetches if a transaction abort has failed

Orabug: stable_rc4

[ Upstream commit 2eae9e4489b4cf83213fa3bd508b5afca3f01780 ]

If a transaction abort has failed then we can no longer use the metadata
device. Typically this happens if the superblock is unreadable.

This fix addresses a crash seen during metadata device failure testing.

Fixes: 8a01a6af75 ("dm thin: prefetch missing metadata pages")
Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 78243d0a1221f06af96a8952cd5ac71cd1fe5376)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

jbd2: fix FS corruption possibility in jbd2_journal_destroy() on umount path

Orabug: stable_rc4

[ Upstream commit c0a2ad9b50dd80eeccd73d9ff962234590d5ec93 ]

On umount path, jbd2_journal_destroy() writes latest transaction ID
(->j_tail_sequence) to be used at next mount.

The bug is that ->j_tail_sequence is not holding latest transaction ID
in some cases. So, at next mount, there is chance to conflict with
remaining (not overwritten yet) transactions.

mount (id=10)
write transaction (id=11)
write transaction (id=12)
umount (id=10) <= the bug doesn't write latest ID

mount (id=10)
write transaction (id=11)
crash

mount
[recovery process]
transaction (id=11)
transaction (id=12) <= valid transaction ID, but old commit
must not replay

Like above, this bug become the cause of recovery failure, or FS
corruption.

So why ->j_tail_sequence doesn't point latest ID?

Because if checkpoint transactions was reclaimed by memory pressure
(i.e. bdev_try_to_free_page()), then ->j_tail_sequence is not updated.
(And another case is, __jbd2_journal_clean_checkpoint_list() is called
with empty transaction.)

So in above cases, ->j_tail_sequence is not pointing latest
transaction ID at umount path. Plus, REQ_FLUSH for checkpoint is not
done too.

So, to fix this problem with minimum changes, this patch updates
->j_tail_sequence, and issue REQ_FLUSH. (With more complex changes,
some optimizations would be possible to avoid unnecessary REQ_FLUSH
for example though.)

BTW,

journal->j_tail_sequence =
++journal->j_transaction_sequence;

Increment of ->j_transaction_sequence seems to be unnecessary, but
ext3 does this.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 3b6a70271a4e1e72585b3a7236cbd17d646b38d1)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

sg: fix dxferp in from_to case

Orabug: 23331104

[ Upstream commit 5ecee0a3ee8d74b6950cb41e8989b0c2174568d4 ]

One of the strange things that the original sg driver did was let the
user provide both a data-out buffer (it followed the sg_header+cdb)
_and_ specify a reply length greater than zero. What happened was that
the user data-out buffer was copied into some kernel buffers and then
the mid level was told a read type operation would take place with the
data from the device overwriting the same kernel buffers. The user would
then read those kernel buffers back into the user space.

From what I can tell, the above action was broken by commit fad7f01e61bf
("sg: set dxferp to NULL for READ with the older SG interface") in 2008
and syzkaller found that out recently.

Make sure that a user space pointer is passed through when data follows
the sg_header structure and command. Fix the abnormal case when a
non-zero reply_len is also given.

Fixes: fad7f01e61bf737fe8a3740d803f000db57ecac6
Cc: <stable@vger.kernel.org> #v2.6.28+
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 685a50b681ddd07ff2b7714797b5793adcc691e7)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

md/raid5: preserve STRIPE_PREREAD_ACTIVE in break_stripe_batch_list

Orabug: 23331103

[ Upstream commit 550da24f8d62fe81f3c13e3ec27602d6e44d43dc ]

break_stripe_batch_list breaks up a batch and copies some flags from
the batch head to the members, preserving others.

It doesn't preserve or copy STRIPE_PREREAD_ACTIVE.  This is not
normally a problem as STRIPE_PREREAD_ACTIVE is cleared when a
stripe_head is added to a batch, and is not set on stripe_heads
already in a batch.

However there is no locking to ensure one thread doesn't set the flag
after it has just been cleared in another.  This does occasionally happen.

md/raid5 maintains a count of the number of stripe_heads with
STRIPE_PREREAD_ACTIVE set: conf->preread_active_stripes.  When
break_stripe_batch_list clears STRIPE_PREREAD_ACTIVE inadvertently
this could becomes incorrect and will never again return to zero.

md/raid5 delays the handling of some stripe_heads until
preread_active_stripes becomes zero.  So when the above mention race
happens, those stripe_heads become blocked and never progress,
resulting is write to the array handing.

So: change break_stripe_batch_list to preserve STRIPE_PREREAD_ACTIVE
in the members of a batch.

URL: https://bugzilla.kernel.org/show_bug.cgi?id=108741
URL: https://bugzilla.redhat.com/show_bug.cgi?id=1258153
URL: http://thread.gmane.org/5649C0E9.2030204@zoner.cz
Reported-by: Martin Svec <martin.svec@zoner.cz> (and others)
Tested-by: Tom Weber <linux@junkyard.4t2.com>
Fixes: 1b956f7a8f9a ("md/raid5: be more selective about distributing flags across batch.")
Cc: stable@vger.kernel.org (v4.1 and later)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 88e2df1b99cfc0087ccd2faa8ca61d7263fc007a)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

be2iscsi: set the boot_kset pointer to NULL in case of failure

Orabug: 23331102

[ Upstream commit 84bd64993f916bcf86270c67686ecf4cea7b8933 ]

In beiscsi_setup_boot_info(), the boot_kset pointer should be set to
NULL in case of failure otherwise an invalid pointer dereference may
occur later.

Cc: <stable@vger.kernel.org>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit d5226186331401679a52df554a9607022e470983)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs

Orabug: stable_rc4

[ Upstream commit b894157145e4ac7598d7062bc93320898a5e059e ]

The Home Agent and PCU PCI devices in Broadwell-EP have a non-BAR register
where a BAR should be. We don't know what the side effects of sizing the
"BAR" would be, and we don't know what address space the "BAR" might appear
to describe.

Mark these devices as having non-compliant BARs so the PCI core doesn't
touch them.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit fcc5834794fdcb62637a1deb25bff5787bf73757)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

bcache: fix cache_set_flush() NULL pointer dereference on OOM

Orabug: 23331099

[ Upstream commit f8b11260a445169989d01df75d35af0f56178f95 ]

When bch_cache_set_alloc() fails to kzalloc the cache_set, the
asyncronous closure handling tries to dereference a cache_set that
hadn't yet been allocated inside of cache_set_flush() which is called
by __cache_set_unregister() during cleanup. This appears to happen only
during an OOM condition on bcache_register.

Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit d09a05998d79dcfaa25de84624dce9f806fe4e7c)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

bcache: cleaned up error handling around register_cache()

Orabug: 23331096

[ Upstream commit 9b299728ed777428b3908ac72ace5f8f84b97789 ]

Fix null pointer dereference by changing register_cache() to return an int
instead of being void.  This allows it to return -ENOMEM or -ENODEV and
enables upper layers to handle the OOM case without NULL pointer issues.

See this thread:
  http://thread.gmane.org/gmane.linux.kernel.bcache.devel/3521

Fixes this error:
  gargamel:/sys/block/md5/bcache# echo /dev/sdh2 > /sys/fs/bcache/register

  bcache: register_cache() error opening sdh2: cannot allocate memory
  BUG: unable to handle kernel NULL pointer dereference at 00000000000009b8
  IP: [<ffffffffc05a7e8d>] cache_set_flush+0x102/0x15c [bcache]
  PGD 120dff067 PUD 1119a3067 PMD 0
  Oops: 0000 [#1] SMP
  Modules linked in: veth ip6table_filter ip6_tables
  (...)
  CPU: 4 PID: 3371 Comm: kworker/4:3 Not tainted 4.4.2-amd64-i915-volpreempt-20160213bc1 #3
  Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
  Workqueue: events cache_set_flush [bcache]
  task: ffff88020d5dc280 ti: ffff88020b6f8000 task.ti: ffff88020b6f8000
  RIP: 0010:[<ffffffffc05a7e8d>]  [<ffffffffc05a7e8d>] cache_set_flush+0x102/0x15c [bcache]

Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
Tested-by: Marc MERLIN <marc@merlins.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 0e6555443a206655885bc4126d9a3a0e2d9d17a3)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

bcache: fix race of writeback thread starting before complete initialization

Orabug: stable_rc4

[ Upstream commit 07cc6ef8edc47f8b4fc1e276d31127a0a5863d4d ]

The bch_writeback_thread might BUG_ON in read_dirty() if
dc->sb==BDEV_STATE_DIRTY and bch_sectors_dirty_init has not yet completed
its related initialization. This patch downs the dc->writeback_lock until
after initialization is complete, thus preventing bch_writeback_thread
from proceeding prematurely.

See this thread:
http://thread.gmane.org/gmane.linux.kernel.bcache.devel/3453

Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
Tested-by: Marc MERLIN <marc@merlins.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 7269554a57352f66aefb3e85cb7e11c4b63bba59)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

sched/cputime: Fix steal_account_process_tick() to always return jiffies

Orabug: stable_rc4

[ Upstream commit f9c904b7613b8b4c85b10cd6b33ad41b2843fa9d ]

The callers of steal_account_process_tick() expect it to return
whether a jiffy should be considered stolen or not.

Currently the return value of steal_account_process_tick() is in
units of cputime, which vary between either jiffies or nsecs
depending on CONFIG_VIRT_CPU_ACCOUNTING_GEN.

If cputime has nsecs granularity and there is a tiny amount of
stolen time (a few nsecs, say) then we will consider the entire
tick stolen and will not account the tick on user/system/idle,
causing /proc/stats to show invalid data.

The fix is to change steal_account_process_tick() to accumulate
the stolen time and only account it once it's worth a jiffy.

(Thanks to Frederic Weisbecker for suggestions to fix a bug in my
first version of the patch.)

Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/56DBBDB8.40305@mail.usask.ca
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 23745ba7ffac8cffbe648812fe7dc485d6df9404)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

perf/x86/intel: Add definition for PT PMI bit

Orabug: 23331092

[ Upstream commit 5690ae28e472d25e330ad0c637a5cea3fc39fb32 ]

This patch adds a definition for GLOBAL_OVFL_STATUS bit 55
which is used with the Processor Trace (PT) feature.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: namhyung@kernel.org
Link: http://lkml.kernel.org/r/1457034642-21837-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 559920294e5db893cf5abedb00f56c2d72bca8c8)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

x86: Add new MSRs and MSR bits used for Intel Skylake PMU support

Orabug: 23331091

[ Upstream commit b83ff1c8617aac03a1cf807aafa848fe0f0908f2 ]

Add new MSRs (LBR_INFO) and some new MSR bits used by the Intel Skylake
PMU driver.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285767-27027-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit c7af1256a07538167fe1b14a6714e7b92cf82179)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

usb: hub: fix a typo in hub_port_init() leading to wrong logic

Orabug: 23331090

[ Upstream commit 0d5ce778c43bf888328231bcdce05d5c860655aa ]

A typo of j for i led to a logic bug. To rule out future
confusion, the variable names are made meaningful.

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Dan Duval <dan.duval@oracle.com>
(cherry picked from commit 8e1682fddbd122a565965a83a5f8235e8bcadc10)

Conflict:

drivers/usb/core/hub.c

of: alloc anywhere from memblock if range not specified

Orabug: 23331089

[ Upstream commit e53b50c0cbe392c946807abf7d07615a3c588642 ]

early_init_dt_alloc_reserved_memory_arch passes end as 0 to
__memblock_alloc_base, when limits are not specified. But
__memblock_alloc_base takes end value of 0 as MEMBLOCK_ALLOC_ACCESSIBLE
and limits the end to memblock.current_limit. This results in regions
never being placed in HIGHMEM area, for e.g. CMA.
Let __memblock_alloc_base allocate from anywhere in memory if limits are
not specified.

Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Cc: stable@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 444cf5487d5f51a3ecce2a0dfe237156290dfc7f)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mtip32xx: Handle FTL rebuild failure state during device initialization

Orabug: stable_rc4

[ Upstream commit aae4a033868c496adae86fc6f9c3e0c405bbf360 ]

Allow device initialization to finish gracefully when it is in
FTL rebuild failure state. Also, recover device out of this state
after successfully secure erasing it.

Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Vignesh Gunasekaran <vgunasekaran@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 65963ead8aefa685ec2e22d403461101c243683e)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mtip32xx: fix incorrectly setting MTIP_DDF_SEC_LOCK_BIT

Orabug: 23331087

[ Upstream commit ee04bed690cb49a49512a641405bac42d13c2b2a ]

Fix incorrectly setting MTIP_DDF_SEC_LOCK_BIT

Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 0e536ed27652e8d5d74e13378a8f48b52cb21c95)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mtip32xx: Handle safe removal during IO

Orabug: 23331085

[ Upstream commit 51c6570eb922146470c2fe660c34585414679bd6 ]

Flush inflight IOs using fsync_bdev() when the device is safely
removed. Also, block further IOs in device open function.

Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Rajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit afc16b3aab195d12ad40546d3cffcab5e3511ead)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mtip32xx: fix crash on surprise removal of the drive

Orabug: 23331084

[ Upstream commit 2132a544727eb17f76bfef8b550a016a41c38821 ]

pci and block layers have changed a lot compared to when SRSI support was added.
Given the current state of pci and block layers, this driver do not have to do
any specific handling.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 13af0df20f8e78dc1e7239a29b2862addde3953e)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mtip32xx: fix rmmod issue

Orabug: 23331083

[ Upstream commit 02b48265e7437bfe153af16337b14ee74f00905f ]

put_disk() need to be called after del_gendisk() to free the disk object structure.

Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 6b9d9c35930bd75ddcfafb8eb7db909ceb63af10)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mtip32xx: Avoid issuing standby immediate cmd during FTL rebuild

Orabug: 23331082

[ Upstream commit d8a18d2d8f5de55666c6011ed175939d22c8e3d8 ]

Prevent standby immediate command from being issued in remove,
suspend and shutdown paths, while drive is in FTL rebuild process.

Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Vignesh Gunasekaran <vgunasekaran@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 15d38f73263562c2ebe3cdb7c6380e0b9a98c76c)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mtip32xx: Print exact time when an internal command is interrupted

Orabug: 23331081

[ Upstream commit 5b7e0a8ac85e2dfd83830dc9e0b3554d153a37e3 ]

Print exact time when an internal command is interrupted.

Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Rajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit c9d3e69a692eed01045bfa718505c3e887e87d85)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

quota: Fix possible GPF due to uninitialised pointers

Orabug: 23331080

[ Upstream commit ab73ef46398e2c0159f3a71de834586422d2a44a ]

When dqget() in __dquot_initialize() fails e.g. due to IO error,
__dquot_initialize() will pass an array of uninitialized pointers to
dqput_all() and thus can lead to deference of random data. Fix the
problem by properly initializing the array.

CC: stable@vger.kernel.org
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit ab1cc52b3f62f2445c60cbe390d26c50ebc0f3bd)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

xfs: fix two memory leaks in xfs_attr_list.c error paths

Orabug: 23331078

[ Upstream commit 2e83b79b2d6c78bf1b4aa227938a214dcbddc83f ]

This plugs 2 trivial leaks in xfs_attr_shortform_list and
xfs_attr3_leaf_list_int.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 594103da3005639712b3123a612791c8f4d3f4e9)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

nfsd4: fix bad bounds checking

Orabug: 23331077

[ Upstream commit 4aed9c46afb80164401143aa0fdcfe3798baa9d5 ]

A number of spots in the xdr decoding follow a pattern like

n = be32_to_cpup(p++);
READ_BUF(n + 4);

where n is a u32. The only bounds checking is done in READ_BUF itself,
but since it's checking (n + 4), it won't catch cases where n is very
large, (u32)(-4) or higher. I'm not sure exactly what the consequences
are, but we've seen crashes soon after.

Instead, just break these up into two READ_BUF()s.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit d876f71611ad9b720cc890075b3c4bec25bd54b5)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

IB/srpt: Simplify srpt_handle_tsk_mgmt()

Orabug: 23331076

[ Upstream commit 51093254bf879bc9ce96590400a87897c7498463 ]

Let the target core check task existence instead of the SRP target
driver. Additionally, let the target core check the validity of the
task management request instead of the ib_srpt driver.

This patch fixes the following kernel crash:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
IP: [<ffffffffa0565f37>] srpt_handle_new_iu+0x6d7/0x790 [ib_srpt]
Oops: 0002 [#1] SMP
Call Trace:
[<ffffffffa05660ce>] srpt_process_completion+0xde/0x570 [ib_srpt]
[<ffffffffa056669f>] srpt_compl_thread+0x13f/0x160 [ib_srpt]
[<ffffffff8109726f>] kthread+0xcf/0xe0
[<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Fixes: 3e4f574857ee ("ib_srpt: Convert TMR path to target_submit_tmr")
Tested-by: Alex Estrin <alex.estrin@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 179e72b561d3d331c850e1a5779688d7a7de5246)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

X.509: Fix leap year handling again

Orabug: 23331075

[ Upstream commit ac4cbedfdf55455b4c447f17f0fa027dbf02b2a6 ]

There are still a couple of minor issues in the X.509 leap year handling:

(1) To avoid doing a modulus-by-400 in addition to a modulus-by-100 when
     determining whether the year is a leap year or not, I divided the year
     by 100 after doing the modulus-by-100, thereby letting the compiler do
     one instruction for both, and then did a modulus-by-4.

     Unfortunately, I then passed the now-modified year value to mktime64()
     to construct a time value.

     Since this isn't a fast path and since mktime64() does a bunch of
     divisions, just condense down to "% 400".  It's also easier to read.

(2) The default month length for any February where the year doesn't
     divide by four exactly is obtained from the month_length[] array where
     the value is 29, not 28.

     This is fixed by altering the table.

Reported-by: Rudolf Polzer <rpolzer@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit e62c5259a62f3da2a911f8fe6275dbf43d3b624f)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

PKCS#7: Improve and export the X.509 ASN.1 time object decoder

Orabug: 23331074

[ Upstream commit fd19a3d195be23e8d9d0d66576b96ea25eea8323 ]

Make the X.509 ASN.1 time object decoder fill in a time64_t rather than a
struct tm to make comparison easier (unfortunately, this makes readable
display less easy) and export it so that it can be used by the PKCS#7 code
too.

Further, tighten up its parsing to reject invalid dates (eg. weird
characters, non-existent hour numbers) and unsupported dates (eg. timezones
other than 'Z' or dates earlier than 1970).

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit f85d91f88486b34679c532a5687466eaf335258f)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

X.509: Extract both parts of the AuthorityKeyIdentifier

Orabug: 23331073

[ Upstream commit b92e6570a992c7d793a209db282f68159368201c ]

Extract both parts of the AuthorityKeyIdentifier, not just the keyIdentifier,
as the second part can be used to match X.509 certificates by issuer and
serialNumber.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 3ccbbbf7b5ceb75bef47692c39f443a3efb38437)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

crypto: ccp - memset request context to zero during import

Orabug: 23331072

[ Upstream commit ce0ae266feaf35930394bd770c69778e4ef03ba9 ]

Since a crypto_ahash_import() can be called against a request context
that has not had a crypto_ahash_init() performed, the request context
needs to be cleared to insure there is no random data present. If not,
the random data can result in a kernel oops during crypto_ahash_update().

Cc: <stable@vger.kernel.org> # 3.14.x-
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit dad41d54081e1bd2ef601c702ff4ea0f7428a965)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

RAID5: check_reshape() shouldn't call mddev_suspend

Orabug: 23331070

[ Upstream commit 27a353c026a879a1001e5eac4bda75b16262c44a ]

check_reshape() is called from raid5d thread. raid5d thread shouldn't
call mddev_suspend(), because mddev_suspend() waits for all IO finish
but IO is handled in raid5d thread, we could easily deadlock here.

This issue is introduced by
738a273 ("md/raid5: fix allocation of 'scribble' array.")

Cc: stable@vger.kernel.org (v4.1+)
Reported-and-tested-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 503f8305ab1b82d8788a2f161e7c52c0c0f6aeac)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

md/raid5: Compare apples to apples (or sectors to sectors)

Orabug: 23331068

[ Upstream commit e7597e69dec59b65c5525db1626b9d34afdfa678 ]

'max_discard_sectors' is in sectors, while 'stripe' is in bytes.

This fixes the problem where DISCARD would get disabled on some larger
RAID5 configurations (6 or more drives in my testing), while it worked
as expected with smaller configurations.

Fixes: 620125f2bf8 ("MD: raid5 trim support")
Cc: stable@vger.kernel.org v3.7+
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 5255a738ee6ecc0e479728efe5668efd64901197)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

fix kABI breakage from pci_dev changes

Orabug: 23331203

Orabug: 23331203

Commit bb44fa317be6c1dd0650feaae3326ac11f2d37a4 ("PCI: Add
dev->has_secondary_link to track downstream PCIe links") and commit
0af1534fb7b3dba6e11d6c2670912725a2087217 ("PCI: Disable IO/MEM decoding
for devices with non-compliant BARs") added one-bit flags to the
pci_dev structure. Technically, this broke kABI (according to the
checker, anyway). In reality, these bits were added after a bunch
of other one-bit fields, the result being that their addition didn't
extend the size of the structure, nor did it change the offsets of
any existing fields of the structure.

This commit simply wraps these two fields in "#ifndef __GENKSYMS__"
to hide them from the checker.

Signed-off-by: Dan Duval <dan.duval@oracle.com>
(cherry picked from commit 32594c00fba3c410b5a339f2ddd0e4c583170186)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

PCI: Disable IO/MEM decoding for devices with non-compliant BARs

Orabug: 23331067

[ Upstream commit b84106b4e2290c081cdab521fa832596cdfea246 ]

The PCI config header (first 64 bytes of each device's config space) is
defined by the PCI spec so generic software can identify the device and
manage its usage of I/O, memory, and IRQ resources.

Some non-spec-compliant devices put registers other than BARs where the
BARs should be. When the PCI core sizes these "BARs", the reads and writes
it does may have unwanted side effects, and the "BAR" may appear to
describe non-sensical address space.

Add a flag bit to mark non-compliant devices so we don't touch their BARs.
Turn off IO/MEM decoding to prevent the devices from consuming address
space, since we can't read the BARs to find out what that address space
would be.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit aa57ba13f44426a076ac567e965654453d4be1f1)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

PCI: Add dev->has_secondary_link to track downstream PCIe links

Orabug: 23331066

[ Upstream commit d0751b98dfa391f862e02dc36a233a54615e3f1d ]

A PCIe Port is an interface to a Link.  A Root Port is a PCI-PCI bridge in
a Root Complex and has a Link on its secondary (downstream) side.  For
other Ports, the Link may be on either the upstream (closer to the Root
Complex) or downstream side of the Port.

The usual topology has a Root Port connected to an Upstream Port.  We
previously assumed this was the only possible topology, and that a
Downstream Port's Link was always on its downstream side, like this:

                  +---------------------+
  +------+        |          Downstream |
  | Root |        | Upstream       Port +--Link--
  | Port +--Link--+ Port                |
  +------+        |          Downstream |
                  |                Port +--Link--
                  +---------------------+

But systems do exist (see URL below) where the Root Port is connected to a
Downstream Port.  In this case, a Downstream Port's Link may be on either
the upstream or downstream side:

                  +---------------------+
  +------+        |            Upstream |
  | Root |        | Downstream     Port +--Link--
  | Port +--Link--+ Port                |
  +------+        |          Downstream |
                  |                Port +--Link--
                  +---------------------+

We can't use the Port type to determine which side the Link is on, so add a
bit in struct pci_dev to keep track.

A Root Port's Link is always on the Port's secondary side.  A component
(Endpoint or Port) on the other end of the Link obviously has the Link on
its upstream side.  If that component is a Port, it is part of a Switch or
a Bridge.  A Bridge has a PCI or PCI-X bus on its secondary side, not a
Link.  The internal bus of a Switch connects the Port to another Port whose
Link is on the downstream side.

[bhelgaas: changelog, comment, cache "type", use if/else]
Link: http://lkml.kernel.org/r/54EB81B2.4050904@pobox.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=94361
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 517a021fdba44206722c85bd9267dabd67475fa6)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

mtd: onenand: fix deadlock in onenand_block_markbad

Orabug: 23331065

[ Upstream commit 5e64c29e98bfbba1b527b0a164f9493f3db9e8cb ]

Commit 5942ddbc500d ("mtd: introduce mtd_block_markbad interface")
incorrectly changed onenand_block_markbad() to call mtd_block_markbad
instead of onenand_chip's block_markbad function. As a result the function
will now recurse and deadlock. Fix by reverting the change.

Fixes: 5942ddbc500d ("mtd: introduce mtd_block_markbad interface")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 084b44e9cbc6eec0b0b13138918fb5935208f3b4)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

aic7xxx: Fix queue depth handling

Orabug: 23331064

[ Upstream commit 5a51a7abca133860a6f4429655a9eda3c4afde32 ]

We were setting the queue depth correctly, then setting it back to
two. If you hit this as a bisection point then please send me an email
as it would imply we've been hiding other bugs with this one.

Cc: <stable@vger.kernel.org>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit cf438ddac48b27e7de0514f96d912e132c908df4)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

aacraid: Fix memory leak in aac_fib_map_free

Orabug: 23331062

[ Upstream commit f88fa79a61726ce9434df9b4aede36961f709f17 ]

aac_fib_map_free() calls pci_free_consistent() without checking that
dev->hw_fib_va is not NULL and dev->max_fib_size is not zero.If they are
indeed NULL/0, this will result in a hang as pci_free_consistent() will
attempt to invalidate cache for the entire 64-bit address space
(which would take a very long time).

Fixed by adding a check to make sure that dev->hw_fib_va and
dev->max_fib_size are not NULL and 0 respectively.

Fixes: 9ad5204d6 - "[SCSI]aacraid: incorrect dma mapping mask during blinked recover or user initiated reset"
Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 6d2cd58f288e320c5a023219ac8726b30657ddbb)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

aacraid: Fix RRQ overload

Orabug: 23331060

[ Upstream commit 3f4ce057d51a9c0ed9b01ba693df685d230ffcae ]

The driver utilizes an array of atomic variables to keep track of IO
submissions to each vector. To submit an IO multiple threads iterate
through the array to find a vector which has empty slots to send an
IO. The reading and updating of the variable is not atomic, causing race
conditions when a thread uses a full vector to submit an IO.

Fixed by mapping each FIB to a vector, the submission path then uses
said vector to submit IO thereby removing the possibly of a race
condition.The vector assignment is started from 1 since vector 0 is
reserved for the use of AIF management FIBS.If the number of MSIx
vectors is 1 (MSI or INTx mode) then all the fibs are allocated to
vector 0.

Fixes: 495c0217 "aacraid: MSI-x support"
Cc: stable@vger.kernel.org # v4.1
Signed-off-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 34bb7098221c2a37f5d9ef3591971a25719ae21a)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

dm: fix excessive dm-mq context switching

Orabug: 23331058

[ Upstream commit 6acfe68bac7e6f16dc312157b1fa6e2368985013 ]

Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower
than if an underlying null_blk device were used directly.  One of the
reasons for this drop in performance is that blk_insert_clone_request()
was calling blk_mq_insert_request() with @async=true.  This forced the
use of kblockd_schedule_delayed_work_on() to run the blk-mq hw queues
which ushered in ping-ponging between process context (fio in this case)
and kblockd's kworker to submit the cloned request.  The ftrace
function_graph tracer showed:

  kworker-2013  =>   fio-12190
  fio-12190    =>  kworker-2013
  ...
  kworker-2013  =>   fio-12190
  fio-12190    =>  kworker-2013
  ...

Fixing blk_insert_clone_request()'s blk_mq_insert_request() call to
_not_ use kblockd to submit the cloned requests isn't enough to
eliminate the observed context switches.

In addition to this dm-mq specific blk-core fix, there are 2 DM core
fixes to dm-mq that (when paired with the blk-core fix) completely
eliminate the observed context switching:

1)  don't blk_mq_run_hw_queues in blk-mq request completion

    Motivated by desire to reduce overhead of dm-mq, punting to kblockd
    just increases context switches.

    In my testing against a really fast null_blk device there was no benefit
    to running blk_mq_run_hw_queues() on completion (and no other blk-mq
    driver does this).  So hopefully this change doesn't induce the need for
    yet another revert like commit 621739b00e16ca2d !

2)  use blk_mq_complete_request() in dm_complete_request()

    blk_complete_request() doesn't offer the traditional q->mq_ops vs
    .request_fn branching pattern that other historic block interfaces
    do (e.g. blk_get_request).  Using blk_mq_complete_request() for
    blk-mq requests is important for performance.  It should be noted
    that, like blk_complete_request(), blk_mq_complete_request() doesn't
    natively handle partial completions -- but the request-based
    DM-multipath target does provide the required partial completion
    support by dm.c:end_clone_bio() triggering requeueing of the request
    via dm-mpath.c:multipath_end_io()'s return of DM_ENDIO_REQUEUE.

dm-mq fix #2 is _much_ more important than #1 for eliminating the
context switches.
Before: cpu          : usr=15.10%, sys=59.39%, ctx=7905181, majf=0, minf=475
After:  cpu          : usr=20.60%, sys=79.35%, ctx=2008, majf=0, minf=472

With these changes multithreaded async read IOPs improved from ~950K
to ~1350K for this dm-mq stacked on null_blk test-case.  The raw read
IOPs of the underlying null_blk device for the same workload is ~1950K.

Fixes: 7fb4898e0 ("block: add blk-mq support to blk_insert_cloned_request()")
Fixes: bfebd1cdb ("dm: add full blk-mq support to request-based DM")
Cc: stable@vger.kernel.org # 4.1+
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit fb1840e257d26b6aad69b864cb7ed4a67ab986b6)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

ext4: iterate over buffer heads correctly in move_extent_per_page()

Orabug: 23331057

[ Upstream commit 87f9a031af48defee9f34c6aaf06d6f1988c244d ]

In commit bcff24887d00 ("ext4: don't read blocks from disk after extents
being swapped") bh is not updated correctly in the for loop and wrong
data has been written to disk. generic/324 catches this on sub-page
block size ext4.

Fixes: bcff24887d00 ("ext4: don't read blocks from disk after extentsbeing swapped")
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 31a37d7c7ef7b4f90b600fcddd1c385a39f9d34c)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

tools/hv: Use include/uapi with __EXPORTED_HEADERS__

Orabug: 23331055

[ Upstream commit 50fe6dd10069e7c062e27f29606f6e91ea979399 ]

Use the local uapi headers to keep in sync with "recently" added #define's
(e.g. VSS_OP_REGISTER1).

Fixes: 3eb2094c59e8 ("Adding makefile for tools/hv")
Cc: <stable@vger.kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 2168661199ff5eeab52a8aa2408a6b21fbf4f743)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

net: irda: Fix use-after-free in irtty_open()

Orabug: 23331054

[ Upstream commit 401879c57f01cbf2da204ad2e8db910525c6dbea ]

The N_IRDA line discipline may access the previous line discipline's closed
and already-fre private data on open [1].

The tty->disc_data field _never_ refers to valid data on entry to the
line discipline's open() method. Rather, the ldisc is expected to
initialize that field for its own use for the lifetime of the instance
(ie. from open() to close() only).

[1]
    ==================================================================
    BUG: KASAN: use-after-free in irtty_open+0x422/0x550 at addr ffff8800331dd068
    Read of size 4 by task a.out/13960
    =============================================================================
    BUG kmalloc-512 (Tainted: G    B          ): kasan: bad access detected
    -----------------------------------------------------------------------------
    ...
    Call Trace:
     [<ffffffff815fa2ae>] __asan_report_load4_noabort+0x3e/0x40 mm/kasan/report.c:279
     [<ffffffff836938a2>] irtty_open+0x422/0x550 drivers/net/irda/irtty-sir.c:436
     [<ffffffff829f1b80>] tty_ldisc_open.isra.2+0x60/0xa0 drivers/tty/tty_ldisc.c:447
     [<ffffffff829f21c0>] tty_set_ldisc+0x1a0/0x940 drivers/tty/tty_ldisc.c:567
     [<     inline     >] tiocsetd drivers/tty/tty_io.c:2650
     [<ffffffff829da49e>] tty_ioctl+0xace/0x1fd0 drivers/tty/tty_io.c:2883
     [<     inline     >] vfs_ioctl fs/ioctl.c:43
     [<ffffffff816708ac>] do_vfs_ioctl+0x57c/0xe60 fs/ioctl.c:607
     [<     inline     >] SYSC_ioctl fs/ioctl.c:622
     [<ffffffff81671204>] SyS_ioctl+0x74/0x80 fs/ioctl.c:613
     [<ffffffff852a7876>] entry_SYSCALL_64_fastpath+0x16/0x7a

Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 0f412b8aa88883f7e3059c5a2c1e56ce0dd8bf86)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

crypto: ccp - Don't assume export/import areas are aligned

Orabug: 23331052

[ Upstream commit b31dde2a5cb1bf764282abf934266b7193c2bc7c ]

Use a local variable for the exported and imported state so that
alignment is not an issue. On export, set a local variable from the
request context and then memcpy the contents of the local variable to
the export memory area. On import, memcpy the import memory area into
a local variable and then use the local variable to set the request
context.

Cc: <stable@vger.kernel.org> # 3.14.x-
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit b053d66b66e702f74ddc986863b30eaac41e7f4c)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

crypto: ccp - Limit the amount of information exported

Orabug: 23331051

[ Upstream commit d1662165ae612ec8b5f94a6b07e65ea58b6dce34 ]

Since the exported information can be exposed to user-space, instead of
exporting the entire request context only export the minimum information
needed.

Cc: <stable@vger.kernel.org> # 3.14.x-
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 5badf7e00f0968d820bf7bba9081339bfca3489c)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

tty: Fix GPF in flush_to_ldisc(), part 2

Orabug: 23331050

[ Upstream commit f33798deecbd59a2955f40ac0ae2bc7dff54c069 ]

commit 9ce119f318ba ("tty: Fix GPF in flush_to_ldisc()") fixed a
GPF caused by a line discipline which does not define a receive_buf()
method.

However, the vt driver (and speakup driver also) pushes selection
data directly to the line discipline receive_buf() method via
tty_ldisc_receive_buf(). Fix the same problem in tty_ldisc_receive_buf().

Cc: <stable@vger.kernel.org>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 3beac2b0bffd8e977ef1d464dd424f22af965a96)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

crypto: ccp - Add hash state import and export support

Orabug: 23331048

[ Upstream commit 952bce9792e6bf36fda09c2e5718abb5d9327369 ]

Commit 8996eafdcbad ("crypto: ahash - ensure statesize is non-zero")
added a check to prevent ahash algorithms from successfully registering
if the import and export functions were not implemented. This prevents
an oops in the hash_accept function of algif_hash. This commit causes
the ccp-crypto module SHA support and AES CMAC support from successfully
registering and causing the ccp-crypto module load to fail because the
ahash import and export functions are not implemented.

Update the CCP Crypto API support to provide import and export support
for ahash algorithms.

Cc: <stable@vger.kernel.org> # 3.14.x-
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 1ad241d40ef8d9a50ced5c35bf55e9a21a997516)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

[media] usbvision: fix crash on detecting device with invalid configuration

Orabug: stable_rc4

[ Upstream commit fa52bd506f274b7619955917abfde355e3d19ffe ]

The usbvision driver crashes when a specially crafted usb device with invalid
number of interfaces or endpoints is detected. This fix adds checks that the
device has proper configuration expected by the driver.

Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 37dee22181885e7847e8c95843b6e94138edbd43)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

include/linux/poison.h: fix LIST_POISON{1,2} offset

Orabug: 23331045

[ Upstream commit 8a5e5e02fc83aaf67053ab53b359af08c6c49aaf ]

Poison pointer values should be small enough to find a room in
non-mmap'able/hardly-mmap'able space.  E.g.  on x86 "poison pointer space"
is located starting from 0x0.  Given unprivileged users cannot mmap
anything below mmap_min_addr, it should be safe to use poison pointers
lower than mmap_min_addr.

The current poison pointer values of LIST_POISON{1,2} might be too big for
mmap_min_addr values equal or less than 1 MB (common case, e.g.  Ubuntu
uses only 0x10000).  There is little point to use such a big value given
the "poison pointer space" below 1 MB is not yet exhausted.  Changing it
to a smaller value solves the problem for small mmap_min_addr setups.

The values are suggested by Solar Designer:
http://www.openwall.com/lists/oss-security/2015/05/02/6

Signed-off-by: Vasily Kulikov <segoon@openwall.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 46460a03f44f1915ded434057fa46332438b3a6e)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

KEYS: Fix handling of stored error in a negatively instantiated user key

Orabug: stable_rc4

[ Upstream commit 096fe9eaea40a17e125569f9e657e34cdb6d73bd ]

If a user key gets negatively instantiated, an error code is cached in the
payload area.  A negatively instantiated key may be then be positively
instantiated by updating it with valid data.  However, the ->update key
type method must be aware that the error code may be there.

The following may be used to trigger the bug in the user key type:

    keyctl request2 user user "" @u
    keyctl add user user "a" @u

which manifests itself as:

BUG: unable to handle kernel paging request at 00000000ffffff8a
IP: [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
PGD 7cc30067 PUD 0
Oops: 0002 [#1] SMP
Modules linked in:
CPU: 3 PID: 2644 Comm: a.out Not tainted 4.3.0+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88003ddea700 ti: ffff88003dd88000 task.ti: ffff88003dd88000
RIP: 0010:[<ffffffff810a376f>]  [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280
[<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
RSP: 0018:ffff88003dd8bdb0  EFLAGS: 00010246
RAX: 00000000ffffff82 RBX: 0000000000000000 RCX: 0000000000000001
RDX: ffffffff81e3fe40 RSI: 0000000000000000 RDI: 00000000ffffff82
RBP: ffff88003dd8bde0 R08: ffff88007d2d2da0 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88003e8073c0 R12: 00000000ffffff82
R13: ffff88003dd8be68 R14: ffff88007d027600 R15: ffff88003ddea700
FS:  0000000000b92880(0063) GS:ffff88007fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000ffffff8a CR3: 000000007cc5f000 CR4: 00000000000006e0
Stack:
ffff88003dd8bdf0 ffffffff81160a8a 0000000000000000 00000000ffffff82
ffff88003dd8be68 ffff88007d027600 ffff88003dd8bdf0 ffffffff810a39e5
ffff88003dd8be20 ffffffff812a31ab ffff88007d027600 ffff88007d027620
Call Trace:
[<ffffffff810a39e5>] kfree_call_rcu+0x15/0x20 kernel/rcu/tree.c:3136
[<ffffffff812a31ab>] user_update+0x8b/0xb0 security/keys/user_defined.c:129
[<     inline     >] __key_update security/keys/key.c:730
[<ffffffff8129e5c1>] key_create_or_update+0x291/0x440 security/keys/key.c:908
[<     inline     >] SYSC_add_key security/keys/keyctl.c:125
[<ffffffff8129fc21>] SyS_add_key+0x101/0x1e0 security/keys/keyctl.c:60
[<ffffffff8185f617>] entry_SYSCALL_64_fastpath+0x12/0x6a arch/x86/entry/entry_64.S:185

Note the error code (-ENOKEY) in EDX.

A similar bug can be tripped by:

    keyctl request2 trusted user "" @u
    keyctl add trusted user "a" @u

This should also affect encrypted keys - but that has to be correctly
parameterised or it will fail with EINVAL before getting to the bit that
will crashes.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit d979e967f848caf908a1401b7ad67cf13f06ef9f)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

KVM: x86: Reload pit counters for all channels when restoring state

Orabug: 23331042

[ Upstream commit 0185604c2d82c560dab2f2933a18f797e74ab5a8 ]

Currently if userspace restores the pit counters with a count of 0
on channels 1 or 2 and the guest attempts to read the count on those
channels, then KVM will perform a mod of 0 and crash. This will ensure
that 0 values are converted to 65536 as per the spec.

This is CVE-2015-7513.

Signed-off-by: Andy Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 90352f3f473a29db1289ec31facc1ac18cc66e9e)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

ovl: fix permission checking for setattr

Orabug: 23331041

[ Upstream commit acff81ec2c79492b180fade3c2894425cd35a545 ]

[Al Viro] The bug is in being too enthusiastic about optimizing ->setattr()
away - instead of "copy verbatim with metadata" + "chmod/chown/utimes"
(with the former being always safe and the latter failing in case of
insufficient permissions) it tries to combine these two. Note that copyup
itself will have to do ->setattr() anyway; _that_ is where the elevated
capabilities are right. Having these two ->setattr() (one to set verbatim
copy of metadata, another to do what overlayfs ->setattr() had been asked
to do in the first place) combined is where it breaks.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 2cadb57dff500076a87b934cac64bb5a2293b644)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

btrfs: async-thread: Fix a use-after-free error for trace

Orabug: 23331040

[ Upstream commit 0a95b851370b84a4b9d92ee6d1fa0926901d0454 ]

Parameter of trace_btrfs_work_queued() can be freed in its workqueue.
So no one use use that pointer after queue_work().

Fix the user-after-free bug by move the trace line before queue_work().

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit b9a54ed91c7bbd5c18a4170be078d9f7e28560ed)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

btrfs: Fix no_space in write and rm loop

Orabug: 23331039

[ Upstream commit 08acfd9dd845dc052c5eae33e6c3976338070069 ]

commit e1746e8381cd2af421f75557b5cae3604fc18b35 upstream.

I see no_space in v4.4-rc1 again in xfstests generic/102.
It happened randomly in some node only.
(one of 4 phy-node, and a kvm with non-virtio block driver)

By bisect, we can found the first-bad is:
commit bdced438acd8 ("block: setup bi_phys_segments after splitting")'
But above patch only triggered the bug by making bio operation
faster(or slower).

Main reason is in our space_allocating code, we need to commit
page writeback before wait it complish, this patch fixed above
bug.

BTW, there is another reason for generic/102 fail, caused by
disable default mixed-blockgroup, I'll fix it in xfstests.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit d5b55a7aae08c0e0785430126bcc4a9ae7f5c737)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

btrfs: wait for delayed iputs on no space

Orabug: 23331038

[ Upstream commit 9a4e7276d39071576d369e607d7accb84b41d0b4 ]

btrfs will report no_space when we run following write and delete
file loop:
# FILE_SIZE_M=[ 75% of fs space ]
# DEV=[ some dev ]
# MNT=[ some dir ]
#
# mkfs.btrfs -f "$DEV"
# mount -o nodatacow "$DEV" "$MNT"
# for ((i = 0; i < 100; i++)); do dd if=/dev/zero of="$MNT"/file0 bs=1M count="$FILE_SIZE_M"; rm -f "$MNT"/file0; done
#

Reason:
iput() and evict() is run after write pages to block device, if
write pages work is not finished before next write, the "rm"ed space
is not freed, and caused above bug.

Fix:
We can add "-o flushoncommit" mount option to avoid above bug, but
it have performance problem. Actually, we can to wait for on-the-fly
writes only when no-space happened, it is which this patch do.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 42bd8f4fda813558c3045c60ad6436b1c7430ec7)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

security: let security modules use PTRACE_MODE_* with bitmasks

Orabug: 23331036

[ Upstream commit 3dfb7d8cdbc7ea0c2970450e60818bb3eefbad69 ]

It looks like smack and yama weren't aware that the ptrace mode
can have flags ORed into it - PTRACE_MODE_NOAUDIT until now, but
only for /proc/$pid/stat, and with the PTRACE_MODE_*CREDS patch,
all modes have flags ORed into them.

Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit ee6ad435c9872610c5a52cad02e331951cf2fb25)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

x86/entry/compat: Add missing CLAC to entry_INT80_32

Orabug: 23331033

[ Upstream commit 3d44d51bd339766f0178f0cf2e8d048b4a4872aa ]

This doesn't seem to fix a regression -- I don't think the CLAC was
ever there.

I double-checked in a debugger: entries through the int80 gate do
not automatically clear AC.

Stable maintainers: I can provide a backport to 4.3 and earlier if
needed. This needs to be backported all the way to 3.10.

Reported-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org> # v3.10 and later
Fixes: 63bcff2a307b ("x86, smap: Add STAC and CLAC instructions to control user space access")
Link: http://lkml.kernel.org/r/b02b7e71ae54074be01fc171cbd4b72517055c0e.1456345086.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 1f9780e372264c0cd71571c94da08cc49ae327e3)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

kernel/resource.c: fix muxed resource handling in __request_region()

Orabug: 23331032

[ Upstream commit 59ceeaaf355fa0fb16558ef7c24413c804932ada ]

In __request_region, if a conflict with a BUSY and MUXED resource is
detected, then the caller goes to sleep and waits for the resource to be
released.  A pointer on the conflicting resource is kept.  At wake-up
this pointer is used as a parent to retry to request the region.

A first problem is that this pointer might well be invalid (if for
example the conflicting resource have already been freed).  Another
problem is that the next call to __request_region() fails to detect a
remaining conflict.  The previously conflicting resource is passed as a
parameter and __request_region() will look for a conflict among the
children of this resource and not at the resource itself.  It is likely
to succeed anyway, even if there is still a conflict.

Instead, the parent of the conflicting resource should be passed to
__request_region().

As a fix, this patch doesn't update the parent resource pointer in the
case we have to wait for a muxed region right after.

Reported-and-tested-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Tested-by: Vincent Donnefort <vdonnefort@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit aa1311b426d5cc249887c8cbfa21a6dda2c5a201)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

btrfs: initialize the seq counter in struct btrfs_device

Orabug: 23331031

[ Upstream commit 546bed631203344611f42b2af1d224d2eedb4e6b ]

I managed to trigger this:
| INFO: trying to register non-static key.
| the code is fine but needs lockdep annotation.
| turning off the locking correctness validator.
| CPU: 1 PID: 781 Comm: systemd-gpt-aut Not tainted 4.4.0-rt2+ #14
| Hardware name: ARM-Versatile Express
| [<80307cec>] (dump_stack)
| [<80070e98>] (__lock_acquire)
| [<8007184c>] (lock_acquire)
| [<80287800>] (btrfs_ioctl)
| [<8012a8d4>] (do_vfs_ioctl)
| [<8012ac14>] (SyS_ioctl)

so I think that btrfs_device_data_ordered_init() is not invoked behind
a macro somewhere.

Fixes: 7cc8e58d53cd ("Btrfs: fix unprotected device's variants on 32bits machine")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 2068256b08824ad53b28fb08952b62dd35e66593)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots

Orabug: stable_rc4

[ Upstream commit f32e48e925964c4f8ab917850788a87e1cef3bad ]

The following call trace is seen when btrfs/031 test is executed in a loop,

[  158.661848] ------------[ cut here ]------------
[  158.662634] WARNING: CPU: 2 PID: 890 at /home/chandan/repos/linux/fs/btrfs/ioctl.c:558 create_subvol+0x3d1/0x6ea()
[  158.664102] BTRFS: Transaction aborted (error -2)
[  158.664774] Modules linked in:
[  158.665266] CPU: 2 PID: 890 Comm: btrfs Not tainted 4.4.0-rc6-g511711a #2
[  158.666251] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[  158.667392]  ffffffff81c0a6b0 ffff8806c7c4f8e8 ffffffff81431fc8 ffff8806c7c4f930
[  158.668515]  ffff8806c7c4f920 ffffffff81051aa1 ffff880c85aff000 ffff8800bb44d000
[  158.669647]  ffff8808863b5c98 0000000000000000 00000000fffffffe ffff8806c7c4f980
[  158.670769] Call Trace:
[  158.671153]  [<ffffffff81431fc8>] dump_stack+0x44/0x5c
[  158.671884]  [<ffffffff81051aa1>] warn_slowpath_common+0x81/0xc0
[  158.672769]  [<ffffffff81051b27>] warn_slowpath_fmt+0x47/0x50
[  158.673620]  [<ffffffff813bc98d>] create_subvol+0x3d1/0x6ea
[  158.674440]  [<ffffffff813777c9>] btrfs_mksubvol.isra.30+0x369/0x520
[  158.675376]  [<ffffffff8108a4aa>] ? percpu_down_read+0x1a/0x50
[  158.676235]  [<ffffffff81377a81>] btrfs_ioctl_snap_create_transid+0x101/0x180
[  158.677268]  [<ffffffff81377b52>] btrfs_ioctl_snap_create+0x52/0x70
[  158.678183]  [<ffffffff8137afb4>] btrfs_ioctl+0x474/0x2f90
[  158.678975]  [<ffffffff81144b8e>] ? vma_merge+0xee/0x300
[  158.679751]  [<ffffffff8115be31>] ? alloc_pages_vma+0x91/0x170
[  158.680599]  [<ffffffff81123f62>] ? lru_cache_add_active_or_unevictable+0x22/0x70
[  158.681686]  [<ffffffff813d99cf>] ? selinux_file_ioctl+0xff/0x1d0
[  158.682581]  [<ffffffff8117b791>] do_vfs_ioctl+0x2c1/0x490
[  158.683399]  [<ffffffff813d3cde>] ? security_file_ioctl+0x3e/0x60
[  158.684297]  [<ffffffff8117b9d4>] SyS_ioctl+0x74/0x80
[  158.685051]  [<ffffffff819b2bd7>] entry_SYSCALL_64_fastpath+0x12/0x6a
[  158.685958] ---[ end trace 4b63312de5a2cb76 ]---
[  158.686647] BTRFS: error (device loop0) in create_subvol:558: errno=-2 No such entry
[  158.709508] BTRFS info (device loop0): forced readonly
[  158.737113] BTRFS info (device loop0): disk space caching is enabled
[  158.738096] BTRFS error (device loop0): Remounting read-write after error is not allowed
[  158.851303] BTRFS error (device loop0): cleaner transaction attach returned -30

This occurs because,

Mount filesystem
Create subvol with ID 257
Unmount filesystem
Mount filesystem
Delete subvol with ID 257
  btrfs_drop_snapshot()
    Add root corresponding to subvol 257 into
    btrfs_transaction->dropped_roots list
Create new subvol (i.e. create_subvol())
  257 is returned as the next free objectid
  btrfs_read_fs_root_no_name()
    Finds the btrfs_root instance corresponding to the old subvol with ID 257
    in btrfs_fs_info->fs_roots_radix.
    Returns error since btrfs_root_item->refs has the value of 0.

To fix the issue the commit initializes tree root's and subvolume root's
highest_objectid when loading the roots from disk.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit c19cd7e350c5ad9f3f4b3e486dbb4dab737a22a4)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Btrfs: fix transaction handle leak on failure to create hard link

Orabug: 23331029

[ Upstream commit 271dba4521aed0c37c063548f876b49f5cd64b2e ]

If we failed to create a hard link we were not always releasing the
the transaction handle we got before, resulting in a memory leak and
preventing any other tasks from being able to commit the current
transaction.
Fix this by always releasing our transaction handle.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 9bf972e8aa6110d750cf1ddab68511f478a6a751)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Btrfs: fix number of transaction units required to create symlink

Orabug: 23331027

[ Upstream commit 9269d12b2d57d9e3d13036bb750762d1110d425c ]

We weren't accounting for the insertion of an inline extent item for the
symlink inode nor that we need to update the parent inode item (through
the call to btrfs_add_nondir()). So fix this by including two more
transaction units.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit a1f535acffbd95ae6ae81656e8bba39af094c3f0)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Btrfs: send, don't BUG_ON() when an empty symlink is found

Orabug: 23331026

[ Upstream commit a879719b8c90e15c9e7fa7266d5e3c0ca962f9df ]

When a symlink is successfully created it always has an inline extent
containing the source path. However if an error happens when creating
the symlink, we can leave in the subvolume's tree a symlink inode without
any such inline extent item - this happens if after btrfs_symlink() calls
btrfs_end_transaction() and before it calls the inode eviction handler
(through the final iput() call), the transaction gets committed and a
crash happens before the eviction handler gets called, or if a snapshot
of the subvolume is made before the eviction handler gets called. Sadly
we can't just avoid this by making btrfs_symlink() call
btrfs_end_transaction() after it calls the eviction handler, because the
later can commit the current transaction before it removes any items from
the subvolume tree (if it encounters ENOSPC errors while reserving space
for removing all the items).

So make send fail more gracefully, with an -EIO error, and print a
message to dmesg/syslog informing that there's an empty symlink inode,
so that the user can delete the empty symlink or do something else
about it.

Reported-by: Stephen R. van den Berg <srb@cuci.nl>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit e92c51b734d57e04f0c9b43106b5294834339be6)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

btrfs: statfs: report zero available if metadata are exhausted

Orabug: 23331025

[ Upstream commit ca8a51b3a979d57b082b14eda38602b7f52d81d1 ]

There is one ENOSPC case that's very confusing. There's Available
greater than zero but no file operation succeds (besides removing
files). This happens when the metadata are exhausted and there's no
possibility to allocate another chunk.

In this scenario it's normal that there's still some space in the data
chunk and the calculation in df reflects that in the Avail value.

To at least give some clue about the ENOSPC situation, let statfs report
zero value in Avail, even if there's still data space available.

Current:
/dev/sdb1 4.0G 3.3G 719M 83% /mnt/test

New:
/dev/sdb1 4.0G 3.3G 0 100% /mnt/test

We calculate the remaining metadata space minus global reserve. If this
is (supposedly) smaller than zero, there's no space. But this does not
hold in practice, the exhausted state happens where's still some
positive delta. So we apply some guesswork and compare the delta to a 4M
threshold. (Practically observed delta was 2M.)

We probably cannot calculate the exact threshold value because this
depends on the internal reservations requested by various operations, so
some operations that consume a few metadata will succeed even if the
Avail is zero. But this is better than the other way around.

Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 4e3fa12f124507ad17f999c28ef35803596ff2c6)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Btrfs: igrab inode in writepage

Orabug: 23331024

[ Upstream commit be7bd730841e69fe8f70120098596f648cd1f3ff ]

We hit this panic on a few of our boxes this week where we have an
ordered_extent with an NULL inode.  We do an igrab() of the inode in writepages,
but weren't doing it in writepage which can be called directly from the VM on
dirty pages.  If the inode has been unlinked then we could have I_FREEING set
which means igrab() would return NULL and we get this panic.  Fix this by trying
to igrab in btrfs_writepage, and if it returns NULL then just redirty the page
and return AOP_WRITEPAGE_ACTIVATE; so the VM knows it wasn't successful.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit bb055e837f904fdc80d4d82819b0a9aaf35dce4a)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

Btrfs: add missing brelse when superblock checksum fails

Orabug: 23331023

[ Upstream commit b2acdddfad13c38a1e8b927d83c3cf321f63601a ]

Looks like oversight, call brelse() when checksum fails. Further down the
code, in the non error path, we do call brelse() and so we don't see
brelse() in the goto error paths.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit c0109d289de5a48e54a2d070981a629fc241f112)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

ext4: fix bh->b_state corruption

Orabug: 23331022

[ Upstream commit ed8ad83808f009ade97ebbf6519bc3a97fefbc0c ]

ext4 can update bh->b_state non-atomically in _ext4_get_block() and
ext4_da_get_block_prep(). Usually this is fine since bh is just a
temporary storage for mapping information on stack but in some cases it
can be fully living bh attached to a page. In such case non-atomic
update of bh->b_state can race with an atomic update which then gets
lost. Usually when we are mapping bh and thus updating bh->b_state
non-atomically, nobody else touches the bh and so things work out fine
but there is one case to especially worry about: ext4_finish_bio() uses
BH_Uptodate_Lock on the first bh in the page to synchronize handling of
PageWriteback state. So when blocksize < pagesize, we can be atomically
modifying bh->b_state of a buffer that actually isn't under IO and thus
can race e.g. with delalloc trying to map that buffer. The result is
that we can mistakenly set / clear BH_Uptodate_Lock bit resulting in the
corruption of PageWriteback state or missed unlock of BH_Uptodate_Lock.

Fix the problem by always updating bh->b_state bits atomically.

CC: stable@vger.kernel.org
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit a7635d6a0849007d2192bc02c038cc1b9d91b274)

Signed-off-by: Dan Duval <dan.duval@oracle.com>

dax: don't abuse get_block mapping for endio callbacks

Orabug: 23331020

[ Upstream commit e842f2903908934187af7232fb5b21da527d1757 ]

dax_fault() currently relies on the get_block callback to attach an
io completion callback to the mapping buffer head so that it can
run unwritten extent conversion after zeroing allocated blocks.

Instead of this hack, pass the conversion callback directly into
dax_fault() similar to the get_block callback. When the filesystem
allocates unwritten extents, it will set the buffer_unwritten()
flag, and hence the dax_fault code can call the completion function
in the contexts where it is necessary without overloading the
mapping buffer head.

Note: The changes to ext4 to use this interface are suspect at best.
In fact, the way ext4 did this end_io assignment in the first place
looks suspect because it only set a completion callback when there
wasn't already some other write() call taking place on the same
inode. The ext4 end_io code looks rather intricate and fragile with
all it's reference counting and passing to different contexts for
modification via inode private pointers that aren't protected by
locks...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
(cherry picked from commit 0e3029cfab4a7884b32e1b2d3c19c12eded9804a)

Signed-off-by: Dan Duval <dan.duval@oracle.com>