]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Brian Maly [Fri, 5 Aug 2016 16:45:49 +0000 (12:45 -0400)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agoKEYS: potential uninitialized variable
Dan Carpenter [Thu, 16 Jun 2016 14:48:57 +0000 (15:48 +0100)]
KEYS: potential uninitialized variable

Orabug: 24393849
CVE: CVE-2016-4470

If __key_link_begin() failed then "edit" would be uninitialized.  I've
added a check to fix that.

This allows a random user to crash the kernel, though it's quite
difficult to achieve.  There are three ways it can be done as the user
would have to cause an error to occur in __key_link():

 (1) Cause the kernel to run out of memory.  In practice, this is difficult
     to achieve without ENOMEM cropping up elsewhere and aborting the
     attempt.

 (2) Revoke the destination keyring between the keyring ID being looked up
     and it being tested for revocation.  In practice, this is difficult to
     time correctly because the KEYCTL_REJECT function can only be used
     from the request-key upcall process.  Further, users can only make use
     of what's in /sbin/request-key.conf, though this does including a
     rejection debugging test - which means that the destination keyring
     has to be the caller's session keyring in practice.

 (3) Have just enough key quota available to create a key, a new session
     keyring for the upcall and a link in the session keyring, but not then
     sufficient quota to create a link in the nominated destination keyring
     so that it fails with EDQUOT.

The bug can be triggered using option (3) above using something like the
following:

echo 80 >/proc/sys/kernel/keys/root_maxbytes
keyctl request2 user debug:fred negate @t

The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system.  Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.

Assuming the failure occurs, something like the following will be seen:

kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
------------[ cut here ]------------
kernel BUG at ../mm/slab.c:2821!
...
RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
RSP: 0018:ffff8804014a7de8  EFLAGS: 00010092
RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
...
Call Trace:
  kfree+0xde/0x1bc
  assoc_array_cancel_edit+0x1f/0x36
  __key_link_end+0x55/0x63
  key_reject_and_link+0x124/0x155
  keyctl_reject_key+0xb6/0xe0
  keyctl_negate_key+0x10/0x12
  SyS_keyctl+0x9f/0xe7
  do_syscall_64+0x63/0x13a
  entry_SYSCALL64_slow_path+0x25/0x25

Fixes: f70e2e06196a ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 38327424b40bcebe2de92d07312c89360ac9229a)
Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agoMerge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Tue, 2 Aug 2016 23:18:48 +0000 (16:18 -0700)]
Merge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agool6-spec: update linux-firmware dependency to 20160616-44.git43e96a1e.0.10
Chuck Anderson [Tue, 2 Aug 2016 23:15:13 +0000 (16:15 -0700)]
ol6-spec: update linux-firmware dependency to 20160616-44.git43e96a1e.0.10

Orabug: 24378297

Update the linux-firmware dependency to 20160616-44.git43e96a1e.0.10

Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Reviewed-by: Guru Anbalagane <guru.anbalagane@oracle.com>
8 years agool7-spec: update dracut version dependency to 033-360.0.3
Chuck Anderson [Tue, 2 Aug 2016 23:01:29 +0000 (16:01 -0700)]
ol7-spec: update dracut version dependency to 033-360.0.3

Orabug: 24378292

Update the dracut version dependency to 033-360.0.3

Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Reviewed-by: Guru Anbalagane <guru.anbalagane@oracle.com>
8 years agoMerge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek...
Chuck Anderson [Tue, 2 Aug 2016 07:00:28 +0000 (00:00 -0700)]
Merge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Tue, 2 Aug 2016 06:59:32 +0000 (23:59 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years ago[2d8747c2] fixup! blk-mq: prevent double-unlock of mutex
Dan Duval [Fri, 29 Jul 2016 17:20:06 +0000 (13:20 -0400)]
[2d8747c2] fixup! blk-mq: prevent double-unlock of mutex

Orabug: 24376549

Commit 2d8747c28478f85d9f04292780b1432edd2a384e ("blk-mq: avoid inserting
requests before establishing new mapping") introduced an extraneous call
to mutex_unlock().  The result is that the mutex gets unlocked twice.

Remove the extra call.

Signed-off-by: Dan Duval <dan.duval@oracle.com>
9 years agoMerge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
Chuck Anderson [Sun, 31 Jul 2016 18:18:42 +0000 (11:18 -0700)]
Merge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek...
Chuck Anderson [Sun, 31 Jul 2016 18:17:59 +0000 (11:17 -0700)]
Merge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sun, 31 Jul 2016 18:15:28 +0000 (11:15 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sun, 31 Jul 2016 18:13:42 +0000 (11:13 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sun, 31 Jul 2016 05:55:22 +0000 (22:55 -0700)]
Merge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoxen-pciback: mark device to be hidden on AER error trigger
Elena Ufimtseva [Thu, 21 Jul 2016 21:25:27 +0000 (17:25 -0400)]
xen-pciback: mark device to be hidden on AER error trigger

Some platforms are configured to reboot the machine upon
AER unrecoverable error and some virtualized systems are subject
to security risks described in XSA-124.
This patch allows for simple AER unrecoverable errors containment
together with killing the guest upon receiving of fatal AER error.
Patch stores in xenstore sbdf of passed through device that triggered
AER unrecoverable error. This will allow xend to make device
unassignable until next reboot or special hypervisor hypercall.

OraBug: 24377669

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Adnan Misherfi <adnan.misherfi@oracle.com>
9 years agotcp: make challenge acks less predictable
Eric Dumazet [Sun, 10 Jul 2016 08:04:02 +0000 (10:04 +0200)]
tcp: make challenge acks less predictable

Yue Cao claims that current host rate limiting of challenge ACKS
(RFC 5961) could leak enough information to allow a patient attacker
to hijack TCP sessions. He will soon provide details in an academic
paper.

This patch increases the default limit from 100 to 1000, and adds
some randomization so that the attacker can no longer hijack
sessions without spending a considerable amount of probes.

Based on initial analysis and patch from Linus.

Note that we also have per socket rate limiting, so it is tempting
to remove the host limit in the future.

v2: randomize the count of challenge acks per second, not the period.

Fixes: 282f23c6ee34 ("tcp: implement RFC 5961 3.2")
Reported-by: Yue Cao <ycao009@ucr.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 2401010
Conflicts:
net/ipv4/tcp_input.c
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
9 years agoext4: update c/mtime on truncate up
Eryu Guan [Tue, 28 Jul 2015 19:08:41 +0000 (15:08 -0400)]
ext4: update c/mtime on truncate up

Commit 3da40c7b0898 ("ext4: only call ext4_truncate when size <= isize")
introduced a bug that c/mtime is not updated on truncate up.

Fix the issue by setting c/mtime explicitly in the truncate up case.

Note that ftruncate(2) is not affected, so you won't see this bug using
truncate(1) and xfs_io(1).

Orabug: 24377419

Signed-off-by: Zirong Lang <zorro.lang@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-By: Dan Duval <dan.duval@oracle.com>
9 years agovfs: rename: check backing inode being equal
Miklos Szeredi [Tue, 10 May 2016 23:16:37 +0000 (01:16 +0200)]
vfs: rename: check backing inode being equal

If a file is renamed to a hardlink of itself POSIX specifies that rename(2)
should do nothing and return success.

This condition is checked in vfs_rename().  However it won't detect hard
links on overlayfs where these are given separate inodes on the overlayfs
layer.

Overlayfs itself detects this condition and returns success without doing
anything, but then vfs_rename() will proceed as if this was a successful
rename (detach_mounts(), d_move()).

The correct thing to do is to detect this condition before even calling
into overlayfs.  This patch does this by calling vfs_select_inode() to get
the underlying inodes.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2+
Orabug: 24363418
CVE:CVE-2016-6198,CVE-2016-6197
Same as mainline v4.6 commit 9409e22acdfc9153f88d9b1ed2bd2a5b34d2d3ca
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
9 years agovfs: add vfs_select_inode() helper
Miklos Szeredi [Thu, 21 Jul 2016 20:30:58 +0000 (13:30 -0700)]
vfs: add vfs_select_inode() helper

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2+
Orabug: 24363418
CVE:CVE-2016-6198,CVE-2016-6197
Based on mainline v4.6 commit 54d5ca871e72f2bb172ec9323497f01cd5091ec7
Conflicts:
  include/linux/dcache.h - code base
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
9 years agoovl: verify upper dentry before unlink and rename
Miklos Szeredi [Thu, 21 Jul 2016 20:24:59 +0000 (13:24 -0700)]
ovl: verify upper dentry before unlink and rename

Unlink and rename in overlayfs checked the upper dentry for staleness by
verifying upper->d_parent against upperdir.  However the dentry can go
stale also by being unhashed, for example.

Expand the verification to actually look up the name again (under parent
lock) and check if it matches the upper dentry.  This matches what the VFS
does before passing the dentry to filesytem's unlink/rename methods, which
excludes any inconsistency caused by overlayfs.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Orabug: 24363418
CVE:CVE-2016-6198,CVE-2016-6197
Based on mainline v4.6 commit 11f3710417d026ea2f4fcf362d866342c5274185
Conflicts:
  fs/overlayfs/dir.c - code base
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
9 years agoovl: fix getcwd() failure after unsuccessful rmdir
Rui Wang [Thu, 21 Jul 2016 20:21:39 +0000 (13:21 -0700)]
ovl: fix getcwd() failure after unsuccessful rmdir

ovl_remove_upper() should do d_drop() only after it successfully
removes the dir, otherwise a subsequent getcwd() system call will
fail, breaking userspace programs.

This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
Orabug: 24363418
CVE:CVE-2016-6198,CVE-2016-6197
Based on mainline v4.5 commit ce9113bbcbf45a57c082d6603b9a9f342be3ef74
Pre-req for mainline v4.6 commit 11f3710417d026ea2f4fcf362d866342c5274185
Conflicts:
  fs/overlayfs/dir.c - code base
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
9 years agoEPSC_API_VERSION(2,8) - New EPSC_QUERY_ON_CHIP_TEMP
Lars Paul Huse [Fri, 22 Jul 2016 21:21:30 +0000 (23:21 +0200)]
EPSC_API_VERSION(2,8) - New EPSC_QUERY_ON_CHIP_TEMP

Also added a new EPSA_GET_EXPORTED_SYMBOL_MAP
which returns a list of exported EPSA runtime symbols.

Orabugs: 2431774623168922

Change-Id: I97c600950fefe6649ec0a8b6539f7225d78aa9c4
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: pqp: Be less aggressive in invoking cond_resched()
Knut Omang [Fri, 22 Jul 2016 12:30:24 +0000 (14:30 +0200)]
sif: pqp: Be less aggressive in invoking cond_resched()

This commit attempts to avoid unnecessary or potentially dangerous
calls to cond_resched() in the privileged QP completion polling code.

Privileged QP requests typically takes a few microseconds to
complete. Since we usually need the result of the operation
to be able to continue, user code usually calls poll_cq_waitfor()
to busy wait for the completion of the request. This commit
adds two measures to make this logic better:

1) Avoid rescheduling while interrupts have been turned off:
The driver was using the in_interrupt() test to avoid calling
cond_resched() from interrupt context, and leaving the rest of
the decision making of whether or not to reschedule to cond_resched().
Testing indicates that this could lead to deadlock prone calls to
schedule(), as cond_resched() would actually allow rescheduling if interrupts
have been disabled. Switch this logic to use irqs_disabled() instead, which will
cover both the interrupt case and the cases where interrupts have been disabled
by the caller.

2) Busywait for the completion for a few cycles before even trying to reschedule
or cpu_relax. Measurements indicate that 10 tries are enough to cover
a large fraction of cases on a lightly loaded system.

Orabug: 23733539

Change-Id: Ief35e1828d4dde9b692640f259c3df80ccdb553b
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Francisco Trivino-Garcia <francisco.trivino@oracle.com>
9 years agosif: xrc: Add handling for xrc_domain_violation & invalid_xrceth events
Vinay Shaw [Sat, 16 Jul 2016 06:32:54 +0000 (08:32 +0200)]
sif: xrc: Add handling for xrc_domain_violation & invalid_xrceth events

XRC spec defines xrc_domain_violation and invalid_xrceth events as
affiliated asynchronous error on XRC TGTQP, the QP is marked in ERROR.
Map these to ofed's IB_EVENT_QP_FATAL type and qp event handler.

Orabug: 24318556

Signed-off-by: Vinay Shaw <vinay.shaw@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: dfs: Minor change to print CQ tied to XRCSRQ (rq_hw).
Vinay Shaw [Tue, 19 Jul 2016 13:37:08 +0000 (15:37 +0200)]
sif: dfs: Minor change to print CQ tied to XRCSRQ (rq_hw).

Orabug: 24318845

Signed-off-by: Vinay Shaw <vinay.shaw@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: During driver load, hold back events instead of ignoring them
Knut Omang [Wed, 20 Jul 2016 12:03:13 +0000 (14:03 +0200)]
sif: During driver load, hold back events instead of ignoring them

The current semantics when a queued event is received before the driver
is done loading is to ignore it with a log warning. This was not sufficient
to implement the flush_retry_qp setup, which relies on lid change events.

Unfortunately the solution to make an exception for lid events for the
flush_retry_qp is not valid because it defeats the purpose of the
check in the first place by allowing such an event to be handled before
the data structure needed to handle it is initialized.

This commit introduces a new kernel completion that the driver completes
when the whole driver load is finished. The first EPSC event queued on the
sif work queue will now block on this completion.

This covers all the remaining cases not handled by commit
"eq: Avoid enabling interrupts on TSU EQs until the initialization is complete"
and solves the original problem that introduced the need for a fix.

Orabug: 24296729

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: Let sif_remove implement the shutdown entry point
Knut Omang [Wed, 20 Jul 2016 06:15:24 +0000 (08:15 +0200)]
sif: Let sif_remove implement the shutdown entry point

Due to bugs in the FLR handling on PSIF 2.1, we need to make sure
that the driver gets to do a full unload whenever possible.
Simply let the shutdown entry point be sif_remove similar
to the remove entry point.

Orabug: 24322970

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: pqp: Fix potential null pointer exception under high load
Knut Omang [Tue, 19 Jul 2016 07:23:05 +0000 (09:23 +0200)]
sif: pqp: Fix potential null pointer exception under high load

If a high number of invalidate requests are posted
without requesting completions, the PQP may run full
enough not to be able to allow a posted req anymore.

To handle this scenario, an additional attempt to send a
synchronous invalidate request was added. Unfortunately
that request ended up being posted with synchronous semantics
but without a handle to handle the completion.

This commit fixes this case by dynamically allocating/freeing
a handle in such situations.

Orabug: 24316139

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: fmr: call sif_post_flush_tlb with ptw flush and in SR/IOV cases
Knut Omang [Tue, 19 Jul 2016 05:54:45 +0000 (07:54 +0200)]
sif: fmr: call sif_post_flush_tlb with ptw flush and in SR/IOV cases

Flush the ptw cache as part of the bulk tlb flush.

With the introduction of more dynamic page tables,
the code to distinguish between the two types of cases,
simple page table entries and entries with interior nodes
was simplified because we no longer always know if PTW entries
have been consumed. Unfortunately the code was unified to the
wrong case which does not flush the ptw cache possibly causing
cache entries to remain in the cache and theoretically have
sif look up pages that no longer exists or that has been reused
for other purposes.

Also now that the EPSC handles the tlb flushing, it is just fine
to call that code even in virtualized setups.
Remove tests for whether VFs exists or we are running in a VF.

Orabug: 24315529

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: eq: Avoid enabling interrupts on TSU EQs until the initialization is complete
Knut Omang [Mon, 18 Jul 2016 11:30:01 +0000 (13:30 +0200)]
sif: eq: Avoid enabling interrupts on TSU EQs until the initialization is complete

During driver load we might have some rare conditions where external events
are occuring before the driver is ready to accept them. The hardware workarounds
to handle issues with QP flushing are particularly sensitive to this.

Delay enabling of the IRQs that can generate interrupts for all
event queues except the EPS event queue(s) until everything is set up and ready.

Note that this commit will also implicitly cause interrupts for EQs 1-3 for each EPSA
not to be enabled. This is no big deal as they are currently not used anyway.

Orabug: 24296729

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: base: change default queue size according to ED scale_profile=1
Hakon Bugge [Thu, 14 Jul 2016 11:34:18 +0000 (13:34 +0200)]
sif: base: change default queue size according to ED scale_profile=1

Make queue sizes of PSIF equal to those of cx3 when using ED's
scale_profile=1.

Signed-off-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Orabug: 23141108
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: sif_eq: fix missing qp->refcnt decrement for COMM_EST events
Francisco Triviño [Wed, 13 Jul 2016 15:23:40 +0000 (17:23 +0200)]
sif: sif_eq: fix missing qp->refcnt decrement for COMM_EST events

qp->refcnt is increased by 1 when event_status_communication_established
is dispatched but later it is not decremented when handling the event
work for IB_EVENT_COMM_EST for UD & RAW QP types.

This commit decrements qp->refcnt for those cases too and fixes another
potential bug by moving the sif_log line up before the qp->refcnt
decrement.

Orabug: 24288467

Signed-off-by: Francisco Triviño <francisco.trivino@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agoEPSC_API_VERSION(2,6) - Adding retrieval of SMP and vlink connect modes
Harald Høeg [Fri, 15 Jul 2016 08:03:26 +0000 (10:03 +0200)]
EPSC_API_VERSION(2,6) - Adding retrieval of SMP and vlink connect modes

Orabug: 23634562

Change-Id: Ic3eea7a7297c9ff97e72cb25dda4ba44fdfa2937
Signed-off-by: Harald Høeg <harald.hoeg@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: eq: increase cq_eq_max to 46
Hakon Bugge [Fri, 8 Jul 2016 10:39:19 +0000 (12:39 +0200)]
sif: eq: increase cq_eq_max to 46

PSIF supports 48 msi-x interrupts. We associate one msi-x per event
queue (EQ). Further, PSIF need one eq for epsc and one for async
events from the hardware. That leaves 46 for completion notification
events or completion vectors.

This commit also reduces the number of completion notification event
queues to the lesser of the number of cpus present and the default.

Note, this requires fw 1.0.0.1 or newer...

Orabug: 23705843

Change-Id: Iea9101bf09203dff86403453a7e0690cb31b3756
Signed-off-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: sif_r3: implemented WA#4074 stats counters
Triviño [Thu, 30 Jun 2016 14:30:07 +0000 (16:30 +0200)]
sif: sif_r3: implemented WA#4074 stats counters

This commit added both wa4074 and wa4059 statistics
to help to identify potential issues when the work-
around are applied.

The wa4074 stats implementation is based on:

a) pre_wa4074_cnt == post_wa4074_cnt. This means the
w/a is triggered from the modify_qp_hw.
b) pre_wa4074_cnt != post_wa4074_cnt. post_wa4074 is
triggered from other scenarios too.
c) post_wa4074_err_cnt != 0. It means that post_wa4074
fails.
d) wrs_csum_corr_wa4074_cnt indicates the number of
WRs that were csum corrupted.
e) rcv_snd_gen_wa4074_cnt shows the number of recv
and send cqe's were generated.

The wa4059 stats indicate the number of keep-alive
events that have been sent.

This commit also improves wa3714 stats implementation
by using atomic64 counters and enumeration values,
and other minor changes such as clean up and fix
typos on comment messages.

Orabug: 23760170

Signed-off-by: Triviño <francisco.trivino@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: Remove software emulation of > 16 SGEs
Hans Westgaard Ry [Wed, 6 Jul 2016 10:44:35 +0000 (12:44 +0200)]
sif: Remove software emulation of > 16 SGEs

Orabug: 24310514

Change-Id: I1886d138b0ff103b074c45da475b3052dd1fd9b1
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: rq: Do not clear the rq_sw until the completion of flush_rq
Wei Lin Guay [Tue, 5 Jul 2016 11:51:37 +0000 (13:51 +0200)]
sif: rq: Do not clear the rq_sw until the completion of flush_rq

Orabug: 23754857

The rq can be invalidated from reset_qp or flush_rq. Nevertheless,
the rq_sw data structure has been reset after rq is invalidated in
reset_qp regardless of the completion of the flush_rq. Thus, move
the rq synchronization to reset_qp, and place the synchronization
in between of reset_qp and flush_rq. After invalidating and
reseting the rq, no flush rq is required as both head and tail
have been reset to 0.

This commit creates another atomic_t variable for the synchronization
between reset rq and flush_rq.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agoIBCM: dereference timewait_info only when needed
Santosh Shilimkar [Tue, 19 Jul 2016 02:35:26 +0000 (19:35 -0700)]
IBCM: dereference timewait_info only when needed

timewait_info is available in valid CM states and may
not be even allocated in invalid states.

Lets move the dereferencing only when we need in
those valid state.

Orabug: 24326732

Reviewed-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Tested-by: Efrain Galaviz <efrain.galaviz@oracle.com>
Tested-by: Hong Liu <hong.x.liu@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoMerge branch topic/uek-4.1/uek-carry of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Sat, 16 Jul 2016 07:06:04 +0000 (00:06 -0700)]
Merge branch topic/uek-4.1/uek-carry of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
Chuck Anderson [Sat, 16 Jul 2016 07:04:49 +0000 (00:04 -0700)]
Merge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Sat, 16 Jul 2016 06:55:14 +0000 (23:55 -0700)]
Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sat, 16 Jul 2016 06:53:50 +0000 (23:53 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoblock: Initialize max_dev_sectors to 0
Keith Busch [Wed, 10 Feb 2016 23:52:47 +0000 (16:52 -0700)]
block: Initialize max_dev_sectors to 0

Orabug: 23615929

(commit 5f009d3f8e6685fe8c6215082c1696a08b411220 of upstream)
The new queue limit is not used by the majority of block drivers, and
should be initialized to 0 for the driver's requested settings to be used.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agosd: Fix rw_max for devices that report an optimal xfer size
Martin K. Petersen [Fri, 13 May 2016 02:17:34 +0000 (22:17 -0400)]
sd: Fix rw_max for devices that report an optimal xfer size

Orabug: 23615929

(commit 6b7e9cde49691e04314342b7dce90c67ad567fcc of upstream)
For historic reasons, io_opt is in bytes and max_sectors in block layer
sectors. This interface inconsistency is error prone and should be
fixed. But for 4.4--4.7 let's make the unit difference explicit via a
wrapper function.

Fixes: d0eb20a863ba ("sd: Optimal I/O size is in bytes, not sectors")
Cc: stable@vger.kernel.org # 4.4+
Reported-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Andrew Patterson <andrew.patterson@hpe.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agosd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
Martin K. Petersen [Tue, 29 Mar 2016 01:18:56 +0000 (21:18 -0400)]
sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes

Orabug: 23615929

(commit f08bb1e0dbdd0297258d0b8cd4dbfcc057e57b2a of upstream)
During revalidate we check whether device capacity has changed before we
decide whether to output disk information or not.

The check for old capacity failed to take into account that we scaled
sdkp->capacity based on the reported logical block size. And therefore
the capacity test would always fail for devices with sectors bigger than
512 bytes and we would print several copies of the same discovery
information.

Avoid scaling sdkp->capacity and instead adjust the value on the fly
when setting the block device capacity and generating fake C/H/S
geometry.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: <stable@vger.kernel.org>
Reported-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agosd: Optimal I/O size is in bytes, not sectors
Martin K. Petersen [Wed, 20 Jan 2016 16:01:23 +0000 (11:01 -0500)]
sd: Optimal I/O size is in bytes, not sectors

Orabug: 23615929

(commit d0eb20a863ba7dc1d3f4b841639671f134560be2 of upstream)
Commit ca369d51b3e1 ("block/sd: Fix device-imposed transfer length
limits") accidentally switched optimal I/O size reporting from bytes to
block layer sectors.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: ca369d51b3e1649be4a72addd6d6a168cfb3f537
Cc: stable@vger.kernel.org # 4.4+
Reviewed-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agosd: Reject optimal transfer length smaller than page size
Martin K. Petersen [Wed, 16 Dec 2015 22:53:52 +0000 (17:53 -0500)]
sd: Reject optimal transfer length smaller than page size

Orabug: 23615929

(commit 9c1d9c207bb800498347a2716da298043ee280c5 of upstream)
Eryu Guan reported that loading scsi_debug would fail. This turned out
to be caused by scsi_debug reporting an optimal I/O size of 32KB which
is smaller than the 64KB page size on the PowerPC system in question.

Add a check to ensure that we only use the device-reported OPTIMAL
TRANSFER LENGTH if it is bigger than or equal to the page cache size.

Reported-by: Eryu Guan <guaneryu@gmail.com>
Reported-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agoblock/sd: Fix device-imposed transfer length limits
Joe Jin [Mon, 6 Jun 2016 05:22:30 +0000 (13:22 +0800)]
block/sd: Fix device-imposed transfer length limits

Orabug: 23615929

(commit ca369d51b3e1649be4a72addd6d6a168cfb3f537 of upstream)
Commit 4f258a46346c ("sd: Fix maximum I/O size for BLOCK_PC requests")
had the unfortunate side-effect of removing an implicit clamp to
BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer
code. This caused problems for some SMR drives.

Debugging this issue revealed a few problems with the existing
infrastructure since the block layer didn't know how to deal with
device-imposed limits, only limits set by the I/O controller.

 - Introduce a new queue limit, max_dev_sectors, which is used by the
   ULD to signal the maximum sectors for a REQ_TYPE_FS request.

 - Ensure that max_dev_sectors is correctly stacked and taken into
   account when overriding max_sectors through sysfs.

 - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer
   values for later processing.

 - In sd_revalidate() set the queue's max_dev_sectors based on the
   MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value
   is not reported, fall back to a cap based on the CDB TRANSFER LENGTH
   field size.

 - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits
   VPD--if reported and sane--to signal the preferred device transfer
   size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS.

 - blk_limits_max_hw_sectors() is no longer used and can be removed.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: sweeneygj@gmx.com
Tested-by: Arzeets <anatol.pomozov@gmail.com>
Tested-by: David Eisner <david.eisner@oriel.oxon.org>
Tested-by: Mario Kicherer <dev@kicherer.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
Conflicts:
  drivers/scsi/sd.h
  include/linux/blkdev.h
    Adding opt_xfer_blocks and max_dev_sectors breaks kABI.
    They will be added using by UEK_KABI_USE2() in a fix-up commit.
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
9 years agoFix kabi issue for upstream commit ca369d51
Joe Jin [Tue, 14 Jun 2016 07:13:33 +0000 (15:13 +0800)]
Fix kabi issue for upstream commit ca369d51

Orabug: 23615929

When backport upstream commit ca369d51 'block/sd: Fix device-imposed
transfer length limits' it broken kabi, this patch fix kabi issue.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agoRevert "ocfs2: bump up o2cb network protocol version"
Junxiao Bi [Thu, 30 Jun 2016 08:50:39 +0000 (16:50 +0800)]
Revert "ocfs2: bump up o2cb network protocol version"

This reverts commit d5eebd62353e9c1434951436c4d1d8d7a636c17d.

This commit made rolling upgrade fail. When one node is upgraded
to new version with this commit, the remaining nodes will fail to
establish connections to it, then vms on the remaining nodes can't
be live migrated to that node. This will cause an outage.

Orabug: 24292852
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
9 years agoBtrfs: fix leaking of ordered extents after direct IO write error
Filipe Manana [Tue, 8 Dec 2015 19:23:20 +0000 (19:23 +0000)]
Btrfs: fix leaking of ordered extents after direct IO write error

Orabug: 23717870

When doing a direct IO write, __blockdev_direct_IO() can call the
btrfs_get_blocks_direct() callback one or more times before it calls the
btrfs_submit_direct() callback. However it can fail after calling the
first callback and before calling the second callback, which is a problem
because the first one creates ordered extents and the second one is the
one that submits bios that cover the ordered extents created by the first
one. That means the ordered extents will never complete nor have any of
the flags BTRFS_ORDERED_IO_DONE / BTRFS_ORDERED_IOERR set, resulting in
subsequent operations (such as other direct IO writes, buffered writes or
hole punching) that lock the same IO range and lookup for ordered extents
in the range to hang forever waiting for those ordered extents because
they can not complete ever, since no bio was submitted.

Fix this by tracking a range of created ordered extents that don't have
yet corresponding bios submitted and completing the ordered extents in
the range if __blockdev_direct_IO() fails with an error.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
(cherry picked from commit f28a492878170f39002660a26c329201cf678d74)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: fix error path when failing to submit bio for direct IO write
Filipe Manana [Tue, 24 Nov 2015 16:23:54 +0000 (16:23 +0000)]
Btrfs: fix error path when failing to submit bio for direct IO write

Orabug: 23717870

Commit 61de718fceb6 ("Btrfs: fix memory corruption on failure to submit
bio for direct IO") fixed problems with the error handling code after we
fail to submit a bio for direct IO. However there were 2 problems that it
did not address when the failure is due to memory allocation failures for
direct IO writes:

1) We considered that there could be only one ordered extent for the whole
   IO range, which is not always true, as we can have multiple;

2) It did not set the bit BTRFS_ORDERED_IO_DONE in the ordered extent,
   which can make other tasks running btrfs_wait_logged_extents() hang
   forever, since they wait for that bit to be set. The general assumption
   is that regardless of an error, the BTRFS_ORDERED_IO_DONE is always set
   and it precedes setting the bit BTRFS_ORDERED_COMPLETE.

Fix these issues by moving part of the btrfs_endio_direct_write() handler
into a new helper function and having that new helper function called when
we fail to allocate memory to submit the bio (and its private object) for
a direct IO write.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
(cherry picked from commit 14543774bd67a64f616431e5c9d1472f58979841)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: fix memory corruption on failure to submit bio for direct IO
Filipe Manana [Wed, 1 Jul 2015 11:13:10 +0000 (12:13 +0100)]
Btrfs: fix memory corruption on failure to submit bio for direct IO

Orabug: 23717870

If we fail to submit a bio for a direct IO request, we were grabbing the
corresponding ordered extent and decrementing its reference count twice,
once for our lookup reference and once for the ordered tree reference.
This was a problem because it caused the ordered extent to be freed
without removing it from the ordered tree and any lists it might be
attached to, leaving dangling pointers to the ordered extent around.
Example trace with CONFIG_DEBUG_PAGEALLOC=y:

[161779.858707] BUG: unable to handle kernel paging request at 0000000087654330
[161779.859983] IP: [<ffffffff8124ca68>] rb_prev+0x22/0x3b
[161779.860636] PGD 34d818067 PUD 0
[161779.860636] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
(...)
[161779.860636] Call Trace:
[161779.860636]  [<ffffffffa06b36a6>] __tree_search+0xd9/0xf9 [btrfs]
[161779.860636]  [<ffffffffa06b3708>] tree_search+0x42/0x63 [btrfs]
[161779.860636]  [<ffffffffa06b4868>] ? btrfs_lookup_ordered_range+0x2d/0xa5 [btrfs]
[161779.860636]  [<ffffffffa06b4873>] btrfs_lookup_ordered_range+0x38/0xa5 [btrfs]
[161779.860636]  [<ffffffffa06aab8e>] btrfs_get_blocks_direct+0x11b/0x615 [btrfs]
[161779.860636]  [<ffffffff8119727f>] do_blockdev_direct_IO+0x5ff/0xb43
[161779.860636]  [<ffffffffa06aaa73>] ? btrfs_page_exists_in_range+0x1ad/0x1ad [btrfs]
[161779.860636]  [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs]
[161779.860636]  [<ffffffff811977f5>] __blockdev_direct_IO+0x32/0x34
[161779.860636]  [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs]
[161779.860636]  [<ffffffffa06a10ae>] btrfs_direct_IO+0x198/0x21f [btrfs]
[161779.860636]  [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs]
[161779.860636]  [<ffffffff81112ca1>] generic_file_direct_write+0xb3/0x128
[161779.860636]  [<ffffffffa06affaa>] ? btrfs_file_write_iter+0x15f/0x3e0 [btrfs]
[161779.860636]  [<ffffffffa06b004c>] btrfs_file_write_iter+0x201/0x3e0 [btrfs]
(...)

We were also not freeing the btrfs_dio_private we allocated previously,
which kmemleak reported with the following trace in its sysfs file:

unreferenced object 0xffff8803f553bf80 (size 96):
  comm "xfs_io", pid 4501, jiffies 4295039588 (age 173.936s)
  hex dump (first 32 bytes):
    88 6c 9b f5 02 88 ff ff 00 00 00 00 00 00 00 00  .l..............
    00 00 00 00 00 00 00 00 00 00 c4 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81161ffe>] create_object+0x172/0x29a
    [<ffffffff8145870f>] kmemleak_alloc+0x25/0x41
    [<ffffffff81154e64>] kmemleak_alloc_recursive.constprop.40+0x16/0x18
    [<ffffffff811579ed>] kmem_cache_alloc_trace+0xfb/0x148
    [<ffffffffa03d8cff>] btrfs_submit_direct+0x65/0x16a [btrfs]
    [<ffffffff811968dc>] dio_bio_submit+0x62/0x8f
    [<ffffffff811975fe>] do_blockdev_direct_IO+0x97e/0xb43
    [<ffffffff811977f5>] __blockdev_direct_IO+0x32/0x34
    [<ffffffffa03d70ae>] btrfs_direct_IO+0x198/0x21f [btrfs]
    [<ffffffff81112ca1>] generic_file_direct_write+0xb3/0x128
    [<ffffffffa03e604d>] btrfs_file_write_iter+0x201/0x3e0 [btrfs]
    [<ffffffff8116586a>] __vfs_write+0x7c/0xa5
    [<ffffffff81165da9>] vfs_write+0xa0/0xe4
    [<ffffffff81166675>] SyS_pwrite64+0x64/0x82
    [<ffffffff81464fd7>] system_call_fastpath+0x12/0x6f
    [<ffffffffffffffff>] 0xffffffffffffffff

For read requests we weren't doing any cleanup either (none of the work
done by btrfs_endio_direct_read()), so a failure submitting a bio for a
read request would leave a range in the inode's io_tree locked forever,
blocking any future operations (both reads and writes) against that range.

So fix this by making sure we do the same cleanup that we do for the case
where the bio submission succeeds.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 61de718fceb6bc028dafe4d06a1f87a9e0998303)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: fix extent accounting for partial direct IO writes
Filipe Manana [Wed, 4 Nov 2015 09:52:04 +0000 (09:52 +0000)]
Btrfs: fix extent accounting for partial direct IO writes

Orabug: 23717870

When doing a write using direct IO we can end up not doing the whole write
operation using the direct IO path, in that case we fallback to a buffered
write to do the remaining IO. This happens for example if the range we are
writing to contains a compressed extent.
When we do a partial write and fallback to buffered IO, due to the
existence of a compressed extent for example, we end up not adjusting the
outstanding extents counter of our inode which ends up getting decremented
twice, once by the DIO ordered extent for the partial write and once again
by btrfs_direct_IO(), resulting in an arithmetic underflow at
extent-tree.c:drop_outstanding_extent(). For example if we have:

  extents        [ prealloc extent ] [ compressed extent ]
  offsets        A        B          C       D           E

and at the moment our inode's outstanding extents counter is 0, if we do a
direct IO write against the range [B, D[ (which has a length smaller than
128Mb), we end up bumping our inode's outstanding extents counter to 1, we
create a DIO ordered extent for the range [B, C[ and then fallback to a
buffered write for the range [C, D[. The direct IO handler
(inode.c:btrfs_direct_IO()) decrements the outstanding extents counter by
1, leaving it with a value of 0, through a call to
btrfs_delalloc_release_space() and then shortly after the DIO ordered
extent finishes and calls btrfs_delalloc_release_metadata() which ends
up to attempt to decrement the inode's outstanding extents counter by 1,
resulting in an assertion failure at drop_outstanding_extent() because
the operation would result in an arithmetic underflow (0 - 1). This
produces the following trace:

  [125471.336838] BTRFS: assertion failed: BTRFS_I(inode)->outstanding_extents >= num_extents, file: fs/btrfs/extent-tree.c, line: 5526
  [125471.338844] ------------[ cut here ]------------
  [125471.340745] kernel BUG at fs/btrfs/ctree.h:4173!
  [125471.340745] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  [125471.340745] Modules linked in: btrfs f2fs xfs libcrc32c dm_flakey dm_mod crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse parport_pc acpi_cpufreq psmouse i2c_piix4 parport pcspkr serio_raw microcode processor evdev i2c_core button ext4 crc16 jbd2 mbcache sd_mod sg sr_mod cdrom ata_generic virtio_scsi ata_piix virtio_pci virtio_ring floppy libata virtio e1000 scsi_mod [last unloaded: btrfs]
  [125471.340745] CPU: 10 PID: 23649 Comm: kworker/u32:1 Tainted: G        W       4.3.0-rc5-btrfs-next-17+ #1
  [125471.340745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
  [125471.340745] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
  [125471.340745] task: ffff8804244fcf80 ti: ffff88040a118000 task.ti: ffff88040a118000
  [125471.340745] RIP: 0010:[<ffffffffa0550da1>]  [<ffffffffa0550da1>] assfail.constprop.46+0x1e/0x20 [btrfs]
  [125471.340745] RSP: 0018:ffff88040a11bc78  EFLAGS: 00010296
  [125471.340745] RAX: 0000000000000075 RBX: 0000000000005000 RCX: 0000000000000000
  [125471.340745] RDX: ffffffff81098f93 RSI: ffffffff8147c619 RDI: 00000000ffffffff
  [125471.340745] RBP: ffff88040a11bc78 R08: 0000000000000001 R09: 0000000000000000
  [125471.340745] R10: ffff88040a11bc08 R11: ffffffff81651000 R12: ffff8803efb4a000
  [125471.340745] R13: ffff8803efb4a000 R14: 0000000000000000 R15: ffff8802f8e33c88
  [125471.340745] FS:  0000000000000000(0000) GS:ffff88043dd40000(0000) knlGS:0000000000000000
  [125471.340745] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  [125471.340745] CR2: 00007fae7ca86095 CR3: 0000000001a0b000 CR4: 00000000000006e0
  [125471.340745] Stack:
  [125471.340745]  ffff88040a11bc88 ffffffffa04ca0cd ffff88040a11bcc8 ffffffffa04ceeb1
  [125471.340745]  ffff8802f8e33940 ffff8802c93eadb0 ffff8802f8e0bf50 ffff8803efb4a000
  [125471.340745]  0000000000000000 ffff8802f8e33c88 ffff88040a11bd38 ffffffffa04eccfa
  [125471.340745] Call Trace:
  [125471.340745]  [<ffffffffa04ca0cd>] drop_outstanding_extent+0x3d/0x6d [btrfs]
  [125471.340745]  [<ffffffffa04ceeb1>] btrfs_delalloc_release_metadata+0x51/0xdd [btrfs]
  [125471.340745]  [<ffffffffa04eccfa>] btrfs_finish_ordered_io+0x420/0x4eb [btrfs]
  [125471.340745]  [<ffffffffa04ecdda>] finish_ordered_fn+0x15/0x17 [btrfs]
  [125471.340745]  [<ffffffffa050e6e8>] normal_work_helper+0x14c/0x32a [btrfs]
  [125471.340745]  [<ffffffffa050e9c8>] btrfs_endio_write_helper+0x12/0x14 [btrfs]
  [125471.340745]  [<ffffffff81063b23>] process_one_work+0x24a/0x4ac
  [125471.340745]  [<ffffffff81064285>] worker_thread+0x206/0x2c2
  [125471.340745]  [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb
  [125471.340745]  [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb
  [125471.340745]  [<ffffffff8106904d>] kthread+0xef/0xf7
  [125471.340745]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
  [125471.340745]  [<ffffffff8147d10f>] ret_from_fork+0x3f/0x70
  [125471.340745]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
  [125471.340745] Code: a5 55 a0 48 89 e5 e8 42 50 bc e0 0f 0b 55 89 f1 48 c7 c2 f0 a8 55 a0 48 89 fe 31 c0 48 c7 c7 14 aa 55 a0 48 89 e5 e8 22 50 bc e0 <0f> 0b 0f 1f 44 00 00 55 31 c9 ba 18 00 00 00 48 89 e5 41 56 41
  [125471.340745] RIP  [<ffffffffa0550da1>] assfail.constprop.46+0x1e/0x20 [btrfs]
  [125471.340745]  RSP <ffff88040a11bc78>
  [125471.539620] ---[ end trace 144259f7838b4aa4 ]---

So fix this by ensuring we adjust the outstanding extents counter when we
do the fallback just like we do for the case where the whole write can be
done through the direct IO path.

We were also adjusting the outstanding extents counter by a constant value
of 1, which is incorrect because we were ignorning that we account extents
in BTRFS_MAX_EXTENT_SIZE units, o fix that as well.

The following test case for fstests reproduces this issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1 # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_xfs_io_command "falloc"

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _scratch_mount "-o compress"

  # Create a compressed extent covering the range [700K, 800K[.
  $XFS_IO_PROG -f -s -c "pwrite -S 0xaa -b 100K 700K 100K" \
      $SCRATCH_MNT/foo | _filter_xfs_io

  # Create prealloc extent covering the range [600K, 700K[.
  $XFS_IO_PROG -c "falloc 600K 100K" $SCRATCH_MNT/foo

  # Write 80K of data to the range [640K, 720K[ using direct IO. This
  # range covers both the prealloc extent and the compressed extent.
  # Because there's a compressed extent in the range we are writing to,
  # the DIO write code path ends up only writing the first 60k of data,
  # which goes to the prealloc extent, and then falls back to buffered IO
  # for writing the remaining 20K of data - because that remaining data
  # maps to a file range containing a compressed extent.
  # When falling back to buffered IO, we used to trigger an assertion when
  # releasing reserved space due to bad accounting of the inode's
  # outstanding extents counter, which was set to 1 but we ended up
  # decrementing it by 1 twice, once through the ordered extent for the
  # 60K of data we wrote using direct IO, and once through the main direct
  # IO handler (inode.cbtrfs_direct_IO()) because the direct IO write
  # wrote less than 80K of data (60K).
  $XFS_IO_PROG -d -c "pwrite -S 0xbb -b 80K 640K 80K" \
      $SCRATCH_MNT/foo | _filter_xfs_io

  # Now similar test as above but for very large write operations. This
  # triggers special cases for an inode's outstanding extents accounting,
  # as internally btrfs logically splits extents into 128Mb units.
  $XFS_IO_PROG -f -s \
      -c "pwrite -S 0xaa -b 128M 258M 128M" \
      -c "falloc 0 258M" \
      $SCRATCH_MNT/bar | _filter_xfs_io
  $XFS_IO_PROG -d -c "pwrite -S 0xbb -b 256M 3M 256M" $SCRATCH_MNT/bar \
      | _filter_xfs_io

  # Now verify the file contents are correct and that they are the same
  # even after unmounting and mounting the fs again (or evicting the page
  # cache).
  #
  # For file foo, all bytes in the range [0, 640K[ must have a value of
  # 0x00, all bytes in the range [640K, 720K[ must have a value of 0xbb
  # and all bytes in the range [720K, 800K[ must have a value of 0xaa.
  #
  # For file bar, all bytes in the range [0, 3M[ must havea value of 0x00,
  # all bytes in the range [3M, 259M[ must have a value of 0xbb and all
  # bytes in the range [259M, 386M[ must have a value of 0xaa.
  #
  echo "File digests before remounting the file system:"
  md5sum $SCRATCH_MNT/foo | _filter_scratch
  md5sum $SCRATCH_MNT/bar | _filter_scratch
  _scratch_remount
  echo "File digests after remounting the file system:"
  md5sum $SCRATCH_MNT/foo | _filter_scratch
  md5sum $SCRATCH_MNT/bar | _filter_scratch

  status=0
  exit

Fixes: e1cbbfa5f5aa ("Btrfs: fix outstanding_extents accounting in DIO")
Fixes: 3e05bde8c3c2 ("Btrfs: only adjust outstanding_extents when we do a short write")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
(cherry picked from commit 9c9464cc92668984ebed79e22b5063877a8d97db)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: Direct I/O: Fix space accounting
chandan [Fri, 28 Aug 2015 15:40:13 +0000 (21:10 +0530)]
Btrfs: Direct I/O: Fix space accounting

Orabug: 23717870

The following call trace is seen when generic/095 test is executed,

WARNING: CPU: 3 PID: 2769 at /home/chandan/code/repos/linux/fs/btrfs/inode.c:8967 btrfs_destroy_inode+0x284/0x2a0()
Modules linked in:
CPU: 3 PID: 2769 Comm: umount Not tainted 4.2.0-rc5+ #31
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20150306_163512-brownie 04/01/2014
 ffffffff81c08150 ffff8802ec9cbce8 ffffffff81984058 ffff8802ffd8feb0
 0000000000000000 ffff8802ec9cbd28 ffffffff81050385 ffff8802ec9cbd38
 ffff8802d12f8588 ffff8802d12f8588 ffff8802f15ab000 ffff8800bb96c0b0
Call Trace:
 [<ffffffff81984058>] dump_stack+0x45/0x57
 [<ffffffff81050385>] warn_slowpath_common+0x85/0xc0
 [<ffffffff81050465>] warn_slowpath_null+0x15/0x20
 [<ffffffff81340294>] btrfs_destroy_inode+0x284/0x2a0
 [<ffffffff8117ce07>] destroy_inode+0x37/0x60
 [<ffffffff8117cf39>] evict+0x109/0x170
 [<ffffffff8117cfd5>] dispose_list+0x35/0x50
 [<ffffffff8117dd3a>] evict_inodes+0xaa/0x100
 [<ffffffff81165667>] generic_shutdown_super+0x47/0xf0
 [<ffffffff81165951>] kill_anon_super+0x11/0x20
 [<ffffffff81302093>] btrfs_kill_super+0x13/0x110
 [<ffffffff81165c99>] deactivate_locked_super+0x39/0x70
 [<ffffffff811660cf>] deactivate_super+0x5f/0x70
 [<ffffffff81180e1e>] cleanup_mnt+0x3e/0x90
 [<ffffffff81180ebd>] __cleanup_mnt+0xd/0x10
 [<ffffffff81069c06>] task_work_run+0x96/0xb0
 [<ffffffff81003a3d>] do_notify_resume+0x3d/0x50
 [<ffffffff8198cbc2>] int_signal+0x12/0x17

This means that the inode had non-zero "outstanding extents" during
eviction. This occurs because, during direct I/O a task which successfully
used up its reserved data space would set BTRFS_INODE_DIO_READY bit and does
not clear the bit after finishing the DIO write. A future DIO write could
actually fail and the unused reserve space won't be freed because of the
previously set BTRFS_INODE_DIO_READY bit.

Clearing the BTRFS_INODE_DIO_READY bit in btrfs_direct_IO() caused the
following issue,
|-----------------------------------+-------------------------------------|
| Task A                            | Task B                              |
|-----------------------------------+-------------------------------------|
| Start direct i/o write on inode X.|                                     |
| reserve space                     |                                     |
| Allocate ordered extent           |                                     |
| release reserved space            |                                     |
| Set BTRFS_INODE_DIO_READY bit.    |                                     |
|                                   | splice()                            |
|                                   | Transfer data from pipe buffer to   |
|                                   | destination file.                   |
|                                   | - kmap(pipe buffer page)            |
|                                   | - Start direct i/o write on         |
|                                   |   inode X.                          |
|                                   |   - reserve space                   |
|                                   |   - dio_refill_pages()              |
|                                   |     - sdio->blocks_available == 0   |
|                                   |     - Since a kernel address is     |
|                                   |       being passed instead of a     |
|                                   |       user space address,           |
|                                   |       iov_iter_get_pages() returns  |
|                                   |       -EFAULT.                      |
|                                   |   - Since BTRFS_INODE_DIO_READY is  |
|                                   |     set, we don't release reserved  |
|                                   |     space.                          |
|                                   |   - Clear BTRFS_INODE_DIO_READY bit.|
| -EIOCBQUEUED is returned.         |                                     |
|-----------------------------------+-------------------------------------|

Hence this commit introduces "struct btrfs_dio_data" to track the usage of
reserved data space. The remaining unused "reserve space" can now be freed
reliably.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 50745b0a7f46f68574cd2b9ae24566bf026e7ebd)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: fix warning of bytes_may_use
Liu Bo [Wed, 17 Jun 2015 08:59:58 +0000 (16:59 +0800)]
Btrfs: fix warning of bytes_may_use

Orabug: 23717870

While running generic/019, dmesg got several warnings from
btrfs_free_reserved_data_space().

Test generic/019 produces some disk failures so sumbit dio will get errors,
in which case, btrfs_direct_IO() goes to the error handling and free
bytes_may_use, but the problem is that bytes_may_use has been free'd
during get_block().

This adds a runtime flag to show if we've gone through get_block(), if so,
don't do the cleanup work.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Tested-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit ddba1bfc2369cd0566bcfdab47599834a32d1c19)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoxen: use same main loop for counting and remapping pages
Juergen Gross [Wed, 18 May 2016 14:44:54 +0000 (16:44 +0200)]
xen: use same main loop for counting and remapping pages

Instead of having two functions for cycling through the E820 map in
order to count to be remapped pages and remap them later, just use one
function with a caller supplied sub-function called for each region to
be processed. This eliminates the possibility of a mismatch between
both loops which showed up in certain configurations.

Suggested-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit dd14be92fbf5bc1ef7343f34968440e44e21b46a)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 24012238

9 years agoMerge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
Chuck Anderson [Wed, 13 Jul 2016 08:23:20 +0000 (01:23 -0700)]
Merge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek...
Chuck Anderson [Wed, 13 Jul 2016 08:18:31 +0000 (01:18 -0700)]
Merge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Wed, 13 Jul 2016 08:14:47 +0000 (01:14 -0700)]
Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Wed, 13 Jul 2016 08:10:12 +0000 (01:10 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Wed, 13 Jul 2016 08:08:56 +0000 (01:08 -0700)]
Merge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoxen-blkfront: dynamic configuration of per-vbd resources
Bob Liu [Fri, 1 Jul 2016 01:11:15 +0000 (21:11 -0400)]
xen-blkfront: dynamic configuration of per-vbd resources

The current VBD layer reserves buffer space for each attached device based on
three statically configured settings which are read at boot time.
 * max_indirect_segs: Maximum amount of segments.
 * max_ring_page_order: Maximum order of pages to be used for the shared ring.
 * max_queues: Maximum of queues(rings) to be used.

But the storage backend, workload, and guest memory result in very different
tuning requirements. It's impossible to centrally predict application
characteristics so it's best to leave allow the settings can be dynamiclly
adjusted based on workload inside the Guest.

Usage:
Show current values:
cat /sys/devices/vbd-xxx/max_indirect_segs
cat /sys/devices/vbd-xxx/max_ring_page_order
cat /sys/devices/vbd-xxx/max_queues

Write new values:
echo <new value> > /sys/devices/vbd-xxx/max_indirect_segs
echo <new value> > /sys/devices/vbd-xxx/max_ring_page_order
echo <new value> > /sys/devices/vbd-xxx/max_queues

Orabug: 23720696
Signed-off-by: Bob Liu <bob.liu@oracle.com>
--
v2: Add device lock and other comments from Konrad.

9 years agoxen-blkfront: introduce blkif_set_queue_limits()
Bob Liu [Fri, 1 Jul 2016 21:43:39 +0000 (17:43 -0400)]
xen-blkfront: introduce blkif_set_queue_limits()

blk_mq_update_nr_hw_queues() reset all queue limits to default which it's not
as xen-blkfront expected, introducing blkif_set_queue_limits() to reset limits
with initial correct values.

Orabug: 23720696
Signed-off-by: Bob Liu <bob.liu@oracle.com>
9 years agoxen-blkfront: fix places not updated after introducing 64KB page granularity
Bob Liu [Fri, 1 Jul 2016 19:45:57 +0000 (15:45 -0400)]
xen-blkfront: fix places not updated after introducing 64KB page granularity

Two places didn't get updated when 64KB page granularity was introduced, this
patch fix them.

Orabug: 23720696
Signed-off-by: Bob Liu <bob.liu@oracle.com>
9 years agoIB: Add RNR timer workaround for PSIF
Santosh Shilimkar [Sat, 18 Jun 2016 20:06:29 +0000 (13:06 -0700)]
IB: Add RNR timer workaround for PSIF

The RNR NAK Retry timer on Titan and Sonoma 1&2 IB subsystems runs 500
times faster than desired. This means that retries are started a lot
sooner than they should.

The software workaround is bit involved and intrusive because it needs
to work in mixed HCA environments. It uses CM protocol to detect the
involvement of the offending IB requestor and then enables the
workaround in the peer responder. To keep the workaround flag
persistent, ib_qp verbs need to carry the flag which impacts
IB core kABI which is wrapped under __GENKSYMS__.

The workaround matches the desired RNR NAK Retry timer value when the
encodings 1 to 14 (decimal) are supplied. For encodings larger than 14
and for zero, the work-around will set the largest possible RNR NAK
Timer value for the offending requestor, which is 1,31 ms.

Thanks to Trivino, Haakon for updates and wide range of testing for
kernel as well as userland with mixed HCA configurations.

Orabug: 23633926

Reviewed-by Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: David Brean <david.brean@oracle.com>
Tested-by: Francisco Triviño García <francisco.trivino@oracle.com>
Signed-off-by: Francisco Triviño García <francisco.trivino@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoIB/core: Add encode/decode FDR/EDR rates
Hans Westgaard Ry [Mon, 11 Jul 2016 10:15:16 +0000 (12:15 +0200)]
IB/core: Add encode/decode FDR/EDR rates

The cases for FDR/EDR signalling speed, was missing in
ib_rate_to_mult and mult_to_ib_rate giving wrong return values
when drivers are converting static rate to/from inter-packet-delay.

Orabug: 23084916

Change-Id: Ib1d6e84eeea1addb830c415faf92f9f430c4ba32
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agobfa: Fix for crash when bfa_itnim is NULL
Sudarsana Reddy Kalluru [Wed, 6 Jul 2016 10:51:29 +0000 (06:51 -0400)]
bfa: Fix for crash when bfa_itnim is NULL

Orabug: 23950878

Fix a very corner case when the port gets disconnected and the BFA and
FCS layers clean up references to the IT nexus.  During this window if a
task management command is issued by the SCSI-ML and ends up referencing
a NULL itnim, it could lead to a crash.

Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Tested-by: Sudarasana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa:Update driver version to 3.2.25.0
Anil Gurumurthy [Thu, 26 Nov 2015 07:17:05 +0000 (12:47 +0530)]
bfa:Update driver version to 3.2.25.0

Orabug: 23950878

Signed-off-by: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa:File header and user visible string changes
Anil Gurumurthy [Thu, 26 Nov 2015 07:14:35 +0000 (12:44 +0530)]
bfa:File header and user visible string changes

Orabug: 23950878

Signed-off-by: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa:Updating copyright messages
Anil Gurumurthy [Thu, 26 Nov 2015 07:09:00 +0000 (12:39 +0530)]
bfa:Updating copyright messages

Orabug: 23950878

Signed-off-by: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa: Fix incorrect de-reference of pointer
Anil Gurumurthy [Thu, 13 Aug 2015 10:14:24 +0000 (03:14 -0700)]
bfa: Fix incorrect de-reference of pointer

Orabug: 23950878

Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Tested-by: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa: Fix indentation
Anil Gurumurthy [Thu, 13 Aug 2015 10:12:47 +0000 (03:12 -0700)]
bfa: Fix indentation

Orabug: 23950878

Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Tested-by : Sudarasana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agolpfc updates to 11.1.0.4 for uek4-r2
rkennedy [Mon, 20 Jun 2016 18:05:12 +0000 (11:05 -0700)]
lpfc updates to 11.1.0.4 for uek4-r2

Orabug: 23762058

Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Update modified file copyrights
James Smart [Thu, 31 Mar 2016 21:12:34 +0000 (14:12 -0700)]
lpfc: Update modified file copyrights

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 506115777af017bfc0968ee1c6aed024cdb6e43b)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Fix interaction between fdmi_on and enable_SmartSAN
James Smart [Thu, 31 Mar 2016 21:12:33 +0000 (14:12 -0700)]
lpfc: Fix interaction between fdmi_on and enable_SmartSAN

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 8663cbbe3ba0d8142faec48bbab0dc3482e3007d)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Add support for SmartSAN 2.0
James Smart [Thu, 31 Mar 2016 21:12:32 +0000 (14:12 -0700)]
lpfc: Add support for SmartSAN 2.0

Revise versions to reflect SmartSAN 2.0 support

RDP updated to support additional descriptors:
  Credit descriptor
  Optical Element Data descriptors for Temperature, Voltage,
        Bias current, TX power and TX power.
  Optical Product Data descriptor.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 56204984761d80b973a0a534c42566ad78303766)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Fix Device discovery failures during switch reboot test.
James Smart [Thu, 31 Mar 2016 21:12:31 +0000 (14:12 -0700)]
lpfc: Fix Device discovery failures during switch reboot test.

When the switch is rebooted, the lpfc driver fails to log
into the fabric, and Unexpected timeout message is seen.

Fix: Do not issue RegVFI if the FLOGI was internally aborted.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 342b59caa66240b670285d519fdfe2c44289b516)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Utilize embedded CDB logic to minimize IO latency
James Smart [Thu, 31 Mar 2016 21:12:30 +0000 (14:12 -0700)]
lpfc: Utilize embedded CDB logic to minimize IO latency

Pass cmd iu payloads inline to adapter job structure rather than as
separate dma buffers.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Fix crash when unregistering default rpi.
James Smart [Thu, 31 Mar 2016 21:12:29 +0000 (14:12 -0700)]
lpfc: Fix crash when unregistering default rpi.

The default rpi completion handler does back to back puts to force the
removal of the ndlp. This ends up calling lpfc_unreg_rpi after the
reference count is at 0.

Fix:  Check the reference count of the ndlp before getting the ref to
make sure we are not getting a reference on a removed object.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit a6517db9006eb618dfde54f4bf6a9a8bc21e16e7)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Fix DMA faults observed upon plugging loopback connector
James Smart [Thu, 31 Mar 2016 21:12:28 +0000 (14:12 -0700)]
lpfc: Fix DMA faults observed upon plugging loopback connector

Driver didn't program the REG_VFI mailbox correctly, giving the adapter
bad addresses.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit ae09c765109293b600ba9169aa3d632e1ac1a843)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Correct LOGO handling during login
James Smart [Thu, 31 Mar 2016 21:12:27 +0000 (14:12 -0700)]
lpfc: Correct LOGO handling during login

After a link bounce, when a remote port issues a LOGO while a REGLOGIN
is pending on that port, the driver does not clean up the ndlp
structure. May result in stack traces in the console log.

Fix: Clear the NLP_REG_LOGIN_SEND flag on the ndlp in the routine

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit de96e9c5b82801ea17558c271730fdc2aa5e7e77)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: fix misleading indentation
Arnd Bergmann [Mon, 14 Mar 2016 14:29:44 +0000 (15:29 +0100)]
lpfc: fix misleading indentation

gcc-6 complains about the indentation of the lpfc_destroy_vport_work_array()
call in lpfc_online(), which clearly doesn't look right:

drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_online':
drivers/scsi/lpfc/lpfc_init.c:2880:3: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
   lpfc_destroy_vport_work_array(phba, vports);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/lpfc/lpfc_init.c:2863:2: note: ...this 'if' clause, but it is not
  if (vports != NULL)
  ^~

Looking at the patch that introduced this code, it's clear that the
behavior is correct and the indentation is wrong.

This fixes the indentation and adds curly braces around the previous
if() block for clarity, as that is most likely what caused the code
to be misindented in the first place.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 549e55cd2a1b ("[SCSI] lpfc 8.2.2 : Fix locking around HBA's port_list")
Reviewed-by: Sebastian Herbszt <herbszt@gmx.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit aeb6641f8ebdd61939f462a8255b316f9bfab707)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: fix missing zero termination in debugfs
Alan [Mon, 15 Feb 2016 19:11:56 +0000 (19:11 +0000)]
lpfc: fix missing zero termination in debugfs

If you feed 32 bytes in then the kstrtoull() doesn't receive a terminated
string so will run off the end.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 0872774d8a319676dea7416e0cf85bec63eea0d0)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Remove redundant code block in lpfc_scsi_cmd_iocb_cmpl
Johannes Thumshirn [Wed, 20 Jan 2016 15:08:40 +0000 (16:08 +0100)]
lpfc: Remove redundant code block in lpfc_scsi_cmd_iocb_cmpl

This removes a redundant code block that will either be executed if the
ENABLE_FCP_RING_POLLING flag is set in phba->cfg_poll or not. The code
is just duplicated in both cases, hence we unify it again.

This probably is a left over from some sort of refactoring.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Sebastian Herbszt <herbszt@gmx.de>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 19db2307365231e798bb99324ed553bcada57913)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agoqla2xxx: Update driver version to 8.07.00.38.40.0-k.
Sawan Chandak [Thu, 7 Jul 2016 10:16:44 +0000 (15:46 +0530)]
qla2xxx: Update driver version to 8.07.00.38.40.0-k.

Orabug: 23755773

Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Fix BBCR offset
Sawan Chandak [Thu, 30 Jun 2016 04:34:02 +0000 (21:34 -0700)]
qla2xxx: Fix BBCR offset

Orabug: 23755773

Fixes: 969a619 ("qla2xxx: Add support for buffer to buffer credit value for ISP27XX.")
Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Disable the adapter and skip error recovery in case of register disconnect.
Sawan Chandak [Fri, 3 Jun 2016 05:57:54 +0000 (11:27 +0530)]
qla2xxx: Disable the adapter and skip error recovery in case of register disconnect.

Orabug: 23755773

If there is error recovery going on due to command timeout and
there is register disconnect, then disable the adapter.

Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Separate ISP type bits out from device type.
Joe Carnuccio [Fri, 3 Jun 2016 05:55:57 +0000 (11:25 +0530)]
qla2xxx: Separate ISP type bits out from device type.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Correction to function qla26xx_dport_diagnostics().
Joe Carnuccio [Thu, 7 Jul 2016 10:08:03 +0000 (15:38 +0530)]
qla2xxx: Correction to function qla26xx_dport_diagnostics().

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add support to handle Loop Init error Asynchronus event.
Joe Carnuccio [Thu, 7 Jul 2016 10:09:34 +0000 (15:39 +0530)]
qla2xxx: Add support to handle Loop Init error Asynchronus event.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Let DPORT be enabled purely by nvram.
Joe Carnuccio [Thu, 7 Jul 2016 08:20:20 +0000 (13:50 +0530)]
qla2xxx: Let DPORT be enabled purely by nvram.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add bsg interface to support statistics counter reset.
Sawan Chandak [Thu, 7 Jul 2016 08:50:37 +0000 (14:20 +0530)]
qla2xxx: Add bsg interface to support statistics counter reset.

Orabug: 23755773

Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add bsg interface to support D_Port Diagnostics.
Joe Carnuccio [Thu, 7 Jul 2016 07:57:30 +0000 (13:27 +0530)]
qla2xxx: Add bsg interface to support D_Port Diagnostics.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Check for device state before unloading the driver.
Sawan Chandak [Thu, 7 Jul 2016 07:01:32 +0000 (12:31 +0530)]
qla2xxx: Check for device state before unloading the driver.

Orabug: 23755773

During hot swap of PCI device, there can be PCI error on device,
during normal driver unload. The race between normal driver unload and
driver unload due to PCI error, can lead to system crash.Fix is to check
if there is unload going on and allow that function to unload the driver.

Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Properly reset firmware statistics.
Joe Carnuccio [Fri, 1 Apr 2016 19:35:55 +0000 (12:35 -0700)]
qla2xxx: Properly reset firmware statistics.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Properly initialize IO statistics.
Joe Carnuccio [Thu, 10 Mar 2016 00:44:03 +0000 (16:44 -0800)]
qla2xxx: Properly initialize IO statistics.

Orabug: 23755773

Properly initialize IO statistics to avoid initial 0xFFFFFFF (-1) values.

Cleanup/simplify usage of pointer to statistics structure.

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Make debug buffer log easier to view.
Joe Carnuccio [Thu, 7 Jul 2016 06:31:34 +0000 (12:01 +0530)]
qla2xxx: Make debug buffer log easier to view.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add module parameter alternate/short names.
Joe Carnuccio [Thu, 28 Jan 2016 22:52:03 +0000 (14:52 -0800)]
qla2xxx: Add module parameter alternate/short names.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Set FLOGI retry in additional firmware options for P2P (N2N) mode.
Giridhar Malavali [Thu, 7 Jul 2016 06:15:53 +0000 (11:45 +0530)]
qla2xxx: Set FLOGI retry in additional firmware options for P2P (N2N) mode.

Orabug: 23755773

When VP decoupling enabled, there could be a window where, FLOGI from
initiators can be dropped before VP0 is enabled, causing link level recovery.
Retry FLOGI to avoid link level recovery.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Shutdown board on thermal shutdown aen.
Joe Carnuccio [Thu, 7 Jul 2016 06:03:08 +0000 (11:33 +0530)]
qla2xxx: Shutdown board on thermal shutdown aen.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>