]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Tue, 2 Aug 2016 06:59:32 +0000 (23:59 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years ago[2d8747c2] fixup! blk-mq: prevent double-unlock of mutex
Dan Duval [Fri, 29 Jul 2016 17:20:06 +0000 (13:20 -0400)]
[2d8747c2] fixup! blk-mq: prevent double-unlock of mutex

Orabug: 24376549

Commit 2d8747c28478f85d9f04292780b1432edd2a384e ("blk-mq: avoid inserting
requests before establishing new mapping") introduced an extraneous call
to mutex_unlock().  The result is that the mutex gets unlocked twice.

Remove the extra call.

Signed-off-by: Dan Duval <dan.duval@oracle.com>
8 years agoMerge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
Chuck Anderson [Sun, 31 Jul 2016 18:18:42 +0000 (11:18 -0700)]
Merge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agoMerge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek...
Chuck Anderson [Sun, 31 Jul 2016 18:17:59 +0000 (11:17 -0700)]
Merge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sun, 31 Jul 2016 18:15:28 +0000 (11:15 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sun, 31 Jul 2016 18:13:42 +0000 (11:13 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agoMerge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sun, 31 Jul 2016 05:55:22 +0000 (22:55 -0700)]
Merge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agoxen-pciback: mark device to be hidden on AER error trigger
Elena Ufimtseva [Thu, 21 Jul 2016 21:25:27 +0000 (17:25 -0400)]
xen-pciback: mark device to be hidden on AER error trigger

Some platforms are configured to reboot the machine upon
AER unrecoverable error and some virtualized systems are subject
to security risks described in XSA-124.
This patch allows for simple AER unrecoverable errors containment
together with killing the guest upon receiving of fatal AER error.
Patch stores in xenstore sbdf of passed through device that triggered
AER unrecoverable error. This will allow xend to make device
unassignable until next reboot or special hypervisor hypercall.

OraBug: 24377669

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Adnan Misherfi <adnan.misherfi@oracle.com>
8 years agotcp: make challenge acks less predictable
Eric Dumazet [Sun, 10 Jul 2016 08:04:02 +0000 (10:04 +0200)]
tcp: make challenge acks less predictable

Yue Cao claims that current host rate limiting of challenge ACKS
(RFC 5961) could leak enough information to allow a patient attacker
to hijack TCP sessions. He will soon provide details in an academic
paper.

This patch increases the default limit from 100 to 1000, and adds
some randomization so that the attacker can no longer hijack
sessions without spending a considerable amount of probes.

Based on initial analysis and patch from Linus.

Note that we also have per socket rate limiting, so it is tempting
to remove the host limit in the future.

v2: randomize the count of challenge acks per second, not the period.

Fixes: 282f23c6ee34 ("tcp: implement RFC 5961 3.2")
Reported-by: Yue Cao <ycao009@ucr.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 2401010
Conflicts:
net/ipv4/tcp_input.c
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoext4: update c/mtime on truncate up
Eryu Guan [Tue, 28 Jul 2015 19:08:41 +0000 (15:08 -0400)]
ext4: update c/mtime on truncate up

Commit 3da40c7b0898 ("ext4: only call ext4_truncate when size <= isize")
introduced a bug that c/mtime is not updated on truncate up.

Fix the issue by setting c/mtime explicitly in the truncate up case.

Note that ftruncate(2) is not affected, so you won't see this bug using
truncate(1) and xfs_io(1).

Orabug: 24377419

Signed-off-by: Zirong Lang <zorro.lang@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-By: Dan Duval <dan.duval@oracle.com>
8 years agovfs: rename: check backing inode being equal
Miklos Szeredi [Tue, 10 May 2016 23:16:37 +0000 (01:16 +0200)]
vfs: rename: check backing inode being equal

If a file is renamed to a hardlink of itself POSIX specifies that rename(2)
should do nothing and return success.

This condition is checked in vfs_rename().  However it won't detect hard
links on overlayfs where these are given separate inodes on the overlayfs
layer.

Overlayfs itself detects this condition and returns success without doing
anything, but then vfs_rename() will proceed as if this was a successful
rename (detach_mounts(), d_move()).

The correct thing to do is to detect this condition before even calling
into overlayfs.  This patch does this by calling vfs_select_inode() to get
the underlying inodes.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2+
Orabug: 24363418
CVE:CVE-2016-6198,CVE-2016-6197
Same as mainline v4.6 commit 9409e22acdfc9153f88d9b1ed2bd2a5b34d2d3ca
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agovfs: add vfs_select_inode() helper
Miklos Szeredi [Thu, 21 Jul 2016 20:30:58 +0000 (13:30 -0700)]
vfs: add vfs_select_inode() helper

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2+
Orabug: 24363418
CVE:CVE-2016-6198,CVE-2016-6197
Based on mainline v4.6 commit 54d5ca871e72f2bb172ec9323497f01cd5091ec7
Conflicts:
  include/linux/dcache.h - code base
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoovl: verify upper dentry before unlink and rename
Miklos Szeredi [Thu, 21 Jul 2016 20:24:59 +0000 (13:24 -0700)]
ovl: verify upper dentry before unlink and rename

Unlink and rename in overlayfs checked the upper dentry for staleness by
verifying upper->d_parent against upperdir.  However the dentry can go
stale also by being unhashed, for example.

Expand the verification to actually look up the name again (under parent
lock) and check if it matches the upper dentry.  This matches what the VFS
does before passing the dentry to filesytem's unlink/rename methods, which
excludes any inconsistency caused by overlayfs.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Orabug: 24363418
CVE:CVE-2016-6198,CVE-2016-6197
Based on mainline v4.6 commit 11f3710417d026ea2f4fcf362d866342c5274185
Conflicts:
  fs/overlayfs/dir.c - code base
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoovl: fix getcwd() failure after unsuccessful rmdir
Rui Wang [Thu, 21 Jul 2016 20:21:39 +0000 (13:21 -0700)]
ovl: fix getcwd() failure after unsuccessful rmdir

ovl_remove_upper() should do d_drop() only after it successfully
removes the dir, otherwise a subsequent getcwd() system call will
fail, breaking userspace programs.

This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
Orabug: 24363418
CVE:CVE-2016-6198,CVE-2016-6197
Based on mainline v4.5 commit ce9113bbcbf45a57c082d6603b9a9f342be3ef74
Pre-req for mainline v4.6 commit 11f3710417d026ea2f4fcf362d866342c5274185
Conflicts:
  fs/overlayfs/dir.c - code base
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoIBCM: dereference timewait_info only when needed
Santosh Shilimkar [Tue, 19 Jul 2016 02:35:26 +0000 (19:35 -0700)]
IBCM: dereference timewait_info only when needed

timewait_info is available in valid CM states and may
not be even allocated in invalid states.

Lets move the dereferencing only when we need in
those valid state.

Orabug: 24326732

Reviewed-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Tested-by: Efrain Galaviz <efrain.galaviz@oracle.com>
Tested-by: Hong Liu <hong.x.liu@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoMerge branch topic/uek-4.1/uek-carry of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Sat, 16 Jul 2016 07:06:04 +0000 (00:06 -0700)]
Merge branch topic/uek-4.1/uek-carry of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
Chuck Anderson [Sat, 16 Jul 2016 07:04:49 +0000 (00:04 -0700)]
Merge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Sat, 16 Jul 2016 06:55:14 +0000 (23:55 -0700)]
Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sat, 16 Jul 2016 06:53:50 +0000 (23:53 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoblock: Initialize max_dev_sectors to 0
Keith Busch [Wed, 10 Feb 2016 23:52:47 +0000 (16:52 -0700)]
block: Initialize max_dev_sectors to 0

Orabug: 23615929

(commit 5f009d3f8e6685fe8c6215082c1696a08b411220 of upstream)
The new queue limit is not used by the majority of block drivers, and
should be initialized to 0 for the driver's requested settings to be used.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agosd: Fix rw_max for devices that report an optimal xfer size
Martin K. Petersen [Fri, 13 May 2016 02:17:34 +0000 (22:17 -0400)]
sd: Fix rw_max for devices that report an optimal xfer size

Orabug: 23615929

(commit 6b7e9cde49691e04314342b7dce90c67ad567fcc of upstream)
For historic reasons, io_opt is in bytes and max_sectors in block layer
sectors. This interface inconsistency is error prone and should be
fixed. But for 4.4--4.7 let's make the unit difference explicit via a
wrapper function.

Fixes: d0eb20a863ba ("sd: Optimal I/O size is in bytes, not sectors")
Cc: stable@vger.kernel.org # 4.4+
Reported-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Andrew Patterson <andrew.patterson@hpe.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agosd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
Martin K. Petersen [Tue, 29 Mar 2016 01:18:56 +0000 (21:18 -0400)]
sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes

Orabug: 23615929

(commit f08bb1e0dbdd0297258d0b8cd4dbfcc057e57b2a of upstream)
During revalidate we check whether device capacity has changed before we
decide whether to output disk information or not.

The check for old capacity failed to take into account that we scaled
sdkp->capacity based on the reported logical block size. And therefore
the capacity test would always fail for devices with sectors bigger than
512 bytes and we would print several copies of the same discovery
information.

Avoid scaling sdkp->capacity and instead adjust the value on the fly
when setting the block device capacity and generating fake C/H/S
geometry.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: <stable@vger.kernel.org>
Reported-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agosd: Optimal I/O size is in bytes, not sectors
Martin K. Petersen [Wed, 20 Jan 2016 16:01:23 +0000 (11:01 -0500)]
sd: Optimal I/O size is in bytes, not sectors

Orabug: 23615929

(commit d0eb20a863ba7dc1d3f4b841639671f134560be2 of upstream)
Commit ca369d51b3e1 ("block/sd: Fix device-imposed transfer length
limits") accidentally switched optimal I/O size reporting from bytes to
block layer sectors.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: ca369d51b3e1649be4a72addd6d6a168cfb3f537
Cc: stable@vger.kernel.org # 4.4+
Reviewed-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agosd: Reject optimal transfer length smaller than page size
Martin K. Petersen [Wed, 16 Dec 2015 22:53:52 +0000 (17:53 -0500)]
sd: Reject optimal transfer length smaller than page size

Orabug: 23615929

(commit 9c1d9c207bb800498347a2716da298043ee280c5 of upstream)
Eryu Guan reported that loading scsi_debug would fail. This turned out
to be caused by scsi_debug reporting an optimal I/O size of 32KB which
is smaller than the 64KB page size on the PowerPC system in question.

Add a check to ensure that we only use the device-reported OPTIMAL
TRANSFER LENGTH if it is bigger than or equal to the page cache size.

Reported-by: Eryu Guan <guaneryu@gmail.com>
Reported-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agoblock/sd: Fix device-imposed transfer length limits
Joe Jin [Mon, 6 Jun 2016 05:22:30 +0000 (13:22 +0800)]
block/sd: Fix device-imposed transfer length limits

Orabug: 23615929

(commit ca369d51b3e1649be4a72addd6d6a168cfb3f537 of upstream)
Commit 4f258a46346c ("sd: Fix maximum I/O size for BLOCK_PC requests")
had the unfortunate side-effect of removing an implicit clamp to
BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer
code. This caused problems for some SMR drives.

Debugging this issue revealed a few problems with the existing
infrastructure since the block layer didn't know how to deal with
device-imposed limits, only limits set by the I/O controller.

 - Introduce a new queue limit, max_dev_sectors, which is used by the
   ULD to signal the maximum sectors for a REQ_TYPE_FS request.

 - Ensure that max_dev_sectors is correctly stacked and taken into
   account when overriding max_sectors through sysfs.

 - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer
   values for later processing.

 - In sd_revalidate() set the queue's max_dev_sectors based on the
   MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value
   is not reported, fall back to a cap based on the CDB TRANSFER LENGTH
   field size.

 - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits
   VPD--if reported and sane--to signal the preferred device transfer
   size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS.

 - blk_limits_max_hw_sectors() is no longer used and can be removed.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: sweeneygj@gmx.com
Tested-by: Arzeets <anatol.pomozov@gmail.com>
Tested-by: David Eisner <david.eisner@oriel.oxon.org>
Tested-by: Mario Kicherer <dev@kicherer.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
Conflicts:
  drivers/scsi/sd.h
  include/linux/blkdev.h
    Adding opt_xfer_blocks and max_dev_sectors breaks kABI.
    They will be added using by UEK_KABI_USE2() in a fix-up commit.
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
9 years agoFix kabi issue for upstream commit ca369d51
Joe Jin [Tue, 14 Jun 2016 07:13:33 +0000 (15:13 +0800)]
Fix kabi issue for upstream commit ca369d51

Orabug: 23615929

When backport upstream commit ca369d51 'block/sd: Fix device-imposed
transfer length limits' it broken kabi, this patch fix kabi issue.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
9 years agoRevert "ocfs2: bump up o2cb network protocol version"
Junxiao Bi [Thu, 30 Jun 2016 08:50:39 +0000 (16:50 +0800)]
Revert "ocfs2: bump up o2cb network protocol version"

This reverts commit d5eebd62353e9c1434951436c4d1d8d7a636c17d.

This commit made rolling upgrade fail. When one node is upgraded
to new version with this commit, the remaining nodes will fail to
establish connections to it, then vms on the remaining nodes can't
be live migrated to that node. This will cause an outage.

Orabug: 24292852
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
9 years agoBtrfs: fix leaking of ordered extents after direct IO write error
Filipe Manana [Tue, 8 Dec 2015 19:23:20 +0000 (19:23 +0000)]
Btrfs: fix leaking of ordered extents after direct IO write error

Orabug: 23717870

When doing a direct IO write, __blockdev_direct_IO() can call the
btrfs_get_blocks_direct() callback one or more times before it calls the
btrfs_submit_direct() callback. However it can fail after calling the
first callback and before calling the second callback, which is a problem
because the first one creates ordered extents and the second one is the
one that submits bios that cover the ordered extents created by the first
one. That means the ordered extents will never complete nor have any of
the flags BTRFS_ORDERED_IO_DONE / BTRFS_ORDERED_IOERR set, resulting in
subsequent operations (such as other direct IO writes, buffered writes or
hole punching) that lock the same IO range and lookup for ordered extents
in the range to hang forever waiting for those ordered extents because
they can not complete ever, since no bio was submitted.

Fix this by tracking a range of created ordered extents that don't have
yet corresponding bios submitted and completing the ordered extents in
the range if __blockdev_direct_IO() fails with an error.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
(cherry picked from commit f28a492878170f39002660a26c329201cf678d74)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: fix error path when failing to submit bio for direct IO write
Filipe Manana [Tue, 24 Nov 2015 16:23:54 +0000 (16:23 +0000)]
Btrfs: fix error path when failing to submit bio for direct IO write

Orabug: 23717870

Commit 61de718fceb6 ("Btrfs: fix memory corruption on failure to submit
bio for direct IO") fixed problems with the error handling code after we
fail to submit a bio for direct IO. However there were 2 problems that it
did not address when the failure is due to memory allocation failures for
direct IO writes:

1) We considered that there could be only one ordered extent for the whole
   IO range, which is not always true, as we can have multiple;

2) It did not set the bit BTRFS_ORDERED_IO_DONE in the ordered extent,
   which can make other tasks running btrfs_wait_logged_extents() hang
   forever, since they wait for that bit to be set. The general assumption
   is that regardless of an error, the BTRFS_ORDERED_IO_DONE is always set
   and it precedes setting the bit BTRFS_ORDERED_COMPLETE.

Fix these issues by moving part of the btrfs_endio_direct_write() handler
into a new helper function and having that new helper function called when
we fail to allocate memory to submit the bio (and its private object) for
a direct IO write.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
(cherry picked from commit 14543774bd67a64f616431e5c9d1472f58979841)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: fix memory corruption on failure to submit bio for direct IO
Filipe Manana [Wed, 1 Jul 2015 11:13:10 +0000 (12:13 +0100)]
Btrfs: fix memory corruption on failure to submit bio for direct IO

Orabug: 23717870

If we fail to submit a bio for a direct IO request, we were grabbing the
corresponding ordered extent and decrementing its reference count twice,
once for our lookup reference and once for the ordered tree reference.
This was a problem because it caused the ordered extent to be freed
without removing it from the ordered tree and any lists it might be
attached to, leaving dangling pointers to the ordered extent around.
Example trace with CONFIG_DEBUG_PAGEALLOC=y:

[161779.858707] BUG: unable to handle kernel paging request at 0000000087654330
[161779.859983] IP: [<ffffffff8124ca68>] rb_prev+0x22/0x3b
[161779.860636] PGD 34d818067 PUD 0
[161779.860636] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
(...)
[161779.860636] Call Trace:
[161779.860636]  [<ffffffffa06b36a6>] __tree_search+0xd9/0xf9 [btrfs]
[161779.860636]  [<ffffffffa06b3708>] tree_search+0x42/0x63 [btrfs]
[161779.860636]  [<ffffffffa06b4868>] ? btrfs_lookup_ordered_range+0x2d/0xa5 [btrfs]
[161779.860636]  [<ffffffffa06b4873>] btrfs_lookup_ordered_range+0x38/0xa5 [btrfs]
[161779.860636]  [<ffffffffa06aab8e>] btrfs_get_blocks_direct+0x11b/0x615 [btrfs]
[161779.860636]  [<ffffffff8119727f>] do_blockdev_direct_IO+0x5ff/0xb43
[161779.860636]  [<ffffffffa06aaa73>] ? btrfs_page_exists_in_range+0x1ad/0x1ad [btrfs]
[161779.860636]  [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs]
[161779.860636]  [<ffffffff811977f5>] __blockdev_direct_IO+0x32/0x34
[161779.860636]  [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs]
[161779.860636]  [<ffffffffa06a10ae>] btrfs_direct_IO+0x198/0x21f [btrfs]
[161779.860636]  [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs]
[161779.860636]  [<ffffffff81112ca1>] generic_file_direct_write+0xb3/0x128
[161779.860636]  [<ffffffffa06affaa>] ? btrfs_file_write_iter+0x15f/0x3e0 [btrfs]
[161779.860636]  [<ffffffffa06b004c>] btrfs_file_write_iter+0x201/0x3e0 [btrfs]
(...)

We were also not freeing the btrfs_dio_private we allocated previously,
which kmemleak reported with the following trace in its sysfs file:

unreferenced object 0xffff8803f553bf80 (size 96):
  comm "xfs_io", pid 4501, jiffies 4295039588 (age 173.936s)
  hex dump (first 32 bytes):
    88 6c 9b f5 02 88 ff ff 00 00 00 00 00 00 00 00  .l..............
    00 00 00 00 00 00 00 00 00 00 c4 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81161ffe>] create_object+0x172/0x29a
    [<ffffffff8145870f>] kmemleak_alloc+0x25/0x41
    [<ffffffff81154e64>] kmemleak_alloc_recursive.constprop.40+0x16/0x18
    [<ffffffff811579ed>] kmem_cache_alloc_trace+0xfb/0x148
    [<ffffffffa03d8cff>] btrfs_submit_direct+0x65/0x16a [btrfs]
    [<ffffffff811968dc>] dio_bio_submit+0x62/0x8f
    [<ffffffff811975fe>] do_blockdev_direct_IO+0x97e/0xb43
    [<ffffffff811977f5>] __blockdev_direct_IO+0x32/0x34
    [<ffffffffa03d70ae>] btrfs_direct_IO+0x198/0x21f [btrfs]
    [<ffffffff81112ca1>] generic_file_direct_write+0xb3/0x128
    [<ffffffffa03e604d>] btrfs_file_write_iter+0x201/0x3e0 [btrfs]
    [<ffffffff8116586a>] __vfs_write+0x7c/0xa5
    [<ffffffff81165da9>] vfs_write+0xa0/0xe4
    [<ffffffff81166675>] SyS_pwrite64+0x64/0x82
    [<ffffffff81464fd7>] system_call_fastpath+0x12/0x6f
    [<ffffffffffffffff>] 0xffffffffffffffff

For read requests we weren't doing any cleanup either (none of the work
done by btrfs_endio_direct_read()), so a failure submitting a bio for a
read request would leave a range in the inode's io_tree locked forever,
blocking any future operations (both reads and writes) against that range.

So fix this by making sure we do the same cleanup that we do for the case
where the bio submission succeeds.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 61de718fceb6bc028dafe4d06a1f87a9e0998303)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: fix extent accounting for partial direct IO writes
Filipe Manana [Wed, 4 Nov 2015 09:52:04 +0000 (09:52 +0000)]
Btrfs: fix extent accounting for partial direct IO writes

Orabug: 23717870

When doing a write using direct IO we can end up not doing the whole write
operation using the direct IO path, in that case we fallback to a buffered
write to do the remaining IO. This happens for example if the range we are
writing to contains a compressed extent.
When we do a partial write and fallback to buffered IO, due to the
existence of a compressed extent for example, we end up not adjusting the
outstanding extents counter of our inode which ends up getting decremented
twice, once by the DIO ordered extent for the partial write and once again
by btrfs_direct_IO(), resulting in an arithmetic underflow at
extent-tree.c:drop_outstanding_extent(). For example if we have:

  extents        [ prealloc extent ] [ compressed extent ]
  offsets        A        B          C       D           E

and at the moment our inode's outstanding extents counter is 0, if we do a
direct IO write against the range [B, D[ (which has a length smaller than
128Mb), we end up bumping our inode's outstanding extents counter to 1, we
create a DIO ordered extent for the range [B, C[ and then fallback to a
buffered write for the range [C, D[. The direct IO handler
(inode.c:btrfs_direct_IO()) decrements the outstanding extents counter by
1, leaving it with a value of 0, through a call to
btrfs_delalloc_release_space() and then shortly after the DIO ordered
extent finishes and calls btrfs_delalloc_release_metadata() which ends
up to attempt to decrement the inode's outstanding extents counter by 1,
resulting in an assertion failure at drop_outstanding_extent() because
the operation would result in an arithmetic underflow (0 - 1). This
produces the following trace:

  [125471.336838] BTRFS: assertion failed: BTRFS_I(inode)->outstanding_extents >= num_extents, file: fs/btrfs/extent-tree.c, line: 5526
  [125471.338844] ------------[ cut here ]------------
  [125471.340745] kernel BUG at fs/btrfs/ctree.h:4173!
  [125471.340745] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  [125471.340745] Modules linked in: btrfs f2fs xfs libcrc32c dm_flakey dm_mod crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse parport_pc acpi_cpufreq psmouse i2c_piix4 parport pcspkr serio_raw microcode processor evdev i2c_core button ext4 crc16 jbd2 mbcache sd_mod sg sr_mod cdrom ata_generic virtio_scsi ata_piix virtio_pci virtio_ring floppy libata virtio e1000 scsi_mod [last unloaded: btrfs]
  [125471.340745] CPU: 10 PID: 23649 Comm: kworker/u32:1 Tainted: G        W       4.3.0-rc5-btrfs-next-17+ #1
  [125471.340745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
  [125471.340745] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
  [125471.340745] task: ffff8804244fcf80 ti: ffff88040a118000 task.ti: ffff88040a118000
  [125471.340745] RIP: 0010:[<ffffffffa0550da1>]  [<ffffffffa0550da1>] assfail.constprop.46+0x1e/0x20 [btrfs]
  [125471.340745] RSP: 0018:ffff88040a11bc78  EFLAGS: 00010296
  [125471.340745] RAX: 0000000000000075 RBX: 0000000000005000 RCX: 0000000000000000
  [125471.340745] RDX: ffffffff81098f93 RSI: ffffffff8147c619 RDI: 00000000ffffffff
  [125471.340745] RBP: ffff88040a11bc78 R08: 0000000000000001 R09: 0000000000000000
  [125471.340745] R10: ffff88040a11bc08 R11: ffffffff81651000 R12: ffff8803efb4a000
  [125471.340745] R13: ffff8803efb4a000 R14: 0000000000000000 R15: ffff8802f8e33c88
  [125471.340745] FS:  0000000000000000(0000) GS:ffff88043dd40000(0000) knlGS:0000000000000000
  [125471.340745] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  [125471.340745] CR2: 00007fae7ca86095 CR3: 0000000001a0b000 CR4: 00000000000006e0
  [125471.340745] Stack:
  [125471.340745]  ffff88040a11bc88 ffffffffa04ca0cd ffff88040a11bcc8 ffffffffa04ceeb1
  [125471.340745]  ffff8802f8e33940 ffff8802c93eadb0 ffff8802f8e0bf50 ffff8803efb4a000
  [125471.340745]  0000000000000000 ffff8802f8e33c88 ffff88040a11bd38 ffffffffa04eccfa
  [125471.340745] Call Trace:
  [125471.340745]  [<ffffffffa04ca0cd>] drop_outstanding_extent+0x3d/0x6d [btrfs]
  [125471.340745]  [<ffffffffa04ceeb1>] btrfs_delalloc_release_metadata+0x51/0xdd [btrfs]
  [125471.340745]  [<ffffffffa04eccfa>] btrfs_finish_ordered_io+0x420/0x4eb [btrfs]
  [125471.340745]  [<ffffffffa04ecdda>] finish_ordered_fn+0x15/0x17 [btrfs]
  [125471.340745]  [<ffffffffa050e6e8>] normal_work_helper+0x14c/0x32a [btrfs]
  [125471.340745]  [<ffffffffa050e9c8>] btrfs_endio_write_helper+0x12/0x14 [btrfs]
  [125471.340745]  [<ffffffff81063b23>] process_one_work+0x24a/0x4ac
  [125471.340745]  [<ffffffff81064285>] worker_thread+0x206/0x2c2
  [125471.340745]  [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb
  [125471.340745]  [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb
  [125471.340745]  [<ffffffff8106904d>] kthread+0xef/0xf7
  [125471.340745]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
  [125471.340745]  [<ffffffff8147d10f>] ret_from_fork+0x3f/0x70
  [125471.340745]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
  [125471.340745] Code: a5 55 a0 48 89 e5 e8 42 50 bc e0 0f 0b 55 89 f1 48 c7 c2 f0 a8 55 a0 48 89 fe 31 c0 48 c7 c7 14 aa 55 a0 48 89 e5 e8 22 50 bc e0 <0f> 0b 0f 1f 44 00 00 55 31 c9 ba 18 00 00 00 48 89 e5 41 56 41
  [125471.340745] RIP  [<ffffffffa0550da1>] assfail.constprop.46+0x1e/0x20 [btrfs]
  [125471.340745]  RSP <ffff88040a11bc78>
  [125471.539620] ---[ end trace 144259f7838b4aa4 ]---

So fix this by ensuring we adjust the outstanding extents counter when we
do the fallback just like we do for the case where the whole write can be
done through the direct IO path.

We were also adjusting the outstanding extents counter by a constant value
of 1, which is incorrect because we were ignorning that we account extents
in BTRFS_MAX_EXTENT_SIZE units, o fix that as well.

The following test case for fstests reproduces this issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1 # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_xfs_io_command "falloc"

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _scratch_mount "-o compress"

  # Create a compressed extent covering the range [700K, 800K[.
  $XFS_IO_PROG -f -s -c "pwrite -S 0xaa -b 100K 700K 100K" \
      $SCRATCH_MNT/foo | _filter_xfs_io

  # Create prealloc extent covering the range [600K, 700K[.
  $XFS_IO_PROG -c "falloc 600K 100K" $SCRATCH_MNT/foo

  # Write 80K of data to the range [640K, 720K[ using direct IO. This
  # range covers both the prealloc extent and the compressed extent.
  # Because there's a compressed extent in the range we are writing to,
  # the DIO write code path ends up only writing the first 60k of data,
  # which goes to the prealloc extent, and then falls back to buffered IO
  # for writing the remaining 20K of data - because that remaining data
  # maps to a file range containing a compressed extent.
  # When falling back to buffered IO, we used to trigger an assertion when
  # releasing reserved space due to bad accounting of the inode's
  # outstanding extents counter, which was set to 1 but we ended up
  # decrementing it by 1 twice, once through the ordered extent for the
  # 60K of data we wrote using direct IO, and once through the main direct
  # IO handler (inode.cbtrfs_direct_IO()) because the direct IO write
  # wrote less than 80K of data (60K).
  $XFS_IO_PROG -d -c "pwrite -S 0xbb -b 80K 640K 80K" \
      $SCRATCH_MNT/foo | _filter_xfs_io

  # Now similar test as above but for very large write operations. This
  # triggers special cases for an inode's outstanding extents accounting,
  # as internally btrfs logically splits extents into 128Mb units.
  $XFS_IO_PROG -f -s \
      -c "pwrite -S 0xaa -b 128M 258M 128M" \
      -c "falloc 0 258M" \
      $SCRATCH_MNT/bar | _filter_xfs_io
  $XFS_IO_PROG -d -c "pwrite -S 0xbb -b 256M 3M 256M" $SCRATCH_MNT/bar \
      | _filter_xfs_io

  # Now verify the file contents are correct and that they are the same
  # even after unmounting and mounting the fs again (or evicting the page
  # cache).
  #
  # For file foo, all bytes in the range [0, 640K[ must have a value of
  # 0x00, all bytes in the range [640K, 720K[ must have a value of 0xbb
  # and all bytes in the range [720K, 800K[ must have a value of 0xaa.
  #
  # For file bar, all bytes in the range [0, 3M[ must havea value of 0x00,
  # all bytes in the range [3M, 259M[ must have a value of 0xbb and all
  # bytes in the range [259M, 386M[ must have a value of 0xaa.
  #
  echo "File digests before remounting the file system:"
  md5sum $SCRATCH_MNT/foo | _filter_scratch
  md5sum $SCRATCH_MNT/bar | _filter_scratch
  _scratch_remount
  echo "File digests after remounting the file system:"
  md5sum $SCRATCH_MNT/foo | _filter_scratch
  md5sum $SCRATCH_MNT/bar | _filter_scratch

  status=0
  exit

Fixes: e1cbbfa5f5aa ("Btrfs: fix outstanding_extents accounting in DIO")
Fixes: 3e05bde8c3c2 ("Btrfs: only adjust outstanding_extents when we do a short write")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
(cherry picked from commit 9c9464cc92668984ebed79e22b5063877a8d97db)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: Direct I/O: Fix space accounting
chandan [Fri, 28 Aug 2015 15:40:13 +0000 (21:10 +0530)]
Btrfs: Direct I/O: Fix space accounting

Orabug: 23717870

The following call trace is seen when generic/095 test is executed,

WARNING: CPU: 3 PID: 2769 at /home/chandan/code/repos/linux/fs/btrfs/inode.c:8967 btrfs_destroy_inode+0x284/0x2a0()
Modules linked in:
CPU: 3 PID: 2769 Comm: umount Not tainted 4.2.0-rc5+ #31
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20150306_163512-brownie 04/01/2014
 ffffffff81c08150 ffff8802ec9cbce8 ffffffff81984058 ffff8802ffd8feb0
 0000000000000000 ffff8802ec9cbd28 ffffffff81050385 ffff8802ec9cbd38
 ffff8802d12f8588 ffff8802d12f8588 ffff8802f15ab000 ffff8800bb96c0b0
Call Trace:
 [<ffffffff81984058>] dump_stack+0x45/0x57
 [<ffffffff81050385>] warn_slowpath_common+0x85/0xc0
 [<ffffffff81050465>] warn_slowpath_null+0x15/0x20
 [<ffffffff81340294>] btrfs_destroy_inode+0x284/0x2a0
 [<ffffffff8117ce07>] destroy_inode+0x37/0x60
 [<ffffffff8117cf39>] evict+0x109/0x170
 [<ffffffff8117cfd5>] dispose_list+0x35/0x50
 [<ffffffff8117dd3a>] evict_inodes+0xaa/0x100
 [<ffffffff81165667>] generic_shutdown_super+0x47/0xf0
 [<ffffffff81165951>] kill_anon_super+0x11/0x20
 [<ffffffff81302093>] btrfs_kill_super+0x13/0x110
 [<ffffffff81165c99>] deactivate_locked_super+0x39/0x70
 [<ffffffff811660cf>] deactivate_super+0x5f/0x70
 [<ffffffff81180e1e>] cleanup_mnt+0x3e/0x90
 [<ffffffff81180ebd>] __cleanup_mnt+0xd/0x10
 [<ffffffff81069c06>] task_work_run+0x96/0xb0
 [<ffffffff81003a3d>] do_notify_resume+0x3d/0x50
 [<ffffffff8198cbc2>] int_signal+0x12/0x17

This means that the inode had non-zero "outstanding extents" during
eviction. This occurs because, during direct I/O a task which successfully
used up its reserved data space would set BTRFS_INODE_DIO_READY bit and does
not clear the bit after finishing the DIO write. A future DIO write could
actually fail and the unused reserve space won't be freed because of the
previously set BTRFS_INODE_DIO_READY bit.

Clearing the BTRFS_INODE_DIO_READY bit in btrfs_direct_IO() caused the
following issue,
|-----------------------------------+-------------------------------------|
| Task A                            | Task B                              |
|-----------------------------------+-------------------------------------|
| Start direct i/o write on inode X.|                                     |
| reserve space                     |                                     |
| Allocate ordered extent           |                                     |
| release reserved space            |                                     |
| Set BTRFS_INODE_DIO_READY bit.    |                                     |
|                                   | splice()                            |
|                                   | Transfer data from pipe buffer to   |
|                                   | destination file.                   |
|                                   | - kmap(pipe buffer page)            |
|                                   | - Start direct i/o write on         |
|                                   |   inode X.                          |
|                                   |   - reserve space                   |
|                                   |   - dio_refill_pages()              |
|                                   |     - sdio->blocks_available == 0   |
|                                   |     - Since a kernel address is     |
|                                   |       being passed instead of a     |
|                                   |       user space address,           |
|                                   |       iov_iter_get_pages() returns  |
|                                   |       -EFAULT.                      |
|                                   |   - Since BTRFS_INODE_DIO_READY is  |
|                                   |     set, we don't release reserved  |
|                                   |     space.                          |
|                                   |   - Clear BTRFS_INODE_DIO_READY bit.|
| -EIOCBQUEUED is returned.         |                                     |
|-----------------------------------+-------------------------------------|

Hence this commit introduces "struct btrfs_dio_data" to track the usage of
reserved data space. The remaining unused "reserve space" can now be freed
reliably.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 50745b0a7f46f68574cd2b9ae24566bf026e7ebd)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoBtrfs: fix warning of bytes_may_use
Liu Bo [Wed, 17 Jun 2015 08:59:58 +0000 (16:59 +0800)]
Btrfs: fix warning of bytes_may_use

Orabug: 23717870

While running generic/019, dmesg got several warnings from
btrfs_free_reserved_data_space().

Test generic/019 produces some disk failures so sumbit dio will get errors,
in which case, btrfs_direct_IO() goes to the error handling and free
bytes_may_use, but the problem is that bytes_may_use has been free'd
during get_block().

This adds a runtime flag to show if we've gone through get_block(), if so,
don't do the cleanup work.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Tested-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit ddba1bfc2369cd0566bcfdab47599834a32d1c19)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
9 years agoxen: use same main loop for counting and remapping pages
Juergen Gross [Wed, 18 May 2016 14:44:54 +0000 (16:44 +0200)]
xen: use same main loop for counting and remapping pages

Instead of having two functions for cycling through the E820 map in
order to count to be remapped pages and remap them later, just use one
function with a caller supplied sub-function called for each region to
be processed. This eliminates the possibility of a mismatch between
both loops which showed up in certain configurations.

Suggested-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit dd14be92fbf5bc1ef7343f34968440e44e21b46a)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
OraBug: 24012238

9 years agoMerge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
Chuck Anderson [Wed, 13 Jul 2016 08:23:20 +0000 (01:23 -0700)]
Merge branch topic/uek-4.1/xen of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek...
Chuck Anderson [Wed, 13 Jul 2016 08:18:31 +0000 (01:18 -0700)]
Merge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Wed, 13 Jul 2016 08:14:47 +0000 (01:14 -0700)]
Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Wed, 13 Jul 2016 08:10:12 +0000 (01:10 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoMerge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Wed, 13 Jul 2016 08:08:56 +0000 (01:08 -0700)]
Merge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

9 years agoxen-blkfront: dynamic configuration of per-vbd resources
Bob Liu [Fri, 1 Jul 2016 01:11:15 +0000 (21:11 -0400)]
xen-blkfront: dynamic configuration of per-vbd resources

The current VBD layer reserves buffer space for each attached device based on
three statically configured settings which are read at boot time.
 * max_indirect_segs: Maximum amount of segments.
 * max_ring_page_order: Maximum order of pages to be used for the shared ring.
 * max_queues: Maximum of queues(rings) to be used.

But the storage backend, workload, and guest memory result in very different
tuning requirements. It's impossible to centrally predict application
characteristics so it's best to leave allow the settings can be dynamiclly
adjusted based on workload inside the Guest.

Usage:
Show current values:
cat /sys/devices/vbd-xxx/max_indirect_segs
cat /sys/devices/vbd-xxx/max_ring_page_order
cat /sys/devices/vbd-xxx/max_queues

Write new values:
echo <new value> > /sys/devices/vbd-xxx/max_indirect_segs
echo <new value> > /sys/devices/vbd-xxx/max_ring_page_order
echo <new value> > /sys/devices/vbd-xxx/max_queues

Orabug: 23720696
Signed-off-by: Bob Liu <bob.liu@oracle.com>
--
v2: Add device lock and other comments from Konrad.

9 years agoxen-blkfront: introduce blkif_set_queue_limits()
Bob Liu [Fri, 1 Jul 2016 21:43:39 +0000 (17:43 -0400)]
xen-blkfront: introduce blkif_set_queue_limits()

blk_mq_update_nr_hw_queues() reset all queue limits to default which it's not
as xen-blkfront expected, introducing blkif_set_queue_limits() to reset limits
with initial correct values.

Orabug: 23720696
Signed-off-by: Bob Liu <bob.liu@oracle.com>
9 years agoxen-blkfront: fix places not updated after introducing 64KB page granularity
Bob Liu [Fri, 1 Jul 2016 19:45:57 +0000 (15:45 -0400)]
xen-blkfront: fix places not updated after introducing 64KB page granularity

Two places didn't get updated when 64KB page granularity was introduced, this
patch fix them.

Orabug: 23720696
Signed-off-by: Bob Liu <bob.liu@oracle.com>
9 years agoIB: Add RNR timer workaround for PSIF
Santosh Shilimkar [Sat, 18 Jun 2016 20:06:29 +0000 (13:06 -0700)]
IB: Add RNR timer workaround for PSIF

The RNR NAK Retry timer on Titan and Sonoma 1&2 IB subsystems runs 500
times faster than desired. This means that retries are started a lot
sooner than they should.

The software workaround is bit involved and intrusive because it needs
to work in mixed HCA environments. It uses CM protocol to detect the
involvement of the offending IB requestor and then enables the
workaround in the peer responder. To keep the workaround flag
persistent, ib_qp verbs need to carry the flag which impacts
IB core kABI which is wrapped under __GENKSYMS__.

The workaround matches the desired RNR NAK Retry timer value when the
encodings 1 to 14 (decimal) are supplied. For encodings larger than 14
and for zero, the work-around will set the largest possible RNR NAK
Timer value for the offending requestor, which is 1,31 ms.

Thanks to Trivino, Haakon for updates and wide range of testing for
kernel as well as userland with mixed HCA configurations.

Orabug: 23633926

Reviewed-by Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
Reviewed-by: David Brean <david.brean@oracle.com>
Tested-by: Francisco Triviño García <francisco.trivino@oracle.com>
Signed-off-by: Francisco Triviño García <francisco.trivino@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoIB/core: Add encode/decode FDR/EDR rates
Hans Westgaard Ry [Mon, 11 Jul 2016 10:15:16 +0000 (12:15 +0200)]
IB/core: Add encode/decode FDR/EDR rates

The cases for FDR/EDR signalling speed, was missing in
ib_rate_to_mult and mult_to_ib_rate giving wrong return values
when drivers are converting static rate to/from inter-packet-delay.

Orabug: 23084916

Change-Id: Ib1d6e84eeea1addb830c415faf92f9f430c4ba32
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
9 years agobfa: Fix for crash when bfa_itnim is NULL
Sudarsana Reddy Kalluru [Wed, 6 Jul 2016 10:51:29 +0000 (06:51 -0400)]
bfa: Fix for crash when bfa_itnim is NULL

Orabug: 23950878

Fix a very corner case when the port gets disconnected and the BFA and
FCS layers clean up references to the IT nexus.  During this window if a
task management command is issued by the SCSI-ML and ends up referencing
a NULL itnim, it could lead to a crash.

Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Tested-by: Sudarasana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa:Update driver version to 3.2.25.0
Anil Gurumurthy [Thu, 26 Nov 2015 07:17:05 +0000 (12:47 +0530)]
bfa:Update driver version to 3.2.25.0

Orabug: 23950878

Signed-off-by: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa:File header and user visible string changes
Anil Gurumurthy [Thu, 26 Nov 2015 07:14:35 +0000 (12:44 +0530)]
bfa:File header and user visible string changes

Orabug: 23950878

Signed-off-by: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa:Updating copyright messages
Anil Gurumurthy [Thu, 26 Nov 2015 07:09:00 +0000 (12:39 +0530)]
bfa:Updating copyright messages

Orabug: 23950878

Signed-off-by: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa: Fix incorrect de-reference of pointer
Anil Gurumurthy [Thu, 13 Aug 2015 10:14:24 +0000 (03:14 -0700)]
bfa: Fix incorrect de-reference of pointer

Orabug: 23950878

Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Tested-by: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agobfa: Fix indentation
Anil Gurumurthy [Thu, 13 Aug 2015 10:12:47 +0000 (03:12 -0700)]
bfa: Fix indentation

Orabug: 23950878

Signed-off-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Tested-by : Sudarasana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agolpfc updates to 11.1.0.4 for uek4-r2
rkennedy [Mon, 20 Jun 2016 18:05:12 +0000 (11:05 -0700)]
lpfc updates to 11.1.0.4 for uek4-r2

Orabug: 23762058

Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Update modified file copyrights
James Smart [Thu, 31 Mar 2016 21:12:34 +0000 (14:12 -0700)]
lpfc: Update modified file copyrights

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 506115777af017bfc0968ee1c6aed024cdb6e43b)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Fix interaction between fdmi_on and enable_SmartSAN
James Smart [Thu, 31 Mar 2016 21:12:33 +0000 (14:12 -0700)]
lpfc: Fix interaction between fdmi_on and enable_SmartSAN

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 8663cbbe3ba0d8142faec48bbab0dc3482e3007d)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Add support for SmartSAN 2.0
James Smart [Thu, 31 Mar 2016 21:12:32 +0000 (14:12 -0700)]
lpfc: Add support for SmartSAN 2.0

Revise versions to reflect SmartSAN 2.0 support

RDP updated to support additional descriptors:
  Credit descriptor
  Optical Element Data descriptors for Temperature, Voltage,
        Bias current, TX power and TX power.
  Optical Product Data descriptor.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 56204984761d80b973a0a534c42566ad78303766)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Fix Device discovery failures during switch reboot test.
James Smart [Thu, 31 Mar 2016 21:12:31 +0000 (14:12 -0700)]
lpfc: Fix Device discovery failures during switch reboot test.

When the switch is rebooted, the lpfc driver fails to log
into the fabric, and Unexpected timeout message is seen.

Fix: Do not issue RegVFI if the FLOGI was internally aborted.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 342b59caa66240b670285d519fdfe2c44289b516)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Utilize embedded CDB logic to minimize IO latency
James Smart [Thu, 31 Mar 2016 21:12:30 +0000 (14:12 -0700)]
lpfc: Utilize embedded CDB logic to minimize IO latency

Pass cmd iu payloads inline to adapter job structure rather than as
separate dma buffers.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Fix crash when unregistering default rpi.
James Smart [Thu, 31 Mar 2016 21:12:29 +0000 (14:12 -0700)]
lpfc: Fix crash when unregistering default rpi.

The default rpi completion handler does back to back puts to force the
removal of the ndlp. This ends up calling lpfc_unreg_rpi after the
reference count is at 0.

Fix:  Check the reference count of the ndlp before getting the ref to
make sure we are not getting a reference on a removed object.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit a6517db9006eb618dfde54f4bf6a9a8bc21e16e7)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Fix DMA faults observed upon plugging loopback connector
James Smart [Thu, 31 Mar 2016 21:12:28 +0000 (14:12 -0700)]
lpfc: Fix DMA faults observed upon plugging loopback connector

Driver didn't program the REG_VFI mailbox correctly, giving the adapter
bad addresses.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit ae09c765109293b600ba9169aa3d632e1ac1a843)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Correct LOGO handling during login
James Smart [Thu, 31 Mar 2016 21:12:27 +0000 (14:12 -0700)]
lpfc: Correct LOGO handling during login

After a link bounce, when a remote port issues a LOGO while a REGLOGIN
is pending on that port, the driver does not clean up the ndlp
structure. May result in stack traces in the console log.

Fix: Clear the NLP_REG_LOGIN_SEND flag on the ndlp in the routine

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit de96e9c5b82801ea17558c271730fdc2aa5e7e77)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: fix misleading indentation
Arnd Bergmann [Mon, 14 Mar 2016 14:29:44 +0000 (15:29 +0100)]
lpfc: fix misleading indentation

gcc-6 complains about the indentation of the lpfc_destroy_vport_work_array()
call in lpfc_online(), which clearly doesn't look right:

drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_online':
drivers/scsi/lpfc/lpfc_init.c:2880:3: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
   lpfc_destroy_vport_work_array(phba, vports);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/lpfc/lpfc_init.c:2863:2: note: ...this 'if' clause, but it is not
  if (vports != NULL)
  ^~

Looking at the patch that introduced this code, it's clear that the
behavior is correct and the indentation is wrong.

This fixes the indentation and adds curly braces around the previous
if() block for clarity, as that is most likely what caused the code
to be misindented in the first place.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 549e55cd2a1b ("[SCSI] lpfc 8.2.2 : Fix locking around HBA's port_list")
Reviewed-by: Sebastian Herbszt <herbszt@gmx.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit aeb6641f8ebdd61939f462a8255b316f9bfab707)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: fix missing zero termination in debugfs
Alan [Mon, 15 Feb 2016 19:11:56 +0000 (19:11 +0000)]
lpfc: fix missing zero termination in debugfs

If you feed 32 bytes in then the kstrtoull() doesn't receive a terminated
string so will run off the end.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 0872774d8a319676dea7416e0cf85bec63eea0d0)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agolpfc: Remove redundant code block in lpfc_scsi_cmd_iocb_cmpl
Johannes Thumshirn [Wed, 20 Jan 2016 15:08:40 +0000 (16:08 +0100)]
lpfc: Remove redundant code block in lpfc_scsi_cmd_iocb_cmpl

This removes a redundant code block that will either be executed if the
ENABLE_FCP_RING_POLLING flag is set in phba->cfg_poll or not. The code
is just duplicated in both cases, hence we unify it again.

This probably is a left over from some sort of refactoring.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Sebastian Herbszt <herbszt@gmx.de>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 23762058
(cherry picked from commit 19db2307365231e798bb99324ed553bcada57913)
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
9 years agoqla2xxx: Update driver version to 8.07.00.38.40.0-k.
Sawan Chandak [Thu, 7 Jul 2016 10:16:44 +0000 (15:46 +0530)]
qla2xxx: Update driver version to 8.07.00.38.40.0-k.

Orabug: 23755773

Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Fix BBCR offset
Sawan Chandak [Thu, 30 Jun 2016 04:34:02 +0000 (21:34 -0700)]
qla2xxx: Fix BBCR offset

Orabug: 23755773

Fixes: 969a619 ("qla2xxx: Add support for buffer to buffer credit value for ISP27XX.")
Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Disable the adapter and skip error recovery in case of register disconnect.
Sawan Chandak [Fri, 3 Jun 2016 05:57:54 +0000 (11:27 +0530)]
qla2xxx: Disable the adapter and skip error recovery in case of register disconnect.

Orabug: 23755773

If there is error recovery going on due to command timeout and
there is register disconnect, then disable the adapter.

Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Separate ISP type bits out from device type.
Joe Carnuccio [Fri, 3 Jun 2016 05:55:57 +0000 (11:25 +0530)]
qla2xxx: Separate ISP type bits out from device type.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Correction to function qla26xx_dport_diagnostics().
Joe Carnuccio [Thu, 7 Jul 2016 10:08:03 +0000 (15:38 +0530)]
qla2xxx: Correction to function qla26xx_dport_diagnostics().

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add support to handle Loop Init error Asynchronus event.
Joe Carnuccio [Thu, 7 Jul 2016 10:09:34 +0000 (15:39 +0530)]
qla2xxx: Add support to handle Loop Init error Asynchronus event.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Let DPORT be enabled purely by nvram.
Joe Carnuccio [Thu, 7 Jul 2016 08:20:20 +0000 (13:50 +0530)]
qla2xxx: Let DPORT be enabled purely by nvram.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add bsg interface to support statistics counter reset.
Sawan Chandak [Thu, 7 Jul 2016 08:50:37 +0000 (14:20 +0530)]
qla2xxx: Add bsg interface to support statistics counter reset.

Orabug: 23755773

Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add bsg interface to support D_Port Diagnostics.
Joe Carnuccio [Thu, 7 Jul 2016 07:57:30 +0000 (13:27 +0530)]
qla2xxx: Add bsg interface to support D_Port Diagnostics.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Check for device state before unloading the driver.
Sawan Chandak [Thu, 7 Jul 2016 07:01:32 +0000 (12:31 +0530)]
qla2xxx: Check for device state before unloading the driver.

Orabug: 23755773

During hot swap of PCI device, there can be PCI error on device,
during normal driver unload. The race between normal driver unload and
driver unload due to PCI error, can lead to system crash.Fix is to check
if there is unload going on and allow that function to unload the driver.

Signed-off-by: Sawan Chandak <sawan.chandak@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Properly reset firmware statistics.
Joe Carnuccio [Fri, 1 Apr 2016 19:35:55 +0000 (12:35 -0700)]
qla2xxx: Properly reset firmware statistics.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Properly initialize IO statistics.
Joe Carnuccio [Thu, 10 Mar 2016 00:44:03 +0000 (16:44 -0800)]
qla2xxx: Properly initialize IO statistics.

Orabug: 23755773

Properly initialize IO statistics to avoid initial 0xFFFFFFF (-1) values.

Cleanup/simplify usage of pointer to statistics structure.

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Make debug buffer log easier to view.
Joe Carnuccio [Thu, 7 Jul 2016 06:31:34 +0000 (12:01 +0530)]
qla2xxx: Make debug buffer log easier to view.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add module parameter alternate/short names.
Joe Carnuccio [Thu, 28 Jan 2016 22:52:03 +0000 (14:52 -0800)]
qla2xxx: Add module parameter alternate/short names.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Set FLOGI retry in additional firmware options for P2P (N2N) mode.
Giridhar Malavali [Thu, 7 Jul 2016 06:15:53 +0000 (11:45 +0530)]
qla2xxx: Set FLOGI retry in additional firmware options for P2P (N2N) mode.

Orabug: 23755773

When VP decoupling enabled, there could be a window where, FLOGI from
initiators can be dropped before VP0 is enabled, causing link level recovery.
Retry FLOGI to avoid link level recovery.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Shutdown board on thermal shutdown aen.
Joe Carnuccio [Thu, 7 Jul 2016 06:03:08 +0000 (11:33 +0530)]
qla2xxx: Shutdown board on thermal shutdown aen.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Add ram area DDR for fwdump template entry T262.
Joe Carnuccio [Mon, 1 Feb 2016 18:42:04 +0000 (10:42 -0800)]
qla2xxx: Add ram area DDR for fwdump template entry T262.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agoqla2xxx: Remove sysfs node fw_dump_template.
Joe Carnuccio [Mon, 22 Feb 2016 08:52:17 +0000 (14:22 +0530)]
qla2xxx: Remove sysfs node fw_dump_template.

Orabug: 23755773

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
9 years agompt3sas: Used "synchronize_irq()"API to synchronize timed-out IO & TMs
Chaitra P B [Fri, 6 May 2016 08:59:31 +0000 (14:29 +0530)]
mpt3sas: Used "synchronize_irq()"API to synchronize timed-out IO & TMs

Replaced mpt3sas_base_flush_reply_queues() with
mpt3sas_base_sync_reply_irqs(),as mpt3sas_base_flush_reply_queues()
skips over reply queues that are currently busy (i.e. being handled by
interrupt processing in another core). If a reply queue is busy, then
call to synchronize_irq()in mpt3sas_base_sync_reply_irqs()make sures the
other core has finished flushing the queue and completed any calls to
the mid-layer scsi_done() routine.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 5f0dfb7a9bcc8139958f59ecb9bbd7e738ae702d)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Set maximum transfer length per IO to 4MB for VDs
Chaitra P B [Fri, 6 May 2016 08:59:30 +0000 (14:29 +0530)]
mpt3sas: Set maximum transfer length per IO to 4MB for VDs

Set maximum transfer length per IO on RAID volumes to 4MB by setting
VD's queue's max_sector to 8192.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 6c197093847e8cdec844df39a373bfe1f9a1ac8a)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Updating mpt3sas driver version to 13.100.00.00
Chaitra P B [Fri, 6 May 2016 08:59:29 +0000 (14:29 +0530)]
mpt3sas: Updating mpt3sas driver version to 13.100.00.00

Bump mpt3sas driver version from 12.100.00.00 to 13.100.00.00

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit b2500d76a0dbaa8993cd6b43941d23d31a312831)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Fix initial Reference tag field for 4K PI drives.
Chaitra P B [Fri, 6 May 2016 08:59:28 +0000 (14:29 +0530)]
mpt3sas: Fix initial Reference tag field for 4K PI drives.

Modified driver code to use scsi_prot_ref_tag() API instead of
scsi_get_lba(), while initializing reference tag field in the CDB.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 648512ccd7d42ccf761f515b7c0cb456a48c477a)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Handle active cable exception event
Chaitra P B [Fri, 6 May 2016 08:59:27 +0000 (14:29 +0530)]
mpt3sas: Handle active cable exception event

In-order to handle this 'MPI2_EVENT_ACTIVE_CABLE_EXCEPTION' event,
driver need to follow below steps,
1. Unmask the 'MPI2_EVENT_ACTIVE_CABLE_EXCEPTION' event,
so that FW can notify this event to host driver.
2. After receiving this event, add this event to AEN event queue,
for notifying this event to applications.
3. Then Print below message in kernel logs if the event data's reason
code is zero,
"Currently an active cable with ReceptacleID <ID_Value> cannot be powered
and devices connected to this active cable will not be seen. This active
cable requires <PowerValue_in_mW> of power"

This event is only for Intruder/Cutlass HBAs.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit a470a51cd6481373cdf2b5934b1b9f7853688de9)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Update MPI header to 2.00.42
Chaitra P B [Fri, 6 May 2016 08:59:26 +0000 (14:29 +0530)]
mpt3sas: Update MPI header to 2.00.42

Updated MPI version and MPI header files.

ChangeList:
* Added SATADeviceWaitTime to SAS IO Unit Page 4
* Added EEDPObservedValue added to SCSI IO Reply message
* Added MPI2_EVENT_ACTIVE_CABLE_EXCEPTION and
  MPI26_EVENT_DATA_ACTIVE_CABLE_EXCEPT

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 4fe6bc97efebdc5083aa749850928fad1740a60d)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas - remove unused fw_event_work elements
Joe Lawrence [Mon, 18 Apr 2016 14:50:12 +0000 (10:50 -0400)]
mpt3sas - remove unused fw_event_work elements

Firmware events are queued up using the fw_event_work's struct work, not
its delayed_work member.  The initial driver for SAS2 controllers had
handled firmware reset using the rescan barrier and was later redesigned
through "mpt2sas: [Resend] Host Reset code cleanup".  The delayed_work
variables are now unused and may provoke CONFIG_DEBUG_OBJECTS_TIMERS
"assert_init not available" false warnings in
_scsih_fw_event_cleanup_queue.

Cleanup fw_event_work's unused entries, update its kerneldoc, and
update _scsih_fw_event_cleanup_queue accordingly.

Fixes: 146b16c8071f (mpt3sas: Refcount fw_events and fix unsafe list usage)
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit b8ac0cc78b56e798851f1435bc673761d3fb877e)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Remove usage of 'struct timeval'
Tina Ruchandani [Wed, 13 Apr 2016 07:01:40 +0000 (00:01 -0700)]
mpt3sas: Remove usage of 'struct timeval'

'struct timeval' will have its tv_sec value overflow on 32-bit systems
in year 2038 and beyond. This patch replaces the use of struct timeval
for computing mpi_request.TimeStamp, and instead uses ktime_t which
provides 64-bit seconds value. The timestamp computed remains
unaffected (milliseconds since Unix epoch).

Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 23409bd4a8b051e28d0106c7a83f362617452098)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Don't overreach ioc->reply_post[] during initialization
Calvin Owens [Fri, 18 Mar 2016 19:45:42 +0000 (12:45 -0700)]
mpt3sas: Don't overreach ioc->reply_post[] during initialization

In _base_make_ioc_operational(), we walk ioc->reply_queue_list and pull
a pointer out of successive elements of ioc->reply_post[] for each entry
in that list if RDPQ is enabled.

Since the code pulls the pointer for the next iteration at the bottom of
the loop, it triggers the a KASAN dump on the final iteration:

    BUG: KASAN: slab-out-of-bounds in _base_make_ioc_operational+0x47b7/0x47e0 [mpt3sas] at addr ffff880754816ab0
    Read of size 8 by task modprobe/305
    <snip>
    Call Trace:
     [<ffffffff81dfc591>] dump_stack+0x4d/0x6c
     [<ffffffff814c9689>] print_trailer+0xf9/0x150
     [<ffffffff814ceda4>] object_err+0x34/0x40
     [<ffffffff814d1231>] kasan_report_error+0x221/0x530
     [<ffffffff814d1673>] __asan_report_load8_noabort+0x43/0x50
     [<ffffffffa0043637>] _base_make_ioc_operational+0x47b7/0x47e0 [mpt3sas]
     [<ffffffffa0049a51>] mpt3sas_base_attach+0x1991/0x2120 [mpt3sas]
     [<ffffffffa0053c93>] _scsih_probe+0xeb3/0x16b0 [mpt3sas]
     [<ffffffff81ebd047>] local_pci_probe+0xc7/0x170
     [<ffffffff81ebf2cf>] pci_device_probe+0x20f/0x290
     [<ffffffff820d50cd>] really_probe+0x17d/0x600
     [<ffffffff820d56a3>] __driver_attach+0x153/0x190
     [<ffffffff820cffac>] bus_for_each_dev+0x11c/0x1a0
     [<ffffffff820d421d>] driver_attach+0x3d/0x50
     [<ffffffff820d378a>] bus_add_driver+0x44a/0x5f0
     [<ffffffff820d666c>] driver_register+0x18c/0x3b0
     [<ffffffff81ebcb76>] __pci_register_driver+0x156/0x200
     [<ffffffffa00c8135>] _mpt3sas_init+0x135/0x1000 [mpt3sas]
     [<ffffffff81000423>] do_one_initcall+0x113/0x2b0
     [<ffffffff813caa5a>] do_init_module+0x1d0/0x4d8
     [<ffffffff81273909>] load_module+0x6729/0x8dc0
     [<ffffffff81276123>] SYSC_init_module+0x183/0x1a0
     [<ffffffff8127625e>] SyS_init_module+0xe/0x10
     [<ffffffff828fe7d7>] entry_SYSCALL_64_fastpath+0x12/0x6a

Fix this by pulling the value at the beginning of the loop.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Jens Axboe <axboe@fb.com>
Acked-by: Chaitra Basappa <chaitra.basappa@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 5ec8a1753bc29efa7e4b1391d691c9c719b30257)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Remove unnecessary synchronize_irq() before free_irq()
Lars-Peter Clausen [Fri, 4 Mar 2016 10:15:07 +0000 (11:15 +0100)]
mpt3sas: Remove unnecessary synchronize_irq() before free_irq()

Calling synchronize_irq() right before free_irq() is quite useless. On
one hand the IRQ can easily fire again before free_irq() is entered, on
the other hand free_irq() itself calls synchronize_irq() internally (in
a race condition free way), before any state associated with the IRQ is
freed.

Patch was generated using the following semantic patch:
// <smpl>
@@
expression irq;
@@
-synchronize_irq(irq);
 free_irq(irq, ...);
// </smpl>

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 7f8b8f3fba55b345f9b6e3f55906bef6e29e354b)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Free memory pools before retrying to allocate with different value.
Suganath prabu Subramani [Thu, 18 Feb 2016 08:39:45 +0000 (14:09 +0530)]
mpt3sas: Free memory pools before retrying to allocate with different value.

Deallocate resources before reallocating of the same in retry_allocation
path of _base_allocate_memory_pools()

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 8ff045c92708a595b7e39d68bdc0bd7edc08a073)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Remove cpumask_clear for zalloc_cpumask_var and don't free free_cpu_mask_var...
Suganath prabu Subramani [Thu, 11 Feb 2016 09:32:55 +0000 (15:02 +0530)]
mpt3sas: Remove cpumask_clear for zalloc_cpumask_var and don't free free_cpu_mask_var before reply_q

Removed cpumask_clear as it is not required for zalloc_cpumask_var and
free free_cpumask_var before freeing reply_q.

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit da3cec2515f0094796679876ba17ba359331dbf6)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Updating mpt3sas driver version to 12.100.00.00
Suganath prabu Subramani [Thu, 28 Jan 2016 06:37:07 +0000 (12:07 +0530)]
mpt3sas: Updating mpt3sas driver version to 12.100.00.00

Bump mpt3sas driver version from 09.102.00.00 to 12.100.00.00

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit d867b655eadf01fd5231ad9f41010c4d3b002a16)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Fix for Asynchronous completion of timedout IO and task abort of timedout IO.
Suganath prabu Subramani [Thu, 28 Jan 2016 06:37:06 +0000 (12:07 +0530)]
mpt3sas: Fix for Asynchronous completion of timedout IO and task abort of timedout IO.

Track msix of each IO and use the same msix for issuing abort to timed
out IO. With this driver will process IO's reply first followed by TM.

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 03d1fb3a65783979f23bd58b5a0387e6992d9e26)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Updated MPI Header to 2.00.42
Suganath prabu Subramani [Thu, 28 Jan 2016 06:37:05 +0000 (12:07 +0530)]
mpt3sas: Updated MPI Header to 2.00.42

Updated MPI version and MPI header files.

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 5c739b6157bd090942e5847ddd12bfb99cd4240d)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Add support for configurable Chain Frame Size
Suganath prabu Subramani [Thu, 28 Jan 2016 06:37:04 +0000 (12:07 +0530)]
mpt3sas: Add support for configurable Chain Frame Size

Added support for configurable Chain Frame Size. Calculate the
Chain Message Frame size from the IOCMaxChainSegementSize (iocfacts).
Applicable only for mpt3sas/SAS3.0 HBA's.

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit ebb3024e2fd5578c800a5ae9165dd7f1a0844c11)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Added smp_affinity_enable module parameter.
Suganath Prabu Subramani [Mon, 8 Feb 2016 16:43:39 +0000 (22:13 +0530)]
mpt3sas: Added smp_affinity_enable module parameter.

Module parameter to enable/disable configuring affinity hint for msix
vector.  SMP affinity feature can be enabled/disabled by setting module
parameter "smp_affinity_enable" to 1/0.  By default this feature is
enabled. (smp_affinity_enable = 1 enabled).

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 64038301baed7d3d59a940ed8db311e27e8995d4)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Make use of additional HighPriority credit message frames for sending SCSI...
Suganath prabu Subramani [Thu, 28 Jan 2016 06:37:02 +0000 (12:07 +0530)]
mpt3sas: Make use of additional HighPriority credit message frames for sending SCSI IO's

Driver assumes HighPriority credit as part of Global credit. But,
Firmware treats HighPriority credit value and global cedits as two
different values. Changed host queue algorithm to treat global credits
and highPriority credits as two different values.

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit fd0331b32826dd440bdcad2ff4c1668e0224e625)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Never block the Enclosure device
Suganath prabu Subramani [Thu, 28 Jan 2016 06:37:01 +0000 (12:07 +0530)]
mpt3sas: Never block the Enclosure device

Never block the SEP device (i.e. Never invoke the
scsi_internal_device_block() API for SEP device) even for the delay not
responding events. Blocking the SEP device will create a deadlock while
adding any device to the OS.

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 30158dc9bbc9d510780673a955cd4fdc36e1d366)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
9 years agompt3sas: Fix static analyzer(coverity) tool identified defects
Suganath prabu Subramani [Thu, 28 Jan 2016 06:37:00 +0000 (12:07 +0530)]
mpt3sas: Fix static analyzer(coverity) tool identified defects

1.Wrong size of argument is being passed
 The size of struct being passed as an argument to memset func and area of
 memory being pointed by an instance of struct in memset func should be of
 same structure type.
2.Dereference null return value
3.Array compared against '0'
 Check whether value pointed by particular index of an array is null or not
 in "if" statement.

Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 22529571
(cherry picked from commit 869817f9e92e3b7911053e3c346560f20219e837)
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>