www.infradead.org Git - users/jedix/linux-maple.git/log

xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff()

There is an off-by-one error in loop termination conditions in
xfs_find_get_desired_pgoff() since 'end' may index a page beyond end of
desired range if 'endoff' is page aligned. It doesn't have any visible
effects but still it is good to fix it.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Orabug: 27093425

(backport upstream commit d7fd24257aa60316bf81093f7f909dc9475ae974)

Signed-off-by: Shan Hai <shan.hai@oracle.com>

xfs: Fix missed holes in SEEK_HOLE implementation

XFS SEEK_HOLE implementation could miss a hole in an unwritten extent as
can be seen by the following command:

xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
-c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (49.312 MiB/sec and 12623.9856 ops/sec)
wrote 8192/8192 bytes at offset 131072
8 KiB, 2 ops; 0.0000 sec (70.383 MiB/sec and 18018.0180 ops/sec)
Whence Result
HOLE 139264

Where we can see that hole at offset 56k was just ignored by SEEK_HOLE
implementation. The bug is in xfs_find_get_desired_pgoff() which does
not properly detect the case when pages are not contiguous.

Fix the problem by properly detecting when found page has larger offset
than expected.

CC: stable@vger.kernel.org
Fixes: d126d43f631f996daeee5006714fed914be32368
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Orabug: 27093425

(backport upstream commit 5375023ae1266553a7baa0845e82917d8803f48c)

Signed-off-by: Shan Hai <shan.hai@oracle.com>

ext4: fix off-by-in in loop termination in ext4_find_unwritten_pgoff()

There is an off-by-one error in loop termination conditions in
ext4_find_unwritten_pgoff() since 'end' may index a page beyond end of
desired range if 'endoff' is page aligned. It doesn't have any visible
effects but still it is good to fix it.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Orabug: 27093425

(backport upstream commit 3f1d5bad3fae983da07be01cff2fde13293bb7b9)

Signed-off-by: Shan Hai <shan.hai@oracle.com>

ext4: fix SEEK_HOLE

Currently, SEEK_HOLE implementation in ext4 may both return that there's
a hole at some offset although that offset already has data and skip
some holes during a search for the next hole. The first problem is
demostrated by:

xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (2.054 GiB/sec and 538461.5385 ops/sec)
Whence Result
HOLE 0

Where we can see that SEEK_HOLE wrongly returned offset 0 as containing
a hole although we have written data there. The second problem can be
demonstrated by:

xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
-c "seek -h 0" file

wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (1.978 GiB/sec and 518518.5185 ops/sec)
wrote 8192/8192 bytes at offset 131072
8 KiB, 2 ops; 0.0000 sec (2 GiB/sec and 500000.0000 ops/sec)
Whence Result
HOLE 139264

Where we can see that hole at offsets 56k..128k has been ignored by the
SEEK_HOLE call.

The underlying problem is in the ext4_find_unwritten_pgoff() which is
just buggy. In some cases it fails to update returned offset when it
finds a hole (when no pages are found or when the first found page has
higher index than expected), in some cases conditions for detecting hole
are just missing (we fail to detect a situation where indices of
returned pages are not contiguous).

Fix ext4_find_unwritten_pgoff() to properly detect non-contiguous page
indices and also handle all cases where we got less pages then expected
in one place and handle it properly there.

CC: stable@vger.kernel.org
Fixes: c8c0df241cc2719b1262e627f999638411934f60
CC: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 7d95eddf313c88b24f99d4ca9c2411a4b82fef33)

Orabug: 27093425

Signed-off-by: Shan Hai <shan.hai@oracle.com>

uek-rpm: Add more missing modules to OL7 ueknano

Orabug: 27092501

List of configs enabled to build vmlinux for both OL6 QU5 nano and OL7
QU6 nano are compared. Few of the configs built in vmlinux is moved to
build as modules in OL7 and those modules are missing in ueknano. This
commit adds them to nano_modules.list

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>

fix unbalanced page refcounting in bio_map_user_iov

bio_map_user_iov and bio_unmap_user do unbalanced pages refcounting if
IO vector has small consecutive buffers belonging to the same page.
bio_add_pc_page merges them into one, but the page reference is never
dropped.

Cc: stable@vger.kernel.org
Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 95d78c28b5a85bacbc29b8dba7c04babb9b0d467)

Orabug: 27062562
CVE: CVE-2017-12190

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>

more bio_map_user_iov() leak fixes

we need to take care of failure exit as well - pages already
in bio should be dropped by analogue of bio_unmap_pages(),
since their refcounts had been bumped only once per reference
in bio.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 2b04e8f6bbb196cab4b232af0f8d48ff2c7a8058)

Orabug: 27062562
CVE: CVE-2017-12190

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Acked-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Shan Hai <shan.hai@oracle.com>
Conflicts:
block/bio.c

virtio-pci: alloc only resources actually used.

Move resource allocation from common code to legacy and modern code.
Only request resources actually used, i.e. bar0 in legacy mode and
the bar(s) specified by capabilities in modern mode.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 59a5b0f7bf74f88da6670bcbf924d8cc1e75b1ee)

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Jim Quigley <Jim.Quigley@oracle.com>
Orabug: 27054871

Conflicts:
drivers/virtio/virtio_pci_legacy.c

This was a minor contextual change introduced because upstream
applied the following patches in the order :-

vring: Use the DMA API on Xen (Andy Lutomirski)  [Orabug: 26388044]
virtio_pci: Use the DMA API if enabled (Andy Lutomirski)  [Orabug: 26388044]
virtio_mmio: Use the DMA API if enabled (Andy Lutomirski)  [Orabug: 26388044]
virtio: Add improved queue allocation API (Andy Lutomirski)  [Orabug: 26388044]
virtio_ring: Support DMA APIs (Andy Lutomirski)  [Orabug: 26388044]
vring: Introduce vring_use_dma_api() (Andy Lutomirski) Orabug: 26388044]
virtio-pci: alloc only resources actually used.

and we are now applying them as :-

virtio-pci: alloc only resources actually used.
vring: Use the DMA API on Xen (Andy Lutomirski)  [Orabug: 26388044]
virtio_pci: Use the DMA API if enabled (Andy Lutomirski)  [Orabug: 26388044]
....

usb: Quiet down false peer failure messages

My recent Intel box is spewing these messages:

xhci_hcd 0000:00:14.0: xHCI Host Controller
xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: xHCI Host Controller
usb usb2: Manufacturer: Linux 4.3.0+ xhci-hcd
usb usb2: SerialNumber: 0000:00:14.0
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 6 ports detected
usb: failed to peer usb2-port2 and usb1-port1 by location (usb2-port2:none) (usb1-port1:usb2-port1)
usb usb2-port2: failed to peer to usb1-port1 (-16)
usb: port power management may be unreliable
usb: failed to peer usb2-port3 and usb1-port1 by location (usb2-port3:none) (usb1-port1:usb2-port1)
usb usb2-port3: failed to peer to usb1-port1 (-16)
usb: failed to peer usb2-port5 and usb1-port1 by location (usb2-port5:none) (usb1-port1:usb2-port1)
usb usb2-port5: failed to peer to usb1-port1 (-16)
usb: failed to peer usb2-port6 and usb1-port1 by location (usb2-port6:none) (usb1-port1:usb2-port1)
usb usb2-port6: failed to peer to usb1-port1 (-16)

Diving into the acpi tables, I noticed the EHCI hub has 12 ports while the XHCI
hub has 8 ports. Most of those ports are of connect type USB_PORT_NOT_USED
(including port 1 of the EHCI hub).

Further the unused ports have location data initialized to 0x80000000.

Now each unused port on the xhci hub walks the port list and finds a matching
peer with port1 of the EHCI hub because the zero'd out group id bits falsely match.
After port1 of the XHCI hub, each following matching peer will generate the
above warning.

These warnings seem to be harmless for this scenario as I don't think it
matters that unused ports could not create a peer link.

The attached patch utilizes that assumption and just turns the pr_warn into
pr_debug to quiet things down.

Tested on my Intel box.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6406eeb3f5bb376c7d9674e61f8da34ce7f05e8d)

Orabug: 27080216

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>

xscore: add dma address check

When "swiotlb buffer is full" error occurs, the DMA allocation
will fail. The data conn and control conn are disconnected. Then
xscore will make error handling. In this error handling, the
unallocated DMA address is unmapped. This will result in the crash.
To avoid crash, the dma address check is added.

Orabug: 27074085

Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

netlink: allow to listen "all" netns

More accurately, listen all netns that have a nsid assigned into the netns
where the netlink socket is opened.
For this purpose, a netlink socket option is added:
NETLINK_LISTEN_ALL_NSID. When this option is set on a netlink socket, this
socket will receive netlink notifications from all netns that have a nsid
assigned into the netns where the socket has been opened. The nsid is sent
to userland via an anscillary data.

With this patch, a daemon needs only one socket to listen many netns. This
is useful when the number of netns is high.

Because 0 is a valid value for a nsid, the field nsid_is_set indicates if
the field nsid is valid or not. skb->cb is initialized to 0 on skb
allocation, thus we are sure that we will never send a nsid 0 by error to
the userland.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 23634951
(cherry picked from commit 59324cf35aba5336b611074028777838a963d03b)
Signed-off-by: Cathy Zhou <Cathy.Zhou@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

netlink: rename private flags and states

These flags and states have the same prefix (NETLINK_) that netlink socket
options. To avoid confusion and to be able to name a flag like a socket
option, let's use an other prefix: NETLINK_[S|F]_.

Note: a comment has been fixed, it was talking about
NETLINK_RECV_NO_ENOBUFS socket option instead of NETLINK_NO_ENOBUFS.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 23634951
(cherry picked from commit cc3a572fe6cf586f478546215bc5d3694357d71e)
Signed-off-by: Cathy Zhou <Cathy.Zhou@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

netns: use a spin_lock to protect nsid management

Before this patch, nsid were protected by the rtnl lock. The goal of this
patch is to be able to find a nsid without needing to hold the rtnl lock.

The next patch will introduce a netlink socket option to listen to all
netns that have a nsid assigned into the netns where the socket is opened.
Thus, it's important to call rtnl_net_notifyid() outside the spinlock, to
avoid a recursive lock (nsid are notified via rtnl). This was the main
reason of the previous patch.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 23634951
(cherry picked from commit 95f38411df055a0ecefe3a3d119d98241087d5ca)
Signed-off-by: Cathy Zhou <Cathy.Zhou@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

netns: notify new nsid outside __peernet2id()

There is no functional change with this patch. It will ease the refactoring
of the locking system that protects nsids and the support of the netlink
socket option NETLINK_LISTEN_ALL_NSID.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 23634951
(cherry picked from commit 3138dbf881274cb20d9aa1b307861f689e820fbe)
Signed-off-by: Cathy Zhou <Cathy.Zhou@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

netns: rename peernet2id() to peernet2id_alloc()

In a following commit, a new function will be introduced to only lookup for
a nsid (no allocation if the nsid doesn't exist). To avoid confusion, the
existing function is renamed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 23634951
(cherry picked from commit 7a0877d4b438886b72be61632eaa774d13262f70)
Signed-off-by: Cathy Zhou <Cathy.Zhou@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

netns: always provide the id to rtnl_net_fill()

The goal of this commit is to prepare the rework of the locking of nsnid
protection.
After this patch, rtnl_net_notifyid() will not call anymore __peernet2id(),
ie no idr_* operation into this function.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 23634951
(cherry picked from commit cab3c8ec8d57ef48ed754ee7acf2b9bdce80fa5f)
Signed-off-by: Cathy Zhou <Cathy.Zhou@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

netns: returns always an id in __peernet2id()

All callers of this function expect a nsid, not an error.
Thus, returns NETNSA_NSID_NOT_ASSIGNED in case of error so that callers
don't have to convert the error to NETNSA_NSID_NOT_ASSIGNED.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 23634951
(cherry picked from commit 109582af18b9aade9385ea6609a792f80a7d70ca)
Signed-off-by: Cathy Zhou <Cathy.Zhou@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

uek-rpm: add update-el-x86; fix-up ol6/update-el

The build env will start looking for uek-rpm/ol?/update-el-<arch>.
Add uek-rpm/ol6/update-el-x86, specifying 6.7.
Add uek-rpm/ol7/update-el-x86, specifying 7.3.

For now, leave uek-rpm/ol?/update-el. Remove them later after we know
that there are no issues with using uek-rpm/ol?/update-el-x86.

Set uek-rpm/ol6/update-el to 6.7 which is the current UEK4 build env.
It was set previously to 6.6 but has been overridden by the build env.
The build env is about to be changed so that it will no longer override it.
uek-rpm/ol7/update-el has the correct value, 7.3, so no change is needed.

Delete the unused uek-rpm/ol6/update-el-ol6 and uek-rpm/ol7/update-el-ol7.

Orabug: 27004340
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>

uek-rpm: disable CONFIG_NUMA_BALANCING_DEFAULT_ENABLED

Orabug: 26798697

This commit disables automatic NUMA balancing by default.
Numerous customers have experienced high iowait due to
oracle db processes sitting in D state on wait_on_page_bit()
because memory pages containing data segment of the oracle
db processes are migrated aggressively across NUMA nodes.

The solution is to disable automatic NUMA balancing by
default until NUMA balancing algorithm is modified to
address this issue. Note that NUMA balancing may
still be enabled with this change by setting
kernel.numa_balancing=1 (e.g. via sysctl).

Signed-off-by: Fred Herard <fred.herard@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>

qla2xxx: Update driver version to 9.00.00.00.40.0-k

Orabug: 26844197, 26923029

Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix delayed response to command for loop mode/direct connect.

Orabug: 26844197, 26923029

Current driver wait for FW to be in the ready state before
processing in-coming commands. For Arbitrated Loop or
Point-to- Point (not switch), FW Ready state can take a while.
FW will transition to ready state after all Nports have been
logged in. In the mean time, certain initiators have completed
the login and starts IO. Driver needs to start processing all
queues if FW is already started.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Use IOCB interface to submit non-critical MBX.

Orabug: 26844197, 26923029

The Mailbox interface is currently over subscribed. We like
to reserve the Mailbox interface for the chip managment and
link initialization. Any non essential Mailbox command will
be routed through the IOCB interface. The IOCB interface is
able to absorb more commands.

Following commands are being routed through IOCB interface

- Get ID List (007Ch)
- Get Port DB (0064h)
- Get Link Priv Stats (006Dh)

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Add async new target notification

Orabug: 26844197, 26923029

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Allow relogin to proceed if remote login did not finish

Orabug: 26844197, 26923029

If the remote port have started the login process, then the
PLOGI and PRLI should be back to back. Driver will allow
the remote port to complete the process. For the case where
the remote port decide to back off from sending PRLI, this
local port sets an expiration timer for the PRLI. Once the
expiration time passes, the relogin retry logic is allowed
to go through and perform login with the remote port.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix sess_lock & hardware_lock lock order problem.

Orabug: 26844197, 26923029

The main lock that needs to be held for CMD or TMR submission
to upper layer is the sess_lock. The sess_lock is used to
serialize cmd submission and session deletion. The addition
of hardware_lock being held is not necessary. This patch removes
hardware_lock dependency from CMD/TMR submission.

Use hardware_lock only for error response in this case.

Path1
       CPU0                    CPU1
       ----                    ----
  lock(&(&ha->tgt.sess_lock)->rlock);
                               lock(&(&ha->hardware_lock)->rlock);
                               lock(&(&ha->tgt.sess_lock)->rlock);
  lock(&(&ha->hardware_lock)->rlock);

Path2/deadlock
*** DEADLOCK ***
Call Trace:
dump_stack+0x85/0xc2
print_circular_bug+0x1e3/0x250
__lock_acquire+0x1425/0x1620
lock_acquire+0xbf/0x210
_raw_spin_lock_irqsave+0x53/0x70
qlt_sess_work_fn+0x21d/0x480 [qla2xxx]
process_one_work+0x1f4/0x6e0

Cc: <stable@vger.kernel.org>
Cc: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reported-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix inadequate lock protection for ABTS.

Orabug: 26844197, 26923029

Normally, ABTS is sent to Target Core as Task MGMT command.
In the case of error, qla2xxx needs to send response, hardware_lock
is required to prevent request queue corruption.

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix request queue corruption.

Orabug: 26844197, 26923029

When FW notify driver or driver detects low FW resource,
driver tries to send out Busy SCSI Status to tell Initiator
side to back off. During the send process, the lock was not held.

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix memory leak for abts processing

Orabug: 26844197, 26923029

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

scsi: qla2xxx: Fix ql_dump_buffer

Orabug: 26844197, 26923029

Recent printk changes for KERN_CONT cause this logging to be defectively
emitted on multiple lines.  Fix it.

Also reduces object size a trivial amount.

$ size drivers/scsi/qla2xxx/qla_dbg.o*
   text    data     bss     dec     hex filename
  39125       0       0   39125    98d5 drivers/scsi/qla2xxx/qla_dbg.o.new
  39164       0       0   39164    98fc drivers/scsi/qla2xxx/qla_dbg.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

scsi: qla2xxx: Fix response queue count for Target mode.

Orabug: 26844197, 26923029

Target mode initialization was not calculating response queue values
correctly resulting into one less MSI-X vector.

[mkp: fixed Fixes: hash]

Cc: <stable@vger.kernel.org>
Fixes: 093df73771ba ("scsi: qla2xxx: Fix Target mode handling with Multiqueue changes.")
Signed-off-by: Michael Hernandez <michael.hernandez@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

scsi: qla2xxx: Cleaned up queue configuration code.

Orabug: 26844197, 26923029

This patch cleaned up queue configuration code, such that once
initialized, we should not touch msix_count value. This will prevent
incorrect numbers of MSI-X vectors requested while performing target
mode configuration.

[mkp: fixed Fixes: hash]

Cc: <stable@vger.kernel.org>
Fixes: d74595278f4a ("scsi: qla2xxx: Add multiple queue pair functionality.")
Signed-off-by: Michael Hernandez <michael.hernandez@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix a warning reported by the "smatch" static checker

Orabug: 26844197, 26923029

Fix the following warning reported by the "smatch" static checker:

drivers/scsi/qla2xxx/qla_init.c:3910 qla2x00_alloc_fcport()
warn: use 'flags' here instead of GFP_XXX?

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Simplify usage of SRB structure in driver

Orabug: 26844197, 26923029

This patch simplifies SRB structure usage in driver.

- Simplify sp->done() and sp->free() interfaces.
- Remove sp->fcport->vha to use vha pointer from sp.
- Use sp->vha context in qla2x00_rel_sp().

Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Improve RSCN handling in driver

Orabug: 26844197, 26923029

Current code blindly does State Change Registration when
the link is up. Move SCR behind fabric scan, so that arbitrated
loop scan would not get erroneous error message.

Some of the other improvements are as follows

- Add session deletion for TPRLO and send acknowledgment for TPRLO.
- Enable FW option to move ABTS, RIDA & PUREX from RSPQ to ATIOQ.
- Save NPort ID early in link init.
- Move ABTS & RIDA to ATIOQ helps in keeping command ordering and
link up sequence ordering.
- Save Nport ID and update VP map so that SCSI CMD/ATIO won't be dropped.
- fcport alloc does the initializes memory to zero. Remove memset to
zero since It might corrupt link list.
- Turn off Registration for State Change MB in loop mode.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Add framework for async fabric discovery

Orabug: 26844197, 26923029

Currently code performs a full scan of the fabric for
every RSCN. Its an expensive process in a noisy large SAN.

This patch optimizes expensive fabric discovery process by
scanning switch for the affected port when RSCN is received.

Currently Initiator Mode code makes login/logout decision without
knowledge of target mode. This causes driver and firmware to go
out-of-sync. This framework synchronizes both initiator mode
personality and target mode personality in making login/logout
decision.

This patch adds following capabilities in the driver

- Send Notification Acknowledgement asynchronously.
- Update session/fcport state asynchronously.
- Create a session or fcport struct asynchronously.
- Send GNL asynchronously. The command will ask FW to
  provide a list of FC Port entries FW knows about.
- Send GPDB asynchronously. The command will ask FW to
  provide detail data of an FC Port FW knows about or
  perform ADISC to verify the state of the session.
- Send GPNID asynchronously. The command will ask switch
  to provide WWPN for provided NPort ID.
- Send GPSC asynchronously. The command will ask switch
  to provide registered port speed for provided WWPN.
- Send GIDPN asynchronously. The command will ask the
  switch to provide Nport ID for provided WWPN.
- In driver unload path, schedule all session for deletion
  and wait for deletion to complete before allowing driver
  unload to proceed.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
[ bvanassche: fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Track I-T nexus as single fc_port struct

Orabug: 26844197, 26923029

Current code merges qla_tgt_sess and fc_port structure
into single fc_port structure representing same I-T nexus.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
[ bvanassche: fixed spelling of patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: introduce a private sess_kref

Orabug: 26844197, 26923029

This stops abusing the common sess_kref to overload it for private
usage, which allows removing the shutdown_session method as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Use d_id instead of s_id for more clarity

Orabug: 26844197, 26923029

Updated code with d_id from s_id for better readability and
clarity.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: fixed spelling of patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix wrong argument in sp done callback

Orabug: 26844197, 26923029

Callback for sp->done expects scsi_qla_host is passed in as argument,
Instead qla_hw_data is passed in.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Remove SRR code

Orabug: 26844197, 26923029

During initial implementation, tape support was included but not
enabled by default on target. So far, we don't see any target
customer requesting this support. Since this code is not being
used actively, we want to remove it and we will add back if there
are any request in future for SRR support.

Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Cleanup TMF code translation from qla_target

Orabug: 26844197, 26923029

Move code code which converts Task Mgmt Command flags for
ATIO to TCM #defines, from qla2xxx driver to tcm_qla2xxx
driver.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Disable out-of-order processing by default in firmware

Orabug: 26844197, 26923029

Out of order(OOO) processing requires initiator, switch
and target to support OOO. In today's environment, none
of the switches support OOO. OOO requires extra buffer
space which affect performance. By turning ON this feature
in QLogic's FW, it delays error recovery because dropped
frame is treated as out of order frame. We're turning OFF
this option of speed up error recovery.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix erroneous invalid handle message

Orabug: 26844197, 26923029

Termination of Immediate Notify IOCB was using wrong
IOCB handle. IOCB completion code was unable to find
appropriate code path due to wrong handle.

Following message is seen in the logs.

"Error entry - invalid handle/queue (ffff)."

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed word order in patch title ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Reduce exess wait during chip reset

Orabug: 26844197, 26923029

Soft reset and Risc reset should take 100uS to complete.
This change pad the timeout up to 400uS, which should be
plenty.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Terminate exchange if corrupted

Orabug: 26844197, 26923029

Corrupted ATIO is defined as length of fcp_header & fcp_cmd
payload is less than 0x38. It's the minimum size for a frame to
carry 8..16 bytes SCSI CDB. The exchange will be dropped or
terminated if corrupted.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch title ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix crash due to null pointer access

Orabug: 26844197, 26923029

During code inspection, while investigating following stack trace
seen on one of the test setup, we found out there was possibility
of memory leak becuase driver was not unwinding the stack properly.

This issue has not been reproduced in a test environment or on a
customer setup.

Here's stack trace that was seen.

[1469877.797315] Call Trace:
[1469877.799940]  [<ffffffffa03ab6e9>] qla2x00_mem_alloc+0xb09/0x10c0 [qla2xxx]
[1469877.806980]  [<ffffffffa03ac50a>] qla2x00_probe_one+0x86a/0x1b50 [qla2xxx]
[1469877.814013]  [<ffffffff813b6d01>] ? __pm_runtime_resume+0x51/0xa0
[1469877.820265]  [<ffffffff8157c1f5>] ? _raw_spin_lock_irqsave+0x25/0x90
[1469877.826776]  [<ffffffff8157cd2d>] ? _raw_spin_unlock_irqrestore+0x6d/0x80
[1469877.833720]  [<ffffffff810741d1>] ? preempt_count_sub+0xb1/0x100
[1469877.839885]  [<ffffffff8157cd0c>] ? _raw_spin_unlock_irqrestore+0x4c/0x80
[1469877.846830]  [<ffffffff81319b9c>] local_pci_probe+0x4c/0xb0
[1469877.852562]  [<ffffffff810741d1>] ? preempt_count_sub+0xb1/0x100
[1469877.858727]  [<ffffffff81319c89>] pci_call_probe+0x89/0xb0

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Collect additional information to debug fw dump

Orabug: 26844197, 26923029

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Reset reserved field in firmware options to 0

Orabug: 26844197, 26923029

During NVRAM initialization in target mode, reset reserved
fields in firmware options to Zero (BIT 15)

Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Include ATIO queue in firmware dump when in target mode

Orabug: 26844197, 26923029

Include ATIO queue for ISP27XX when firmware dump is collected
for target mode.

Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix wrong IOCB type assumption

Orabug: 26844197, 26923029

qlt_reset is called with Immedidate Notify IOCB only.
Current code wrongly cast it as ATIO IOCB.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Add DebugFS node for target sess list.

Orabug: 26844197, 26923029

#cat  /sys/kernel/debug/qla2xxx/qla2xxx_31/tgt_sess
qla2xxx_31
Port ID   Port Name                Handle
ff:fc:01  21:fd:00:05:33:c7:ec:16  0
01:0e:00  21:00:00:24:ff:7b:8a:e4  1
01:0f:00  21:00:00:24:ff:7b:8a:e5  2
....

(Drop ->check_initiator_node_acl() parameter usage - nab)

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

tcm_qla2xxx: Convert to target_alloc_session usage

Orabug: 26844197, 26923029

This patch converts existing qla2xxx target mode assignment
of struct qla_tgt_sess related sid + loop_id values to use
a callback via the new target_alloc_session API caller.

Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Use ATIO type to send correct tmr response

Orabug: 26844197, 26923029

The function value inside se_cmd can change if the TMR is cancelled.
Use original ATIO Type to correctly determine CTIO response.

Signed-off-by: Swapnil Nagle <swapnil.nagle@purestroage.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Fix TMR ABORT interaction issue between qla2xxx and TCM

Orabug: 26844197, 26923029

During lun reset, TMR thread from TCM would issue abort
to qla driver. At abort time, each command is in different
state. Depending on the state, qla will use the TMR thread
to trigger a command free(cmd_kref--) if command is not
down at firmware.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Move atioq to a different lock to reduce lock contention

Orabug: 26844197, 26923029

99% of the time the ATIOQ has SCSI command.  The other 1% of time
is something else.  Most of the time this interrupt does not need
to hold the hardware_lock.  We're moving the ATIO interrupt thread
to a different lock to reduce lock contention.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Remove dependency on hardware_lock to reduce lock contention.

Orabug: 26844197, 26923029

Sessions management (add, deleted, modify) currently are serialized
through the hardware_lock. Hardware_lock is a high traffic lock.
This lock is accessed by both the transmit & receive sides.

Sessions management is now moved off to another lock call sess_lock.
This is done to reduce lock contention and increase traffic throughput.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Replace QLA_TGT_STATE_ABORTED with a bit.

Orabug: 26844197, 26923029

Replace QLA_TGT_STATE_ABORTED state with a bit because
the current state of the command is lost when an abort
is requested by upper layer.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Wait for all conflicts before ack'ing PLOGI

Orabug: 26844197, 26923029

Until now ack'ing of a new PLOGI has only been delayed if there
was an existing session for the same WWN. Ack was released when
the session deletion completed.

If there was another WWN session with the same fc_id/loop_id pair
(aka "conflicting session"), PLOGI was still ack'ed immediately.
This potentially caused a problem when old session deletion logged
fc_id/loop_id out of FW after new session has been established.

Two work-arounds were attempted before:
1. Dropping PLOGIs until conflicting session goes away.
2. Detecting initiator being logged out of FW and issuing LOGO
to force re-login.

This patch introduces proper solution to the problem where PLOGI
is held until either existing session with same WWN or any
conflicting session goes away. Mechanism supports one session holding
two PLOGI acks as well as one PLOGI ack being held by many sessions.

Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Delete session if initiator is gone from FW

Orabug: 26844197, 26923029

1. Initiator A is logged in with fc_id(1)/loop_id(1)
2. Initiator A re-logs in with fc_id(2)/loop_id(2)
3. Part of old session deletion async logoout for 1/1 is queued
4. Initiator B logs in with fc_id(1)/loop_id(1), starts
   passing data and creates session.
   5. Async logo from 3 is processed by DPC and sent to FW

   Now initiator B has the session but is logged out from FW.

   This condition is detected first with CTIO error 29 at which
   point we should delete current session. During session
   deletion we will send LOGO to initiator to force re-login.

   Under rare circumstances initiator might be logged out of FW,
   not have driver session, but still think it's logged in.
   E.g. the above sequence plus session deletion due to re-config.
   Incoming commands will fail to create local session because
   initiator is not found in FW. In this case we also issue LOGO
   to initiator to force him re-login.

   Finally this patch fixes exchange leak when commands where
   received in logged out state. In this case loop_id must be
   set to FFFF when corresponding exchange is terminated. The
   patch modifies exchange termination to always use FFFF,
   since in certain scenarios it's impossible to tell whether
   command was received in logged in or logged out state.

Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Added interface to send explicit LOGO.

Orabug: 26844197, 26923029

This patch adds interface to send explicit LOGO
explicit LOGO using using ELS commands from driver.

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Add FW resource count in DebugFS.

Orabug: 26844197, 26923029

DebugFS now will show fw_resource_count node.

FW Resource count

Original TGT exchg count[0]
current TGT exchg count[0]
original Initiator Exchange count[2048]
Current Initiator Exchange count[2048]
Original IOCB count[2078]
Current IOCB count[2067]
MAX VP count[254]
MAX FCF count[0]

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Enable Target counters in DebugFS.

Orabug: 26844197, 26923029

Following counters are added in target mode to help debugging efforts.

Target Counters

qla_core_sbt_cmd = 0
qla_core_ret_sta_ctio = 0
qla_core_ret_ctio = 0
core_qla_que_buf = 0
core_qla_snd_status = 0
core_qla_free_cmd = 0
num alloc iocb failed = 0
num term exchange sent = 0
num Q full sent = 0

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: terminate exchange when command is aborted by LIO

Orabug: 26844197, 26923029

The newly introduced aborted_task TFO callback has to terminate
exchange with QLogic driver, since command is being deleted and
no status will be queued to the driver at a later point.

This patch also moves the burden of releasing one cmd refcount to
the aborted_task handler.

Changed iSCSI aborted_task logic to satisfy the above requirement.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: drop cmds/tmrs arrived while session is being deleted

Orabug: 26844197, 26923029

If a new initiator (different WWN) shows up on the same fcport, old
initiator's session is scheduled for deletion. But there is a small
window between it being marked with QLA_SESS_DELETION_IN_PROGRESS
and qlt_unret_sess getting called when new session's commands will
keep finding old session in the fcport map.

This patch drops cmds/tmrs if they find session in the progress of
being deleted.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: disable scsi_transport_fc registration in target mode

Orabug: 26844197, 26923029

There are multiple reasons for disabling this:

1. It provides no functional benefit. We pretty much only get a few more
sysfs entries for each port, but all that information is already
available from /sys/kernel/debug/target/qla-session-X

2. It already only works in private-loop mode. By disabling we'll be
getting more uniform behavior with fabric mode.

3. It creates complications for the new PLOGI handling mechanism:
scsi_transport_fc port deletion timer could race with new session
from initiator and cause logout after successful login.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: added sess generations to detect RSCN update races

Orabug: 26844197, 26923029

RSCN processing in qla2xxx driver can run in parallel with ELS/IO
processing. As such the decision to remove disappeared fc port's
session could be stale, because a new login sequence has occurred
since and created a brand new session.

Previous mechanism of dealing with this by delaying deletion request
was prone to erroneous deletions if the event that was supposed to
cancel the deletion never arrived or has been delayed in processing.

New mechanism relies on a time-like generation counter to serialize
RSCN updates relative to ELS/IO updates.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Abort stale cmds on qla_tgt_wq when plogi arrives

Orabug: 26844197, 26923029

cancel any commands from initiator's s_id that are still waiting
on qla_tgt_wq when PLOGI arrives.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: delay plogi/prli ack until existing sessions are deleted

Orabug: 26844197, 26923029

- keep qla_tgt_sess object on the session list until it's freed

- modify use of sess->deleted flag to differentiate delayed
  session deletion that can be cancelled from irreversible one:
  QLA_SESS_DELETION_PENDING vs QLA_SESS_DELETION_IN_PROGRESS

- during IN_PROGRESS deletion all newly arrived commands and TMRs will
  be rejected, existing commands and TMRs will be terminated when
  given by the core to the fabric or simply dropped if session logout
  has already happened (logout terminates all existing exchanges)

- new PLOGI will initiate deletion of the following sessions
  (unless deletion is already IN_PROGRESS):
  - with the same port_name (with logout)
  - different port_name, different loop_id but the same port_id
    (with logout)
  - different port_name, different port_id, but the same loop_id
    (without logout)

- additionally each new PLOGI will store imm notify iocb in the
  same port_name session being deleted. When deletion process
  completes this iocb will be acked. Only the most recent PLOGI
  iocb is stored. The older ones will be terminated when replaced.

- new PRLI will initiate deletion of the following sessions
  (unless deletion is already IN_PROGRESS):
  - different port_name, different port_id, but the same loop_id
   (without logout)

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: cleanup cmd in qla workqueue before processing TMR

Orabug: 26844197, 26923029

Since cmds go into qla_tgt_wq and TMRs don't, it's possible that TMR
like TASK_ABORT can be queued over the cmd for which it was meant.
To avoid this race, use a per-port list to keep track of cmds that
are enqueued to qla_tgt_wq but not yet processed. When a TMR arrives,
iterate through this list and remove any cmds that match the TMR.
This patch supports TASK_ABORT and LUN_RESET.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Swapnil Nagle <swapnil.nagle@purestorage.com>
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Add flush after updating ATIOQ consumer index.

Orabug: 26844197, 26923029

After updating the consumer index of ATIO Q, a read is
required to flush the write to the adapter register.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

qla2xxx: Enable target mode for ISP27XX

Orabug: 26844197, 26923029

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

KVM: nVMX: Fix loss of L2's NMI blocking state

Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1:

Before NMI IRET test
Sending NMI to self
NMI isr running stack 0x461000
Sending nested NMI to self
After nested NMI to self
Nested NMI isr running rip=40038e
After iret
After NMI to self
FAIL: NMI

Commit 4c4a6f790ee862 ("KVM: nVMX: track NMI blocking state separately
for each VMCS") tracks NMI blocking state separately for vmcs01 and
vmcs02. However it is not enough:

- The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault
   on IRET, so the L2 can generate #PF which can be intercepted by L0.
- L0 walks L1's guest page table and sees the mapping is invalid, it
   resumes the L1 guest and injects the #PF into L1.  At this point the
   vmcs02 has nmi_known_unmasked=true.
- L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field
   of vmcs12 (and fixes the shadow page table) before resuming L2 guest.
- L1 executes VMRESUME to resume L2, causing a vmexit to L0
- during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the
   interruptibility-state field of vmcs02, but nmi_known_unmasked is
   still true.
- L2 immediately exits to L0 with another page fault, because L0 still has
   not updated the NGVA->HPA page tables.  However, nmi_known_unmasked is
   true so vmx_recover_nmi_blocking does not do anything.

The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2d6144e366fb39609aecf7a658e2e10af37627e9)
OraBug: 27031246 nested virt: L2 windows guest reboot hangs with L1 KVM hypervisor
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Tested-by: Xuan Bai <xuan.bai@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

KVM: nVMX: track NMI blocking state separately for each VMCS

vmx_recover_nmi_blocking is using a cached value of the guest
interruptibility info, which is stored in vmx->nmi_known_unmasked.
vmx_recover_nmi_blocking is run for both normal and nested guests,
so the cached value must be per-VMCS.

This fixes eventinj.flat in a nested non-EPT environment. With EPT it
works, because the EPT violation handler doesn't have the
vmx->nmi_known_unmasked optimization (it is unnecessary because, unlike
vmx_recover_nmi_blocking, it can just look at the exit qualification).

Thanks to Wanpeng Li for debugging the testcase and providing an initial
patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit 4c4a6f790ee862ee9f0dc8b35c71f55bcf792b71)
OraBug: 27031246 nested virt: L2 windows guest reboot hangs with L1 KVM hypervisor
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Tested-by: Xuan Bai <xuan.bai@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

KVM: VMX: require virtual NMI support

Virtual NMIs are only missing in Prescott and Yonah chips. Both are obsolete
for virtualization usage---Yonah is 32-bit only even---so drop vNMI emulation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2c82878b0cb38fd516fd612c67852a6bbf282003. It is a
partical cherry-pick in that PIN_BASED_VMX_PREEMPTION_TIMER in
setup_vmcs_config in file arch/x86/kvm/vmx.c has been omitted due to two things:
- it being not required in the fix for bug 27031246 (which is about NMIs)
- no need as we do not have commit 64672c95ea4c2f7096e519e826076867e8ef0938
(kvm: vmx: hook preemption timer support) backported in UEK4.)
OraBug: 27031246 nested virt: L2 windows guest reboot hangs with L1 KVM hypervisor
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Tested-by: Xuan Bai <xuan.bai@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

KVM: nVMX: Fix the NMI IDT-vectoring handling

Run kvm-unit-tests/eventinj.flat in L1:

Sending NMI to self
After NMI to self
FAIL: NMI

This test scenario is to test whether VMM can handle NMI IDT-vectoring info correctly.

At the beginning, L2 writes LAPIC to send a self NMI, the EPT page tables on both L1
and L0 are empty so:

- The L2 accesses memory can generate EPT violation which can be intercepted by L0.

  The EPT violation vmexit occurred during delivery of this NMI, and the NMI info is
  recorded in vmcs02's IDT-vectoring info.

- L0 walks L1's EPT12 and L0 sees the mapping is invalid, it injects the EPT violation into L1.

  The vmcs02's IDT-vectoring info is reflected to vmcs12's IDT-vectoring info since
  it is a nested vmexit.

- L1 receives the EPT violation, then fixes its EPT12.
- L1 executes VMRESUME to resume L2 which generates vmexit and causes L1 exits to L0.
- L0 emulates VMRESUME which is called from L1, then return to L2.

  L0 merges the requirement of vmcs12's IDT-vectoring info and injects it to L2 through
  vmcs02.

- The L2 re-executes the fault instruction and cause EPT violation again.
- Since the L1's EPT12 is valid, L0 can fix its EPT02
- L0 resume L2

  The EPT violation vmexit occurred during delivery of this NMI again, and the NMI info
  is recorded in vmcs02's IDT-vectoring info. L0 should inject the NMI through vmentry
  event injection since it is caused by EPT02's EPT violation.

However, vmx_inject_nmi() refuses to inject NMI from IDT-vectoring info if vCPU is in
guest mode, this patch fix it by permitting to inject NMI from IDT-vectoring if it is
the L0's responsibility to inject NMI from IDT-vectoring info to L2.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Bandan Das <bsd@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit c5a6d5f7faad8549bb5ff7e3e5792e33933c5b9f)
OraBug: 27031246 nested virt: L2 windows guest reboot hangs with L1 KVM hypervisor
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Tested-by: Xuan Bai <xuan.bai@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

NFS: Add static NFS I/O tracepoints

Tools like tcpdump and rpcdebug can be very useful. But there are
plenty of environments where they are difficult or impossible to
use. For example, we've had customers report I/O failures during
workloads so heavy that collecting network traffic or enabling
RPC debugging are themselves onerous.

The kernel's static tracepoints are lightweight (less likely to
introduce timing changes) and efficient (the trace data is compact).
They also work in scenarios where capturing network traffic is not
possible due to lack of hardware support (some InfiniBand HCAs) or
where data or network privacy is a concern.

Introduce tracepoints that show when an NFS READ, WRITE, or COMMIT
is initiated, and when it completes. Record the arguments and
results of each operation, which are not shown by existing sunrpc
module's tracepoints.

For instance, the recorded offset and count can be used to match an
"initiate" event to a "done" event. If an NFS READ result returns
fewer bytes than requested or zero, seeing the EOF flag can be
probative. Seeing an NFS4ERR_BAD_STATEID result is also indication
of a particular class of problems. The timing information attached
to each event record can often be useful as well.

Usage example:

[root@manet tmp]# trace-cmd record -e nfs:*initiate* -e nfs:*done
/sys/kernel/debug/tracing/events/nfs/*initiate*/filter
/sys/kernel/debug/tracing/events/nfs/*done/filter
Hit Ctrl^C to stop recording
^CKernel buffer statistics:
  Note: "entries" are the entries left in the kernel ring buffer and are not
        recorded in the trace data. They should all be zero.

CPU: 0
entries: 0
overrun: 0
commit overrun: 0
bytes: 3680
oldest event ts:    78.367422
now ts:   100.124419
dropped events: 0
read events: 74

... and so on.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
(cherry picked from commit 8224b2734ab1da4996b851e1e5d3047e7a0df499)

orabug: 26923332
Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
Reviewed-by: John Sobecki <john.sobecki@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

firmware: dmi_scan: add SBMIOS entry and DMI tables

Orabug: 27023745

Some utils, like dmidecode and smbios, need to access SMBIOS entry
table area in order to get information like SMBIOS version, size, etc.
Currently it's done via /dev/mem. But for situation when /dev/mem
usage is disabled, the utils have to use dmi sysfs instead, which
doesn't represent SMBIOS entry and adds code/delay redundancy when direct
access for table is needed.

So this patch creates dmi/tables and adds SMBIOS entry point to allow
utils in question to work correctly without /dev/mem. Also patch adds
raw dmi table to simplify dmi table processing in user space, as
proposed by Jean Delvare.

Tested-by: Roy Franz <roy.franz@linaro.org>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@globallogic.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Dan Duval <dan.duval@oracle.com>
(cherry picked from commit d7f96f97c4031fa4ffdb7801f9aae23e96170a6f)
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflict:

MAINTAINERS

x86/platform/uv: Fix kdump for UV

With uv_nmi.action=kdump on the boot line, "power nmi" does not kdump.
Instead it produces backtraces (the default "dump" action).

One line from upstream commit 2965faa5e03d ("kexec: split kexec_load
syscall from kexec core code") was inadvertantly applied to
arch/x86/platform/uv/uv_nmi.c, but the rest of the commit is
missing. The UEK kernel does not have any of the other
CONFIG_KEXEC_CORE related changes in it, so this breaks uv_nmi

Orabug: 27031345

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Brian Maly <brian.maly@oracle.com>

KEYS: prevent KEYCTL_READ on negative key

Because keyctl_read_key() looks up the key with no permissions
requested, it may find a negatively instantiated key.  If the key is
also possessed, we went ahead and called ->read() on the key.  But the
key payload will actually contain the ->reject_error rather than the
normal payload.  Thus, the kernel oopses trying to read the
user_key_payload from memory address (int)-ENOKEY = 0x00000000ffffff82.

Fortunately the payload data is stored inline, so it shouldn't be
possible to abuse this as an arbitrary memory read primitive...

Reproducer:
    keyctl new_session
    keyctl request2 user desc '' @s
    keyctl read $(keyctl show | awk '/user: desc/ {print $1}')

It causes a crash like the following:
     BUG: unable to handle kernel paging request at 00000000ffffff92
     IP: user_read+0x33/0xa0
     PGD 36a54067 P4D 36a54067 PUD 0
     Oops: 0000 [#1] SMP
     CPU: 0 PID: 211 Comm: keyctl Not tainted 4.14.0-rc1 #337
     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
     task: ffff90aa3b74c3c0 task.stack: ffff9878c0478000
     RIP: 0010:user_read+0x33/0xa0
     RSP: 0018:ffff9878c047bee8 EFLAGS: 00010246
     RAX: 0000000000000001 RBX: ffff90aa3d7da340 RCX: 0000000000000017
     RDX: 0000000000000000 RSI: 00000000ffffff82 RDI: ffff90aa3d7da340
     RBP: ffff9878c047bf00 R08: 00000024f95da94f R09: 0000000000000000
     R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
     R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
     FS:  00007f58ece69740(0000) GS:ffff90aa3e200000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00000000ffffff92 CR3: 0000000036adc001 CR4: 00000000003606f0
     Call Trace:
      keyctl_read_key+0xac/0xe0
      SyS_keyctl+0x99/0x120
      entry_SYSCALL_64_fastpath+0x1f/0xbe
     RIP: 0033:0x7f58ec787bb9
     RSP: 002b:00007ffc8d401678 EFLAGS: 00000206 ORIG_RAX: 00000000000000fa
     RAX: ffffffffffffffda RBX: 00007ffc8d402800 RCX: 00007f58ec787bb9
     RDX: 0000000000000000 RSI: 00000000174a63ac RDI: 000000000000000b
     RBP: 0000000000000004 R08: 00007ffc8d402809 R09: 0000000000000020
     R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffc8d402800
     R13: 00007ffc8d4016e0 R14: 0000000000000000 R15: 0000000000000000
     Code: e5 41 55 49 89 f5 41 54 49 89 d4 53 48 89 fb e8 a4 b4 ad ff 85 c0 74 09 80 3d b9 4c 96 00 00 74 43 48 8b b3 20 01 00 00 4d 85 ed <0f> b7 5e 10 74 29 4d 85 e4 74 24 4c 39 e3 4c 89 e2 4c 89 ef 48
     RIP: user_read+0x33/0xa0 RSP: ffff9878c047bee8
     CR2: 00000000ffffff92

Fixes: 61ea0c0ba904 ("KEYS: Skip key state checks when checking for possession")
Cc: <stable@vger.kernel.org> [v3.13+]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
(cherry picked from commit 37863c43b2c6464f252862bf2e9768264e961678)

Orabug: 27049926
CVE: CVE-2017-12192
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

virtio:rng: Virtio RNG devices need to be re-registered after suspend/resume

The patch for

commit: 5c06273401f2eb7b290cadbae18ee00f8f65e893
Author: Amit Shah <amit.shah@redhat.com>
Date: Sun Jul 27 07:34:01 2014 +0930

virtio: rng: delay hwrng_register() till driver is ready

moved the call to hwrng_register() out of the probe routine into the scan
routine. We need to call hwrng_register() after a suspend/restore cycle
to re-register the device, but the scan function is not invoked for the
restore. Add the call to hwrng_register() to virtio_restore() instead.

Orabug: 26989940

Reviewed-by: Liam Merwick <Liam.Merwick@oracle.com>
Signed-off-by: Jim Quigley <Jim.Quigley@oracle.com>

Hang/soft lockup in d_invalidate with simultaneous calls

It's not hard to trigger a bunch of d_invalidate() on the same
dentry in parallel.  They end up fighting each other - any
dentry picked for removal by one will be skipped by the rest
and we'll go for the next iteration through the entire
subtree, even if everything is being skipped.  Morevoer, we
immediately go back to scanning the subtree.  The only thing
we really need is to dissolve all mounts in the subtree and
as soon as we've nothing left to do, we can just unhash the
dentry and bugger off.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 81be24d263dbeddaba35827036d6f6787a59c2c3)

Orabug: 26908674
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

dtrace: Add CTF archive to the UEK nano package

DTrace support is now built into the UEK nano kernels. However the UEK
nano kernel packages lack required CTF data archive (vmlinux.ctfa).
This patch adds it to UEK nano kernels too.

Orabug: 26983106

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

uek-rpm: Update kernel-ueknano's provides list.

Orabug: 27037687

Install of libibverbs requires kernel-uek as prerequisite. The older
ueknano rpms had this in the provides list.

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>

uek-rpm: Add more modules to ueknano for OL7

Orabug: 27037696

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>

Revert "drivers/char/mem.c: deny access in open operation when securelevel is set"

This reverts commit 8ca81822fffb2536aa1076fb17d1ea1413df3e7f.

Orabug: 27037788
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Bump mpt3sas driver version to v16.100.00.00

Bump mpt3sas driver version to v16.100.00.00

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit 09a50f43a53b8ea2a48feea891196e6cc7355074)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Adding support for SAS3616 HBA device

Adding PNP ID of Mercator i.e. SAS3616 HBA device. Its device ID is
0xD1 and vendor ID is 0x1000.

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit 15fd7c74dadc8de5dd6b4a2a146127f035093aa3)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Fix possibility of using invalid Enclosure Handle for SAS device after host reset

Enclosure handles are not updated after host reset. As a result, driver
device structure is holding previously assigned enclosure handle which
is different from the enclosure handle populated in the corresponding
device page.

Modified the driver to update devices enclosure handles after host reset
to current value by referring the enclosure handles from corresponding
device pages

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit aba5a85c2fcf05a6922e28c7179526adad58f4b5)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Display chassis slot information of the drive

Display chassis slot information along with other drive location
parameters such as slot number and connector name in the logs if
chassis slot validity bit is set in 'SAS Enclosure Page 0'.

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit 7588895646b5a943d3310271885c5935123a455c)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Updated MPI headers to v2.00.48

Updated MPI headers to v2.00.48

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit 90e7a70199184ed5f3081981c7cffed771b84bb3)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive

Whenever an I/O for a RAID volume fails with IOCStatus
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and SCSIStatus equal to
(MPI2_SCSI_STATE_TERMINATED | MPI2_SCSI_STATE_NO_SCSI_STATUS) then
return the I/O to SCSI midlayer with "DID_RESET" (i.e. retry the IO
infinite times) set in the host byte.

Previously, the driver was completing the I/O with "DID_SOFT_ERROR"
which causes the I/O to be quickly retried. However, firmware needed
more time and hence I/Os were failing.

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit 2ce9a3645299ba1752873d333d73f67620f4550b)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Fix removal and addition of vSES device during host reset

For Dev Handles whose value is less than HBA's phys count number, driver
would return HBA's SAS address value. As a result, for a Virtual SES
device the driver was returning the HBA's SAS address. Updated the
driver to return Virtual SES' SAS address.

[mkp: clarified commit message]

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit 758f8139e9a779de76fed2b48dc492bfb6612684)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Reduce memory footprint in kdump kernel

To reduce the memory footprint of the driver in the kdump kernel, we
apply the following settings when reset_devices is set:

1. Use single MSI-x vector.
2. Disable RDPQ mode.
3. Set sg_table_size to 32 by default.
4) Set SCSI IO Queue depth to 200.

[mkp: fixed commit message]

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit 06f5f976a6ee0f8bbb0dd648415eeac0536fef97)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Fixed memory leaks in driver

While removing Expander devices, we are removing expander device entry
from the list before freeing its child devices. While freeing child
device we are finding its parent device node as NULL and therefore we
are not freeing the child device's allocated data structures. Updated
the driver to remove the expander device from the list only after
freeing all its child devices.

[mkp: clarified commit message]

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit bbe3def3a11dc1040d45469f5dd26032e9fd8c79)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: mpt3sas: Processing of Cable Exception events

Earlier Active Cable Exception event with reason code "Cable Degraded
(0x02))" was added only for Active Cable. Now this event is extended to
Passive cable too. Re-arranged display message accordingly.

Also added Cable Exception Event event for SAS3008 & SAS3108 HBAs
(i.e. MPI 2.5 spec supporting HBAs). Previously, this event was enabled
only for MPI 2.6 spec supporting HBA devices.

[mkp: typos]

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Orabug: 26894858
(cherry picked from commit b99b199378afac7675876adc170d82d7a4442330)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

selinux: fix off-by-one in setprocattr

SELinux tries to support setting/clearing of /proc/pid/attr attributes
from the shell by ignoring terminating newlines and treating an
attribute value that begins with a NUL or newline as an attempt to
clear the attribute.  However, the test for clearing attributes has
always been wrong; it has an off-by-one error, and this could further
lead to reading past the end of the allocated buffer since commit
bb646cdb12e75d82258c2f2e7746d5952d3e321a ("proc_pid_attr_write():
switch to memdup_user()").  Fix the off-by-one error.

Even with this fix, setting and clearing /proc/pid/attr attributes
from the shell is not straightforward since the interface does not
support multiple write() calls (so shells that write the value and
newline separately will set and then immediately clear the attribute,
requiring use of echo -n to set the attribute), whereas trying to use
echo -n "" to clear the attribute causes the shell to skip the
write() call altogether since POSIX says that a zero-length write
causes no side effects. Thus, one must use echo -n to set and echo
without -n to clear, as in the following example:
$ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate
unconfined_u:object_r:user_home_t:s0
$ echo "" > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate

Note the use of /proc/$$ rather than /proc/self, as otherwise
the cat command will read its own attribute value, not that of the shell.

There are no users of this facility to my knowledge; possibly we
should just get rid of it.

UPDATE: Upon further investigation it appears that a local process
with the process:setfscreate permission can cause a kernel panic as a
result of this bug.  This patch fixes CVE-2017-2618.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: added the update about CVE-2017-2618 to the commit description]
Cc: stable@vger.kernel.org # 3.5: d6ea83ec6864e
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
(cherry picked from commit 0c461cb727d146c9ef2d3e86214f498b78b7d125)

Orabug: 25660054
CVE: CVE-2017-2618
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

sysctl: Drop reference added by grab_header in proc_sys_readdir

Fixes CVE-2016-9191, proc_sys_readdir doesn't drop reference
added by grab_header when return from !dir_emit_dots path.
It can cause any path called unregister_sysctl_table will
wait forever.

The calltrace of CVE-2016-9191:

[ 5535.960522] Call Trace:
[ 5535.963265]  [<ffffffff817cdaaf>] schedule+0x3f/0xa0
[ 5535.968817]  [<ffffffff817d33fb>] schedule_timeout+0x3db/0x6f0
[ 5535.975346]  [<ffffffff817cf055>] ? wait_for_completion+0x45/0x130
[ 5535.982256]  [<ffffffff817cf0d3>] wait_for_completion+0xc3/0x130
[ 5535.988972]  [<ffffffff810d1fd0>] ? wake_up_q+0x80/0x80
[ 5535.994804]  [<ffffffff8130de64>] drop_sysctl_table+0xc4/0xe0
[ 5536.001227]  [<ffffffff8130de17>] drop_sysctl_table+0x77/0xe0
[ 5536.007648]  [<ffffffff8130decd>] unregister_sysctl_table+0x4d/0xa0
[ 5536.014654]  [<ffffffff8130deff>] unregister_sysctl_table+0x7f/0xa0
[ 5536.021657]  [<ffffffff810f57f5>] unregister_sched_domain_sysctl+0x15/0x40
[ 5536.029344]  [<ffffffff810d7704>] partition_sched_domains+0x44/0x450
[ 5536.036447]  [<ffffffff817d0761>] ? __mutex_unlock_slowpath+0x111/0x1f0
[ 5536.043844]  [<ffffffff81167684>] rebuild_sched_domains_locked+0x64/0xb0
[ 5536.051336]  [<ffffffff8116789d>] update_flag+0x11d/0x210
[ 5536.057373]  [<ffffffff817cf61f>] ? mutex_lock_nested+0x2df/0x450
[ 5536.064186]  [<ffffffff81167acb>] ? cpuset_css_offline+0x1b/0x60
[ 5536.070899]  [<ffffffff810fce3d>] ? trace_hardirqs_on+0xd/0x10
[ 5536.077420]  [<ffffffff817cf61f>] ? mutex_lock_nested+0x2df/0x450
[ 5536.084234]  [<ffffffff8115a9f5>] ? css_killed_work_fn+0x25/0x220
[ 5536.091049]  [<ffffffff81167ae5>] cpuset_css_offline+0x35/0x60
[ 5536.097571]  [<ffffffff8115aa2c>] css_killed_work_fn+0x5c/0x220
[ 5536.104207]  [<ffffffff810bc83f>] process_one_work+0x1df/0x710
[ 5536.110736]  [<ffffffff810bc7c0>] ? process_one_work+0x160/0x710
[ 5536.117461]  [<ffffffff810bce9b>] worker_thread+0x12b/0x4a0
[ 5536.123697]  [<ffffffff810bcd70>] ? process_one_work+0x710/0x710
[ 5536.130426]  [<ffffffff810c3f7e>] kthread+0xfe/0x120
[ 5536.135991]  [<ffffffff817d4baf>] ret_from_fork+0x1f/0x40
[ 5536.142041]  [<ffffffff810c3e80>] ? kthread_create_on_node+0x230/0x230

One cgroup maintainer mentioned that "cgroup is trying to offline
a cpuset css, which takes place under cgroup_mutex.  The offlining
ends up trying to drain active usages of a sysctl table which apprently
is not happening."
The real reason is that proc_sys_readdir doesn't drop reference added
by grab_header when return from !dir_emit_dots path. So this cpuset
offline path will wait here forever.

See here for details: http://www.openwall.com/lists/oss-security/2016/11/04/13

Fixes: f0c3b5093add ("[readdir] convert procfs")
Cc: stable@vger.kernel.org
Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: Yang Shukui <yangshukui@huawei.com>
Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
(cherry picked from commit 93362fa47fe98b62e4a34ab408c4a418432e7939)

Orabug: 25062944
CVE: 25062944
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: John Haxby <john.haxby@oracle.com>

storvsc: don't assume SG list is contiguous

Scatterlists are contiguous if they're limited to a page; but for large
I/Os, it's possible that the scatterlists span pages, in which case the
pages will not be physically contiguous - they will be chained together.
The MS patch (link below) fixes the wrong assumption in do_bounce_buffer()
that scatterlists are always contiguous, so it's a good fix to port, in
general.

Orabug: 26492697

MS patch:
https://github.com/LIS/lis-next/commit/a13bbc4ab81e459f635237a938f89737300ecfa1

Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Reviewed-by: Joe Slember <joe.slember@oracle.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

thp: run vma_adjust_trans_huge() outside i_mmap_rwsem

vma_addjust_trans_huge() splits pmd if it's crossing VMA boundary.
During split we munlock the huge page which requires rmap walk. rmap
wants to take the lock on its own.

Let's move vma_adjust_trans_huge() outside i_mmap_rwsem to fix this.

Link: http://lkml.kernel.org/r/1466021202-61880-19-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
mm/mmap.c

Orabug: 27026170

(cherry picked from commit 37f9f5595c26d3cb644ca2fab83dc4c4db119f9f)
Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>