Keoseong Park [Mon, 23 Aug 2021 09:07:14 +0000 (18:07 +0900)]
scsi: ufs: ufshpb: Fix possible memory leak
When HPB pinned region exists and mctx allocation for this region fails, a
memory leak is possible because memory is not released for the subregion
table of the current region.
Free memory for the subregion table of the current region.
Device reset and target reset will be using different calling sequences, so
open-code __qla2xxx_eh_generic_reset() in qla2xxx_eh_device_reset(), and
remove the now obsolete function __qla2xxx_eh_generic_reset(). No
functional changes.
Link: https://lore.kernel.org/r/20210819091913.94436-4-hare@suse.de Cc: Nilesh Javali <njavali@marvell.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Device reset and target reset will be using different calling sequences, so
open-code __qla2xxx_eh_generic_reset() in qla2xxx_eh_target_reset(). No
functional changes.
Link: https://lore.kernel.org/r/20210819091913.94436-3-hare@suse.de Cc: Nilesh Javali <njavali@marvell.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 19 Aug 2021 09:19:11 +0000 (11:19 +0200)]
scsi: qla2xxx: Do not call fc_block_scsi_eh() during bus reset
When calling bus reset the driver will be doing a full SAN resync, so there
is no need to wait for any pending RSCNs; they'll be re-issued during
resync anyway.
Link: https://lore.kernel.org/r/20210819091913.94436-2-hare@suse.de Cc: Nilesh Javali <njavali@marvell.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Quinn Tran [Tue, 17 Aug 2021 05:13:13 +0000 (22:13 -0700)]
scsi: qla2xxx: Fix NVMe session down detection
When Target port transitions personality from one to another (NVMe <-->
FCP), there could be some overlap of the two where one layer is going down
while the other layer is coming up. This overlap can cause temporary I/O
error. Detect those errors/transitions and recover from them. Triggers
session tear down and allow relogin to re-drive the connection under the
following conditions:
Quinn Tran [Tue, 17 Aug 2021 05:13:12 +0000 (22:13 -0700)]
scsi: qla2xxx: Fix NVMe retry
For target port that register itself as both FCP + NVMe, initiator driver
will try to login one mode at a time. If the last mode did not succeed,
then driver will try the other mode.
When error is encountered, current code only flip to other mode one time
(NVMe->FCP) and remain on the last mode. Driver wrongly assumed target
port does not support PRLI NVMe, instead it was not ready to receive PRLI.
This patch will alternate back and forth on every PRLI failure until login
retry count has depleted or it is succeeded.
Arun Easi [Tue, 17 Aug 2021 05:13:11 +0000 (22:13 -0700)]
scsi: qla2xxx: Fix hang on NVMe command timeouts
The abort callback gets called only when it gets posted to firmware. The
refcounting is done properly in the callback. On internal errors, the
callback is not invoked leading to a hung I/O. Fix this by having separate
error code when command gets returned from firmware.
Quinn Tran [Tue, 17 Aug 2021 05:13:10 +0000 (22:13 -0700)]
scsi: qla2xxx: Fix NVMe | FCP personality change
Currently driver saves the personality type (FCP|NVMe) at the start of
first discovery of the remote device. If the remote device personality do
change over time, then qla driver needs to present that to user to decide.
Quinn Tran [Tue, 17 Aug 2021 05:13:08 +0000 (22:13 -0700)]
scsi: qla2xxx: edif: Add N2N support for EDIF
For EDIF + N2N to work, firmware 9.8 or later is required. The driver will
pause after PLOGI to allow app to authenticate. Once authentication
completes, app will tell driver to do PRLI.
Abort threads were getting out without processing due to the "deleted"
flag check. The delete thread, meanwhile, could not proceed with a
logout (that would have cleared out pending requests) as the logout IOCB
work was not progressing. It appears like the hung qlt_free_session_done()
thread is causing the ha->wq works on hold. The qlt_free_session_done()
was hung waiting for nvme_fc_unregister_remoteport() + localport_delete cb
to be complete, which would only happen when all I/Os are released.
Fix this by allowing abort to progress until device delete is completely
done. This should make the qlt_free_session_done() proceed without hang and
thus clear up the deadlock.
Quinn Tran [Tue, 17 Aug 2021 05:13:06 +0000 (22:13 -0700)]
scsi: qla2xxx: edif: Fix EDIF enable flag
edif_enabled is prematurely turned on if hardware is capable of handling
the feature. However, firmware also needs to support EDIF before enabling
this bit.
Christian Loehle [Mon, 16 Aug 2021 09:37:51 +0000 (09:37 +0000)]
scsi: sd: Do not exit sd_spinup_disk() quietly
The sd_spinup_disk() function logs what is happening. Unfortunately this
output stops if the media was marked as removed in the meantime. Add a
print for this case too.
Hannes Reinecke [Tue, 17 Aug 2021 07:53:06 +0000 (09:53 +0200)]
scsi: ibmvfc: Do not wait for initial device scan
The initial device scan might take some time, and there really is no need
to wait for it during probe(). So return immediately from scsi_scan_host()
during probe() and avoid any udev stalls during booting.
Link: https://lore.kernel.org/r/20210817075306.11315-1-mwilck@suse.com Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The xcopy code always returns the same NOT READY sense key for all detected
errors. Change the sense key for invalid requests to ILLEGAL REQUEST, and
for aborted transfers to COPY ABORTED.
Link: https://lore.kernel.org/r/20210803145410.80147-3-s.samoylenko@yadro.com Fixes: d877d7275be3 ("target: Fix a deadlock between the XCOPY code and iSCSI session shutdown") Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Signed-off-by: Sergey Samoylenko <s.samoylenko@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sergey Samoylenko [Tue, 3 Aug 2021 14:54:09 +0000 (17:54 +0300)]
scsi: target: Allows backend drivers to fail with specific sense codes
Currently, backend drivers can fail I/O with SAM_STAT_CHECK_CONDITION which
gets us TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE.
Add a new helper that allows backend drivers to fail with specific sense
codes.
This is based on a patch from Mike Christie <michael.christie@oracle.com>.
Cc: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20210803145410.80147-2-s.samoylenko@yadro.com Reviewed-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Sergey Samoylenko <s.samoylenko@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Gustavo A. R. Silva [Tue, 10 Aug 2021 21:07:41 +0000 (16:07 -0500)]
scsi: smartpqi: Replace one-element array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code
should always use "flexible array members"[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].
Refactor the code a bit according to the use of a flexible-array member in
struct pqi_event_config instead of a one-element array, and use the
struct_size() helper.
This helps with the ongoing efforts to globally enable -Warray-bounds and
get us closer to being able to tighten the FORTIFY_SOURCE routines on
memcpy().
This issue was found with the help of Coccinelle and audited and fixed,
manually.
Tuo Li [Tue, 10 Aug 2021 04:04:13 +0000 (21:04 -0700)]
scsi: target: pscsi: Fix possible null-pointer dereference in pscsi_complete_cmd()
The return value of transport_kmap_data_sg() is assigned to the variable
buf:
buf = transport_kmap_data_sg(cmd);
And then it is checked:
if (!buf) {
This indicates that buf can be NULL. However, it is dereferenced in the
following statements:
if (!(buf[3] & 0x80))
buf[3] |= 0x80;
if (!(buf[2] & 0x80))
buf[2] |= 0x80;
To fix these possible null-pointer dereferences, dereference buf and call
transport_kunmap_data_sg() only when buf is not NULL.
Link: https://lore.kernel.org/r/20210810040414.248167-1-islituo@gmail.com Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Reviewed-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Tuo Li <islituo@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Fri, 13 Aug 2021 13:49:13 +0000 (21:49 +0800)]
scsi: core: Remove scsi_cmnd.tag
It is never read, so get rid of it.
Link: https://lore.kernel.org/r/1628862553-179450-4-git-send-email-john.garry@huawei.com Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Fri, 13 Aug 2021 13:49:12 +0000 (21:49 +0800)]
scsi: fnic: Stop setting scsi_cmnd.tag
It is never read. Setting it and the request tag seems dodgy anyway.
Link: https://lore.kernel.org/r/1628862553-179450-3-git-send-email-john.garry@huawei.com Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Fri, 13 Aug 2021 13:49:11 +0000 (21:49 +0800)]
scsi: wd719: Stop using scsi_cmnd.tag
Use scsi_cmd_to_rq(cmd)->tag instead.
Link: https://lore.kernel.org/r/1628862553-179450-2-git-send-email-john.garry@huawei.com Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Tue, 10 Aug 2021 08:51:49 +0000 (11:51 +0300)]
scsi: qedf: Fix error codes in qedf_alloc_global_queues()
This driver has some left over "return 1" on failure style code mixed with
"return negative error codes" style code. The caller doesn't care so we
should just convert everything to return negative error codes.
Then there was a problem that there were two variables used to store error
codes which just resulted in confusion. If qedf_alloc_bdq() returned a
negative error code, we accidentally returned success instead of
propagating the error code. So get rid of the "rc" variable and use
"status" every where.
Also remove the "status = 0" initialization so that these sorts of bugs
will be detected by the compiler in the future.
Link: https://lore.kernel.org/r/20210810085023.GA23998@kili Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Tue, 10 Aug 2021 08:47:53 +0000 (11:47 +0300)]
scsi: qedi: Fix error codes in qedi_alloc_global_queues()
This function had some left over code that returned 1 on error instead
negative error codes. Convert everything to use negative error codes. The
caller treats all non-zero returns the same so this does not affect run
time.
A couple places set "rc" instead of "status" so those error paths ended up
returning success by mistake. Get rid of the "rc" variable and use
"status" everywhere.
Remove the bogus "status = 0" initialization, as a future proofing measure
so the compiler will warn about uninitialized error codes.
Link: https://lore.kernel.org/r/20210810084753.GD23810@kili Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Tue, 10 Aug 2021 08:46:13 +0000 (11:46 +0300)]
scsi: smartpqi: Fix an error code in pqi_get_raid_map()
Return -EINVAL on failure instead of success.
Link: https://lore.kernel.org/r/20210810084613.GB23810@kili Fixes: a91aaae0243b ("scsi: smartpqi: allow for larger raid maps") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the alloc_queue callback driver checks the map, if queue is already
allocated:
ha->queue_pair_map[qidx]
This works fine as long as max_qpairs is greater than nvme_max_hw_queues(8)
since the size of the queue_pair_map is equal to max_qpair. In case nr_cpus
is less than 8, max_qpairs is less than 8. This creates wrong value
returned as qpair.
Saurav Kashyap [Tue, 10 Aug 2021 04:37:18 +0000 (21:37 -0700)]
scsi: qla2xxx: Changes to support kdump kernel for NVMe BFS
The MSI-X and MSI calls fails in kdump kernel. Because of this
qla2xxx_create_qpair() fails leading to .create_queue callback failure.
The fix is to return existing qpair instead of allocating new one and
allocate a single hw queue.
Quinn Tran [Tue, 10 Aug 2021 04:37:15 +0000 (21:37 -0700)]
scsi: qla2xxx: Fix NPIV create erroneous error
When user creates multiple NPIVs, the switch capabilities field is checked
before a vport is allowed to be created. This field is being toggled if a
switch scan is in progress. This creates erroneous reject of vport create.
Link: https://lore.kernel.org/r/20210810043720.1137-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Quinn Tran [Tue, 10 Aug 2021 04:37:14 +0000 (21:37 -0700)]
scsi: qla2xxx: Fix unsafe removal from linked list
On NPIV delete, the VPort is taken off a linked list in an unsafe manner.
The check for VPort refcount should be done behind lock before taking off
the element.
Quinn Tran [Tue, 10 Aug 2021 04:37:13 +0000 (21:37 -0700)]
scsi: qla2xxx: Fix port type info
Over time, fcport->port_type became a flag field. The flags within this
field were not defined properly. This caused external tools to read wrong
info.
Link: https://lore.kernel.org/r/20210810043720.1137-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Shai Malin [Wed, 4 Aug 2021 22:14:12 +0000 (01:14 +0300)]
scsi: qedi: Add support for fastpath doorbell recovery
Driver fastpath employs doorbells to indicate to the device that work is
available. Each doorbell translates to a message sent to the device over
PCI. These messages are queued by the doorbell queue HW block, and handled
by the HW.
If a sufficient amount of CPU cores are sending messages at a sufficient
rate, the queue can overflow, and messages can be dropped. There are many
entities in the driver which can send doorbell messages. When overflow
happens, a fatal HW attention is indicated, and the Doorbell HW block stops
accepting new doorbell messages until recovery procedure is done.
When overflow occurs, all doorbells are dropped. Since doorbells are
aggregatives, if more doorbells are sent nothing has to be done. But if
the "last" doorbell is dropped, the doorbelling entity doesn’t know this
happened, and may wait forever for the device to perform the action. The
doorbell recovery mechanism addresses just that - it sends the last
doorbell of every entity.
[mkp: fix missing brackets reported by Guenter Roeck]
Link: https://lore.kernel.org/r/20210804221412.5048-1-smalin@marvell.com Co-developed-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Mon, 9 Aug 2021 23:03:55 +0000 (16:03 -0700)]
scsi: core: Remove the request member from struct scsi_cmnd
Since all scsi_cmnd.request users are gone, remove the request pointer
from struct scsi_cmnd.
Link: https://lore.kernel.org/r/20210809230355.8186-53-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Mon, 9 Aug 2021 23:03:40 +0000 (16:03 -0700)]
scsi: qla1280: Use scsi_cmd_to_rq() instead of scsi_cmnd.request
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. Remove the unused CMD_REQUEST() macro. This patch does not change
any functionality.
Bart Van Assche [Mon, 9 Aug 2021 23:03:15 +0000 (16:03 -0700)]
scsi: NCR5380: Use sc_data_direction instead of rq_data_dir()
This patch prepares for the removal of the request pointer from struct
scsi_cmnd and does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-13-bvanassche@acm.org Cc: Michael Schmitz <schmitzmic@gmail.com> Suggested-by: Finn Thain <fthain@linux-m68k.org> Acked-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Mon, 9 Aug 2021 23:03:13 +0000 (16:03 -0700)]
scsi: zfcp: Use scsi_cmd_to_rq() instead of scsi_cmnd.request
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-11-bvanassche@acm.org Acked-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Mon, 9 Aug 2021 23:03:11 +0000 (16:03 -0700)]
scsi: RDMA/iser: Use scsi_cmd_to_rq() instead of scsi_cmnd.request
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-9-bvanassche@acm.org Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>