]> www.infradead.org Git - users/griffoul/linux.git/log
users/griffoul/linux.git
3 years agoscsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4
James Smart [Fri, 25 Feb 2022 02:22:53 +0000 (18:22 -0800)]
scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4

Convert the SLI4 fast and slow paths to use native SLI4 wqe constructs
instead of iocb SLI3-isms.

Includes the following:

 - Create simple get_xxx and set_xxx routines to wrapper access to common
   elements in both SLI3 and SLI4 commands - allowing calling routines to
   avoid sli-rev-specific structures to access the elements.

 - using the wqe in the job structure as the primary element

 - use defines from SLI-4, not SLI-3

 - Removal of iocb to wqe conversion from fast and slow path

 - Add below routines to handle fast path
lpfc_prep_embed_io - prepares the wqe for fast path
lpfc_wqe_bpl2sgl   - manages bpl to sgl conversion
lpfc_sli_wqe2iocb  - converts a WQE to IOCB for SLI-3 path

 - Add lpfc_sli3_iocb2wcqecmpl in completion path to convert an SLI-3
   iocb completion to wcqe completion

 - Refactor some of the code that works on both revs for clarity

Link: https://lore.kernel.org/r/20220225022308.16486-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: SLI path split: Refactor lpfc_iocbq
James Smart [Fri, 25 Feb 2022 02:22:52 +0000 (18:22 -0800)]
scsi: lpfc: SLI path split: Refactor lpfc_iocbq

Currently, SLI3 and SLI4 data paths use the same lpfc_iocbq structure.
This is a "common" structure but many of the components refer to sli-rev
specific entities which can lead the developer astray as to what they
actually mean, should be set to, or when they should be used.

This first patch prepares the lpfc_iocbq structure so that elements common
to both SLI3 and SLI4 data paths are more appropriately named, making it
clear they apply generically.

Fieldnames based on 'iocb' (sli3) or 'wqe' (sli4) which are actually
generic to the paths are renamed to 'cmd':

 - iocb_flag is renamed to cmd_flag

 - lpfc_vmid_iocb_tag is renamed to lpfc_vmid_tag

 - fabric_iocb_cmpl is renamed to fabric_cmd_cmpl

 - wait_iocb_cmpl is renamed to wait_cmd_cmpl

 - iocb_cmpl and wqe_cmpl are combined and renamed to cmd_cmpl

 - rsvd2 member is renamed to num_bdes due to pre-existing usage

The structure name itself will retain the iocb reference as changing to a
more relevant "job" or "cmd" title induces many hundreds of line changes
for only a name change.

lpfc_post_buffer is also renamed to lpfc_sli3_post_buffer to indicate use
in the SLI3 path only.

Link: https://lore.kernel.org/r/20220225022308.16486-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Use kcalloc()
Julia Lawall [Sun, 13 Mar 2022 14:18:47 +0000 (15:18 +0100)]
scsi: lpfc: Use kcalloc()

Use kcalloc() instead of kmalloc() + memset().

Link: https://lore.kernel.org/r/20220313141847.109804-1-Julia.Lawall@inria.fr
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: aic7xxx: Fix typos in comments
Julia Lawall [Mon, 14 Mar 2022 11:53:49 +0000 (12:53 +0100)]
scsi: aic7xxx: Fix typos in comments

Various spelling mistakes in comments.  Detected with the help of
Coccinelle.

Link: https://lore.kernel.org/r/20220314115354.144023-26-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix typos in comments
Julia Lawall [Mon, 14 Mar 2022 11:53:48 +0000 (12:53 +0100)]
scsi: qla2xxx: Fix typos in comments

Various spelling mistakes in comments.  Detected with the help of
Coccinelle.

Link: https://lore.kernel.org/r/20220314115354.144023-25-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc_sli: Fix typos in comments
Julia Lawall [Mon, 14 Mar 2022 11:53:41 +0000 (12:53 +0100)]
scsi: elx: libefc_sli: Fix typos in comments

Various spelling mistakes in comments.  Detected with the help of
Coccinelle.

Link: https://lore.kernel.org/r/20220314115354.144023-18-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Fix typos in comments
Julia Lawall [Mon, 14 Mar 2022 11:53:26 +0000 (12:53 +0100)]
scsi: lpfc: Fix typos in comments

Various spelling mistakes in comments.  Detected with the help of
Coccinelle.

Link: https://lore.kernel.org/r/20220314115354.144023-3-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Update version to 10.02.07.400-k
Nilesh Javali [Thu, 10 Mar 2022 09:26:04 +0000 (01:26 -0800)]
scsi: qla2xxx: Update version to 10.02.07.400-k

Link: https://lore.kernel.org/r/20220310092604.22950-14-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Increase max limit of ql2xnvme_queues
Shreyas Deodhar [Thu, 10 Mar 2022 09:26:03 +0000 (01:26 -0800)]
scsi: qla2xxx: Increase max limit of ql2xnvme_queues

Increase max limit of ql2xnvme_queues to (max_qpair - 1).

Link: https://lore.kernel.org/r/20220310092604.22950-13-njavali@marvell.com
Fixes: 65120de26a54 ("scsi: qla2xxx: Add ql2xnvme_queues module param to configure number of NVMe queues")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Use correct feature type field during RFF_ID processing
Manish Rangankar [Thu, 10 Mar 2022 09:26:02 +0000 (01:26 -0800)]
scsi: qla2xxx: Use correct feature type field during RFF_ID processing

During SNS Register FC-4 Features (RFF_ID) the initiator driver was sending
incorrect type field for NVMe supported device. Use correct feature type
field.

Link: https://lore.kernel.org/r/20220310092604.22950-12-njavali@marvell.com
Fixes: e374f9f59281 ("scsi: qla2xxx: Migrate switch registration commands away from mailbox interface")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix stuck session of PRLI reject
Quinn Tran [Thu, 10 Mar 2022 09:26:01 +0000 (01:26 -0800)]
scsi: qla2xxx: Fix stuck session of PRLI reject

Remove stale recovery code that prevents normal path recovery.

Link: https://lore.kernel.org/r/20220310092604.22950-11-njavali@marvell.com
Fixes: 1cbc0efcd9be ("scsi: qla2xxx: Fix retry for PRLI RJT with reason of BUSY")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Reduce false trigger to login
Quinn Tran [Thu, 10 Mar 2022 09:26:00 +0000 (01:26 -0800)]
scsi: qla2xxx: Reduce false trigger to login

While a session is in the middle of a relogin, a late RSCN can be delivered
from switch. RSCN trigger fabric scan where the scan logic can trigger
another session login while a login is in progress.  Reduce the extra
trigger to prevent multiple logins to the same session.

Link: https://lore.kernel.org/r/20220310092604.22950-10-njavali@marvell.com
Fixes: bee8b84686c4 ("scsi: qla2xxx: Reduce redundant ADISC command for RSCNs")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix laggy FC remote port session recovery
Quinn Tran [Thu, 10 Mar 2022 09:25:59 +0000 (01:25 -0800)]
scsi: qla2xxx: Fix laggy FC remote port session recovery

For session recovery, driver relies on the dpc thread to initiate certain
operations. The dpc thread runs exclusively without the Mailbox interface
being occupied. A recent code change for heartbeat check via mailbox cmd 0
is preventing the dpc thread from carrying out its operation. This patch
allows the higher priority error recovery to run first before running the
lower priority heartbeat check.

Link: https://lore.kernel.org/r/20220310092604.22950-9-njavali@marvell.com
Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix hang due to session stuck
Quinn Tran [Thu, 10 Mar 2022 09:25:58 +0000 (01:25 -0800)]
scsi: qla2xxx: Fix hang due to session stuck

User experienced device lost. The log shows Get port data base command was
queued up, failed, and requeued again. Every time it is requeued, it set
the FCF_ASYNC_ACTIVE. This prevents any recovery code from occurring
because driver thinks a recovery is in progress for this session. In
essence, this session is hung.  The reason it gets into this place is the
session deletion got in front of this call due to link perturbation.

Break the requeue cycle and exit.  The session deletion code will trigger a
session relogin.

Link: https://lore.kernel.org/r/20220310092604.22950-8-njavali@marvell.com
Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix N2N inconsistent PLOGI
Quinn Tran [Thu, 10 Mar 2022 09:25:57 +0000 (01:25 -0800)]
scsi: qla2xxx: Fix N2N inconsistent PLOGI

For N2N topology, ELS Passthrough is used to send PLOGI. On failure of ELS
pass through PLOGI, driver flipped over to using LLIOCB PLOGI for N2N. This
is not consistent. Delete the session to restart the connection where ELS
pass through PLOGI would be used consistently.

Link: https://lore.kernel.org/r/20220310092604.22950-7-njavali@marvell.com
Fixes: c76ae845ea83 ("scsi: qla2xxx: Add error handling for PLOGI ELS passthrough")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix crash during module load unload test
Arun Easi [Thu, 10 Mar 2022 09:25:56 +0000 (01:25 -0800)]
scsi: qla2xxx: Fix crash during module load unload test

During purex packet handling the driver was incorrectly freeing a
pre-allocated structure. Fix this by skipping that entry.

System crashed with the following stack during a module unload test.

Call Trace:
sbitmap_init_node+0x7f/0x1e0
sbitmap_queue_init_node+0x24/0x150
blk_mq_init_bitmaps+0x3d/0xa0
blk_mq_init_tags+0x68/0x90
blk_mq_alloc_map_and_rqs+0x44/0x120
blk_mq_alloc_set_map_and_rqs+0x63/0x150
blk_mq_alloc_tag_set+0x11b/0x230
scsi_add_host_with_dma.cold+0x3f/0x245
qla2x00_probe_one+0xd5a/0x1b80 [qla2xxx]

Call Trace with slub_debug and debug kernel:
kasan_report_invalid_free+0x50/0x80
__kasan_slab_free+0x137/0x150
slab_free_freelist_hook+0xc6/0x190
kfree+0xe8/0x2e0
qla2x00_free_device+0x3bb/0x5d0 [qla2xxx]
qla2x00_remove_one+0x668/0xcf0 [qla2xxx]

Link: https://lore.kernel.org/r/20220310092604.22950-6-njavali@marvell.com
Fixes: 62e9dd177732 ("scsi: qla2xxx: Change in PUREX to handle FPIN ELS requests")
Cc: stable@vger.kernel.org
Reported-by: Marco Patalano <mpatalan@redhat.com>
Tested-by: Marco Patalano <mpatalan@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix missed DMA unmap for NVMe ls requests
Arun Easi [Thu, 10 Mar 2022 09:25:55 +0000 (01:25 -0800)]
scsi: qla2xxx: Fix missed DMA unmap for NVMe ls requests

At NVMe ELS request time, request structure is DMA mapped and never
unmapped. Fix this by calling the unmap on ELS completion.

Link: https://lore.kernel.org/r/20220310092604.22950-5-njavali@marvell.com
Fixes: e84067d74301 ("scsi: qla2xxx: Add FC-NVMe F/W initialization and transport registration")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix loss of NVMe namespaces after driver reload test
Arun Easi [Thu, 10 Mar 2022 09:25:54 +0000 (01:25 -0800)]
scsi: qla2xxx: Fix loss of NVMe namespaces after driver reload test

Driver registration of localport can race when it happens at the remote
port discovery time. Fix this by calling the registration under a mutex.

Link: https://lore.kernel.org/r/20220310092604.22950-4-njavali@marvell.com
Fixes: e84067d74301 ("scsi: qla2xxx: Add FC-NVMe F/W initialization and transport registration")
Cc: stable@vger.kernel.org
Reported-by: Marco Patalano <mpatalan@redhat.com>
Tested-by: Marco Patalano <mpatalan@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix disk failure to rediscover
Quinn Tran [Thu, 10 Mar 2022 09:25:53 +0000 (01:25 -0800)]
scsi: qla2xxx: Fix disk failure to rediscover

User experienced some of the LUN failed to get rediscovered after long
cable pull test. The issue is triggered by a race condition between driver
setting session online state vs starting the LUN scan process at the same
time. Current code set the online state after notifying the session is
available. In this case, trigger to start the LUN scan process happened
before driver could set the session in online state.  LUN scan ends up with
failure due to the session online check was failing.

Set the online state before reporting of the availability of the session.

Link: https://lore.kernel.org/r/20220310092604.22950-3-njavali@marvell.com
Fixes: aecf043443d3 ("scsi: qla2xxx: Fix Remote port registration")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Fix incorrect reporting of task management failure
Quinn Tran [Thu, 10 Mar 2022 09:25:52 +0000 (01:25 -0800)]
scsi: qla2xxx: Fix incorrect reporting of task management failure

User experienced no task management error while target device is responding
with error. The RSP_CODE field in the status IOCB is in little endian.
Driver assumes it's big endian and it picked up erroneous data.

Convert the data back to big endian as is on the wire.

Link: https://lore.kernel.org/r/20220310092604.22950-2-njavali@marvell.com
Fixes: faef62d13463 ("[SCSI] qla2xxx: Fix Task Management command asynchronous handling")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libiscsi: Teardown iscsi_cls_conn gracefully
Wenchao Hao [Thu, 10 Mar 2022 01:57:59 +0000 (20:57 -0500)]
scsi: libiscsi: Teardown iscsi_cls_conn gracefully

Commit 1b8d0300a3e9 ("scsi: libiscsi: Fix UAF in
iscsi_conn_get_param()/iscsi_conn_teardown()") fixed an UAF in
iscsi_conn_get_param() and introduced 2 tmp_xxx varibles.

We can gracefully fix this UAF with the help of device_del(). Calling
iscsi_remove_conn() at the beginning of iscsi_conn_teardown would make
userspace unable to see iscsi_cls_conn. This way we we can free memory
safely.

Remove iscsi_destroy_conn() since it is no longer used.

Link: https://lore.kernel.org/r/20220310015759.3296841-4-haowenchao@huawei.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Signed-off-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libiscsi: Add iscsi_cls_conn to sysfs after initialization
Wenchao Hao [Thu, 10 Mar 2022 01:57:58 +0000 (20:57 -0500)]
scsi: libiscsi: Add iscsi_cls_conn to sysfs after initialization

iscsi_create_conn() exposed iscsi_cls_conn to sysfs prior to initialization
of iscsi_conn's dd_data. When userspace tried to access an attribute such
as the connect address, a NULL pointer dereference was observed.

Do not add iscsi_cls_conn to sysfs until it has been initialized.  Remove
iscsi_create_conn() since it is no longer used.

Link: https://lore.kernel.org/r/20220310015759.3296841-3-haowenchao@huawei.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Signed-off-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Add helper functions to manage iscsi_cls_conn
Wenchao Hao [Thu, 10 Mar 2022 01:57:57 +0000 (20:57 -0500)]
scsi: iscsi: Add helper functions to manage iscsi_cls_conn

 - iscsi_alloc_conn(): Allocate and initialize iscsi_cls_conn

 - iscsi_add_conn(): Expose iscsi_cls_conn to userspace via sysfs

 - iscsi_remove_conn(): Remove iscsi_cls_conn from sysfs

Link: https://lore.kernel.org/r/20220310015759.3296841-2-haowenchao@huawei.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Signed-off-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Remove unreachable code warning
Yang Li [Tue, 1 Mar 2022 08:04:48 +0000 (16:04 +0800)]
scsi: core: Remove unreachable code warning

The smatch tool reported the following warning:
drivers/scsi/scsi_error.c:1988 scsi_decide_disposition() warn: ignoring
unreachable code.

Remove the "default:return FAILED;" instead of "return FAILED;" reported by
smatch, because compilers can provide more useful diagnostics about
switch/case statements that do not have a default statement, especially if
the "switch" applies to a value with enumeration type.

Link: https://lore.kernel.org/r/20220301080448.112813-1-yang.lee@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: megasas: Clean up some inconsistent indenting
Yang Li [Fri, 25 Feb 2022 01:16:05 +0000 (09:16 +0800)]
scsi: megasas: Clean up some inconsistent indenting

Eliminate the following smatch warning:
drivers/scsi/megaraid/megaraid_sas_fusion.c:5104 megasas_reset_fusion()
warn: inconsistent indenting

Link: https://lore.kernel.org/r/20220225011605.130927-1-yang.lee@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: aacraid: Clean up some inconsistent indenting
Jiapeng Chong [Wed, 9 Mar 2022 00:50:31 +0000 (08:50 +0800)]
scsi: aacraid: Clean up some inconsistent indenting

Eliminate the following smatch warning:

drivers/scsi/aacraid/linit.c:867 aac_eh_tmf_hard_reset_fib() warn:
inconsistent indenting.

Link: https://lore.kernel.org/r/20220309005031.126504-1-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: target: Add iscsi/cpus_allowed_list in configfs
Mingzhe Zou [Tue, 1 Mar 2022 07:55:00 +0000 (15:55 +0800)]
scsi: target: Add iscsi/cpus_allowed_list in configfs

The RX/TX threads for iSCSI connection can be scheduled to any online CPUs,
and will not be rescheduled.

When binding other heavy load threads along with iSCSI connection RX/TX
thread to the same CPU, the iSCSI performance will be worse.

Add iscsi/cpus_allowed_list in configfs. The available CPU set of iSCSI
connection RX/TX threads is allowed_cpus & online_cpus. If it is modified,
all RX/TX threads will be rescheduled.

Link: https://lore.kernel.org/r/20220301075500.14266-1-mingzhe.zou@easystack.cn
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Use libsas internal abort support
John Garry [Fri, 11 Mar 2022 12:23:52 +0000 (20:23 +0800)]
scsi: hisi_sas: Use libsas internal abort support

Use the common libsas internal abort functionality.

In addition, this driver has special handling for internal abort timeouts -
specifically whether to reset the controller in that instance, so extend
the API for that.

Timeout is now increased to 20 * Hz from 6 * Hz.

We also retry for failure now, but this should not make a difference.

Link: https://lore.kernel.org/r/1647001432-239276-5-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Use libsas internal abort support
John Garry [Fri, 11 Mar 2022 12:23:51 +0000 (20:23 +0800)]
scsi: pm8001: Use libsas internal abort support

New special handling is added for SAS_PROTOCOL_INTERNAL_ABORT proto so that
we may use the common queue command API.

Link: https://lore.kernel.org/r/1647001432-239276-4-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libsas: Add sas_execute_internal_abort_dev()
John Garry [Fri, 11 Mar 2022 12:23:50 +0000 (20:23 +0800)]
scsi: libsas: Add sas_execute_internal_abort_dev()

Add support for a "device" variant of internal abort, which will abort all
pending I/Os for a specific device.

Link: https://lore.kernel.org/r/1647001432-239276-3-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libsas: Add sas_execute_internal_abort_single()
John Garry [Fri, 11 Mar 2022 12:23:49 +0000 (20:23 +0800)]
scsi: libsas: Add sas_execute_internal_abort_single()

The internal abort feature is common to hisi_sas and pm8001 HBAs, and the
driver support is similar also, so add a common handler.

Two modes of operation will be supported:

 - single: Abort a single tagged command

 - device: Abort all commands associated with a specific domain device

A new protocol is added, SAS_PROTOCOL_INTERNAL_ABORT, so the common queue
command API may be re-used.

Only add "single" support as a first step.

Link: https://lore.kernel.org/r/1647001432-239276-2-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Remove failing soft_wwn support
James Smart [Thu, 10 Mar 2022 15:48:45 +0000 (07:48 -0800)]
scsi: lpfc: Remove failing soft_wwn support

The soft_wwpn/soft_wwn functionality, which allows the driver to modify
service parameters in an attempt to override the adapter-assigned WWN, was
originally attempted to be removed roughly 6 yrs ago as new fabric features
were being introduced that clashed with the implementation.  In the end,
the feature was left in with the user being responsible if things went
south.

We've reached a point where soft_wwn is no longer functional and is failing
in almost all production use cases. Use of Fabric features such as Fabric
Assigned WWPN and Automatic DPORT is now prevalent and the features require
coordination between the adapter and driver that can't be solved by the
simplistic update of the service parameters. As it is no longer functional,
the feature is to be removed.

There are still ways to override the adapter-assigned WWN but they require
the admin to invoke bios/efi level menus.

Link: https://lore.kernel.org/r/20220310154845.11125-1-jsmart2021@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: core: scsi_get_lba() error fix
Peter Wang [Mon, 7 Mar 2022 11:17:52 +0000 (19:17 +0800)]
scsi: ufs: core: scsi_get_lba() error fix

When ufs initializes without scmd->device->sector_size set, scsi_get_lba()
will get a wrong shift number and trigger an ubsan error.  The shift
exponent 4294967286 is too large for the 64-bit type 'sector_t' (aka
'unsigned long long').

Call scsi_get_lba() only when opcode is READ_10/WRITE_10/UNMAP.

Link: https://lore.kernel.org/r/20220307111752.10465-1-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: docs: UFS documentation corrections
Randy Dunlap [Mon, 7 Mar 2022 01:32:24 +0000 (17:32 -0800)]
scsi: ufs: docs: UFS documentation corrections

Make a variety of corrections to ufs.rst:

 - add spaces around parenthetical phrases
 - correct singular/plural grammar and nouns
 - correct punctuation
 - add article adjectives
 - add hyphens to multi-word adjectives
 - spell Lun as LUN
 - spell upiu as UPIU (in text, not code examples)
 - don't capitalize generic "specification"

Link: https://lore.kernel.org/r/20220307013224.5130-1-rdunlap@infradead.org
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: mpt3sas: Fix incorrect 4GB boundary check
Sreekanth Reddy [Thu, 3 Mar 2022 14:02:30 +0000 (19:32 +0530)]
scsi: mpt3sas: Fix incorrect 4GB boundary check

The driver must perform its 4GB boundary check using the pool's DMA address
instead of using the virtual address.

Link: https://lore.kernel.org/r/20220303140230.13098-1-sreekanth.reddy@broadcom.com
Fixes: d6adc251dd2f ("scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: mpt3sas: Remove scsi_dma_map() error messages
Sreekanth Reddy [Thu, 3 Mar 2022 14:02:03 +0000 (19:32 +0530)]
scsi: mpt3sas: Remove scsi_dma_map() error messages

When scsi_dma_map() fails by returning a sges_left value less than zero,
the amount of logging produced can be extremely high.  In a recent end-user
environment, 1200 messages per second were being sent to the log buffer.
This eventually overwhelmed the system and it stalled.

These error messages are not needed. Remove them.

Link: https://lore.kernel.org/r/20220303140203.12642-1-sreekanth.reddy@broadcom.com
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libfc: Fix use after free in fc_exch_abts_resp()
Jianglei Nie [Thu, 3 Mar 2022 01:51:15 +0000 (09:51 +0800)]
scsi: libfc: Fix use after free in fc_exch_abts_resp()

fc_exch_release(ep) will decrease the ep's reference count. When the
reference count reaches zero, it is freed. But ep is still used in the
following code, which will lead to a use after free.

Return after the fc_exch_release() call to avoid use after free.

Link: https://lore.kernel.org/r/20220303015115.459778-1-niejianglei2021@163.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: scsi_debug: Fix qc_lock use in sdebug_blk_mq_poll()
Damien Le Moal [Tue, 1 Mar 2022 11:30:09 +0000 (20:30 +0900)]
scsi: scsi_debug: Fix qc_lock use in sdebug_blk_mq_poll()

The use of the 'locked' boolean variable to control locking and unlocking
of the qc_lock spinlock of struct sdebug_queue confuses sparse, leading to
a warning about an unexpected unlock. Simplify the qc_lock lock/unlock
handling code of this function to avoid this warning by removing the
'locked' boolean variable. This change also fixes unlocked access to the
in_use_bm bitmap with the find_first_bit() function.

Link: https://lore.kernel.org/r/20220301113009.595857-3-damien.lemoal@opensource.wdc.com
Fixes: b05d4e481eff ("scsi: scsi_debug: Refine sdebug_blk_mq_poll()")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: scsi_debug: Silence unexpected unlock warnings
Damien Le Moal [Tue, 1 Mar 2022 11:30:08 +0000 (20:30 +0900)]
scsi: scsi_debug: Silence unexpected unlock warnings

The return statement inside the sdeb_read_lock(), sdeb_read_unlock(),
sdeb_write_lock() and sdeb_write_unlock() confuse sparse, leading to many
warnings about unexpected unlocks in the resp_xxx() functions.

Modify the lock/unlock functions using the __acquire() and __release()
inline annotations for the sdebug_no_rwlock == true case to avoid these
warnings.

Link: https://lore.kernel.org/r/20220301113009.595857-2-damien.lemoal@opensource.wdc.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: Fix runtime PM messages never-ending cycle
Adrian Hunter [Mon, 28 Feb 2022 11:36:52 +0000 (13:36 +0200)]
scsi: ufs: Fix runtime PM messages never-ending cycle

Kernel messages produced during runtime PM can cause a never-ending cycle
because user space utilities (e.g. journald or rsyslog) write the messages
back to storage, causing runtime resume, more messages, and so on.

Messages that tell of things that are expected to happen, are arguably
unnecessary, so suppress them.

UFS driver messages are changes to from dev_err() to dev_dbg() which means
they will not display unless activated by dynamic debug of building with
-DDEBUG.

sdev->silence_suspend is set to skip messages from sd_suspend_common()
"Synchronizing SCSI cache", "Stopping disk" and scsi_report_sense()
"Power-on or device reset occurred" message (Note, that message appears
when the LUN is accessed after runtime PM, not during runtime PM)

 Example messages from Ubuntu 21.10:

 $ dmesg | tail
 [ 1620.380071] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[1, 1], lane[1, 1], pwr[SLOWAUTO_MODE, SLOWAUTO_MODE], rate = 0
 [ 1620.408825] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[4, 4], lane[2, 2], pwr[FAST MODE, FAST MODE], rate = 2
 [ 1620.409020] ufshcd 0000:00:12.5: ufshcd_find_max_sup_active_icc_level: Regulator capability was not set, actvIccLevel=0
 [ 1620.409524] sd 0:0:0:0: Power-on or device reset occurred
 [ 1622.938794] sd 0:0:0:0: [sda] Synchronizing SCSI cache
 [ 1622.939184] ufs_device_wlun 0:0:0:49488: Power-on or device reset occurred
 [ 1625.183175] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[1, 1], lane[1, 1], pwr[SLOWAUTO_MODE, SLOWAUTO_MODE], rate = 0
 [ 1625.208041] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[4, 4], lane[2, 2], pwr[FAST MODE, FAST MODE], rate = 2
 [ 1625.208311] ufshcd 0000:00:12.5: ufshcd_find_max_sup_active_icc_level: Regulator capability was not set, actvIccLevel=0
 [ 1625.209035] sd 0:0:0:0: Power-on or device reset occurred

Note for stable: depends on patch "scsi: core: sd: Add silence_suspend flag
to suppress some PM messages".

Link: https://lore.kernel.org/r/20220228113652.970857-3-adrian.hunter@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: sd: Add silence_suspend flag to suppress some PM messages
Adrian Hunter [Mon, 28 Feb 2022 11:36:51 +0000 (13:36 +0200)]
scsi: core: sd: Add silence_suspend flag to suppress some PM messages

Kernel messages produced during runtime PM can cause a never-ending cycle
because user space utilities (e.g. journald or rsyslog) write the messages
back to storage, causing runtime resume, more messages, and so on.

Messages that tell of things that are expected to happen are arguably
unnecessary, so add a flag to suppress them. This flag is used by the UFS
driver.

Link: https://lore.kernel.org/r/20220228113652.970857-2-adrian.hunter@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Use rport as argument for lpfc_chk_tgt_mapped()
Hannes Reinecke [Tue, 1 Mar 2022 14:37:18 +0000 (15:37 +0100)]
scsi: lpfc: Use rport as argument for lpfc_chk_tgt_mapped()

We only need the rport structure for lpfc_chk_tgt_mapped().

Link: https://lore.kernel.org/r/20220301143718.40913-6-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Use rport as argument for lpfc_send_taskmgmt()
Hannes Reinecke [Tue, 1 Mar 2022 14:37:17 +0000 (15:37 +0100)]
scsi: lpfc: Use rport as argument for lpfc_send_taskmgmt()

Instead of passing in a scsi_cmnd we should be using the rport; we already
have the target and LUN ID as parameters, so there's no need to pass the
scsi_cmnd too.

Link: https://lore.kernel.org/r/20220301143718.40913-5-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Use fc_block_rport()
Hannes Reinecke [Tue, 1 Mar 2022 14:37:16 +0000 (15:37 +0100)]
scsi: lpfc: Use fc_block_rport()

Use fc_block_rport() instead of fc_block_scsi_eh() as the SCSI command will
be removed as argument for SCSI EH functions.

Link: https://lore.kernel.org/r/20220301143718.40913-4-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Drop lpfc_no_handler()
Hannes Reinecke [Tue, 1 Mar 2022 14:37:15 +0000 (15:37 +0100)]
scsi: lpfc: Drop lpfc_no_handler()

The default SCSI EH action for a non-existing EH callback is to return
FAILED, so having a callback just returning FAILED is pointless.

Link: https://lore.kernel.org/r/20220301143718.40913-3-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Kill lpfc_bus_reset_handler()
Hannes Reinecke [Tue, 1 Mar 2022 14:37:14 +0000 (15:37 +0100)]
scsi: lpfc: Kill lpfc_bus_reset_handler()

lpfc_bus_reset_handler() is really just a loop calling
lpfc_target_reset_handler() over all targets, which is what the error
handler will be doing anyway.

Link: https://lore.kernel.org/r/20220301143718.40913-2-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: wd719x: Return proper error code when dma_set_mask() fails
Zheyu Ma [Mon, 28 Feb 2022 14:54:15 +0000 (14:54 +0000)]
scsi: wd719x: Return proper error code when dma_set_mask() fails

During the process of driver probing, the probe function should return < 0
for failure, otherwise the kernel will treat value >= 0 as success.

Set 'err' to the error value returned by dma_set_mask() in case of failure.

Link: https://lore.kernel.org/r/1646060055-11361-1-git-send-email-zheyuma97@gmail.com
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Drop temp workq_name
Mike Christie [Sat, 26 Feb 2022 23:04:35 +0000 (17:04 -0600)]
scsi: iscsi: Drop temp workq_name

When the workqueue code was created it didn't allow variable args so we
have been using a temp buffer. Drop that.

Link: https://lore.kernel.org/r/20220226230435.38733-7-michael.christie@oracle.com
Reviewed-by: Chris Leech <cleech@redhat.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Use the session workqueue for recovery
Mike Christie [Sat, 26 Feb 2022 23:04:34 +0000 (17:04 -0600)]
scsi: iscsi: Use the session workqueue for recovery

Use the session workqueue for recovery and unbinding. If there are delays
during device blocking/cleanup then it will no longer affect other
sessions.

Link: https://lore.kernel.org/r/20220226230435.38733-6-michael.christie@oracle.com
Reviewed-by: Chris Leech <cleech@redhat.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: ql4xxx: Use per-session workqueue for unbinding
Mike Christie [Sat, 26 Feb 2022 23:04:33 +0000 (17:04 -0600)]
scsi: iscsi: ql4xxx: Use per-session workqueue for unbinding

We currently allocate a workqueue per host and only use it for removing the
target. For the session per host case we could be using this workqueue to
be able to do recoveries (block, unblock, timeout handling) in parallel. To
also allow offload drivers to do their session recoveries in parallel, this
drops the per host workqueue and replaces it with a per session one.

Link: https://lore.kernel.org/r/20220226230435.38733-5-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Remove iscsi_scan_finished()
Mike Christie [Sat, 26 Feb 2022 23:04:32 +0000 (17:04 -0600)]
scsi: iscsi: Remove iscsi_scan_finished()

qla4xxx does not use iscsi_scan_finished() anymore so remove it.

Link: https://lore.kernel.org/r/20220226230435.38733-4-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Speed up session unblocking and removal
Mike Christie [Sat, 26 Feb 2022 23:04:31 +0000 (17:04 -0600)]
scsi: iscsi: Speed up session unblocking and removal

When the iSCSI class was added upstream, blocking a queue was fast because
it just set some flag bits and didn't handle I/O that was in the process of
being sent to the driver. That's no longer the case so blocking a queue is
expensive and we can end up with a backlog of blocks by the time we have
relogged in and are trying to start the queues.

For the session unblock case, this has try to cancel the block and recovery
work in case they are still queued so we can avoid unneeded queue
manipulations. For removal, we also now try to cancel all the recovery
related works since a couple lines down we will set the session and device
state so running those functions are not necessary.

Link: https://lore.kernel.org/r/20220226230435.38733-3-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Fix recovery and unblocking race
Mike Christie [Sat, 26 Feb 2022 23:04:30 +0000 (17:04 -0600)]
scsi: iscsi: Fix recovery and unblocking race

If the user sets the iscsi_eh_timer_workq/iscsi_eh workqueue's max_active
to greater than 1, the recovery_work could be running when
__iscsi_unblock_session() runs. The cancel_delayed_work() will then not
wait for the running work and we can race where we end up with the wrong
session state and scsi_device state set.

This replaces the cancel_delayed_work() with the sync version.

Link: https://lore.kernel.org/r/20220226230435.38733-2-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: scsi_transport_fc: Fix FPIN Link Integrity statistics counters
James Smart [Tue, 1 Mar 2022 17:55:36 +0000 (09:55 -0800)]
scsi: scsi_transport_fc: Fix FPIN Link Integrity statistics counters

In the original FPIN commit, stats were incremented by the event_count.
Event_count is the minimum # of events that must occur before an FPIN is
sent. Thus, its not the actual number of events, and could be significantly
off (too low) as it doesn't reflect anything not reported.  Rather than
attempt to count events, have the statistic count how many FPINS cross the
threshold and were reported.

Link: https://lore.kernel.org/r/20220301175536.60250-1-jsmart2021@gmail.com
Fixes: 3dcfe0de5a97 ("scsi: fc: Parse FPIN packets and update statistics")
Cc: <stable@vger.kernel.org> # v5.11+
Cc: Shyam Sundar <ssundar@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libsas: Clean up sas_form_port()
Damien Le Moal [Mon, 28 Feb 2022 09:48:57 +0000 (18:48 +0900)]
scsi: libsas: Clean up sas_form_port()

Sparse throws a warning about context imbalance ("different lock contexts
for basic block") in sas_form_port() as it gets confused with the fact that
a port is locked within one of the two search loops and unlocked afterward
outside of the search loops once the phy is added to the port. Since this
code is not easy to follow, improve it by factoring out the code adding the
phy to the port once the port is locked into the helper function
sas_form_port_add_phy(). This helper can then be called directly within the
port search loops, avoiding confusion and clearing the sparse warning.

Link: https://lore.kernel.org/r/20220228094857.557329-1-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Remove <scsi/scsi_request.h>
Christoph Hellwig [Thu, 24 Feb 2022 17:55:52 +0000 (18:55 +0100)]
scsi: core: Remove <scsi/scsi_request.h>

This header is empty now except for an include of <linux/blk-mq.h>, so
remove it.

Link: https://lore.kernel.org/r/20220224175552.988286-9-hch@lst.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Remove struct scsi_request
Christoph Hellwig [Thu, 24 Feb 2022 17:55:51 +0000 (18:55 +0100)]
scsi: core: Remove struct scsi_request

Let submitters initialize the scmd->allowed field directly instead of
indirecting through struct scsi_request and remove the now superfluous
structure.

Link: https://lore.kernel.org/r/20220224175552.988286-8-hch@lst.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Move the result field from struct scsi_request to struct scsi_cmnd
Christoph Hellwig [Thu, 24 Feb 2022 17:55:50 +0000 (18:55 +0100)]
scsi: core: Move the result field from struct scsi_request to struct scsi_cmnd

Prepare for removing the scsi_request structure by moving the result field
to struct scsi_cmnd.

Link: https://lore.kernel.org/r/20220224175552.988286-7-hch@lst.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Move the resid_len field from struct scsi_request to struct scsi_cmnd
Christoph Hellwig [Thu, 24 Feb 2022 17:55:49 +0000 (18:55 +0100)]
scsi: core: Move the resid_len field from struct scsi_request to struct scsi_cmnd

Prepare for removing the scsi_request structure by moving the resid_len
field to struct scsi_cmnd.

Link: https://lore.kernel.org/r/20220224175552.988286-6-hch@lst.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Remove the sense and sense_len fields from struct scsi_request
Christoph Hellwig [Thu, 24 Feb 2022 17:55:48 +0000 (18:55 +0100)]
scsi: core: Remove the sense and sense_len fields from struct scsi_request

Just use the sense_buffer field in struct scsi_cmnd for the sense data and
move the sense_len field over to struct scsi_cmnd.

Link: https://lore.kernel.org/r/20220224175552.988286-5-hch@lst.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Remove the cmd field from struct scsi_request
Christoph Hellwig [Thu, 24 Feb 2022 17:55:47 +0000 (18:55 +0100)]
scsi: core: Remove the cmd field from struct scsi_request

Now that each scsi_request is backed by a scsi_cmnd, there is no need to
indirect the CDB storage.  Change all submitters of SCSI passthrough
requests to store the CDB information directly in the scsi_cmnd, and while
doing so allocate the full 32 bytes that cover all Linux supported SCSI
hosts instead of requiring dynamic allocation for > 16 byte CDBs.  On
64-bit systems this does not change the size of the scsi_cmnd at all, while
on 32-bit systems it slightly increases it for now, but that increase will
be made up by the removal of the remaining scsi_request fields.

Link: https://lore.kernel.org/r/20220224175552.988286-4-hch@lst.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Don't memset() the entire scsi_cmnd in scsi_init_command()
Christoph Hellwig [Thu, 24 Feb 2022 17:55:46 +0000 (18:55 +0100)]
scsi: core: Don't memset() the entire scsi_cmnd in scsi_init_command()

Replace the big fat memset that requires saving and restoring various
fields with just initializing those fields that need initialization.

All the clearing to 0 is moved to scsi_prepare_cmd() as scsi_ioctl_reset()
alreadly uses kzalloc() to allocate a pre-zeroed command.

This is still conservative and can probably be optimized further.

Link: https://lore.kernel.org/r/20220224175552.988286-3-hch@lst.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: target: pscsi: Remove struct pscsi_plugin_task
Christoph Hellwig [Thu, 24 Feb 2022 17:55:45 +0000 (18:55 +0100)]
scsi: target: pscsi: Remove struct pscsi_plugin_task

Copy directly from the se_cmd CDB to the one in the scsi_request.  This
temporarily limits the pscsi backend to supporting only up to 16 byte CDBs,
but this restriction will be lifted later in this series.

Link: https://lore.kernel.org/r/20220224175552.988286-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libsas: Use bool for queue_work() return code
John Garry [Fri, 25 Feb 2022 10:57:36 +0000 (18:57 +0800)]
scsi: libsas: Use bool for queue_work() return code

Function queue_work() returns a bool, so use a bool to hold this value for
the return code from callers, which should make the code a tiny bit more
clear.

Also take this opportunity to condense the code of the those callers, such
as sas_queue_work(), as suggested by Damien.

Link: https://lore.kernel.org/r/1645786656-221630-3-git-send-email-john.garry@huawei.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libsas: Make sas_notify_{phy,port}_event() return void
John Garry [Fri, 25 Feb 2022 10:57:35 +0000 (18:57 +0800)]
scsi: libsas: Make sas_notify_{phy,port}_event() return void

Nobody checks the return codes, so make them return void. Indeed, if the
LLDD cannot send an event, nothing much can be done in the LLDD about it.

Also remove prototype for sas_notify_phy_event() in sas_internal.h, which
should not be there.

Link: https://lore.kernel.org/r/1645786656-221630-2-git-send-email-john.garry@huawei.com
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Modify v3 HW SSP underflow error processing
Xingui Yang [Thu, 24 Feb 2022 11:51:29 +0000 (19:51 +0800)]
scsi: hisi_sas: Modify v3 HW SSP underflow error processing

In case of SSP underflow allow the response frame IU to be examined for
setting the response stat value rather than always setting
SAS_DATA_UNDERRUN.

This will mean that we call sas_ssp_task_response() in those scenarios and
may send sense data to upper layer.

Such a condition would be for bad blocks were we just reporting an
underflow error to upper layer, but now the sense data will tell
immediately that the media is faulty.

Link: https://lore.kernel.org/r/1645703489-87194-7-git-send-email-john.garry@huawei.com
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Limit users changing debugfs BIST count value
Xiang Chen [Thu, 24 Feb 2022 11:51:28 +0000 (19:51 +0800)]
scsi: hisi_sas: Limit users changing debugfs BIST count value

Add a file operation for "cnt" file under bist directory, so users can only
read "cnt" or clear "cnt" to zero, but cannot randomly modify.

Link: https://lore.kernel.org/r/1645703489-87194-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Rename error labels in hisi_sas_v3_probe()
Qi Liu [Thu, 24 Feb 2022 11:51:27 +0000 (19:51 +0800)]
scsi: hisi_sas: Rename error labels in hisi_sas_v3_probe()

To avoid doubt, rename the error labels to indicate the action they will
take.

Link: https://lore.kernel.org/r/1645703489-87194-5-git-send-email-john.garry@huawei.com
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Free irq vectors in order for v3 HW
Qi Liu [Thu, 24 Feb 2022 11:51:26 +0000 (19:51 +0800)]
scsi: hisi_sas: Free irq vectors in order for v3 HW

If the driver probe fails to request the channel IRQ or fatal IRQ, the
driver will free the IRQ vectors before freeing the IRQs in free_irq(),
and this will cause a kernel BUG like this:

------------[ cut here ]------------
kernel BUG at drivers/pci/msi.c:369!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Call trace:
   free_msi_irqs+0x118/0x13c
   pci_disable_msi+0xfc/0x120
   pci_free_irq_vectors+0x24/0x3c
   hisi_sas_v3_probe+0x360/0x9d0 [hisi_sas_v3_hw]
   local_pci_probe+0x44/0xb0
   work_for_cpu_fn+0x20/0x34
   process_one_work+0x1d0/0x340
   worker_thread+0x2e0/0x460
   kthread+0x180/0x190
   ret_from_fork+0x10/0x20
---[ end trace b88990335b610c11 ]---

So we use devm_add_action() to control the order in which we free the
vectors.

Link: https://lore.kernel.org/r/1645703489-87194-4-git-send-email-john.garry@huawei.com
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Change hisi_sas_control_phy() phyup timeout
Xiang Chen [Thu, 24 Feb 2022 11:51:25 +0000 (19:51 +0800)]
scsi: hisi_sas: Change hisi_sas_control_phy() phyup timeout

The time of phyup not only depends on the controller but also the type of
disk connected. As an example, from experience, for some SATA disks the
amount of time from reset/power-on to receive the D2H FIS for phyup can
take upto and more than 10s sometimes. According to the specification of
some SATA disks such as ST14000NM0018, the max time from power-on to ready
is 30s.

Based on this the current timeout of phyup at 2s which is not enough. So
set the value as HISI_SAS_WAIT_PHYUP_TIMEOUT (30s) in
hisi_sas_control_phy().

For v3 hw there is a pre-existing workaround for a HW bug, being that we
issue a link reset when the OOB occurs but the phyup does not. The current
phyup timeout is HISI_SAS_WAIT_PHYUP_TIMEOUT. So if this does occur from
when issuing a phy enable or similar via hisi_sas_control_phy(), the
subsequent HW workaround linkreset processing calls hisi_sas_control_phy(),
but this will pend the original phy reset timing out, so it is safe.

Link: https://lore.kernel.org/r/1645703489-87194-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Change permission of parameter prot_mask
Xiang Chen [Thu, 24 Feb 2022 11:51:24 +0000 (19:51 +0800)]
scsi: hisi_sas: Change permission of parameter prot_mask

Currently the permission of parameter prot_mask is 0x0, which means that
the member does not appear in sysfs. Change it as other module parameters
to 0444 for world-readable.

[mkp: s/v3/v2/]

Link: https://lore.kernel.org/r/1645703489-87194-2-git-send-email-john.garry@huawei.com
Fixes: d6a9000b81be ("scsi: hisi_sas: Add support for DIF feature for v2 hw")
Reported-by: Yihang Li <liyihang6@hisilicon.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla4xxx: Remove unneeded variable
Changcheng Deng [Thu, 24 Feb 2022 09:07:35 +0000 (09:07 +0000)]
scsi: qla4xxx: Remove unneeded variable

Remove unneeded variable used to store return value.

Link: https://lore.kernel.org/r/20220224090735.1967816-1-deng.changcheng@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: aha152x: Fix aha152x_setup() __setup handler return value
Randy Dunlap [Wed, 23 Feb 2022 00:06:23 +0000 (16:06 -0800)]
scsi: aha152x: Fix aha152x_setup() __setup handler return value

__setup() handlers should return 1 if the command line option is handled
and 0 if not (or maybe never return 0; doing so just pollutes init's
environment with strings that are not init arguments/parameters).

Return 1 from aha152x_setup() to indicate that the boot option has been
handled.

Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Link: https://lore.kernel.org/r/20220223000623.5920-1-rdunlap@infradead.org
Cc: "Juergen E. Fischer" <fischer@norbit.de>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm80xx: Handle non-fatal errors
Ajish Koshy [Tue, 22 Feb 2022 09:26:18 +0000 (14:56 +0530)]
scsi: pm80xx: Handle non-fatal errors

Firmware expects host driver to clear scratchpad rsvd 0 register after
non-fatal error is found.

This is done when firmware raises fatal error interrupt and indicates
non-fatal error. At this point firmware updates scratchpad rsvd 0 register
with non-fatal error value. Here host has to clear the register after
reading it during non-fatal errors.

Rename:

 - MSGU_HOST_SCRATCH_PAD_6 to MSGU_SCRATCH_PAD_RSVD_0

 - MSGU_HOST_SCRATCH_PAD_7 to MSGU_SCRATCH_PAD_RSVD_1

Link: https://lore.kernel.org/r/20220222092618.108198-1-Ajish.Koshy@microchip.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: mac53c94: Stop using struct scsi_pointer
Finn Thain [Mon, 21 Feb 2022 23:09:42 +0000 (10:09 +1100)]
scsi: mac53c94: Stop using struct scsi_pointer

This driver doesn't use SCp.ptr to save a SCSI command data pointer which
means "scsi pointer" is a complete misnomer here. Only a few members of
struct scsi_pointer are used and the rest waste memory.  Avoid the "struct
foo { struct bar; };" silliness.

Link: https://lore.kernel.org/r/3529a59873a7de8455a27af2528341afe5069adc.1645484982.git.fthain@linux-m68k.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: mesh: Stop using struct scsi_pointer
Finn Thain [Mon, 21 Feb 2022 23:09:42 +0000 (10:09 +1100)]
scsi: mesh: Stop using struct scsi_pointer

This driver doesn't use SCp.ptr to save a SCSI command data pointer which
means "scsi pointer" is a complete misnomer here. Only a few members of
struct scsi_pointer are used and the rest waste memory.  Avoid the "struct
foo { struct bar; };" silliness.

Link: https://lore.kernel.org/r/fbf930e64af5b15ca028dfe25b00fe933951f19b.1645484982.git.fthain@linux-m68k.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: docs: Update notes about scsi_times_out
Khazhismel Kumykov [Sat, 19 Feb 2022 00:16:01 +0000 (16:16 -0800)]
scsi: core: docs: Update notes about scsi_times_out

Most importantly: eh_timed_out() is not limited by scmd->allowed, and can
reset timer forever.

Fixes: c829c394165f ("[SCSI] FC transport : Avoid device offline cases by stalling aborts until device unblocked")
Link: https://lore.kernel.org/r/20220219001601.3534043-1-khazhy@google.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: message: fusion: Use GFP_KERNEL instead of GFP_ATOMIC in non-atomic context
Christophe JAILLET [Tue, 15 Feb 2022 06:32:34 +0000 (07:32 +0100)]
scsi: message: fusion: Use GFP_KERNEL instead of GFP_ATOMIC in non-atomic context

Just a few lines below this kzalloc() we have a mutex_lock() which can
sleep.

Moreover, the only way to call this function is when a delayed work is
schedule. And delayed work can sleep:

  INIT_DELAYED_WORK(&fw_event->work, mptsas_firmware_event_work);
    --> mptsas_firmware_event_work()
      --> mptsas_send_link_status_event()
        --> mptsas_expander_add()

So there is really no good reason to use GFP_ATOMIC here. Change it to
GFP_KERNEL to give more opportunities to the kernel.

Link: https://lore.kernel.org/r/eccb2179ce800529851ed4fabc9d3f95fbbf7d7f.1644906731.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libfc: Replace one-element arrays with flexible-array members
Gustavo A. R. Silva [Mon, 14 Feb 2022 22:39:03 +0000 (16:39 -0600)]
scsi: libfc: Replace one-element arrays with flexible-array members

Use flexible-array members in struct fc_fdmi_attr_entry and fs_fdmi_attrs
instead of one-element arrays, and refactor the code accordingly.

Also, this helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines on
memcpy().

https://github.com/KSPP/linux/issues/79
https://github.com/ClangBuiltLinux/linux/issues/1590

Link: https://lore.kernel.org/r/20220214223903.GA859464@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix pm8001_info() message format
Damien Le Moal [Sun, 20 Feb 2022 03:18:10 +0000 (12:18 +0900)]
scsi: pm8001: Fix pm8001_info() message format

Make the driver messages more readable by adding a space after the message
prefix ":" and removing the extra space between function name and line
number.

Link: https://lore.kernel.org/r/20220220031810.738362-32-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Improve pm80XX_send_abort_all()
Damien Le Moal [Sun, 20 Feb 2022 03:18:09 +0000 (12:18 +0900)]
scsi: pm8001: Improve pm80XX_send_abort_all()

Both pm8001_send_abort_all() and pm80xx_send_abort_all() are called only
for a non null device with the NCQ_READ_LOG_FLAG set, so remove the device
check on entry of these functions. Furthermore, setting the
NCQ_ABORT_ALL_FLAG device id flag and clearing the NCQ_READ_LOG_FLAG is
always done before calling these functions. Move these operations inside
the functions.

Link: https://lore.kernel.org/r/20220220031810.738362-31-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Simplify pm8001_ccb_task_free()
Damien Le Moal [Sun, 20 Feb 2022 03:18:08 +0000 (12:18 +0900)]
scsi: pm8001: Simplify pm8001_ccb_task_free()

The task argument of the pm8001_ccb_task_free() function can be inferred
from the ccb argument ccb_task field. So there is no need to have this
argument. Likewise, the ccb_index argument is always equal to the ccb tag
field and is not needed either. Remove both arguments and update all call
sites. The pm8001_ccb_task_free_done() helper is also modified to match
this change.

Link: https://lore.kernel.org/r/20220220031810.738362-30-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Simplify pm8001_task_exec()
Damien Le Moal [Sun, 20 Feb 2022 03:18:07 +0000 (12:18 +0900)]
scsi: pm8001: Simplify pm8001_task_exec()

The main part of the pm8001_task_exec() function uses a do {} while(0) loop
that is useless and only makes the code harder to read. Remove this
loop. The unnecessary local variable t is also removed.

Additionally, avoid repeatedly declaring "struct task_status_struct *ts" to
handle error cases by declaring this variable for the entire function
scope. This allows simplifying the error cases, and together with the
addition of blank lines make the code more readable.

Finally, handling of the running_req counter is fixed to avoid decrementing
it without a corresponding incrementation in the case of an invalid task
protocol.

Link: https://lore.kernel.org/r/20220220031810.738362-29-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Simplify pm8001_mpi_build_cmd() interface
Damien Le Moal [Sun, 20 Feb 2022 03:18:06 +0000 (12:18 +0900)]
scsi: pm8001: Simplify pm8001_mpi_build_cmd() interface

There is no need to pass a pointer to a struct inbound_queue_table to
pm8001_mpi_build_cmd(). Passing the start index in the inbound queue table
of the adapter is enough. This change allows avoiding the declaration of a
struct inbound_queue_table pointer (circularQ variables) in many functions,
simplifying the code.

While at it, blank lines are added i(e.g. after local variable
declarations) to make the code more readable.

Link: https://lore.kernel.org/r/20220220031810.738362-28-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Introduce ccb alloc/free helpers
Damien Le Moal [Sun, 20 Feb 2022 03:18:05 +0000 (12:18 +0900)]
scsi: pm8001: Introduce ccb alloc/free helpers

Introduce the pm8001_ccb_alloc() and pm8001_ccb_free() helpers to replace
the typical code patterns:

res = pm8001_tag_alloc(pm8001_ha, &ccb_tag);
if (res)
...
ccb = &pm8001_ha->ccb_info[ccb_tag];
ccb->device = pm8001_ha_dev;
ccb->ccb_tag = ccb_tag;
ccb->task = task;
ccb->n_elem = 0;

and

ccb->task = NULL;
ccb->ccb_tag = PM8001_INVALID_TAG;
pm8001_tag_free(pm8001_ha, tag);

With the simpler function calls:

ccb = pm8001_ccb_alloc(pm8001_ha, pm8001_ha_dev, task);
if (!ccb)
...

and

pm8001_ccb_free(pm8001_ha, ccb);

The pm8001_ccb_alloc() helper ensures that all fields of the ccb info
structure for the newly allocated tag are all initialized, except the
buf_prd field. The pm8001_ccb_free() helper clears the initialized fields
and the ccb tag to ensure that iteration over the adapter ccb_info array
detects ccbs that are in use.

All call site of the pm8001_tag_alloc() function that use a ccb info
associated with an allocated tag are converted to use the new helpers.

Link: https://lore.kernel.org/r/20220220031810.738362-27-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Simplify pm8001_get_ncq_tag()
Damien Le Moal [Sun, 20 Feb 2022 03:18:04 +0000 (12:18 +0900)]
scsi: pm8001: Simplify pm8001_get_ncq_tag()

To detect if a command is NCQ, there is no need to test all possible NCQ
command codes. Instead, use ata_is_ncq() to test the command protocol.

Link: https://lore.kernel.org/r/20220220031810.738362-26-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Cleanup pm8001_exec_internal_task_abort()
Damien Le Moal [Sun, 20 Feb 2022 03:18:03 +0000 (12:18 +0900)]
scsi: pm8001: Cleanup pm8001_exec_internal_task_abort()

Replace the goto statement in the for loop with "break" and remove the
ex_err label. Also fix long lines, identation and blank lines to make the
code more readable.

Link: https://lore.kernel.org/r/20220220031810.738362-25-damien.lemoal@opensource.wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libsas: Simplify sas_ata_qc_issue() detection of NCQ commands
Damien Le Moal [Sun, 20 Feb 2022 03:18:02 +0000 (12:18 +0900)]
scsi: libsas: Simplify sas_ata_qc_issue() detection of NCQ commands

To detect if a command is NCQ, there is no need to test all possible NCQ
command codes. Instead, use ata_is_ncq() to test the command protocol.

Link: https://lore.kernel.org/r/20220220031810.738362-24-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix memory leak in pm8001_chip_fw_flash_update_req()
Damien Le Moal [Sun, 20 Feb 2022 03:18:01 +0000 (12:18 +0900)]
scsi: pm8001: Fix memory leak in pm8001_chip_fw_flash_update_req()

In pm8001_chip_fw_flash_update_build(), if
pm8001_chip_fw_flash_update_build() fails, the struct fw_control_ex
allocated must be freed.

Link: https://lore.kernel.org/r/20220220031810.738362-23-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix tag leaks on error
Damien Le Moal [Sun, 20 Feb 2022 03:18:00 +0000 (12:18 +0900)]
scsi: pm8001: Fix tag leaks on error

In pm8001_chip_set_dev_state_req(), pm8001_chip_fw_flash_update_req(),
pm80xx_chip_phy_ctl_req() and pm8001_chip_reg_dev_req() add missing calls
to pm8001_tag_free() to free the allocated tag when pm8001_mpi_build_cmd()
fails.

Similarly, in pm8001_exec_internal_task_abort(), if the chip ->task_abort
method fails, the tag allocated for the abort request task must be
freed. Add the missing call to pm8001_tag_free().

Link: https://lore.kernel.org/r/20220220031810.738362-22-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix task leak in pm8001_send_abort_all()
Damien Le Moal [Sun, 20 Feb 2022 03:17:59 +0000 (12:17 +0900)]
scsi: pm8001: Fix task leak in pm8001_send_abort_all()

In pm8001_send_abort_all(), make sure to free the allocated sas task
if pm8001_tag_alloc() or pm8001_mpi_build_cmd() fail.

Link: https://lore.kernel.org/r/20220220031810.738362-21-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix tag values handling
Damien Le Moal [Sun, 20 Feb 2022 03:17:58 +0000 (12:17 +0900)]
scsi: pm8001: Fix tag values handling

The function pm8001_tag_alloc() determines free tags using the function
find_first_zero_bit() which can return 0 when the first bit of the bitmap
being inspected is 0. As such, tag 0 is a valid tag value that should not
be dismissed as invalid. Fix the functions pm8001_work_fn(),
mpi_sata_completion(), pm8001_mpi_task_abort_resp() and
pm8001_open_reject_retry() to not dismiss 0 tags as invalid.

The value 0xffffffff is used for invalid tags for unused ccb information
structures. Add the macro definition PM8001_INVALID_TAG to define this
value.

Link: https://lore.kernel.org/r/20220220031810.738362-20-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix pm8001_mpi_task_abort_resp()
Damien Le Moal [Sun, 20 Feb 2022 03:17:57 +0000 (12:17 +0900)]
scsi: pm8001: Fix pm8001_mpi_task_abort_resp()

The call to pm8001_ccb_task_free() at the end of
pm8001_mpi_task_abort_resp() already frees the ccb tag. So when the device
NCQ_ABORT_ALL_FLAG is set, the tag should not be freed again.  Also change
the hardcoded 0xBFFFFFFF value to ~NCQ_ABORT_ALL_FLAG as it ought to be.

Link: https://lore.kernel.org/r/20220220031810.738362-19-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix pm8001_tag_alloc() failures handling
Damien Le Moal [Sun, 20 Feb 2022 03:17:56 +0000 (12:17 +0900)]
scsi: pm8001: Fix pm8001_tag_alloc() failures handling

In mpi_set_phy_profile_req() and in pm8001_set_phy_profile_single(), if
pm8001_tag_alloc() fails to allocate a new tag, a warning message is issued
but the uninitialized tag variable is still used to build a command. Avoid
this by returning early in case of tag allocation failure.

Also make sure to always return the error code returned by
pm8001_tag_alloc() when this function fails instead of an arbitrary value.

Link: https://lore.kernel.org/r/20220220031810.738362-18-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix abort all task initialization
Damien Le Moal [Sun, 20 Feb 2022 03:17:55 +0000 (12:17 +0900)]
scsi: pm8001: Fix abort all task initialization

In pm80xx_send_abort_all(), the n_elem field of the ccb used is not
initialized to 0. This missing initialization sometimes lead to the task
completion path seeing the ccb with a non-zero n_elem resulting in the
execution of invalid dma_unmap_sg() calls in pm8001_ccb_task_free(),
causing a crash such as:

[  197.676341] RIP: 0010:iommu_dma_unmap_sg+0x6d/0x280
[  197.700204] RSP: 0018:ffff889bbcf89c88 EFLAGS: 00010012
[  197.705485] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff83d0bda0
[  197.712687] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88810dffc0d0
[  197.719887] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8881c790098b
[  197.727089] R10: ffffed1038f20131 R11: 0000000000000001 R12: 0000000000000000
[  197.734296] R13: ffff88810dffc0d0 R14: 0000000000000010 R15: 0000000000000000
[  197.741493] FS:  0000000000000000(0000) GS:ffff889bbcf80000(0000) knlGS:0000000000000000
[  197.749659] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  197.755459] CR2: 00007f16c1b42734 CR3: 0000000004814000 CR4: 0000000000350ee0
[  197.762656] Call Trace:
[  197.765127]  <IRQ>
[  197.767162]  pm8001_ccb_task_free+0x5f1/0x820 [pm80xx]
[  197.772364]  ? do_raw_spin_unlock+0x54/0x220
[  197.776680]  pm8001_mpi_task_abort_resp+0x2ce/0x4f0 [pm80xx]
[  197.782406]  process_oq+0xe85/0x7890 [pm80xx]
[  197.786817]  ? lock_acquire+0x194/0x490
[  197.790697]  ? handle_irq_event+0x10e/0x1b0
[  197.794920]  ? mpi_sata_completion+0x2d70/0x2d70 [pm80xx]
[  197.800378]  ? __wake_up_bit+0x100/0x100
[  197.804340]  ? lock_is_held_type+0x98/0x110
[  197.808565]  pm80xx_chip_isr+0x94/0x130 [pm80xx]
[  197.813243]  tasklet_action_common.constprop.0+0x24b/0x2f0
[  197.818785]  __do_softirq+0x1b5/0x82d
[  197.822485]  ? do_raw_spin_unlock+0x54/0x220
[  197.826799]  __irq_exit_rcu+0x17e/0x1e0
[  197.830678]  irq_exit_rcu+0xa/0x20
[  197.834114]  common_interrupt+0x78/0x90
[  197.840051]  </IRQ>
[  197.844236]  <TASK>
[  197.848397]  asm_common_interrupt+0x1e/0x40

Avoid this issue by always initializing the ccb n_elem field to 0 in
pm8001_send_abort_all(), pm8001_send_read_log() and
pm80xx_send_abort_all().

Link: https://lore.kernel.org/r/20220220031810.738362-17-damien.lemoal@opensource.wdc.com
Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix NCQ NON DATA command completion handling
Damien Le Moal [Sun, 20 Feb 2022 03:17:54 +0000 (12:17 +0900)]
scsi: pm8001: Fix NCQ NON DATA command completion handling

NCQ NON DATA is an NCQ command with the DMA_NONE DMA direction and so a
register-device-to-host-FIS response is expected for it.

However, for an IO_SUCCESS case, mpi_sata_completion() expects a
set-device-bits-FIS for any ata task with an use_ncq field true, which
includes NCQ NON DATA commands.

Fix this to correctly treat NCQ NON DATA commands as non-data by also
testing for the DMA_NONE DMA direction.

Link: https://lore.kernel.org/r/20220220031810.738362-16-damien.lemoal@opensource.wdc.com
Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix NCQ NON DATA command task initialization
Damien Le Moal [Sun, 20 Feb 2022 03:17:53 +0000 (12:17 +0900)]
scsi: pm8001: Fix NCQ NON DATA command task initialization

In the pm8001_chip_sata_req() and pm80xx_chip_sata_req() functions, all
tasks with a DMA direction of DMA_NONE (no data transfer) are initialized
using the ATAP value 0x04. However, NCQ NON DATA commands, while being
DMA_NONE commands are NCQ commands and need to be initialized using the
value 0x07 for ATAP, similarly to other NCQ commands.

Make sure that NCQ NON DATA command tasks are initialized similarly to
other NCQ commands by also testing the task "use_ncq" field in addition to
the DMA direction. While at it, reorganize the code into a chain of if -
else if - else to avoid useless affectations and debug messages.

Link: https://lore.kernel.org/r/20220220031810.738362-15-damien.lemoal@opensource.wdc.com
Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Remove local variable in pm8001_pci_resume()
Damien Le Moal [Sun, 20 Feb 2022 03:17:52 +0000 (12:17 +0900)]
scsi: pm8001: Remove local variable in pm8001_pci_resume()

In pm8001_pci_resume(), the use of the u32 type for the local variable
device_state causes a sparse warning:

warning: incorrect type in assignment (different base types)
    expected unsigned int [usertype] device_state
    got restricted pci_power_t [usertype] current_state

Since this variable is used only once in the function, remove it and use
pdev->current_state directly. While at it, also add a blank line after the
last local variable declaration.

Link: https://lore.kernel.org/r/20220220031810.738362-14-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix use of struct set_phy_profile_req fields
Damien Le Moal [Sun, 20 Feb 2022 03:17:51 +0000 (12:17 +0900)]
scsi: pm8001: Fix use of struct set_phy_profile_req fields

In mpi_set_phy_profile_req() and pm8001_set_phy_profile_single(), use
cpu_to_le32() to initialize the ppc_phyid field of struct
set_phy_profile_req. Furthermore, fix the definition of the reserved field
of this structure to be an array of __le32, to match the use of
cpu_to_le32() when assigning values. These changes remove several sparse
type warnings.

Link: https://lore.kernel.org/r/20220220031810.738362-13-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Fix le32 values handling in pm80xx_chip_sata_req()
Damien Le Moal [Sun, 20 Feb 2022 03:17:50 +0000 (12:17 +0900)]
scsi: pm8001: Fix le32 values handling in pm80xx_chip_sata_req()

Make sure that the __le32 fields of struct sata_cmd are manipulated after
applying the correct endian conversion. That is, use cpu_to_le32() for
assigning values and le32_to_cpu() for consulting a field value.  In
particular, make sure that the calculations for the 4G boundary check are
done using CPU endianness and *not* little endian values. With these fixes,
many sparse warnings are removed.

While at it, fix some code identation and add blank lines after variable
declarations and in some other places to make this code more readable.

Link: https://lore.kernel.org/r/20220220031810.738362-12-damien.lemoal@opensource.wdc.com
Fixes: 0ecdf00ba6e5 ("[SCSI] pm80xx: 4G boundary fix.")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>