]> www.infradead.org Git - users/willy/xarray.git/log
users/willy/xarray.git
3 months agoMerge patch series "ufs: host: mediatek: Provide features and fixes in MediaTek platf...
Martin K. Petersen [Fri, 25 Jul 2025 02:25:40 +0000 (22:25 -0400)]
Merge patch series "ufs: host: mediatek: Provide features and fixes in MediaTek platforms"

peter.wang@mediatek.com says:

This series fixes some defects and provide features in MediaTek UFS drivers.

Link: https://lore.kernel.org/r/20250722030841.1998783-1-peter.wang@mediatek.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Support FDE (AES) clock scaling
Peter Wang [Tue, 22 Jul 2025 03:07:24 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Support FDE (AES) clock scaling

Add support for scaling the FDE (AES) clock to achieve higher
performance, particularly for HS-G5:

 1. Parse DTS settings for FDE min/max mux.

 2. Scale up the FDE clock when required for enhanced performance.

These changes ensure that the FDE clock can be dynamically adjusted based
on performance needs, leveraging DTS configurations.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-10-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Support clock scaling with Vcore binding
Peter Wang [Tue, 22 Jul 2025 03:07:23 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Support clock scaling with Vcore binding

Add support for clock scaling with Vcore binding:

 1. Parse the DTS setting for Vcore voltage.

 2. Set the Vcore voltage to the DTS-specified value before scaling up.

 3. Reset the Vcore voltage to the default setting after scaling down.

These changes ensure that the Vcore voltage is appropriately managed
during clock scaling operations to maintain system stability and
performance.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-9-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Add clock scaling query function
Peter Wang [Tue, 22 Jul 2025 03:07:22 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Add clock scaling query function

Introduce a clock scaling readiness query function to streamline the
process of checking clock scaling parameters.  This function simplifies
the code by encapsulating the logic for determining if clock scaling is
ready.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-8-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Add more UFSCHI hardware versions
Alice Chao [Tue, 22 Jul 2025 03:07:21 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Add more UFSCHI hardware versions

Introduce a function for version control to distinguish between new and
old platforms. Update the handling of hardware IP versions, ensuring
correct version comparisons by adjusting the version format for specific
projects.

Signed-off-by: Alice Chao <alice.chao@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-7-peter.wang@mediatek.com
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Set IRQ affinity policy for MCQ mode
Peter Wang [Tue, 22 Jul 2025 03:07:20 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Set IRQ affinity policy for MCQ mode

Set the IRQ affinity for MCQ mode to improve performance. Specifically,
it migrates the IRQ from CPU0 to CPU3 to enhance IRQ handling efficiency.

Setting IRQ affinity directly from the kernel allows the configuration to
take effect earlier, and provides greater security and consistency,
especially important for systems with strict performanceor real-time
requirements.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-6-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Handle broken RTC based on DTS setting
Peter Wang [Tue, 22 Jul 2025 03:07:19 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Handle broken RTC based on DTS setting

Introduce a mechanism to handle broken RTC by checking the DTS
setting. The configuration is specifically required for legacy platform.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-5-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Change ref-clk timeout policy
Peter Wang [Tue, 22 Jul 2025 03:07:18 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Change ref-clk timeout policy

Update the timeout policy for ref-clk control.

 - If a clock-on operation times out, it is assumed that the clock is
   off. The system will notify TFA to perform clock-off settings.

 - If a clock-off operation times out, it is assumed that the clock will
   eventually turn off. The 'ref_clk_enabled' flag is set directly.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-4-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Add DDR_EN setting
Naomi Chu [Tue, 22 Jul 2025 03:07:17 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Add DDR_EN setting

On MT6989 and later platforms, control of DDR_EN has been switched from
SPM to EMI. To prevent abnormal access to DRAM, it is necessary to wait
for 'ddren_ack' or assert 'ddren_urgent' after sending 'ddren_req'.

Introduce the DDR_EN configuration in the UFS initialization flow,
utilizing the assertion of 'ddren_urgent' to maintain performance.

Signed-off-by: Naomi Chu <naomi.chu@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-3-peter.wang@mediatek.com
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: host: mediatek: Simplify boolean conversion
Peter Wang [Tue, 22 Jul 2025 03:07:16 +0000 (11:07 +0800)]
scsi: ufs: host: mediatek: Simplify boolean conversion

Simplify the conversion from unsigned int to boolean by removing explicit
conversions and parentheses, relying on implicit conversion instead. This
change ensures consistency with other usages in ufs-mediatek.c and
streamlines the code.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-2-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: core: Use str_true_false() helper in UFS_FLAG()
Liu Song [Mon, 21 Jul 2025 12:01:38 +0000 (20:01 +0800)]
scsi: ufs: core: Use str_true_false() helper in UFS_FLAG()

Remove hard-coded strings by using the str_true_false() helper function.

Signed-off-by: Liu Song <liu.song13@zte.com.cn>
Link: https://lore.kernel.org/r/20250721200138431dOU9KyajGyGi5339ma26p@zte.com.cn
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: Fix sas_user_scan() to handle wildcard and multi-channel scans
Ranjan Kumar [Tue, 24 Jun 2025 06:16:49 +0000 (11:46 +0530)]
scsi: Fix sas_user_scan() to handle wildcard and multi-channel scans

sas_user_scan() did not fully process wildcard channel scans
(SCAN_WILD_CARD) when a transport-specific user_scan() callback was
present. Only channel 0 would be scanned via user_scan(), while the
remaining channels were skipped, potentially missing devices.

user_scan() invokes updated sas_user_scan() for channel 0, and if
successful, iteratively scans remaining channels (1 to
shost->max_channel) via scsi_scan_host_selected().  This ensures complete
wildcard scanning without affecting transport-specific scanning behavior.

Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250624061649.17990-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: target: core: Generate correct identifiers for PR OUT transport IDs
Maurizio Lombardi [Mon, 14 Jul 2025 13:37:38 +0000 (15:37 +0200)]
scsi: target: core: Generate correct identifiers for PR OUT transport IDs

Fix target_parse_pr_out_transport_id() to return a string representing
the transport ID in a human-readable format (e.g., naa.xxxxxxxx...)  for
various SCSI protocol types (SAS, FCP, SRP, SBP).

Previously, the function returned a pointer to the raw binary buffer,
which was incorrectly compared against human-readable strings, causing
comparisons to fail.  Now, the function writes a properly formatted
string into a buffer provided by the caller.  The output format depends
on the transport protocol:

* SAS: 64-bit identifier, "naa." prefix.
* FCP: 64-bit identifier, colon separated values.
* SBP: 64-bit identifier, no prefix.
* SRP: 128-bit identifier, "0x" prefix.
* iSCSI: IQN string.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Link: https://lore.kernel.org/r/20250714133738.11054-1-mlombard@redhat.com
Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: MAINTAINERS: Update hisi_sas entry
Yihang Li [Wed, 2 Jul 2025 01:24:23 +0000 (09:24 +0800)]
scsi: MAINTAINERS: Update hisi_sas entry

liyihang9@huawei.com no longer works. So update information for hisi_sas.

Signed-off-by: Yihang Li <liyihang9@h-partners.com>
Link: https://lore.kernel.org/r/20250702012423.1947238-1-liyihang9@h-partners.com
Acked-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: target: iblock: Allow iblock devices to be shared
Mike Christie [Mon, 21 Jul 2025 18:51:45 +0000 (13:51 -0500)]
scsi: target: iblock: Allow iblock devices to be shared

We might be running a local application that also interacts with the
backing device. In this setup we have some clustering type of software
that manages the ownwer of it, so we don't want the kernel to restrict
us. This patch allows the user to control if the driver gets exclusive
access.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20250721185145.20913-1-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: core: Use link recovery when h8 exit fails during runtime resume
Seunghui Lee [Thu, 17 Jul 2025 08:12:13 +0000 (17:12 +0900)]
scsi: ufs: core: Use link recovery when h8 exit fails during runtime resume

If the h8 exit fails during runtime resume process, the runtime thread
enters runtime suspend immediately and the error handler operates at the
same time.  It becomes stuck and cannot be recovered through the error
handler.  To fix this, use link recovery instead of the error handler.

Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths")
Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
Link: https://lore.kernel.org/r/20250717081213.6811-1-sh043.lee@samsung.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: Revert "scsi: iscsi: Fix HW conn removal use after free"
Li Lingfeng [Tue, 15 Jul 2025 07:39:26 +0000 (15:39 +0800)]
scsi: Revert "scsi: iscsi: Fix HW conn removal use after free"

This reverts commit c577ab7ba5f3bf9062db8a58b6e89d4fe370447e.

The invocation of iscsi_put_conn() in iscsi_iter_destory_conn_fn() is
used to free the initial reference counter of iscsi_cls_conn.  For
non-qla4xxx cases, the ->destroy_conn() callback (e.g.,
iscsi_conn_teardown) will call iscsi_remove_conn() and iscsi_put_conn()
to remove the connection from the children list of session and free the
connection at last.  However for qla4xxx, it is not the case. The
->destroy_conn() callback of qla4xxx will keep the connection in the
session conn_list and doesn't use iscsi_put_conn() to free the initial
reference counter. Therefore, it seems necessary to keep the
iscsi_put_conn() in the iscsi_iter_destroy_conn_fn(), otherwise, there
will be memory leak problem.

Link: https://lore.kernel.org/all/88334658-072b-4b90-a949-9c74ef93cfd1@huawei.com/
Fixes: c577ab7ba5f3 ("scsi: iscsi: Fix HW conn removal use after free")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Link: https://lore.kernel.org/r/20250715073926.3529456-1-lilingfeng3@huawei.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: aacraid: Stop using PCI_IRQ_AFFINITY
John Garry [Tue, 15 Jul 2025 11:15:35 +0000 (11:15 +0000)]
scsi: aacraid: Stop using PCI_IRQ_AFFINITY

When PCI_IRQ_AFFINITY is set for calling pci_alloc_irq_vectors(), it
means interrupts are spread around the available CPUs. It also means that
the interrupts become managed, which means that an interrupt is shutdown
when all the CPUs in the interrupt affinity mask go offline.

Using managed interrupts in this way means that we should ensure that
completions should not occur on HW queues where the associated interrupt
is shutdown. This is typically achieved by ensuring only CPUs which are
online can generate IO completion traffic to the HW queue which they are
mapped to (so that they can also serve completion interrupts for that HW
queue).

The problem in the driver is that a CPU can generate completions to a HW
queue whose interrupt may be shutdown, as the CPUs in the HW queue
interrupt affinity mask may be offline. This can cause IOs to never
complete and hang the system. The driver maintains its own CPU <-> HW
queue mapping for submissions, see aac_fib_vector_assign(), but this does
not reflect the CPU <-> HW queue interrupt affinity mapping.

Commit 9dc704dcc09e ("scsi: aacraid: Reply queue mapping to CPUs based on
IRQ affinity") tried to remedy this issue may mapping CPUs properly to HW
queue interrupts. However this was later reverted in commit c5becf57dd56
("Revert "scsi: aacraid: Reply queue mapping to CPUs based on IRQ
affinity") - it seems that there were other reports of hangs. I guess
that this was due to some implementation issue in the original commit or
maybe a HW issue.

Fix the very original hang by just not using managed interrupts by not
setting PCI_IRQ_AFFINITY.  In this way, all CPUs will be in each HW queue
affinity mask, so should not create completion problems if any CPUs go
offline.

Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20250715111535.499853-1-john.g.garry@oracle.com
Closes: https://lore.kernel.org/linux-scsi/20250618192427.3845724-1-jmeneghi@redhat.com/
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Tested-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: qcom: Drop dead compile guard
Konrad Dybcio [Thu, 24 Jul 2025 12:23:52 +0000 (14:23 +0200)]
scsi: ufs: qcom: Drop dead compile guard

SCSI_UFSHCD already selects DEVFREQ_GOV_SIMPLE_ONDEMAND, drop the check.

Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250724-topic-ufs_compile_check-v1-1-5ba9e99dbd52@oss.qualcomm.com
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: mpt3sas: Fix a fw_event memory leak
Tomas Henzl [Wed, 23 Jul 2025 15:30:18 +0000 (17:30 +0200)]
scsi: mpt3sas: Fix a fw_event memory leak

In _mpt3sas_fw_work() the fw_event reference is removed, it should also
be freed in all cases.

Fixes: 4318c7347847 ("scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20250723153018.50518-1-thenzl@redhat.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: libiscsi: Initialize iscsi_conn->dd_data only if memory is allocated
Showrya M N [Fri, 27 Jun 2025 11:23:29 +0000 (16:53 +0530)]
scsi: libiscsi: Initialize iscsi_conn->dd_data only if memory is allocated

In case of an ib_fast_reg_mr allocation failure during iSER setup, the
machine hits a panic because iscsi_conn->dd_data is initialized
unconditionally, even when no memory is allocated (dd_size == 0).  This
leads invalid pointer dereference during connection teardown.

Fix by setting iscsi_conn->dd_data only if memory is actually allocated.

Panic trace:
------------
 iser: iser_create_fastreg_desc: Failed to allocate ib_fast_reg_mr err=-12
 iser: iser_alloc_rx_descriptors: failed allocating rx descriptors / data buffers
 BUG: unable to handle page fault for address: fffffffffffffff8
 RIP: 0010:swake_up_locked.part.5+0xa/0x40
 Call Trace:
  complete+0x31/0x40
  iscsi_iser_conn_stop+0x88/0xb0 [ib_iser]
  iscsi_stop_conn+0x66/0xc0 [scsi_transport_iscsi]
  iscsi_if_stop_conn+0x14a/0x150 [scsi_transport_iscsi]
  iscsi_if_rx+0x1135/0x1834 [scsi_transport_iscsi]
  ? netlink_lookup+0x12f/0x1b0
  ? netlink_deliver_tap+0x2c/0x200
  netlink_unicast+0x1ab/0x280
  netlink_sendmsg+0x257/0x4f0
  ? _copy_from_user+0x29/0x60
  sock_sendmsg+0x5f/0x70

Signed-off-by: Showrya M N <showrya@chelsio.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Link: https://lore.kernel.org/r/20250627112329.19763-1-showrya@chelsio.com
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: scsi_transport_fc: Add comments to describe added 'rport' parameter
Ewan D. Milne [Mon, 21 Jul 2025 16:46:52 +0000 (12:46 -0400)]
scsi: scsi_transport_fc: Add comments to describe added 'rport' parameter

Note that there is no executable code altered by this patch.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507181446.aAoFiDm5-lkp@intel.com/
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Link: https://lore.kernel.org/r/20250721164652.335716-1-emilne@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: bfa: Double-free fix
jackysliu [Tue, 24 Jun 2025 11:58:24 +0000 (19:58 +0800)]
scsi: bfa: Double-free fix

When the bfad_im_probe() function fails during initialization, the memory
pointed to by bfad->im is freed without setting bfad->im to NULL.

Subsequently, during driver uninstallation, when the state machine enters
the bfad_sm_stopping state and calls the bfad_im_probe_undo() function,
it attempts to free the memory pointed to by bfad->im again, thereby
triggering a double-free vulnerability.

Set bfad->im to NULL if probing fails.

Signed-off-by: jackysliu <1972843537@qq.com>
Link: https://lore.kernel.org/r/tencent_3BB950D6D2D470976F55FC879206DE0B9A09@qq.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoMerge patch series "ufs: ufs-qcom: Align programming sequence as per HW spec"
Martin K. Petersen [Tue, 15 Jul 2025 01:00:41 +0000 (21:00 -0400)]
Merge patch series "ufs: ufs-qcom: Align programming sequence as per HW spec"

Nitin Rawat <quic_nitirawa@quicinc.com> says:

This patch series adds programming support for Qualcomm UFS
to align with Hardware Specification.

In this patch series below changes are taken care.

1. Enable QUnipro Internal Clock Gating
2. Update esi_vec_mask for HW major version >= 6

Link: https://lore.kernel.org/r/20250714075336.2133-1-quic_nitirawa@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: isci: Fix dma_unmap_sg() nents value
Thomas Fourier [Fri, 27 Jun 2025 14:24:47 +0000 (16:24 +0200)]
scsi: isci: Fix dma_unmap_sg() nents value

The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.

Fixes: ddcc7e347a89 ("isci: fix dma_unmap_sg usage")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250627142451.241713-2-fourier.thomas@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: mvsas: Fix dma_unmap_sg() nents value
Thomas Fourier [Fri, 27 Jun 2025 13:48:18 +0000 (15:48 +0200)]
scsi: mvsas: Fix dma_unmap_sg() nents value

The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.

Fixes: b5762948263d ("[SCSI] mvsas: Add Marvell 6440 SAS/SATA driver")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250627134822.234813-2-fourier.thomas@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: elx: efct: Fix dma_unmap_sg() nents value
Thomas Fourier [Fri, 27 Jun 2025 11:41:13 +0000 (13:41 +0200)]
scsi: elx: efct: Fix dma_unmap_sg() nents value

The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.

Fixes: 692e5d73a811 ("scsi: elx: efct: LIO backend interface routines")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250627114117.188480-2-fourier.thomas@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: scsi_transport_fc: Change to use per-rport devloss_work_q
Ewan D. Milne [Mon, 7 Jul 2025 20:22:25 +0000 (16:22 -0400)]
scsi: scsi_transport_fc: Change to use per-rport devloss_work_q

Configurations with large numbers of FC rports per host instance are
taking a very long time to complete all devloss work.  Increase potential
parallelism by using a per-rport devloss_work_q for dev_loss_work and
fast_io_fail_work.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Link: https://lore.kernel.org/r/20250707202225.1203189-1-emilne@redhat.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: exynos: Fix programming of HCI_UTRL_NEXUS_TYPE
André Draszik [Mon, 7 Jul 2025 17:05:27 +0000 (18:05 +0100)]
scsi: ufs: exynos: Fix programming of HCI_UTRL_NEXUS_TYPE

On Google gs101, the number of UTP transfer request slots (nutrs) is 32,
and in this case the driver ends up programming the UTRL_NEXUS_TYPE
incorrectly as 0.

This is because the left hand side of the shift is 1, which is of type
int, i.e. 31 bits wide. Shifting by more than that width results in
undefined behaviour.

Fix this by switching to the BIT() macro, which applies correct type
casting as required. This ensures the correct value is written to
UTRL_NEXUS_TYPE (0xffffffff on gs101), and it also fixes a UBSAN shift
warning:

    UBSAN: shift-out-of-bounds in drivers/ufs/host/ufs-exynos.c:1113:21
    shift exponent 32 is too large for 32-bit type 'int'

For consistency, apply the same change to the nutmrs / UTMRL_NEXUS_TYPE
write.

Fixes: 55f4b1f73631 ("scsi: ufs: ufs-exynos: Add UFS host support for Exynos SoCs")
Cc: stable@vger.kernel.org
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://lore.kernel.org/r/20250707-ufs-exynos-shift-v1-1-1418e161ae40@linaro.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: core: Fix kernel doc for scsi_track_queue_full()
Bagas Sanjaya [Wed, 2 Jul 2025 03:58:23 +0000 (10:58 +0700)]
scsi: core: Fix kernel doc for scsi_track_queue_full()

Sphinx reports indentation warning on scsi_track_queue_full() return
values:

Documentation/driver-api/scsi:101: ./drivers/scsi/scsi.c:247: ERROR: Unexpected indentation. [docutils]

Fix the warning by making the return values listing a bullet list.

Fixes: eb44820c28bc ("[SCSI] Add Documentation and integrate into docbook build")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20250702035822.18072-2-bagasdotme@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ibmvscsi_tgt: Fix dma_unmap_sg() nents value
Thomas Fourier [Mon, 30 Jun 2025 11:18:02 +0000 (13:18 +0200)]
scsi: ibmvscsi_tgt: Fix dma_unmap_sg() nents value

The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.

Fixes: 88a678bbc34c ("ibmvscsis: Initial commit of IBM VSCSI Tgt Driver")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250630111803.94389-2-fourier.thomas@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ibmvscsi_tgt: Fix typo in comment
Ankit Dange [Sat, 28 Jun 2025 12:53:20 +0000 (18:23 +0530)]
scsi: ibmvscsi_tgt: Fix typo in comment

Correct the misspelling of "transitition" to "transition" in a comment
in ibmvscsi_tgt.c for clarity.

Signed-off-by: Ankit Dange <ankitdange37@gmail.com>
Link: https://lore.kernel.org/r/20250628125320.295824-1-ankitdange37@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoMerge patch series "mpi3mr: Few minor bug fixes"
Martin K. Petersen [Tue, 15 Jul 2025 00:56:16 +0000 (20:56 -0400)]
Merge patch series "mpi3mr: Few minor bug fixes"

Ranjan Kumar <ranjan.kumar@broadcom.com> says:

Few minor fixes of mpi3mr driver.

Link: https://lore.kernel.org/r/20250627194539.48851-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: mpi3mr: Update driver version to 8.14.0.5.50
Ranjan Kumar [Fri, 27 Jun 2025 19:45:39 +0000 (01:15 +0530)]
scsi: mpi3mr: Update driver version to 8.14.0.5.50

Updated driver version to 8.14.0.5.50

Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250627194539.48851-5-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: mpi3mr: Serialize admin queue BAR writes on 32-bit systems
Ranjan Kumar [Fri, 27 Jun 2025 19:45:38 +0000 (01:15 +0530)]
scsi: mpi3mr: Serialize admin queue BAR writes on 32-bit systems

On 32-bit systems, 64-bit BAR writes to admin queue registers are
performed as two 32-bit writes. Without locking, this can cause partial
writes when accessed concurrently.

Updated per-queue spinlocks is used to serialize these writes and prevent
race conditions.

Fixes: 824a156633df ("scsi: mpi3mr: Base driver code")
Cc: stable@vger.kernel.org
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250627194539.48851-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: mpi3mr: Drop unnecessary volatile from __iomem pointers
Ranjan Kumar [Fri, 27 Jun 2025 19:45:37 +0000 (01:15 +0530)]
scsi: mpi3mr: Drop unnecessary volatile from __iomem pointers

The volatile qualifier is redundant for __iomem pointers.

Cleaned up usage in mpi3mr_writeq() and sysif_regs pointer as per
Upstream compliance.

Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250627194539.48851-3-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: mpi3mr: Fix race between config read submit and interrupt completion
Ranjan Kumar [Fri, 27 Jun 2025 19:45:36 +0000 (01:15 +0530)]
scsi: mpi3mr: Fix race between config read submit and interrupt completion

The "is_waiting" flag was updated after calling complete(), which could
lead to a race where the waiting thread wakes up before the flag is
cleared. This may cause a missed wakeup or stale state check.

Reorder the operations to update "is_waiting" before signaling completion
to ensure consistent state.

Fixes: 824a156633df ("scsi: mpi3mr: Base driver code")
Cc: stable@vger.kernel.org
Co-developed-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250627194539.48851-2-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: ufs-qcom: Enable QUnipro Internal Clock Gating
Nitin Rawat [Mon, 14 Jul 2025 07:53:36 +0000 (13:23 +0530)]
scsi: ufs: ufs-qcom: Enable QUnipro Internal Clock Gating

Enable internal clock gating for Qualcomm UFS host controller by setting
the following attributes to 1 during host controller initialization:

 - DL_VS_CLK_CFG
 - PA_VS_CLK_CFG_REG
 - DME_VS_CORE_CLK_CTRL.DME_HW_CGC_EN

This change is necessary to support the internal clock gating mechanism
in Qualcomm UFS host controller. This is power saving feature and hence
driver can continue to function correctly despite any error in enabling
these feature.

Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20250714075336.2133-4-quic_nitirawa@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: core: Add ufshcd_dme_rmw() to modify DME attributes
Nitin Rawat [Mon, 14 Jul 2025 07:53:35 +0000 (13:23 +0530)]
scsi: ufs: core: Add ufshcd_dme_rmw() to modify DME attributes

Introduce ufshcd_dme_rmw() API to read, modify, and write DME attributes
in UFS host controllers using a mask and value.

Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20250714075336.2133-3-quic_nitirawa@quicinc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: ufs-qcom: Update esi_vec_mask for HW major version >= 6
Bao D. Nguyen [Mon, 14 Jul 2025 07:53:34 +0000 (13:23 +0530)]
scsi: ufs: ufs-qcom: Update esi_vec_mask for HW major version >= 6

The MCQ feature and ESI are supported by all Qualcomm UFS controller
versions 6 and above.

Therefore, update the ESI vector mask in the UFS_MEM_CFG3 register for
platforms with major version number of 6 or higher.

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20250714075336.2133-2-quic_nitirawa@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: core: Use scsi_cmd_priv() instead of open-coding it
Bart Van Assche [Tue, 24 Jun 2025 21:05:40 +0000 (14:05 -0700)]
scsi: core: Use scsi_cmd_priv() instead of open-coding it

Improve code readability without modifying the behavior of the code.

Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20250624210541.512910-4-bvanassche@acm.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: qla2xxx: Remove firmware URL
Xose Vazquez Perez [Tue, 24 Jun 2025 19:09:25 +0000 (21:09 +0200)]
scsi: qla2xxx: Remove firmware URL

The historic QLogic firmware URL redirects to a Marvell page that only
provides drivers.

Refer to linux-firmware instead.

Cc: Nilesh Javali <njavali@marvell.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: QLOGIC ML <GR-QLogic-Storage-Upstream@marvell.com>
Cc: LINUX SCSI ML <linux-scsi@vger.kernel.org>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Link: https://lore.kernel.org/r/20250624190926.115009-1-xose.vazquez@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 months agoscsi: ufs: core: Improve return value documentation
Bart Van Assche [Mon, 23 Jun 2025 21:59:01 +0000 (14:59 -0700)]
scsi: ufs: core: Improve return value documentation

Some functions return a negative value to indicate an error while other
functions return a value != 0 to indicate an error. Document the return
value behavior where this documentation is missing and fix the return
value documentation where necessary. Add warnings to detect mismatches
between documentation and implementation. This matters because several
sysfs callback functions only work correctly if a negative value is
returned upon error.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20250623215909.4169007-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: scsi_devinfo: Remove redundant 'found'
mrigendrachaubey [Sun, 22 Jun 2025 05:57:09 +0000 (11:27 +0530)]
scsi: scsi_devinfo: Remove redundant 'found'

Remove the unnecessary 'found' flag in scsi_devinfo_lookup_by_key(). The
loop can return the matching entry directly when found, and fall through
to return ERR_PTR(-EINVAL) otherwise.

Signed-off-by: mrigendrachaubey <mrigendra.chaubey@gmail.com>
Link: https://lore.kernel.org/r/20250622055709.7893-1-mrigendra.chaubey@gmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: qla2xxx: Avoid stack frame size warning in qla_dfs
Arnd Bergmann [Fri, 20 Jun 2025 17:32:22 +0000 (19:32 +0200)]
scsi: qla2xxx: Avoid stack frame size warning in qla_dfs

The qla2x00_dfs_tgt_port_database_show() function constructs a fake
fc_port_t object on the stack, which--depending on the configuration--is
large enough to exceed the stack size warning limit:

drivers/scsi/qla2xxx/qla_dfs.c:176:1: error: stack frame size (1392) exceeds limit (1280) in 'qla2x00_dfs_tgt_port_database_show' [-Werror,-Wframe-larger-than]

Rework this function to no longer need the structure but instead call a
custom helper function that just prints the data directly from the
port_database_24xx structure.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250620173232.864179-1-arnd@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: mpi3mr: Fix kernel-doc issues in mpi3mr_app.c
Randy Dunlap [Fri, 20 Jun 2025 16:21:58 +0000 (09:21 -0700)]
scsi: mpi3mr: Fix kernel-doc issues in mpi3mr_app.c

Fix all kernel-doc problems in mpi3mr_app.c:

mpi3mr_app.c:809: warning: Excess function parameter 'data' description in 'mpi3mr_set_trigger_data_in_hdb'
mpi3mr_app.c:836: warning: Excess function parameter 'data' description in 'mpi3mr_set_trigger_data_in_all_hdb'
mpi3mr_app.c:3395: warning: No description found for return value of 'sas_ncq_prio_supported_show'
mpi3mr_app.c:3413: warning: No description found for return value of 'sas_ncq_prio_enable_show'

Fixes: fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffers")
Fixes: d8d08d1638ce ("scsi: mpi3mr: Trigger support")
Fixes: 90e6f08915ec ("scsi: mpi3mr: Fix ATA NCQ priority support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20250620162158.776795-1-rdunlap@infradead.org
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Ranjan Kumar <ranjan.kumar@broadcom.com>
Cc: mpi3mr-linuxdrv.pdl@broadcom.com
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: pm80xx: Add controller SCSI host fatal error uevents
Salomon Dushimirimana [Mon, 16 Jun 2025 19:00:18 +0000 (19:00 +0000)]
scsi: pm80xx: Add controller SCSI host fatal error uevents

Add pm80xx_fatal_error_uevent_emit() which is called when the pm80xx
driver encouters a fatal error. The uevent has the following additional
custom key/value pair sets:

 - DRIVER: driver name, pm80xx in this case
 - HBA_NUM: the scsi host id of the device
 - EVENT_TYPE: to indicate a fatal error
 - REPORTED_BY: either driver or firmware

The uevent is anchored to the kernel object that represents the SCSI
controller, which includes other useful core variables, such as, ACTION,
DEVPATH, SUBSYSTEM, and more.

The fatal_error_uevent_emit() function is called when the controller
fatal error state changes. Since this doesn't happen often for a
specific SCSI host, there is no worries of a uevent storm.

Signed-off-by: Salomon Dushimirimana <salomondush@google.com>
Link: https://lore.kernel.org/r/20250616190018.2136260-1-salomondush@google.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoMerge patch series "Update lpfc to revision 14.4.0.10"
Martin K. Petersen [Mon, 23 Jun 2025 17:12:16 +0000 (13:12 -0400)]
Merge patch series "Update lpfc to revision 14.4.0.10"

Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.4.0.10

This patch set contains bug fixes related to diagnostic log messaging,
driver initialization and removal, updates to mailbox command handling,
and string modifications for obsolete adapter model descriptions.

The patches were cut against Martin's 6.17/scsi-queue tree.

Link: https://lore.kernel.org/r/20250618192138.124116-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Copyright updates for 14.4.0.10 patches
Justin Tee [Wed, 18 Jun 2025 19:21:38 +0000 (12:21 -0700)]
scsi: lpfc: Copyright updates for 14.4.0.10 patches

Update copyrights to 2025 for files modified in the 14.4.0.10 patch set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-14-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Update lpfc version to 14.4.0.10
Justin Tee [Wed, 18 Jun 2025 19:21:37 +0000 (12:21 -0700)]
scsi: lpfc: Update lpfc version to 14.4.0.10

Update lpfc version to 14.4.0.10

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-13-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Modify end-of-life adapters' model descriptions
Justin Tee [Wed, 18 Jun 2025 19:21:36 +0000 (12:21 -0700)]
scsi: lpfc: Modify end-of-life adapters' model descriptions

Obsolete adapters' model description strings are updated to indicate that
they are no longer supported.  End-of-life adapters will still remain
probed by the lpfc driver based on PCI id.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-12-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Revise CQ_CREATE_SET mailbox bitfield definitions
Justin Tee [Wed, 18 Jun 2025 19:21:35 +0000 (12:21 -0700)]
scsi: lpfc: Revise CQ_CREATE_SET mailbox bitfield definitions

The CQ_CREATE_SET mailbox command's bitfields are updated.  Rename the
cqe_cnt and separate high/low bitfield names to help resolve confusion
between two similar bitfield definitions.  Corresponding usages of the
newly defined bitfields are updated as well.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-11-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Move clearing of HBA_SETUP flag to before lpfc_sli4_queue_unset
Justin Tee [Wed, 18 Jun 2025 19:21:34 +0000 (12:21 -0700)]
scsi: lpfc: Move clearing of HBA_SETUP flag to before lpfc_sli4_queue_unset

Move clearing of HBA_SETUP flag out of lpfc_sli_brdrestart_s4 and before
lpfc_sli4_queue_unset.  lpfc_sli4_queue_unset kfrees phba queues, so
clear the HBA_SETUP atomic flag to signal that the phba struct is no
longer initialized.

Also, add a check for the HBA_SETUP flag in the lpfc_sli4_io_xri_aborted
routine before dereferencing the ELS WQ.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-10-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Ensure HBA_SETUP flag is used only for SLI4 in dev_loss_tmo_callbk
Justin Tee [Wed, 18 Jun 2025 19:21:33 +0000 (12:21 -0700)]
scsi: lpfc: Ensure HBA_SETUP flag is used only for SLI4 in dev_loss_tmo_callbk

For SLI3, the HBA_SETUP flag is never set so the lpfc_dev_loss_tmo_callbk
always early returns.  Add a phba->sli_rev check for SLI4 mode so that
the SLI3 path can flow through the original dev_loss_tmo worker thread
design to lpfc_dev_loss_tmo_handler instead of early return.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Relocate clearing initial phba flags from link up to link down hdlr
Justin Tee [Wed, 18 Jun 2025 19:21:32 +0000 (12:21 -0700)]
scsi: lpfc: Relocate clearing initial phba flags from link up to link down hdlr

Port wide initialization flags FLOGI_ISSUED and RHBA_CMPL make more sense
to be cleared upon a link down event rather than waiting for a link up
event.  By moving clearing of these initializatin flags to a link down
handler, future confusion on the state of initialization is avoided.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Simplify error handling for failed lpfc_get_sli4_parameters cmd
Justin Tee [Wed, 18 Jun 2025 19:21:31 +0000 (12:21 -0700)]
scsi: lpfc: Simplify error handling for failed lpfc_get_sli4_parameters cmd

There are unnecessary checks on an HBA's interface type and family before
erroring out a failed lpfc_get_sli4_parameters mailbox command.  Simplify
the error handling by logging a message and proceeding to memory free
labels.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Early return out of FDMI cmpl for locally rejected statuses
Justin Tee [Wed, 18 Jun 2025 19:21:30 +0000 (12:21 -0700)]
scsi: lpfc: Early return out of FDMI cmpl for locally rejected statuses

If an FDMI request completes with local reject status and the request is
not retryable, there's no need to parse an FDMI response payload.  Insert
an early return statement for such cases.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Skip RSCN processing when FC_UNLOADING flag is set
Justin Tee [Wed, 18 Jun 2025 19:21:29 +0000 (12:21 -0700)]
scsi: lpfc: Skip RSCN processing when FC_UNLOADING flag is set

During rmmod, all ndlp objects are cleaned up and marked with the
NLP_DROPPED flag indicating that an ndlp object is currently being
released.  Thus, if an RSCN is received during driver unload, then
walking the fc_nodes list to process the RSCN is unnecessary because the
ndlp objects are very shortly going to be released.

In the lpfc_rscn_recovery_check routine, early return if the driver is in
the middle of unloading by checking for the FC_UNLOADING flag.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Check for hdwq null ptr when cleaning up lpfc_vport structure
Justin Tee [Wed, 18 Jun 2025 19:21:28 +0000 (12:21 -0700)]
scsi: lpfc: Check for hdwq null ptr when cleaning up lpfc_vport structure

If a call to lpfc_sli4_read_rev() from lpfc_sli4_hba_setup() fails, the
resultant cleanup routine lpfc_sli4_vport_delete_fcp_xri_aborted() may
occur before sli4_hba.hdwqs are allocated.  This may result in a null
pointer dereference when attempting to take the abts_io_buf_list_lock for
the first hardware queue.  Fix by adding a null ptr check on
phba->sli4_hba.hdwq and early return because this situation means there
must have been an error during port initialization.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Update debugfs trace ring initialization messages
Justin Tee [Wed, 18 Jun 2025 19:21:27 +0000 (12:21 -0700)]
scsi: lpfc: Update debugfs trace ring initialization messages

Initialization parameters for trace rings used in debugfs are sometimes
automatically adjusted.  This patch corrects and updates the
corresponding log messages.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: lpfc: Revise logging format for failed CT MIB requests
Justin Tee [Wed, 18 Jun 2025 19:21:26 +0000 (12:21 -0700)]
scsi: lpfc: Revise logging format for failed CT MIB requests

Unsupported and rejected CT MIB request log messages are changed to
KERN_WARNING level.  Also, remove extra space in log message.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: mpt3sas: Correctly handle ATA device errors
Damien Le Moal [Fri, 6 Jun 2025 05:27:47 +0000 (14:27 +0900)]
scsi: mpt3sas: Correctly handle ATA device errors

With the ATA error model, an NCQ command failure always triggers an abort
(termination) of all NCQ commands queued on the device. In such case, the
SAT or the host must handle the failed command according to the command
sense data and immediately retry all other NCQ commands that were aborted
due to the failed NCQ command.

For SAS HBAs controlled by the mpt3sas driver, NCQ command aborts are not
handled by the HBA SAT and sent back to the host, with an ioc log
information equal to 0x31080000 (IOC_LOGINFO_PREFIX_PL with the PL code
PL_LOGINFO_CODE_SATA_NCQ_FAIL_ALL_CMDS_AFTR_ERR). The function
_scsih_io_done() always forces a retry of commands terminated with the
status MPI2_IOCSTATUS_SCSI_IOC_TERMINATED using the SCSI result
DID_SOFT_ERROR, regardless of the log_info for the command.  This
correctly forces the retry of collateral NCQ abort commands, but with the
retry counter for the command being incremented. If a command to an ATA
device is subject to too many retries due to other NCQ commands failing
(e.g. read commands trying to access unreadable sectors), the collateral
NCQ abort commands may be terminated with an error as they run out of
retries. This violates the SAT specification and causes hard-to-debug
command errors.

Solve this issue by modifying the handling of the
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED status to check if a command is for an
ATA device and if the command loginfo indicates an NCQ collateral
abort. If that is the case, force the command retry using the SCSI result
DID_IMM_RETRY to avoid incrementing the command retry count.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250606052747.742998-3-dlemoal@kernel.org
Tested-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: mpi3mr: Correctly handle ATA device errors
Damien Le Moal [Fri, 6 Jun 2025 05:27:46 +0000 (14:27 +0900)]
scsi: mpi3mr: Correctly handle ATA device errors

With the ATA error model, an NCQ command failure always triggers an abort
(termination) of all NCQ commands queued on the device. In such case, the
SAT or the host must handle the failed command according to the command
sense data and immediately retry all other NCQ commands that were aborted
due to the failed NCQ command.

For SAS HBAs controlled by the mpi3mr driver, NCQ command aborts are not
handled by the HBA SAT and sent back to the host, with an ioc log
information equal to 0x31080000 (IOC_LOGINFO_PREFIX_PL with the PL code
PL_LOGINFO_CODE_SATA_NCQ_FAIL_ALL_CMDS_AFTR_ERR). The function
mpi3mr_process_op_reply_desc() always forces a retry of commands
terminated with the status MPI3_IOCSTATUS_SCSI_IOC_TERMINATED using the
SCSI result DID_SOFT_ERROR, regardless of the ioc_loginfo for the
command. This correctly forces the retry of collateral NCQ abort
commands, but with the retry counter for the command being incremented.
If a command to an ATA device is subject to too many retries due to other
NCQ commands failing (e.g. read commands trying to access unreadable
sectors), the collateral NCQ abort commands may be terminated with an
error as they run out of retries. This violates the SAT specification and
causes hard-to-debug command errors.

Solve this issue by modifying the handling of the
MPI3_IOCSTATUS_SCSI_IOC_TERMINATED status to check if a command is for an
ATA device and if the command ioc_loginfo indicates an NCQ collateral
abort. If that is the case, force the command retry using the SCSI result
DID_IMM_RETRY to avoid incrementing the command retry count.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250606052747.742998-2-dlemoal@kernel.org
Tested-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: pm80xx: Free allocated tags after failure
Francisco Gutierrez [Tue, 17 Jun 2025 21:04:43 +0000 (21:04 +0000)]
scsi: pm80xx: Free allocated tags after failure

This change frees resources after an error is detected.

Signed-off-by: Francisco Gutierrez <frankramirez@google.com>
Link: https://lore.kernel.org/r/20250617210443.989058-1-frankramirez@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: ufs: Clear ucd_rsp_ptr for UPIU requests once
Avri Altman [Tue, 17 Jun 2025 09:56:10 +0000 (12:56 +0300)]
scsi: ufs: Clear ucd_rsp_ptr for UPIU requests once

Previously, the response buffer (ucd_rsp_ptr) was cleared in multiple
UPIU preparation functions. Do it once.

Signed-off-by: Avri Altman <avri.altman@sandisk.com>
Link: https://lore.kernel.org/r/20250617095611.89229-2-avri.altman@sandisk.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: Don't use %pK through printk()
Thomas WeiĂŸschuh [Wed, 11 Jun 2025 09:58:06 +0000 (11:58 +0200)]
scsi: Don't use %pK through printk()

In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.  Since commit ad67b74d2469 ("printk: hash
addresses printed with %p") the regular %p has been improved to avoid
this issue.  Furthermore, restricted pointers ("%pK") were never meant to
be used through printk(). They can still unintentionally leak raw
pointers or acquire sleeping locks in atomic contexts.

Switch to the regular pointer formatting which is safer and easier to
reason about.

Signed-off-by: Thomas WeiĂŸschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20250611-restricted-pointers-scsi-v1-1-fe31bfbc4910@linutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: core: Remember if a device is an ATA device
Damien Le Moal [Wed, 11 Jun 2025 09:34:21 +0000 (18:34 +0900)]
scsi: core: Remember if a device is an ATA device

scsi_add_lun() tests the device vendor string of SCSI devices to detect
if a SCSI device is in fact an ATA device, in order to correctly handle
SATL power management. The function scsi_cdl_enable() also requires
knowing if a SCSI device is an ATA device to control the state of the
device CDL feature but this function does that by testing for the
presence of the VPD page 89h (ATA INFORMATION page).
sd_read_write_same() also has a similar test.

Simplify these different methods by adding the is_ata field to struct
scsi_device to remember that a SCSI device is in fact an ATA one based
on the device vendor name test. This field can also allow low level
SCSI host adapter drivers to take special actions for ATA devices
(e.g. to better handle ATA NCQ errors).

With this, simplify scsi_cdl_enable() and sd_read_write_same().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250611093421.2901633-1-dlemoal@kernel.org
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: mpt3sas: Drop unused variable in mpt3sas_send_mctp_passthru_req()
André Draszik [Fri, 6 Jun 2025 15:29:43 +0000 (16:29 +0100)]
scsi: mpt3sas: Drop unused variable in mpt3sas_send_mctp_passthru_req()

With W=1, gcc complains correctly:

    mpt3sas_ctl.c: In function â€˜mpt3sas_send_mctp_passthru_req’:
    mpt3sas_ctl.c:2917:29: error: variable â€˜mpi_reply’ set but not used [-Werror=unused-but-set-variable]
     2917 |         MPI2DefaultReply_t *mpi_reply;
          |                             ^~~~~~~~~

Drop the unused assignment and variable.

Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://lore.kernel.org/r/20250606-mpt3sas-v1-1-906ffe49fb6b@linaro.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: trace: Show rtn in string for scsi_dispatch_cmd_error()
Kassey Li [Wed, 21 May 2025 01:17:11 +0000 (09:17 +0800)]
scsi: trace: Show rtn in string for scsi_dispatch_cmd_error()

By default the scsi_dispatch_cmd_error() return value is displayed in
decimal:

  kworker/3:1H-183 [003] ....  51.035474: scsi_dispatch_cmd_error: host_no=0 channel=0 id=0 lun=4 data_sgl=1  prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(READ_10 lba=3907214  txlen=1 protect=0 raw=28 00 00 3b 9e 8e 00 00 01 00) rtn=4181

However, these numbers are not particularly helpful wrt. debugging
errors. Especially since the kernel code consistently uses the following
defines in hexadecimal:

  SCSI_MLQUEUE_HOST_BUSY   0x1055
  SCSI_MLQUEUE_DEVICE_BUSY 0x1056
  SCSI_MLQUEUE_EH_RETRY    0x1057
  SCSI_MLQUEUE_TARGET_BUSY 0x1058

Switch to using the string form of these values in the trace output:

  dd-1059    [007] .....    31.689529: scsi_dispatch_cmd_error: host_no=0 channel=0 id=0 lun=4 data_sgl=65 prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=23 scheduler_tag=117 cmnd=(READ_10 lba=0 txlen=128 protect=0 raw=28 00 00 00 00 00 00 00 80 00) rtn=SCSI_MLQUEUE_DEVICE_BUSY

Signed-off-by: Kassey Li <quic_yingangl@quicinc.com>
Link: https://lore.kernel.org/r/20250521011711.1983625-1-quic_yingangl@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: ufs: core: Add HID support
Huan Tang [Fri, 23 May 2025 06:46:04 +0000 (14:46 +0800)]
scsi: ufs: core: Add HID support

Follow JESD220G, support HID(Host Initiated Defragmentation) through
sysfs, the relevant sysfs nodes are as follows:

1. analysis_trigger
2. defrag_trigger
3. fragmented_size
4. defrag_size
5. progress_ratio
6. state

The detailed definition of the six nodes can be found in the sysfs
documentation.

HID's execution policy is given to user-space.

Signed-off-by: Huan Tang <tanghuan@vivo.com>
Signed-off-by: Wenxing Cheng <wenxing.cheng@vivo.com>
Link: https://lore.kernel.org/r/20250523064604.800-1-tanghuan@vivo.com
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Bean Huo <huobean@gmail.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: fc_transport: docs: Add documentation for FC Remote Ports
Alok Tiwari [Sat, 7 Jun 2025 16:22:56 +0000 (09:22 -0700)]
scsi: fc_transport: docs: Add documentation for FC Remote Ports

This patch updates the scsi_fc_transport.rst documentation by replacing
the outdated << To Be Supplied >> placeholder under the "FC Remote Ports
(rports)" section with a detailed explanation of remote port
functionality in the Fibre Channel (FC) transport class.

The new documentation covers:

 - What rports are and their role in FC-based SCSI communication

 - Their representation in sysfs (/sys/class/fc_remote_ports/)

 - Common sysfs attributes such as (port_id, port_name, node_name, and
   port_state).

 - Their typical lifecycle (creation and removal)

 - Guidance for driver developers on using fc_remote_port_add() and
   fc_remote_port_delete()

This change improves the completeness and usefulness of the FC transport
documentation for developers and users interacting with Fibre Channel
drivers in the Linux SCSI subsystem

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Link: https://lore.kernel.org/r/20250607162304.1765430-1-alok.a.tiwari@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoscsi: fcoe: Remove fcoe_select_cpu()
Hannes Reinecke [Thu, 5 Jun 2025 06:20:14 +0000 (08:20 +0200)]
scsi: fcoe: Remove fcoe_select_cpu()

The function fcoe_select_cpu() is just used to distribute incoming skbs
which start a new FC command sequence. But the network stack already
received (and processed) that skb, and there is a _really_ good chance
that all subsequent skbs for this sequence will be handled with the same
CPU. So we should just use the CPU on which this skb was allocated on and
save ourselves some overhead due to pointless scheduling.

Signed-off-by: Hannes Reinecke <hare@kernel.org>
Link: https://lore.kernel.org/r/20250605062014.105302-1-hare@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 months agoLinux 6.16-rc1
Linus Torvalds [Sun, 8 Jun 2025 20:44:43 +0000 (13:44 -0700)]
Linux 6.16-rc1

4 months agoMerge tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Jun 2025 18:44:41 +0000 (11:44 -0700)]
Merge tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:

 - Add initial DMR support, which required smarter RAPL probe

 - Fix AMD MSR RAPL energy reporting

 - Add RAPL power limit configuration output

 - Minor fixes

* tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: version 2025.06.08
  tools/power turbostat: Add initial support for BartlettLake
  tools/power turbostat: Add initial support for DMR
  tools/power turbostat: Dump RAPL sysfs info
  tools/power turbostat: Avoid probing the same perf counters
  tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
  tools/power turbostat: Clean up add perf/msr counter logic
  tools/power turbostat: Introduce add_msr_counter()
  tools/power turbostat: Remove add_msr_perf_counter_()
  tools/power turbostat: Remove add_cstate_perf_counter_()
  tools/power turbostat: Remove add_rapl_perf_counter_()
  tools/power turbostat: Quit early for unsupported RAPL counters
  tools/power turbostat: Always check rapl_joules flag
  tools/power turbostat: Fix AMD package-energy reporting
  tools/power turbostat: Fix RAPL_GFX_ALL typo
  tools/power turbostat: Add Android support for MSR device handling
  tools/power turbostat.8: pm_domain wording fix
  tools/power turbostat.8: fix typo: idle_pct should be pct_idle

4 months agoMerge tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 8 Jun 2025 18:33:00 +0000 (11:33 -0700)]
Merge tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer cleanup from Thomas Gleixner:
 "The delayed from_timer() API cleanup:

  The renaming to the timer_*() namespace was delayed due massive
  conflicts against Linux-next. Now that everything is upstream finish
  the conversion"

* tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  treewide, timers: Rename from_timer() to timer_container_of()

4 months agoMerge tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Jun 2025 18:27:20 +0000 (11:27 -0700)]
Merge tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A small set of x86 fixes:

   - Cure IO bitmap inconsistencies

     A failed fork cleans up all resources of the newly created thread
     via exit_thread(). exit_thread() invokes io_bitmap_exit() which
     does the IO bitmap cleanups, which unfortunately assume that the
     cleanup is related to the current task, which is obviously bogus.

     Make it work correctly

   - A lockdep fix in the resctrl code removed the clearing of the
     command buffer in two places, which keeps stale error messages
     around. Bring them back.

   - Remove unused trace events"

* tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  fs/resctrl: Restore the rdt_last_cmd_clear() calls after acquiring rdtgroup_mutex
  x86/iopl: Cure TIF_IO_BITMAP inconsistencies
  x86/fpu: Remove unused trace events

4 months agoMerge tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 8 Jun 2025 18:25:13 +0000 (11:25 -0700)]
Merge tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "Add the missing seq_file forward declaration in the timer namespace
  header"

* tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timens: Add struct seq_file forward declaration

4 months agotools/power turbostat: version 2025.06.08
Len Brown [Sun, 8 Jun 2025 16:31:59 +0000 (12:31 -0400)]
tools/power turbostat: version 2025.06.08

Add initial DMR support, which required smarter RAPL probe
Fix AMD MSR RAPL energy reporting
Add RAPL power limit configuration output
Minor fixes

Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Add initial support for BartlettLake
Zhang Rui [Fri, 18 Apr 2025 06:04:26 +0000 (14:04 +0800)]
tools/power turbostat: Add initial support for BartlettLake

Add initial support for BartlettLake.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Add initial support for DMR
Zhang Rui [Mon, 4 Mar 2024 06:54:40 +0000 (14:54 +0800)]
tools/power turbostat: Add initial support for DMR

Add initial support for DMR.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Dump RAPL sysfs info
Zhang Rui [Fri, 30 May 2025 06:01:31 +0000 (14:01 +0800)]
tools/power turbostat: Dump RAPL sysfs info

for example:

intel-rapl:1: psys 28.0s:100W 976.0us:100W
intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W
intel-rapl:0/intel-rapl:0:0: core disabled
intel-rapl:0/intel-rapl:0:1: uncore disabled
intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W

[lenb: simplified format]

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
squish me

Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Avoid probing the same perf counters
Zhang Rui [Fri, 30 May 2025 00:09:28 +0000 (08:09 +0800)]
tools/power turbostat: Avoid probing the same perf counters

For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.

Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.

As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.

Fix the problem by skipping the already probed counter.

Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.

In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
Zhang Rui [Sat, 17 May 2025 09:44:50 +0000 (17:44 +0800)]
tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared

platform_features->rapl_msrs describes the RAPL MSRs supported. While
RAPL Perf counters can be exposed from different kernel backend drivers,
e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver.

Thus, turbostat should first blindly probe all the available RAPL Perf
counters, and falls back to the RAPL MSR counters if they are listed in
platform_features->rapl_msrs.

With this, platforms that don't have RAPL MSRs can clear the
platform_features->rapl_msrs bits and use RAPL Perf counters only.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Clean up add perf/msr counter logic
Zhang Rui [Sat, 17 May 2025 09:35:17 +0000 (17:35 +0800)]
tools/power turbostat: Clean up add perf/msr counter logic

Increase the code readability by moving the no_perf/no_msr flag and the
cai->perf_name/cai->msr sanity checks into the counter probe functions.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Introduce add_msr_counter()
Zhang Rui [Sat, 17 May 2025 07:58:51 +0000 (15:58 +0800)]
tools/power turbostat: Introduce add_msr_counter()

probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR
counters and MPERF/APERF/SMI MSR counters, thus its name is misleading.

Similar to add_perf_counter(), introduce add_msr_counter() to probe a
counter via MSR. Introduce wrapper function add_rapl_msr_counter() at
the same time to add extra check for Zero return value for specified
RAPL counters.

No functional change intended.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Remove add_msr_perf_counter_()
Zhang Rui [Sat, 17 May 2025 09:40:08 +0000 (17:40 +0800)]
tools/power turbostat: Remove add_msr_perf_counter_()

As the only caller of add_msr_perf_counter_(), add_msr_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_msr_perf_counter_() and move all the logic to
add_msr_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Remove add_cstate_perf_counter_()
Zhang Rui [Sat, 17 May 2025 07:43:59 +0000 (15:43 +0800)]
tools/power turbostat: Remove add_cstate_perf_counter_()

As the only caller of add_cstate_perf_counter_(),
add_cstate_perf_counter() just gives extra debug output on top. There is
no need to keep both functions.

Remove add_cstate_perf_counter_() and move all the logic to
add_cstate_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Remove add_rapl_perf_counter_()
Zhang Rui [Sat, 17 May 2025 04:06:22 +0000 (12:06 +0800)]
tools/power turbostat: Remove add_rapl_perf_counter_()

As the only caller of add_rapl_perf_counter_(), add_rapl_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_rapl_perf_counter_() and move all the logic to
add_rapl_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Quit early for unsupported RAPL counters
Zhang Rui [Sat, 17 May 2025 02:26:14 +0000 (10:26 +0800)]
tools/power turbostat: Quit early for unsupported RAPL counters

Quit early for unsupported RAPL counters.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Always check rapl_joules flag
Zhang Rui [Fri, 30 May 2025 06:00:33 +0000 (14:00 +0800)]
tools/power turbostat: Always check rapl_joules flag

rapl_joules bit should always be checked even if
platform_features->rapl_msrs is not set or no_msr flag is used.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Fix AMD package-energy reporting
Gautham R. Shenoy [Thu, 29 May 2025 11:48:25 +0000 (17:18 +0530)]
tools/power turbostat: Fix AMD package-energy reporting

commit 05a2f07db888 ("tools/power turbostat: read RAPL counters via
perf") that adds support to read RAPL counters via perf defines the
notion of a RAPL domain_id which is set to physical_core_id on
platforms which support per_core_rapl counters (Eg: AMD processors
Family 17h onwards) and is set to the physical_package_id on all the
other platforms.

However, the physical_core_id is only unique within a package and on
platforms with multiple packages more than one core can have the same
physical_core_id and thus the same domain_id. (For eg, the first cores
of each package have the physical_core_id = 0). This results in all
these cores with the same physical_core_id using the same entry in the
rapl_counter_info_perdomain[]. Since rapl_perf_init() skips the
perf-initialization for cores whose domain_ids have already been
visited, cores that have the same physical_core_id always read the
perf file corresponding to the physical_core_id of the first package
and thus the package-energy is incorrectly reported to be the same
value for different packages.

Note: This issue only arises when RAPL counters are read via perf and
not when they are read via MSRs since in the latter case the MSRs are
read separately on each core.

Fix this issue by associating each CPU with rapl_core_id which is
unique across all the packages in the system.

Fixes: 05a2f07db888 ("tools/power turbostat: read RAPL counters via perf")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Fix RAPL_GFX_ALL typo
Kaushlendra Kumar [Fri, 23 May 2025 08:06:59 +0000 (13:36 +0530)]
tools/power turbostat: Fix RAPL_GFX_ALL typo

Fix typo in the currently unused RAPL_GFX_ALL macro definition.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat: Add Android support for MSR device handling
Kaushlendra Kumar [Thu, 22 May 2025 08:49:46 +0000 (14:19 +0530)]
tools/power turbostat: Add Android support for MSR device handling

It uses /dev/msrN device paths on Android instead of /dev/cpu/N/msr,
updates error messages and permission checks to reflect the Android
device path, and wraps platform-specific code with #if defined(ANDROID)
to ensure correct behavior on both Android and non-Android systems.
These changes improve compatibility and usability of turbostat on
Android devices.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat.8: pm_domain wording fix
Len Brown [Fri, 18 Apr 2025 21:54:39 +0000 (17:54 -0400)]
tools/power turbostat.8: pm_domain wording fix

turbostat.8: clarify that uncore "domains" are Power Management domains,
aka pm_domains.

Signed-off-by: Len Brown <len.brown@intel.com>
4 months agotools/power turbostat.8: fix typo: idle_pct should be pct_idle
Len Brown [Wed, 9 Apr 2025 04:06:24 +0000 (00:06 -0400)]
tools/power turbostat.8: fix typo: idle_pct should be pct_idle

idle_pct should be pct_idle

Signed-off-by: Len Brown <len.brown@intel.com>
4 months agoMerge tag 'perf-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Jun 2025 18:07:33 +0000 (11:07 -0700)]
Merge tag 'perf-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf fix from Thomas Gleixner:
 "A single fix for the x86 performance counters on Intel CPUs:

  The MSR offset calculations for fixed performance counters are stored
  at the wrong index in the configuration array causing the general
  purpose counter MSR offset to be overwritten, so both the general
  purpose and the fixed counters offsets are incorrect.

  Correct the array index calculation to fix that"

* tag 'perf-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr()

4 months agoMerge tag 'irq-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Jun 2025 18:02:53 +0000 (11:02 -0700)]
Merge tag 'irq-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Thomas Gleixner:
 "A single fix for the PCI/MSI code:

  The conversion to per device MSI domains created a MSI domain with
  size 1 instead of sizing it to the maximum possible number of MSI
  interrupts for the device. This "worked" as the subsequent allocations
  resized the domain, but the recent change to move the prepare() call
  into the domain creation path broke this works by chance mechanism.

  Size the domain properly at creation time"

* tag 'irq-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI/MSI: Size device MSI domain with the maximum number of vectors

4 months agoMerge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sun, 8 Jun 2025 17:35:12 +0000 (10:35 -0700)]
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull mount fixes from Al Viro:
 "Various mount-related bugfixes:

   - split the do_move_mount() checks in subtree-of-our-ns and
     entire-anon cases and adapt detached mount propagation selftest for
     mount_setattr

   - allow clone_private_mount() for a path on real rootfs

   - fix a race in call of has_locked_children()

   - fix move_mount propagation graph breakage by MOVE_MOUNT_SET_GROUP

   - make sure clone_private_mnt() caller has CAP_SYS_ADMIN in the right
     userns

   - avoid false negatives in path_overmount()

   - don't leak MNT_LOCKED from parent to child in finish_automount()

   - do_change_type(): refuse to operate on unmounted/not ours mounts"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  do_change_type(): refuse to operate on unmounted/not ours mounts
  clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns
  selftests/mount_setattr: adapt detached mount propagation test
  do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases
  fs: allow clone_private_mount() for a path on real rootfs
  fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
  finish_automount(): don't leak MNT_LOCKED from parent to child
  path_overmount(): avoid false negatives
  fs/fhandle.c: fix a race in call of has_locked_children()

4 months agoMerge tag '6.16-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 8 Jun 2025 17:20:21 +0000 (10:20 -0700)]
Merge tag '6.16-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull more smb client updates from Steve French:

 - multichannel/reconnect fixes

 - move smbdirect (smb over RDMA) defines to fs/smb/common so they will
   be able to be used in the future more broadly, and a documentation
   update explaining setting up smbdirect mounts

 - update email address for Paulo

* tag '6.16-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal version number
  MAINTAINERS, mailmap: Update Paulo Alcantara's email address
  cifs: add documentation for smbdirect setup
  cifs: do not disable interface polling on failure
  cifs: serialize other channels when query server interfaces is pending
  cifs: deal with the channel loading lag while picking channels
  smb: client: make use of common smbdirect_socket_parameters
  smb: smbdirect: introduce smbdirect_socket_parameters
  smb: client: make use of common smbdirect_socket
  smb: smbdirect: add smbdirect_socket.h
  smb: client: make use of common smbdirect.h
  smb: smbdirect: add smbdirect.h with public structures
  smb: client: make use of common smbdirect_pdu.h
  smb: smbdirect: add smbdirect_pdu.h with protocol definitions

4 months agoMerge tag 'trace-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Sun, 8 Jun 2025 15:19:01 +0000 (08:19 -0700)]
Merge tag 'trace-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull more tracing fixes from Steven Rostedt:

 - Fix regression of waiting a long time on updating trace event filters

   When the faultable trace points were added, it needed task trace RCU
   synchronization.

   This was added to the tracepoint_synchronize_unregister() function.
   The filter logic always called this function whenever it updated the
   trace event filters before freeing the old filters. This increased
   the time of "trace-cmd record" from taking 13 seconds to running over
   2 minutes to complete.

   Move the freeing of the filters to call_rcu*() logic, which brings
   the time back down to 13 seconds.

 - Fix ring_buffer_subbuf_order_set() error path lock protection

   The error path of the ring_buffer_subbuf_order_set() released the
   mutex too early and allowed subsequent accesses to setting the
   subbuffer size to corrupt the data and cause a bug.

   By moving the mutex locking to the end of the error path, it prevents
   the reentrant access to the critical data and also allows the
   function to convert the taking of the mutex over to the guard()
   logic.

 - Remove unused power management clock events

   The clock events were added in 2010 for power management. In 2011 arm
   used them. In 2013 the code they were used in was removed. These
   events have been wasting memory since then.

 - Fix sparse warnings

   There was a few places that sparse warned about trace_events_filter.c
   where file->filter was referenced directly, but it is annotated with
   an __rcu tag. Use the helper functions and fix them up to use
   rcu_dereference() properly.

* tag 'trace-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Add rcu annotation around file->filter accesses
  tracing: PM: Remove unused clock events
  ring-buffer: Fix buffer locking in ring_buffer_subbuf_order_set()
  tracing: Fix regression of filter waiting a long time on RCU synchronization