Linus Torvalds [Fri, 20 Jun 2025 06:25:28 +0000 (23:25 -0700)]
Merge tag 'io_uring-6.16-20250619' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Two fixes for error injection failures. One fixes a task leak issue
introduced in this merge window, the other an older issue with
handling allocation of a mapped buffer.
- Fix for a syzbot issue that triggers a kmalloc warning on attempting
an allocation that's too large
- Fix for an error injection failure causing a double put of a task,
introduced in this merge window
* tag 'io_uring-6.16-20250619' of git://git.kernel.dk/linux:
io_uring: fix potential page leak in io_sqe_buffer_register()
io_uring/sqpoll: don't put task_struct on tctx setup failure
io_uring: remove duplicate io_uring_alloc_task_context() definition
io_uring: fix task leak issue in io_wq_create()
io_uring/rsrc: validate buffer count with offset for cloning
Linus Torvalds [Fri, 20 Jun 2025 06:18:59 +0000 (23:18 -0700)]
Merge tag 'drm-fixes-2025-06-20' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Bit of an uptick in fixes for rc3, msm and amdgpu leading the way,
with i915/xe/nouveau with a few each and then some scattered misc
bits, nothing looks too crazy:
msm:
- Display:
- Fixed DP output on SDM845
- Fixed 10nm DSI PLL init
- GPU:
- SUBMIT ioctl error path leak fixes
- drm half of stall-on-fault fixes
- a7xx: Missing CP_RESET_CONTEXT_STATE
- Skip GPU component bind if GPU is not in the device table
i915:
- Fix MIPI vtotal programming off by one on Broxton
- Fix PMU code for GCOV and AutoFDO enabled build
xe:
- A workaround update
- Fix memset on iomem
- Fix early wedge on GuC Load failure
Linus Torvalds [Fri, 20 Jun 2025 06:15:10 +0000 (23:15 -0700)]
Merge tag 'v6.16-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a regression in ahash (broken fallback finup) and
reinstates a Kconfig option to control the extra self-tests"
* tag 'v6.16-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ahash - Fix infinite recursion in ahash_def_finup
crypto: testmgr - reinstate kconfig control over full self-tests
Linus Torvalds [Fri, 20 Jun 2025 00:46:08 +0000 (17:46 -0700)]
Merge tag 'spi-fix-v6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
"One fix here from Thierry, fixing crashes caused by attempting to do
cache sync operations on uncached memory on Tegra platforms"
* tag 'spi-fix-v6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: tegra210-qspi: Remove cache operations
Linus Torvalds [Fri, 20 Jun 2025 00:40:42 +0000 (17:40 -0700)]
Merge tag 'regulator-fix-v6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"One patch here from Heiko which fixes stability issues on some
Rockchip platforms by implementing soft start support and providing
startup time information for their regulators"
* tag 'regulator-fix-v6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: fan53555: add enable_time support and soft-start times
Linus Torvalds [Thu, 19 Jun 2025 17:21:32 +0000 (10:21 -0700)]
Merge tag 'net-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless.
The ath12k fix to avoid FW crashes requires adding support for a
number of new FW commands so it's quite large in terms of LoC. The
rest is relatively small.
Current release - fix to a fix:
- ptp: fix breakage after ptp_vclock_in_use() rework
Current release - regressions:
- openvswitch: allocate struct ovs_pcpu_storage dynamically, static
allocation may exhaust module loader limit on smaller systems
Previous releases - regressions:
- tcp: fix tcp_packet_delayed() for peers with no selective ACK
support
Previous releases - always broken:
- wifi: ath12k: don't activate more links than firmware supports
- tcp: make sure sockets open via passive TFO have valid NAPI ID
- eth: bnxt_en: update MRU and RSS table of RSS contexts on queue
reset, prevent Rx queues from silently hanging after queue reset
- NFC: uart: set tty->disc_data only in success path"
* tag 'net-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
net: airoha: Differentiate hwfd buffer size for QDMA0 and QDMA1
net: airoha: Compute number of descriptors according to reserved memory size
tools: ynl: fix mixing ops and notifications on one socket
net: atm: fix /proc/net/atm/lec handling
net: atm: add lec_mutex
mlxbf_gige: return EPROBE_DEFER if PHY IRQ is not available
net: airoha: Always check return value from airoha_ppe_foe_get_entry()
NFC: nci: uart: Set tty->disc_data only in success path
calipso: Fix null-ptr-deref in calipso_req_{set,del}attr().
MAINTAINERS: Remove Shannon Nelson from MAINTAINERS file
net: lan743x: fix potential out-of-bounds write in lan743x_ptp_io_event_clock_get()
eth: fbnic: avoid double free when failing to DMA-map FW msg
tcp: fix passive TFO socket having invalid NAPI ID
selftests: net: add test for passive TFO socket NAPI ID
selftests: net: add passive TFO test binary
selftests: netdevsim: improve lib.sh include in peer.sh
tipc: fix null-ptr-deref when acquiring remote ip of ethernet bearer
Octeontx2-pf: Fix Backpresure configuration
net: ftgmac100: select FIXED_PHY
net: ethtool: remove duplicate defines for family info
...
Lorenzo Bianconi [Thu, 19 Jun 2025 07:07:25 +0000 (09:07 +0200)]
net: airoha: Differentiate hwfd buffer size for QDMA0 and QDMA1
EN7581 SoC allows configuring the size and the number of buffers in
hwfd payload queue for both QDMA0 and QDMA1.
In order to reduce the required DRAM used for hwfd buffers queues and
decrease the memory footprint, differentiate hwfd buffer size for QDMA0
and QDMA1 and reduce hwfd buffer size to 1KB for QDMA1 (WAN) while
maintaining 2KB for QDMA0 (LAN).
Lorenzo Bianconi [Thu, 19 Jun 2025 07:07:24 +0000 (09:07 +0200)]
net: airoha: Compute number of descriptors according to reserved memory size
In order to not exceed the reserved memory size for hwfd buffers,
compute the number of hwfd buffers/descriptors according to the
reserved memory size and the size of each hwfd buffer (2KB).
Fixes: 3a1ce9e3d01b ("net: airoha: Add the capability to allocate hwfd buffers via reserved-memory") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250619-airoha-hw-num-desc-v4-1-49600a9b319a@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 19 Jun 2025 15:38:40 +0000 (08:38 -0700)]
Merge tag 'wireless-2025-06-18' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
More fixes:
- ath12k
- avoid busy-waiting
- activate correct number of links
- iwlwifi
- iwldvm regression (lots of warnings)
- iwlmld merge damage regression (crash)
- fix build with some old gcc versions
- carl9170: don't talk to device w/o FW [syzbot]
- ath6kl: remove bad FW WARN [syzbot]
- ieee80211: use variable-length arrays [syzbot]
- mac80211
- remove WARN on delayed beacon update [syzbot]
- drop OCB frames with invalid source [syzbot]
* tag 'wireless-2025-06-18' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: iwlwifi: Fix incorrect logic on cmd_ver range checking
wifi: iwlwifi: dvm: restore n_no_reclaim_cmds setting
wifi: iwlwifi: cfg: Limit cb_size to valid range
wifi: iwlwifi: restore missing initialization of async_handlers_list (again)
wifi: ath6kl: remove WARN on bad firmware input
wifi: carl9170: do not ping device which has failed to load firmware
wifi: ath12k: don't wait when there is no vdev started
wifi: ath12k: don't use static variables in ath12k_wmi_fw_stats_process()
wifi: ath12k: avoid burning CPU while waiting for firmware stats
wifi: ath12k: fix documentation on firmware stats
wifi: ath12k: don't activate more links than firmware supports
wifi: ath12k: update link active in case two links fall on the same MAC
wifi: ath12k: support WMI_MLO_LINK_SET_ACTIVE_CMDID command
wifi: ath12k: update freq range for each hardware mode
wifi: ath12k: parse and save sbs_lower_band_end_freq from WMI_SERVICE_READY_EXT2_EVENTID event
wifi: ath12k: parse and save hardware mode info from WMI_SERVICE_READY_EXT_EVENTID event for later use
wifi: ath12k: Avoid CPU busy-wait by handling VDEV_STAT and BCN_STAT
wifi: mac80211: don't WARN for late channel/color switch
wifi: mac80211: drop invalid source address OCB frames
wifi: remove zero-length arrays
====================
Jakub Kicinski [Wed, 18 Jun 2025 17:17:46 +0000 (10:17 -0700)]
tools: ynl: fix mixing ops and notifications on one socket
The multi message support loosened the connection between the request
and response handling, as we can now submit multiple requests before
we start processing responses. Passing the attr set to NlMsgs decoding
no longer makes sense (if it ever did), attr set may differ message
by messsage. Isolate the part of decoding responsible for attr-set
specific interpretation and call it once we identified the correct op.
Without this fix performing SET operation on an ethtool socket, while
being subscribed to notifications causes:
# File "tools/net/ynl/pyynl/lib/ynl.py", line 1096, in _op
# Exception| return self._ops(ops)[0]
# Exception| ~~~~~~~~~^^^^^
# File "tools/net/ynl/pyynl/lib/ynl.py", line 1040, in _ops
# Exception| nms = NlMsgs(reply, attr_space=op.attr_set)
# Exception| ^^^^^^^^^^^
The value of op we use on line 1040 is stale, it comes form the previous
loop. If a notification comes before a response we will update op to None
and the next iteration thru the loop will break with the trace above.
Fixes: 6fda63c45fe8 ("tools/net/ynl: fix cli.py --subscribe feature") Fixes: ba8be00f68f5 ("tools/net/ynl: Add multi message support to ynl") Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250618171746.1201403-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 18 Jun 2025 14:08:43 +0000 (14:08 +0000)]
net: atm: add lec_mutex
syzbot found its way in net/atm/lec.c, and found an error path
in lecd_attach() could leave a dangling pointer in dev_lec[].
Add a mutex to protect dev_lecp[] uses from lecd_attach(),
lec_vcc_attach() and lec_mcast_attach().
Following patch will use this mutex for /proc/net/atm/lec.
BUG: KASAN: slab-use-after-free in lecd_attach net/atm/lec.c:751 [inline]
BUG: KASAN: slab-use-after-free in lane_ioctl+0x2224/0x23e0 net/atm/lec.c:1008
Read of size 8 at addr ffff88807c7b8e68 by task syz.1.17/6142
David Thompson [Wed, 18 Jun 2025 13:59:02 +0000 (13:59 +0000)]
mlxbf_gige: return EPROBE_DEFER if PHY IRQ is not available
The message "Error getting PHY irq. Use polling instead"
is emitted when the mlxbf_gige driver is loaded by the
kernel before the associated gpio-mlxbf driver, and thus
the call to get the PHY IRQ fails since it is not yet
available. The driver probe() must return -EPROBE_DEFER
if acpi_dev_gpio_irq_get_by() returns the same.
Fixes: 6c2a6ddca763 ("net: mellanox: mlxbf_gige: Replace non-standard interrupt handling") Signed-off-by: David Thompson <davthompson@nvidia.com> Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250618135902.346-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Krzysztof Kozlowski [Wed, 18 Jun 2025 07:36:50 +0000 (09:36 +0200)]
NFC: nci: uart: Set tty->disc_data only in success path
Setting tty->disc_data before opening the NCI device means we need to
clean it up on error paths. This also opens some short window if device
starts sending data, even before NCIUARTSETDRIVER IOCTL succeeded
(broken hardware?). Close the window by exposing tty->disc_data only on
the success path, when opening of the NCI device and try_module_get()
succeeds.
The code differs in error path in one aspect: tty->disc_data won't be
ever assigned thus NULL-ified. This however should not be relevant
difference, because of "tty->disc_data=NULL" in nci_uart_tty_open().
Kuniyuki Iwashima [Tue, 17 Jun 2025 22:40:42 +0000 (15:40 -0700)]
calipso: Fix null-ptr-deref in calipso_req_{set,del}attr().
syzkaller reported a null-ptr-deref in sock_omalloc() while allocating
a CALIPSO option. [0]
The NULL is of struct sock, which was fetched by sk_to_full_sk() in
calipso_req_setattr().
Since commit a1a5344ddbe8 ("tcp: avoid two atomic ops for syncookies"),
reqsk->rsk_listener could be NULL when SYN Cookie is returned to its
client, as hinted by the leading SYN Cookie log.
Here are 3 options to fix the bug:
1) Return 0 in calipso_req_setattr()
2) Return an error in calipso_req_setattr()
3) Alaways set rsk_listener
1) is no go as it bypasses LSM, but 2) effectively disables SYN Cookie
for CALIPSO. 3) is also no go as there have been many efforts to reduce
atomic ops and make TCP robust against DDoS. See also commit 3b24d854cb35
("tcp/dccp: do not touch listener sk_refcnt under synflood").
As of the blamed commit, SYN Cookie already did not need refcounting,
and no one has stumbled on the bug for 9 years, so no CALIPSO user will
care about SYN Cookie.
Let's return an error in calipso_req_setattr() and calipso_req_delattr()
in the SYN Cookie case.
This can be reproduced by [1] on Fedora and now connect() of nc times out.
Fixes: e1adea927080 ("calipso: Allow request sockets to be relabelled by the lsm.") Reported-by: syzkaller <syzkaller@googlegroups.com> Reported-by: John Cheung <john.cs.hey@gmail.com> Closes: https://lore.kernel.org/netdev/CAP=Rh=MvfhrGADy+-WJiftV2_WzMH4VEhEFmeT28qY+4yxNu4w@mail.gmail.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://patch.msgid.link/20250617224125.17299-1-kuni1840@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shannon Nelson [Mon, 16 Jun 2025 22:44:37 +0000 (15:44 -0700)]
MAINTAINERS: Remove Shannon Nelson from MAINTAINERS file
Brett Creeley is taking ownership of AMD/Pensando drivers while I wander
off into the sunset with my retirement this month. I'll still keep an
eye out on a few topics for awhile, and maybe do some free-lance work in
the future.
Meanwhile, thank you all for the fun and support and the many learning
opportunities :-).
Special thanks go to DaveM for merging my first patch long ago, the big
ionic patchset a few years ago, and my last patchset last week.
When the GuC fails to load we declare the device wedged. However, the
very first GuC load attempt on GT0 (from xe_gt_init_hwconfig) is done
before the GT1 GuC objects are initialized, so things go bad when the
wedge code attempts to cleanup GT1. To fix this, check the initialization
status in the functions called during wedge.
Fixes: 7dbe8af13c18 ("drm/xe: Wedge the entire device") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: stable@vger.kernel.org # v6.12+: 1e1981b16bb1: drm/xe: Fix taking invalid lock on wedge Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250611214453.1159846-2-daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 0b93b7dcd9eb888a6ac7546560877705d4ad61bf) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Alexey Kodanev [Mon, 16 Jun 2025 11:37:43 +0000 (11:37 +0000)]
net: lan743x: fix potential out-of-bounds write in lan743x_ptp_io_event_clock_get()
Before calling lan743x_ptp_io_event_clock_get(), the 'channel' value
is checked against the maximum value of PCI11X1X_PTP_IO_MAX_CHANNELS(8).
This seems correct and aligns with the PTP interrupt status register
(PTP_INT_STS) specifications.
However, lan743x_ptp_io_event_clock_get() writes to ptp->extts[] with
only LAN743X_PTP_N_EXTTS(4) elements, using channel as an index:
====================
net: fix passive TFO socket having invalid NAPI ID
Found a bug where an accepted passive TFO socket returns an invalid NAPI
ID (i.e. 0) from SO_INCOMING_NAPI_ID. Add a selftest for this using
netdevsim and fix the bug.
Patch 1 is a drive-by fix for the lib.sh include in an existing
drivers/net/netdevsim/peer.sh selftest.
Patch 2 adds a test binary for a simple TFO server/client.
Patch 3 adds a selftest for checking that the NAPI ID of a passive TFO
socket is valid. This will currently fail.
Patch 4 adds a fix for the bug.
====================
David Wei [Tue, 17 Jun 2025 21:21:02 +0000 (14:21 -0700)]
tcp: fix passive TFO socket having invalid NAPI ID
There is a bug with passive TFO sockets returning an invalid NAPI ID 0
from SO_INCOMING_NAPI_ID. Normally this is not an issue, but zero copy
receive relies on a correct NAPI ID to process sockets on the right
queue.
Fix by adding a sk_mark_napi_id_set().
Fixes: e5907459ce7e ("tcp: Record Rx hash and NAPI ID in tcp_child_process") Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250617212102.175711-5-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 19 Jun 2025 00:47:27 +0000 (17:47 -0700)]
Merge tag '6.16-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Fix alternate data streams bug
- Important fix for null pointer deref with Kerberos authentication
- Fix oops in smbdirect (RDMA) in free_transport
* tag '6.16-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: handle set/get info file for streamed file
ksmbd: fix null pointer dereference in destroy_previous_session
ksmbd: add free_transport ops in ksmbd connection
Linus Torvalds [Wed, 18 Jun 2025 21:31:16 +0000 (14:31 -0700)]
Merge tag 'driver-core-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fixes from Danilo Krummrich:
- Fix a race condition in Devres::drop(). This depends on two other
patches:
- (Minimal) Rust abstractions for struct completion
- Let Revocable indicate whether its data is already being revoked
- Fix Devres to avoid exposing the internal Revocable
- Add .mailmap entry for Danilo Krummrich
- Add Madhavan Srinivasan to embargoed-hardware-issues.rst
* tag 'driver-core-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
Documentation: embargoed-hardware-issues.rst: Add myself for Power
mailmap: add entry for Danilo Krummrich
rust: devres: do not dereference to the internal Revocable
rust: devres: fix race in Devres::drop()
rust: revocable: indicate whether `data` has been revoked already
rust: completion: implement initial abstraction
Linus Torvalds [Wed, 18 Jun 2025 21:25:50 +0000 (14:25 -0700)]
Merge tag 'cgroup-for-6.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
- In cgroup1 freezer, a task migrating into a frozen cgroup might not
get frozen immediately due to the wrong operation order. Fix it.
* tag 'cgroup-for-6.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup,freezer: fix incomplete freezing when attaching tasks
Linus Torvalds [Wed, 18 Jun 2025 21:22:31 +0000 (14:22 -0700)]
Merge tag 'wq-for-6.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
- Fix missed early init of wq_isolated_cpumask
* tag 'wq-for-6.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Initialize wq_isolated_cpumask in workqueue_init_early()
Each link supports multiple channels for example RPM link supports
16 channels. In case of link oversubsribe, NIX will assert backpressure
on receive channels.
The previous patch considered a single channel per link, resulting in
backpressure not being enabled on the remaining channels
Fixes: a7ef63dbd588 ("octeontx2-af: Disable backpressure between CPT and NIX") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250617063403.3582210-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 18 Jun 2025 21:17:15 +0000 (14:17 -0700)]
Merge tag 'sched_ext-for-6.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- Fix a couple bugs in cgroup cpu.weight support
- Add the new sched-ext@lists.linux.dev to MAINTAINERS
* tag 'sched_ext-for-6.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext, sched/core: Don't call scx_group_set_weight() prematurely from sched_create_group()
sched_ext: Make scx_group_set_weight() always update tg->scx.weight
sched_ext: Update mailing list entry in MAINTAINERS
Jakub Kicinski [Wed, 18 Jun 2025 21:15:15 +0000 (14:15 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-06-17 (ice, e1000e)
For ice:
Krishna Kumar modifies aRFS match criteria to correctly identify
matching filters.
Grzegorz fixes a memory leak in eswitch legacy mode.
For e1000e:
Vitaly sets clock frequency on some Nahum systems which may misreport
their value.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000e: set fixed clock frequency indication for Nahum 11 and Nahum 13
ice: fix eswitch code memory leak in reset scenario
net: ice: Perform accurate aRFS flow match
====================
Heiner Kallweit [Tue, 17 Jun 2025 18:20:17 +0000 (20:20 +0200)]
net: ftgmac100: select FIXED_PHY
Depending on e.g. DT configuration this driver uses a fixed link.
So we shouldn't rely on the user to enable FIXED_PHY, select it in
Kconfig instead. We may end up with a non-functional driver otherwise.
Jakub Kicinski [Tue, 17 Jun 2025 20:22:40 +0000 (13:22 -0700)]
net: ethtool: remove duplicate defines for family info
Commit under fixes switched to uAPI generation from the YAML
spec. A number of custom defines were left behind, mostly
for commands very hard to express in YAML spec.
Among what was left behind was the name and version of
the generic netlink family. Problem is that the codegen
always outputs those values so we ended up with a duplicated,
differently named set of defines.
Provide naming info in YAML and remove the incorrect defines.
Linus Torvalds [Wed, 18 Jun 2025 21:09:22 +0000 (14:09 -0700)]
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fixes from Eric Biggers:
- Fix a regression in the arm64 Poly1305 code
- Fix a couple compiler warnings
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto/poly1305: Fix arm64's poly1305_blocks_arch()
lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older
lib/crypto: Annotate crypto strings with nonstring
When tasks are migrated to a frozen cgroup, the freezer fails to
immediately freeze the tasks, causing the cgroup to remain in the
"FREEZING".
The freeze_task() function is called before clearing the CGROUP_FROZEN
flag. This causes the freezing() check to incorrectly return false,
preventing __freeze_task() from being invoked for the migrated task.
To fix this issue, clear the CGROUP_FROZEN state before calling
freeze_task().
Linus Torvalds [Wed, 18 Jun 2025 17:32:01 +0000 (10:32 -0700)]
Merge tag 'selinux-pr-20250618' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"A small SELinux patch to resolve a UBSAN warning in the
xfrm/labeled-IPsec code"
* tag 'selinux-pr-20250618' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix selinux_xfrm_alloc_user() to set correct ctx_len
Linus Torvalds [Wed, 18 Jun 2025 17:24:50 +0000 (10:24 -0700)]
Merge tag 'ftrace-v6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace fix from Steven Rostedt:
- Do not blindly enable function_graph tracer when updating
funcgraph-args
When the option to trace function arguments in the function graph
trace is updated, it requires the function graph tracer to switch its
callback routine. It disables function graph tracing, updates the
callback and then re-enables function graph tracing.
The issue is that it doesn't check if function graph tracing is
currently enabled or not. If it is not enabled, it will try to
disable it and re-enable it (which will actually enable it even
though it is not the current tracer). This causes an issue in the
accounting and will trigger a WARN_ON() if the function tracer is
enabled after that.
* tag 'ftrace-v6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
fgraph: Do not enable function_graph tracer when setting funcgraph-args
Linus Torvalds [Wed, 18 Jun 2025 17:21:39 +0000 (10:21 -0700)]
Merge tag 'ata-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Niklas Cassel:
- Force PIO for ATAPI devices on VT6415/VT6330 as the controller locks
up on ATAPI DMA (Tasos)
- Fix ACPI PATA cable type detection such that the controller is not
forced down to a slow transfer mode (Tasos)
- Fix build error on 32-bit UML (Johannes)
- Fix a PCI region leak in the pata_macio driver so that the driver no
longer fails to load after rmmod (Philipp)
- Use correct DMI BIOS build date for ThinkPad W541 quirk (me)
- Disallow LPM for ASUSPRO-D840SA motherboard as this board
interestingly enough gets graphical corruptions on the iGPU when LPM
is enabled (me)
- Disallow LPM for Asus B550-F motherboard as this board will get
command timeouts on ports 5 and 6, yet LPM with the same drive works
fine on all other ports (Mikko)
* tag 'ata-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: ahci: Disallow LPM for Asus B550-F motherboard
ata: ahci: Disallow LPM for ASUSPRO-D840SA motherboard
ata: ahci: Use correct BIOS build date for ThinkPad W541 quirk
ata: pata_macio: Fix PCI region leak
ata: pata_cs5536: fix build on 32-bit UML
ata: libata-acpi: Do not assume 40 wire cable if no devices are enabled
ata: pata_via: Force PIO for ATAPI devices on VT6415/VT6330
Alex Deucher [Mon, 16 Jun 2025 19:47:13 +0000 (15:47 -0400)]
drm/amdgpu/sdma5.2: init engine reset mutex
Missing the mutex init.
Fixes: 47454f2dc0bf ("drm/amdgpu: Register the new sdma function pointers for sdma_v5_2") Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ea685ff30a51a25dd9be90786933ada49a088f65)
Jay Cornwall [Wed, 11 Jun 2025 14:52:14 +0000 (09:52 -0500)]
drm/amdkfd: Fix race in GWS queue scheduling
q->gws is not updated atomically with qpd->mapped_gws_queue. If a
runlist is created between pqm_set_gws and update_queue it will
contain a queue which uses GWS in a process with no GWS allocated.
This will result in a scheduler hang.
Use q->properties.is_gws which is changed while holding the DQM lock.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b98370220eb3110e82248e3354e16a489a492cfb) Cc: stable@vger.kernel.org
Alex Deucher [Mon, 16 Jun 2025 19:43:55 +0000 (15:43 -0400)]
drm/amdgpu/sdma5: init engine reset mutex
Missing the mutex init.
Fixes: e56d4bf57fab ("drm/amdgpu/: drm/amdgpu: Register the new sdma function pointers for sdma_v5_0") Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3f4caf092f02f0de169c6122639af481c7edc8f9)
Alex Deucher [Mon, 2 Jun 2025 15:31:52 +0000 (11:31 -0400)]
drm/amdgpu: switch job hw_fence to amdgpu_fence
Use the amdgpu fence container so we can store additional
data in the fence. This also fixes the start_time handling
for MCBP since we were casting the fence to an amdgpu_fence
and it wasn't.
Fixes: 3f4c175d62d8 ("drm/amdgpu: MCBP based on DRM scheduler (v9)") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bf1cd14f9e2e1fdf981eed273ddd595863f5288c) Cc: stable@vger.kernel.org
Jesse Zhang [Mon, 16 Jun 2025 11:21:41 +0000 (19:21 +0800)]
drm/amdgpu: Fix SDMA UTC_L1 handling during start/stop sequences
This commit makes two key fixes to SDMA v4.4.2 handling:
1. disable UTC_L1 in sdma_cntl register when stopping SDMA engines
by reading the current value before modifying UTC_L1_ENABLE bit.
2. Ensure UTC_L1_ENABLE is consistently managed by:
- Adding the missing register write when enabling UTC_L1 during start
- Keeping UTC_L1 enabled by default as per hardware requirements
v2: Correct SDMA_CNTL setting (Philip)
Suggested-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 375bf564654e85a7b1b0657b191645b3edca1bda) Cc: stable@vger.kernel.org
Jesse Zhang [Wed, 11 Jun 2025 07:07:11 +0000 (15:07 +0800)]
drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations
Simplify SDMA v4_4_2 queue reset and stop operations by:
1. Removing GET_INST(SDMA0) conversion for ring->me
2. Using the logical instance ID (ring->me) directly
3. Maintaining consistent behavior with other SDMA queue operations
This change aligns with the existing queue handling logic where
ring->me already represents the correct instance identifier.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3bab282dfe25dff7a55add432f56898505a6cc6c) Cc: stable@vger.kernel.org
Jesse Zhang [Wed, 11 Jun 2025 07:02:09 +0000 (15:02 +0800)]
drm/amdgpu: Fix SDMA engine reset with logical instance ID
This commit makes the following improvements to SDMA engine reset handling:
1. Clarifies in the function documentation that instance_id refers to a logical ID
2. Adds conversion from logical to physical instance ID before performing reset
using GET_INST(SDMA0, instance_id) macro
3. Improves error messaging to indicate when a logical instance reset fails
4. Adds better code organization with blank lines for readability
The change ensures proper SDMA engine reset by using the correct physical
instance ID while maintaining the logical ID interface for callers.
V2: Remove harvest_config check and convert directly to physical instance (Lijo)
Suggested-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5efa6217c239ed1ceec0f0414f9b6f6927387dfc) Cc: stable@vger.kernel.org
Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fb5ec2174d70a8989bc207d257db90ffeca3b163) Cc: stable@vger.kernel.org
Frank Min [Wed, 4 Jun 2025 13:00:44 +0000 (21:00 +0800)]
drm/amdgpu: Add kicker device detection
1. add kicker device list
2. add kicker device checking helper function
Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 09aa2b408f4ab689c3541d22b0968de0392ee406) Cc: stable@vger.kernel.org
Mario Limonciello [Thu, 29 May 2025 14:46:32 +0000 (09:46 -0500)]
drm/amd/display: Export full brightness range to userspace
[WHY]
Userspace currently is offered a range from 0-0xFF but the PWM is
programmed from 0-0xFFFF. This can be limiting to some software
that wants to apply greater granularity.
[HOW]
Convert internally to firmware values only when mapping custom
brightness curves because these are in 0-0xFF range. Advertise full
PWM range to userspace.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8dbd72cb790058ce52279af38a43c2b302fdd3e5) Cc: stable@vger.kernel.org
Mario Limonciello [Thu, 29 May 2025 16:33:44 +0000 (11:33 -0500)]
drm/amd/display: Only read ACPI backlight caps once
[WHY]
Backlight caps are read already in amdgpu_dm_update_backlight_caps().
They may be updated by update_connector_ext_caps(). Reading again when
registering backlight device may cause wrong values to be used.
[HOW]
Use backlight caps already registered to the dm.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 148144f6d2f14b02eaaa39b86bbe023cbff350bd) Cc: stable@vger.kernel.org
Nicholas Kazlauskas [Wed, 21 May 2025 20:40:25 +0000 (16:40 -0400)]
drm/amd/display: Add more checks for DSC / HUBP ONO guarantees
[WHY]
For non-zero DSC instances it's possible that the HUBP domain required
to drive it for sequential ONO ASICs isn't met, potentially causing
the logic to the tile to enter an undefined state leading to a system
hang.
[HOW]
Add more checks to ensure that the HUBP domain matching the DSC instance
is appropriately powered.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Duncan Ma <duncan.ma@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit da63df07112e5a9857a8d2aaa04255c4206754ec) Cc: stable@vger.kernel.org
Michael Strauss [Wed, 26 Feb 2025 15:03:48 +0000 (10:03 -0500)]
drm/amd/display: Get LTTPR IEEE OUI/Device ID From Closest LTTPR To Host
[WHY]
These fields are read for the explicit purpose of detecting embedded LTTPRs
(i.e. between host ASIC and the user-facing port), and thus need to
calculate the correct DPCD address offset based on LTTPR count to target
the appropriate LTTPR's DPCD register space with these queries.
[HOW]
Cascaded LTTPRs in a link each snoop and increment LTTPR count when queried
via DPCD read, so an LTTPR embedded in a source device (e.g. USB4 port on a
laptop) will always be addressible using the max LTTPR count seen by the
host. Therefore we simply need to use a recently added helper function to
calculate the correct DPCD address to target potentially embedded LTTPRs
based on the received LTTPR count.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 791897f5c77a2a65d0e500be4743af2ddf6eb061) Cc: stable@vger.kernel.org
Jesse.Zhang [Thu, 29 May 2025 03:27:37 +0000 (11:27 +0800)]
drm/amdkfd: move SDMA queue reset capability check to node_show
Relocate the per-SDMA queue reset capability check from
kfd_topology_set_capabilities() to node_show() to ensure we read the
latest value of sdma.supported_reset after all IP blocks are initialized.
Fixes: ceb7114c961b ("drm/amdkfd: flag per-sdma queue reset supported to user space") Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e17df7b086cf908cedd919f448da9e00028419bb)
Steven Rostedt [Wed, 18 Jun 2025 11:38:01 +0000 (07:38 -0400)]
fgraph: Do not enable function_graph tracer when setting funcgraph-args
When setting the funcgraph-args option when function graph tracer is net
enabled, it incorrectly enables it. Worse, it unregisters itself when it
was never registered. Then when it gets enabled again, it will register
itself a second time causing a WARNing.
Have the setting of the funcgraph-args check if function_graph tracer is
the current tracer of the instance, and if not, do nothing, as there's
nothing to do (the option is checked when function_graph tracing starts).
Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/20250618073801.057ea636@gandalf.local.home Fixes: c7a60a733c373 ("ftrace: Have funcgraph-args take affect during tracing") Closes: https://lore.kernel.org/all/4ab1a7bdd0174ab09c7b0d68cdbff9a4@huawei.com/ Reported-by: Changbin Du <changbin.du@huawei.com> Tested-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Penglei Jiang [Tue, 17 Jun 2025 16:56:44 +0000 (09:56 -0700)]
io_uring: fix potential page leak in io_sqe_buffer_register()
If allocation of the 'imu' fails, then the existing pages aren't
unpinned in the error path. This is mostly a theoretical issue,
requiring fault injection to hit.
Move unpin_user_pages() to unified error handling to fix the page leak
issue.
Neal Cardwell [Fri, 13 Jun 2025 19:30:56 +0000 (15:30 -0400)]
tcp: fix tcp_packet_delayed() for tcp_is_non_sack_preventing_reopen() behavior
After the following commit from 2024:
commit e37ab7373696 ("tcp: fix to allow timestamp undo if no retransmits were sent")
...there was buggy behavior where TCP connections without SACK support
could easily see erroneous undo events at the end of fast recovery or
RTO recovery episodes. The erroneous undo events could cause those
connections to suffer repeated loss recovery episodes and high
retransmit rates.
The problem was an interaction between the non-SACK behavior on these
connections and the undo logic. The problem is that, for non-SACK
connections at the end of a loss recovery episode, if snd_una ==
high_seq, then tcp_is_non_sack_preventing_reopen() holds steady in
CA_Recovery or CA_Loss, but clears tp->retrans_stamp to 0. Then upon
the next ACK the "tcp: fix to allow timestamp undo if no retransmits
were sent" logic saw the tp->retrans_stamp at 0 and erroneously
concluded that no data was retransmitted, and erroneously performed an
undo of the cwnd reduction, restoring cwnd immediately to the value it
had before loss recovery. This caused an immediate burst of traffic
and build-up of queues and likely another immediate loss recovery
episode.
This commit fixes tcp_packet_delayed() to ignore zero retrans_stamp
values for non-SACK connections when snd_una is at or above high_seq,
because tcp_is_non_sack_preventing_reopen() clears retrans_stamp in
this case, so it's not a valid signal that we can undo.
Note that the commit named in the Fixes footer restored long-present
behavior from roughly 2005-2019, so apparently this bug was present
for a while during that era, and this was simply not caught.
Fixes: e37ab7373696 ("tcp: fix to allow timestamp undo if no retransmits were sent") Reported-by: Eric Wheeler <netdev@lists.ewheeler.net> Closes: https://lore.kernel.org/netdev/64ea9333-e7f9-0df-b0f2-8d566143acab@ewheeler.net/ Signed-off-by: Neal Cardwell <ncardwell@google.com> Co-developed-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 22 May 2025 12:17:03 +0000 (13:17 +0100)]
wifi: iwlwifi: Fix incorrect logic on cmd_ver range checking
The current cmd_ver range checking on cmd_ver < 1 && cmd_ver > 3 can
never be true because the logical operator && is being used, cmd_ver
can never be less than 1 and also greater than 3. Fix this by using
the logical || operator.
Apparently I accidentally removed this setting in my transport
configuration rework, leading to an endless stream of warnings
from the PCIe code when relevant notifications are received by
the driver from firmware. Restore it.
Pei Xiao [Tue, 27 May 2025 08:03:55 +0000 (16:03 +0800)]
wifi: iwlwifi: cfg: Limit cb_size to valid range
on arm64 defconfig build failed with gcc-8:
drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c:208:3:
include/linux/bitfield.h:195:3: error: call to '__field_overflow'
declared with attribute error: value doesn't fit into mask
__field_overflow(); \
^~~~~~~~~~~~~~~~~~
include/linux/bitfield.h:215:2: note: in expansion of macro '____MAKE_OP'
____MAKE_OP(u##size,u##size,,)
^~~~~~~~~~~
include/linux/bitfield.h:218:1: note: in expansion of macro '__MAKE_OP'
__MAKE_OP(32)
Johannes Berg [Wed, 18 Jun 2025 07:05:26 +0000 (09:05 +0200)]
Merge tag 'ath-current-20250617' of git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath
Jeff Johnson says:
==================
ath.git updates for v6.16-rc3
Fix the following 3 issues:
wifi: ath12k: avoid burning CPU while waiting for firmware stats
wifi: ath12k: don't activate more links than firmware supports
wifi: carl9170: do not ping device which has failed to load firmware
==================
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Tue, 17 Jun 2025 09:45:29 +0000 (11:45 +0200)]
wifi: ath6kl: remove WARN on bad firmware input
If the firmware gives bad input, that's nothing to do with
the driver's stack at this point etc., so the WARN_ON()
doesn't add any value. Additionally, this is one of the
top syzbot reports now. Just print a message, and as an
added bonus, print the sizes too.
Kuniyuki Iwashima [Mon, 16 Jun 2025 18:21:14 +0000 (11:21 -0700)]
atm: atmtcp: Free invalid length skb in atmtcp_c_send().
syzbot reported the splat below. [0]
vcc_sendmsg() copies data passed from userspace to skb and passes
it to vcc->dev->ops->send().
atmtcp_c_send() accesses skb->data as struct atmtcp_hdr after
checking if skb->len is 0, but it's not enough.
Also, when skb->len == 0, skb and sk (vcc) were leaked because
dev_kfree_skb() is not called and sk_wmem_alloc adjustment is missing
to revert atm_account_tx() in vcc_sendmsg(), which is expected
to be done in atm_pop_raw().
Let's properly free skb with an invalid length in atmtcp_c_send().
GPU:
- SUBMIT ioctl error path leak fixes
- drm half of stall-on-fault fixes. Note there is a soft dependency,
to get correct mmu fault devcoredumps, on arm-smmu changes which
are not in this branch, but have already been merged by Linus. So
by the time Linus merges this, everything should be peachy.
- a7xx: Missing CP_RESET_CONTEXT_STATE
- Skip GPU component bind if GPU is not in the device table.
Dmitry Antipov [Mon, 16 Jun 2025 18:12:05 +0000 (21:12 +0300)]
wifi: carl9170: do not ping device which has failed to load firmware
Syzkaller reports [1, 2] crashes caused by an attempts to ping
the device which has failed to load firmware. Since such a device
doesn't pass 'ieee80211_register_hw()', an internal workqueue
managed by 'ieee80211_queue_work()' is not yet created and an
attempt to queue work on it causes null-ptr-deref.
Fixes: e4a668c59080 ("carl9170: fix spurious restart due to high latency") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Acked-by: Christian Lamparter <chunkeey@gmail.com> Link: https://patch.msgid.link/20250616181205.38883-1-dmantipov@yandex.ru Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Baochen Qiang [Thu, 12 Jun 2025 01:31:52 +0000 (09:31 +0800)]
wifi: ath12k: don't wait when there is no vdev started
For WMI_REQUEST_VDEV_STAT request, firmware might split response into
multiple events dut to buffer limit, hence currently in
ath12k_wmi_fw_stats_process() host waits until all events received. In
case there is no vdev started, this results in that below condition
would never get satisfied
Baochen Qiang [Thu, 12 Jun 2025 01:31:51 +0000 (09:31 +0800)]
wifi: ath12k: don't use static variables in ath12k_wmi_fw_stats_process()
Currently ath12k_wmi_fw_stats_process() is using static variables to count
firmware stat events. Taking num_vdev as an example, if for whatever
reason (say ar->num_started_vdevs is 0 or firmware bug etc.) the following
condition
(++num_vdev) == total_vdevs_started
is not met, is_end is not set thus num_vdev won't be cleared. Next time
when firmware stats is requested again, even if everything is working
fine, failure is expected due to the condition above will never be
satisfied.
The same applies to num_bcn as well.
Change to use non-static counters and reset them each time before firmware
stats is requested.
Baochen Qiang [Thu, 12 Jun 2025 01:31:50 +0000 (09:31 +0800)]
wifi: ath12k: avoid burning CPU while waiting for firmware stats
ath12k_mac_get_fw_stats() is busy polling fw_stats_done flag while waiting
firmware finishing sending all events. This is not good as CPU is
monopolized and kept burning during the wait.
Baochen Qiang [Thu, 12 Jun 2025 01:31:49 +0000 (09:31 +0800)]
wifi: ath12k: fix documentation on firmware stats
Regarding the firmware stats events handling, the comment in
ath12k_mac_get_fw_stats() says host determines whether all events have been
received based on 'end' tag in TLV. This is wrong as there is no such tag
at all, actually host makes the decision totally by itself based on the
stats type and active pdev/vdev counts etc.
this results in firmware crash when the number of links activated on the
same device is more than supported.
Since firmware supports activating at most 2 links for a ML connection,
to avoid firmware crash, host needs to select 2 links out of the useful
links. As the assoc link has already been chosen, the question becomes
how to determine partner links. A straightforward principle applied
here is that the resulted combination should achieve the best throughput.
For that purpose, ideally various factors like bandwidth, RSSI etc should
be considered. But that would be too complicate. To make it easy, the
choice is to only take hardware modes into consideration.
The SBS (single band simultaneously) mode frequency range covers 5 GHz
and 6 GHz bands. In this mode, the two individual MACs are both active,
with one working on 5g-high band and the other on 5g-low band (from
hardware perspective 5 GHz and 6 GHz bands are referred to as a 'large'
single 5 GHz band). The DBS (dual band simultaneously) mode covers 2 GHz
band and the 'large' 5 GHz band, with one MAC working on 2 GHz band and
the other working on 5 GHz band or 6 GHz band. Since 5,6 GHz bands could
provide higher bandwidth than 2 GHz band, the preference is given to SBS
mode. Other hardware modes results in only one working MAC at any given
time, so it is chosen only when both SBS are DBS are not possible.
For each hardware mode, if there are more than one partner candidate,
just choose the first one.
For now only single device MLO case is handled as it is easy. Other cases
could be addressed in the future.
Baochen Qiang [Thu, 22 May 2025 08:54:14 +0000 (16:54 +0800)]
wifi: ath12k: update link active in case two links fall on the same MAC
In case of two links established on the same device in an ML connection,
depending on device's hardware mode capability, it is possible that both
links fall on the same MAC. Currently, no specific action is taken to
address this but just keep both links active. However this would result
in lower throughput compared to even one link, because switching between
these two links on the resulted MAC significantly impacts throughput.
Check if both links fall in the frequency range of a single MAC. If
so, send WMI_MLO_LINK_SET_ACTIVE_CMDID command to firmware such that
firmware can deactivate one of them. Note the decision of which link
getting deactivated is made by firmware, host only sends the vdev
lists.
Baochen Qiang [Thu, 22 May 2025 08:54:13 +0000 (16:54 +0800)]
wifi: ath12k: support WMI_MLO_LINK_SET_ACTIVE_CMDID command
Add WMI_MLO_LINK_SET_ACTIVE_CMDID command. This command allows host to
send required link information to firmware such that firmware can make
decision on activating/deactivating links in various scenarios.
Baochen Qiang [Thu, 22 May 2025 08:54:12 +0000 (16:54 +0800)]
wifi: ath12k: update freq range for each hardware mode
Previous patches parse and save hardware MAC frequency range information
in ath12k_svc_ext_info structure. Such range represents hardware
capability hence needs to be updated based on host information, e.g. guard
the range based on host's low/high boundary.
So update frequency range. The updated range is saved in
ath12k_hw_mode_info structure and would be used when doing vdev activation
and link selection in following patches.
Baochen Qiang [Thu, 22 May 2025 08:54:11 +0000 (16:54 +0800)]
wifi: ath12k: parse and save sbs_lower_band_end_freq from WMI_SERVICE_READY_EXT2_EVENTID event
Firmware sends the boundary between lower and higher bands in
ath12k_wmi_dbs_or_sbs_cap_params structure embedded in
WMI_SERVICE_READY_EXT2_EVENTID event. The boundary is needed when
updating frequency range in the following patch. So parse and save
it for later use. Note ath12k_wmi_dbs_or_sbs_cap_params is placed
after some other structures, so placeholders for them are added
as well.
Baochen Qiang [Thu, 22 May 2025 08:54:10 +0000 (16:54 +0800)]
wifi: ath12k: parse and save hardware mode info from WMI_SERVICE_READY_EXT_EVENTID event for later use
WLAN hardware might support various hardware modes such as DBS (dual
band simultaneously), SBS (single band simultaneously) and DBS_OR_SBS
etc, see enum wmi_host_hw_mode_config_type. Firmware advertises actual
supported modes in WMI_SERVICE_READY_EXT_EVENTID event. For each mode,
firmware advertises frequency range each hardware MAC can operate on.
In MLO case such information is necessary during vdev activation and
link selection (which is done in following patches), so add a new
structure ath12k_svc_ext_info to ath12k_wmi_base, then parse and save
those information to it for later use.
Bjorn Andersson [Tue, 10 Jun 2025 03:06:22 +0000 (22:06 -0500)]
wifi: ath12k: Avoid CPU busy-wait by handling VDEV_STAT and BCN_STAT
When the ath12k driver is built without CONFIG_ATH12K_DEBUG, the
recently refactored stats code can cause any user space application
(such at NetworkManager) to consume 100% CPU for 3 seconds, every time
stats are read.
Commit 'b8a0d83fe4c7 ("wifi: ath12k: move firmware stats out of
debugfs")' moved ath12k_debugfs_fw_stats_request() out of debugfs, by
merging the additional logic into ath12k_mac_get_fw_stats().
Among the added responsibility of ath12k_mac_get_fw_stats() was the
busy-wait for `fw_stats_done`.
Signalling of `fw_stats_done` happens when one of the
WMI_REQUEST_PDEV_STAT, WMI_REQUEST_VDEV_STAT, and WMI_REQUEST_BCN_STAT
messages are received, but the handling of the latter two commands remained
in the debugfs code. As `fw_stats_done` isn't signalled, the calling
processes will spin until the timeout (3 seconds) is reached.
Moving the handling of these two additional responses out of debugfs
resolves the issue.
Jakub Kicinski [Tue, 17 Jun 2025 23:13:12 +0000 (16:13 -0700)]
Merge branch 'ptp_vclock-fixes'
Vladimir Oltean says:
====================
ptp_vclock fixes
While I was intending to test something else related to PTP in net-next
I noticed any time I would run ptp4l on an interface, the kernel would
print "ptp: physical clock is free running" and ptp4l would exit with an
error code.
I then found Jeongjun Park's patch and subsequent explanation provided
to Jakub's question, specifically related to the code which introduced
the breakage I am seeing.
https://lore.kernel.org/netdev/CAO9qdTEjQ5414un7Yw604paECF=6etVKSDSnYmZzZ6Pg3LurXw@mail.gmail.com/
I had to look at the original issue that prompted Jeongjun Park's patch,
and provide an alternative fix for it. Patch 1/2 in this set contains a
logical revert plus the alternative fix, squashed into one.
Patch 2/2 fixes another issue which was confusing during debugging/
characterization, namely: "ok, the kernel clearly thinks that any
physical clock is free-running after this change (despite there being no
vclocks), but why would ptp4l fail to create the clock altogether? Why
not just fail to adjust it?"
By reverting (locally) Jeongjun Park's commit, I could reproduce
the reported lockdep splat using the commands from patch 1/2's commit
message, and this goes away with the reworked implementation.
====================