]> www.infradead.org Git - users/willy/xarray.git/log
users/willy/xarray.git
3 months agonet: hns3: disable interrupt when ptp init failed
Yonglong Liu [Tue, 22 Jul 2025 12:54:21 +0000 (20:54 +0800)]
net: hns3: disable interrupt when ptp init failed

When ptp init failed, we'd better disable the interrupt and clear the
flag, to avoid early report interrupt at next probe.

Fixes: 0bf5eb788512 ("net: hns3: add support for PTP")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722125423.1270673-3-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: hns3: fix concurrent setting vlan filter issue
Jian Shen [Tue, 22 Jul 2025 12:54:20 +0000 (20:54 +0800)]
net: hns3: fix concurrent setting vlan filter issue

The vport->req_vlan_fltr_en may be changed concurrently by function
hclge_sync_vlan_fltr_state() called in periodic work task and
function hclge_enable_vport_vlan_filter() called by user configuration.
It may cause the user configuration inoperative. Fixes it by protect
the vport->req_vlan_fltr by vport_lock.

Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722125423.1270673-2-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agos390/ism: fix concurrency management in ism_cmd()
Halil Pasic [Tue, 22 Jul 2025 16:18:17 +0000 (18:18 +0200)]
s390/ism: fix concurrency management in ism_cmd()

The s390x ISM device data sheet clearly states that only one
request-response sequence is allowable per ISM function at any point in
time.  Unfortunately as of today the s390/ism driver in Linux does not
honor that requirement. This patch aims to rectify that.

This problem was discovered based on Aliaksei's bug report which states
that for certain workloads the ISM functions end up entering error state
(with PEC 2 as seen from the logs) after a while and as a consequence
connections handled by the respective function break, and for future
connection requests the ISM device is not considered -- given it is in a
dysfunctional state. During further debugging PEC 3A was observed as
well.

A kernel message like
[ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a
is a reliable indicator of the stated function entering error state
with PEC 2. Let me also point out that a kernel message like
[ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery
is a reliable indicator that the ISM function won't be auto-recovered
because the ISM driver currently lacks support for it.

On a technical level, without this synchronization, commands (inputs to
the FW) may be partially or fully overwritten (corrupted) by another CPU
trying to issue commands on the same function. There is hard evidence that
this can lead to DMB token values being used as DMB IOVAs, leading to
PEC 2 PCI events indicating invalid DMA. But this is only one of the
failure modes imaginable. In theory even completely losing one command
and executing another one twice and then trying to interpret the outputs
as if the command we intended to execute was actually executed and not
the other one is also possible.  Frankly, I don't feel confident about
providing an exhaustive list of possible consequences.

Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory")
Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com>
Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agoselftests: drv-net: wait for iperf client to stop sending
Nimrod Oren [Tue, 22 Jul 2025 12:26:55 +0000 (15:26 +0300)]
selftests: drv-net: wait for iperf client to stop sending

A few packets may still be sent out during the termination of iperf
processes. These late packets cause failures in rss_ctx.py when they
arrive on queues expected to be empty.

Example failure observed:

  Check failed 2 != 0 traffic on inactive queues (context 1):
    [0, 0, 1, 1, 386385, 397196, 0, 0, 0, 0, ...]

  Check failed 4 != 0 traffic on inactive queues (context 2):
    [0, 0, 0, 0, 2, 2, 247152, 253013, 0, 0, ...]

  Check failed 2 != 0 traffic on inactive queues (context 3):
    [0, 0, 0, 0, 0, 0, 1, 1, 282434, 283070, ...]

To avoid such failures, wait until all client sockets for the requested
port are either closed or in the TIME_WAIT state.

Fixes: 847aa551fa78 ("selftests: drv-net: rss_ctx: factor out send traffic and check")
Signed-off-by: Nimrod Oren <noren@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722122655.3194442-1-noren@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMAINTAINERS: Add in6.h to MAINTAINERS
Kees Cook [Tue, 22 Jul 2025 16:56:49 +0000 (09:56 -0700)]
MAINTAINERS: Add in6.h to MAINTAINERS

My CC-adding automation returned nothing on a future patch to the
include/linux/in6.h file, and I went looking for why. Add the missed
in6.h to MAINTAINERS.

Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722165645.work.047-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge tag 'linux-can-fixes-for-6.16-20250722' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Wed, 23 Jul 2025 01:39:51 +0000 (18:39 -0700)]
Merge tag 'linux-can-fixes-for-6.16-20250722' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2025-07-22

The patch is by me and fixes a potential NULL pointer deref in the CAN
device driver infrastructure. It can be triggered from user space.

* tag 'linux-can-fixes-for-6.16-20250722' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode
====================

Link: https://patch.msgid.link/20250722110059.3664104-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests: netfilter: tone-down conntrack clash test
Florian Westphal [Mon, 21 Jul 2025 22:36:49 +0000 (00:36 +0200)]
selftests: netfilter: tone-down conntrack clash test

The test is supposed to observe that the 'clash_resolve' stat counter
incremented (i.e., the code path was covered).
This check was incorrect, 'conntrack -S' needs to be called in the
revevant namespace, not the initial netns.

The clash resolution logic in conntrack is only exercised when multiple
packets with the same udp quadruple race. Depending on kernel config,
number of CPUs, scheduling policy etc.  this might not trigger even
after several retries.  Thus the script eventually returns SKIP if the
retry count is exceeded.

The udpclash tool with also exit with a failure if it did not observe
the expected number of replies.

In the script, make a note of this but do not fail anymore, just check if
the clash resolution logic triggered after all.

Remove the 'single-core' test: while unlikely, with preemptible kernel it
should be possible to also trigger clash resolution logic.

With this change the test will either SKIP or pass.

Hard error could be restored later once its clear whats going on, so
also dump 'conntrack -S' when some packets went missing to see if
conntrack dropped them on insert.

Fixes: 78a588363587 ("selftests: netfilter: add conntrack clash resolution test case")
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20250721223652.6956-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Jakub Kicinski [Wed, 23 Jul 2025 01:24:10 +0000 (18:24 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2025-07-21 (i40e, ice, e1000e)

For i40e:
Dennis Chen adjusts reporting of VF Tx dropped to a more appropriate
field.

Jamie Bainbridge fixes a check which can cause a PF set VF MAC address
to be lost.

For ice:
Haoxiang Li adds an error check in DDP load to prevent NULL pointer
dereference.

For e1000e:
Jacek Kowalski adds workarounds for issues surrounding Tiger Lake
platforms with uninitialized NVMs.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  e1000e: ignore uninitialized checksum word on tgp
  e1000e: disregard NVM checksum on tgp when valid checksum bit is not set
  ice: Fix a null pointer dereference in ice_copy_and_init_pkg()
  i40e: When removing VF MAC filters, only check PF-set MAC
  i40e: report VF tx_dropped with tx_errors instead of tx_discards
====================

Link: https://patch.msgid.link/20250721173733.2248057-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agocan: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode
Marc Kleine-Budde [Tue, 15 Jul 2025 20:35:46 +0000 (22:35 +0200)]
can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode

Andrei Lalaev reported a NULL pointer deref when a CAN device is
restarted from Bus Off and the driver does not implement the struct
can_priv::do_set_mode callback.

There are 2 code path that call struct can_priv::do_set_mode:
- directly by a manual restart from the user space, via
  can_changelink()
- delayed automatic restart after bus off (deactivated by default)

To prevent the NULL pointer deference, refuse a manual restart or
configure the automatic restart delay in can_changelink() and report
the error via extack to user space.

As an additional safety measure let can_restart() return an error if
can_priv::do_set_mode is not set instead of dereferencing it
unchecked.

Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
Closes: https://lore.kernel.org/all/20250714175520.307467-1-andrey.lalaev@gmail.com
Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface")
Link: https://patch.msgid.link/20250718-fix-nullptr-deref-do_set_mode-v1-1-0b520097bb96@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 months agonet/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class
Xiang Mei [Thu, 17 Jul 2025 23:01:28 +0000 (16:01 -0700)]
net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class

might_sleep could be trigger in the atomic context in qfq_delete_class.

qfq_destroy_class was moved into atomic context locked
by sch_tree_lock to avoid a race condition bug on
qfq_aggregate. However, might_sleep could be triggered by
qfq_destroy_class, which introduced sleeping in atomic context (path:
qfq_destroy_class->qdisc_put->__qdisc_destroy->lockdep_unregister_key
->might_sleep).

Considering the race is on the qfq_aggregate objects, keeping
qfq_rm_from_agg in the lock but moving the left part out can solve
this issue.

Fixes: 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on qfq_aggregate")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Link: https://patch.msgid.link/4a04e0cc-a64b-44e7-9213-2880ed641d77@sabinyo.mountain
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/20250717230128.159766-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agogve: Fix stuck TX queue for DQ queue format
Praveen Kaligineedi [Thu, 17 Jul 2025 19:20:24 +0000 (19:20 +0000)]
gve: Fix stuck TX queue for DQ queue format

gve_tx_timeout was calculating missed completions in a way that is only
relevant in the GQ queue format. Additionally, it was attempting to
disable device interrupts, which is not needed in either GQ or DQ queue
formats.

As a result, TX timeouts with the DQ queue format likely would have
triggered early resets without kicking the queue at all.

This patch drops the check for pending work altogether and always kicks
the queue after validating the queue has not seen a TX timeout too
recently.

Cc: stable@vger.kernel.org
Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ")
Co-developed-by: Tim Hostetler <thostet@google.com>
Signed-off-by: Tim Hostetler <thostet@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20250717192024.1820931-1-hramamurthy@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: appletalk: Fix use-after-free in AARP proxy probe
Kito Xu (veritas501) [Thu, 17 Jul 2025 01:28:43 +0000 (01:28 +0000)]
net: appletalk: Fix use-after-free in AARP proxy probe

The AARP proxy‐probe routine (aarp_proxy_probe_network) sends a probe,
releases the aarp_lock, sleeps, then re-acquires the lock.  During that
window an expire timer thread (__aarp_expire_timer) can remove and
kfree() the same entry, leading to a use-after-free.

race condition:

         cpu 0                          |            cpu 1
    atalk_sendmsg()                     |   atif_proxy_probe_device()
    aarp_send_ddp()                     |   aarp_proxy_probe_network()
    mod_timer()                         |   lock(aarp_lock) // LOCK!!
    timeout around 200ms                |   alloc(aarp_entry)
    and then call                       |   proxies[hash] = aarp_entry
    aarp_expire_timeout()               |   aarp_send_probe()
                                        |   unlock(aarp_lock) // UNLOCK!!
    lock(aarp_lock) // LOCK!!           |   msleep(100);
    __aarp_expire_timer(&proxies[ct])   |
    free(aarp_entry)                    |
    unlock(aarp_lock) // UNLOCK!!       |
                                        |   lock(aarp_lock) // LOCK!!
                                        |   UAF aarp_entry !!

==================================================================
BUG: KASAN: slab-use-after-free in aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493
Read of size 4 at addr ffff8880123aa360 by task repro/13278

CPU: 3 UID: 0 PID: 13278 Comm: repro Not tainted 6.15.2 #3 PREEMPT(full)
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:408 [inline]
 print_report+0xc1/0x630 mm/kasan/report.c:521
 kasan_report+0xca/0x100 mm/kasan/report.c:634
 aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493
 atif_proxy_probe_device net/appletalk/ddp.c:332 [inline]
 atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857
 atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818
 sock_do_ioctl+0xdc/0x260 net/socket.c:1190
 sock_ioctl+0x239/0x6a0 net/socket.c:1311
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:906 [inline]
 __se_sys_ioctl fs/ioctl.c:892 [inline]
 __x64_sys_ioctl+0x194/0x200 fs/ioctl.c:892
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcb/0x250 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
 </TASK>

Allocated:
 aarp_alloc net/appletalk/aarp.c:382 [inline]
 aarp_proxy_probe_network+0xd8/0x630 net/appletalk/aarp.c:468
 atif_proxy_probe_device net/appletalk/ddp.c:332 [inline]
 atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857
 atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818

Freed:
 kfree+0x148/0x4d0 mm/slub.c:4841
 __aarp_expire net/appletalk/aarp.c:90 [inline]
 __aarp_expire_timer net/appletalk/aarp.c:261 [inline]
 aarp_expire_timeout+0x480/0x6e0 net/appletalk/aarp.c:317

The buggy address belongs to the object at ffff8880123aa300
 which belongs to the cache kmalloc-192 of size 192
The buggy address is located 96 bytes inside of
 freed 192-byte region [ffff8880123aa300ffff8880123aa3c0)

Memory state around the buggy address:
 ffff8880123aa200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff8880123aa280: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8880123aa300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                       ^
 ffff8880123aa380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
 ffff8880123aa400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com>
Link: https://patch.msgid.link/20250717012843.880423-1-hxzene@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: bcmasp: Restore programming of TX map vector register
Florian Fainelli [Fri, 18 Jul 2025 21:22:42 +0000 (14:22 -0700)]
net: bcmasp: Restore programming of TX map vector register

On ASP versions v2.x we need to program the TX map vector register to
properly exercise end-to-end flow control, otherwise the TX engine can
either lock-up, or cause the hardware calculated checksum to be
wrong/corrupted when multiple back to back packets are being submitted
for transmission. This register defaults to 0, which means no flow
control being applied.

Fixes: e9f31435ee7d ("net: bcmasp: Add support for asp-v3.0")
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250718212242.3447751-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'selftests-mptcp-connect-cover-alt-modes'
Jakub Kicinski [Mon, 21 Jul 2025 23:21:32 +0000 (16:21 -0700)]
Merge branch 'selftests-mptcp-connect-cover-alt-modes'

Matthieu Baerts says:

====================
selftests: mptcp: connect: cover alt modes

mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
make sure everything works as expected when using "mmap" and "sendfile"
modes instead of "poll", and with the MPTCP checksum support.

These modes should be validated, but they are not when the selftests are
executed via the kselftest helpers. It means that most CIs validating
these selftests, like NIPA for the net development trees and LKFT for
the stable ones, are not covering these modes.

To fix that, new test programs have been added, simply calling
mptcp_connect.sh with the right parameters.

The first patch can be backported up to v5.6, and the second one up to
v5.14.

v1: https://lore.kernel.org/20250714-net-mptcp-sft-connect-alt-v1-0-bf1c5abbe575@kernel.org
====================

Link: https://patch.msgid.link/20250715-net-mptcp-sft-connect-alt-v2-0-8230ddd82454@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests: mptcp: connect: also cover checksum
Matthieu Baerts (NGI0) [Tue, 15 Jul 2025 18:43:29 +0000 (20:43 +0200)]
selftests: mptcp: connect: also cover checksum

The checksum mode has been added a while ago, but it is only validated
when manually launching mptcp_connect.sh with "-C".

The different CIs were then not validating these MPTCP Connect tests
with checksum enabled. To make sure they do, add a new test program
executing mptcp_connect.sh with the checksum mode.

Fixes: 94d66ba1d8e4 ("selftests: mptcp: enable checksum in mptcp_connect.sh")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250715-net-mptcp-sft-connect-alt-v2-2-8230ddd82454@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests: mptcp: connect: also cover alt modes
Matthieu Baerts (NGI0) [Tue, 15 Jul 2025 18:43:28 +0000 (20:43 +0200)]
selftests: mptcp: connect: also cover alt modes

The "mmap" and "sendfile" alternate modes for mptcp_connect.sh/.c are
available from the beginning, but only tested when mptcp_connect.sh is
manually launched with "-m mmap" or "-m sendfile", not via the
kselftests helpers.

The MPTCP CI was manually running "mptcp_connect.sh -m mmap", but not
"-m sendfile". Plus other CIs, especially the ones validating the stable
releases, were not validating these alternate modes.

To make sure these modes are validated by these CIs, add two new test
programs executing mptcp_connect.sh with the alternate modes.

Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250715-net-mptcp-sft-connect-alt-v2-1-8230ddd82454@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoe1000e: ignore uninitialized checksum word on tgp
Jacek Kowalski [Mon, 30 Jun 2025 08:35:00 +0000 (10:35 +0200)]
e1000e: ignore uninitialized checksum word on tgp

As described by Vitaly Lifshits:

> Starting from Tiger Lake, LAN NVM is locked for writes by SW, so the
> driver cannot perform checksum validation and correction. This means
> that all NVM images must leave the factory with correct checksum and
> checksum valid bit set.

Unfortunately some systems have left the factory with an uninitialized
value of 0xFFFF at register address 0x3F (checksum word location).
So on Tiger Lake platform we ignore the computed checksum when such
condition is encountered.

Signed-off-by: Jacek Kowalski <jacek@jacekk.info>
Tested-by: Vlad URSU <vlad@ursu.me>
Fixes: 4051f68318ca9 ("e1000e: Do not take care about recovery NVM checksum")
Cc: stable@vger.kernel.org
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 months agoe1000e: disregard NVM checksum on tgp when valid checksum bit is not set
Jacek Kowalski [Mon, 30 Jun 2025 08:33:39 +0000 (10:33 +0200)]
e1000e: disregard NVM checksum on tgp when valid checksum bit is not set

As described by Vitaly Lifshits:

> Starting from Tiger Lake, LAN NVM is locked for writes by SW, so the
> driver cannot perform checksum validation and correction. This means
> that all NVM images must leave the factory with correct checksum and
> checksum valid bit set. Since Tiger Lake devices were the first to have
> this lock, some systems in the field did not meet this requirement.
> Therefore, for these transitional devices we skip checksum update and
> verification, if the valid bit is not set.

Signed-off-by: Jacek Kowalski <jacek@jacekk.info>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Fixes: 4051f68318ca9 ("e1000e: Do not take care about recovery NVM checksum")
Cc: stable@vger.kernel.org
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 months agoice: Fix a null pointer dereference in ice_copy_and_init_pkg()
Haoxiang Li [Thu, 3 Jul 2025 09:52:32 +0000 (17:52 +0800)]
ice: Fix a null pointer dereference in ice_copy_and_init_pkg()

Add check for the return value of devm_kmemdup()
to prevent potential null pointer dereference.

Fixes: c76488109616 ("ice: Implement Dynamic Device Personalization (DDP) download")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 months agoi40e: When removing VF MAC filters, only check PF-set MAC
Jamie Bainbridge [Tue, 24 Jun 2025 23:29:18 +0000 (09:29 +1000)]
i40e: When removing VF MAC filters, only check PF-set MAC

When the PF is processing an Admin Queue message to delete a VF's MACs
from the MAC filter, we currently check if the PF set the MAC and if
the VF is trusted.

This results in undesirable behaviour, where if a trusted VF with a
PF-set MAC sets itself down (which sends an AQ message to delete the
VF's MAC filters) then the VF MAC is erased from the interface.

This results in the VF losing its PF-set MAC which should not happen.

There is no need to check for trust at all, because an untrusted VF
cannot change its own MAC. The only check needed is whether the PF set
the MAC. If the PF set the MAC, then don't erase the MAC on link-down.

Resolve this by changing the deletion check only for PF-set MAC.

(the out-of-tree driver has also intentionally removed the check for VF
trust here with OOT driver version 2.26.8, this changes the Linux kernel
driver behaviour and comment to match the OOT driver behaviour)

Fixes: ea2a1cfc3b201 ("i40e: Fix VF MAC filter removal")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 months agoi40e: report VF tx_dropped with tx_errors instead of tx_discards
Dennis Chen [Wed, 18 Jun 2025 19:52:40 +0000 (15:52 -0400)]
i40e: report VF tx_dropped with tx_errors instead of tx_discards

Currently the tx_dropped field in VF stats is not updated correctly
when reading stats from the PF. This is because it reads from
i40e_eth_stats.tx_discards which seems to be unused for per VSI stats,
as it is not updated by i40e_update_eth_stats() and the corresponding
register, GLV_TDPC, is not implemented[1].

Use i40e_eth_stats.tx_errors instead, which is actually updated by
i40e_update_eth_stats() by reading from GLV_TEPC.

To test, create a VF and try to send bad packets through it:

$ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs
$ cat test.py
from scapy.all import *

vlan_pkt = Ether(dst="ff:ff:ff:ff:ff:ff") / Dot1Q(vlan=999) / IP(dst="192.168.0.1") / ICMP()
ttl_pkt = IP(dst="8.8.8.8", ttl=0) / ICMP()

print("Send packet with bad VLAN tag")
sendp(vlan_pkt, iface="enp2s0f0v0")
print("Send packet with TTL=0")
sendp(ttl_pkt, iface="enp2s0f0v0")
$ ip -s link show dev enp2s0f0
16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
    RX:  bytes packets errors dropped  missed   mcast
             0       0      0       0       0       0
    TX:  bytes packets errors dropped carrier collsns
             0       0      0       0       0       0
    vf 0     link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    RX: bytes  packets  mcast   bcast   dropped
             0        0       0       0        0
    TX: bytes  packets   dropped
             0        0        0
$ python test.py
Send packet with bad VLAN tag
.
Sent 1 packets.
Send packet with TTL=0
.
Sent 1 packets.
$ ip -s link show dev enp2s0f0
16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
    RX:  bytes packets errors dropped  missed   mcast
             0       0      0       0       0       0
    TX:  bytes packets errors dropped carrier collsns
             0       0      0       0       0       0
    vf 0     link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    RX: bytes  packets  mcast   bcast   dropped
             0        0       0       0        0
    TX: bytes  packets   dropped
             0        0        0

A packet with non-existent VLAN tag and a packet with TTL = 0 are sent,
but tx_dropped is not incremented.

After patch:

$ ip -s link show dev enp2s0f0
19: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
    RX:  bytes packets errors dropped  missed   mcast
             0       0      0       0       0       0
    TX:  bytes packets errors dropped carrier collsns
             0       0      0       0       0       0
    vf 0     link/ether 4a:b7:3d:37:f7:56 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    RX: bytes  packets  mcast   bcast   dropped
             0        0       0       0        0
    TX: bytes  packets   dropped
             0        0        2

Fixes: dc645daef9af5bcbd9c ("i40e: implement VF stats NDO")
Signed-off-by: Dennis Chen <dechen@redhat.com>
Link: https://www.intel.com/content/www/us/en/content-details/596333/intel-ethernet-controller-x710-tm4-at2-carlsville-datasheet.html
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 months agoMerge branch 'mlx5-misc-fixes-2025-07-17'
Jakub Kicinski [Sat, 19 Jul 2025 00:33:05 +0000 (17:33 -0700)]
Merge branch 'mlx5-misc-fixes-2025-07-17'

Tariq Toukan says:

====================
mlx5 misc fixes 2025-07-17

This small patchset provides misc bug fixes from the team to the mlx5
driver.
====================

Link: https://patch.msgid.link/1752753970-261832-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet/mlx5: E-Switch, Fix peer miss rules to use peer eswitch
Shahar Shitrit [Thu, 17 Jul 2025 12:06:10 +0000 (15:06 +0300)]
net/mlx5: E-Switch, Fix peer miss rules to use peer eswitch

In the original design, it is assumed local and peer eswitches have the
same number of vfs. However, in new firmware, local and peer eswitches
can have different number of vfs configured by mlxconfig.  In such
configuration, it is incorrect to derive the number of vfs from the
local device's eswitch.

Fix this by updating the peer miss rules add and delete functions to use
the peer device's eswitch and vf count instead of the local device's
information, ensuring correct behavior regardless of vf configuration
differences.

Fixes: ac004b832128 ("net/mlx5e: E-Switch, Add peer miss rules")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1752753970-261832-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet/mlx5: Fix memory leak in cmd_exec()
Chiara Meiohas [Thu, 17 Jul 2025 12:06:09 +0000 (15:06 +0300)]
net/mlx5: Fix memory leak in cmd_exec()

If cmd_exec() is called with callback and mlx5_cmd_invoke() returns an
error, resources allocated in cmd_exec() will not be freed.

Fix the code to release the resources if mlx5_cmd_invoke() returns an
error.

Fixes: f086470122d5 ("net/mlx5: cmdif, Return value improvements")
Reported-by: Alex Tereshkin <atereshkin@nvidia.com>
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1752753970-261832-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: ti: icssg-prueth: Fix buffer allocation for ICSSG
Himanshu Mittal [Thu, 17 Jul 2025 09:42:20 +0000 (15:12 +0530)]
net: ti: icssg-prueth: Fix buffer allocation for ICSSG

Fixes overlapping buffer allocation for ICSSG peripheral
used for storing packets to be received/transmitted.
There are 3 buffers:
1. Buffer for Locally Injected Packets
2. Buffer for Forwarding Packets
3. Buffer for Host Egress Packets

In existing allocation buffers for 2. and 3. are overlapping causing
packet corruption.

Packet corruption observations:
During tcp iperf testing, due to overlapping buffers the received ack
packet overwrites the packet to be transmitted. So, we see packets on
wire with the ack packet content inside the content of next TCP packet
from sender device.

Details for AM64x switch mode:
-> Allocation by existing driver:
+---------+-------------------------------------------------------------+
|         |          SLICE 0             |          SLICE 1             |
|         +------+--------------+--------+------+--------------+--------+
|         | Slot | Base Address | Size   | Slot | Base Address | Size   |
|---------+------+--------------+--------+------+--------------+--------+
|         | 0    | 70000000     | 0x2000 | 0    | 70010000     | 0x2000 |
|         | 1    | 70002000     | 0x2000 | 1    | 70012000     | 0x2000 |
|         | 2    | 70004000     | 0x2000 | 2    | 70014000     | 0x2000 |
| FWD     | 3    | 70006000     | 0x2000 | 3    | 70016000     | 0x2000 |
| Buffers | 4    | 70008000     | 0x2000 | 4    | 70018000     | 0x2000 |
|         | 5    | 7000A000     | 0x2000 | 5    | 7001A000     | 0x2000 |
|         | 6    | 7000C000     | 0x2000 | 6    | 7001C000     | 0x2000 |
|         | 7    | 7000E000     | 0x2000 | 7    | 7001E000     | 0x2000 |
+---------+------+--------------+--------+------+--------------+--------+
|         | 8    | 70020000     | 0x1000 | 8    | 70028000     | 0x1000 |
|         | 9    | 70021000     | 0x1000 | 9    | 70029000     | 0x1000 |
|         | 10   | 70022000     | 0x1000 | 10   | 7002A000     | 0x1000 |
| Our     | 11   | 70023000     | 0x1000 | 11   | 7002B000     | 0x1000 |
| LI      | 12   | 00000000     | 0x0    | 12   | 00000000     | 0x0    |
| Buffers | 13   | 00000000     | 0x0    | 13   | 00000000     | 0x0    |
|         | 14   | 00000000     | 0x0    | 14   | 00000000     | 0x0    |
|         | 15   | 00000000     | 0x0    | 15   | 00000000     | 0x0    |
+---------+------+--------------+--------+------+--------------+--------+
|         | 16   | 70024000     | 0x1000 | 16   | 7002C000     | 0x1000 |
|         | 17   | 70025000     | 0x1000 | 17   | 7002D000     | 0x1000 |
|         | 18   | 70026000     | 0x1000 | 18   | 7002E000     | 0x1000 |
| Their   | 19   | 70027000     | 0x1000 | 19   | 7002F000     | 0x1000 |
| LI      | 20   | 00000000     | 0x0    | 20   | 00000000     | 0x0    |
| Buffers | 21   | 00000000     | 0x0    | 21   | 00000000     | 0x0    |
|         | 22   | 00000000     | 0x0    | 22   | 00000000     | 0x0    |
|         | 23   | 00000000     | 0x0    | 23   | 00000000     | 0x0    |
+---------+------+--------------+--------+------+--------------+--------+
--> here 16, 17, 18, 19 overlapping with below express buffer

+-----+-----------------------------------------------+
|     |       SLICE 0       |        SLICE 1          |
|     +------------+----------+------------+----------+
|     | Start addr | End addr | Start addr | End addr |
+-----+------------+----------+------------+----------+
| EXP | 70024000   | 70028000 | 7002C000   | 70030000 | <-- Overlapping
| PRE | 70030000   | 70033800 | 70034000   | 70037800 |
+-----+------------+----------+------------+----------+

+---------------------+----------+----------+
|                     | SLICE 0  |  SLICE 1 |
+---------------------+----------+----------+
| Default Drop Offset | 00000000 | 00000000 |     <-- Field not configured
+---------------------+----------+----------+

-> Allocation this patch brings:
+---------+-------------------------------------------------------------+
|         |          SLICE 0             |          SLICE 1             |
|         +------+--------------+--------+------+--------------+--------+
|         | Slot | Base Address | Size   | Slot | Base Address | Size   |
|---------+------+--------------+--------+------+--------------+--------+
|         | 0    | 70000000     | 0x2000 | 0    | 70040000     | 0x2000 |
|         | 1    | 70002000     | 0x2000 | 1    | 70042000     | 0x2000 |
|         | 2    | 70004000     | 0x2000 | 2    | 70044000     | 0x2000 |
| FWD     | 3    | 70006000     | 0x2000 | 3    | 70046000     | 0x2000 |
| Buffers | 4    | 70008000     | 0x2000 | 4    | 70048000     | 0x2000 |
|         | 5    | 7000A000     | 0x2000 | 5    | 7004A000     | 0x2000 |
|         | 6    | 7000C000     | 0x2000 | 6    | 7004C000     | 0x2000 |
|         | 7    | 7000E000     | 0x2000 | 7    | 7004E000     | 0x2000 |
+---------+------+--------------+--------+------+--------------+--------+
|         | 8    | 70010000     | 0x1000 | 8    | 70050000     | 0x1000 |
|         | 9    | 70011000     | 0x1000 | 9    | 70051000     | 0x1000 |
|         | 10   | 70012000     | 0x1000 | 10   | 70052000     | 0x1000 |
| Our     | 11   | 70013000     | 0x1000 | 11   | 70053000     | 0x1000 |
| LI      | 12   | 00000000     | 0x0    | 12   | 00000000     | 0x0    |
| Buffers | 13   | 00000000     | 0x0    | 13   | 00000000     | 0x0    |
|         | 14   | 00000000     | 0x0    | 14   | 00000000     | 0x0    |
|         | 15   | 00000000     | 0x0    | 15   | 00000000     | 0x0    |
+---------+------+--------------+--------+------+--------------+--------+
|         | 16   | 70014000     | 0x1000 | 16   | 70054000     | 0x1000 |
|         | 17   | 70015000     | 0x1000 | 17   | 70055000     | 0x1000 |
|         | 18   | 70016000     | 0x1000 | 18   | 70056000     | 0x1000 |
| Their   | 19   | 70017000     | 0x1000 | 19   | 70057000     | 0x1000 |
| LI      | 20   | 00000000     | 0x0    | 20   | 00000000     | 0x0    |
| Buffers | 21   | 00000000     | 0x0    | 21   | 00000000     | 0x0    |
|         | 22   | 00000000     | 0x0    | 22   | 00000000     | 0x0    |
|         | 23   | 00000000     | 0x0    | 23   | 00000000     | 0x0    |
+---------+------+--------------+--------+------+--------------+--------+

+-----+-----------------------------------------------+
|     |       SLICE 0       |        SLICE 1          |
|     +------------+----------+------------+----------+
|     | Start addr | End addr | Start addr | End addr |
+-----+------------+----------+------------+----------+
| EXP | 70018000   | 7001C000 | 70058000   | 7005C000 |
| PRE | 7001C000   | 7001F800 | 7005C000   | 7005F800 |
+-----+------------+----------+------------+----------+

+---------------------+----------+----------+
|                     | SLICE 0  |  SLICE 1 |
+---------------------+----------+----------+
| Default Drop Offset | 7001F800 | 7005F800 |
+---------------------+----------+----------+

Rootcause: missing buffer configuration for Express frames in
function: prueth_fw_offload_buffer_setup()

Details:
Driver implements two distinct buffer configuration functions that are
invoked based on the driver state and ICSSG firmware:-
- prueth_fw_offload_buffer_setup()
- prueth_emac_buffer_setup()

During initialization, driver creates standard network interfaces
(netdevs) and configures buffers via prueth_emac_buffer_setup().
This function properly allocates and configures all required memory
regions including:
- LI buffers
- Express packet buffers
- Preemptible packet buffers

However, when the driver transitions to an offload mode (switch/HSR/PRP),
buffer reconfiguration is handled by prueth_fw_offload_buffer_setup().
This function does not reconfigure the buffer regions required for
Express packets, leading to incorrect buffer allocation.

Fixes: abd5576b9c57 ("net: ti: icssg-prueth: Add support for ICSSG switch firmware")
Signed-off-by: Himanshu Mittal <h-mittal1@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250717094220.546388-1-h-mittal1@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agodpaa2-switch: Fix device reference count leak in MAC endpoint handling
Ma Ke [Thu, 17 Jul 2025 02:23:09 +0000 (10:23 +0800)]
dpaa2-switch: Fix device reference count leak in MAC endpoint handling

The fsl_mc_get_endpoint() function uses device_find_child() for
localization, which implicitly calls get_device() to increment the
device's reference count before returning the pointer. However, the
caller dpaa2_switch_port_connect_mac() fails to properly release this
reference in multiple scenarios. We should call put_device() to
decrement reference count properly.

As comment of device_find_child() says, 'NOTE: you will need to drop
the reference with put_device() after use'.

Found by code review.

Cc: stable@vger.kernel.org
Fixes: 84cba72956fd ("dpaa2-switch: integrate the MAC endpoint support")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250717022309.3339976-3-make24@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agodpaa2-eth: Fix device reference count leak in MAC endpoint handling
Ma Ke [Thu, 17 Jul 2025 02:23:08 +0000 (10:23 +0800)]
dpaa2-eth: Fix device reference count leak in MAC endpoint handling

The fsl_mc_get_endpoint() function uses device_find_child() for
localization, which implicitly calls get_device() to increment the
device's reference count before returning the pointer. However, the
caller dpaa2_eth_connect_mac() fails to properly release this
reference in multiple scenarios. We should call put_device() to
decrement reference count properly.

As comment of device_find_child() says, 'NOTE: you will need to drop
the reference with put_device() after use'.

Found by code review.

Cc: stable@vger.kernel.org
Fixes: 719479230893 ("dpaa2-eth: add MAC/PHY support through phylink")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250717022309.3339976-2-make24@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agobus: fsl-mc: Fix potential double device reference in fsl_mc_get_endpoint()
Ma Ke [Thu, 17 Jul 2025 02:23:07 +0000 (10:23 +0800)]
bus: fsl-mc: Fix potential double device reference in fsl_mc_get_endpoint()

The fsl_mc_get_endpoint() function may call fsl_mc_device_lookup()
twice, which would increment the device's reference count twice if
both lookups find a device. This could lead to a reference count leak.

Found by code review.

Cc: stable@vger.kernel.org
Fixes: 1ac210d128ef ("bus: fsl-mc: add the fsl_mc_get_endpoint function")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 8567494cebe5 ("bus: fsl-mc: rescan devices if endpoint not found")
Link: https://patch.msgid.link/20250717022309.3339976-1-make24@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge tag 'net-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 17 Jul 2025 17:04:04 +0000 (10:04 -0700)]
Merge tag 'net-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from Bluetooth, CAN, WiFi and Netfilter.

  More code here than I would have liked. That said, better now than
  next week. Nothing particularly scary stands out. The improvement to
  the OpenVPN input validation is a bit large but better get them in
  before the code makes it to a final release. Some of the changes we
  got from sub-trees could have been split better between the fix and
  -next refactoring, IMHO, that has been communicated.

  We have one known regression in a TI AM65 board not getting link. The
  investigation is going a bit slow, a number of people are on vacation.
  We'll try to wrap it up, but don't think it should hold up the
  release.

  Current release - fix to a fix:

   - Bluetooth: L2CAP: fix attempting to adjust outgoing MTU, it broke
     some headphones and speakers

  Current release - regressions:

   - wifi: ath12k: fix packets received in WBM error ring with REO LUT
     enabled, fix Rx performance regression

   - wifi: iwlwifi:
       - fix crash due to a botched indexing conversion
       - mask reserved bits in chan_state_active_bitmap, avoid FW assert()

  Current release - new code bugs:

   - nf_conntrack: fix crash due to removal of uninitialised entry

   - eth: airoha: fix potential UaF in airoha_npu_get()

  Previous releases - regressions:

   - net: fix segmentation after TCP/UDP fraglist GRO

   - af_packet: fix the SO_SNDTIMEO constraint not taking effect and a
     potential soft lockup waiting for a completion

   - rpl: fix UaF in rpl_do_srh_inline() for sneaky skb geometry

   - virtio-net: fix recursive rtnl_lock() during probe()

   - eth: stmmac: populate entire system_counterval_t in get_time_fn()

   - eth: libwx: fix a number of crashes in the driver Rx path

   - hv_netvsc: prevent IPv6 addrconf after IFF_SLAVE lost that meaning

  Previous releases - always broken:

   - mptcp: fix races in handling connection fallback to pure TCP

   - rxrpc: assorted error handling and race fixes

   - sched: another batch of "security" fixes for qdiscs (QFQ, HTB)

   - tls: always refresh the queue when reading sock, avoid UaF

   - phy: don't register LEDs for genphy, avoid deadlock

   - Bluetooth: btintel: check if controller is ISO capable on
     btintel_classify_pkt_type(), work around FW returning incorrect
     capabilities

  Misc:

   - make OpenVPN Netlink input checking more strict before it makes it
     to a final release

   - wifi: cfg80211: remove scan request n_channels __counted_by, it's
     only yielding false positives"

* tag 'net-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
  rxrpc: Fix to use conn aborts for conn-wide failures
  rxrpc: Fix transmission of an abort in response to an abort
  rxrpc: Fix notification vs call-release vs recvmsg
  rxrpc: Fix recv-recv race of completed call
  rxrpc: Fix irq-disabled in local_bh_enable()
  selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying
  net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree
  net: bridge: Do not offload IGMP/MLD messages
  selftests: Add test cases for vlan_filter modification during runtime
  net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime
  tls: always refresh the queue when reading sock
  virtio-net: fix recursived rtnl_lock() during probe()
  net/mlx5: Update the list of the PCI supported devices
  hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 addrconf
  phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()
  Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU
  netfilter: nf_conntrack: fix crash due to removal of uninitialised entry
  net: fix segmentation after TCP/UDP fraglist GRO
  ipv6: mcast: Delay put pmc->idev in mld_del_delrec()
  net: airoha: fix potential use-after-free in airoha_npu_get()
  ...

3 months agoMerge tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Thu, 17 Jul 2025 16:46:37 +0000 (09:46 -0700)]
Merge tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These address three issues introduced during the current development
  cycle and related to system suspend and hibernation, one triggering
  when asynchronous suspend of devices fails, one possibly affecting
  memory management in the core suspend code error path, and one due to
  duplicate filesystems freezing during system suspend:

   - Fix a deadlock that may occur on asynchronous device suspend
     failures due to missing completion updates in error paths (Rafael
     Wysocki)

   - Drop a misplaced pm_restore_gfp_mask() call, which may cause swap
     to be accessed too early if system suspend fails, from
     suspend_devices_and_enter() (Rafael Wysocki)

   - Remove duplicate filesystems_freeze/thaw() calls, which sometimes
     cause systems to be unable to resume, from enter_state() (Zihuan
     Zhang)"

* tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Update power.completion for all devices on errors
  PM: suspend: clean up redundant filesystems_freeze/thaw() handling
  PM: suspend: Drop a misplaced pm_restore_gfp_mask() call

3 months agoMerge tag 'for-net-2025-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluet...
Jakub Kicinski [Thu, 17 Jul 2025 14:54:48 +0000 (07:54 -0700)]
Merge tag 'for-net-2025-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - hci_sync: fix connectable extended advertising when using static random address
 - hci_core: fix typos in macros
 - hci_core: add missing braces when using macro parameters
 - hci_core: replace 'quirks' integer by 'quirk_flags' bitmap
 - SMP: If an unallowed command is received consider it a failure
 - SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
 - L2CAP: Fix null-ptr-deref in l2cap_sock_resume_cb()
 - L2CAP: Fix attempting to adjust outgoing MTU
 - btintel: Check if controller is ISO capable on btintel_classify_pkt_type
 - btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID

* tag 'for-net-2025-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU
  Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID
  Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap
  Bluetooth: hci_core: add missing braces when using macro parameters
  Bluetooth: hci_core: fix typos in macros
  Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
  Bluetooth: SMP: If an unallowed command is received consider it a failure
  Bluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type
  Bluetooth: hci_sync: fix connectable extended advertising when using static random address
  Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()
====================

Link: https://patch.msgid.link/20250717142849.537425-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'rxrpc-miscellaneous-fixes'
Jakub Kicinski [Thu, 17 Jul 2025 14:50:51 +0000 (07:50 -0700)]
Merge branch 'rxrpc-miscellaneous-fixes'

David Howells says:

====================
rxrpc: Miscellaneous fixes

Here are some fixes for rxrpc:

 (1) Fix the calling of IP routing code with IRQs disabled.

 (2) Fix a recvmsg/recvmsg race when the first completes a call.

 (3) Fix a race between notification, recvmsg and sendmsg releasing a call.

 (4) Fix abort of abort.

 (5) Fix call-level aborts that should be connection-level aborts.
====================

Link: https://patch.msgid.link/20250717074350.3767366-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agorxrpc: Fix to use conn aborts for conn-wide failures
David Howells [Thu, 17 Jul 2025 07:43:45 +0000 (08:43 +0100)]
rxrpc: Fix to use conn aborts for conn-wide failures

Fix rxrpc to use connection-level aborts for things that affect the whole
connection, such as the service ID not matching a local service.

Fixes: 57af281e5389 ("rxrpc: Tidy up abort generation infrastructure")
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-6-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agorxrpc: Fix transmission of an abort in response to an abort
David Howells [Thu, 17 Jul 2025 07:43:44 +0000 (08:43 +0100)]
rxrpc: Fix transmission of an abort in response to an abort

Under some circumstances, such as when a server socket is closing, ABORT
packets will be generated in response to incoming packets.  Unfortunately,
this also may include generating aborts in response to incoming aborts -
which may cause a cycle.  It appears this may be made possible by giving
the client a multicast address.

Fix this such that rxrpc_reject_packet() will refuse to generate aborts in
response to aborts.

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
cc: LePremierHomme <kwqcheii@proton.me>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-5-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agorxrpc: Fix notification vs call-release vs recvmsg
David Howells [Thu, 17 Jul 2025 07:43:43 +0000 (08:43 +0100)]
rxrpc: Fix notification vs call-release vs recvmsg

When a call is released, rxrpc takes the spinlock and removes it from
->recvmsg_q in an effort to prevent racing recvmsg() invocations from
seeing the same call.  Now, rxrpc_recvmsg() only takes the spinlock when
actually removing a call from the queue; it doesn't, however, take it in
the lead up to that when it checks to see if the queue is empty.  It *does*
hold the socket lock, which prevents a recvmsg/recvmsg race - but this
doesn't prevent sendmsg from ending the call because sendmsg() drops the
socket lock and relies on the call->user_mutex.

Fix this by firstly removing the bit in rxrpc_release_call() that dequeues
the released call and, instead, rely on recvmsg() to simply discard
released calls (done in a preceding fix).

Secondly, rxrpc_notify_socket() is abandoned if the call is already marked
as released rather than trying to be clever by setting both pointers in
call->recvmsg_link to NULL to trick list_empty().  This isn't perfect and
can still race, resulting in a released call on the queue, but recvmsg()
will now clean that up.

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
cc: LePremierHomme <kwqcheii@proton.me>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-4-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agorxrpc: Fix recv-recv race of completed call
David Howells [Thu, 17 Jul 2025 07:43:42 +0000 (08:43 +0100)]
rxrpc: Fix recv-recv race of completed call

If a call receives an event (such as incoming data), the call gets placed
on the socket's queue and a thread in recvmsg can be awakened to go and
process it.  Once the thread has picked up the call off of the queue,
further events will cause it to be requeued, and once the socket lock is
dropped (recvmsg uses call->user_mutex to allow the socket to be used in
parallel), a second thread can come in and its recvmsg can pop the call off
the socket queue again.

In such a case, the first thread will be receiving stuff from the call and
the second thread will be blocked on call->user_mutex.  The first thread
can, at this point, process both the event that it picked call for and the
event that the second thread picked the call for and may see the call
terminate - in which case the call will be "released", decoupling the call
from the user call ID assigned to it (RXRPC_USER_CALL_ID in the control
message).

The first thread will return okay, but then the second thread will wake up
holding the user_mutex and, if it sees that the call has been released by
the first thread, it will BUG thusly:

kernel BUG at net/rxrpc/recvmsg.c:474!

Fix this by just dequeuing the call and ignoring it if it is seen to be
already released.  We can't tell userspace about it anyway as the user call
ID has become stale.

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: LePremierHomme <kwqcheii@proton.me>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agorxrpc: Fix irq-disabled in local_bh_enable()
David Howells [Thu, 17 Jul 2025 07:43:41 +0000 (08:43 +0100)]
rxrpc: Fix irq-disabled in local_bh_enable()

The rxrpc_assess_MTU_size() function calls down into the IP layer to find
out the MTU size for a route.  When accepting an incoming call, this is
called from rxrpc_new_incoming_call() which holds interrupts disabled
across the code that calls down to it.  Unfortunately, the IP layer uses
local_bh_enable() which, config dependent, throws a warning if IRQs are
enabled:

WARNING: CPU: 1 PID: 5544 at kernel/softirq.c:387 __local_bh_enable_ip+0x43/0xd0
...
RIP: 0010:__local_bh_enable_ip+0x43/0xd0
...
Call Trace:
 <TASK>
 rt_cache_route+0x7e/0xa0
 rt_set_nexthop.isra.0+0x3b3/0x3f0
 __mkroute_output+0x43a/0x460
 ip_route_output_key_hash+0xf7/0x140
 ip_route_output_flow+0x1b/0x90
 rxrpc_assess_MTU_size.isra.0+0x2a0/0x590
 rxrpc_new_incoming_peer+0x46/0x120
 rxrpc_alloc_incoming_call+0x1b1/0x400
 rxrpc_new_incoming_call+0x1da/0x5e0
 rxrpc_input_packet+0x827/0x900
 rxrpc_io_thread+0x403/0xb60
 kthread+0x2f7/0x310
 ret_from_fork+0x2a/0x230
 ret_from_fork_asm+0x1a/0x30
...
hardirqs last  enabled at (23): _raw_spin_unlock_irq+0x24/0x50
hardirqs last disabled at (24): _raw_read_lock_irq+0x17/0x70
softirqs last  enabled at (0): copy_process+0xc61/0x2730
softirqs last disabled at (25): rt_add_uncached_list+0x3c/0x90

Fix this by moving the call to rxrpc_assess_MTU_size() out of
rxrpc_init_peer() and further up the stack where it can be done without
interrupts disabled.

It shouldn't be a problem for rxrpc_new_incoming_call() to do it after the
locks are dropped as pmtud is going to be performed by the I/O thread - and
we're in the I/O thread at this point.

Fixes: a2ea9a907260 ("rxrpc: Use irq-disabling spinlocks between app and I/O thread")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
cc: LePremierHomme <kwqcheii@proton.me>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-2-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying
William Liu [Thu, 17 Jul 2025 02:29:47 +0000 (02:29 +0000)]
selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying

Ensure that any deactivation and row emptying that occurs
during htb_dequeue_tree does not cause a kernel panic.
This scenario originally triggered a kernel BUG_ON, and
we are checking for a graceful fail now.

Signed-off-by: William Liu <will@willsroot.io>
Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io>
Link: https://patch.msgid.link/20250717022912.221426-1-will@willsroot.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree
William Liu [Thu, 17 Jul 2025 02:28:38 +0000 (02:28 +0000)]
net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree

htb_lookup_leaf has a BUG_ON that can trigger with the following:

tc qdisc del dev lo root
tc qdisc add dev lo root handle 1: htb default 1
tc class add dev lo parent 1: classid 1:1 htb rate 64bit
tc qdisc add dev lo parent 1:1 handle 2: netem
tc qdisc add dev lo parent 2:1 handle 3: blackhole
ping -I lo -c1 -W0.001 127.0.0.1

The root cause is the following:

1. htb_dequeue calls htb_dequeue_tree which calls the dequeue handler on
   the selected leaf qdisc
2. netem_dequeue calls enqueue on the child qdisc
3. blackhole_enqueue drops the packet and returns a value that is not
   just NET_XMIT_SUCCESS
4. Because of this, netem_dequeue calls qdisc_tree_reduce_backlog, and
   since qlen is now 0, it calls htb_qlen_notify -> htb_deactivate ->
   htb_deactiviate_prios -> htb_remove_class_from_row -> htb_safe_rb_erase
5. As this is the only class in the selected hprio rbtree,
   __rb_change_child in __rb_erase_augmented sets the rb_root pointer to
   NULL
6. Because blackhole_dequeue returns NULL, netem_dequeue returns NULL,
   which causes htb_dequeue_tree to call htb_lookup_leaf with the same
   hprio rbtree, and fail the BUG_ON

The function graph for this scenario is shown here:
 0)               |  htb_enqueue() {
 0) + 13.635 us   |    netem_enqueue();
 0)   4.719 us    |    htb_activate_prios();
 0) # 2249.199 us |  }
 0)               |  htb_dequeue() {
 0)   2.355 us    |    htb_lookup_leaf();
 0)               |    netem_dequeue() {
 0) + 11.061 us   |      blackhole_enqueue();
 0)               |      qdisc_tree_reduce_backlog() {
 0)               |        qdisc_lookup_rcu() {
 0)   1.873 us    |          qdisc_match_from_root();
 0)   6.292 us    |        }
 0)   1.894 us    |        htb_search();
 0)               |        htb_qlen_notify() {
 0)   2.655 us    |          htb_deactivate_prios();
 0)   6.933 us    |        }
 0) + 25.227 us   |      }
 0)   1.983 us    |      blackhole_dequeue();
 0) + 86.553 us   |    }
 0) # 2932.761 us |    qdisc_warn_nonwc();
 0)               |    htb_lookup_leaf() {
 0)               |      BUG_ON();
 ------------------------------------------

The full original bug report can be seen here [1].

We can fix this just by returning NULL instead of the BUG_ON,
as htb_dequeue_tree returns NULL when htb_lookup_leaf returns
NULL.

[1] https://lore.kernel.org/netdev/pF5XOOIim0IuEfhI-SOxTgRvNoDwuux7UHKnE_Y5-zVd4wmGvNk2ceHjKb8ORnzw0cGwfmVu42g9dL7XyJLf1NEzaztboTWcm0Ogxuojoeo=@willsroot.io/

Fixes: 512bb43eb542 ("pkt_sched: sch_htb: Optimize WARN_ONs in htb_dequeue_tree() etc.")
Signed-off-by: William Liu <will@willsroot.io>
Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io>
Link: https://patch.msgid.link/20250717022816.221364-1-will@willsroot.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: bridge: Do not offload IGMP/MLD messages
Joseph Huang [Wed, 16 Jul 2025 15:35:50 +0000 (11:35 -0400)]
net: bridge: Do not offload IGMP/MLD messages

Do not offload IGMP/MLD messages as it could lead to IGMP/MLD Reports
being unintentionally flooded to Hosts. Instead, let the bridge decide
where to send these IGMP/MLD messages.

Consider the case where the local host is sending out reports in response
to a remote querier like the following:

       mcast-listener-process (IP_ADD_MEMBERSHIP)
          \
          br0
         /   \
      swp1   swp2
        |     |
  QUERIER     SOME-OTHER-HOST

In the above setup, br0 will want to br_forward() reports for
mcast-listener-process's group(s) via swp1 to QUERIER; but since the
source hwdom is 0, the report is eligible for tx offloading, and is
flooded by hardware to both swp1 and swp2, reaching SOME-OTHER-HOST as
well. (Example and illustration provided by Tobias.)

Fixes: 472111920f1c ("net: bridge: switchdev: allow the TX data plane forwarding to be offloaded")
Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250716153551.1830255-1-Joseph.Huang@garmin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'net-vlan-fix-vlan-0-refcount-imbalance-of-toggling-filtering-during...
Jakub Kicinski [Thu, 17 Jul 2025 14:44:30 +0000 (07:44 -0700)]
Merge branch 'net-vlan-fix-vlan-0-refcount-imbalance-of-toggling-filtering-during-runtime'

Dong Chenchen says:

====================
net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime

Fix VLAN 0 refcount imbalance of toggling filtering during runtime.
====================

Link: https://patch.msgid.link/20250716034504.2285203-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests: Add test cases for vlan_filter modification during runtime
Dong Chenchen [Wed, 16 Jul 2025 03:45:04 +0000 (11:45 +0800)]
selftests: Add test cases for vlan_filter modification during runtime

Add test cases for vlan_filter modification during runtime, which
may triger null-ptr-ref or memory leak of vlan0.

Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://patch.msgid.link/20250716034504.2285203-3-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime
Dong Chenchen [Wed, 16 Jul 2025 03:45:03 +0000 (11:45 +0800)]
net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime

Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.

The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:

 # ip link add bond1 up type bond mode 0
 # ethtool -K bond1 rx-vlan-filter off
 # ip link del dev bond1

When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].

Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:

$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0

Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
    vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
    vlan_filter_push_vids
        ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
    vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
        vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
    BUG_ON(!vlan_info); //vlan_info is null

Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.

[1]
unreferenced object 0xffff8880068e3100 (size 256):
  comm "ip", pid 384, jiffies 4296130254
  hex dump (first 32 bytes):
    00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00  . 0.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 81ce31fa):
    __kmalloc_cache_noprof+0x2b5/0x340
    vlan_vid_add+0x434/0x940
    vlan_device_event.cold+0x75/0xa8
    notifier_call_chain+0xca/0x150
    __dev_notify_flags+0xe3/0x250
    rtnl_configure_link+0x193/0x260
    rtnl_newlink_create+0x383/0x8e0
    __rtnl_newlink+0x22c/0xa40
    rtnl_newlink+0x627/0xb00
    rtnetlink_rcv_msg+0x6fb/0xb70
    netlink_rcv_skb+0x11f/0x350
    netlink_unicast+0x426/0x710
    netlink_sendmsg+0x75a/0xc20
    __sock_sendmsg+0xc1/0x150
    ____sys_sendmsg+0x5aa/0x7b0
    ___sys_sendmsg+0xfc/0x180

[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS:  00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
 <TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)

Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250716034504.2285203-2-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-next
Jakub Kicinski [Thu, 17 Jul 2025 14:41:25 +0000 (07:41 -0700)]
Merge tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-next

Antonio Quartulli says:

====================
This bugfix batch includes the following changes:
* properly propagate sk mark to skb->mark field
* reject unexpected incoming netlink attributes
* reset GSO state when moving skb from transport to tunnel layer

* tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-next:
  ovpn: reset GSO metadata after decapsulation
  ovpn: reject unexpected netlink attributes
  ovpn: propagate socket mark to skb in UDP
====================

Link: https://patch.msgid.link/20250716115443.16763-1-antonio@openvpn.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agotls: always refresh the queue when reading sock
Jakub Kicinski [Wed, 16 Jul 2025 14:38:50 +0000 (07:38 -0700)]
tls: always refresh the queue when reading sock

After recent changes in net-next TCP compacts skbs much more
aggressively. This unearthed a bug in TLS where we may try
to operate on an old skb when checking if all skbs in the
queue have matching decrypt state and geometry.

    BUG: KASAN: slab-use-after-free in tls_strp_check_rcv+0x898/0x9a0 [tls]
    (net/tls/tls_strp.c:436 net/tls/tls_strp.c:530 net/tls/tls_strp.c:544)
    Read of size 4 at addr ffff888013085750 by task tls/13529

    CPU: 2 UID: 0 PID: 13529 Comm: tls Not tainted 6.16.0-rc5-virtme
    Call Trace:
     kasan_report+0xca/0x100
     tls_strp_check_rcv+0x898/0x9a0 [tls]
     tls_rx_rec_wait+0x2c9/0x8d0 [tls]
     tls_sw_recvmsg+0x40f/0x1aa0 [tls]
     inet_recvmsg+0x1c3/0x1f0

Always reload the queue, fast path is to have the record in the queue
when we wake, anyway (IOW the path going down "if !strp->stm.full_len").

Fixes: 0d87bbd39d7f ("tls: strp: make sure the TCP skbs do not have overlapping data")
Link: https://patch.msgid.link/20250716143850.1520292-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agovirtio-net: fix recursived rtnl_lock() during probe()
Zigit Zo [Wed, 16 Jul 2025 11:57:17 +0000 (19:57 +0800)]
virtio-net: fix recursived rtnl_lock() during probe()

The deadlock appears in a stack trace like:

  virtnet_probe()
    rtnl_lock()
    virtio_config_changed_work()
      netdev_notify_peers()
        rtnl_lock()

It happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the
virtio-net driver is still probing.

The config_work in probe() will get scheduled until virtnet_open() enables
the config change notification via virtio_config_driver_enable().

Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down")
Signed-off-by: Zigit Zo <zuozhijie@bytedance.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20250716115717.1472430-1-zuozhijie@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet/mlx5: Update the list of the PCI supported devices
Maor Gottlieb [Wed, 16 Jul 2025 07:29:29 +0000 (10:29 +0300)]
net/mlx5: Update the list of the PCI supported devices

Add the upcoming ConnectX-10 device ID to the table of supported
PCI device IDs.

Cc: stable@vger.kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1752650969-148501-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agohv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 addrconf
Li Tian [Wed, 16 Jul 2025 00:26:05 +0000 (08:26 +0800)]
hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 addrconf

Set an additional flag IFF_NO_ADDRCONF to prevent ipv6 addrconf.

Commit under Fixes added a new flag change that was not made
to hv_netvsc resulting in the VF being assinged an IPv6.

Fixes: 8a321cf7becc ("net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf")
Suggested-by: Cathy Avery <cavery@redhat.com>
Signed-off-by: Li Tian <litian@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20250716002607.4927-1-litian@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agophonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()
Nathan Chancellor [Tue, 15 Jul 2025 23:15:40 +0000 (16:15 -0700)]
phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()

A new warning in clang [1] points out a place in pep_sock_accept() where
dst is uninitialized then passed as a const pointer to pep_find_pipe():

  net/phonet/pep.c:829:37: error: variable 'dst' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
    829 |         newsk = pep_find_pipe(&pn->hlist, &dst, pipe_handle);
        |                                            ^~~:

Move the call to pn_skb_get_dst_sockaddr(), which initializes dst, to
before the call to pep_find_pipe(), so that dst is consistently used
initialized throughout the function.

Cc: stable@vger.kernel.org
Fixes: f7ae8d59f661 ("Phonet: allocate sock from accept syscall rather than soft IRQ")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d441f19b319e
Closes: https://github.com/ClangBuiltLinux/linux/issues/2101
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20250715-net-phonet-fix-uninit-const-pointer-v1-1-8efd1bd188b3@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoBluetooth: L2CAP: Fix attempting to adjust outgoing MTU
Luiz Augusto von Dentz [Wed, 16 Jul 2025 13:40:49 +0000 (09:40 -0400)]
Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU

Configuration request only configure the incoming direction of the peer
initiating the request, so using the MTU is the other direction shall
not be used, that said the spec allows the peer responding to adjust:

Bluetooth Core 6.1, Vol 3, Part A, Section 4.5

 'Each configuration parameter value (if any is present) in an
 L2CAP_CONFIGURATION_RSP packet reflects an ‘adjustment’ to a
 configuration parameter value that has been sent (or, in case of
 default values, implied) in the corresponding
 L2CAP_CONFIGURATION_REQ packet.'

That said adjusting the MTU in the response shall be limited to ERTM
channels only as for older modes the remote stack may not be able to
detect the adjustment causing it to silently drop packets.

Link: https://github.com/bluez/bluez/issues/1422
Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/149
Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4793
Fixes: 042bb9603c44 ("Bluetooth: L2CAP: Fix L2CAP MTU negotiation")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoMerge tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Abeni [Thu, 17 Jul 2025 12:52:41 +0000 (14:52 +0200)]
Merge tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Couple of fixes:
 - ath12k performance regression from -rc1
 - cfg80211 counted_by() removal for scan request
   as it doesn't match usage and keeps complaining
 - iwlwifi crash with certain older devices
 - iwlwifi missing an error path unlock
 - iwlwifi compatibility with certain BIOS updates

* tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: iwlwifi: Fix botched indexing conversion
  wifi: cfg80211: remove scan request n_channels counted_by
  wifi: ath12k: Fix packets received in WBM error ring with REO LUT enabled
  wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap
  wifi: iwlwifi: pcie: fix locking on invalid TOP reset
====================

Link: https://patch.msgid.link/20250717091831.18787-5-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agoMerge tag 'nf-25-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Paolo Abeni [Thu, 17 Jul 2025 12:44:54 +0000 (14:44 +0200)]
Merge tag 'nf-25-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains Netfilter fixes for net:

1) Three patches to enhance conntrack selftests for resize and clash
   resolution, from Florian Westphal.

2) Expand nft_concat_range.sh selftest to improve coverage from error
   path, from Florian Westphal.

3) Hide clash bit to userspace from netlink dumps until there is a
   good reason to expose, from Florian Westphal.

4) Revert notification for device registration/unregistration for
   nftables basechains and flowtables, we decided to go for a better
   way to handle this through the nfnetlink_hook infrastructure which
   will come via nf-next, patch from Phil Sutter.

5) Fix crash in conntrack due to race related to SLAB_TYPESAFE_BY_RCU
   that results in removing a recycled object that is not yet in the
   hashes. Move IPS_CONFIRM setting after the object is in the hashes.
   From Florian Westphal.

netfilter pull request 25-07-17

* tag 'nf-25-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_conntrack: fix crash due to removal of uninitialised entry
  Revert "netfilter: nf_tables: Add notifications for hook changes"
  netfilter: nf_tables: hide clash bit from userspace
  selftests: netfilter: nft_concat_range.sh: send packets to empty set
  selftests: netfilter: conntrack_resize.sh: also use udpclash tool
  selftests: netfilter: add conntrack clash resolution test case
  selftests: netfilter: conntrack_resize.sh: extend resize test
====================

Link: https://patch.msgid.link/20250717095808.41725-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonetfilter: nf_conntrack: fix crash due to removal of uninitialised entry
Florian Westphal [Wed, 16 Jul 2025 18:39:14 +0000 (20:39 +0200)]
netfilter: nf_conntrack: fix crash due to removal of uninitialised entry

A crash in conntrack was reported while trying to unlink the conntrack
entry from the hash bucket list:
    [exception RIP: __nf_ct_delete_from_lists+172]
    [..]
 #7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack]
 #8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack]
 #9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack]
    [..]

The nf_conn struct is marked as allocated from slab but appears to be in
a partially initialised state:

 ct hlist pointer is garbage; looks like the ct hash value
 (hence crash).
 ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected
 ct->timeout is 30000 (=30s), which is unexpected.

Everything else looks like normal udp conntrack entry.  If we ignore
ct->status and pretend its 0, the entry matches those that are newly
allocated but not yet inserted into the hash:
  - ct hlist pointers are overloaded and store/cache the raw tuple hash
  - ct->timeout matches the relative time expected for a new udp flow
    rather than the absolute 'jiffies' value.

If it were not for the presence of IPS_CONFIRMED,
__nf_conntrack_find_get() would have skipped the entry.

Theory is that we did hit following race:

cpu x  cpu y cpu z
 found entry E found entry E
 E is expired <preemption>
 nf_ct_delete()
 return E to rcu slab
init_conntrack
E is re-inited,
ct->status set to 0
reply tuplehash hnnode.pprev
stores hash value.

cpu y found E right before it was deleted on cpu x.
E is now re-inited on cpu z.  cpu y was preempted before
checking for expiry and/or confirm bit.

->refcnt set to 1
E now owned by skb
->timeout set to 30000

If cpu y were to resume now, it would observe E as
expired but would skip E due to missing CONFIRMED bit.

nf_conntrack_confirm gets called
sets: ct->status |= CONFIRMED
This is wrong: E is not yet added
to hashtable.

cpu y resumes, it observes E as expired but CONFIRMED:
<resumes>
nf_ct_expired()
 -> yes (ct->timeout is 30s)
confirmed bit set.

cpu y will try to delete E from the hashtable:
nf_ct_delete() -> set DYING bit
__nf_ct_delete_from_lists

Even this scenario doesn't guarantee a crash:
cpu z still holds the table bucket lock(s) so y blocks:

wait for spinlock held by z

CONFIRMED is set but there is no
guarantee ct will be added to hash:
"chaintoolong" or "clash resolution"
logic both skip the insert step.
reply hnnode.pprev still stores the
hash value.

unlocks spinlock
return NF_DROP
<unblocks, then
 crashes on hlist_nulls_del_rcu pprev>

In case CPU z does insert the entry into the hashtable, cpu y will unlink
E again right away but no crash occurs.

Without 'cpu y' race, 'garbage' hlist is of no consequence:
ct refcnt remains at 1, eventually skb will be free'd and E gets
destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy.

To resolve this, move the IPS_CONFIRMED assignment after the table
insertion but before the unlock.

Pablo points out that the confirm-bit-store could be reordered to happen
before hlist add resp. the timeout fixup, so switch to set_bit and
before_atomic memory barrier to prevent this.

It doesn't matter if other CPUs can observe a newly inserted entry right
before the CONFIRMED bit was set:

Such event cannot be distinguished from above "E is the old incarnation"
case: the entry will be skipped.

Also change nf_ct_should_gc() to first check the confirmed bit.

The gc sequence is:
 1. Check if entry has expired, if not skip to next entry
 2. Obtain a reference to the expired entry.
 3. Call nf_ct_should_gc() to double-check step 1.

nf_ct_should_gc() is thus called only for entries that already failed an
expiry check. After this patch, once the confirmed bit check passes
ct->timeout has been altered to reflect the absolute 'best before' date
instead of a relative time.  Step 3 will therefore not remove the entry.

Without this change to nf_ct_should_gc() we could still get this sequence:

 1. Check if entry has expired.
 2. Obtain a reference.
 3. Call nf_ct_should_gc() to double-check step 1:
    4 - entry is still observed as expired
    5 - meanwhile, ct->timeout is corrected to absolute value on other CPU
      and confirm bit gets set
    6 - confirm bit is seen
    7 - valid entry is removed again

First do check 6), then 4) so the gc expiry check always picks up either
confirmed bit unset (entry gets skipped) or expiry re-check failure for
re-inited conntrack objects.

This change cannot be backported to releases before 5.19. Without
commit 8a75a2c17410 ("netfilter: conntrack: remove unconfirmed list")
|= IPS_CONFIRMED line cannot be moved without further changes.

Cc: Razvan Cojocaru <rzvncj@gmail.com>
Link: https://lore.kernel.org/netfilter-devel/20250627142758.25664-1-fw@strlen.de/
Link: https://lore.kernel.org/netfilter-devel/4239da15-83ff-4ca4-939d-faef283471bb@gmail.com/
Fixes: 1397af5bfd7d ("netfilter: conntrack: remove the percpu dying list")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 months agonet: fix segmentation after TCP/UDP fraglist GRO
Felix Fietkau [Sat, 5 Jul 2025 15:06:21 +0000 (17:06 +0200)]
net: fix segmentation after TCP/UDP fraglist GRO

Since "net: gro: use cb instead of skb->network_header", the skb network
header is no longer set in the GRO path.
This breaks fraglist segmentation, which relies on ip_hdr()/tcp_hdr()
to check for address/port changes.
Fix this regression by selectively setting the network header for merged
segment skbs.

Fixes: 186b1ea73ad8 ("net: gro: use cb instead of skb->network_header")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250705150622.10699-1-nbd@nbd.name
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agoipv6: mcast: Delay put pmc->idev in mld_del_delrec()
Yue Haibing [Mon, 14 Jul 2025 14:19:57 +0000 (22:19 +0800)]
ipv6: mcast: Delay put pmc->idev in mld_del_delrec()

pmc->idev is still used in ip6_mc_clear_src(), so as mld_clear_delrec()
does, the reference should be put after ip6_mc_clear_src() return.

Fixes: 63ed8de4be81 ("mld: add mc_lock for protecting per-interface mld data")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://patch.msgid.link/20250714141957.3301871-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net...
Jakub Kicinski [Wed, 16 Jul 2025 23:17:34 +0000 (16:17 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2025-07-15 (ixgbe, fm10k, i40e, ice)

Arnd Bergmann resolves compile issues with large NR_CPUS for ixgbe, fm10k,
and i40e.

For ice:
Dave adds a NULL check for LAG netdev.

Michal corrects a pointer check in debugfs initialization.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: check correct pointer in fwlog debugfs
  ice: add NULL check in eswitch lag check
  ethernet: intel: fix building with large NR_CPUS
====================

Link: https://patch.msgid.link/20250715202948.3841437-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: airoha: fix potential use-after-free in airoha_npu_get()
Alok Tiwari [Tue, 15 Jul 2025 14:30:58 +0000 (07:30 -0700)]
net: airoha: fix potential use-after-free in airoha_npu_get()

np->name was being used after calling of_node_put(np), which
releases the node and can lead to a use-after-free bug.
Previously, of_node_put(np) was called unconditionally after
of_find_device_by_node(np), which could result in a use-after-free if
pdev is NULL.

This patch moves of_node_put(np) after the error check to ensure
the node is only released after both the error and success cases
are handled appropriately, preventing potential resource issues.

Fixes: 23290c7bc190 ("net: airoha: Introduce Airoha NPU support")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250715143102.3458286-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet/mlx5: Correctly set gso_size when LRO is used
Christoph Paasch [Tue, 15 Jul 2025 20:20:53 +0000 (13:20 -0700)]
net/mlx5: Correctly set gso_size when LRO is used

gso_size is expected by the networking stack to be the size of the
payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt
is the full sized frame (including the headers). Dividing cqe_bcnt by
lro_num_seg will then give incorrect results.

For example, running a bpftrace higher up in the TCP-stack
(tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even
though in reality the payload was only 1448 bytes.

This can have unintended consequences:
- In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss
will be 1448 (because tp->advmss is 1448). Thus, we will always
recompute scaling_ratio each time an LRO-packet is received.
- In tcp_gro_receive(), it will interfere with the decision whether or
not to flush and thus potentially result in less gro'ed packets.

So, we need to discount the protocol headers from cqe_bcnt so we can
actually divide the payload by lro_num_seg to get the real gso_size.

v2:
 - Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len
   (Tariq Toukan <tariqt@nvidia.com>)
 - Improve commit-message (Gal Pressman <gal@nvidia.com>)

Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_size-on-cx-7-nic-v2-1-e06c3475f3ac@openai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 16 Jul 2025 20:00:38 +0000 (13:00 -0700)]
Merge tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fix from Masami Hiramatsu:

 - fprobe-event: The @params variable was being used in an error path
   without being initialized. The fix to return an error code.

* tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/probes: Avoid using params uninitialized in parse_btf_arg()

3 months agoBluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID
Zijun Hu [Tue, 15 Jul 2025 12:40:13 +0000 (20:40 +0800)]
Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID

For GF variant of WCN6855 without board ID programmed
btusb_generate_qca_nvm_name() will chose wrong NVM
'qca/nvm_usb_00130201.bin' to download.

Fix by choosing right NVM 'qca/nvm_usb_00130201_gf.bin'.
Also simplify NVM choice logic of btusb_generate_qca_nvm_name().

Fixes: d6cba4e6d0e2 ("Bluetooth: btusb: Add support using different nvm for variant WCN6855 controller")
Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoBluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap
Christian Eggers [Mon, 14 Jul 2025 20:27:45 +0000 (22:27 +0200)]
Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap

The 'quirks' member already ran out of bits on some platforms some time
ago. Replace the integer member by a bitmap in order to have enough bits
in future. Replace raw bit operations by accessor macros.

Fixes: ff26b2dd6568 ("Bluetooth: Add quirk for broken READ_VOICE_SETTING")
Fixes: 127881334eaa ("Bluetooth: Add quirk for broken READ_PAGE_SCAN_TYPE")
Suggested-by: Pauli Virtanen <pav@iki.fi>
Tested-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoBluetooth: hci_core: add missing braces when using macro parameters
Christian Eggers [Mon, 14 Jul 2025 20:27:44 +0000 (22:27 +0200)]
Bluetooth: hci_core: add missing braces when using macro parameters

Macro parameters should always be put into braces when accessing it.

Fixes: 4fc9857ab8c6 ("Bluetooth: hci_sync: Add check simultaneous roles support")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoBluetooth: hci_core: fix typos in macros
Christian Eggers [Mon, 14 Jul 2025 20:27:43 +0000 (22:27 +0200)]
Bluetooth: hci_core: fix typos in macros

The provided macro parameter is named 'dev' (rather than 'hdev', which
may be a variable on the stack where the macro is used).

Fixes: a9a830a676a9 ("Bluetooth: hci_event: Fix sending HCI_OP_READ_ENC_KEY_SIZE")
Fixes: 6126ffabba6b ("Bluetooth: Introduce HCI_CONN_FLAG_DEVICE_PRIVACY device flag")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoBluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
Luiz Augusto von Dentz [Wed, 2 Jul 2025 15:53:40 +0000 (11:53 -0400)]
Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout

This replaces the usage of HCI_ERROR_REMOTE_USER_TERM, which as the name
suggest is to indicate a regular disconnection initiated by an user,
with HCI_ERROR_AUTH_FAILURE to indicate the session has timeout thus any
pairing shall be considered as failed.

Fixes: 1e91c29eb60c ("Bluetooth: Use hci_disconnect for immediate disconnection from SMP")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoBluetooth: SMP: If an unallowed command is received consider it a failure
Luiz Augusto von Dentz [Mon, 30 Jun 2025 18:42:23 +0000 (14:42 -0400)]
Bluetooth: SMP: If an unallowed command is received consider it a failure

If a command is received while a bonding is ongoing consider it a
pairing failure so the session is cleanup properly and the device is
disconnected immediately instead of continuing with other commands that
may result in the session to get stuck without ever completing such as
the case bellow:

> ACL Data RX: Handle 2048 flags 0x02 dlen 21
      SMP: Identity Information (0x08) len 16
        Identity resolving key[16]: d7e08edef97d3e62cd2331f82d8073b0
> ACL Data RX: Handle 2048 flags 0x02 dlen 21
      SMP: Signing Information (0x0a) len 16
        Signature key[16]: 1716c536f94e843a9aea8b13ffde477d
Bluetooth: hci0: unexpected SMP command 0x0a from XX:XX:XX:XX:XX:XX
> ACL Data RX: Handle 2048 flags 0x02 dlen 12
      SMP: Identity Address Information (0x09) len 7
        Address: XX:XX:XX:XX:XX:XX (Intel Corporate)

While accourding to core spec 6.1 the expected order is always BD_ADDR
first first then CSRK:

When using LE legacy pairing, the keys shall be distributed in the
following order:

    LTK by the Peripheral

    EDIV and Rand by the Peripheral

    IRK by the Peripheral

    BD_ADDR by the Peripheral

    CSRK by the Peripheral

    LTK by the Central

    EDIV and Rand by the Central

    IRK by the Central

    BD_ADDR by the Central

    CSRK by the Central

When using LE Secure Connections, the keys shall be distributed in the
following order:

    IRK by the Peripheral

    BD_ADDR by the Peripheral

    CSRK by the Peripheral

    IRK by the Central

    BD_ADDR by the Central

    CSRK by the Central

According to the Core 6.1 for commands used for key distribution "Key
Rejected" can be used:

  '3.6.1. Key distribution and generation

  A device may reject a distributed key by sending the Pairing Failed command
  with the reason set to "Key Rejected".

Fixes: b28b4943660f ("Bluetooth: Add strict checks for allowed SMP PDUs")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoBluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type
Luiz Augusto von Dentz [Wed, 9 Jul 2025 19:02:56 +0000 (15:02 -0400)]
Bluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type

Due to what seem to be a bug with variant version returned by some
firmwares the code may set hdev->classify_pkt_type with
btintel_classify_pkt_type when in fact the controller doesn't even
support ISO channels feature but may use the handle range expected from
a controllers that does causing the packets to be reclassified as ISO
causing several bugs.

To fix the above btintel_classify_pkt_type will attempt to check if the
controller really supports ISO channels and in case it doesn't don't
reclassify even if the handle range is considered to be ISO, this is
considered safer than trying to fix the specific controller/firmware
version as that could change over time and causing similar problems in
the future.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219553
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2100565
Link: https://github.com/StarLabsLtd/firmware/issues/180
Fixes: f25b7fd36cc3 ("Bluetooth: Add vendor-specific packet classification for ISO data")
Cc: stable@vger.kernel.org
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Sean Rhodes <sean@starlabs.systems>
3 months agoBluetooth: hci_sync: fix connectable extended advertising when using static random...
Alessandro Gasbarroni [Wed, 9 Jul 2025 07:53:11 +0000 (09:53 +0200)]
Bluetooth: hci_sync: fix connectable extended advertising when using static random address

Currently, the connectable flag used by the setup of an extended
advertising instance drives whether we require privacy when trying to pass
a random address to the advertising parameters (Own Address).
If privacy is not required, then it automatically falls back to using the
controller's public address. This can cause problems when using controllers
that do not have a public address set, but instead use a static random
address.

e.g. Assume a BLE controller that does not have a public address set.
The controller upon powering is set with a random static address by default
by the kernel.

< HCI Command: LE Set Random Address (0x08|0x0005) plen 6
         Address: E4:AF:26:D8:3E:3A (Static)
> HCI Event: Command Complete (0x0e) plen 4
      LE Set Random Address (0x08|0x0005) ncmd 1
        Status: Success (0x00)

Setting non-connectable extended advertisement parameters in bluetoothctl
mgmt

add-ext-adv-params -r 0x801 -x 0x802 -P 2M -g 1

correctly sets Own address type as Random

< HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036)
plen 25
...
    Own address type: Random (0x01)

Setting connectable extended advertisement parameters in bluetoothctl mgmt

add-ext-adv-params -r 0x801 -x 0x802 -P 2M -g -c 1

mistakenly sets Own address type to Public (which causes to use Public
Address 00:00:00:00:00:00)

< HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036)
plen 25
...
    Own address type: Public (0x00)

This causes either the controller to emit an Invalid Parameters error or to
mishandle the advertising.

This patch makes sure that we use the already set static random address
when requesting a connectable extended advertising when we don't require
privacy and our public address is not set (00:00:00:00:00:00).

Fixes: 3fe318ee72c5 ("Bluetooth: move hci_get_random_address() to hci_sync")
Signed-off-by: Alessandro Gasbarroni <alex.gasbarroni@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoBluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()
Kuniyuki Iwashima [Mon, 7 Jul 2025 19:28:29 +0000 (19:28 +0000)]
Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()

syzbot reported null-ptr-deref in l2cap_sock_resume_cb(). [0]

l2cap_sock_resume_cb() has a similar problem that was fixed by commit
1bff51ea59a9 ("Bluetooth: fix use-after-free error in lock_sock_nested()").

Since both l2cap_sock_kill() and l2cap_sock_resume_cb() are executed
under l2cap_sock_resume_cb(), we can avoid the issue simply by checking
if chan->data is NULL.

Let's not access to the killed socket in l2cap_sock_resume_cb().

[0]:
BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:82 [inline]
BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline]
BUG: KASAN: null-ptr-deref in l2cap_sock_resume_cb+0xb4/0x17c net/bluetooth/l2cap_sock.c:1711
Write of size 8 at addr 0000000000000570 by task kworker/u9:0/52

CPU: 1 UID: 0 PID: 52 Comm: kworker/u9:0 Not tainted 6.16.0-rc4-syzkaller-g7482bb149b9f #0 PREEMPT
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Workqueue: hci0 hci_rx_work
Call trace:
 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:501 (C)
 __dump_stack+0x30/0x40 lib/dump_stack.c:94
 dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120
 print_report+0x58/0x84 mm/kasan/report.c:524
 kasan_report+0xb0/0x110 mm/kasan/report.c:634
 check_region_inline mm/kasan/generic.c:-1 [inline]
 kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:189
 __kasan_check_write+0x20/0x30 mm/kasan/shadow.c:37
 instrument_atomic_write include/linux/instrumented.h:82 [inline]
 clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline]
 l2cap_sock_resume_cb+0xb4/0x17c net/bluetooth/l2cap_sock.c:1711
 l2cap_security_cfm+0x524/0xea0 net/bluetooth/l2cap_core.c:7357
 hci_auth_cfm include/net/bluetooth/hci_core.h:2092 [inline]
 hci_auth_complete_evt+0x2e8/0xa4c net/bluetooth/hci_event.c:3514
 hci_event_func net/bluetooth/hci_event.c:7511 [inline]
 hci_event_packet+0x650/0xe9c net/bluetooth/hci_event.c:7565
 hci_rx_work+0x320/0xb18 net/bluetooth/hci_core.c:4070
 process_one_work+0x7e8/0x155c kernel/workqueue.c:3238
 process_scheduled_works kernel/workqueue.c:3321 [inline]
 worker_thread+0x958/0xed8 kernel/workqueue.c:3402
 kthread+0x5fc/0x75c kernel/kthread.c:464
 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:847

Fixes: d97c899bde33 ("Bluetooth: Introduce L2CAP channel callback for resuming")
Reported-by: syzbot+e4d73b165c3892852d22@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/686c12bd.a70a0220.29fe6c.0b13.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 months agoovpn: reset GSO metadata after decapsulation
Ralf Lici [Tue, 1 Jul 2025 12:47:44 +0000 (14:47 +0200)]
ovpn: reset GSO metadata after decapsulation

The ovpn_netdev_write() function is responsible for injecting
decapsulated and decrypted packets back into the local network stack.

Prior to this patch, the skb could retain GSO metadata from the outer,
encrypted tunnel packet. This original GSO metadata, relevant to the
sender's transport context, becomes invalid and misleading for the
tunnel/data path once the inner packet is exposed.

Leaving this stale metadata intact causes internal GSO validation checks
further down the kernel's network stack (validate_xmit_skb()) to fail,
leading to packet drops. The reasons for these failures vary by
protocol, for example:
- for ICMP, no offload handler is registered;
- for TCP and UDP, the respective offload handlers return errors when
  comparing skb->len to the outdated skb_shinfo(skb)->gso_size.

By calling skb_gso_reset(skb) we ensure the inner packet is presented to
gro_cells_receive() with a clean state, correctly indicating it is an
individual packet from the perspective of the local stack.

This change eliminates the "Driver has suspect GRO implementation, TCP
performance may be compromised" warning and improves overall TCP
performance by allowing GSO/GRO to function as intended on the
decapsulated traffic.

Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Reported-by: Gert Doering <gert@greenie.muc.de>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/4
Tested-by: Gert Doering <gert@greenie.muc.de>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
3 months agoovpn: reject unexpected netlink attributes
Antonio Quartulli [Wed, 25 Jun 2025 14:08:11 +0000 (16:08 +0200)]
ovpn: reject unexpected netlink attributes

Netlink ops do not expect all attributes to be always set, however
this condition is not explicitly coded any where, leading the user
to believe that all sent attributes are somewhat processed.

Fix this behaviour by introducing explicit checks.

For CMD_OVPN_PEER_GET and CMD_OVPN_KEY_GET directly open-code the
needed condition in the related ops handlers.
While for all other ops use attribute subsets in the ovpn.yaml spec file.

Fixes: b7a63391aa98 ("ovpn: add basic netlink support")
Reported-by: Ralf Lici <ralf@mandelbit.com>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/19
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
3 months agoovpn: propagate socket mark to skb in UDP
Ralf Lici [Wed, 4 Jun 2025 13:11:58 +0000 (15:11 +0200)]
ovpn: propagate socket mark to skb in UDP

OpenVPN allows users to configure a FW mark on sockets used to
communicate with other peers. The mark is set by means of the
`SO_MARK` Linux socket option.

However, in the ovpn UDP code path, the socket's `sk_mark` value is
currently ignored and it is not propagated to outgoing `skbs`.

This commit ensures proper inheritance of the field by setting
`skb->mark` to `sk->sk_mark` before handing the `skb` to the network
stack for transmission.

Fixes: 08857b5ec5d9 ("ovpn: implement basic TX path (UDP)")
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Link: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31877.html
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
3 months agotracing/probes: Avoid using params uninitialized in parse_btf_arg()
Nathan Chancellor [Wed, 16 Jul 2025 03:19:44 +0000 (20:19 -0700)]
tracing/probes: Avoid using params uninitialized in parse_btf_arg()

After a recent change in clang to strengthen uninitialized warnings [1],
it points out that in one of the error paths in parse_btf_arg(), params
is used uninitialized:

  kernel/trace/trace_probe.c:660:19: warning: variable 'params' is uninitialized when used here [-Wuninitialized]
    660 |                         return PTR_ERR(params);
        |                                        ^~~~~~

Match many other NO_BTF_ENTRY error cases and return -ENOENT, clearing
up the warning.

Link: https://lore.kernel.org/all/20250715-trace_probe-fix-const-uninit-warning-v1-1-98960f91dd04@kernel.org/
Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2110
Fixes: d157d7694460 ("tracing/probes: Support BTF field access from $retval")
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
3 months agoMerge branch 'mptcp-fix-fallback-related-races'
Jakub Kicinski [Wed, 16 Jul 2025 00:31:30 +0000 (17:31 -0700)]
Merge branch 'mptcp-fix-fallback-related-races'

Matthieu Baerts says:

====================
mptcp: fix fallback-related races

This series contains 3 fixes somewhat related to various races we have
while handling fallback.

The root cause of the issues addressed here is that the check for
"we can fallback to tcp now" and the related action are not atomic. That
also applies to fallback due to MP_FAIL -- where the window race is even
wider.

Address the issue introducing an additional spinlock to bundle together
all the relevant events, as per patch 1 and 2. These fixes can be
backported up to v5.19 and v5.15.

Note that mptcp_disconnect() unconditionally clears the fallback status
(zeroing msk->flags) but don't touch the `allows_infinite_fallback`
flag. Such issue is addressed in patch 3, and can be backported up to
v5.17.
====================

Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-0-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agomptcp: reset fallback status gracefully at disconnect() time
Paolo Abeni [Mon, 14 Jul 2025 16:41:46 +0000 (18:41 +0200)]
mptcp: reset fallback status gracefully at disconnect() time

mptcp_disconnect() clears the fallback bit unconditionally, without
touching the associated flags.

The bit clear is safe, as no fallback operation can race with that --
all subflow are already in TCP_CLOSE status thanks to the previous
FASTCLOSE -- but we need to consistently reset all the fallback related
status.

Also acquire the relevant lock, to avoid fouling static analyzers.

Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-3-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agomptcp: plug races between subflow fail and subflow creation
Paolo Abeni [Mon, 14 Jul 2025 16:41:45 +0000 (18:41 +0200)]
mptcp: plug races between subflow fail and subflow creation

We have races similar to the one addressed by the previous patch between
subflow failing and additional subflow creation. They are just harder to
trigger.

The solution is similar. Use a separate flag to track the condition
'socket state prevent any additional subflow creation' protected by the
fallback lock.

The socket fallback makes such flag true, and also receiving or sending
an MP_FAIL option.

The field 'allow_infinite_fallback' is now always touched under the
relevant lock, we can drop the ONCE annotation on write.

Fixes: 478d770008b0 ("mptcp: send out MP_FAIL when data checksum fails")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-2-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agomptcp: make fallback action and fallback decision atomic
Paolo Abeni [Mon, 14 Jul 2025 16:41:44 +0000 (18:41 +0200)]
mptcp: make fallback action and fallback decision atomic

Syzkaller reported the following splat:

  WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 __mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
  WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
  WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 check_fully_established net/mptcp/options.c:982 [inline]
  WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
  Modules linked in:
  CPU: 1 UID: 0 PID: 7704 Comm: syz.3.1419 Not tainted 6.16.0-rc3-gbd5ce2324dba #20 PREEMPT(voluntary)
  Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
  RIP: 0010:__mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
  RIP: 0010:mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
  RIP: 0010:check_fully_established net/mptcp/options.c:982 [inline]
  RIP: 0010:mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
  Code: 24 18 e8 bb 2a 00 fd e9 1b df ff ff e8 b1 21 0f 00 e8 ec 5f c4 fc 44 0f b7 ac 24 b0 00 00 00 e9 54 f1 ff ff e8 d9 5f c4 fc 90 <0f> 0b 90 e9 b8 f4 ff ff e8 8b 2a 00 fd e9 8d e6 ff ff e8 81 2a 00
  RSP: 0018:ffff8880a3f08448 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff8880180a8000 RCX: ffffffff84afcf45
  RDX: ffff888090223700 RSI: ffffffff84afdaa7 RDI: 0000000000000001
  RBP: ffff888017955780 R08: 0000000000000001 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
  R13: ffff8880180a8910 R14: ffff8880a3e9d058 R15: 0000000000000000
  FS:  00005555791b8500(0000) GS:ffff88811c495000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000000110c2800b7 CR3: 0000000058e44000 CR4: 0000000000350ef0
  Call Trace:
   <IRQ>
   tcp_reset+0x26f/0x2b0 net/ipv4/tcp_input.c:4432
   tcp_validate_incoming+0x1057/0x1b60 net/ipv4/tcp_input.c:5975
   tcp_rcv_established+0x5b5/0x21f0 net/ipv4/tcp_input.c:6166
   tcp_v4_do_rcv+0x5dc/0xa70 net/ipv4/tcp_ipv4.c:1925
   tcp_v4_rcv+0x3473/0x44a0 net/ipv4/tcp_ipv4.c:2363
   ip_protocol_deliver_rcu+0xba/0x480 net/ipv4/ip_input.c:205
   ip_local_deliver_finish+0x2f1/0x500 net/ipv4/ip_input.c:233
   NF_HOOK include/linux/netfilter.h:317 [inline]
   NF_HOOK include/linux/netfilter.h:311 [inline]
   ip_local_deliver+0x1be/0x560 net/ipv4/ip_input.c:254
   dst_input include/net/dst.h:469 [inline]
   ip_rcv_finish net/ipv4/ip_input.c:447 [inline]
   NF_HOOK include/linux/netfilter.h:317 [inline]
   NF_HOOK include/linux/netfilter.h:311 [inline]
   ip_rcv+0x514/0x810 net/ipv4/ip_input.c:567
   __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5975
   __netif_receive_skb+0x1f/0x120 net/core/dev.c:6088
   process_backlog+0x301/0x1360 net/core/dev.c:6440
   __napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7453
   napi_poll net/core/dev.c:7517 [inline]
   net_rx_action+0xb44/0x1010 net/core/dev.c:7644
   handle_softirqs+0x1d0/0x770 kernel/softirq.c:579
   do_softirq+0x3f/0x90 kernel/softirq.c:480
   </IRQ>
   <TASK>
   __local_bh_enable_ip+0xed/0x110 kernel/softirq.c:407
   local_bh_enable include/linux/bottom_half.h:33 [inline]
   inet_csk_listen_stop+0x2c5/0x1070 net/ipv4/inet_connection_sock.c:1524
   mptcp_check_listen_stop.part.0+0x1cc/0x220 net/mptcp/protocol.c:2985
   mptcp_check_listen_stop net/mptcp/mib.h:118 [inline]
   __mptcp_close+0x9b9/0xbd0 net/mptcp/protocol.c:3000
   mptcp_close+0x2f/0x140 net/mptcp/protocol.c:3066
   inet_release+0xed/0x200 net/ipv4/af_inet.c:435
   inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:487
   __sock_release+0xb3/0x270 net/socket.c:649
   sock_close+0x1c/0x30 net/socket.c:1439
   __fput+0x402/0xb70 fs/file_table.c:465
   task_work_run+0x150/0x240 kernel/task_work.c:227
   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
   exit_to_user_mode_loop+0xd4/0xe0 kernel/entry/common.c:114
   exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline]
   syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline]
   syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline]
   do_syscall_64+0x245/0x360 arch/x86/entry/syscall_64.c:100
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
  RIP: 0033:0x7fc92f8a36ad
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007ffcf52802d8 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
  RAX: 0000000000000000 RBX: 00007ffcf52803a8 RCX: 00007fc92f8a36ad
  RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
  RBP: 00007fc92fae7ba0 R08: 0000000000000001 R09: 0000002800000000
  R10: 00007fc92f700000 R11: 0000000000000246 R12: 00007fc92fae5fac
  R13: 00007fc92fae5fa0 R14: 0000000000026d00 R15: 0000000000026c51
   </TASK>
  irq event stamp: 4068
  hardirqs last  enabled at (4076): [<ffffffff81544816>] __up_console_sem+0x76/0x80 kernel/printk/printk.c:344
  hardirqs last disabled at (4085): [<ffffffff815447fb>] __up_console_sem+0x5b/0x80 kernel/printk/printk.c:342
  softirqs last  enabled at (3096): [<ffffffff840e1be0>] local_bh_enable include/linux/bottom_half.h:33 [inline]
  softirqs last  enabled at (3096): [<ffffffff840e1be0>] inet_csk_listen_stop+0x2c0/0x1070 net/ipv4/inet_connection_sock.c:1524
  softirqs last disabled at (3097): [<ffffffff813b6b9f>] do_softirq+0x3f/0x90 kernel/softirq.c:480

Since we need to track the 'fallback is possible' condition and the
fallback status separately, there are a few possible races open between
the check and the actual fallback action.

Add a spinlock to protect the fallback related information and use it
close all the possible related races. While at it also remove the
too-early clearing of allow_infinite_fallback in __mptcp_subflow_connect():
the field will be correctly cleared by subflow_finish_connect() if/when
the connection will complete successfully.

If fallback is not possible, as per RFC, reset the current subflow.

Since the fallback operation can now fail and return value should be
checked, rename the helper accordingly.

Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status")
Cc: stable@vger.kernel.org
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/570
Reported-by: syzbot+5cf807c20386d699b524@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/555
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-1-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: libwx: fix multicast packets received count
Jiawen Wu [Mon, 14 Jul 2025 01:56:56 +0000 (09:56 +0800)]
net: libwx: fix multicast packets received count

Multicast good packets received by PF rings that pass ethternet MAC
address filtering are counted for rtnl_link_stats64.multicast. The
counter is not cleared on read. Fix the duplicate counting on updating
statistics.

Fixes: 46b92e10d631 ("net: libwx: support hardware statistics")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/DA229A4F58B70E51+20250714015656.91772-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'fix-rx-fatal-errors'
Jakub Kicinski [Wed, 16 Jul 2025 00:28:28 +0000 (17:28 -0700)]
Merge branch 'fix-rx-fatal-errors'

Jiawen Wu says:

====================
Fix Rx fatal errors

There are some fatal errors on the Rx NAPI path, which can cause
the kernel to crash. Fix known issues and potential risks.

The part of the patches has been mentioned before[1].

[1]: https://lore.kernel.org/all/C8A23A11DB646E60+20250630094102.22265-1-jiawenwu@trustnetic.com/

v1: https://lore.kernel.org/20250709064025.19436-1-jiawenwu@trustnetic.com
====================

Link: https://patch.msgid.link/20250714024755.17512-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: libwx: properly reset Rx ring descriptor
Jiawen Wu [Mon, 14 Jul 2025 02:47:55 +0000 (10:47 +0800)]
net: libwx: properly reset Rx ring descriptor

When device reset is triggered by feature changes such as toggling Rx
VLAN offload, wx->do_reset() is called to reinitialize Rx rings. The
hardware descriptor ring may retain stale values from previous sessions.
And only set the length to 0 in rx_desc[0] would result in building
malformed SKBs. Fix it to ensure a clean slate after device reset.

[  549.186435] [     C16] ------------[ cut here ]------------
[  549.186457] [     C16] kernel BUG at net/core/skbuff.c:2814!
[  549.186468] [     C16] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[  549.186472] [     C16] CPU: 16 UID: 0 PID: 0 Comm: swapper/16 Kdump: loaded Not tainted 6.16.0-rc4+ #23 PREEMPT(voluntary)
[  549.186476] [     C16] Hardware name: Micro-Star International Co., Ltd. MS-7E16/X670E GAMING PLUS WIFI (MS-7E16), BIOS 1.90 12/31/2024
[  549.186478] [     C16] RIP: 0010:__pskb_pull_tail+0x3ff/0x510
[  549.186484] [     C16] Code: 06 f0 ff 4f 34 74 7b 4d 8b 8c 24 c8 00 00 00 45 8b 84 24 c0 00 00 00 e9 c8 fd ff ff 48 c7 44 24 08 00 00 00 00 e9 5e fe ff ff <0f> 0b 31 c0 e9 23 90 5b ff 41 f7 c6 ff 0f 00 00 75 bf 49 8b 06 a8
[  549.186487] [     C16] RSP: 0018:ffffb391c0640d70 EFLAGS: 00010282
[  549.186490] [     C16] RAX: 00000000fffffff2 RBX: ffff8fe7e4d40200 RCX: 00000000fffffff2
[  549.186492] [     C16] RDX: ffff8fe7c3a4bf8e RSI: 0000000000000180 RDI: ffff8fe7c3a4bf40
[  549.186494] [     C16] RBP: ffffb391c0640da8 R08: ffff8fe7c3a4c0c0 R09: 000000000000000e
[  549.186496] [     C16] R10: ffffb391c0640d88 R11: 000000000000000e R12: ffff8fe7e4d40200
[  549.186497] [     C16] R13: 00000000fffffff2 R14: ffff8fe7fa01a000 R15: 00000000fffffff2
[  549.186499] [     C16] FS:  0000000000000000(0000) GS:ffff8fef5ae40000(0000) knlGS:0000000000000000
[  549.186502] [     C16] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  549.186503] [     C16] CR2: 00007f77d81d6000 CR3: 000000051a032000 CR4: 0000000000750ef0
[  549.186505] [     C16] PKRU: 55555554
[  549.186507] [     C16] Call Trace:
[  549.186510] [     C16]  <IRQ>
[  549.186513] [     C16]  ? srso_alias_return_thunk+0x5/0xfbef5
[  549.186517] [     C16]  __skb_pad+0xc7/0xf0
[  549.186523] [     C16]  wx_clean_rx_irq+0x355/0x3b0 [libwx]
[  549.186533] [     C16]  wx_poll+0x92/0x120 [libwx]
[  549.186540] [     C16]  __napi_poll+0x28/0x190
[  549.186544] [     C16]  net_rx_action+0x301/0x3f0
[  549.186548] [     C16]  ? srso_alias_return_thunk+0x5/0xfbef5
[  549.186551] [     C16]  ? __raw_spin_lock_irqsave+0x1e/0x50
[  549.186554] [     C16]  ? srso_alias_return_thunk+0x5/0xfbef5
[  549.186557] [     C16]  ? wake_up_nohz_cpu+0x35/0x160
[  549.186559] [     C16]  ? srso_alias_return_thunk+0x5/0xfbef5
[  549.186563] [     C16]  handle_softirqs+0xf9/0x2c0
[  549.186568] [     C16]  __irq_exit_rcu+0xc7/0x130
[  549.186572] [     C16]  common_interrupt+0xb8/0xd0
[  549.186576] [     C16]  </IRQ>
[  549.186577] [     C16]  <TASK>
[  549.186579] [     C16]  asm_common_interrupt+0x22/0x40
[  549.186582] [     C16] RIP: 0010:cpuidle_enter_state+0xc2/0x420
[  549.186585] [     C16] Code: 00 00 e8 11 0e 5e ff e8 ac f0 ff ff 49 89 c5 0f 1f 44 00 00 31 ff e8 0d ed 5c ff 45 84 ff 0f 85 40 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 84 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[  549.186587] [     C16] RSP: 0018:ffffb391c0277e78 EFLAGS: 00000246
[  549.186590] [     C16] RAX: ffff8fef5ae40000 RBX: 0000000000000003 RCX: 0000000000000000
[  549.186591] [     C16] RDX: 0000007fde0faac5 RSI: ffffffff826e53f6 RDI: ffffffff826fa9b3
[  549.186593] [     C16] RBP: ffff8fe7c3a20800 R08: 0000000000000002 R09: 0000000000000000
[  549.186595] [     C16] R10: 0000000000000000 R11: 000000000000ffff R12: ffffffff82ed7a40
[  549.186596] [     C16] R13: 0000007fde0faac5 R14: 0000000000000003 R15: 0000000000000000
[  549.186601] [     C16]  ? cpuidle_enter_state+0xb3/0x420
[  549.186605] [     C16]  cpuidle_enter+0x29/0x40
[  549.186609] [     C16]  cpuidle_idle_call+0xfd/0x170
[  549.186613] [     C16]  do_idle+0x7a/0xc0
[  549.186616] [     C16]  cpu_startup_entry+0x25/0x30
[  549.186618] [     C16]  start_secondary+0x117/0x140
[  549.186623] [     C16]  common_startup_64+0x13e/0x148
[  549.186628] [     C16]  </TASK>

Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-4-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: libwx: fix the using of Rx buffer DMA
Jiawen Wu [Mon, 14 Jul 2025 02:47:54 +0000 (10:47 +0800)]
net: libwx: fix the using of Rx buffer DMA

The wx_rx_buffer structure contained two DMA address fields: 'dma' and
'page_dma'. However, only 'page_dma' was actually initialized and used
to program the Rx descriptor. But 'dma' was uninitialized and used in
some paths.

This could lead to undefined behavior, including DMA errors or
use-after-free, if the uninitialized 'dma' was used. Althrough such
error has not yet occurred, it is worth fixing in the code.

Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-3-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: libwx: remove duplicate page_pool_put_full_page()
Jiawen Wu [Mon, 14 Jul 2025 02:47:53 +0000 (10:47 +0800)]
net: libwx: remove duplicate page_pool_put_full_page()

page_pool_put_full_page() should only be invoked when freeing Rx buffers
or building a skb if the size is too short. At other times, the pages
need to be reused. So remove the redundant page put. In the original
code, double free pages cause kernel panic:

[  876.949834]  __irq_exit_rcu+0xc7/0x130
[  876.949836]  common_interrupt+0xb8/0xd0
[  876.949838]  </IRQ>
[  876.949838]  <TASK>
[  876.949840]  asm_common_interrupt+0x22/0x40
[  876.949841] RIP: 0010:cpuidle_enter_state+0xc2/0x420
[  876.949843] Code: 00 00 e8 d1 1d 5e ff e8 ac f0 ff ff 49 89 c5 0f 1f 44 00 00 31 ff e8 cd fc 5c ff 45 84 ff 0f 85 40 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 84 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[  876.949844] RSP: 0018:ffffaa7340267e78 EFLAGS: 00000246
[  876.949845] RAX: ffff9e3f135be000 RBX: 0000000000000002 RCX: 0000000000000000
[  876.949846] RDX: 000000cc2dc4cb7c RSI: ffffffff89ee49ae RDI: ffffffff89ef9f9e
[  876.949847] RBP: ffff9e378f940800 R08: 0000000000000002 R09: 00000000000000ed
[  876.949848] R10: 000000000000afc8 R11: ffff9e3e9e5a9b6c R12: ffffffff8a6d8580
[  876.949849] R13: 000000cc2dc4cb7c R14: 0000000000000002 R15: 0000000000000000
[  876.949852]  ? cpuidle_enter_state+0xb3/0x420
[  876.949855]  cpuidle_enter+0x29/0x40
[  876.949857]  cpuidle_idle_call+0xfd/0x170
[  876.949859]  do_idle+0x7a/0xc0
[  876.949861]  cpu_startup_entry+0x25/0x30
[  876.949862]  start_secondary+0x117/0x140
[  876.949864]  common_startup_64+0x13e/0x148
[  876.949867]  </TASK>
[  876.949868] ---[ end trace 0000000000000000 ]---
[  876.949869] ------------[ cut here ]------------
[  876.949870] list_del corruption, ffffead40445a348->next is NULL
[  876.949873] WARNING: CPU: 14 PID: 0 at lib/list_debug.c:52 __list_del_entry_valid_or_report+0x67/0x120
[  876.949875] Modules linked in: snd_hrtimer(E) bnep(E) binfmt_misc(E) amdgpu(E) squashfs(E) vfat(E) loop(E) fat(E) amd_atl(E) snd_hda_codec_realtek(E) intel_rapl_msr(E) snd_hda_codec_generic(E) intel_rapl_common(E) snd_hda_scodec_component(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) edac_mce_amd(E) snd_intel_dspcfg(E) snd_hda_codec(E) snd_hda_core(E) amdxcp(E) kvm_amd(E) snd_hwdep(E) gpu_sched(E) drm_panel_backlight_quirks(E) cec(E) snd_pcm(E) drm_buddy(E) snd_seq_dummy(E) drm_ttm_helper(E) btusb(E) kvm(E) snd_seq_oss(E) btrtl(E) ttm(E) btintel(E) snd_seq_midi(E) btbcm(E) drm_exec(E) snd_seq_midi_event(E) i2c_algo_bit(E) snd_rawmidi(E) bluetooth(E) drm_suballoc_helper(E) irqbypass(E) snd_seq(E) ghash_clmulni_intel(E) sha512_ssse3(E) drm_display_helper(E) aesni_intel(E) snd_seq_device(E) rfkill(E) snd_timer(E) gf128mul(E) drm_client_lib(E) drm_kms_helper(E) snd(E) i2c_piix4(E) joydev(E) soundcore(E) wmi_bmof(E) ccp(E) k10temp(E) i2c_smbus(E) gpio_amdpt(E) i2c_designware_platform(E) gpio_generic(E) sg(E)
[  876.949914]  i2c_designware_core(E) sch_fq_codel(E) parport_pc(E) drm(E) ppdev(E) lp(E) parport(E) fuse(E) nfnetlink(E) ip_tables(E) ext4 crc16 mbcache jbd2 sd_mod sfp mdio_i2c i2c_core txgbe ahci ngbe pcs_xpcs libahci libwx r8169 phylink libata realtek ptp pps_core video wmi
[  876.949933] CPU: 14 UID: 0 PID: 0 Comm: swapper/14 Kdump: loaded Tainted: G        W   E       6.16.0-rc2+ #20 PREEMPT(voluntary)
[  876.949935] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
[  876.949936] Hardware name: Micro-Star International Co., Ltd. MS-7E16/X670E GAMING PLUS WIFI (MS-7E16), BIOS 1.90 12/31/2024
[  876.949936] RIP: 0010:__list_del_entry_valid_or_report+0x67/0x120
[  876.949938] Code: 00 00 00 48 39 7d 08 0f 85 a6 00 00 00 5b b8 01 00 00 00 5d 41 5c e9 73 0d 93 ff 48 89 fe 48 c7 c7 a0 31 e8 89 e8 59 7c b3 ff <0f> 0b 31 c0 5b 5d 41 5c e9 57 0d 93 ff 48 89 fe 48 c7 c7 c8 31 e8
[  876.949940] RSP: 0018:ffffaa73405d0c60 EFLAGS: 00010282
[  876.949941] RAX: 0000000000000000 RBX: ffffead40445a348 RCX: 0000000000000000
[  876.949942] RDX: 0000000000000105 RSI: 0000000000000001 RDI: 00000000ffffffff
[  876.949943] RBP: 0000000000000000 R08: 000000010006dfde R09: ffffffff8a47d150
[  876.949944] R10: ffffffff8a47d150 R11: 0000000000000003 R12: dead000000000122
[  876.949945] R13: ffff9e3e9e5af700 R14: ffffead40445a348 R15: ffff9e3e9e5af720
[  876.949946] FS:  0000000000000000(0000) GS:ffff9e3f135be000(0000) knlGS:0000000000000000
[  876.949947] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  876.949948] CR2: 00007fa58b480048 CR3: 0000000156724000 CR4: 0000000000750ef0
[  876.949949] PKRU: 55555554
[  876.949950] Call Trace:
[  876.949951]  <IRQ>
[  876.949952]  __rmqueue_pcplist+0x53/0x2c0
[  876.949955]  alloc_pages_bulk_noprof+0x2e0/0x660
[  876.949958]  __page_pool_alloc_pages_slow+0xa9/0x400
[  876.949961]  page_pool_alloc_pages+0xa/0x20
[  876.949963]  wx_alloc_rx_buffers+0xd7/0x110 [libwx]
[  876.949967]  wx_clean_rx_irq+0x262/0x430 [libwx]
[  876.949971]  wx_poll+0x92/0x130 [libwx]
[  876.949975]  __napi_poll+0x28/0x190
[  876.949977]  net_rx_action+0x301/0x3f0
[  876.949980]  ? srso_alias_return_thunk+0x5/0xfbef5
[  876.949981]  ? profile_tick+0x30/0x70
[  876.949983]  ? srso_alias_return_thunk+0x5/0xfbef5
[  876.949984]  ? srso_alias_return_thunk+0x5/0xfbef5
[  876.949986]  ? timerqueue_add+0xa3/0xc0
[  876.949988]  ? srso_alias_return_thunk+0x5/0xfbef5
[  876.949989]  ? __raise_softirq_irqoff+0x16/0x70
[  876.949991]  ? srso_alias_return_thunk+0x5/0xfbef5
[  876.949993]  ? srso_alias_return_thunk+0x5/0xfbef5
[  876.949994]  ? wx_msix_clean_rings+0x41/0x50 [libwx]
[  876.949998]  handle_softirqs+0xf9/0x2c0

Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-2-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: intel: populate entire system_counterval_t in get_time_fn() callback
Markus Blöchl [Sun, 13 Jul 2025 20:21:41 +0000 (22:21 +0200)]
net: stmmac: intel: populate entire system_counterval_t in get_time_fn() callback

get_time_fn() callback implementations are expected to fill out the
entire system_counterval_t struct as it may be initially uninitialized.

This broke with the removal of convert_art_to_tsc() helper functions
which left use_nsecs uninitialized.

Initially assign the entire struct with default values.

Fixes: f5e1d0db3f02 ("stmmac: intel: Remove convert_art_to_tsc()")
Cc: stable@vger.kernel.org
Signed-off-by: Markus Blöchl <markus@blochl.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250713-stmmac_crossts-v1-1-31bfe051b5cb@blochl.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge tag 'linux-can-fixes-for-6.16-20250715' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Tue, 15 Jul 2025 23:05:41 +0000 (16:05 -0700)]
Merge tag 'linux-can-fixes-for-6.16-20250715' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2025-07-15

Brett Werling's patch for the tcan4x5x glue code driver fixes the
detection of chips which are held in reset/sleep and must be woken up
by GPIO prior to communication.

* tag 'linux-can-fixes-for-6.16-20250715' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: tcan4x5x: fix reset gpio usage during probe
====================

Link: https://patch.msgid.link/20250715101625.3202690-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agousb: net: sierra: check for no status endpoint
Oliver Neukum [Mon, 14 Jul 2025 11:12:56 +0000 (13:12 +0200)]
usb: net: sierra: check for no status endpoint

The driver checks for having three endpoints and
having bulk in and out endpoints, but not that
the third endpoint is interrupt input.
Rectify the omission.

Reported-by: syzbot+3f89ec3d1d0842e95d50@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/686d5a9f.050a0220.1ffab7.0017.GAE@google.com/
Tested-by: syzbot+3f89ec3d1d0842e95d50@syzkaller.appspotmail.com
Fixes: eb4fd8cd355c8 ("net/usb: add sierra_net.c driver")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://patch.msgid.link/20250714111326.258378-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoice: check correct pointer in fwlog debugfs
Michal Swiatkowski [Tue, 24 Jun 2025 09:26:36 +0000 (11:26 +0200)]
ice: check correct pointer in fwlog debugfs

pf->ice_debugfs_pf_fwlog should be checked for an error here.

Fixes: 96a9a9341cda ("ice: configure FW logging")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 months agoice: add NULL check in eswitch lag check
Dave Ertman [Thu, 22 May 2025 17:16:57 +0000 (13:16 -0400)]
ice: add NULL check in eswitch lag check

The function ice_lag_is_switchdev_running() is being called from outside of
the LAG event handler code.  This results in the lag->upper_netdev being
NULL sometimes.  To avoid a NULL-pointer dereference, there needs to be a
check before it is dereferenced.

Fixes: 776fe19953b0 ("ice: block default rule setting on LAG interface")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 months agoethernet: intel: fix building with large NR_CPUS
Arnd Bergmann [Fri, 20 Jun 2025 17:31:24 +0000 (19:31 +0200)]
ethernet: intel: fix building with large NR_CPUS

With large values of CONFIG_NR_CPUS, three Intel ethernet drivers fail to
compile like:

In function ‘i40e_free_q_vector’,
    inlined from ‘i40e_vsi_alloc_q_vectors’ at drivers/net/ethernet/intel/i40e/i40e_main.c:12112:3:
  571 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
include/linux/rcupdate.h:1084:17: note: in expansion of macro ‘BUILD_BUG_ON’
 1084 |                 BUILD_BUG_ON(offsetof(typeof(*(ptr)), rhf) >= 4096);    \
drivers/net/ethernet/intel/i40e/i40e_main.c:5113:9: note: in expansion of macro ‘kfree_rcu’
 5113 |         kfree_rcu(q_vector, rcu);
      |         ^~~~~~~~~

The problem is that the 'rcu' member in 'q_vector' is too far from the start
of the structure. Move this member before the CPU mask instead, in all three
drivers.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 months agoMerge tag 'soc-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Tue, 15 Jul 2025 16:26:33 +0000 (09:26 -0700)]
Merge tag 'soc-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "There are 18 devicetree fixes for three arm64 plaforms: Qualcomm
  Snapdragon, Rockchips and NXP i.MX. These get updated to more
  correctly describe the hardware, fixing issues with:

   - real-time clock on Snapdragon based laptops

   - SD card detection, PCI probing and HDMI/DDC communication on
     Rockchips

   - ethernet and SPI probing on certain i.MX based boards

   - a regression with the i.MX watchdog

  Aside from the devicetree fixes, there are two additional fixes for
  the merged ASPEED LPC snoop driver that saw some changes in 6.16, and
  one additional driver enabled in arm64 defconfig to fix CPU frequency
  scaling"

* tag 'soc-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits)
  arm64: dts: freescale: imx8mm-verdin: Keep LDO5 always on
  soc: aspeed: lpc-snoop: Don't disable channels that aren't enabled
  soc: aspeed: lpc-snoop: Cleanup resources in stack-order
  arm64: dts: imx95: Correct the DMA interrupter number of pcie0_ep
  arm64: dts: rockchip: Add missing fan-supply to rk3566-quartz64-a
  arm64: dts: rockchip: use cs-gpios for spi1 on ringneck
  arm64: dts: add big-endian property back into watchdog node
  arm64: dts: imx95-15x15-evk: fix the overshoot issue of NETC
  arm64: dts: imx95-19x19-evk: fix the overshoot issue of NETC
  arm64: dts: rockchip: list all CPU supplies on ArmSoM Sige5
  arm64: dts: imx8mp-venice-gw74xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw73xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw72xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw71xx: fix TPM SPI frequency
  arm64: dts: qcom: x1e80100: describe uefi rtc offset
  arm64: dts: qcom: sc8280xp-x13s: describe uefi rtc offset
  arm64: defconfig: Enable Qualcomm CPUCP mailbox driver
  arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi 4B
  arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi CM5
  arm64: dts: rockchip: Adjust the HDMI DDC IO driver strength for rk3588
  ...

3 months agoMerge tag 'hid-for-linus-2025071501' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 15 Jul 2025 16:20:44 +0000 (09:20 -0700)]
Merge tag 'hid-for-linus-2025071501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Benjamin Tissoires:

 - one warning cleanup introduced in the last PR (Andy Shevchenko)

 - a nasty syzbot buffer underflow fix co-debugged with Alan Stern
   (Benjamin Tissoires)

* tag 'hid-for-linus-2025071501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  selftests/hid: add a test case for the recent syzbot underflow
  HID: core: do not bypass hid_hw_raw_request
  HID: core: ensure __hid_request reserves the report ID as the first byte
  HID: core: ensure the allocated report buffer can contain the reserved report ID
  HID: debug: Remove duplicate entry (BTN_WHEEL)

3 months agoselftests: net: increase inter-packet timeout in udpgro.sh
Paolo Abeni [Thu, 10 Jul 2025 16:04:50 +0000 (18:04 +0200)]
selftests: net: increase inter-packet timeout in udpgro.sh

The mentioned test is not very stable when running on top of
debug kernel build. Increase the inter-packet timeout to allow
more slack in such environments.

Fixes: 3327a9c46352 ("selftests: add functionals test for UDP GRO")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/b0370c06ddb3235debf642c17de0284b2cd3c652.1752163107.git.pabeni@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agoPM: sleep: Update power.completion for all devices on errors
Rafael J. Wysocki [Mon, 14 Jul 2025 17:45:31 +0000 (19:45 +0200)]
PM: sleep: Update power.completion for all devices on errors

After commit aa7a9275ab81 ("PM: sleep: Suspend async parents after
suspending children"), the following scenario is possible:

 1. Device A is async and it depends on device B that is sync.
 2. Async suspend is scheduled for A before the processing of B is
    started.
 3. A is waiting for B.
 4. In the meantime, an unrelated device fails to suspend and returns
    an error.
 5. The processing of B doesn't start at all and its power.completion is
    not updated.
 6. A is still waiting for B when async_synchronize_full() is called.
 7. Deadlock ensues.

To prevent this from happening, update power.completion for all devices
on errors in all suspend phases, but do not do it directly for devices
that are already being processed or are waiting for the processing to
start because in those cases it may be necessary to wait for the
processing to actually complete before updating power.completion for
the device.

Fixes: aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children")
Fixes: 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous")
Closes: https://lore.kernel.org/linux-pm/e13740a0-88f3-4a6f-920f-15805071a7d6@linaro.org/
Reported-and-tested-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/6191258.lOV4Wx5bFT@rjwysocki.net
3 months agoPM: suspend: clean up redundant filesystems_freeze/thaw() handling
Zihuan Zhang [Sat, 12 Jul 2025 03:08:24 +0000 (11:08 +0800)]
PM: suspend: clean up redundant filesystems_freeze/thaw() handling

The recently introduced support for freezing filesystems during system
suspend included calls to filesystems_freeze() in both suspend_prepare()
and enter_state(), as well as calls to filesystems_thaw() in both
suspend_finish() and the Unlock path in enter_state(). These are
redundant.

Moreover, calling filesystems_freeze() twice, from both suspend_prepare()
and enter_state(), leads to a black screen and makes the system unable
to resume in some cases.

Address this as follows:

 - filesystems_freeze() is already called in suspend_prepare(), which
   is the proper and consistent place to handle pre-suspend operations.
   The second call in enter_state() is unnecessary and so remove it.

 - filesystems_thaw() is invoked in suspend_finish(), which covers
   successful suspend/resume paths. In the failure case, add a call
   to filesystems_thaw() only when needed, avoiding the duplicate call
   in the general Unlock path.

This change simplifies the suspend code and avoids repeated freeze/thaw
calls, while preserving correct ordering and behavior.

Fixes: eacfbf74196f ("power: freeze filesystems during suspend/resume")
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Link: https://patch.msgid.link/20250712030824.81474-1-zhangzihuan@kylinos.cn
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
3 months agoPM: suspend: Drop a misplaced pm_restore_gfp_mask() call
Rafael J. Wysocki [Thu, 10 Jul 2025 08:43:26 +0000 (10:43 +0200)]
PM: suspend: Drop a misplaced pm_restore_gfp_mask() call

The pm_restore_gfp_mask() call added by commit 12ffc3b1513e ("PM:
Restrict swap use to later in the suspend sequence") to
suspend_devices_and_enter() is done too early because it takes
place before calling dpm_resume() in dpm_resume_end() and some
swap-backing devices may not be ready at that point.  Moreover,
dpm_resume_end() called subsequently in the same code path invokes
pm_restore_gfp_mask() again and calling it twice in a row is
pointless.

Drop the misplaced pm_restore_gfp_mask() call from
suspend_devices_and_enter() to address this issue.

Fixes: 12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/2810409.mvXUDI8C0e@rjwysocki.net
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
3 months agoMerge tag 'iwlwifi-fixes-2025-07-15' of https://git.kernel.org/pub/scm/linux/kernel...
Johannes Berg [Tue, 15 Jul 2025 11:07:24 +0000 (13:07 +0200)]
Merge tag 'iwlwifi-fixes-2025-07-15' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next

Miri Korenblit says:
====================
iwlwifi-fixes

- missing unlock in error path
- Avoid FW assert on bad command values
- fix kernel panic due to incorrect index calculation
====================

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
3 months agocan: tcan4x5x: fix reset gpio usage during probe
Brett Werling [Fri, 11 Jul 2025 14:17:28 +0000 (09:17 -0500)]
can: tcan4x5x: fix reset gpio usage during probe

Fixes reset GPIO usage during probe by ensuring we retrieve the GPIO and
take the device out of reset (if it defaults to being in reset) before
we attempt to communicate with the device. This is achieved by moving
the call to tcan4x5x_get_gpios() before tcan4x5x_find_version() and
avoiding any device communication while getting the GPIOs. Once we
determine the version, we can then take the knowledge of which GPIOs we
obtained and use it to decide whether we need to disable the wake or
state pin functions within the device.

This change is necessary in a situation where the reset GPIO is pulled
high externally before the CPU takes control of it, meaning we need to
explicitly bring the device out of reset before we can start
communicating with it at all.

This also has the effect of fixing an issue where a reset of the device
would occur after having called tcan4x5x_disable_wake(), making the
original behavior not actually disable the wake. This patch should now
disable wake or state pin functions well after the reset occurs.

Signed-off-by: Brett Werling <brett.werling@garmin.com>
Link: https://patch.msgid.link/20250711141728.1826073-1-brett.werling@garmin.com
Cc: Markus Schneider-Pargmann <msp@baylibre.com>
Fixes: 142c6dc6d9d7 ("can: tcan4x5x: Add support for tcan4552/4553")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 months agowifi: iwlwifi: Fix botched indexing conversion
Ville Syrjälä [Fri, 11 Jul 2025 20:57:44 +0000 (23:57 +0300)]
wifi: iwlwifi: Fix botched indexing conversion

The conversion from compiler assisted indexing to manual
indexing wasn't done correctly. The array is still made
up of __le16 elements so multiplying the outer index by
the element size is not what we want. Fix it up.

This causes the kernel to oops when trying to transfer any
significant amount of data over wifi:

BUG: unable to handle page fault for address: ffffc900009f5282
PGD 100000067 P4D 100000067 PUD 1000fb067 PMD 102e82067 PTE 0
Oops: Oops: 0002 [#1] SMP
CPU: 1 UID: 0 PID: 99 Comm: kworker/u8:3 Not tainted 6.15.0-rc2-cl-bisect3-00604-g6204d5130a64-dirty #78 PREEMPT
Hardware name: Dell Inc. Latitude E5400                  /0D695C, BIOS A19 06/13/2013
Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]
RIP: 0010:iwl_trans_pcie_tx+0x4dd/0xe60 [iwlwifi]
Code: 00 00 66 81 fa ff 0f 0f 87 42 09 00 00 3d ff 00 00 00 0f 8f 37 09 00 00 41 c1 e0 0c 41 09 d0 48 8d 14 b6 48 c1 e2 07 48 01 ca <66> 44 89 04 57 48 8d 0c 12 83 f8 3f 0f 8e 84 01 00 00 41 8b 85 80
RSP: 0018:ffffc900001c3b50 EFLAGS: 00010206
RAX: 00000000000000c1 RBX: ffff88810b180028 RCX: 00000000000000c1
RDX: 0000000000002141 RSI: 000000000000000d RDI: ffffc900009f1000
RBP: 0000000000000002 R08: 0000000000000025 R09: ffffffffa050fa60
R10: 00000000fbdbf4bc R11: 0000000000000082 R12: ffff88810e5ade40
R13: ffff88810af81588 R14: 000000000000001a R15: ffff888100dfe0c8
FS:  0000000000000000(0000) GS:ffff8881998c3000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc900009f5282 CR3: 0000000001e39000 CR4: 00000000000426f0
Call Trace:
 <TASK>
 ? rcu_is_watching+0xd/0x40
 ? __iwl_dbg+0xb1/0xe0 [iwlwifi]
 iwlagn_tx_skb+0x8e2/0xcb0 [iwldvm]
 iwlagn_mac_tx+0x18/0x30 [iwldvm]
 ieee80211_handle_wake_tx_queue+0x6c/0xc0 [mac80211]
 ieee80211_agg_start_txq+0x140/0x2e0 [mac80211]
 ieee80211_agg_tx_operational+0x126/0x210 [mac80211]
 ieee80211_process_addba_resp+0x27b/0x2a0 [mac80211]
 ieee80211_iface_work+0x4bd/0x4d0 [mac80211]
 ? _raw_spin_unlock_irq+0x1f/0x40
 cfg80211_wiphy_work+0x117/0x1f0 [cfg80211]
 process_one_work+0x1ee/0x570
 worker_thread+0x1c5/0x3b0
 ? bh_worker+0x240/0x240
 kthread+0x110/0x220
 ? kthread_queue_delayed_work+0x90/0x90
 ret_from_fork+0x28/0x40
 ? kthread_queue_delayed_work+0x90/0x90
 ret_from_fork_asm+0x11/0x20
 </TASK>
Modules linked in: ctr aes_generic ccm sch_fq_codel bnep xt_tcpudp xt_multiport xt_state iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ip_tables x_tables btusb btrtl btintel btbcm bluetooth ecdh_generic ecc libaes hid_generic usbhid hid binfmt_misc joydev mousedev snd_hda_codec_hdmi iwldvm snd_hda_codec_idt snd_hda_codec_generic mac80211 coretemp iTCO_wdt watchdog kvm_intel i2c_dev snd_hda_intel libarc4 kvm snd_intel_dspcfg sdhci_pci sdhci_uhs2 snd_hda_codec iwlwifi sdhci irqbypass cqhci snd_hwdep snd_hda_core cfg80211 firewire_ohci mmc_core psmouse snd_pcm i2c_i801 firewire_core pcspkr led_class uhci_hcd i2c_smbus tg3 crc_itu_t iosf_mbi snd_timer rfkill libphy ehci_pci snd ehci_hcd lpc_ich mfd_core usbcore video intel_agp usb_common soundcore intel_gtt evdev agpgart parport_pc wmi parport backlight
CR2: ffffc900009f5282
---[ end trace 0000000000000000 ]---
RIP: 0010:iwl_trans_pcie_tx+0x4dd/0xe60 [iwlwifi]
Code: 00 00 66 81 fa ff 0f 0f 87 42 09 00 00 3d ff 00 00 00 0f 8f 37 09 00 00 41 c1 e0 0c 41 09 d0 48 8d 14 b6 48 c1 e2 07 48 01 ca <66> 44 89 04 57 48 8d 0c 12 83 f8 3f 0f 8e 84 01 00 00 41 8b 85 80
RSP: 0018:ffffc900001c3b50 EFLAGS: 00010206
RAX: 00000000000000c1 RBX: ffff88810b180028 RCX: 00000000000000c1
RDX: 0000000000002141 RSI: 000000000000000d RDI: ffffc900009f1000
RBP: 0000000000000002 R08: 0000000000000025 R09: ffffffffa050fa60
R10: 00000000fbdbf4bc R11: 0000000000000082 R12: ffff88810e5ade40
R13: ffff88810af81588 R14: 000000000000001a R15: ffff888100dfe0c8
FS:  0000000000000000(0000) GS:ffff8881998c3000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc900009f5282 CR3: 0000000001e39000 CR4: 00000000000426f0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Cc: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Fixes: 6204d5130a64 ("wifi: iwlwifi: use bc entries instead of bc table also for pre-ax210")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20250711205744.28723-1-ville.syrjala@linux.intel.com
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
3 months agowifi: cfg80211: remove scan request n_channels counted_by
Johannes Berg [Mon, 14 Jul 2025 12:21:30 +0000 (14:21 +0200)]
wifi: cfg80211: remove scan request n_channels counted_by

This reverts commit e3eac9f32ec0 ("wifi: cfg80211: Annotate struct
cfg80211_scan_request with __counted_by").

This really has been a completely failed experiment. There were
no actual bugs found, and yet at this point we already have four
"fixes" to it, with nothing to show for but code churn, and it
never even made the code any safer.

In all of the cases that ended up getting "fixed", the structure
is also internally inconsistent after the n_channels setting as
the channel list isn't actually filled yet. You cannot scan with
such a structure, that's just wrong. In mac80211, the struct is
also reused multiple times, so initializing it once is no good.

Some previous "fixes" (e.g. one in brcm80211) are also just setting
n_channels before accessing the array, under the assumption that the
code is correct and the array can be accessed, further showing that
the whole thing is just pointless when the allocation count and use
count are not separate.

If we really wanted to fix it, we'd need to separately track the
number of channels allocated and the number of channels currently
used, but given that no bugs were found despite the numerous syzbot
reports, that'd just be a waste of time.

Remove the __counted_by() annotation. We really should also remove
a number of the n_channels settings that are setting up a structure
that's inconsistent, but that can wait.

Reported-by: syzbot+e834e757bd9b3d3e1251@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e834e757bd9b3d3e1251
Fixes: e3eac9f32ec0 ("wifi: cfg80211: Annotate struct cfg80211_scan_request with __counted_by")
Link: https://patch.msgid.link/20250714142130.9b0bbb7e1f07.I09112ccde72d445e11348fc2bef68942cb2ffc94@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
3 months agoMerge tag 'ath-current-20250714' of git://git.kernel.org/pub/scm/linux/kernel/git...
Johannes Berg [Tue, 15 Jul 2025 08:26:03 +0000 (10:26 +0200)]
Merge tag 'ath-current-20250714' of git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath

Jeff Johnson says:
==================
ath.git update for v6.16-rc7

Fix an ath12k performance regression introduce in v6.16-rc1
==================

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
3 months agoMerge tag 'for-6.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 15 Jul 2025 02:25:28 +0000 (19:25 -0700)]
Merge tag 'for-6.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mikulas Patocka:

 - dm-bufio: fix scheduling in atomic

* tag 'for-6.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm-bufio: fix sched in atomic context

3 months agonet: phy: Don't register LEDs for genphy
Sean Anderson [Mon, 7 Jul 2025 19:58:03 +0000 (15:58 -0400)]
net: phy: Don't register LEDs for genphy

If a PHY has no driver, the genphy driver is probed/removed directly in
phy_attach/detach. If the PHY's ofnode has an "leds" subnode, then the
LEDs will be (un)registered when probing/removing the genphy driver.
This could occur if the leds are for a non-generic driver that isn't
loaded for whatever reason. Synchronously removing the PHY device in
phy_detach leads to the following deadlock:

rtnl_lock()
ndo_close()
    ...
    phy_detach()
        phy_remove()
            phy_leds_unregister()
                led_classdev_unregister()
                    led_trigger_set()
                        netdev_trigger_deactivate()
                            unregister_netdevice_notifier()
                                rtnl_lock()

There is a corresponding deadlock on the open/register side of things
(and that one is reported by lockdep), but it requires a race while this
one is deterministic.

Generic PHYs do not support LEDs anyway, so don't bother registering
them.

Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Link: https://patch.msgid.link/20250707195803.666097-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>