net: dns_resolver: Use reST bullet list for features list
Features overview list uses an asterisk in parentheses (``(*)``)
as bullet list marker, which isn't supported by Sphinx as proper
bullet. Replace it with just asterisk.
Both the nxp,sja1105 and the nxp,sja1110 series feature an active-low
reset pin, rendering reset-gpios a valid property for all of the
nxp,sja1105 family.
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Link: https://patch.msgid.link/20250924-imx8mp-prt8ml-v3-1-f498d7f71a94@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 26 Sep 2025 22:18:14 +0000 (15:18 -0700)]
Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
idpf: add XSk support
Alexander Lobakin says:
Add support for XSk xmit and receive using libeth_xdp.
This includes adding interfaces to reconfigure/enable/disable only
a particular set of queues and support for checksum offload XSk Tx
metadata.
libeth_xdp's implementation mostly matches the one of ice: batched
allocations and sending, unrolled descriptor writes etc. But unlike
other Intel drivers, XSk wakeup is implemented using CSD/IPI instead
of HW "software interrupt". In lots of different tests, this yielded
way better perf than SW interrupts, but also, this gives better
control over which CPU will handle the NAPI loop (SW interrupts are
a subject to irqbalance and stuff, while CSDs are strictly pinned
1:1 to the core of the same index).
Note that the header split is always disabled for XSk queues, as
for now we see no reasons to have it there.
XSk xmit perf is up to 3x comparing to ice. XSk XDP_PASS is also
faster a bunch as it uses system percpu page_pools, so that the
only overhead left is memcpy(). The rest is at least comparable.
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
idpf: enable XSk features and ndo_xsk_wakeup
idpf: implement Rx path for AF_XDP
idpf: implement XSk xmit
idpf: add XSk pool initialization
idpf: add virtchnl functions to manage selected queues
====================
Dan Carpenter [Wed, 24 Sep 2025 14:21:17 +0000 (17:21 +0300)]
dibs: Check correct variable in dibs_init()
There is a typo in this code. It should check "dibs_class" instead of
"&dibs_class". Remove the &.
Fixes: 804737349813 ("dibs: Create class dibs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Link: https://patch.msgid.link/aNP-XcrjSUjZAu4a@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
this is a pull request of 48 patches for net-next/main, which
supersedes tags/linux-can-next-for-6.18-20250923.
The 1st patch is by Xichao Zhao and converts ns_to_ktime() to
us_to_ktime() in the m_can driver.
Vincent Mailhol contributes 2 patches: Updating the MAINTAINERS and
mailmap files to Vincent's new email address and sorting the includes
in the CAN helper library alphabeticaly.
Stéphane Grosjean's patch modifies all peak CAN drivers and the
mailmap to reflect Stéphane's new email address.
4 patches by Biju Das update the CAN-FD handling in the rcar_canfd
driver.
Followed by 11 patches by Geert Uytterhoeven updating and improving
the rcar_can driver.
Stefan Mätje contributes 2 patches for the esd_usb driver updating the
error messages.
The next 3 patch series are all by Vincent Mailhol: 3 patches to
optimize the size of struct raw_sock and struct uniqframe. 4 patches
which rework the CAN MTU logic as preparation for CAN-XL interfaces.
And finally 20 patches that prepare and refactor the CAN netlink code
for the upcoming CAN-XL support.
* tag 'linux-can-next-for-6.18-20250924' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (48 commits)
can: netlink: add userland error messages
can: dev: add can_get_ctrlmode_str()
can: calc_bittiming: make can_calc_tdco() FD agnostic
can: netlink: make can_tdc_fill_info() FD agnostic
can: netlink: add can_bitrate_const_fill_info()
can: netlink: add can_bittiming_const_fill_info()
can: netlink: add can_bittiming_fill_info()
can: netlink: add can_data_bittiming_get_size()
can: netlink: make can_tdc_get_size() FD agnostic
can: netlink: add can_ctrlmode_changelink()
can: netlink: add can_dtb_changelink()
can: netlink: make can_tdc_changelink() FD agnostic
can: netlink: remove useless check in can_tdc_changelink()
can: netlink: refactor CAN_CTRLMODE_TDC_{AUTO,MANUAL} flag reset logic
can: netlink: add can_validate_databittiming()
can: netlink: add can_validate_tdc()
can: netlink: refactor can_validate_bittiming()
can: netlink: document which symbols are FD specific
can: dev: make can_get_relative_tdco() FD agnostic and move it to bittiming.h
can: dev: move struct data_bittiming_params to linux/can/bittiming.h
...
====================
1) Fix field-spanning memcpy warning in AH output.
From Charalampos Mitrodimas.
2) Replace the strcpy() calls for alg_name by strscpy().
From Miguel García.
* tag 'ipsec-next-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
xfrm: xfrm_user: use strscpy() for alg_name
net: ipv6: fix field-spanning memcpy warning in AH output
====================
Jakub Kicinski [Fri, 26 Sep 2025 21:27:28 +0000 (14:27 -0700)]
Merge tag 'wireless-next-2025-09-25' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Quite a bit more things, including pull requests from drivers:
- mt76: MLO support, HW restart improvements
- rtw88/89: small features, prep for RTL8922DE support
- ath10k: GTK rekey fixes
- cfg80211/mac80211:
- additions for more NAN support
- S1G channel representation cleanup
* tag 'wireless-next-2025-09-25' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (167 commits)
wifi: libertas: add WQ_UNBOUND to alloc_workqueue users
Revert "wifi: libertas: WQ_PERCPU added to alloc_workqueue users"
wifi: libertas: WQ_PERCPU added to alloc_workqueue users
wifi: cfg80211: fix width unit in cfg80211_radio_chandef_valid()
wifi: ath11k: HAL SRNG: don't deinitialize and re-initialize again
wifi: ath12k: enforce CPU endian format for all QMI data
wifi: ath12k: Use 1KB Cache Flush Command for QoS TID Descriptors
wifi: ath12k: Fix flush cache failure during RX queue update
wifi: ath12k: Add Retry Mechanism for REO RX Queue Update Failures
wifi: ath12k: Refactor REO command to use ath12k_dp_rx_tid_rxq
wifi: ath12k: Refactor RX TID buffer cleanup into helper function
wifi: ath12k: Refactor RX TID deletion handling into helper function
wifi: ath12k: Increase DP_REO_CMD_RING_SIZE to 256
wifi: cfg80211: remove IEEE80211_CHAN_{1,2,4,8,16}MHZ flags
wifi: rtw89: avoid circular locking dependency in ser_state_run()
wifi: rtw89: fix leak in rtw89_core_send_nullfunc()
wifi: rtw89: avoid possible TX wait initialization race
wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait()
wifi: ath12k: Fix peer lookup in ath12k_dp_mon_rx_deliver_msdu()
wifi: mac80211: fix Rx packet handling when pubsta information is not available
...
====================
Stanislav Fomichev [Wed, 24 Sep 2025 22:25:18 +0000 (15:25 -0700)]
selftests: drv-net: Enable BTF
Commit fec2e55bdef ("selftests: drv-net: Pull data before parsing headers")
added __ksym external symbol to xdp_native.bpf.c which now requires
a kernel with BTF. Enable BTF for driver selftests.
Before:
# TAP version 13
# 1..10
# # Exception| Traceback (most recent call last):
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/net/lib/py/ksft.py", line 244, in ksft_run
# # Exception| case(*args)
# # Exception| ~~~~^^^^^^^
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/drivers/net/./xdp.py", line 231, in test_xdp_native_pass_sb
# # Exception| _test_pass(cfg, bpf_info, 256)
# # Exception| ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/drivers/net/./xdp.py", line 209, in _test_pass
# # Exception| prog_info = _load_xdp_prog(cfg, bpf_info)
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/drivers/net/./xdp.py", line 114, in _load_xdp_prog
# # Exception| cmd(
# # Exception| ~~~^
# # Exception| f"ip link set dev {cfg.ifname} mtu {bpf_info.mtu} xdpdrv obj {abs_path} sec {bpf_info.xdp_sec}",
# # Exception| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# # Exception| shell=True
# # Exception| ^^^^^^^^^^
# # Exception| )
# # Exception| ^
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/net/lib/py/utils.py", line 75, in __init__
# # Exception| self.process(terminate=False, fail=fail, timeout=timeout)
# # Exception| ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/net/lib/py/utils.py", line 95, in process
# # Exception| raise CmdExitFailure("Command failed: %s\nSTDOUT: %s\nSTDERR: %s" %
# # Exception| (self.proc.args, stdout, stderr), self)
# # Exception| net.lib.py.utils.CmdExitFailure: Command failed: ip link set dev eni30773np1 mtu 1500 xdpdrv obj /home/sdf/src/linux/tools/testing/selftests/net/lib/xdp_native.bpf.o sec xdp
# # Exception| STDOUT: b''
# # Exception| STDERR: b"libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?\nlibbpf: failed to find '.BTF' ELF section in /lib/modules/6.17.0-rc6-virtme/build/vmlinux\nlibbpf: failed to find valid kernel BTF\nlib
bpf: Error loading vmlinux BTF: -3\nlibbpf: failed to load object '/home/sdf/src/linux/tools/testing/selftests/net/lib/xdp_native.bpf.o'\n"
# not ok 1 xdp.test_xdp_native_pass_sb
...
After:
# TAP version 13
# 1..10
# ok 1 xdp.test_xdp_native_pass_sb
# ok 2 xdp.test_xdp_native_pass_mb
# ok 3 xdp.test_xdp_native_drop_sb
# ok 4 xdp.test_xdp_native_drop_mb
# ok 5 xdp.test_xdp_native_tx_sb
# ok 6 xdp.test_xdp_native_tx_mb
# # Ignoring SIGTERM (cnt: 2), already exiting...
# # Ignoring SIGTERM (cnt: 3), already exiting...
# # Exception| Traceback (most recent call last):
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/net/lib/py/ksft.py", line 244, in ksft_run
# # Exception| case(*args)
# # Exception| ~~~~^^^^^^^
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/drivers/net/./xdp.py", line 506, in test_xdp_native_adjst_taa
# # Exception| res = _test_xdp_native_tail_adjst(
# # Exception| cfg,
# # Exception| pkt_sz_lst,
# # Exception| offset_lst,
# # Exception| )
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/drivers/net/./xdp.py", line 467, in _test_xdp_native_tail_adt
# # Exception| recvd_str = _exchg_udp(cfg, port, test_str)
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/drivers/net/./xdp.py", line 72, in _exchg_udp
# # Exception| with bkg(rx_udp_cmd, exit_wait=True) as nc:
# # Exception| ~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/net/lib/py/utils.py", line 137, in __exit__
# # Exception| return self.process(terminate=terminate, fail=self.check_fail)
# # Exception| ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/net/lib/py/utils.py", line 85, in process
# # Exception| stdout, stderr = self.proc.communicate(timeout)
# # Exception| ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
# # Exception| File "/usr/lib/python3.13/subprocess.py", line 1222, in communicate
# # Exception| stdout, stderr = self._communicate(input, endtime, timeout)
# # Exception| ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
# # Exception| File "/usr/lib/python3.13/subprocess.py", line 2128, in _communicate
# # Exception| ready = selector.select(timeout)
# # Exception| File "/usr/lib/python3.13/selectors.py", line 398, in select
# # Exception| fd_event_list = self._selector.poll(timeout)
# # Exception| File "/home/sdf/src/linux/tools/testing/selftests/net/lib/py/ksft.py", line 208, in _ksft_intr
# # Exception| raise KsftTerminate()
# # Exception| net.lib.py.ksft.KsftTerminate
# # Stopping tests due to KsftTerminate.
# not ok 7 xdp.test_xdp_native_adjst_tail_grow_data
# # Totals: pass:6 fail:1 xfail:0 xpass:0 skip:0 error:0
psp: Expand PSP acronym in INET_PSP help description
People not very intimate with PSP may not know the meaning of this
recursive acronym. Hence replace the half-explanatory "PSP protocol" in
the help description by the full expansion, like is done in the linked
PSP Architecture Specification document.
Robert Marko [Thu, 25 Sep 2025 13:19:49 +0000 (15:19 +0200)]
dt-bindings: net: sparx5: correct LAN969x register space windows
LAN969x needs only 2 register space windows as GCB is already covered by
the "devices" register space window, so expect only 2 "reg" and "reg-names"
properties.
Fixes: 41c6439fdc2b ("dt-bindings: net: add compatible strings for lan969x targets") Signed-off-by: Robert Marko <robert.marko@sartura.hr> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20250925132109.583984-1-robert.marko@sartura.hr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests: drv-net: Reload pkt pointer after calling filter_udphdr
Fix a verification failure. filter_udphdr() calls bpf_xdp_pull_data(),
which will invalidate all pkt pointers. Therefore, all ctx->data loaded
before filter_udphdr() cannot be used. Reload it to prevent verification
errors.
The error may not appear on some compiler versions if they decide to
load ctx->data after filter_udphdr() when it is first used.
Fixes: efec2e55bdef ("selftests: drv-net: Pull data before parsing headers") Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250925161452.1290694-1-ameryhung@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
xsk: refactors around generic xmit side
this small patchset is about refactoring code around xsk_build_skb() as
it became pretty heavy. Generic xmit is a bit hard to follow so here are
three clean ups to start with making this code more friendly.
Maciej Fijalkowski [Thu, 25 Sep 2025 16:00:09 +0000 (18:00 +0200)]
xsk: wrap generic metadata handling onto separate function
xsk_build_skb() has gone wild with its size and one of the things we can
do about it is to pull out a branch that takes care of metadata handling
and make it a separate function.
While at it, let us add metadata SW support for devices supporting
IFF_TX_SKB_NO_LINEAR flag, that happen to have separate logic for
building skb in xsk's generic xmit path.
Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250925160009.2474816-4-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maciej Fijalkowski [Thu, 25 Sep 2025 16:00:08 +0000 (18:00 +0200)]
xsk: remove @first_frag from xsk_build_skb()
Instead of using auxiliary boolean that tracks if we are at first frag
when gathering all elements of skb, same functionality can be achieved
with checking if skb_shared_info::nr_frags is 0.
Remove @first_frag but be careful around xsk_build_skb_zerocopy() and
NULL the skb pointer when it failed so that common error path does not
incorrectly interpret it during decision whether to call kfree_skb().
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250925160009.2474816-3-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maciej Fijalkowski [Thu, 25 Sep 2025 16:00:07 +0000 (18:00 +0200)]
xsk: avoid overwriting skb fields for multi-buffer traffic
We are unnecessarily setting a bunch of skb fields per each processed
descriptor, which is redundant for fragmented frames.
Let us set these respective members for first fragment only. To address
both paths that we have within xsk_build_skb(), move assignments onto
xsk_set_destructor_arg() and rename it to xsk_skb_init_misc().
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250925160009.2474816-2-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This small series by Gal adds a new coccinelle script that spots
potential transitions to symbolic error names in print functions, and
then uses it in mlx5 driver.
====================
Gal Pressman [Thu, 18 Sep 2025 10:43:47 +0000 (13:43 +0300)]
net/mlx5: Use %pe format specifier for error pointers
Using the coccinelle test introduced in previous commit
(scripts/coccinelle/misc/ptr_err_to_pe.cocci), convert error logging
throughout the mlx5 driver to use the %pe format specifier instead of
PTR_ERR() with integer format specifiers.
Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Alexei Lazar <alazar@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1758192227-701925-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gal Pressman [Thu, 18 Sep 2025 10:43:46 +0000 (13:43 +0300)]
scripts/coccinelle: Find PTR_ERR() to %pe candidates
Add a new Coccinelle script to identify places where PTR_ERR() is used
in print functions and suggest using the %pe format specifier instead.
For printing error pointers (i.e., a pointer for which IS_ERR() is true)
%pe will print a symbolic error name (e.g,. -EINVAL), opposed to the raw
errno (e.g,. -22) produced by PTR_ERR().
It also makes the code cleaner by saving a redundant call to PTR_ERR().
The script supports context, report, and org modes.
Example transformation:
printk("Error: %ld\n", PTR_ERR(ptr)); // Before
printk("Error: %pe\n", ptr); // After
Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Alexei Lazar <alazar@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1758192227-701925-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cross-merge networking fixes after downstream PR (net-6.17-rc8).
Conflicts:
drivers/net/can/spi/hi311x.c 6b6968084721 ("can: hi311x: fix null pointer dereference when resuming from sleep before interface was enabled") 27ce71e1ce81 ("net: WQ_PERCPU added to alloc_workqueue users")
https://lore.kernel.org/72ce7599-1b5b-464a-a5de-228ff9724701@kernel.org
net/smc/smc_loopback.c
drivers/dibs/dibs_loopback.c a35c04de2565 ("net/smc: fix warning in smc_rx_splice() when calling get_page()") cc21191b584c ("dibs: Move data path to dibs layer")
https://lore.kernel.org/74368a5c-48ac-4f8e-a198-40ec1ed3cf5f@kernel.org
Adjacent changes:
drivers/net/dsa/lantiq/lantiq_gswip.c c0054b25e2f1 ("net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup()") 7a1eaef0a791 ("net: dsa: lantiq_gswip: support model-specific mac_select_pcs()")
Merge tag 'net-6.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from Bluetooth, IPsec and CAN.
No known regressions at this point.
Current release - regressions:
- xfrm: xfrm_alloc_spi shouldn't use 0 as SPI
Previous releases - regressions:
- xfrm: fix offloading of cross-family tunnels
- bluetooth: fix several races leading to UaFs
- dsa: lantiq_gswip: fix FDB entries creation for the CPU port
- eth:
- tun: update napi->skb after XDP process
- mlx: fix UAF in flow counter release
Previous releases - always broken:
- core: forbid FDB status change while nexthop is in a group
- smc: fix warning in smc_rx_splice() when calling get_page()
- can: provide missing ndo_change_mtu(), to prevent buffer overflow.
- eth:
- i40e: fix VF config validation
- broadcom: fix support for PTP_EXTTS_REQUEST2 ioctl"
* tag 'net-6.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits)
octeontx2-pf: Fix potential use after free in otx2_tc_add_flow()
net: dsa: lantiq_gswip: suppress -EINVAL errors for bridge FDB entries added to the CPU port
net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup()
libie: fix string names for AQ error codes
net/mlx5e: Fix missing FEC RS stats for RS_544_514_INTERLEAVED_QUAD
net/mlx5: HWS, ignore flow level for multi-dest table
net/mlx5: fs, fix UAF in flow counter release
selftests: fib_nexthops: Add test cases for FDB status change
selftests: fib_nexthops: Fix creation of non-FDB nexthops
nexthop: Forbid FDB status change while nexthop is in a group
net: allow alloc_skb_with_frags() to use MAX_SKB_FRAGS
bnxt_en: correct offset handling for IPv6 destination address
ptp: document behavior of PTP_STRICT_FLAGS
broadcom: fix support for PTP_EXTTS_REQUEST2 ioctl
broadcom: fix support for PTP_PEROUT_DUTY_CYCLE
Bluetooth: MGMT: Fix possible UAFs
Bluetooth: hci_event: Fix UAF in hci_acl_create_conn_sync
Bluetooth: hci_event: Fix UAF in hci_conn_tx_dequeue
Bluetooth: hci_sync: Fix hci_resume_advertising_sync
Bluetooth: Fix build after header cleanup
...
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"virtio,vhost: last minute fixes
More small fixes. Most notably this fixes crashes and hangs in
vhost-net"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
MAINTAINERS, mailmap: Update address for Peter Hilber
virtio_config: clarify output parameters
uapi: vduse: fix typo in comment
vhost: Take a reference on the task in struct vhost_task.
vhost-net: flush batched before enabling notifications
Revert "vhost/net: Defer TX queue re-enable until after sendmsg"
vhost-net: unbreak busy polling
vhost-scsi: fix argument order in tport allocation error message
====================
net: gso: restore outer ip ids correctly
GRO currently ignores outer IPv4 header IDs for encapsulated packets
that have their don't-fragment flag set. GSO, however, always assumes
that outer IP IDs are incrementing. This results in GSO mangling the
outer IDs when they aren't incrementing. For example, GSO mangles the
outer IDs of IPv6 packets that were converted to IPv4, which must
have an ID of 0 according to RFC 6145, sect. 5.1.
GRO+GSO is supposed to be entirely transparent by default. GSO already
correctly restores inner IDs and IDs of non-encapsulated packets. The
tx-tcp-mangleid-segmentation feature can be enabled to allow the
mangling of such IDs so that TSO can be used.
This series fixes outer ID restoration for encapsulated packets when
tx-tcp-mangleid-segmentation is disabled. It also allows GRO to merge
packets with fixed IDs that don't have their don't-fragment flag set.
Richard Gobert [Tue, 23 Sep 2025 08:59:07 +0000 (10:59 +0200)]
net: gro: remove unnecessary df checks
Currently, packets with fixed IDs will be merged only if their
don't-fragment bit is set. This restriction is unnecessary since
packets without the don't-fragment bit will be forwarded as-is even
if they were merged together. The merged packets will be segmented
into their original forms before being forwarded, either by GSO or
by TSO. The IDs will also remain identical unless NETIF_F_TSO_MANGLEID
is set, in which case the IDs can become incrementing, which is also fine.
Clean up the code by removing the unnecessary don't-fragment checks.
Richard Gobert [Tue, 23 Sep 2025 08:59:06 +0000 (10:59 +0200)]
net: gso: restore ids of outer ip headers correctly
Currently, NETIF_F_TSO_MANGLEID indicates that the inner-most ID can
be mangled. Outer IDs can always be mangled.
Make GSO preserve outer IDs by default, with NETIF_F_TSO_MANGLEID allowing
both inner and outer IDs to be mangled.
This commit also modifies a few drivers that use SKB_GSO_FIXEDID directly.
Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> # for sfc Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250923085908.4687-4-richardbgobert@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Richard Gobert [Tue, 23 Sep 2025 08:59:04 +0000 (10:59 +0200)]
net: gro: remove is_ipv6 from napi_gro_cb
Remove is_ipv6 from napi_gro_cb and use sk->sk_family instead.
This frees up space for another ip_fixedid bit that will be added
in the next commit.
udp_sock_create always creates either a AF_INET or a AF_INET6 socket,
so using sk->sk_family is reliable. In IPv6-FOU, cfg->ipv6_v6only is
always enabled.
Add support to read module EEPROM for fbnic. Towards this, add required
support to issue a new command to the firmware and to receive the response
to the corresponding command.
Create a local copy of the data in the completion struct before writing to
ethtool_module_eeprom to avoid writing to data in case it is freed. Given
that EEPROM pages are small, the overhead of additional copy is
negligible.
Do not block API with explicit checks since API has appropriate checks in
place for length, offset, and page.
Explicitly check bank, page, offset, and length in
fbnic_fw_parse_qsfp_read_resp() to match EEPROM read responses to the
correct request. This is important because if the driver times out waiting
for an EEPROM read response, a subsequent read request with different
values is susceptible to receiving an erroneous response (i.e., the
response to the previous request).
Dan Carpenter [Tue, 23 Sep 2025 11:19:11 +0000 (14:19 +0300)]
octeontx2-pf: Fix potential use after free in otx2_tc_add_flow()
This code calls kfree_rcu(new_node, rcu) and then dereferences "new_node"
and then dereferences it on the next line. Two lines later, we take
a mutex so I don't think this is an RCU safe region. Re-order it to do
the dereferences before queuing up the free.
Fixes: 68fbff68dbea ("octeontx2-pf: Add police action for TC flower") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/aNKCL1jKwK8GRJHh@stanley.mountain Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 25 Sep 2025 08:29:22 +0000 (10:29 +0200)]
Merge branch 'lantiq_gswip-fixes'
Vladimir Oltean says:
====================
lantiq_gswip fixes
This is a small set of fixes which I believe should be backported for
the lantiq_gswip driver. Daniel Golle asked me to submit them here:
https://lore.kernel.org/netdev/aLiDfrXUbw1O5Vdi@pidgin.makrotopia.org/
As mentioned there, a merge conflict with net-next is expected, due to
the movement of the driver to the 'drivers/net/dsa/lantiq' folder there.
Good luck :-/
Patch 2/2 fixes an old regression and is the minimal fix for that, as
discussed here:
https://lore.kernel.org/netdev/aJfNMLNoi1VOsPrN@pidgin.makrotopia.org/
Patch 1/2 was identified by me through static analysis, and I consider
it to be a serious deficiency. It needs a test tag.
====================
Vladimir Oltean [Thu, 18 Sep 2025 07:21:42 +0000 (10:21 +0300)]
net: dsa: lantiq_gswip: suppress -EINVAL errors for bridge FDB entries added to the CPU port
The blamed commit and others in that patch set started the trend
of reusing existing DSA driver API for a new purpose: calling
ds->ops->port_fdb_add() on the CPU port.
The lantiq_gswip driver was not prepared to handle that, as can be seen
from the many errors that Daniel presents in the logs:
[ 174.050000] gswip 1e108000.switch: port 2 failed to add fa:aa:72:f4:8b:1e vid 1 to fdb: -22
[ 174.060000] gswip 1e108000.switch lan2: entered promiscuous mode
[ 174.070000] gswip 1e108000.switch: port 2 failed to add 00:01:02:03:04:02 vid 0 to fdb: -22
[ 174.090000] gswip 1e108000.switch: port 2 failed to add 00:01:02:03:04:02 vid 1 to fdb: -22
[ 174.090000] gswip 1e108000.switch: port 2 failed to delete fa:aa:72:f4:8b:1e vid 1 from fdb: -2
The errors are because gswip_port_fdb() wants to get a handle to the
bridge that originated these FDB events, to associate it with a FID.
Absolutely honourable purpose, however this only works for user ports.
To get the bridge that generated an FDB entry for the CPU port, one
would need to look at the db.bridge.dev argument. But this was
introduced in commit c26933639b54 ("net: dsa: request drivers to perform
FDB isolation"), first appeared in v5.18, and when the blamed commit was
introduced in v5.14, no such API existed.
So the core DSA feature was introduced way too soon for lantiq_gswip.
Not acting on these host FDB entries and suppressing any errors has no
other negative effect, and practically returns us to not supporting the
host filtering feature at all - peacefully, this time.
Fixes: 10fae4ac89ce ("net: dsa: include bridge addresses which are local in the host fdb list") Reported-by: Daniel Golle <daniel@makrotopia.org> Closes: https://lore.kernel.org/netdev/aJfNMLNoi1VOsPrN@pidgin.makrotopia.org/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250918072142.894692-3-vladimir.oltean@nxp.com Tested-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vladimir Oltean [Thu, 18 Sep 2025 07:21:41 +0000 (10:21 +0300)]
net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup()
A port added to a "single port bridge" operates as standalone, and this
is mutually exclusive to being part of a Linux bridge. In fact,
gswip_port_bridge_join() calls gswip_add_single_port_br() with
add=false, i.e. removes the port from the "single port bridge" to enable
autonomous forwarding.
The blamed commit seems to have incorrectly thought that ds->ops->port_enable()
is called one time per port, during the setup phase of the switch.
However, it is actually called during the ndo_open() implementation of
DSA user ports, which is to say that this sequence of events:
1. ip link set swp0 down
2. ip link add br0 type bridge
3. ip link set swp0 master br0
4. ip link set swp0 up
would cause swp0 to join back the "single port bridge" which step 3 had
just removed it from.
The correct DSA hook for one-time actions per port at switch init time
is ds->ops->port_setup(). This is what seems to match the coder's
intention; also see the comment at the beginning of the file:
* At the initialization the driver allocates one bridge table entry for
~~~~~~~~~~~~~~~~~~~~~
* each switch port which is used when the port is used without an
* explicit bridge.
Fixes: 8206e0ce96b3 ("net: dsa: lantiq: Add VLAN unaware bridge offloading") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250918072142.894692-2-vladimir.oltean@nxp.com Tested-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Merge tag 'probes-fixes-v6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- fprobe: Even if there is a memory allocation failure, try to remove
the addresses recorded until then from the filter. Previously we just
skipped it.
- tracing: dynevent: Add a missing lockdown check on dynevent. This
dynevent is the interface for all probe events. Thus if there is no
check, any probe events can be added after lock down the tracefs.
* tag 'probes-fixes-v6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: dynevent: Add a missing lockdown check on dynevent
tracing: fprobe: Fix to remove recorded module addresses from filter
====================
convert 3 drivers to ndo_hwtstamp API
Convert tg3, bnxt_en and mlx5 to use ndo_hwtstamp API. These 3 drivers
were chosen because I have access to the HW and is able to test the
changes. Also there is a selftest provided to validated that the driver
correctly sets up timestamp configuration, according to what is exposed
as supported by the hardware. Selftest allows driver to fallback to some
wider scope of RX timestamping, i.e. it allows the driver to set up
ptpv2-event filter when ptpv2-l2-event is requested.
====================
Add simple tests to validate that the driver sets up timestamping
configuration according to what is reported in capabilities.
For RX timestamping we allow driver to fallback to wider scope for
timestamping if filter is applied. That actually means that driver
can enable ptpv2-event when it reports ptpv2-l4-event is supported,
but not vice versa.
Three sections ("Socket Options", "Security", and "Example Client Usage")
use title headings, which increase number of entries in the networking
docs toctree by three, and also make the rest of sections headed under
"Example Client Usage".
By default the LED will be ON when there is a link but they are not
blinking when there is any traffic activity. Therefore change this
to blink when there is any traffic.
Jakub Kicinski [Thu, 25 Sep 2025 00:45:14 +0000 (17:45 -0700)]
Merge tag 'nf-next-25-09-24' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Florian Westphal says:
====================
netfilter: fixes for net-next
These fixes target next because the bug is either not severe or has
existed for so long that there is no reason to cram them in at the last
minute.
1) Fix IPVS ftp unregistering during netns cleanup, broken since netns
support was introduced in 2011 in the 2.6.39 kernel.
From Slavin Liu.
2) nfnetlink must reset the 'nlh' pointer back to the original
address when a batch is replayed, else we emit bogus ACK messages
and conceal real errno from userspace.
From Fernando Fernandez Mancera. This was broken since 6.10.
3) Recent fix for nftables 'pipapo' set type was incomplete, it only
made things work for the AVX2 version of the algorithm.
4) Testing revealed another problem with avx2 version that results in
out-of-bounds read access, this bug always existed since feature was
added in 5.7 kernel. This also comes with a selftest update.
Last fix resolves a long-standing bug (since 4.9) in conntrack /proc
interface:
Decrease skip count when we reap an expired entry during dump.
As-is we erronously elide one conntrack entry from dump for every expired
entry seen. From Eric Dumazet.
* tag 'nf-next-25-09-24' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_conntrack: do not skip entries in /proc/net/nf_conntrack
selftests: netfilter: nft_concat_range.sh: add check for double-create bug
netfilter: nft_set_pipapo_avx2: fix skip of expired entries
netfilter: nft_set_pipapo: use 0 genmask for packetpath lookups
netfilter: nfnetlink: reset nlh pointer during batch replay
ipvs: Defer ip_vs_ftp unregister during netns cleanup
====================
Jakub Kicinski [Thu, 25 Sep 2025 00:40:29 +0000 (17:40 -0700)]
Merge branch 'net-stmmac-yet-more-cleanups'
Russell King says:
====================
net: stmmac: yet more cleanups
Building on the previous cleanup series, this cleans up yet more stmmac
code.
- Move stmmac_bus_clks_config() into stmmac_platform() which is where
its onlny user is.
- Move the xpcs Clause 73 test into stmmac_init_phy(), resulting in
simpler code in __stmmac_open().
- Move "can't attach PHY" error message into stmmac_init_phy().
We then start moving stuff out of __stmac_open() into stmmac_open()
(and correspondingly __stmmac_release() into stmmac_release()) which
is not necessary when re-initialising the interface on e.g. MTU change.
- Move initialisation of tx_lpi_timer
- Move PHY attachment/detachment
- Move PHY error message into stmmac_init_phy()
Finally, simplfy the paths in stmmac_init_phy().
====================
Russell King (Oracle) [Tue, 23 Sep 2025 11:26:24 +0000 (12:26 +0100)]
net: stmmac: simplify stmmac_init_phy()
If we fail to attach a PHY, there is no point trying to configure WoL
settings. Exit the function after printing the "cannot attach to PHY"
error, and remove the now unnecessary code indentation for configuring
the LPI timer in phylink. Since we know that "ret" must be zero at this
point, change the final return to use a constant rather than "ret".
Russell King (Oracle) [Tue, 23 Sep 2025 11:26:19 +0000 (12:26 +0100)]
net: stmmac: move PHY handling out of __stmmac_open()/release()
Move the PHY attachment/detachment from the network driver out of
__stmmac_open() and __stmmac_release() into stmmac_open() and
stmmac_release() where these actions will only happen when the
interface is administratively brought up or down. It does not make
sense to detach and re-attach the PHY during a change of MTU.
Russell King (Oracle) [Tue, 23 Sep 2025 11:26:14 +0000 (12:26 +0100)]
net: stmmac: move initialisation of priv->tx_lpi_timer to stmmac_open()
The initialisation of priv->tx_lpi_timer only happens once during the
lifetime of the driver, which is during the initial administrative
open of the device. Move this initialisation out of __stmmac_open()
into stmmac_open().
Russell King (Oracle) [Tue, 23 Sep 2025 11:26:09 +0000 (12:26 +0100)]
net: stmmac: move PHY attachment error message into stmmac_init_phy()
Move the "cannot attach to PHY" error message into stmmac_init_phy()
so we don't end up with multiple error messages printed when things
go wrong. Drop the function name from the message, and use %pe to
print the error code description rather than just a number.
Russell King (Oracle) [Tue, 23 Sep 2025 11:26:04 +0000 (12:26 +0100)]
net: stmmac: move xpcs clause 73 test into stmmac_init_phy()
We avoid binding a PHY if the XPCS is using clause 73 negotiation.
Rather than having this complexity in __stmmac_open(), move it to
stmmac_init_phy() instead. There is no point checking the XPCS
state this unless phylink wants a PHY, so place this appropriately.
Russell King (Oracle) [Tue, 23 Sep 2025 11:25:59 +0000 (12:25 +0100)]
net: stmmac: move stmmac_bus_clks_config() to stmmac_platform.c
stmmac_bus_clks_config() is only used by stmmac_platform.c, so rather
than having it in stmmac_main.c and needing to export the symbol,
move it to where it's used.
Jacob Keller [Tue, 23 Sep 2025 20:56:56 +0000 (13:56 -0700)]
libie: fix string names for AQ error codes
The LIBIE_AQ_STR macro() introduced by commit 5feaa7a07b85 ("libie: add
adminq helper for converting err to str") is used in order to generate
strings for printing human readable error codes. Its definition is missing
the separating underscore ('_') character which makes the resulting strings
difficult to read. Additionally, the string won't match the source code,
preventing search tools from working properly.
Add the missing underscore character, fixing the error string names.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Fixes: 5feaa7a07b85 ("libie: add adminq helper for converting err to str") Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250923205657.846759-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gustavo A. R. Silva [Tue, 23 Sep 2025 20:45:10 +0000 (22:45 +0200)]
tls: Avoid -Wflex-array-member-not-at-end warning
Remove unused flexible-array member in struct tls_rec and, with this,
fix the following warning:
net/tls/tls.h:131:29: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Also, add a comment to prevent people from adding any members
after struct aead_request, which is a flexible structure --this is
a structure that ends in a flexible-array member.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/aNMG1lyXw4XEAVaE@kspp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Biggers [Wed, 24 Sep 2025 20:18:22 +0000 (13:18 -0700)]
crypto: af_alg - Fix incorrect boolean values in af_alg_ctx
Commit 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in
af_alg_sendmsg") changed some fields from bool to 1-bit bitfields of
type u32.
However, some assignments to these fields, specifically 'more' and
'merge', assign values greater than 1. These relied on C's implicit
conversion to bool, such that zero becomes false and nonzero becomes
true.
With a 1-bit bitfields of type u32 instead, mod 2 of the value is taken
instead, resulting in 0 being assigned in some cases when 1 was intended.
Fix this by restoring the bool type.
Fixes: 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'soc-fixes-6.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"There are a few minor code fixes for tegra firmware, i.MX firmware
and the eyeq reset controller, and a MAINTAINERS update as Alyssa
Rosenzweig moves on to non-kernel projects.
The other changes are all for devicetree files:
- Multiple Marvell Armada SoCs need changes to fix PCIe, audio and
SATA
- A socfpga board fails to probe the ethernet phy
- The two temperature sensors on i.MX8MP are swapped
- Allwinner devicetree files cause build-time warnings
- Two Rockchip based boards need corrections for headphone detection
and SPI flash"
* tag 'soc-fixes-6.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: remove Alyssa Rosenzweig
firmware: tegra: Do not warn on missing memory-region property
arm64: dts: marvell: cn9132-clearfog: fix multi-lane pci x2 and x4 ports
arm64: dts: marvell: cn9132-clearfog: disable eMMC high-speed modes
arm64: dts: marvell: cn913x-solidrun: fix sata ports status
ARM: dts: kirkwood: Fix sound DAI cells for OpenRD clients
arm64: dts: imx8mp: Correct thermal sensor index
ARM: imx: Kconfig: Adjust select after renamed config option
firmware: imx: Add stub functions for SCMI CPU API
firmware: imx: Add stub functions for SCMI LMM API
firmware: imx: Add stub functions for SCMI MISC API
riscv: dts: allwinner: rename devterm i2c-gpio node to comply with binding
arm64: dts: rockchip: Fix the headphone detection on the orangepi 5
arm64: dts: rockchip: Add vcc supply for SPI Flash on NanoPC-T6
ARM: dts: socfpga: sodia: Fix mdio bus probe and PHY address
reset: eyeq: fix OF node leak
ARM64: dts: mcbin: fix SATA ports on Macchiatobin
ARM: dts: armada-370-db: Fix stereo audio input routing on Armada 370
ARM: dts: allwinner: Minor whitespace cleanup
Merge tag 'for-6.17-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One more regression fix for a problem in zoned mode: mounting would
fail if the number of open and active zones reached a common limit
that didn't use to be checked"
* tag 'for-6.17-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: don't fail mount needlessly due to too many active zones
Merge tag '6.17-rc7-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- free_transport fix for disconnect races
- minor delayed work fix
* tag '6.17-rc7-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
smb: server: use disable_work_sync in transport_rdma.c
smb: server: don't use delayed_work for post_recv_credits_work
Alexander Lobakin [Thu, 11 Sep 2025 16:22:33 +0000 (18:22 +0200)]
idpf: enable XSk features and ndo_xsk_wakeup
Now that AF_XDP functionality is fully implemented, advertise XSk XDP
feature and add .ndo_xsk_wakeup() callback to be able to use it with
this driver.
Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Ramu R <ramu.r@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Alexander Lobakin [Thu, 11 Sep 2025 16:22:32 +0000 (18:22 +0200)]
idpf: implement Rx path for AF_XDP
Implement Rx packet processing specific to AF_XDP ZC using the libeth
XSk infra. Initialize queue registers before allocating buffers to
avoid redundant ifs when updating the queue tail.
Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Ramu R <ramu.r@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Alexander Lobakin [Thu, 11 Sep 2025 16:22:31 +0000 (18:22 +0200)]
idpf: implement XSk xmit
Implement the XSk transmit path using the libeth (libeth_xdp)
XSk infra.
When the NAPI poll is called, XSk Tx queues are polled first,
before regular Tx and Rx. They're generally faster to serve
and have higher priority comparing to regular traffic.
Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Ramu R <ramu.r@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Kubiak [Thu, 11 Sep 2025 16:22:30 +0000 (18:22 +0200)]
idpf: add XSk pool initialization
Add functionality to setup an XSk buffer pool, including ability to
stop, reconfig and start only selected queues, not the whole device.
Pool DMA mapping is managed by libeth_xdp.
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Ramu R <ramu.r@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Kubiak [Thu, 11 Sep 2025 16:22:29 +0000 (18:22 +0200)]
idpf: add virtchnl functions to manage selected queues
Implement VC functions dedicated to enabling, disabling and configuring
not all but only selected queues.
Also, refactor the existing implementation to make the code more
modular. Introduce new generic functions for sending VC messages
consisting of chunks, in order to isolate the sending algorithm
and its implementation for specific VC messages.
Finally, rewrite the function for mapping queues to q_vectors using the
new modular approach to avoid copying the code that implements the VC
message sending algorithm.
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Co-developed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Ramu R <ramu.r@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
We've added 9 non-merge commits during the last 33 day(s) which contain
a total of 10 files changed, 480 insertions(+), 53 deletions(-).
The main changes are:
1) A new bpf_xdp_pull_data kfunc that supports pulling data from
a frag into the linear area of a xdp_buff, from Amery Hung.
This includes changes in the xdp_native.bpf.c selftest, which
Nimrod's future work depends on.
It is a merge from a stable branch 'xdp_pull_data' which has
also been merged to bpf-next.
There is a conflict with recent changes in 'include/net/xdp.h'
in the net-next tree that will need to be resolved.
2) A compiler warning fix when CONFIG_NET=n in the recent dynptr
skb_meta support, from Jakub Sitnicki.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests: drv-net: Pull data before parsing headers
selftests/bpf: Test bpf_xdp_pull_data
bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN
bpf: Make variables in bpf_prog_test_run_xdp less confusing
bpf: Clear packet pointers after changing packet data in kfuncs
bpf: Support pulling non-linear xdp data
bpf: Allow bpf_xdp_shrink_data to shrink a frag from head and tail
bpf: Clear pfmemalloc flag when freeing all fragments
bpf: Return an error pointer for skb metadata when CONFIG_NET=n
====================
tracing: dynevent: Add a missing lockdown check on dynevent
Since dynamic_events interface on tracefs is compatible with
kprobe_events and uprobe_events, it should also check the lockdown
status and reject if it is set.
Marc Kleine-Budde [Wed, 24 Sep 2025 15:10:01 +0000 (17:10 +0200)]
Merge patch series "can: netlink: preparation before introduction of CAN XL step 3/3"
Vincent Mailhol <mailhol@kernel.org> says:
In November last year, I sent an RFC to introduce CAN XL [1]. That
RFC, despite positive feedback, was put on hold due to some unanswered
question concerning the PWM encoding [2].
While stuck, some small preparation work was done in parallel in [3]
by refactoring the struct can_priv and doing some trivial clean-up and
renaming. Initially, [3] received zero feedback but was eventually
merged after splitting it in smaller parts and resending it.
Finally, in July this year, we clarified the remaining mysteries about
PWM calculation, thus unlocking the series. Summer being a bit busy
because of some personal matters brings us to now.
After doing all the refactoring and adding all the CAN XL features,
the final result is more than 30 patches, definitively too much for a
single series. So I am splitting the remaining changes three:
- can: rework the CAN MTU logic [4]
- can: netlink: preparation before introduction of CAN XL (this series)
- CAN XL (will come right after the two preparation series get merged)
And thus, this series continues and finishes the preparation work done
in [3] and [4]. It contains all the refactoring needed to smoothly
introduce CAN XL. The goal is to:
- split the functions in smaller pieces: CAN XL will introduce a
fair amount of code. And some functions which are already fairly
long (86 lines for can_validate(), 215 lines for can_changelink())
would grow to disproportionate sizes if the CAN XL logic were to
be inlined in those functions.
- repurpose the existing code to handle both CAN FD and CAN XL: a
huge part of CAN XL simply reuses the CAN FD logic. All the
existing CAN FD logic is made more generic to handle both CAN FD
and XL.
In more details:
- Patch #1 moves struct data_bittiming_params from dev.h to
bittiming.h and patch #2 makes can_get_relative_tdco() FD agnostic
before also moving it to bittiming.h.
- Patch #3 adds some comments to netlink.h tagging which IFLA
symbols are FD specific.
- Patches #4 to #6 are refactoring can_validate() and
can_validate_bittiming().
- Patches #7 to #11 are refactoring can_changelink() and
can_tdc_changelink().
- Patches #12 and #13 are refactoring can_get_size() and
can_tdc_get_size().
- Patches #14 to #17 are refactoring can_fill_info() and
can_tdc_fill_info().
- Patch #18 makes can_calc_tdco() FD agnostic.
- Patch #19 adds can_get_ctrlmode_str() which converts control mode
flags into strings. This is done in preparation of patch #20.
- Patch #20 is the final patch and improves the user experience by
providing detailed error messages whenever invalid parameters are
provided. All those error messages came into handy when debugging
the upcoming CAN XL patches.
Aside from the last patch, the other changes do not impact any of the
existing functionalities.
The follow up series which introduces CAN XL is nearly completed but
will be sent only once this one is approved: one thing at a time, I do
not want to overwhelm people (including myself).
Vincent Mailhol [Tue, 23 Sep 2025 06:58:45 +0000 (15:58 +0900)]
can: netlink: add userland error messages
Use NL_SET_ERR_MSG() and NL_SET_ERR_MSG_FMT() to return meaningful
error messages to the userland whenever a -EOPNOTSUPP error is
returned due to a failed validation of the CAN netlink arguments.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:44 +0000 (15:58 +0900)]
can: dev: add can_get_ctrlmode_str()
In an effort to give more human readable messages when errors occur
because of conflicting options, it can be useful to convert the CAN
control mode flags into text.
Add a function which converts the first set CAN control mode into a
human readable string. The reason to only convert the first one is to
simplify edge cases: imagine that there are several invalid control
modes, we would just return the first invalid one to the user, thus
not having to handle complex string concatenation. The user can then
solve the first problem, call the netlink interface again and see the
next issue.
People who wish to enumerate all the control modes can still do so by,
for example, using this new function in a for_each_set_bit() loop.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:43 +0000 (15:58 +0900)]
can: calc_bittiming: make can_calc_tdco() FD agnostic
can_calc_tdco() uses the CAN_CTRLMODE_FD_TDC_MASK and
CAN_CTRLMODE_TDC_AUTO macros making it specific to CAN FD. Add the tdc
mask to the function parameter list. The value of the tdc auto flag
can then be derived from that mask and stored in a local variable.
This way, the function becomes CAN FD agnostic and can be reused later
on for the CAN XL TDC.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:42 +0000 (15:58 +0900)]
can: netlink: make can_tdc_fill_info() FD agnostic
can_tdc_fill_info() depends on some variables which are specific to CAN
FD. Move these to the function parameters list so that, later on, this
function can be reused for the CAN XL TDC.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:41 +0000 (15:58 +0900)]
can: netlink: add can_bitrate_const_fill_info()
Add can_bitrate_const_fill_info() to factorise the logic when filling
the bitrate constant information for Classical CAN and CAN FD. This
function will be reused later on for CAN XL.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:40 +0000 (15:58 +0900)]
can: netlink: add can_bittiming_const_fill_info()
Add function can_bittiming_const_fill_info() to factorise the logic
when filling the bittiming constant information for Classical CAN and
CAN FD. This function will be reused later on for CAN XL.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:39 +0000 (15:58 +0900)]
can: netlink: add can_bittiming_fill_info()
Add can_bittiming_fill_info() to factorise the logic when filling the
bittiming information for Classical CAN and CAN FD. This function will
be reused later on for CAN XL.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:37 +0000 (15:58 +0900)]
can: netlink: make can_tdc_get_size() FD agnostic
can_tdc_get_size() needs to access can_priv->fd making it specific to
CAN FD. Change the function parameter from struct can_priv to struct
data_bittiming_params.
can_tdc_get_size() also uses the CAN_CTRLMODE_TDC_MANUAL macro making
it specific to CAN FD. Add the tdc mask to the function parameter
list. The value of the tdc manual flag can then be derived from that
mask and stored in a local variable.
This way, the function becomes CAN FD agnostic and can be reused later
on for the CAN XL TDC.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:36 +0000 (15:58 +0900)]
can: netlink: add can_ctrlmode_changelink()
Split the control mode change link logic into a new function:
can_ctrlmode_changelink(). The purpose is to increase code readability
by preventing can_changelink() from becoming too big.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:35 +0000 (15:58 +0900)]
can: netlink: add can_dtb_changelink()
Factorise the databittiming parsing out of can_changelink() and move
it in the new can_dtb_changelink() function. This is a preparation
patch for the introduction of CAN XL because the databittiming
changelink logic will be reused later on.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:34 +0000 (15:58 +0900)]
can: netlink: make can_tdc_changelink() FD agnostic
can_tdc_changelink() needs to access can_priv->fd making it
specific to CAN FD. Change the function parameter from struct can_priv
to struct data_bittiming_params. This way, the function becomes CAN FD
agnostic and can be reused later on for the CAN XL TDC.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:33 +0000 (15:58 +0900)]
can: netlink: remove useless check in can_tdc_changelink()
can_tdc_changelink() return -EOPNOTSUPP under this condition:
!tdc_const || !can_fd_tdc_is_enabled(priv)
But this function is only called if the data[IFLA_CAN_TDC] parameters
are provided. At this point, can_validate_tdc() already checked that
either of the tdc auto or tdc manual control modes were provided, that
is to say, can_fd_tdc_is_enabled(priv) must be true.
Because the right hand operand of this condition is always true,
remove it.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:32 +0000 (15:58 +0900)]
can: netlink: refactor CAN_CTRLMODE_TDC_{AUTO,MANUAL} flag reset logic
CAN_CTRLMODE_TDC_AUTO and CAN_CTRLMODE_TDC_MANUAL are mutually
exclusive. This means that whenever the user switches from auto to
manual mode (or vice versa), the other flag which was set previously
needs to be cleared.
Currently, this is handled with a masking operation. It can be done in
a simpler manner by clearing any of the previous TDC flags before
copying netlink attributes. The code becomes easier to understand and
will make it easier to add the new upcoming CAN XL flags which will
have a similar reset logic as the current TDC flags.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:31 +0000 (15:58 +0900)]
can: netlink: add can_validate_databittiming()
Factorise the databittiming validation out of can_validate() and move
it in the new add can_validate_databittiming() function. Also move
can_validate()'s comment because it is specific to CAN FD. This is a
preparation patch for the introduction of CAN XL as this databittiming
validation will be reused later on.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:30 +0000 (15:58 +0900)]
can: netlink: add can_validate_tdc()
Factorise the TDC validation out of can_validate() and move it in the
new can_validate_tdc() function. This is a preparation patch for the
introduction of CAN XL because this TDC validation will be reused
later on.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:29 +0000 (15:58 +0900)]
can: netlink: refactor can_validate_bittiming()
Whenever can_validate_bittiming() is called, it is always preceded by
some boilerplate code which was copy pasted all over the place. Move
that repeated code directly inside can_validate_bittiming().
Finally, the mempcy() is not needed: the nla attributes are four bytes
aligned which is just enough for struct can_bittiming. Add a
static_assert() to document that the alignment is correct and just use
the pointer returned by nla_data() as-is.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:28 +0000 (15:58 +0900)]
can: netlink: document which symbols are FD specific
The CAN XL netlink interface will also have data bitrate and TDC
parameters. The current FD parameters do not have a prefix in their
names to differentiate them.
Because the netlink interface is part of the UAPI, it is unfortunately
not feasible to rename the existing symbols to add an FD_ prefix. The
best alternative is to add a comment for each of the symbols to notify
the reader of which parts are CAN FD specific.
While at it, fix a typo: transiver -> transceiver.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:27 +0000 (15:58 +0900)]
can: dev: make can_get_relative_tdco() FD agnostic and move it to bittiming.h
can_get_relative_tdco() needs to access can_priv->fd making it
specific to CAN FD. Change the function parameter from struct can_priv
to struct data_bittiming_params. This way, the function becomes CAN FD
agnostic and can be reused later on for the CAN XL TDC.
Now that we dropped the dependency on struct can_priv, also move
can_get_relative_tdco() back to bittiming.h where it was meant to
belong to.
Vincent Mailhol [Tue, 23 Sep 2025 06:58:26 +0000 (15:58 +0900)]
can: dev: move struct data_bittiming_params to linux/can/bittiming.h
In commit b803c4a4f788 ("can: dev: add struct data_bittiming_params to
group FD parameters"), struct data_bittiming_params was put into
linux/can/dev.h.
This structure being a collection of bittiming parameters, on second
thought, bittiming.h is actually a better location. This way, users of
struct data_bittiming_params will not have to forcefully include
linux/can/dev.h thus removing some complexity and reducing the risk of
circular dependencies in headers.
Move struct data_bittiming_params from linux/can/dev.h to
linux/can/bittiming.h.
Marc Kleine-Budde [Wed, 24 Sep 2025 15:06:20 +0000 (17:06 +0200)]
Merge patch series "can: rework the CAN MTU logic (CAN XL preparation step 2/3)"
Vincent Mailhol <mailhol@kernel.org> says:
The CAN MTU logic is currently broken. can_change_mtu() will update
both the MTU and the CAN_CTRLMODE_FD flag.
Back then, commit bc05a8944a34 ("can: allow to change the device mtu
for CAN FD capable devices") stated that:
The configuration can be done either with the 'fd { on | off }'
option in the 'ip' tool from iproute2 or by setting the CAN
netdevice MTU to CAN_MTU (16) or to CANFD_MTU (72).
on a CAN FD interface, we are left with a device on which CAN FD is
enabled but which does not have the FD databittiming parameters
configured.
The same goes on when setting the mtu back to 16:
ip link set can0 type can bitrate 500000 fd on dbitrate 5000000
ip link set can0 mtu 16
The device is now in Classical CAN mode but iproute2 is still
reporting the databittiming values (although this time, the issue
seems less critical as it is only a reporting problem).
The only way to resolve the problem and bring the device back to a
coherent state is to call again the netlink interface using the
"fd on" or "fd off" options.
The idea of being able to infer the CAN_CTRLMODE_FD flag from the MTU
value is just incorrect for physical devices. Note that this logic
remains valid on virtual interfaces (vcan and vxcan) because those do
not have control mode flags and thus no conflict occurs.
This series reworks the CAN MTU logic. The goal is to always maintain
a coherent state between the MTU and the control mode flags as listed
in below table:
fd off, xl off fd on, xl off fd any, xl on
---------------------------------------------------------------------------
default mtu CAN_MTU CANFD_MTU CANXL_MTU
min mtu CAN_MTU CANFD_MTU CANXL_MIN_MTU
max mtu CAN_MTU CANFD_MTU CANXL_MAX_MTU
In order to switch between one column to another, the user must use
the fd/xl on/off flags. Directly modifying the MTU from one column to
the other is not permitted any more.
The CAN XL is not yet supported at the moment, so the last column is
just given as a reference to better understand what is coming up. This
series will just implement the first two columns.
While doing the rewrite, the logic is adjusted to reuse as much as
possible the net core infrastructure. By populating:
net_device->min_mtu
and
net_device->max_mtu
the net core infrastructure will automatically:
1. validate that the user's inputs are in range.
2. report those min and max MTU values through the netlink
interface.
Point 1. will allow us to get rid of the can_change_mtu() in a near
future for all the physical devices and point 2. allows the end user
to see the valid MTU range by doing a:
$ ip --details link show can0
Finally, because using the net core, it will be possible after the
removal of can_change_mtu() to modify the MTU while the device is up.
As stated previously, the only modifications allowed will be within
the MTU range of a given CAN protocol. So for Classical CAN and CAN
FD, the MTU is fixed to, respectively, CAN_MTU and CANFD_MTU. For the
upcoming CAN XL, the user will be able to change the MTU to anything
between CANXL_MIN_MTU and CANXL_MAX_MTU even if the device is up.
The first patch of this series annotates the read access on
net_device->mtu. This preparation is needed to prevent any race
condition to occur when modifying the MTU while the device is up.
The second patch is another preparation change which moves
can_set_static_ctrlmode() from dev.h to dev.c.
The third patch populates the MTU minimum and maximum value.
The fourth patch is just a clean-up to remove the old
can_change_mtu().
The fourth and last patch comes as a bonus content and modifies the
default MTU of the vcan and vxcan so that CAN XL is on by default.
Note that after this series, the old can_change_mtu() becomes
useless. That function can not yet be removed because some pending
changes from other maintainers' trees still depend on it. It will be
removed in the next development window once all those changes reach
net-next.
Vincent Mailhol [Tue, 23 Sep 2025 06:37:11 +0000 (15:37 +0900)]
can: enable CAN XL for virtual CAN devices by default
In commit 97edec3a11cf ("can: enable CAN FD for virtual CAN devices by
default"), vcan and vxcan default MTU was set to CANFD_MTU by default.
The reason was that users were confused on how to activate CAN FD on
virtual interfaces.
Following the introduction of CAN XL, the same logic should be
applied. Set the MTU to CANXL_MTU by default.
The users who really wish to use a Classical CAN only or a CAN FD
virtual device can do respectively:
Vincent Mailhol [Tue, 23 Sep 2025 06:37:10 +0000 (15:37 +0900)]
can: populate the minimum and maximum MTU values
By populating:
net_device->min_mtu
and
net_device->max_mtu
the net core infrastructure will automatically:
1. validate that the user's inputs are in range.
2. report those min and max MTU values through the netlink
interface.
Add can_set_default_mtu() which sets the default mtu value as well as
the minimum and maximum values. The logic for the default mtu value
remains unchanged:
- CANFD_MTU if the device has a static CAN_CTRLMODE_FD.
- CAN_MTU otherwise.
Call can_set_default_mtu() each time the CAN_CTRLMODE_FD is modified.
This will guarantee that the MTU value is always consistent with the
control mode flags.
With this, the checks done in can_change_mtu() become fully redundant
and will be removed in an upcoming change and it is now possible to
confirm the minimum and maximum MTU values on a physical CAN interface
by doing:
$ ip --details link show can0
The virtual interfaces (vcan and vxcan) are not impacted by this
change.
tracing: fprobe: Fix to remove recorded module addresses from filter
Even if there is a memory allocation failure in fprobe_addr_list_add(),
there is a partial list of module addresses. So remove the recorded
addresses from filter if exists.
This also removes the redundant ret local variable.
Eric Dumazet [Wed, 24 Sep 2025 07:27:09 +0000 (07:27 +0000)]
netfilter: nf_conntrack: do not skip entries in /proc/net/nf_conntrack
ct_seq_show() has an opportunistic garbage collector :
if (nf_ct_should_gc(ct)) {
nf_ct_kill(ct);
goto release;
}
So if one nf_conn is killed there, next time ct_get_next() runs,
we skip the following item in the bucket, even if it should have
been displayed if gc did not take place.
We can decrement st->skip_elems to tell ct_get_next() one of the items
was removed from the chain.
Fixes: 58e207e4983d ("netfilter: evict stale entries when user reads /proc/net/nf_conntrack") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de>
netfilter: nft_set_pipapo_avx2: fix skip of expired entries
KASAN reports following splat:
BUG: KASAN: slab-out-of-bounds in pipapo_get_avx2+0x941/0x25d0
Read of size 1 at addr ffff88814c561be0 by task nft/3944
Call Trace:
pipapo_get_avx2+0x941/0x25d0
nft_pipapo_insert+0x440/0x11b0
nf_tables_newsetelem+0x220a/0x3a00
..
This bisects to commit 84c1da7b38d9 ("netfilter: nft_set_pipapo: use AVX2
algorithm for insertions too").
However, that change merely uncovers this bug.
When we find a match but that match has expired or timed out, the AVX2
implementation restarts the full match loop.
At that point, the pointer to the key data has already been changed and
points to the keys last field.
This will then result in out-of-bounds read once its incremented again
for the next field.
The restart logic in AVX2 is different compared to the plain C
implementation, but both should follow the same logic.
The C implementation just calls pipapo_refill() again do check the next
entry. Do the same in the AVX2 implementation.
Note that with this change, due to implementation differences of
pipapo_refill vs. nft_pipapo_avx2_refill, the refill call will return
the same element again. Then, on the next call, it will move to the next
entry as expected. This is because avx2_refill doesn't clear the bitmap
in the 'last' conditional. This is harmless. Expired/timed out elements
are also not expected to be frequent.
netfilter: nft_set_pipapo: use 0 genmask for packetpath lookups
In commit c4eaca2e1052 ("netfilter: nft_set_pipapo: don't check genbit from
packetpath lookups") I replaced genmask_cur() with NFT_GENMASK_ANY, but
this change has no effect in the pipapo set type.
New entries are unreachable from the active copy, so NFT_GENMASK_ANY has
same result as genmask_cur():
current-gen elements are disabled and the new-generation
elements cannot be found.
Tests did not catch this incomplete fix because the change also dropped
the genmask test from the AVX2 version of the algorithm, so test only
fails if host cpu lacks AVX2 support.
Use genmask test only from the control plane (inserts, deletions, ..).
Packet path has to skip the check, use of 0 is enough for this because
ext->genmask has a the relevant bit set when the element is INACTIVE
in that generation: using a 0 genmask thus makes nft_set_elem_active()
always return true.
Fix the comment and replace NFT_GENMASK_ANY with 0.
netfilter: nfnetlink: reset nlh pointer during batch replay
During a batch replay, the nlh pointer is not reset until the parsing of
the commands. Since commit bf2ac490d28c ("netfilter: nfnetlink: Handle
ACK flags for batch messages") that is problematic as the condition to
add an ACK for batch begin will evaluate to true even if NLM_F_ACK
wasn't used for batch begin message.
If there is an error during the command processing, netlink is sending
an ACK despite that. This misleads userspace tools which think that the
return code was 0. Reset the nlh pointer to the original one when a
replay is triggered.
Fixes: bf2ac490d28c ("netfilter: nfnetlink: Handle ACK flags for batch messages") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de>
Slavin Liu [Thu, 11 Sep 2025 17:57:59 +0000 (01:57 +0800)]
ipvs: Defer ip_vs_ftp unregister during netns cleanup
On the netns cleanup path, __ip_vs_ftp_exit() may unregister ip_vs_ftp
before connections with valid cp->app pointers are flushed, leading to a
use-after-free.
Fix this by introducing a global `exiting_module` flag, set to true in
ip_vs_ftp_exit() before unregistering the pernet subsystem. In
__ip_vs_ftp_exit(), skip ip_vs_ftp unregister if called during netns
cleanup (when exiting_module is false) and defer it to
__ip_vs_cleanup_batch(), which unregisters all apps after all connections
are flushed. If called during module exit, unregister ip_vs_ftp
immediately.
Marco Crivellari [Tue, 23 Sep 2025 14:59:05 +0000 (16:59 +0200)]
wifi: libertas: add WQ_UNBOUND to alloc_workqueue users
Currently if a user enqueues a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() uses
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
Explicitly add the WQ_UNBOUND flag to alloc_workqueue() users, marking
the workqueue unbound.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.