Ramya Gnanasekar [Fri, 6 Jun 2025 10:44:36 +0000 (16:14 +0530)]
wifi: mac80211: Fix 6 GHz Band capabilities element advertisement in lower bands
Currently, when adding the 6 GHz Band Capabilities element, the channel
list of the wiphy is checked to determine if 6 GHz is supported for a given
virtual interface. However, in a multi-radio wiphy (e.g., one that has
both lower bands and 6 GHz combined), the wiphy advertises support for
all bands. As a result, the 6 GHz Band Capabilities element is incorrectly
included in mesh beacon and station's association request frames of
interfaces operating in lower bands, without verifying whether the
interface is actually operating in a 6 GHz channel.
Fix this by verifying if the interface operates on 6 GHz channel
before adding the element. Note that this check cannot be placed
directly in ieee80211_put_he_6ghz_cap() as the same function is used to
add probe request elements while initiating scan in which case the
interface may not be operating in any band's channel.
Wright Feng [Sun, 17 Aug 2025 19:04:35 +0000 (21:04 +0200)]
wifi: brcmfmac: support AP isolation to restrict reachability between stations
hostapd & wpa_supplicant userspace daemons exposes an AP mode specific
config file parameter "ap_isolate" to the user, which is used to control
low-level bridging of frames between the stations associated in the BSS.
In driver, handle this user setting in the newly defined cfg80211_ops
function brcmf_cfg80211_change_bss() by enabling "ap_isolate" IOVAR in
the firmware.
In AP mode, the "ap_isolate" value from the cfg80211 layer represents,
0 = allow low-level bridging of frames between associated stations
1 = restrict low-level bridging of frames to isolate associated stations
-1 = do not change existing setting
Signed-off-by: Wright Feng <wright.feng@cypress.com> Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com> Signed-off-by: Gokul Sivakumar <gokulkumar.sivakumar@infineon.com>
[arend: indicate ap_isolate support in struct wiphy::bss_param_support] Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://patch.msgid.link/20250817190435.1495094-5-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arend van Spriel [Sun, 17 Aug 2025 19:04:34 +0000 (21:04 +0200)]
wifi: nl80211: strict checking attributes for NL80211_CMD_SET_BSS
Assure user-space only modifies attributes for NL80211_CMD_SET_BSS
that are supported by the driver. This stricter checking is only done
when user-space commits to it by including NL80211_ATTR_BSS_PARAM.
Arend van Spriel [Sun, 17 Aug 2025 19:04:33 +0000 (21:04 +0200)]
wifi: drivers: indicate support for attributes in NL80211_CMD_SET_BSS
The command NL80211_CMD_SET_BSS has a number of individual attributes
and the driver can advertise which of those it will handle when it is
changed by user-space. For drivers providing an empty .change_bss()
the callback has been removed.
Arend van Spriel [Sun, 17 Aug 2025 19:04:32 +0000 (21:04 +0200)]
wifi: nl80211: allow drivers to support subset of NL80211_CMD_SET_BSS
The so-called fullmac devices rely on firmware functionality and/or API to
change BSS parameters. Today there are limited drivers supporting the
nl80211 primitive, but they only handle a subset of the bss parameters
passed if any. The mac80211 driver does handle all parameters and stores
their configured values. Some of the BSS parameters were already conditional
by wiphy->features. For these the wiphy->bss_param_support and wiphy->features
fields are silently aligned in wiphy_register(). Maybe better to issue a warning
instead when they are misaligned.
Muna Sinada [Fri, 15 Aug 2025 21:30:11 +0000 (14:30 -0700)]
wifi: nl80211: Add EHT fixed Tx rate support
Add new attributes to support EHT MCS/NSS Tx rates and EHT GI/LTF.
Parse EHT fixed MCS/NSS Tx rates and EHT GI/LTF values passed by the
userspace, validate and add as part of cfg80211_bitrate_mask.
MCS mask is constructed by new function, eht_build_mcs_mask(). Max NSS
supported for MCS rates of 7, 9, 11 and 13 is utilized to set MCS
bitmask for each NSS. MCS rates 14, and 15 if supported, are set only
for NSS = 0.
Co-developed-by: Aloka Dixit <aloka.dixit@oss.qualcomm.com> Signed-off-by: Aloka Dixit <aloka.dixit@oss.qualcomm.com> Signed-off-by: Muna Sinada <muna.sinada@oss.qualcomm.com> Link: https://patch.msgid.link/20250815213011.2704803-1-muna.sinada@oss.qualcomm.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
wifi: mac80211: consider links for validating SCAN_FLAG_AP in scan request during MLO
Commit 78a7a126dc5b ("wifi: mac80211: validate SCAN_FLAG_AP in scan request
during MLO") introduced a check that rejects scan requests if any link is
already beaconing. This works fine when all links share the same radio, but
breaks down in multi-radio setups.
Consider a scenario where a 2.4 GHz link is beaconing and a scan is
requested on a 5 GHz link, each backed by a different physical radio. The
current logic still blocks the scan, even though it should be allowed. As a
result, interface bring-up fails unnecessarily in valid configurations.
Fix this by checking whether the scan is being requested on the same
underlying radio as the beaconing link. Only reject the scan if it targets
a link that is already beaconing and the NL80211_FEATURE_AP_SCAN is not
set. This ensures correct behavior in multi-radio environments and avoids
false rejections.
wifi: mac80211: simplify return value handling of cfg80211_get_radio_idx_by_chan()
In several instances where cfg80211_get_radio_idx_by_chan() is called,
redundant checks are performed across function — such as verifying if
wiphy->n_radio < 2 or if the returned index is negative. These checks are
unnecessary, as the return value can be directly compared. Moreover, the
function can be safely called even when radio-level properties are not
explicitly advertised since in such case in each call it is going to get
same error value.
Therefore, simplify the usage of this function across all such cases by
removing redundant conditions and relying on the return value directly.
wifi: cfg80211: fix return value in cfg80211_get_radio_idx_by_chan()
If a valid radio index is not found, the function returns -ENOENT. If the
channel argument itself is invalid, it returns -EINVAL. However, since the
caller only checks for < 0, the distinction between these error codes is
not utilized much. Also, handling these two distinct error codes throughout
the codebase adds complexity, as both cases must be addressed separately. A
subsequent change aims to simplify this by using a single error code for
all invalid cases, making error handling more consistent and streamlined.
To support this change, update the return value to -EINVAL when a valid
radio index is not found. This is still appropriate because, even if the
channel argument is structurally valid, the absence of a corresponding
radio index implies that the argument is effectively invalid—otherwise, a
valid index would have been found.
wifi: mac80211: kunit: add kunit tests for S1G PVB decoding
Add support for testing the 6 examples mentioned in IEEE80211-2024
Annex L. These tests cover the 3 mandatory decoding modes being
block bitmap, single AID and OLB alongside their equivalent
inverses.
An S1G TIM PVB has 3 mandatory encoding modes, that being
block bitmap, single AID and OBL alongside the ability for
each encoding mode to be inverted. Introduce the ability to
parse the 3 encoding formats. The implementation specification
for the encoding formats can be found in IEEE80211-2024 9.4.2.5.
wifi: mac80211: support block bitmap S1G TIM encoding
An S1G TIM PVB is encoded differently compared to a non-s1g TIM PVB.
As the AP dictates which encoding mode it uses, here we only implement
block bitmap encoding. This is the default encoding mode used by
all current vendor implementations.
Additionally, S1G has a maximum AID count of 8192, however we are
limiting the current implementation to 1600. This has no resemblence
to the standard and is purely an implementation detail. The reason for
this is due to the TIM elements maximum length of 255. This allows for,
at most, 25 encoded blocks for a PVB encoded with block bitmap. Support
for the maximum of 8192 AIDs will require an implementation of page slicing
to be added to mac80211.
As a result, we perform extra validation on both the STA and AP side
when receiving an AID as an S1G interface.
Add support for block bitmap encoding for an S1G AP and limit the
maximum AID count to 1600 for the current mac80211 implementations.
Cypress(Infineon) is not the vendor for this 43752 SDIO WLAN chip, and so
has not officially released any firmware binary for it. It is incorrect to
maintain this WLAN chip with firmware vendor ID as "CYW". So relabel the
chip's firmware Vendor ID as "WCC" as suggested by the maintainer.
Fixes: d2587c57ffd8 ("brcmfmac: add 43752 SDIO ids and initialization") Fixes: f74f1ec22dc2 ("wifi: brcmfmac: add support for Cypress firmware api") Signed-off-by: Gokul Sivakumar <gokulkumar.sivakumar@infineon.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://patch.msgid.link/20250724101136.6691-1-gokulkumar.sivakumar@infineon.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Stefan Kerkmann [Mon, 4 Aug 2025 14:16:59 +0000 (16:16 +0200)]
wifi: mwifiex: send world regulatory domain to driver
The world regulatory domain is a restrictive subset of channel
configurations which allows legal operation of the adapter all over the
world. Changing to this domain should not be prevented.
Stefan Kerkmann [Mon, 4 Aug 2025 13:58:27 +0000 (15:58 +0200)]
wifi: mwifiex: add rgpower table loading support
Marvell/NXP Wi-Fi adapters allow fine-grained adjustment of the transmit
power levels and various other internal parameters. This is done by
sending command streams to the adapter. One storage format of these
command streams are the rgpower tables, which consist of multiple
command blocks in the following format:
command_block_1 = {
XX XX LL LL XX XX ..
}
command_block_n = {
XX XX LL LL XX XX XX ..
}
XX = raw byte as hex chars
LL = total length of the "raw" command block
These command blocks are parsed into their binary representation and
then send to the adapter. The parsing logic was adapted from NXP's
mwifiex driver[1].
The rgpower tables matching the currently set regulatory domain are
automatically requested and applied. If not found the existing device
tree provided power tables are tried as well.
Gustavo A. R. Silva [Mon, 11 Aug 2025 05:10:39 +0000 (14:10 +0900)]
wifi: iwlegacy: Remove unused structs and avoid -Wflex-array-member-not-at-end warnings
Remove unused structures and avoid the following
-Wflex-array-member-not-at-end warnings:
drivers/net/wireless/intel/iwlegacy/iwl-spectrum.h:68:39: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/wireless/intel/iwlegacy/iwl-spectrum.h:60:39: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Stanislaw Gruszka <stf_xl@wp.pl> Link: https://patch.msgid.link/aJl7TxeWgLdEKWhg@kspp Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Lorenzo Bianconi [Tue, 26 Aug 2025 11:54:31 +0000 (13:54 +0200)]
wifi: mac80211: Make CONNECTION_MONITOR optional for MLO sta
Since commit '1bc892d76a6f ("wifi: mac80211: extend connection
monitoring for MLO")' mac80211 supports connection monitor for MLO
client interfaces. Remove the CONNECTION_MONITOR requirement in
ieee80211_register_hw routine.
Steven Rostedt [Fri, 29 Aug 2025 02:17:59 +0000 (22:17 -0400)]
wifi: cfg80211: Remove unused tracepoints
Tracepoints that are defined take up around 5K each, even if they are not
used. If they are defined and not used, then they waste memory for unused
code. Soon unused tracepoints will cause warnings.
Remove the unused tracepoints of the cfg80211 subsystem. They are:
Miri Korenblit [Thu, 28 Aug 2025 08:25:59 +0000 (11:25 +0300)]
wifi: iwlwifi: carefully select the PNVM source
For newer device, and from API 100 (core 97), the PNVM should be taken
from the .ucode file, and not from an external .pnvm file.
In the current logic, if the PNVM doesn't exist in the .ucode file, we
fallback to fetching the .ucode file. This is wrong and hides bugs.
This fallback was needed for (a) old devices and (b) for newer
devices with an old API.
Since we no longer support those old APIs, (b) is not longer relevant.
We can, according to the device, select the right PNVM source
and fail if we couldn't find the PNVM there.
Add clear logic to select the expected PNVM source, and print an error
if we couldn't get the PNVM from there.
Miri Korenblit [Thu, 28 Aug 2025 08:25:58 +0000 (11:25 +0300)]
wifi: iwlwifi: mld: make iwl_mld_rm_vif void
Unlike adding/allocating an object, destroying it should always
succeed. In addition, the return value of iwl_mld_rm_vif is not even
used.
Make it a void function.
Miri Korenblit [Thu, 28 Aug 2025 08:25:57 +0000 (11:25 +0300)]
wifi: iwlwifi: pcie: remember when interrupts are disabled
trans_pcie::fh_mask and hw_mask indicates what are the interrupts are
currently enabled (unmasked).
When we disable all interrupts, those should be set to 0, so if, for
some reason, we get an interrupt even though it was disabled, we will
know to ignore.
Miri Korenblit [Thu, 28 Aug 2025 08:25:56 +0000 (11:25 +0300)]
wifi: iwlwifi: mld: support TLC command version 5
A new version of the TLC command was added in order to support the new
MCSs intoduced in UHR, and an indication of ELR support.
To support the new MCSs, the new version will have MCS bitmaps
(ht_rates) of 32 bit and not 16 bit, as in the old version.
Change the code to populate the new version of the command,
and if the FW requires the old version, copy the content of the new version
structure to the old version structure.
Note that this doesn't actually set the new MCSs, this will come later.
Johannes Berg [Thu, 28 Aug 2025 08:25:47 +0000 (11:25 +0300)]
wifi: iwlwifi: uefi: remove runtime check of constant values
There's no need to check an ARRAY_SIZE() at runtime, it's
already determined at build time, so could be a BUILD_BUG_ON.
However it's not that useful here since the array is defined
using UEFI_MAX_DSM_FUNCS, check DSM_FUNC_NUM_FUNCS instead to
ensure the array cannot be accessed out-of-band, i.e. ensure
the range check there is always good enough.
Cross-merge networking fixes after downstream PR (net-6.17-rc4).
No conflicts.
Adjacent changes:
drivers/net/ethernet/intel/idpf/idpf_txrx.c 02614eee26fb ("idpf: do not linearize big TSO packets") 6c4e68480238 ("idpf: remove obsolete stashing code")
Linus Torvalds [Fri, 29 Aug 2025 00:35:51 +0000 (17:35 -0700)]
Merge tag 'net-6.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from Bluetooth.
Current release - regressions:
- ipv4: fix regression in local-broadcast routes
- vsock: fix error-handling regression introduced in v6.17-rc1
Previous releases - regressions:
- bluetooth:
- mark connection as closed during suspend disconnect
- fix set_local_name race condition
- eth:
- ice: fix NULL pointer dereference on reset
- mlx5: fix memory leak in hws_pool_buddy_init error path
- bnxt_en: fix stats context reservation logic
- hv: fix loss of receive events from host during channel open
Previous releases - always broken:
- page_pool: fix incorrect mp_ops error handling
- sctp: initialize more fields in sctp_v6_from_sk()
- eth:
- octeontx2-vf: fix max packet length errors
- idpf: fix Tx flow scheduling to avoid Tx timeouts
- bnxt_en: fix memory corruption during ifdown
- ice: fix incorrect counter for buffer allocation failures
- mlx5: fix lockdep assertion on sync reset unload event
- fbnic: fixup rtnl_lock and devl_lock handling
- xgmac: do not enable RX FIFO overflow interrupts
- phy: mscc: fix when PTP clock is register and unregister
Misc:
- add Telit Cinterion LE910C4-WWX new compositions"
* tag 'net-6.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (60 commits)
net: ipv4: fix regression in local-broadcast routes
net: macb: Disable clocks once
fbnic: Move phylink resume out of service_task and into open/close
fbnic: Fixup rtnl_lock and devl_lock handling related to mailbox code
net: rose: fix a typo in rose_clear_routes()
l2tp: do not use sock_hold() in pppol2tp_session_get_sock()
sctp: initialize more fields in sctp_v6_from_sk()
MAINTAINERS: rmnet: Update email addresses
net: rose: include node references in rose_neigh refcount
net: rose: convert 'use' field to refcount_t
net: rose: split remove and free operations in rose_remove_neigh()
net: hv_netvsc: fix loss of early receive events from host during channel open.
net: stmmac: Set CIC bit only for TX queues with COE
net: stmmac: xgmac: Correct supported speed modes
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
net/mlx5e: Set local Xoff after FW update
net/mlx5e: Update and set Xon/Xoff upon port speed set
net/mlx5e: Update and set Xon/Xoff upon MTU set
net/mlx5: Prevent flow steering mode changes in switchdev mode
net/mlx5: Nack sync reset when SFs are present
...
Jakub Kicinski [Thu, 28 Aug 2025 23:59:21 +0000 (16:59 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
ice: split ice_virtchnl.c git-blame friendly way
Przemek Kitszel says:
Split ice_virtchnl.c into two more files (+headers), in a way
that git-blame works better.
Then move virtchnl files into a new subdir.
No logic changes.
I have developed (or discovered ;)) how to split a file in a way that
both old and new are nice in terms of git-blame
There was not much discussion on [RFC], so I would like to propose
to go forward with this approach.
There are more commits needed to have it nice, so it forms a git-log vs
git-blame tradeoff, but (after the brief moment that this is on the top)
we spend orders of magnitude more time looking at the blame output (and
commit messages linked from that) - so I find it much better to see
actual logic changes instead of "move xx to yy" stuff (typical for
"squashed/single-commit splits").
Cherry-picks/rebases work the same with this method as with simple
"squashed/single-commit" approach (literally all commits squashed into
one (to have better git-log, but shitty git-blame output).
Rationale for the split itself is, as usual, "file is big and we want to
extend it".
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: finish virtchnl.c split into rss.c
ice: extract virt/rss.c: cleanup - p2
ice: extract virt/rss.c: cleanup - p1
ice: split RSS stuff out of virtchnl.c - copy back
ice: split RSS stuff out of virtchnl.c - tmp rename
ice: finish virtchnl.c split into queues.c
ice: extract virt/queues.c: cleanup - p3
ice: extract virt/queues.c: cleanup - p2
ice: extract virt/queues.c: cleanup - p1
ice: split queue stuff out of virtchnl.c - copy back
ice: split queue stuff out of virtchnl.c - tmp rename
ice: add virt/ and move ice_virtchnl* files there
====================
Jakub Kicinski [Wed, 27 Aug 2025 23:43:19 +0000 (16:43 -0700)]
eth: mlx5: remove Kconfig co-dependency with VXLAN
mlx5 has a Kconfig co-dependency on VXLAN, even tho it doesn't
call any VXLAN function (unlike mlxsw). Perhaps this dates back
to very old days when tunnel ports were fetched directly from
VXLAN.
Remove the dependency to allow MLX5=y + VXLAN=m kernel configs.
But still avoid compiling in the lib/vxlan code if VXLAN=n.
Russell King (Oracle) [Wed, 27 Aug 2025 13:27:47 +0000 (14:27 +0100)]
net: stmmac: mdio: clean up c22/c45 accessor split
The C45 accessors were setting the GR (register number) field twice,
once with the 16-bit register address truncated to five bits, and
then overwritten with the C45 devad. This is harmless since the field
was being cleared prior to being updated with the C45 devad, except
for the extra work.
Russell King (Oracle) [Wed, 27 Aug 2025 08:54:51 +0000 (09:54 +0100)]
net: stmmac: minor cleanups to stmmac_bus_clks_config()
stmmac_bus_clks_config() doesn't need to repeatedly on dereference
priv->plat as this remains the same throughout this function. Not only
does this detract from the function's readability, but it could cause
the value to be reloaded each time. Use a local variable.
Also, the final return can simply return zero, and we can dispense
with the initialiser for 'ret'.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/E1urBvf-000000002ii-37Ce@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Wed, 27 Aug 2025 08:41:48 +0000 (09:41 +0100)]
net: stmmac: mdio: use netdev_priv() directly
netdev_priv() is an inline function, taking a struct net_device
pointer. When passing in the MII bus->priv, which is a void pointer,
there is no need to go via a local ndev variable to type it first.
Sky Huang [Wed, 27 Aug 2025 04:47:55 +0000 (12:47 +0800)]
net: phy: mtk-2p5ge: Add LED support for MT7988
Add LED support for MT7988's built-in 2.5Gphy. LED hardware has almost
the same design with MT7981's/MT7988's built-in GbE. So hook the same
helper function here.
Before mtk_phy_leds_state_init(), set correct default values of LED0
and LED1.
Linus Torvalds [Thu, 28 Aug 2025 23:34:32 +0000 (16:34 -0700)]
Merge tag 'pm-6.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Add missing locking annotations to two recently introduced
list_for_each_entry_rcu() loops in the core device suspend/resume
code (Johannes Berg)"
* tag 'pm-6.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: annotate RCU list iterations
Jakub Kicinski [Wed, 27 Aug 2025 17:35:58 +0000 (10:35 -0700)]
selftests: drv-net: rss_ctx: fix the queue count check
Commit 0d6ccfe6b319 ("selftests: drv-net: rss_ctx: check for all-zero keys")
added a skip exception if NIC has fewer than 3 queues enabled,
but it's just constructing the object, it's not actually rising
this exception.
Before:
# Exception| net.lib.py.utils.CmdExitFailure: Command failed: ethtool -X enp1s0 equal 3 hkey d1:cc:77:47:9d:ea:15:f2:b9:6c:ef:68:62:c0:45:d5:b0:99:7d:cf:29:53:40:06:3d:8e:b9:bc:d4:70:89:b8:8d:59:04:ea:a9:c2:21:b3:55:b8:ab:6b:d9:48:b4:bd:4c:ff:a5:f0:a8:c2
not ok 1 rss_ctx.test_rss_key_indir
After:
ok 1 rss_ctx.test_rss_key_indir # SKIP Device has fewer than 3 queues (or doesn't support queue stats)
====================
devmem/io_uring: allow more flexibility for ZC DMA devices
For TCP zerocopy rx (io_uring, devmem), there is an assumption that the
parent device can do DMA. However that is not always the case:
- Scalable Function netdevs [1] have the DMA device in the grandparent.
- For Multi-PF netdevs [2] queues can be associated to different DMA
devices.
The series adds an API for getting the DMA device for a netdev queue.
Drivers that have special requirements can implement the newly added
queue management op. Otherwise the parent will still be used as before.
This series continues with switching to this API for io_uring zcrx and
devmem and adds a ndo_queue_dma_dev op for mlx5.
The last part of the series changes devmem rx bind to get the DMA device
per queue and blocks the case when multiple queues use different DMA
devices. The tx bind is left as is.
Dragos Tatulea [Wed, 27 Aug 2025 14:40:01 +0000 (17:40 +0300)]
net: devmem: allow binding on rx queues with same DMA devices
Multi-PF netdevs have queues belonging to different PFs which also means
different DMA devices. This means that the binding on the DMA buffer can
be done to the incorrect device.
This change allows devmem binding to multiple queues only when the
queues have the same DMA device. Otherwise an error is returned.
Dragos Tatulea [Wed, 27 Aug 2025 14:39:58 +0000 (17:39 +0300)]
net/mlx5e: add op for getting netdev DMA device
For zero-copy (devmem, io_uring), the netdev DMA device used
is the parent device of the net device. However that is not
always accurate for mlx5 devices:
- SFs: The parent device is an auxdev.
- Multi-PF netdevs: The DMA device should be determined by
the queue.
This change implements the DMA device queue API that returns the DMA
device appropriately for all cases.
Dragos Tatulea [Wed, 27 Aug 2025 14:39:57 +0000 (17:39 +0300)]
net: devmem: get netdev DMA device via new API
Switch to the new API for fetching DMA devices for a netdev. The API is
called with queue index 0 for now which is equivalent with the previous
behavior.
This patch will allow devmem to work with devices where the DMA device
is not stored in the parent device. mlx5 SFs are an example of such a
device.
Multi-PF netdevs are still problematic (as they were before this
change). Upcoming patches will address this for the rx binding.
Dragos Tatulea [Wed, 27 Aug 2025 14:39:56 +0000 (17:39 +0300)]
io_uring/zcrx: add support for custom DMA devices
Use the new API for getting a DMA device for a specific netdev queue.
This patch will allow io_uring zero-copy rx to work with devices
where the DMA device is not stored in the parent device. mlx5 SFs
are an example of such a device.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://patch.msgid.link/20250827144017.1529208-4-dtatulea@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dragos Tatulea [Wed, 27 Aug 2025 14:39:55 +0000 (17:39 +0300)]
queue_api: add support for fetching per queue DMA dev
For zerocopy (io_uring, devmem), there is an assumption that the
parent device can do DMA. However that is not always the case:
- Scalable Function netdevs [1] have the DMA device in the grandparent.
- For Multi-PF netdevs [2] queues can be associated to different DMA
devices.
This patch introduces the a queue based interface for allowing drivers
to expose a different DMA device for zerocopy.
Linus Torvalds [Thu, 28 Aug 2025 23:04:14 +0000 (16:04 -0700)]
Merge tag 'dma-mapping-6.17-2025-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
- another small fix for arm64 systems with memory encryption (Shanker
Donthineni)
- fix for arm32 systems with non-standard CMA configuration (Oreoluwa
Babatunde)
* tag 'dma-mapping-6.17-2025-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
of: reserved_mem: Restructure call site for dma_contiguous_early_fixup()
Linus Torvalds [Thu, 28 Aug 2025 22:46:06 +0000 (15:46 -0700)]
Merge tag 'fixes-2025-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fixes from Mike Rapoport:
- printk cleanups in memblock and numa_memblks
- update kernel-doc for MEMBLOCK_RSRV_NOINIT to be more accurate and
detailed
* tag 'fixes-2025-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock: fix kernel-doc for MEMBLOCK_RSRV_NOINIT
mm: numa,memblock: Use SZ_1M macro to denote bytes to MB conversion
mm/numa_memblks: Use pr_debug instead of printk(KERN_DEBUG)
Linus Torvalds [Thu, 28 Aug 2025 22:39:06 +0000 (15:39 -0700)]
Merge tag 'powerpc-6.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Madhavan Srinivasan:
- Merge two CONFIG_POWERPC64_CPU entries in Kconfig.cputype
- Replace extra-y to always-y in Makefile
- Cleanup to use dev_fwnode helper
- Fix misleading comment in kvmppc_prepare_to_enter()
- misc cleanup and fixes
Thanks to Amit Machhiwal, Andrew Donnellan, Christophe Leroy, Gautam
Menghani, Jiri Slaby (SUSE), Masahiro Yamada, Shrikanth Hegde, Stephen
Rothwell, Venkat Rao Bagalkote, and Xichao Zhao
* tag 'powerpc-6.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/boot/install.sh: Fix shellcheck warnings
powerpc/prom_init: Fix shellcheck warnings
powerpc/kvm: Fix ifdef to remove build warning
powerpc: unify two CONFIG_POWERPC64_CPU entries in the same choice block
powerpc: use always-y instead of extra-y in Makefiles
powerpc/64: Drop unnecessary 'rc' variable
powerpc: Use dev_fwnode()
KVM: PPC: Fix misleading interrupts comment in kvmppc_prepare_to_enter()
====================
fbnic: Synchronize address handling with BMC
The fbnic driver needs to communicate with the BMC if it is operating on
the RMII-based transport (RBT) of the same port the host is on. To enable
this we need to add rules that will route BMC traffic to the RBT/BMC and
the BMC and firmware need to configure rules on the RBT side of the
interface to route traffic from the BMC to the host instead of the MAC.
To enable that this patch set addresses two issues. First it will cause the
TCAM to be reconfigured in the event that the BMC was not previously
present when the driver was loaded, but the FW sends a notification that
the FW capabilities have changed and a BMC w/ various MAC addresses is now
present. Second it adds support for sending a message to the firmware so
that if the host adds additional MAC addresses the FW can be made aware and
route traffic for those addresses from the RBT to the host instead of the
MAC.
====================
Alexander Duyck [Tue, 26 Aug 2025 19:45:07 +0000 (12:45 -0700)]
fbnic: Push local unicast MAC addresses to FW to populate TCAMs
The MACDA TCAM can only be accessed by one entity at a time and as such we
cannot have simultaneous reads from the firmware to probe for changes from
the host. As such we have to send a message indicating what the state of
the MACDA is to the firmware when we updated it so that the firmware can
sync up the TCAMs it owns to route BMC packets to the host.
To support that we are adding a new message that is invoked when we write
the MACDA that will notify the firmware of updates from the host and allow
it to sync up the TCAM configuration to match the one on the host side.
Alexander Duyck [Tue, 26 Aug 2025 19:45:01 +0000 (12:45 -0700)]
fbnic: Add logic to repopulate RPC TCAM if BMC enables channel
The BMC itself can decide to abandon a link and move onto another link in
the event of things such as a link flap. As a result the driver may load
with the BMC not present, and then needs to update things to support the
BMC being present while the link is up and the NIC is passing traffic.
To support this we add support to the watchdog to reinitialize the RPC to
support adding the BMC unicast, multicast, and multicast promiscuous
filters while the link is up and the NIC owns the link.
Alexander Duyck [Tue, 26 Aug 2025 19:44:54 +0000 (12:44 -0700)]
fbnic: Pass fbnic_dev instead of netdev to __fbnic_set/clear_rx_mode
To make the __fbnic_set_rx_mode and __fbnic_clear_rx_mode calls usable by
more points in the code we can make to that they expect a fbnic_dev pointer
instead of a netdev pointer.
Alexander Duyck [Tue, 26 Aug 2025 19:44:47 +0000 (12:44 -0700)]
fbnic: Move promisc_sync out of netdev code and into RPC path
In order for us to support the BMC possibly connecting, disconnecting, and
then reconnecting we need to be able to support entities outside of just
the NIC setting up promiscuous mode as the BMC can use a multicast
promiscuous setup.
To support that we should move the promisc_sync code out of the netdev and
into the RPC section of the driver so that it is reachable from more paths.
Eric Dumazet [Tue, 26 Aug 2025 12:50:31 +0000 (12:50 +0000)]
inet: raw: add drop_counters to raw sockets
When a packet flood hits one or more RAW sockets, many cpus
have to update sk->sk_drops.
This slows down other cpus, because currently
sk_drops is in sock_write_rx group.
Add a socket_drop_counters structure to raw sockets.
Using dedicated cache lines to hold drop counters
makes sure that consumers no longer suffer from
false sharing if/when producers only change sk->sk_drops.
Eric Dumazet [Tue, 26 Aug 2025 12:50:30 +0000 (12:50 +0000)]
udp: add drop_counters to udp socket
When a packet flood hits one or more UDP sockets, many cpus
have to update sk->sk_drops.
This slows down other cpus, because currently
sk_drops is in sock_write_rx group.
Add a socket_drop_counters structure to udp sockets.
Using dedicated cache lines to hold drop counters
makes sure that consumers no longer suffer from
false sharing if/when producers only change sk->sk_drops.
This adds 128 bytes per UDP socket.
Tested with the following stress test, sending about 11 Mpps
to a dual socket AMD EPYC 7B13 64-Core.
super_netperf 20 -t UDP_STREAM -H DUT -l10 -- -n -P,1000 -m 120
Note: due to socket lookup, only one UDP socket is receiving
packets on DUT.
Then measure receiver (DUT) behavior. We can see both
consumer and BH handlers can process more packets per second.
Jakub Kicinski [Mon, 25 Aug 2025 20:18:28 +0000 (13:18 -0700)]
uapi: wrap compiler_types.h in an ifdef instead of the implicit strip
The uAPI stddef header includes compiler_types.h, a kernel-only
header, to make sure that kernel definitions of annotations
like __counted_by() take precedence.
There is a hack in scripts/headers_install.sh which strips includes
of compiler.h and compiler_types.h when installing uAPI headers.
While explicit handling makes sense for compiler.h, which is included
all over the uAPI, compiler_types.h is only included by stddef.h
(within the uAPI, obviously it's included in kernel code a lot).
Remove the stripping from scripts/headers_install.sh and wrap
the include of compiler_types.h in #ifdef __KERNEL__ instead.
This should be equivalent functionally, but is easier to understand
to a casual reader of the code. It also makes it easier to work
with kernel headers directly from under tools/
Oscar Maes [Wed, 27 Aug 2025 06:23:21 +0000 (08:23 +0200)]
net: ipv4: fix regression in local-broadcast routes
Commit 9e30ecf23b1b ("net: ipv4: fix incorrect MTU in broadcast routes")
introduced a regression where local-broadcast packets would have their
gateway set in __mkroute_output, which was caused by fi = NULL being
removed.
Fix this by resetting the fib_info for local-broadcast packets. This
preserves the intended changes for directed-broadcast packets.
Cc: stable@vger.kernel.org Fixes: 9e30ecf23b1b ("net: ipv4: fix incorrect MTU in broadcast routes") Reported-by: Brett A C Sheffield <bacs@librecast.net> Closes: https://lore.kernel.org/regressions/20250822165231.4353-4-bacs@librecast.net Signed-off-by: Oscar Maes <oscmaes92@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250827062322.4807-1-oscmaes92@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Neil Mandir [Tue, 26 Aug 2025 14:30:22 +0000 (10:30 -0400)]
net: macb: Disable clocks once
When the driver is removed the clocks are disabled twice: once in
macb_remove and a second time by runtime pm. Disable wakeup in remove so
all the clocks are disabled and skip the second call to macb_clks_disable.
Always suspend the device as we always set it active in probe.
Fixes: d54f89af6cc4 ("net: macb: Add pm runtime support") Signed-off-by: Neil Mandir <neil.mandir@seco.com> Co-developed-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250826143022.935521-1-sean.anderson@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Miri Korenblit [Tue, 26 Aug 2025 15:55:04 +0000 (18:55 +0300)]
wifi: iwlwifi: refactor iwl_pnvm_get_from_fs
Instead of having an error code or 0 as a return value and passing a
pointer to a pointer to be set by this function, change it to return a
pointer, and use NULL as an error indication.
Miri Korenblit [Tue, 26 Aug 2025 15:55:02 +0000 (18:55 +0300)]
wifi: iwlwifi: mld: don't modify trans state where not needed
In suspend and resume flows, if we had any error we set the transport state to
'FW_ERROR' This was done to avoid sending commands when we shouldn't.
In the mentioned flows, we can have a few types of errors:
1. logic errors
2. FW is in error state (can't send commands)
3. FW is misbehaving
4. D3 handshake error
In the first, we can still talk to the firmware.
In the second - the transport already knows about the FW error, no need to tell it.
In the third - we need to treat it as any other FW misbehaviour. There is no reason
to have a special handling here.
So we only need it for the last type. Change the code to set the
tansport state to FW error only in case of a d3 handshake error.
While at it, add a comment explaining why the opmode sets the FW error
bits.
Miri Korenblit [Tue, 26 Aug 2025 15:55:01 +0000 (18:55 +0300)]
wifi: iwlwifi: simplify iwl_trans_pcie_d3_resume
If iwl_trans_d3_resume succeeded but the hw requested a reset, this will
be indicated to the opmode via the iwl_d3_status parameter while the return
value will be 0.
But the opmode doesn't really care if the resume failed or if a restart
is required. It acts the same in both cases (beside different logs, but
this can be done in iwl_trans_pcie_d3_resume)
This complicates the code for no good reason.
Change the iwl_trans_pcie_d3_resume to return an error value also in the
case that everything went successfully but a restart is required,
and add more logs so we can differentiate between the cases.
This makes iwl_d3_status redundant. Remove it as well.
Miri Korenblit [Tue, 26 Aug 2025 15:55:00 +0000 (18:55 +0300)]
wifi: iwlwifi: trans: remove STATUS_SUSPENDED
We needed this bit to prevent sending host commands when suspended in
pseudo mode (in real suspension we can't send commands anyway).
Now as pseudo mode is removed, we no longer need it.
Remove.
Miri Korenblit [Tue, 26 Aug 2025 15:54:58 +0000 (18:54 +0300)]
wifi: iwlwifi: remove dump file name extension support
The options to configure a dump file name extension was added for 2
cases:
1. if we dump because of a missed beacon, we added the mac id and type
to the filename.
2. to add the error id of the LMAC/UMAC/TCM/RCM error id to the file
name.
For 1, there is a bug: in cases in which missed beacon will not trigger
a dump (for example in the default preset), and a missed beacon occurred,
and eventually there is a dump for a different reason,
the dump file name will contain the mac type and id even thought the
dump has nothing to do with a missed beacon.
Anyway, both cases are no longer required. Remove the code.
Johannes Berg [Tue, 26 Aug 2025 15:54:56 +0000 (18:54 +0300)]
wifi: iwlwifi: add a new FW file numbering scheme
Firmware releases follow a "Core N" pattern, but due to some
historical accidents, the API number for a Core N has always
been N+3. That's confusing for everyone.
For future firmware releases the firmware will make new file
names that, instead of being named with the API number, will
be named with the core number. For example, for the next one
for bz/fm it'd be "iwlwifi-bz-b0-fm-c0-c99.ucode" instead of
the now expected "iwlwifi-bz-b0-fm-c0-102.ucode".
In the driver, represent that as an offset of 1000, and then
request the "c<core>" format instead of just "<api>". When
looking for older versions, skip from 1099 to 101 (which is
core 98.)
Somashekhar Puttagangaiah [Tue, 26 Aug 2025 15:54:51 +0000 (18:54 +0300)]
wifi: iwlwifi: mld: trigger mlo scan only when not in EMLSR
When beacon loss happens or the RSSI drops, trigger MLO scan only
if not in EMLSR. The link switch was meant to be done when we are
not in EMLSR and we can try to switch to a better link.
If in EMLSR, we exit first and then trigger MLO scan.
Miri Korenblit [Tue, 26 Aug 2025 15:54:50 +0000 (18:54 +0300)]
wifi: iwlwifi: mld: don't check the cipher on resume
On resume, we are iterating all the keys in order to update the PN.
Currently we check the cipher of the key we are currently iterating on
to decide whether the key is PTK, GTK, IGTK or BIGTK.
But we can find the type of the key by the keyidx, and we anyway have to
check the keyidx, so just remove the cipher switch case and check only
the keyidx instead
Linus Torvalds [Thu, 28 Aug 2025 02:18:51 +0000 (19:18 -0700)]
Merge tag 'perf-tools-fixes-for-v6.17-2025-08-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf-tools fixes from Namhyung Kim:
"A number of kernel header sync changes and two build-id fixes"
* tag 'perf-tools-fixes-for-v6.17-2025-08-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf symbol: Add blocking argument to filename__read_build_id
perf symbol-minimal: Fix ehdr reading in filename__read_build_id
tools headers: Sync uapi/linux/vhost.h with the kernel source
tools headers: Sync uapi/linux/prctl.h with the kernel source
tools headers: Sync uapi/linux/fs.h with the kernel source
tools headers: Sync uapi/linux/fcntl.h with the kernel source
tools headers: Sync syscall tables with the kernel source
tools headers: Sync powerpc headers with the kernel source
tools headers: Sync arm64 headers with the kernel source
tools headers: Sync x86 headers with the kernel source
tools headers: Sync linux/cfi_types.h with the kernel source
tools headers: Sync linux/bits.h with the kernel source
tools headers: Sync KVM headers with the kernel source
perf test: Fix a build error in x86 topdown test
Jakub Kicinski [Thu, 28 Aug 2025 01:57:13 +0000 (18:57 -0700)]
Merge branch 'locking-fixes-for-fbnic-driver'
Alexander Duyck says:
====================
Locking fixes for fbnic driver
Address a few locking issues that were reported on the fbnic driver.
Specifically in one case we were seeing locking leaks due to us not
releasing the locks in certain exception paths. In another case we were
using phylink_resume outside of a section in which we held the RTNL mutex
and as a result we were throwing an assert.
====================