]> www.infradead.org Git - users/willy/linux.git/log
users/willy/linux.git
2 months agoMerge branch 'net-dsa-b53-mmap-add-bcm63268-gphy-power-control'
Jakub Kicinski [Fri, 15 Aug 2025 00:54:02 +0000 (17:54 -0700)]
Merge branch 'net-dsa-b53-mmap-add-bcm63268-gphy-power-control'

Kyle Hendry says:

====================
net: dsa: b53: mmap: Add bcm63268 GPHY power control

The gpio controller on the bcm63268 has a register for
controlling the gigabit phy power. These patches disable
low power mode when enabling the gphy port.

This is based on an earlier patch series here:
https://lore.kernel.org/20250306053105.41677-1-kylehendrydev@gmail.com

I have created a new series since many of the changes
were included in the ephy control patches:
https://lore.kernel.org/20250724035300.20497-1-kylehendrydev@gmail.com
====================

Link: https://patch.msgid.link/20250814002530.5866-1-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: b53: mmap: Implement bcm63268 gphy power control
Kyle Hendry [Thu, 14 Aug 2025 00:25:28 +0000 (17:25 -0700)]
net: dsa: b53: mmap: Implement bcm63268 gphy power control

Add check for gphy in enable/disable phy calls and set power bits
in gphy control register.

Signed-off-by: Kyle Hendry <kylehendrydev@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250814002530.5866-3-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: b53: mmap: Add gphy port to phy info for bcm63268
Kyle Hendry [Thu, 14 Aug 2025 00:25:27 +0000 (17:25 -0700)]
net: dsa: b53: mmap: Add gphy port to phy info for bcm63268

Add gphy mask to bcm63xx phy info struct and add data for bcm63268

Signed-off-by: Kyle Hendry <kylehendrydev@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250814002530.5866-2-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agosfc: replace min/max nesting with clamp()
Xichao Zhao [Tue, 12 Aug 2025 06:50:26 +0000 (14:50 +0800)]
sfc: replace min/max nesting with clamp()

The clamp() macro explicitly expresses the intent of constraining
a value within bounds.Therefore, replacing min(max(a, b), c) with
clamp(val, lo, hi) can improve code readability.

Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com>
Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20250812065026.620115-1-zhao.xichao@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'bridge-redirect-to-backup-port-when-port-is-administratively-down'
Jakub Kicinski [Fri, 15 Aug 2025 00:45:39 +0000 (17:45 -0700)]
Merge branch 'bridge-redirect-to-backup-port-when-port-is-administratively-down'

Ido Schimmel says:

====================
bridge: Redirect to backup port when port is administratively down

Patch #1 amends the bridge to redirect to the backup port when the
primary port is administratively down and not only when it does not have
a carrier. See the commit message for more details.

Patch #2 extends the bridge backup port selftest to cover this case.
====================

Link: https://patch.msgid.link/20250812080213.325298-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: net: Test bridge backup port when port is administratively down
Ido Schimmel [Tue, 12 Aug 2025 08:02:13 +0000 (11:02 +0300)]
selftests: net: Test bridge backup port when port is administratively down

Test that packets are redirected to the backup port when the primary
port is administratively down.

With the previous patch:

 # ./test_bridge_backup_port.sh
 [...]
 TEST: swp1 administratively down                                    [ OK ]
 TEST: No forwarding out of swp1                                     [ OK ]
 TEST: Forwarding out of vx0                                         [ OK ]
 TEST: swp1 administratively up                                      [ OK ]
 TEST: Forwarding out of swp1                                        [ OK ]
 TEST: No forwarding out of vx0                                      [ OK ]
 [...]
 Tests passed:  89
 Tests failed:   0

Without the previous patch:

 # ./test_bridge_backup_port.sh
 [...]
 TEST: swp1 administratively down                                    [ OK ]
 TEST: No forwarding out of swp1                                     [ OK ]
 TEST: Forwarding out of vx0                                         [FAIL]
 TEST: swp1 administratively up                                      [ OK ]
 TEST: Forwarding out of swp1                                        [ OK ]
 [...]
 Tests passed:  85
 Tests failed:   4

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250812080213.325298-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobridge: Redirect to backup port when port is administratively down
Ido Schimmel [Tue, 12 Aug 2025 08:02:12 +0000 (11:02 +0300)]
bridge: Redirect to backup port when port is administratively down

If a backup port is configured for a bridge port, the bridge will
redirect known unicast traffic towards the backup port when the primary
port is administratively up but without a carrier. This is useful, for
example, in MLAG configurations where a system is connected to two
switches and there is a peer link between both switches. The peer link
serves as the backup port in case one of the switches loses its
connection to the multi-homed system.

In order to avoid flooding when the primary port loses its carrier, the
bridge does not flush dynamic FDB entries pointing to the port upon STP
disablement, if the port has a backup port.

The above means that known unicast traffic destined to the primary port
will be blackholed when the port is put administratively down, until the
FDB entries pointing to it are aged-out.

Given that the current behavior is quite weird and unlikely to be
depended on by anyone, amend the bridge to redirect to the backup port
also when the primary port is administratively down and not only when it
does not have a carrier.

The change is motivated by a report from a user who expected traffic to
be redirected to the backup port when the primary port was put
administratively down while debugging a network issue.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250812080213.325298-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: drv-net: wait for carrier
Jakub Kicinski [Tue, 12 Aug 2025 14:20:54 +0000 (07:20 -0700)]
selftests: drv-net: wait for carrier

On fast machines the tests run in quick succession so even
when tests clean up after themselves the carrier may need
some time to come back.

Specifically in NIPA when ping.py runs right after netpoll_basic.py
the first ping command fails.

Since the context manager callbacks are now common NetDrvEpEnv
gets an ip link up call as well.

Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250812142054.750282-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: mscc: report and configure in-band auto-negotiation for SGMII/QSGMII
Vladimir Oltean [Wed, 13 Aug 2025 07:44:54 +0000 (10:44 +0300)]
net: phy: mscc: report and configure in-band auto-negotiation for SGMII/QSGMII

The following Vitesse/Microsemi/Microchip PHYs, among those supported by
this driver, have the host interface configurable as SGMII or QSGMII:
- VSC8504
- VSC8514
- VSC8552
- VSC8562
- VSC8572
- VSC8574
- VSC8575
- VSC8582
- VSC8584

All these PHYs are documented to have bit 7 of "MAC SerDes PCS Control"
as "MAC SerDes ANEG enable".

Out of these, I could test the VSC8514 quad PHY in QSGMII. This works
both with the in-band autoneg on and off, on the NXP LS1028A-RDB and
T1040-RDB boards.

Notably, the bit is sticky (survives soft resets), so giving Linux the
tools to read and modify this settings makes it robust to changes made
to it by previous boot layers (U-Boot).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250813074454.63224-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: realtek: remove unnecessary file, dentry, inode declarations
Vladimir Oltean [Wed, 13 Aug 2025 18:10:23 +0000 (21:10 +0300)]
net: dsa: realtek: remove unnecessary file, dentry, inode declarations

These are present since commit d8652956cf37 ("net: dsa: realtek-smi: Add
Realtek SMI driver") and never needed. Apparently the driver was not
cleaned up sufficiently for submission.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patch.msgid.link/20250813181023.808528-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet/sched: Use TC_RTAB_SIZE instead of magic number
Yue Haibing [Wed, 13 Aug 2025 12:55:26 +0000 (20:55 +0800)]
net/sched: Use TC_RTAB_SIZE instead of magic number

Replace magic number with TC_RTAB_SIZE to make it more informative.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://patch.msgid.link/20250813125526.853895-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoptp: ptp_clockmatrix: Remove redundant semicolons
Liao Yuanhong [Wed, 13 Aug 2025 09:50:24 +0000 (17:50 +0800)]
ptp: ptp_clockmatrix: Remove redundant semicolons

Remove unnecessary semicolons.

Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Link: https://patch.msgid.link/20250813095024.559085-1-liaoyuanhong@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'devlink-port-attr-cleanup'
Jakub Kicinski [Fri, 15 Aug 2025 00:35:23 +0000 (17:35 -0700)]
Merge branch 'devlink-port-attr-cleanup'

Parav Pandit says:

====================
devlink port attr cleanup

patch-1 removes the return 0 check at several places and simplfies
patch-2 constifies the attributes and moves the checks early
caller
====================

Link: https://patch.msgid.link/20250813094417.7269-1-parav@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agodevlink/port: Check attributes early and constify
Parav Pandit [Wed, 13 Aug 2025 09:44:17 +0000 (12:44 +0300)]
devlink/port: Check attributes early and constify

Constify the devlink port attributes to indicate they are read only
and does not depend on anything else. Therefore, validate it early
before setting in the devlink port.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Link: https://patch.msgid.link/20250813094417.7269-3-parav@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agodevlink/port: Simplify return checks
Parav Pandit [Wed, 13 Aug 2025 09:44:16 +0000 (12:44 +0300)]
devlink/port: Simplify return checks

Drop always returning 0 from the helper routine and simplify
its callers.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Link: https://patch.msgid.link/20250813094417.7269-2-parav@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonfc: pn533: Delete an unnecessary check
Dan Carpenter [Wed, 13 Aug 2025 05:51:22 +0000 (08:51 +0300)]
nfc: pn533: Delete an unnecessary check

The "rc" variable is set like this:

if (IS_ERR(resp)) {
rc = PTR_ERR(resp);

We know that "rc" is negative so there is no need to check.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/aJwn2ox5g9WsD2Vx@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: realtek: convert RTL8226-CG to c45 only
Markus Stockhausen [Wed, 13 Aug 2025 05:44:07 +0000 (01:44 -0400)]
net: phy: realtek: convert RTL8226-CG to c45 only

Short: Convert the RTL8226-CG to c45 so it can be used in its
Realtek based ecosystems.

Long: The RTL8226-CG can be mainly found on devices of the
Realtek Otto switch platform. Devices like the Zyxel XGS1210-12
are based on it. These implement a hardware based phy polling
in the background to update SoC status registers.

The hardware provides 4 smi busses where phys are attached to.
For each bus one can decide if it is polled in c45 or c22 mode.
See https://svanheule.net/realtek/longan/register/smi_glb_ctrl
With this setting the register access will be limited by the
hardware. This is very complex (including caching and special
c45-over-c22 handling). But basically it boils down to "enable
protocol x and SoC will disable register access via protocol y".

Mainline already gained support for the rtl9300 mdio driver
in commit 24e31e474769 ("net: mdio: Add RTL9300 MDIO driver").

It covers the basic features, but a lot effort is still needed
to understand hardware properly. So it runs a simple setup by
selecting the proper bus mode during startup.

/* Put the interfaces into C45 mode if required */
glb_ctrl_mask = GENMASK(19, 16);
for (i = 0; i < MAX_SMI_BUSSES; i++)
if (priv->smi_bus_is_c45[i])
glb_ctrl_val |= GLB_CTRL_INTF_SEL(i);
...
err = regmap_update_bits(regmap, SMI_GLB_CTRL,
 glb_ctrl_mask, glb_ctrl_val);

To avoid complex coding later on, it limits access by only
providing either c22 or c45:

bus->name = "Realtek Switch MDIO Bus";
if (priv->smi_bus_is_c45[mdio_bus]) {
bus->read_c45 = rtl9300_mdio_read_c45;
bus->write_c45 =  rtl9300_mdio_write_c45;
} else {
bus->read = rtl9300_mdio_read_c22;
bus->write = rtl9300_mdio_write_c22;
}

Because of these limitations the existing RTL8226 phy driver
is not working at all on Realtek switches. Convert the driver
to c45-only.

Luckily the RTL8226 seems to support proper MDIO_PMA_EXTABLE
flags. So standard function genphy_c45_pma_read_abilities() can
call genphy_c45_pma_read_ext_abilities() and 10/100/1000 is
populated right. Thus conversion is straight forward.

Outputs before - REMARK: For this a "hacked" bus was used that
toggles the mode for each c22/c45 access. But that is slow and
produces unstable data in the SoC status registers).

Settings for lan9:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Unknown! (255)
        Port: Twisted Pair
        PHYAD: 24
        Transceiver: external
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: d
        Wake-on: d
        Link detected: no

Outputs with this commit:

Settings for lan9:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Unknown! (255)
        Port: Twisted Pair
        PHYAD: 24
        Transceiver: external
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: d
        Wake-on: d
        Link detected: no

Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Link: https://patch.msgid.link/20250813054407.1108285-1-markus.stockhausen@gmx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: motorcomm: Add support for PHY LEDs on YT8521
Jijie Shao [Wed, 13 Aug 2025 12:45:42 +0000 (20:45 +0800)]
net: phy: motorcomm: Add support for PHY LEDs on YT8521

Add minimal LED controller driver supporting
the most common uses with the 'netdev' trigger.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250813124542.3450447-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'docs/v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Jakub Kicinski [Fri, 15 Aug 2025 00:26:37 +0000 (17:26 -0700)]
Merge tag 'docs/v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-docs

Mauro Carvalho Chehab says:

====================
add a generic yaml parser integrated with Netlink specs generation

- An YAML parser Sphinx plugin, integrated with Netlink YAML doc
  parser.

The patch content is identical to my v10 submission:
https://lore.kernel.org/cover.1753718185.git.mchehab+huawei@kernel.org

* tag 'docs/v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-docs:
  sphinx: parser_yaml.py: fix line numbers information
  docs: parser_yaml.py: fix backward compatibility with old docutils
  docs: parser_yaml.py: add support for line numbers from the parser
  tools: netlink_yml_parser.py: add line numbers to parsed data
  MAINTAINERS: add netlink_yml_parser.py to linux-doc
  docs: netlink: remove obsolete .gitignore from unused directory
  tools: ynl_gen_rst.py: drop support for generating index files
  docs: uapi: netlink: update netlink specs link
  docs: use parser_yaml extension to handle Netlink specs
  docs: sphinx: add a parser for yaml files for Netlink specs
  tools: ynl_gen_rst.py: cleanup coding style
  docs: netlink: index.rst: add a netlink index file
  tools: ynl_gen_rst.py: Split library from command line tool
  docs: netlink: netlink-raw.rst: use :ref: instead of :doc:
====================

Link: https://patch.msgid.link/20250812113329.356c93c2@foz.lan
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 17 Jul 2025 17:56:56 +0000 (10:56 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-6.17-rc2).

No conflicts.

Adjacent changes:

drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
  d7a276a5768f ("net: stmmac: rk: convert to suspend()/resume() methods")
  de1e963ad064 ("net: stmmac: rk: put the PHY clock on remove")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 14 Aug 2025 14:14:30 +0000 (07:14 -0700)]
Merge tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from Netfilter and IPsec.

  Current release - regressions:

   - netfilter: nft_set_pipapo:
      - don't return bogus extension pointer
      - fix null deref for empty set

  Current release - new code bugs:

   - core: prevent deadlocks when enabling NAPIs with mixed kthread
     config

   - eth: netdevsim: Fix wild pointer access in nsim_queue_free().

  Previous releases - regressions:

   - page_pool: allow enabling recycling late, fix false positive
     warning

   - sched: ets: use old 'nbands' while purging unused classes

   - xfrm:
      - restore GSO for SW crypto
      - bring back device check in validate_xmit_xfrm

   - tls: handle data disappearing from under the TLS ULP

   - ptp: prevent possible ABBA deadlock in ptp_clock_freerun()

   - eth:
      - bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
      - hv_netvsc: fix panic during namespace deletion with VF

  Previous releases - always broken:

   - netfilter: fix refcount leak on table dump

   - vsock: do not allow binding to VMADDR_PORT_ANY

   - sctp: linearize cloned gso packets in sctp_rcv

   - eth:
      - hibmcge: fix the division by zero issue
      - microchip: fix KSZ8863 reset problem"

* tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
  net: usb: asix_devices: add phy_mask for ax88772 mdio bus
  net: kcm: Fix race condition in kcm_unattach()
  selftests: net/forwarding: test purge of active DWRR classes
  net/sched: ets: use old 'nbands' while purging unused classes
  bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
  netdevsim: Fix wild pointer access in nsim_queue_free().
  net: mctp: Fix bad kfree_skb in bind lookup test
  netfilter: nf_tables: reject duplicate device on updates
  ipvs: Fix estimator kthreads preferred affinity
  netfilter: nft_set_pipapo: fix null deref for empty set
  selftests: tls: test TCP stealing data from under the TLS socket
  tls: handle data disappearing from under the TLS ULP
  ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
  ixgbe: prevent from unwanted interface name changes
  devlink: let driver opt out of automatic phys_port_name generation
  net: prevent deadlocks when enabling NAPIs with mixed kthread config
  net: update NAPI threaded config even for disabled NAPIs
  selftests: drv-net: don't assume device has only 2 queues
  docs: Fix name for net.ipv4.udp_child_hash_entries
  riscv: dts: thead: Add APB clocks for TH1520 GMACs
  ...

2 months agoMerge branch 'net-ethtool-support-including-flow-label-in-the-flow-hash-for-rss'
Paolo Abeni [Thu, 14 Aug 2025 09:40:21 +0000 (11:40 +0200)]
Merge branch 'net-ethtool-support-including-flow-label-in-the-flow-hash-for-rss'

Jakub Kicinski says:

====================
net: ethtool: support including Flow Label in the flow hash for RSS

Add support for using IPv6 Flow Label in Rx hash computation
and therefore RSS queue selection.

v3: https://lore.kernel.org/20250724015101.186608-1-kuba@kernel.org
v2:  https://lore.kernel.org/20250722014915.3365370-1-kuba@kernel.org
RFC: https://lore.kernel.org/20250609173442.1745856-1-kuba@kernel.org
====================

Link: https://patch.msgid.link/20250811234212.580748-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoselftests: drv-net: add test for RSS on flow label
Jakub Kicinski [Mon, 11 Aug 2025 23:42:12 +0000 (16:42 -0700)]
selftests: drv-net: add test for RSS on flow label

Add a simple test for checking that RSS on flow label works,
and that its rejected for IPv4 flows.

 # ./tools/testing/selftests/drivers/net/hw/rss_flow_label.py
 TAP version 13
 1..2
 ok 1 rss_flow_label.test_rss_flow_label
 ok 2 rss_flow_label.test_rss_flow_label_6only
 # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250811234212.580748-5-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoeth: bnxt: support RSS on IPv6 Flow Label
Jakub Kicinski [Mon, 11 Aug 2025 23:42:11 +0000 (16:42 -0700)]
eth: bnxt: support RSS on IPv6 Flow Label

It appears that the bnxt FW API has the relevant bit for Flow Label
hashing. Plumb in the support. Obey the capability bit.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250811234212.580748-4-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoeth: fbnic: support RSS on IPv6 Flow Label
Jakub Kicinski [Mon, 11 Aug 2025 23:42:10 +0000 (16:42 -0700)]
eth: fbnic: support RSS on IPv6 Flow Label

Support IPv6 Flow Label hashing. Use both inner and outer IPv6
header's Flow Label if both headers are detected. Flow Label
is unlike normal header fields, by enabling it user accepts
the unstable hash and possible reordering. Because of that
I think it's reasonable to hash over all Flow Labels we can
find, even tho we don't hash over all L3 addresses.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250811234212.580748-3-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ethtool: support including Flow Label in the flow hash for RSS
Jakub Kicinski [Mon, 11 Aug 2025 23:42:09 +0000 (16:42 -0700)]
net: ethtool: support including Flow Label in the flow hash for RSS

Some modern NICs support including the IPv6 Flow Label in
the flow hash for RSS queue selection. This is outside
the old "Microsoft spec", but was included in the OCP NIC spec:

  [ ] RSS include flow label in the hash (configurable)

https://www.opencompute.org/w/index.php?title=Core_Offloads#Receive_Side_Scaling

RSS Flow Label hashing allows TCP Protective Load Balancing (PLB)
to recover from receiver congestion / overload.
Rx CPU/queue hotspots are relatively common for data ingest
workloads, and so far we had to try to detect the condition
at the RPC layer and reopen the connection. PLB lets us change
the Flow Label and therefore Rx CPU on RTO, with minimal packet
reordering. PLB reaction times are much faster, and can happen
at any point in the connection, not just at RPC boundaries.

Due to the nature of host processing (relatively long queues,
other kernel subsystems masking IRQs for 100s of msecs)
the risk of reordering within the host is higher than in
the network. But for applications which need it - it is far
preferable to potentially persistent overload of subset of
queues.

It is expected that the hash communicated to the host
may change if the Flow Label changes. This may be surprising
to some host software, but I don't expect the devices
can compute two Toeplitz hashes, one with the Flow Label
for queue selection and one without for the rx hash
communicated to the host. Besides, changing the hash
may potentially help to change the path thru host queues.
User can disable NETIF_F_RXHASH if they require a stable
flow hash.

The name RXH_IP6_FL was chosen based on what we call
Flow Label variables in IPv6 processing (fl). I prefer
fl_lbl but that appears to be an fbnic-only spelling.
We could spell out RXH_IP6_FLOW_LABEL but existing
RXH_ defines are a lot more terse.

Willem notes [1] that Flow Label is defined as identifying the flow
and therefore including both the flow label _and_ the L4 header
fields is not generally necessary. But it should not hurt so
it's not explicitly prevented if the driver supports hashing
on both at the same time.

Link: https://lore.kernel.org/68483433b45e2_3cd66f29440@willemb.c.googlers.com.notmuch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250811234212.580748-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: usb: asix_devices: add phy_mask for ax88772 mdio bus
Xu Yang [Mon, 11 Aug 2025 09:29:31 +0000 (17:29 +0800)]
net: usb: asix_devices: add phy_mask for ax88772 mdio bus

Without setting phy_mask for ax88772 mdio bus, current driver may create
at most 32 mdio phy devices with phy address range from 0x00 ~ 0x1f.
DLink DUB-E100 H/W Ver B1 is such a device. However, only one main phy
device will bind to net phy driver. This is creating issue during system
suspend/resume since phy_polling_mode() in phy_state_machine() will
directly deference member of phydev->drv for non-main phy devices. Then
NULL pointer dereference issue will occur. Due to only external phy or
internal phy is necessary, add phy_mask for ax88772 mdio bus to workarnoud
the issue.

Closes: https://lore.kernel.org/netdev/20250806082931.3289134-1-xu.yang_2@nxp.com
Fixes: e532a096be0e ("net: usb: asix: ax88772: add phylib support")
Cc: stable@vger.kernel.org
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20250811092931.860333-1-xu.yang_2@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: cadence: macb: convert from round_rate() to determine_rate()
Brian Masney [Sun, 10 Aug 2025 22:24:14 +0000 (18:24 -0400)]
net: cadence: macb: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate().

Signed-off-by: Brian Masney <bmasney@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250810-net-round-rate-v1-1-dbb237c9fe5c@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoMerge tag 'probes-fixes-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 14 Aug 2025 03:23:32 +0000 (20:23 -0700)]
Merge tag 'probes-fixes-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fix from Masami Hiramatsu:

 - MAINTAINERS: Remove bouncing kprobes maintainer

* tag 'probes-fixes-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  MAINTAINERS: Remove bouncing kprobes maintainer

2 months agoMAINTAINERS: Remove bouncing kprobes maintainer
Dave Hansen [Thu, 14 Aug 2025 02:38:58 +0000 (11:38 +0900)]
MAINTAINERS: Remove bouncing kprobes maintainer

The kprobes MAINTAINERS entry includes anil.s.keshavamurthy@intel.com.
That address is bouncing. Remove it.

This still leaves three other listed maintainers.

Link: https://lore.kernel.org/all/20250808180124.7DDE2ECD@davehans-spike.ostc.intel.com/
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: linux-trace-kernel@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2 months agoMerge branch 'net-don-t-use-pk-through-printk-or-tracepoints'
Jakub Kicinski [Thu, 14 Aug 2025 01:26:18 +0000 (18:26 -0700)]
Merge branch 'net-don-t-use-pk-through-printk-or-tracepoints'

Thomas Weißschuh says:

====================
net: Don't use %pK through printk or tracepoints

In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through printk(). They can still unintentionally leak raw pointers or
acquire sleeping locks in atomic contexts.

Switch to the regular pointer formatting which is safer and
easier to reason about.
There are still a few users of %pK left, but these use it through seq_file,
for which its usage is safe.
====================

Link: https://patch.msgid.link/20250811-restricted-pointers-net-v5-0-2e2fdc7d3f2c@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet/mlx5: Don't use %pK through tracepoints
Thomas Weißschuh [Mon, 11 Aug 2025 09:43:19 +0000 (11:43 +0200)]
net/mlx5: Don't use %pK through tracepoints

In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through tracepoints. They can still unintentionally leak raw pointers or
acquire sleeping locks in atomic contexts.

Switch to the regular pointer formatting which is safer and
easier to reason about.
There are still a few users of %pK left, but these use it through seq_file,
for which its usage is safe.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250811-restricted-pointers-net-v5-2-2e2fdc7d3f2c@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoice: Don't use %pK through printk or tracepoints
Thomas Weißschuh [Mon, 11 Aug 2025 09:43:18 +0000 (11:43 +0200)]
ice: Don't use %pK through printk or tracepoints

In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through printk(). They can still unintentionally leak raw pointers or
acquire sleeping locks in atomic contexts.

Switch to the regular pointer formatting which is safer and
easier to reason about.
There are still a few users of %pK left, but these use it through seq_file,
for which its usage is safe.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250811-restricted-pointers-net-v5-1-2e2fdc7d3f2c@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: kcm: Fix race condition in kcm_unattach()
Sven Stegemann [Tue, 12 Aug 2025 19:18:03 +0000 (21:18 +0200)]
net: kcm: Fix race condition in kcm_unattach()

syzbot found a race condition when kcm_unattach(psock)
and kcm_release(kcm) are executed at the same time.

kcm_unattach() is missing a check of the flag
kcm->tx_stopped before calling queue_work().

If the kcm has a reserved psock, kcm_unattach() might get executed
between cancel_work_sync() and unreserve_psock() in kcm_release(),
requeuing kcm->tx_work right before kcm gets freed in kcm_done().

Remove kcm->tx_stopped and replace it by the less
error-prone disable_work_sync().

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+e62c9db591c30e174662@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e62c9db591c30e174662
Reported-by: syzbot+d199b52665b6c3069b94@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d199b52665b6c3069b94
Reported-by: syzbot+be6b1fdfeae512726b4e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=be6b1fdfeae512726b4e
Signed-off-by: Sven Stegemann <sven@stegemann.de>
Link: https://patch.msgid.link/20250812191810.27777-1-sven@stegemann.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'ets-use-old-nbands-while-purging-unused-classes'
Jakub Kicinski [Thu, 14 Aug 2025 01:11:56 +0000 (18:11 -0700)]
Merge branch 'ets-use-old-nbands-while-purging-unused-classes'

Davide Caratti says:

====================
ets: use old 'nbands' while purging unused classes

- patch 1/2 fixes a NULL dereference in the control path of sch_ets qdisc
- patch 2/2 extends kselftests to verify effectiveness of the above fix
====================

Link: https://patch.msgid.link/cover.1755016081.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: net/forwarding: test purge of active DWRR classes
Davide Caratti [Tue, 12 Aug 2025 16:40:30 +0000 (18:40 +0200)]
selftests: net/forwarding: test purge of active DWRR classes

Extend sch_ets.sh to add a reproducer for problematic list deletions when
active DWRR class are purged by ets_qdisc_change() [1] [2].

[1] https://lore.kernel.org/netdev/e08c7f4a6882f260011909a868311c6e9b54f3e4.1639153474.git.dcaratti@redhat.com/
[2] https://lore.kernel.org/netdev/f3b9bacc73145f265c19ab80785933da5b7cbdec.1754581577.git.dcaratti@redhat.com/

Suggested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/489497cb781af7389011ca1591fb702a7391f5e7.1755016081.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet/sched: ets: use old 'nbands' while purging unused classes
Davide Caratti [Tue, 12 Aug 2025 16:40:29 +0000 (18:40 +0200)]
net/sched: ets: use old 'nbands' while purging unused classes

Shuang reported sch_ets test-case [1] crashing in ets_class_qlen_notify()
after recent changes from Lion [2]. The problem is: in ets_qdisc_change()
we purge unused DWRR queues; the value of 'q->nbands' is the new one, and
the cleanup should be done with the old one. The problem is here since my
first attempts to fix ets_qdisc_change(), but it surfaced again after the
recent qdisc len accounting fixes. Fix it purging idle DWRR queues before
assigning a new value of 'q->nbands', so that all purge operations find a
consistent configuration:

 - old 'q->nbands' because it's needed by ets_class_find()
 - old 'q->nstrict' because it's needed by ets_class_is_strict()

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 62 UID: 0 PID: 39457 Comm: tc Kdump: loaded Not tainted 6.12.0-116.el10.x86_64 #1 PREEMPT(voluntary)
 Hardware name: Dell Inc. PowerEdge R640/06DKY5, BIOS 2.12.2 07/09/2021
 RIP: 0010:__list_del_entry_valid_or_report+0x4/0x80
 Code: ff 4c 39 c7 0f 84 39 19 8e ff b8 01 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 8b 17 48 8b 4f 08 48 85 d2 0f 84 56 19 8e ff 48 85 c9 0f 84 ab
 RSP: 0018:ffffba186009f400 EFLAGS: 00010202
 RAX: 00000000000000d6 RBX: 0000000000000000 RCX: 0000000000000004
 RDX: ffff9f0fa29b69c0 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: ffffffffc12c2400 R08: 0000000000000008 R09: 0000000000000004
 R10: ffffffffffffffff R11: 0000000000000004 R12: 0000000000000000
 R13: ffff9f0f8cfe0000 R14: 0000000000100005 R15: 0000000000000000
 FS:  00007f2154f37480(0000) GS:ffff9f269c1c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000001530be001 CR4: 00000000007726f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  ets_class_qlen_notify+0x65/0x90 [sch_ets]
  qdisc_tree_reduce_backlog+0x74/0x110
  ets_qdisc_change+0x630/0xa40 [sch_ets]
  __tc_modify_qdisc.constprop.0+0x216/0x7f0
  tc_modify_qdisc+0x7c/0x120
  rtnetlink_rcv_msg+0x145/0x3f0
  netlink_rcv_skb+0x53/0x100
  netlink_unicast+0x245/0x390
  netlink_sendmsg+0x21b/0x470
  ____sys_sendmsg+0x39d/0x3d0
  ___sys_sendmsg+0x9a/0xe0
  __sys_sendmsg+0x7a/0xd0
  do_syscall_64+0x7d/0x160
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
 RIP: 0033:0x7f2155114084
 Code: 89 02 b8 ff ff ff ff eb bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 80 3d 25 f0 0c 00 00 74 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 89 54 24 1c 48 89
 RSP: 002b:00007fff1fd7a988 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 0000560ec063e5e0 RCX: 00007f2155114084
 RDX: 0000000000000000 RSI: 00007fff1fd7a9f0 RDI: 0000000000000003
 RBP: 00007fff1fd7aa60 R08: 0000000000000010 R09: 000000000000003f
 R10: 0000560ee9b3a010 R11: 0000000000000202 R12: 00007fff1fd7aae0
 R13: 000000006891ccde R14: 0000560ec063e5e0 R15: 00007fff1fd7aad0
  </TASK>

 [1] https://lore.kernel.org/netdev/e08c7f4a6882f260011909a868311c6e9b54f3e4.1639153474.git.dcaratti@redhat.com/
 [2] https://lore.kernel.org/netdev/d912cbd7-193b-4269-9857-525bee8bbb6a@gmail.com/

Cc: stable@vger.kernel.org
Fixes: 103406b38c60 ("net/sched: Always pass notifications when child class becomes empty")
Fixes: c062f2a0b04d ("net/sched: sch_ets: don't remove idle classes from the round-robin list")
Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
Reported-by: Li Shuang <shuali@redhat.com>
Closes: https://issues.redhat.com/browse/RHEL-108026
Reviewed-by: Petr Machata <petrm@nvidia.com>
Co-developed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/7928ff6d17db47a2ae7cc205c44777b1f1950545.1755016081.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Jakub Kicinski [Thu, 14 Aug 2025 00:31:46 +0000 (17:31 -0700)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
ixgbe: bypass devlink phys_port_name generation

Jedrzej adds option to skip phys_port_name generation and opts
ixgbe into it as some configurations rely on pre-devlink naming
which could end up broken as a result.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ixgbe: prevent from unwanted interface name changes
  devlink: let driver opt out of automatic phys_port_name generation
====================

Link: https://patch.msgid.link/20250812205226.1984369-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: netconsole: Validate interface selection by MAC address
Andre Carvalho [Tue, 12 Aug 2025 19:38:23 +0000 (20:38 +0100)]
selftests: netconsole: Validate interface selection by MAC address

Extend the existing netconsole cmdline selftest to also validate that
interface selection can be performed via MAC address.

The test now validates that netconsole works with both interface name
and MAC address, improving test coverage.

Suggested-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Andre Carvalho <asantostc@gmail.com>
Link: https://patch.msgid.link/20250812-netcons-cmdline-selftest-v2-1-8099fb7afa9e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
David Wei [Tue, 12 Aug 2025 18:29:07 +0000 (11:29 -0700)]
bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE

The data page pool always fills the HW rx ring with pages. On arm64 with
64K pages, this will waste _at least_ 32K of memory per entry in the rx
ring.

Fix by fragmenting the pages if PAGE_SIZE > BNXT_RX_PAGE_SIZE. This
makes the data page pool the same as the header pool.

Tested with iperf3 with a small (64 entries) rx ring to encourage buffer
circulation.

Fixes: cd1fafe7da1f ("eth: bnxt: add support rx side device memory TCP")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250812182907.1540755-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetdevsim: Fix wild pointer access in nsim_queue_free().
Kuniyuki Iwashima [Tue, 12 Aug 2025 16:21:26 +0000 (16:21 +0000)]
netdevsim: Fix wild pointer access in nsim_queue_free().

syzbot reported the splat below. [0]

When nsim_queue_uninit() is called from nsim_init_netdevsim(),
register_netdevice() has not been called, thus dev->dstats has
not been allocated.

Let's not call dev_dstats_rx_dropped_add() in such a case.

[0]
BUG: unable to handle page fault for address: ffff88809782c020
 PF: supervisor write access in kernel mode
 PF: error_code(0x0002) - not-present page
PGD 1b401067 P4D 1b401067 PUD 0
Oops: Oops: 0002 [#1] SMP KASAN NOPTI
CPU: 3 UID: 0 PID: 8476 Comm: syz.1.251 Not tainted 6.16.0-syzkaller-06699-ge8d780dcd957 #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:local_add arch/x86/include/asm/local.h:33 [inline]
RIP: 0010:u64_stats_add include/linux/u64_stats_sync.h:89 [inline]
RIP: 0010:dev_dstats_rx_dropped_add include/linux/netdevice.h:3027 [inline]
RIP: 0010:nsim_queue_free+0xba/0x120 drivers/net/netdevsim/netdev.c:714
Code: 07 77 6c 4a 8d 3c ed 20 7e f1 8d 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 46 4a 03 1c ed 20 7e f1 8d <4c> 01 63 20 be 00 02 00 00 48 8d 3d 00 00 00 00 e8 61 2f 58 fa 48
RSP: 0018:ffffc900044af150 EFLAGS: 00010286
RAX: dffffc0000000000 RBX: ffff88809782c000 RCX: 00000000000079c3
RDX: 1ffffffff1be2fc7 RSI: ffffffff8c15f380 RDI: ffffffff8df17e38
RBP: ffff88805f59d000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000003 R14: ffff88806ceb3d00 R15: ffffed100dfd308e
FS:  0000000000000000(0000) GS:ffff88809782c000(0063) knlGS:00000000f505db40
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: ffff88809782c020 CR3: 000000006fc6a000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 nsim_queue_uninit drivers/net/netdevsim/netdev.c:993 [inline]
 nsim_init_netdevsim drivers/net/netdevsim/netdev.c:1049 [inline]
 nsim_create+0xd0a/0x1260 drivers/net/netdevsim/netdev.c:1101
 __nsim_dev_port_add+0x435/0x7d0 drivers/net/netdevsim/dev.c:1438
 nsim_dev_port_add_all drivers/net/netdevsim/dev.c:1494 [inline]
 nsim_dev_reload_create drivers/net/netdevsim/dev.c:1546 [inline]
 nsim_dev_reload_up+0x5b8/0x860 drivers/net/netdevsim/dev.c:1003
 devlink_reload+0x322/0x7c0 net/devlink/dev.c:474
 devlink_nl_reload_doit+0xe31/0x1410 net/devlink/dev.c:584
 genl_family_rcv_msg_doit+0x206/0x2f0 net/netlink/genetlink.c:1115
 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
 genl_rcv_msg+0x55c/0x800 net/netlink/genetlink.c:1210
 netlink_rcv_skb+0x155/0x420 net/netlink/af_netlink.c:2552
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
 netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1346
 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896
 sock_sendmsg_nosec net/socket.c:714 [inline]
 __sock_sendmsg net/socket.c:729 [inline]
 ____sys_sendmsg+0xa95/0xc70 net/socket.c:2614
 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2668
 __sys_sendmsg+0x16d/0x220 net/socket.c:2700
 do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline]
 __do_fast_syscall_32+0x7c/0x3a0 arch/x86/entry/syscall_32.c:306
 do_fast_syscall_32+0x32/0x80 arch/x86/entry/syscall_32.c:331
 entry_SYSENTER_compat_after_hwframe+0x84/0x8e
RIP: 0023:0xf708e579
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f505d55c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 0000000080000080
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
Modules linked in:
CR2: ffff88809782c020

Fixes: 2a68a22304f9 ("netdevsim: account dropped packet length in stats on queue free")
Reported-by: syzbot+8aa80c6232008f7b957d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/688bb9ca.a00a0220.26d0e1.0050.GAE@google.com/
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250812162130.4129322-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: enetc: Remove error print for devm_add_action_or_reset()
Waqar Hameed [Tue, 12 Aug 2025 12:13:58 +0000 (14:13 +0200)]
net: enetc: Remove error print for devm_add_action_or_reset()

When `devm_add_action_or_reset()` fails, it is due to a failed memory
allocation and will thus return `-ENOMEM`. `dev_err_probe()` doesn't do
anything when error is `-ENOMEM`. Therefore, remove the useless call to
`dev_err_probe()` when `devm_add_action_or_reset()` fails, and just
return the value instead.

Signed-off-by: Waqar Hameed <waqar.hameed@axis.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/pnd1ppghh4p.a.out@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: libwx: cleanup VF register macros
Jiawen Wu [Tue, 12 Aug 2025 09:37:25 +0000 (17:37 +0800)]
net: libwx: cleanup VF register macros

Adjust the order of VF regitser macros, make it elegant.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/778899EE1D862EC2+20250812093725.58821-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agotun: replace strcpy with strscpy for ifr_name
Miguel García [Tue, 12 Aug 2025 08:22:44 +0000 (10:22 +0200)]
tun: replace strcpy with strscpy for ifr_name

Replace the strcpy() calls that copy the device name into ifr->ifr_name
with strscpy() to avoid potential overflows and guarantee NULL termination.

Destination is ifr->ifr_name (size IFNAMSIZ).

Tested in QEMU (BusyBox rootfs):
 - Created TUN devices via TUNSETIFF helper
 - Set addresses and brought links up
 - Verified long interface names are safely truncated (IFNAMSIZ-1)

Signed-off-by: Miguel García <miguelgarciaroman8@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250812082244.60240-1-miguelgarciaroman8@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: forwarding: Add a test for FDB activity notification control
Ido Schimmel [Tue, 12 Aug 2025 07:18:10 +0000 (10:18 +0300)]
selftests: forwarding: Add a test for FDB activity notification control

Test various aspects of FDB activity notification control:

* Transitioning of an FDB entry from inactive to active state.

* Transitioning of an FDB entry from active to inactive state.

* Avoiding the resetting of an FDB entry's last activity time (i.e.,
  "updated" time) using the "norefresh" keyword.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250812071810.312346-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: mctp: Fix bad kfree_skb in bind lookup test
Matt Johnston [Tue, 12 Aug 2025 05:08:58 +0000 (13:08 +0800)]
net: mctp: Fix bad kfree_skb in bind lookup test

The kunit test's skb_pkt is consumed by mctp_dst_input() so shouldn't be
freed separately.

Fixes: e6d8e7dbc5a3 ("net: mctp: Add bind lookup test")
Reported-by: Alexandre Ghiti <alex@ghiti.fr>
Closes: https://lore.kernel.org/all/734b02a3-1941-49df-a0da-ec14310d41e4@ghiti.fr/
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://patch.msgid.link/20250812-fix-mctp-bind-test-v1-1-5e2128664eb3@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: mediatek: wed: Introduce MT7992 WED support to MT7988 SoC
Lorenzo Bianconi [Tue, 12 Aug 2025 04:57:23 +0000 (06:57 +0200)]
net: mediatek: wed: Introduce MT7992 WED support to MT7988 SoC

Introduce the second WDMA RX ring in WED driver for MT7988 SoC since the
Mediatek MT7992 WiFi chipset supports two separated WDMA rings.
Add missing MT7988 configurations to properly support WED for MT7992 in
MT76 driver.

Co-developed-by: Rex Lu <rex.lu@mediatek.com>
Signed-off-by: Rex Lu <rex.lu@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250812-mt7992-wed-support-v3-1-9ada78a819a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agovsock: use sizeof(struct sockaddr_storage) instead of magic value
Wang Liang [Tue, 12 Aug 2025 01:59:29 +0000 (09:59 +0800)]
vsock: use sizeof(struct sockaddr_storage) instead of magic value

Previous commit 230b183921ec ("net: Use standard structures for generic
socket address structures.") use 'struct sockaddr_storage address;'
to replace 'char address[MAX_SOCK_ADDR];'.

The macro MAX_SOCK_ADDR is removed by commit 01893c82b4e6 ("net: Remove
MAX_SOCK_ADDR constant").

The comment in vsock_getname() is outdated, use sizeof(struct
sockaddr_storage) instead of magic value 128.

Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20250812015929.1419896-1-wangliang74@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'refine-stmmac-code'
Jakub Kicinski [Wed, 13 Aug 2025 23:27:44 +0000 (16:27 -0700)]
Merge branch 'refine-stmmac-code'

Tiezhu Yang says:

====================
Refine stmmac code

Here are three small patches to refine stmmac code when debugging and
testing the problem "Failed to reset the dma".
====================

Link: https://patch.msgid.link/20250811073506.27513-1-yangtiezhu@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: Return early if invalid in loongson_dwmac_fix_reset()
Tiezhu Yang [Mon, 11 Aug 2025 07:35:06 +0000 (15:35 +0800)]
net: stmmac: Return early if invalid in loongson_dwmac_fix_reset()

If the MAC controller does not connect to any PHY interface, there is a
missing clock, then the DMA reset fails.

For this case, the DMA_BUS_MODE_SFT_RESET bit is 1 before software reset,
just print an error message which gives a hint the PHY clock is missing,
and then return -EINVAL immediately to avoid waiting for the timeout when
the DMA reset fails in loongson_dwmac_fix_reset().

With this patch, for the normal end user, the computer start faster with
reducing boot time for 2 seconds on the specified mainboard.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://patch.msgid.link/20250811073506.27513-4-yangtiezhu@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: Change first parameter of fix_soc_reset()
Tiezhu Yang [Mon, 11 Aug 2025 07:35:05 +0000 (15:35 +0800)]
net: stmmac: Change first parameter of fix_soc_reset()

In order to use netdev_err() to print message in the callback function of
fix_soc_reset(), change fix_soc_reset() to have "struct stmmac_priv *" as
its first parameter.

This is preparation for later patch, no functionality change.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://patch.msgid.link/20250811073506.27513-3-yangtiezhu@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: Check stmmac_hw_setup() in stmmac_resume()
Tiezhu Yang [Mon, 11 Aug 2025 07:35:04 +0000 (15:35 +0800)]
net: stmmac: Check stmmac_hw_setup() in stmmac_resume()

stmmac_hw_setup() may return 0 on success and an appropriate negative
integer as defined in errno.h file on failure, just check it and then
return early if failed in stmmac_resume().

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://patch.msgid.link/20250811073506.27513-2-yangtiezhu@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'nf-25-08-13' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Wed, 13 Aug 2025 21:51:51 +0000 (14:51 -0700)]
Merge tag 'nf-25-08-13' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for *net*:

1) I managed to add a null dereference crash in nft_set_pipapo
   in the current development cycle, was not caught by CI
   because the avx2 implementation is fine, but selftest
   splats when run on non-avx2 host.

2) Fix the ipvs estimater kthread affinity, was incorrect
   since 6.14. From Frederic Weisbecker.

3) nf_tables should not allow to add a device to a flowtable
   or netdev chain more than once -- reject this.
   From Pablo Neira Ayuso.  This has been broken for long time,
   blamed commit dates from v5.8.

* tag 'nf-25-08-13' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: reject duplicate device on updates
  ipvs: Fix estimator kthreads preferred affinity
  netfilter: nft_set_pipapo: fix null deref for empty set
====================

Link: https://patch.msgid.link/20250813113800.20775-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'erofs-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 13 Aug 2025 18:29:27 +0000 (11:29 -0700)]
Merge tag 'erofs-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:

 - Align FSDAX enablement among multiple devices

 - Fix EROFS_FS_ZIP_ACCEL build dependency again to prevent forcing
   CRYPTO{,_DEFLATE}=y even if EROFS=m

 - Fix atomic context detection to properly launch kworkers on demand

 - Fix block count statistics for 48-bit addressing support

* tag 'erofs-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix block count report when 48-bit layout is on
  erofs: fix atomic context detection when !CONFIG_DEBUG_LOCK_ALLOC
  erofs: Do not select tristate symbols from bool symbols
  erofs: Fallback to normal access if DAX is not supported on extra device

2 months agoMerge tag 'rcu.fixes.6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Linus Torvalds [Wed, 13 Aug 2025 17:23:28 +0000 (10:23 -0700)]
Merge tag 'rcu.fixes.6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU fix from Neeraj Upadhyay:
 "Fix a regression introduced by commit b41642c87716 ("rcu: Fix
  rcu_read_unlock() deadloop due to IRQ work") which results in boot
  hang as reported by kernel test bot at [1].

  This issue happens because RCU re-initializes the deferred QS IRQ work
  everytime it is queued. With commit b41642c87716, the IRQ work
  re-initialization can happen while it is already queued. This results
  in IRQ work being requeued to itself. When IRQ work finally fires, as
  it is requeued to itself, it is repeatedly executed and results in
  hang.

  Fix this with initializing the IRQ work only once before the CPU
  boots"

Link: https://lore.kernel.org/rcu/202508071303.c1134cce-lkp@intel.com/
* tag 'rcu.fixes.6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
  rcu: Fix racy re-initialization of irq_work causing hangs

2 months agoMerge tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Wed, 13 Aug 2025 15:28:33 +0000 (08:28 -0700)]
Merge tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "12 hotfixes. 5 are cc:stable and the remainder address post-6.16
  issues or aren't considered necessary for -stable kernels.

  10 of these fixes are for MM"

* tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  proc: proc_maps_open allow proc_mem_open to return NULL
  mm/mremap: avoid expensive folio lookup on mremap folio pte batch
  userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry
  mm: pass page directly instead of using folio_page
  selftests/proc: fix string literal warning in proc-maps-race.c
  fs/proc/task_mmu: hold PTL in pagemap_hugetlb_range and gather_hugetlb_stats
  mm/smaps: fix race between smaps_hugetlb_range and migration
  mm: fix the race between collapse and PT_RECLAIM under per-vma lock
  mm/kmemleak: avoid soft lockup in __kmemleak_do_cleanup()
  MAINTAINERS: add Masami as a reviewer of hung task detector
  mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock
  kasan/test: fix protection against compiler elision

2 months agonetfilter: nf_tables: reject duplicate device on updates
Pablo Neira Ayuso [Wed, 13 Aug 2025 00:38:50 +0000 (02:38 +0200)]
netfilter: nf_tables: reject duplicate device on updates

A chain/flowtable update with duplicated devices in the same batch is
possible. Unfortunately, netdev event path only removes the first
device that is found, leaving unregistered the hook of the duplicated
device.

Check if a duplicated device exists in the transaction batch, bail out
with EEXIST in such case.

WARNING is hit when unregistering the hook:

 [49042.221275] WARNING: CPU: 4 PID: 8425 at net/netfilter/core.c:340 nf_hook_entry_head+0xaa/0x150
 [49042.221375] CPU: 4 UID: 0 PID: 8425 Comm: nft Tainted: G S                  6.16.0+ #170 PREEMPT(full)
 [...]
 [49042.221382] RIP: 0010:nf_hook_entry_head+0xaa/0x150

Fixes: 78d9f48f7f44 ("netfilter: nf_tables: add devices to existing flowtable")
Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
2 months agoipvs: Fix estimator kthreads preferred affinity
Frederic Weisbecker [Tue, 29 Jul 2025 12:26:11 +0000 (14:26 +0200)]
ipvs: Fix estimator kthreads preferred affinity

The estimator kthreads' affinity are defined by sysctl overwritten
preferences and applied through a plain call to the scheduler's affinity
API.

However since the introduction of managed kthreads preferred affinity,
such a practice shortcuts the kthreads core code which eventually
overwrites the target to the default unbound affinity.

Fix this with using the appropriate kthread's API.

Fixes: d1a89197589c ("kthread: Default affine kthread to its preferred NUMA node")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
2 months agonetfilter: nft_set_pipapo: fix null deref for empty set
Florian Westphal [Mon, 11 Aug 2025 10:26:10 +0000 (12:26 +0200)]
netfilter: nft_set_pipapo: fix null deref for empty set

Blamed commit broke the check for a null scratch map:
  -  if (unlikely(!m || !*raw_cpu_ptr(m->scratch)))
  +  if (unlikely(!raw_cpu_ptr(m->scratch)))

This should have been "if (!*raw_ ...)".
Use the pattern of the avx2 version which is more readable.

This can only be reproduced if avx2 support isn't available.

Fixes: d8d871a35ca9 ("netfilter: nft_set_pipapo: merge pipapo_get/lookup")
Signed-off-by: Florian Westphal <fw@strlen.de>
2 months agoselftests: tls: test TCP stealing data from under the TLS socket
Jakub Kicinski [Thu, 7 Aug 2025 23:29:07 +0000 (16:29 -0700)]
selftests: tls: test TCP stealing data from under the TLS socket

Check a race where data disappears from the TCP socket after
TLS signaled that its ready to receive.

  ok 6 global.data_steal
  #  RUN           tls_basic.base_base ...
  #            OK  tls_basic.base_base

Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250807232907.600366-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agotls: handle data disappearing from under the TLS ULP
Jakub Kicinski [Thu, 7 Aug 2025 23:29:06 +0000 (16:29 -0700)]
tls: handle data disappearing from under the TLS ULP

TLS expects that it owns the receive queue of the TCP socket.
This cannot be guaranteed in case the reader of the TCP socket
entered before the TLS ULP was installed, or uses some non-standard
read API (eg. zerocopy ones). Replace the WARN_ON() and a buggy
early exit (which leaves anchor pointing to a freed skb) with real
error handling. Wipe the parsing state and tell the reader to retry.

We already reload the anchor every time we (re)acquire the socket lock,
so the only condition we need to avoid is an out of bounds read
(not having enough bytes in the socket for previously parsed record len).

If some data was read from under TLS but there's enough in the queue
we'll reload and decrypt what is most likely not a valid TLS record.
Leading to some undefined behavior from TLS perspective (corrupting
a stream? missing an alert? missing an attack?) but no kernel crash
should take place.

Reported-by: William Liu <will@willsroot.io>
Reported-by: Savino Dicanosa <savy@syst3mfailure.io>
Link: https://lore.kernel.org/tFjq_kf7sWIG3A7CrCg_egb8CVsT_gsmHAK0_wxDPJXfIzxFAMxqmLwp3MlU5EHiet0AwwJldaaFdgyHpeIUCS-3m3llsmRzp9xIOBR4lAI=@syst3mfailure.io
Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250807232907.600366-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'net-airoha-introduce-npu-callbacks-for-wlan-offloading'
Jakub Kicinski [Wed, 13 Aug 2025 01:58:33 +0000 (18:58 -0700)]
Merge branch 'net-airoha-introduce-npu-callbacks-for-wlan-offloading'

Lorenzo Bianconi says:

====================
net: airoha: Introduce NPU callbacks for wlan offloading

Similar to wired traffic, EN7581 SoC allows to offload traffic to/from
the MT76 wireless NIC configuring the NPU module via the Netfilter
flowtable. This series introduces the necessary NPU callback used by
the MT7996 driver in order to enable the offloading.
MT76 support has been posted as RFC in [0] in order to show how the
APIs are consumed.
====================

Link: https://patch.msgid.link/20250811-airoha-en7581-wlan-offlaod-v7-0-58823603bb4e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: airoha: Add airoha_offload.h header
Lorenzo Bianconi [Mon, 11 Aug 2025 15:31:42 +0000 (17:31 +0200)]
net: airoha: Add airoha_offload.h header

Move NPU definitions to airoha_offload.h in include/linux/soc/airoha/ in
order to allow the MT76 driver to access the callback definitions.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250811-airoha-en7581-wlan-offlaod-v7-7-58823603bb4e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: airoha: npu: Enable core 3 for WiFi offloading
Lorenzo Bianconi [Mon, 11 Aug 2025 15:31:41 +0000 (17:31 +0200)]
net: airoha: npu: Enable core 3 for WiFi offloading

NPU core 3 is responsible for WiFi offloading so enable it during NPU
probe.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250811-airoha-en7581-wlan-offlaod-v7-6-58823603bb4e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: airoha: npu: Read NPU wlan interrupt lines from the DTS
Lorenzo Bianconi [Mon, 11 Aug 2025 15:31:40 +0000 (17:31 +0200)]
net: airoha: npu: Read NPU wlan interrupt lines from the DTS

Read all NPU wlan IRQ lines from the NPU device-tree node.
NPU module fires wlan irq lines when the traffic to/from the WiFi NIC is
not hw accelerated (these interrupts will be consumed by the MT76 driver
in subsequent patches).
This is a preliminary patch to enable wlan flowtable offload for EN7581
SoC.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250811-airoha-en7581-wlan-offlaod-v7-5-58823603bb4e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: airoha: npu: Add wlan irq management callbacks
Lorenzo Bianconi [Mon, 11 Aug 2025 15:31:39 +0000 (17:31 +0200)]
net: airoha: npu: Add wlan irq management callbacks

Introduce callbacks used by the MT76 driver to configure NPU SoC
interrupts. This is a preliminary patch to enable wlan flowtable
offload for EN7581 SoC with MT76 driver.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250811-airoha-en7581-wlan-offlaod-v7-4-58823603bb4e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: airoha: npu: Add wlan_{send,get}_msg NPU callbacks
Lorenzo Bianconi [Mon, 11 Aug 2025 15:31:38 +0000 (17:31 +0200)]
net: airoha: npu: Add wlan_{send,get}_msg NPU callbacks

Introduce wlan_send_msg() and wlan_get_msg() NPU wlan callbacks used
by the wlan driver (MT76) to initialize NPU module registers in order to
offload wireless-wired traffic.
This is a preliminary patch to enable wlan flowtable offload for EN7581
SoC with MT76 driver.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250811-airoha-en7581-wlan-offlaod-v7-3-58823603bb4e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: airoha: npu: Add NPU wlan memory initialization commands
Lorenzo Bianconi [Mon, 11 Aug 2025 15:31:37 +0000 (17:31 +0200)]
net: airoha: npu: Add NPU wlan memory initialization commands

Introduce wlan_init_reserved_memory callback used by MT76 driver during
NPU wlan offloading setup.
This is a preliminary patch to enable wlan flowtable offload for EN7581
SoC with MT76 driver.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250811-airoha-en7581-wlan-offlaod-v7-2-58823603bb4e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agodt-bindings: net: airoha: npu: Add memory regions used for wlan offload
Lorenzo Bianconi [Mon, 11 Aug 2025 15:31:36 +0000 (17:31 +0200)]
dt-bindings: net: airoha: npu: Add memory regions used for wlan offload

Document memory regions used by Airoha EN7581 NPU for wlan traffic
offloading. The brand new added memory regions do not introduce any
backward compatibility issues since they will be used just to offload
traffic to/from the MT76 wireless NIC and the MT76 probing will not fail
if these memory regions are not provide, it will just disable offloading
via the NPU module.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250811-airoha-en7581-wlan-offlaod-v7-1-58823603bb4e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'selftests-drv-net-improve-zerocopy-tests'
Jakub Kicinski [Wed, 13 Aug 2025 01:27:44 +0000 (18:27 -0700)]
Merge branch 'selftests-drv-net-improve-zerocopy-tests'

Jakub Kicinski says:

====================
selftests: drv-net: improve zerocopy tests

A few tweaks to the devmem test to make it more "NIPA-compatible".
We still need a fix to make sure that the test sets hds threshold
to 0. Taehee is presumably already/still working on that:
https://lore.kernel.org/20250702104249.1665034-1-ap420073@gmail.com
so I'm not including my version.

  # ./tools/testing/selftests/drivers/net/hw/devmem.py
  TAP version 13
  1..3
  ok 1 devmem.check_rx
  ok 2 devmem.check_tx
  ok 3 devmem.check_tx_chunks
  # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
====================

Link: https://patch.msgid.link/20250811231334.561137-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: drv-net: devmem: flip the direction of Tx tests
Jakub Kicinski [Mon, 11 Aug 2025 23:13:34 +0000 (16:13 -0700)]
selftests: drv-net: devmem: flip the direction of Tx tests

The Device Under Test should always be the local system.
While the Rx test gets this right the Tx test is sending
from remote to local. So Tx of DMABUF memory happens on remote.

These tests never run in NIPA since we don't have a compatible
device so we haven't caught this.

Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: net: terminate bkg() commands on exception
Jakub Kicinski [Mon, 11 Aug 2025 23:13:33 +0000 (16:13 -0700)]
selftests: net: terminate bkg() commands on exception

There is a number of:

  with bkg("socat ..LISTEN..", exit_wait=True)

uses in the tests. If whatever is supposed to send the traffic
fails we will get stuck in the bkg(). Try to kill the process
in case of exception, to avoid the long wait.

A specific example where this happens is the devmem Tx tests.

Reviewed-by: Joe Damato <joe@dama.to>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: drv-net: devmem: add / correct the IPv6 support
Jakub Kicinski [Mon, 11 Aug 2025 23:13:32 +0000 (16:13 -0700)]
selftests: drv-net: devmem: add / correct the IPv6 support

We need to use bracketed IPv6 addresses for socat.

Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: drv-net: devmem: remove sudo from system() calls
Jakub Kicinski [Mon, 11 Aug 2025 23:13:31 +0000 (16:13 -0700)]
selftests: drv-net: devmem: remove sudo from system() calls

The general expectations for network HW selftests is that they
will be run as root. sudo doesn't seem to work on NIPA VMs.
While it's probably something solvable in the setup I think we should
remove the sudos. devmem is the only networking test using sudo.

Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: drv-net: add configs for zerocopy Rx
Jakub Kicinski [Mon, 11 Aug 2025 23:13:30 +0000 (16:13 -0700)]
selftests: drv-net: add configs for zerocopy Rx

Looks like neither IO_URING nor UDMABUF are enabled even tho
iou-zcrx.py and devmem.py (respectively) need those.
IO_URING gets enabled by default but UDMABUF is missing.

Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'net-stmmac-improbe-suspend-resume-architecture'
Jakub Kicinski [Wed, 13 Aug 2025 01:04:56 +0000 (18:04 -0700)]
Merge branch 'net-stmmac-improbe-suspend-resume-architecture'

Russell King says:

====================
net: stmmac: improbe suspend/resume architecture

This series improves the stmmac suspend/resume architecture by
providing a couple of method hooks in struct plat_stmmacenet_data which
are called by core code, and thus are available for any of the
platform glue drivers, whether using a platform or PCI device.

As these methods are called by core code, we can also provide a simple
PM ops structure also in the core code for converted glue drivers to
use.

The remainder of the patches convert the various drivers.
====================

Link: https://patch.msgid.link/aJo7kvoub5voHOUQ@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: mediatek: convert to resume() method
Russell King (Oracle) [Mon, 11 Aug 2025 18:51:24 +0000 (19:51 +0100)]
net: stmmac: mediatek: convert to resume() method

Convert mediatek to use the resume() platform method rather than the
init() platform method as mediatek_dwmac_init() is only called from
the resume paths.

This will ensure that in a future commit, mediatek_dwmac_init() won't
be called when probing the main part of the stmmac driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1ulXcC-008grN-Hc@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: stm32: convert to suspend()/resume() methods
Russell King (Oracle) [Mon, 11 Aug 2025 18:51:19 +0000 (19:51 +0100)]
net: stmmac: stm32: convert to suspend()/resume() methods

Convert stm32 to use the new suspend() and resume() methods rather
than implementing these in custom wrappers around the main driver's
suspend/resume methods. This allows this driver to use the stmmac
simple PM ops structure.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1ulXc7-008grH-Dh@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: rk: convert to suspend()/resume() methods
Russell King (Oracle) [Mon, 11 Aug 2025 18:51:14 +0000 (19:51 +0100)]
net: stmmac: rk: convert to suspend()/resume() methods

Convert rk to use the new suspend() and resume() methods rather than
implementing these in custom wrappers around the main driver's
suspend/resume methods. This allows this driver to use the simmac
simple PM ops structure.

We can further simplify the driver as there is no need to track whether
the device was suspended, we only need to check whether the device is
wakeup capable in the resume method. This is because the resume method
will only be called after the suspend method.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1ulXc2-008grB-9k@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: pci: convert to suspend()/resume() methods
Russell King (Oracle) [Mon, 11 Aug 2025 18:51:09 +0000 (19:51 +0100)]
net: stmmac: pci: convert to suspend()/resume() methods

Convert pci to use the new suspend() and resume() methods rather
than implementing these in custom wrappers around the main driver's
suspend/resume methods.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1ulXbx-008gr4-5H@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: loongson: convert to suspend()/resume() methods
Russell King (Oracle) [Mon, 11 Aug 2025 18:51:04 +0000 (19:51 +0100)]
net: stmmac: loongson: convert to suspend()/resume() methods

Convert loongson to use the new suspend() and resume() methods rather
than implementing these in custom wrappers around the main driver's
suspend/resume methods. This allows this driver to use the stmmac
simple PM ops structure.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1ulXbs-008gqy-16@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: intel: convert to suspend()/resume() methods
Russell King (Oracle) [Mon, 11 Aug 2025 18:50:58 +0000 (19:50 +0100)]
net: stmmac: intel: convert to suspend()/resume() methods

Convert intel to use the new suspend() and resume() methods rather
than implementing these in custom wrappers around the main driver's
suspend/resume methods. This allows this driver to use the stmmac
simple PM ops structure.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1ulXbm-008gqs-P9@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: platform: legacy hooks for suspend()/resume() methods
Russell King (Oracle) [Mon, 11 Aug 2025 18:50:53 +0000 (19:50 +0100)]
net: stmmac: platform: legacy hooks for suspend()/resume() methods

Add legacy hooks for the suspend() and resume() methods to forward
these calls to the init() and exit() methods when the platform code
hasn't populated the two former methods. This allows us to get rid
of stmmac_pltfr_suspend() and stmmac_pltfr_resume().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1ulXbh-008gql-LO@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: provide a set of simple PM ops
Russell King (Oracle) [Mon, 11 Aug 2025 18:50:48 +0000 (19:50 +0100)]
net: stmmac: provide a set of simple PM ops

Several drivers will want to make use of simple PM operations, so
provide these from the core driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1ulXbc-008gqf-GJ@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: add suspend()/resume() platform ops
Russell King (Oracle) [Mon, 11 Aug 2025 18:50:43 +0000 (19:50 +0100)]
net: stmmac: add suspend()/resume() platform ops

Add suspend/resume platform operations, which, when populated, override
the init/exit platform operations when we suspend and resume. These
suspend()/resume() methods are called by core code, and thus are
designed to support any struct device, not just platform devices. This
allows them to be used by the PCI drivers we have.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1ulXbX-008gqZ-Bb@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'selftest-af_unix-enable-wall-and-wflex-array-member-not-at-end'
Jakub Kicinski [Wed, 13 Aug 2025 00:58:35 +0000 (17:58 -0700)]
Merge branch 'selftest-af_unix-enable-wall-and-wflex-array-member-not-at-end'

Kuniyuki Iwashima says:

====================
selftest: af_unix: Enable -Wall and -Wflex-array-member-not-at-end.

This series fix 4 warnings caught by -Wall and
-Wflex-array-member-not-at-end.
====================

Link: https://patch.msgid.link/20250811215432.3379570-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftest: af_unix: Add -Wall and -Wflex-array-member-not-at-end to CFLAGS.
Kuniyuki Iwashima [Mon, 11 Aug 2025 21:53:04 +0000 (21:53 +0000)]
selftest: af_unix: Add -Wall and -Wflex-array-member-not-at-end to CFLAGS.

-Wall and -Wflex-array-member-not-at-end caught some warnings that
will be fixed in later patches.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250811215432.3379570-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftest: af_unix: Silence -Wall warning for scm_pid.c.
Kuniyuki Iwashima [Mon, 11 Aug 2025 21:53:07 +0000 (21:53 +0000)]
selftest: af_unix: Silence -Wall warning for scm_pid.c.

-Wall found 2 unused variables in scm_pid.c:

scm_pidfd.c: In function ‘parse_cmsg’:
scm_pidfd.c:140:13: warning: unused variable ‘data’ [-Wunused-variable]
  140 |         int data = 0;
      |             ^~~~
scm_pidfd.c: In function ‘cmsg_check_dead’:
scm_pidfd.c:246:15: warning: unused variable ‘client_pid’ [-Wunused-variable]
  246 |         pid_t client_pid;
      |               ^~~~~~~~~~

Let's remove these variables.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250811215432.3379570-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftest: af_unix: Silence -Wflex-array-member-not-at-end warning for scm_rights.c.
Kuniyuki Iwashima [Mon, 11 Aug 2025 21:53:06 +0000 (21:53 +0000)]
selftest: af_unix: Silence -Wflex-array-member-not-at-end warning for scm_rights.c.

scm_rights.c has no problem in functionality, but when compiled with
-Wflex-array-member-not-at-end, it shows this warning:

scm_rights.c: In function ‘__send_fd’:
scm_rights.c:275:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
  275 |                 struct cmsghdr cmsghdr;
      |                                ^~~~~~~

Let's silence it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250811215432.3379570-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftest: af_unix: Silence -Wflex-array-member-not-at-end warning for scm_inq.c.
Kuniyuki Iwashima [Mon, 11 Aug 2025 21:53:05 +0000 (21:53 +0000)]
selftest: af_unix: Silence -Wflex-array-member-not-at-end warning for scm_inq.c.

scm_inq.c has no problem in functionality, but when compiled with
-Wflex-array-member-not-at-end, it shows this warning:

scm_inq.c:15:24: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
   15 |         struct cmsghdr cmsghdr;
      |                        ^~~~~~~

Let's silence it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250811215432.3379570-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'netconsole-reuse-netpoll_parse_ip_addr-in-configfs-helpers'
Jakub Kicinski [Wed, 13 Aug 2025 00:32:44 +0000 (17:32 -0700)]
Merge branch 'netconsole-reuse-netpoll_parse_ip_addr-in-configfs-helpers'

Breno Leitao says:

====================
netconsole: reuse netpoll_parse_ip_addr in configfs helpers

This patchset refactors the IP address parsing logic in the netconsole
driver to eliminate code duplication and improve maintainability. The
changes centralize IPv4 and IPv6 address parsing into a single function
(netpoll_parse_ip_addr). For that, it needs to teach
netpoll_parse_ip_addr() to handle strings with newlines, which is the
type of string coming from configfs.

Background

The netconsole driver currently has duplicate IP address parsing logic
in both local_ip_store() and remote_ip_store() functions. This
duplication increases the risk of inconsistencies and makes the code
harder to maintain.

Benefits

* Reduced code duplication: ~40 lines of duplicate parsing logic
  eliminated
 * Improved robustness: Centralized parsing reduces the chance
   of inconsistencies
 * Easier to maintain: Code follow more the netdev way

v3: https://lore.kernel.org/20250723-netconsole_ref-v3-0-8be9b24e4a99@debian.org
v2: https://lore.kernel.org/20250721-netconsole_ref-v2-0-b42f1833565a@debian.org
v1: https://lore.kernel.org/20250718-netconsole_ref-v1-0-86ef253b7a7a@debian.org
====================

Link: https://patch.msgid.link/20250811-netconsole_ref-v4-0-9c510d8713a2@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetconsole: use netpoll_parse_ip_addr in local_ip_store
Breno Leitao [Mon, 11 Aug 2025 18:13:28 +0000 (11:13 -0700)]
netconsole: use netpoll_parse_ip_addr in local_ip_store

Replace manual IP address parsing with a call to netpoll_parse_ip_addr
in remote_ip_store(), simplifying the code and reducing the chance of
errors.

The error message got removed, since it is not a good practice to
pr_err() if used pass a wrong value in configfs.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetconsole: use netpoll_parse_ip_addr in local_ip_store
Breno Leitao [Mon, 11 Aug 2025 18:13:27 +0000 (11:13 -0700)]
netconsole: use netpoll_parse_ip_addr in local_ip_store

Replace manual IP address parsing with a call to netpoll_parse_ip_addr
in local_ip_store(), simplifying the code and reducing the chance of
errors.

Also, remove the pr_err() if the user enters an invalid value in
configfs entries. pr_err() is not the best way to alert user that the
configuration is invalid.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetconsole: add support for strings with new line in netpoll_parse_ip_addr
Breno Leitao [Mon, 11 Aug 2025 18:13:26 +0000 (11:13 -0700)]
netconsole: add support for strings with new line in netpoll_parse_ip_addr

The current IP address parsing logic fails when the input string
contains a trailing newline character. This can occur when IP
addresses are provided through configfs, which contains newlines in
a const buffer.

Teach netpoll_parse_ip_addr() how to ignore newlines at the end of the
IPs. Also, simplify the code by:

 * No need to check for separators. Try to parse ipv4, if it fails try
   ipv6 similarly to ceph_pton()
 * If ipv6 is not supported, don't call in6_pton() at all.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250811-netconsole_ref-v4-2-9c510d8713a2@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetconsole: move netpoll_parse_ip_addr() earlier for reuse
Breno Leitao [Mon, 11 Aug 2025 18:13:25 +0000 (11:13 -0700)]
netconsole: move netpoll_parse_ip_addr() earlier for reuse

Move netpoll_parse_ip_addr() earlier in the file to be reused in
other functions, such as local_ip_store(). This avoids duplicate
address parsing logic and centralizes validation for both IPv4
and IPv6 string input.

No functional changes intended.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250811-netconsole_ref-v4-1-9c510d8713a2@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: mdio: mdio-bcm-unimac: Refine incorrect clock message
Florian Fainelli [Mon, 11 Aug 2025 16:59:21 +0000 (09:59 -0700)]
net: mdio: mdio-bcm-unimac: Refine incorrect clock message

In light of a81649a4efd3 ("net: mdio: mdio-bcm-unimac: Correct rate
fallback logic"), it became clear that the warning should be specific to
the MDIO controller instance, and there should be further information
provided to indicate what is wrong, whether the requested clock
frequency or the rate calculation. Clarify the message accordingly.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250811165921.392030-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet/sched: Remove redundant memset(0) call in reset_policy()
Thorsten Blum [Mon, 11 Aug 2025 16:40:38 +0000 (18:40 +0200)]
net/sched: Remove redundant memset(0) call in reset_policy()

The call to nla_strscpy() already zero-pads the tail of the destination
buffer which makes the additional memset(0) call redundant. Remove it.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250811164039.43250-1-thorsten.blum@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoptp: prevent possible ABBA deadlock in ptp_clock_freerun()
Jeongjun Park [Mon, 28 Jul 2025 06:26:49 +0000 (15:26 +0900)]
ptp: prevent possible ABBA deadlock in ptp_clock_freerun()

syzbot reported the following ABBA deadlock:

       CPU0                           CPU1
       ----                           ----
  n_vclocks_store()
    lock(&ptp->n_vclocks_mux) [1]
        (physical clock)
                                     pc_clock_adjtime()
                                       lock(&clk->rwsem) [2]
                                        (physical clock)
                                       ...
                                       ptp_clock_freerun()
                                         ptp_vclock_in_use()
                                           lock(&ptp->n_vclocks_mux) [3]
                                              (physical clock)
    ptp_clock_unregister()
      posix_clock_unregister()
        lock(&clk->rwsem) [4]
          (virtual clock)

Since ptp virtual clock is registered only under ptp physical clock, both
ptp_clock and posix_clock must be physical clocks for ptp_vclock_in_use()
to lock &ptp->n_vclocks_mux and check ptp->n_vclocks.

However, when unregistering vclocks in n_vclocks_store(), the locking
ptp->n_vclocks_mux is a physical clock lock, but clk->rwsem of
ptp_clock_unregister() called through device_for_each_child_reverse()
is a virtual clock lock.

Therefore, clk->rwsem used in CPU0 and clk->rwsem used in CPU1 are
different locks, but in lockdep, a false positive occurs because the
possibility of deadlock is determined through lock-class.

To solve this, lock subclass annotation must be added to the posix_clock
rwsem of the vclock.

Reported-by: syzbot+7cfb66a237c4a5fb22ad@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7cfb66a237c4a5fb22ad
Fixes: 73f37068d540 ("ptp: support ptp physical/virtual clocks conversion")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250728062649.469882-1-aha310510@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agophonet: add __rcu annotations
Eric Dumazet [Mon, 11 Aug 2025 14:52:52 +0000 (14:52 +0000)]
phonet: add __rcu annotations

Removes following sparse errors.

make C=2 net/phonet/socket.o net/phonet/af_phonet.o
  CHECK   net/phonet/socket.c
net/phonet/socket.c:619:14: error: incompatible types in comparison expression (different address spaces):
net/phonet/socket.c:619:14:    struct sock [noderef] __rcu *
net/phonet/socket.c:619:14:    struct sock *
net/phonet/socket.c:642:17: error: incompatible types in comparison expression (different address spaces):
net/phonet/socket.c:642:17:    struct sock [noderef] __rcu *
net/phonet/socket.c:642:17:    struct sock *
net/phonet/socket.c:658:17: error: incompatible types in comparison expression (different address spaces):
net/phonet/socket.c:658:17:    struct sock [noderef] __rcu *
net/phonet/socket.c:658:17:    struct sock *
net/phonet/socket.c:677:25: error: incompatible types in comparison expression (different address spaces):
net/phonet/socket.c:677:25:    struct sock [noderef] __rcu *
net/phonet/socket.c:677:25:    struct sock *
net/phonet/socket.c:726:21: warning: context imbalance in 'pn_res_seq_start' - wrong count at exit
net/phonet/socket.c:741:13: warning: context imbalance in 'pn_res_seq_stop' - wrong count at exit
  CHECK   net/phonet/af_phonet.c
net/phonet/af_phonet.c:35:14: error: incompatible types in comparison expression (different address spaces):
net/phonet/af_phonet.c:35:14:    struct phonet_protocol const [noderef] __rcu *
net/phonet/af_phonet.c:35:14:    struct phonet_protocol const *
net/phonet/af_phonet.c:474:17: error: incompatible types in comparison expression (different address spaces):
net/phonet/af_phonet.c:474:17:    struct phonet_protocol const [noderef] __rcu *
net/phonet/af_phonet.c:474:17:    struct phonet_protocol const *
net/phonet/af_phonet.c:486:9: error: incompatible types in comparison expression (different address spaces):
net/phonet/af_phonet.c:486:9:    struct phonet_protocol const [noderef] __rcu *
net/phonet/af_phonet.c:486:9:    struct phonet_protocol const *

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Rémi Denis-Courmont <courmisch@gmail.com>
Link: https://patch.msgid.link/20250811145252.1007242-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agodt-bindings: nfc: ti,trf7970a: Drop 'db' suffix duplicating dtschema
Krzysztof Kozlowski [Mon, 11 Aug 2025 14:22:36 +0000 (16:22 +0200)]
dt-bindings: nfc: ti,trf7970a: Drop 'db' suffix duplicating dtschema

A common property unit suffix '-db' was added to dtschema, thus
in-kernel bindings should not reference the type.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20250811142235.170407-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>