]> www.infradead.org Git - nvme.git/log
nvme.git
9 months agonet: stmmac: restart LPI timer after cleaning transmit descriptors
Russell King (Oracle) [Mon, 13 Jan 2025 11:46:20 +0000 (11:46 +0000)]
net: stmmac: restart LPI timer after cleaning transmit descriptors

Fix a bug in the LPI handling, where it is possible to immediately
enter LPI mode after cleaning the transmit descriptors when all queues
are empty rather than waiting for the LPI timeout to expire.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItg-000MBg-TW@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: combine stmmac_enable_eee_mode()
Russell King (Oracle) [Mon, 13 Jan 2025 11:46:15 +0000 (11:46 +0000)]
net: stmmac: combine stmmac_enable_eee_mode()

Combine stmmac_enable_eee_mode() with stmmac_try_to_start_sw_lpi()
which makes the code easier to read and the flow more logical. We
can now trivially see that if the transmit queues are busy, we
(re-)start the eee_ctrl_timer. Otherwise, if the transmit path is
not already in LPI mode, we ask the hardware to enter LPI mode.

I believe that now we can see better what is going on here, this
shows that there is a bug with the software LPI timer implementation.

The LPI timer is supposed to define how long after the last
transmittion completed before we start signalling LPI. However,
this code structure shows that if all transmit queues are empty,
and stmmac_try_to_start_sw_lpi() is called immediately after cleaning
the transmit queue, we will instruct the hardware to start signalling
LPI immediately.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItb-000MBa-OU@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: provide function for restarting sw LPI timer
Russell King (Oracle) [Mon, 13 Jan 2025 11:46:10 +0000 (11:46 +0000)]
net: stmmac: provide function for restarting sw LPI timer

Provide a function that encapsulates restarting the software LPI
timer when we have determined that the transmit path is busy, or
whether the EEE parameters have changed.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItW-000MBU-KQ@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: provide stmmac_eee_tx_busy()
Russell King (Oracle) [Mon, 13 Jan 2025 11:46:05 +0000 (11:46 +0000)]
net: stmmac: provide stmmac_eee_tx_busy()

Extract the code which checks whether there's still work to do on any
of the stmmac transmit queues. This will allow us to combine
stmmac_enable_eee_mode() with stmmac_try_to_start_sw_lpi() in the
next patch.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItR-000MBO-GF@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: add stmmac_try_to_start_sw_lpi()
Russell King (Oracle) [Mon, 13 Jan 2025 11:46:00 +0000 (11:46 +0000)]
net: stmmac: add stmmac_try_to_start_sw_lpi()

There are two places which call stmmac_enable_eee_mode() and follow it
immediately by modifying the expiry of priv->eee_ctrl_timer. Both code
paths are trying to enable LPI mode. Remove this duplication by
providing a function for this.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItM-000MBI-CX@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: check priv->eee_sw_timer_en in suspend path
Russell King (Oracle) [Mon, 13 Jan 2025 11:45:55 +0000 (11:45 +0000)]
net: stmmac: check priv->eee_sw_timer_en in suspend path

The suspend path uses priv->eee_enabled when cleaning up the software
timed LPI mode. Use priv->eee_sw_timer_en instead so we're consistently
using a single control for software-based timer handling.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItH-000MBC-8i@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: simplify TX cleanup decision for ending sw LPI mode
Russell King (Oracle) [Mon, 13 Jan 2025 11:45:50 +0000 (11:45 +0000)]
net: stmmac: simplify TX cleanup decision for ending sw LPI mode

As mentioned in "net: stmmac: correct priv->eee_sw_timer_en setting",
we can simplify some fast-path tests.

The transmit cleaning path checks whether EEE is enabled, the transmit
path is not in LPI mode, and that we're using software timed mode.
Since the above mentioned commit, checking whether EEE is enabled is
no longer necessary as priv->eee_sw_timer_en will be false when EEE is
disabled. Simplify this test.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItC-000MB6-54@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: correct priv->eee_sw_timer_en setting
Russell King (Oracle) [Mon, 13 Jan 2025 11:45:45 +0000 (11:45 +0000)]
net: stmmac: correct priv->eee_sw_timer_en setting

If we are disabling EEE/LPI, then we should not be enabling software
mode. The only time when we should is if EEE is active, and we are
wanting to use software-timed EEE mode.

Therefore, in the disable path of stmmac_eee_init(), ensure that
priv->eee_sw_timer_en is set false as we are going to be calling
del_timer_sync() on the timer.

This will allow us to simplify some fast-path tests in later patches.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXIt7-000MB0-0W@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: rename stmmac_disable_sw_eee_mode()
Russell King (Oracle) [Mon, 13 Jan 2025 11:45:39 +0000 (11:45 +0000)]
net: stmmac: rename stmmac_disable_sw_eee_mode()

stmmac_disable_sw_eee_mode() was not a good choice for this functions
purpose - which is to stop transmitting LPI because we want to send a
packet. Rename it to stmmac_stop_sw_lpi().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXIt1-000MAu-TE@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ethernet: sunplus: Switch to ndo_eth_ioctl
谢致邦 (XIE Zhibang) [Mon, 13 Jan 2025 09:41:56 +0000 (09:41 +0000)]
net: ethernet: sunplus: Switch to ndo_eth_ioctl

The device ioctl handler no longer calls ndo_do_ioctl, but calls
ndo_eth_ioctl to handle mii ioctls since commit a76053707dbf
("dev_ioctl: split out ndo_eth_ioctl"). However, sunplus still used
ndo_do_ioctl when it was introduced. So switch to ndo_eth_ioctl.

Bad commit fd3040b9394c ("net: ethernet: Add driver for Sunplus SP7021")
was the initial driver commit, meaning that PHY IOCTLs where never
available on this driver. Therefore don't consider this as a fix.

Found by code inspection.

Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Link: https://patch.msgid.link/tencent_8CF8A72C708E96B9C7DC1AF96FEE19AF3D05@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'net-ethernet-simplify-few-things'
Jakub Kicinski [Wed, 15 Jan 2025 02:04:28 +0000 (18:04 -0800)]
Merge branch 'net-ethernet-simplify-few-things'

Krzysztof Kozlowski says:

====================
net: ethernet: Simplify few things

Few code simplifications without functional impact.
Not tested on hardware.
====================

Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-0-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: stm32: Use syscon_regmap_lookup_by_phandle_args
Krzysztof Kozlowski [Sun, 12 Jan 2025 13:32:47 +0000 (14:32 +0100)]
net: stmmac: stm32: Use syscon_regmap_lookup_by_phandle_args

Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument.  Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.

There is also no real benefit in printing errors on missing syscon
argument, because this is done just too late: runtime check on
static/build-time data.  Dtschema and Devicetree bindings offer the
static/build-time check for this already.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-5-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: sti: Use syscon_regmap_lookup_by_phandle_args
Krzysztof Kozlowski [Sun, 12 Jan 2025 13:32:46 +0000 (14:32 +0100)]
net: stmmac: sti: Use syscon_regmap_lookup_by_phandle_args

Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument.  Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.

There is also no real benefit in printing errors on missing syscon
argument, because this is done just too late: runtime check on
static/build-time data.  Dtschema and Devicetree bindings offer the
static/build-time check for this already.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-4-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: imx: Use syscon_regmap_lookup_by_phandle_args
Krzysztof Kozlowski [Sun, 12 Jan 2025 13:32:45 +0000 (14:32 +0100)]
net: stmmac: imx: Use syscon_regmap_lookup_by_phandle_args

Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument.  Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.

There is also no real benefit in printing errors on missing syscon
argument, because this is done just too late: runtime check on
static/build-time data.  Dtschema and Devicetree bindings offer the
static/build-time check for this already.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-3-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ti: am65-cpsw-nuss: Use syscon_regmap_lookup_by_phandle_args
Krzysztof Kozlowski [Sun, 12 Jan 2025 13:32:44 +0000 (14:32 +0100)]
net: ti: am65-cpsw-nuss: Use syscon_regmap_lookup_by_phandle_args

Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument.  Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-2-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ti: icssg-prueth: Do not print physical memory addresses
Krzysztof Kozlowski [Sun, 12 Jan 2025 13:32:43 +0000 (14:32 +0100)]
net: ti: icssg-prueth: Do not print physical memory addresses

Debugging messages should not reveal anything about memory addresses.
This also solves arm compile test warnings:

  drivers/net/ethernet/ti/icssg/icssg_prueth_sr1.c:1034:49: error:
    format specifies type 'unsigned long long' but the argument has type 'phys_addr_t' (aka 'unsigned int') [-Werror,-Wformat]

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-1-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agosocket: Remove unused kernel_sendmsg_locked
Dr. David Alan Gilbert [Sun, 12 Jan 2025 13:13:18 +0000 (13:13 +0000)]
socket: Remove unused kernel_sendmsg_locked

The last use of kernel_sendmsg_locked() was removed in 2023 by
commit dc97391e6610 ("sock: Remove ->sendpage*() in favour of
sendmsg(MSG_SPLICE_PAGES)")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250112131318.63753-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: phy: Constify struct mdio_device_id
Christophe JAILLET [Sun, 12 Jan 2025 14:14:50 +0000 (15:14 +0100)]
net: phy: Constify struct mdio_device_id

'struct mdio_device_id' is not modified in these drivers.

Constifying these structures moves some data to a read-only section, so
increase overall security.

On a x86_64, with allmodconfig, as an example:
Before:
======
   text    data     bss     dec     hex filename
  27014   12792       0   39806    9b7e drivers/net/phy/broadcom.o

After:
=====
   text    data     bss     dec     hex filename
  27206   12600       0   39806    9b7e drivers/net/phy/broadcom.o

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/403c381b7d9156b67ad68ffc44b8eee70c5e86a9.1736691226.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'net-phy-realtek-add-hwmon-support'
Jakub Kicinski [Tue, 14 Jan 2025 22:51:36 +0000 (14:51 -0800)]
Merge branch 'net-phy-realtek-add-hwmon-support'

Heiner Kallweit says:

====================
net: phy: realtek: add hwmon support

This adds hwmon support for the temperature sensor on RTL822x.
It's available on the standalone versions of the PHY's, and on the
internal PHY's of RTL8125B(P)/RTL8125D/RTL8126.
====================

Link: https://patch.msgid.link/7319d8f9-2d6f-4522-92e8-a8a4990042fb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: phy: realtek: add hwmon support for temp sensor on RTL822x
Heiner Kallweit [Sat, 11 Jan 2025 20:51:24 +0000 (21:51 +0100)]
net: phy: realtek: add hwmon support for temp sensor on RTL822x

This adds hwmon support for the temperature sensor on RTL822x.
It's available on the standalone versions of the PHY's, and on
the integrated PHY's in RTL8125B/RTL8125D/RTL8126.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/ad6bfe9f-6375-4a00-84b4-bfb38a21bd71@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: phy: move realtek PHY driver to its own subdirectory
Heiner Kallweit [Sat, 11 Jan 2025 20:50:19 +0000 (21:50 +0100)]
net: phy: move realtek PHY driver to its own subdirectory

In preparation of adding a source file with hwmon support, move the
Realtek PHY driver to its own subdirectory and rename realtek.c to
realtek_main.c.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/c566551b-c915-4e34-9b33-129a6ddd6e4c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: phy: realtek: add support for reading MDIO_MMD_VEND2 regs on RTL8125/RTL8126
Heiner Kallweit [Sat, 11 Jan 2025 20:49:31 +0000 (21:49 +0100)]
net: phy: realtek: add support for reading MDIO_MMD_VEND2 regs on RTL8125/RTL8126

RTL8125/RTL8126 don't support MMD access to the internal PHY, but
provide a mechanism to access at least all MDIO_MMD_VEND2 registers.
By exposing this mechanism standard MMD access functions can be used
to access the MDIO_MMD_VEND2 registers.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/e821b302-5fe6-49ab-aabd-05da500581c0@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: airoha: Enforce ETS Qdisc priomap
Lorenzo Bianconi [Sun, 12 Jan 2025 18:32:45 +0000 (19:32 +0100)]
net: airoha: Enforce ETS Qdisc priomap

EN7581 SoC supports fixed QoS band priority where WRR queues have lowest
priorities with respect to SP ones.
E.g: WRR0, WRR1, .., WRRm, SP0, SP1, .., SPn

Enforce ETS Qdisc priomap according to the hw capabilities.

Suggested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/20250112-airoha_ets_priomap-v1-1-fb616de159ba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ethernet: ti: am65-cpsw: VLAN-aware CPSW only if !DSA
Alexander Sverdlin [Fri, 10 Jan 2025 12:57:35 +0000 (13:57 +0100)]
net: ethernet: ti: am65-cpsw: VLAN-aware CPSW only if !DSA

Only configure VLAN-aware CPSW mode if no port is used as DSA CPU port.
VLAN-aware mode interferes with some DSA tagging schemes and makes stacking
DSA switches downstream of CPSW impossible. Previous attempts to address
the issue linked below.

Link: https://lore.kernel.org/netdev/20240227082815.2073826-1-s-vadapalli@ti.com/
Link: https://lore.kernel.org/linux-arm-kernel/4699400.vD3TdgH1nR@localhost/
Co-developed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://patch.msgid.link/20250110125737.546184-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agotsnep: Link queues to NAPIs
Gerhard Engleder [Fri, 10 Jan 2025 22:39:39 +0000 (23:39 +0100)]
tsnep: Link queues to NAPIs

Use netif_queue_set_napi() to link queues to NAPI instances so that they
can be queried with netlink.

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump queue-get --json='{"ifindex": 11}'
[{'id': 0, 'ifindex': 11, 'napi-id': 9, 'type': 'rx'},
 {'id': 1, 'ifindex': 11, 'napi-id': 10, 'type': 'rx'},
 {'id': 0, 'ifindex': 11, 'napi-id': 9, 'type': 'tx'},
 {'id': 1, 'ifindex': 11, 'napi-id': 10, 'type': 'tx'}]

Additionally use netif_napi_set_irq() to also provide NAPI interrupt
number to userspace.

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --do napi-get --json='{"id": 9}'
{'defer-hard-irqs': 0,
 'gro-flush-timeout': 0,
 'id': 9,
 'ifindex': 11,
 'irq': 42,
 'irq-suspend-timeout': 0}

Providing information about queues to userspace makes sense as APIs like
XSK provide queue specific access. Also XSK busy polling relies on
queues linked to NAPIs.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250110223939.37490-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMAINTAINERS: downgrade Ethernet NIC drivers without CI reporting
Jakub Kicinski [Sat, 11 Jan 2025 02:43:57 +0000 (18:43 -0800)]
MAINTAINERS: downgrade Ethernet NIC drivers without CI reporting

Per previous change downgrade all NIC drivers (discrete, embedded,
SoC components, virtual) which don't report test results to CI
from Supported to Maintained.

Also include all components or building blocks of NIC drivers
(separate entries for "shared" code, subsystem support like PTP
or entries for specific offloads etc.)

Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250111024359.3678956-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agodocs: netdev: document requirements for Supported status
Jakub Kicinski [Sat, 11 Jan 2025 02:43:56 +0000 (18:43 -0800)]
docs: netdev: document requirements for Supported status

As announced back in April, require running upstream tests
to maintain Supported status for NIC drivers:

  https://lore.kernel.org/20240425114200.3effe773@kernel.org

Multiple vendors have been "working on it" for months.
Let's make it official.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250111024359.3678956-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'tcp-add-a-new-paws_ack-drop-reason'
Jakub Kicinski [Tue, 14 Jan 2025 21:28:15 +0000 (13:28 -0800)]
Merge branch 'tcp-add-a-new-paws_ack-drop-reason'

Eric Dumazet says:

====================
tcp: add a new PAWS_ACK drop reason

Current TCP_RFC7323_PAWS drop reason is too generic and can
cause confusion.

One common source for these drops are ACK packets coming too late.

A prior packet with payload already changed tp->rcv_nxt.

Add TCP_RFC7323_PAWS_ACK new drop reason, and do not
generate a DUPACK for such old ACK.
====================

Link: https://patch.msgid.link/20250113135558.3180360-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agotcp: add LINUX_MIB_PAWS_OLD_ACK SNMP counter
Eric Dumazet [Mon, 13 Jan 2025 13:55:58 +0000 (13:55 +0000)]
tcp: add LINUX_MIB_PAWS_OLD_ACK SNMP counter

Prior patch in the series added TCP_RFC7323_PAWS_ACK drop reason.

This patch adds the corresponding SNMP counter, for folks
using nstat instead of tracing for TCP diagnostics.

nstat -az | grep PAWSOldAck

Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250113135558.3180360-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agotcp: add TCP_RFC7323_PAWS_ACK drop reason
Eric Dumazet [Mon, 13 Jan 2025 13:55:57 +0000 (13:55 +0000)]
tcp: add TCP_RFC7323_PAWS_ACK drop reason

XPS can cause reorders because of the relaxed OOO
conditions for pure ACK packets.

For hosts not using RFS, what can happpen is that ACK
packets are sent on behalf of the cpu processing NIC
interrupts, selecting TX queue A for ACK packet P1.

Then a subsequent sendmsg() can run on another cpu.
TX queue selection uses the socket hash and can choose
another queue B for packets P2 (with payload).

If queue A is more congested than queue B,
the ACK packet P1 could be sent on the wire after
P2.

A linux receiver when processing P1 (after P2) currently increments
LINUX_MIB_PAWSESTABREJECTED (TcpExtPAWSEstab)
and use TCP_RFC7323_PAWS drop reason.
It might also send a DUPACK if not rate limited.

In order to better understand this pattern, this
patch adds a new drop_reason : TCP_RFC7323_PAWS_ACK.

For old ACKS like these, we no longer increment
LINUX_MIB_PAWSESTABREJECTED and no longer sends a DUPACK,
keeping credit for other more interesting DUPACK.

perf record -e skb:kfree_skb -a
perf script
...
         swapper       0 [148] 27475.438637: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.438706: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.438908: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [148] 27475.439010: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [148] 27475.439214: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.439286: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
...

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250113135558.3180360-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agotcp: add drop_reason support to tcp_disordered_ack()
Eric Dumazet [Mon, 13 Jan 2025 13:55:56 +0000 (13:55 +0000)]
tcp: add drop_reason support to tcp_disordered_ack()

Following patch is adding a new drop_reason to tcp_validate_incoming().

Change tcp_disordered_ack() to not return a boolean anymore,
but a drop reason.

Change its name to tcp_disordered_ack_check()

Refactor tcp_validate_incoming() to ease the code
review of the following patch, and reduce indentation
level.

This patch is a refactor, with no functional change.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250113135558.3180360-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: phy: dp83822: Fix typo "outout" -> "output"
Colin Ian King [Mon, 13 Jan 2025 09:15:55 +0000 (09:15 +0000)]
net: phy: dp83822: Fix typo "outout" -> "output"

There is a typo in a phydev_err message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250113091555.23594-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Jakub Kicinski [Tue, 14 Jan 2025 19:13:34 +0000 (11:13 -0800)]
Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Tariq Toukan says:

====================
mlx5-next updates 2025-01-14

The following pull-request contains mlx5 IFC updates.

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Add nic_cap_reg and vhca_icm_ctrl registers
  net/mlx5: SHAMPO: Introduce new SHAMPO specific HCA caps
  net/mlx5: Add support for MRTCQ register
  net/mlx5: Update mlx5_ifc to support FEC for 200G per lane link modes
====================

Link: https://patch.msgid.link/20250114055700.1928736-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'arrange-pse-core-and-update-tps23881-driver'
Paolo Abeni [Tue, 14 Jan 2025 12:56:36 +0000 (13:56 +0100)]
Merge branch 'arrange-pse-core-and-update-tps23881-driver'

Kory Maincent says:

====================
Arrange PSE core and update TPS23881 driver

This patch includes several improvements to the PSE core for better
implementation and maintainability:

- Move the conversion between current limit and power limit from the driver
  to the PSE core.
- Update power and current limit checks.
- Split the ethtool_get_status callback into multiple callbacks.
- Fix PSE PI of_node detection.
- Clean ethtool header of PSE structures.

Additionally, the TPS23881 driver has been updated to support power
limit and measurement features, aligning with the new PSE core
functionalities.

This patch series is the first part of the budget evaluation strategy
support patch series sent earlier:
https://lore.kernel.org/netdev/20250104161622.7b82dfdf@kmaincent-XPS-13-7390/T/#t

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
====================

Link: https://patch.msgid.link/20250110-b4-feature_poe_arrange-v3-0-142279aedb94@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: Clean ethtool header of PSE structures
Kory Maincent [Fri, 10 Jan 2025 09:40:31 +0000 (10:40 +0100)]
net: pse-pd: Clean ethtool header of PSE structures

Remove PSE-specific structures from the ethtool header to improve code
modularity, maintain independent headers, and reduce incremental build
time.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: Fix missing PI of_node description
Kory Maincent [Fri, 10 Jan 2025 09:40:30 +0000 (10:40 +0100)]
net: pse-pd: Fix missing PI of_node description

The PI of_node was not assigned in the regulator_config structure, leading
to failures in resolving the correct supply when different power supplies
are assigned to multiple PIs of a PSE controller. This fix ensures that the
of_node is properly set in the regulator_config, allowing accurate supply
resolution for each PI.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: tps23881: Add support for power limit and measurement features
Kory Maincent [Fri, 10 Jan 2025 09:40:29 +0000 (10:40 +0100)]
net: pse-pd: tps23881: Add support for power limit and measurement features

Expand PSE callbacks to support the newly introduced
pi_get/set_pw_limit() and pi_get_voltage() functions. These callbacks
allow for power limit configuration in the TPS23881 controller.

Additionally, the patch includes the pi_get_pw_class() the
pi_get_actual_pw(), and the pi_get_pw_limit_ranges') callbacks providing
more comprehensive PoE status reporting.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: Remove is_enabled callback from drivers
Kory Maincent [Fri, 10 Jan 2025 09:40:28 +0000 (10:40 +0100)]
net: pse-pd: Remove is_enabled callback from drivers

The is_enabled callback is now redundant as the admin_state can be obtained
directly from the driver and provides the same information.

To simplify functionality, the core will handle this internally, making
the is_enabled callback unnecessary at the driver level. Remove the
callback from all drivers.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: Split ethtool_get_status into multiple callbacks
Kory Maincent [Fri, 10 Jan 2025 09:40:27 +0000 (10:40 +0100)]
net: pse-pd: Split ethtool_get_status into multiple callbacks

The ethtool_get_status callback currently handles all status and PSE
information within a single function. This approach has two key
drawbacks:

1. If the core requires some information for purposes other than
   ethtool_get_status, redundant code will be needed to fetch the same
   data from the driver (like is_enabled).

2. Drivers currently have access to all information passed to ethtool.
   New variables will soon be added to ethtool status, such as PSE ID,
   power domain IDs, and budget evaluation strategies, which are meant
   to be managed solely by the core. Drivers should not have the ability
   to modify these variables.

To resolve these issues, ethtool_get_status has been split into multiple
callbacks, with each handling a specific piece of information required
by ethtool or the core.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: Use power limit at driver side instead of current limit
Kory Maincent [Fri, 10 Jan 2025 09:40:26 +0000 (10:40 +0100)]
net: pse-pd: Use power limit at driver side instead of current limit

The regulator framework uses current limits, but the PSE standard and
known PSE controllers rely on power limits. Instead of converting
current to power within each driver, perform the conversion in the PSE
core. This avoids redundancy in driver implementation and aligns better
with the standard, simplifying driver development.

Remove at the same time the _pse_ethtool_get_status() function which is
not needed anymore.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: tps23881: Add missing configuration register after disable
Kory Maincent [Fri, 10 Jan 2025 09:40:25 +0000 (10:40 +0100)]
net: pse-pd: tps23881: Add missing configuration register after disable

When setting the PWOFF register, the controller resets multiple
configuration registers. This patch ensures these registers are
reconfigured as needed following a disable operation.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: tps23881: Use helpers to calculate bit offset for a channel
Kory Maincent [Fri, 10 Jan 2025 09:40:24 +0000 (10:40 +0100)]
net: pse-pd: tps23881: Use helpers to calculate bit offset for a channel

This driver frequently follows a pattern where two registers are read or
written in a single operation, followed by calculating the bit offset for
a specific channel.

Introduce helpers to streamline this process and reduce code redundancy,
making the codebase cleaner and more maintainable.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: tps23881: Simplify function returns by removing redundant checks
Kory Maincent [Fri, 10 Jan 2025 09:40:23 +0000 (10:40 +0100)]
net: pse-pd: tps23881: Simplify function returns by removing redundant checks

Cleaned up several functions in tps23881 by removing redundant checks on
return values at the end of functions. These check has been removed, and
the return statement now directly returns the function result, reducing
the code's complexity and making it more concise.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Kyle Swenson <kyle.swenson@est.tech>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: Add power limit check
Kory Maincent [Fri, 10 Jan 2025 09:40:22 +0000 (10:40 +0100)]
net: pse-pd: Add power limit check

Checking only the current limit is not sufficient. According to the
standard, voltage can reach up to 57V and current up to 1.92A, which
exceeds the power limit described in the standard (99.9W). Add a power
limit check to prevent this.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: Avoid setting max_uA in regulator constraints
Kory Maincent [Fri, 10 Jan 2025 09:40:21 +0000 (10:40 +0100)]
net: pse-pd: Avoid setting max_uA in regulator constraints

Setting the max_uA constraint in the regulator API imposes a current
limit during the regulator registration process. This behavior conflicts
with preserving the maximum PI power budget configuration across reboots.

Instead, compare the desired current limit to MAX_PI_CURRENT in the
pse_pi_set_current_limit() function to ensure proper handling of the
power budget.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: pse-pd: Remove unused pse_ethtool_get_pw_limit function declaration
Kory Maincent [Fri, 10 Jan 2025 09:40:20 +0000 (10:40 +0100)]
net: pse-pd: Remove unused pse_ethtool_get_pw_limit function declaration

Removed the unused pse_ethtool_get_pw_limit() function declaration from
pse.h. This function was declared but never implemented or used,
making the declaration unnecessary.

Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Kyle Swenson <kyle.swenson@est.tech>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agoMerge branch 'add-multicast-filtering-support-for-vlan-interface'
Paolo Abeni [Tue, 14 Jan 2025 11:17:29 +0000 (12:17 +0100)]
Merge branch 'add-multicast-filtering-support-for-vlan-interface'

MD Danish Anwar says:

====================
Add Multicast Filtering support for VLAN interface

This series adds Multicast filtering support for VLAN interfaces in dual
EMAC and HSR offload mode for ICSSG driver.

Patch 1/4 - Adds support for VLAN in dual EMAC mode
Patch 2/4 - Adds MC filtering support for VLAN in dual EMAC mode
Patch 3/4 - Create and export hsr_get_port_ndev() in hsr_device.c
Patch 4/4 - Adds MC filtering support for VLAN in HSR mode

[1] https://lore.kernel.org/all/20241216100044.577489-2-danishanwar@ti.com/
[2] https://lore.kernel.org/all/202412210336.BmgcX3Td-lkp@intel.com/#t
[3] https://lore.kernel.org/all/31bb8a3e-5a1c-4c94-8c33-c0dfd6d643fb@kernel.org/
v1 https://lore.kernel.org/all/20241216100044.577489-1-danishanwar@ti.com/
v2 https://lore.kernel.org/all/20241223092557.2077526-1-danishanwar@ti.com/
v3 https://lore.kernel.org/all/20250103092033.1533374-1-danishanwar@ti.com/
====================

Link: https://patch.msgid.link/20250110082852.3899027-1-danishanwar@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: ti: icssg-prueth: Add Support for Multicast filtering with VLAN in HSR mode
MD Danish Anwar [Fri, 10 Jan 2025 08:28:52 +0000 (13:58 +0530)]
net: ti: icssg-prueth: Add Support for Multicast filtering with VLAN in HSR mode

Add multicast filtering support for VLAN interfaces in HSR offload mode
for ICSSG driver.

The driver calls vlan_for_each() API on the hsr device's ndev to get the
list of available vlans for the hsr device. The driver then sync mc addr of
vlan interface with a locally mainatined list emac->vlan_mcast_list[vid]
using __hw_addr_sync_multiple() API.

The driver then calls the sync / unsync callbacks.

In the sync / unsync call back, driver checks if the vdev's real dev is
hsr device or not. If the real dev is hsr device, driver gets the per
port device using hsr_get_port_ndev() and then driver passes appropriate
vid to FDB helper functions.

Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: hsr: Create and export hsr_get_port_ndev()
MD Danish Anwar [Fri, 10 Jan 2025 08:28:51 +0000 (13:58 +0530)]
net: hsr: Create and export hsr_get_port_ndev()

Create an API to get the net_device to the slave port of HSR device. The
API will take hsr net_device and enum hsr_port_type for which we want the
net_device as arguments.

This API can be used by client drivers who support HSR and want to get
the net_devcie of slave ports from the hsr device. Export this API for
the same.

This API needs the enum hsr_port_type to be accessible by the drivers using
hsr. Move the enum hsr_port_type from net/hsr/hsr_main.h to
include/linux/if_hsr.h for the same.

Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: ti: icssg-prueth: Add Multicast Filtering support for VLAN in MAC mode
MD Danish Anwar [Fri, 10 Jan 2025 08:28:50 +0000 (13:58 +0530)]
net: ti: icssg-prueth: Add Multicast Filtering support for VLAN in MAC mode

Add multicast filtering support for VLAN interfaces in dual EMAC mode
for ICSSG driver.

The driver uses vlan_for_each() API to get the list of available
vlans. The driver then sync mc addr of vlan interface with a locally
mainatined list emac->vlan_mcast_list[vid] using __hw_addr_sync_multiple()
API.

__hw_addr_sync_multiple() is used instead of __hw_addr_sync() to sync
vdev->mc with local list because the sync_cnt for addresses in vdev->mc
will already be set by the vlan_dev_set_rx_mode() [net/8021q/vlan_dev.c]
and __hw_addr_sync() only syncs when the sync_cnt == 0. Whereas
__hw_addr_sync_multiple() can sync addresses even if sync_cnt is not 0.
Export __hw_addr_sync_multiple() so that driver can use it.

Once the local list is synced, driver calls __hw_addr_sync_dev() with
the local list, vdev, sync and unsync callbacks.

__hw_addr_sync_dev() is used with the local maintained list as the list
to synchronize instead of using __dev_mc_sync() on vdev because
__dev_mc_sync() on vdev will call __hw_addr_sync_dev() on vdev->mc and
sync_cnt for addresses in vdev->mc will already be set by the
vlan_dev_set_rx_mode() [net/8021q/vlan_dev.c] and __hw_addr_sync_dev()
only syncs if the sync_cnt of addresses in the list (vdev->mc in this case)
is 0. Whereas __hw_addr_sync_dev() on local list will work fine as the
sync_cnt for addresses in the local list will still be 0.

Based on change in addresses in the local list, sync / unsync callbacks
are invoked. In the sync / unsync API in driver, based on whether the ndev
is vlan or not, driver passes appropriate vid to FDB helper functions.

Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: ti: icssg-prueth: Add VLAN support in EMAC mode
MD Danish Anwar [Fri, 10 Jan 2025 08:28:49 +0000 (13:58 +0530)]
net: ti: icssg-prueth: Add VLAN support in EMAC mode

Add support for VLAN filtering in dual EMAC mode.

Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agoMerge tag 'nf-next-25-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilt...
Paolo Abeni [Tue, 14 Jan 2025 11:08:24 +0000 (12:08 +0100)]
Merge tag 'nf-next-25-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains a small batch of Netfilter/IPVS updates
for net-next:

1) Remove unused genmask parameter in nf_tables_addchain()

2) Speed up reads from /proc/net/ip_vs_conn, from Florian Westphal.

3) Skip empty buckets in hashlimit to avoid atomic operations that results
   in false positive reports by syzbot with lockdep enabled, patch from
   Eric Dumazet.

4) Add conntrack event timestamps available via ctnetlink,
   from Florian Westphal.

netfilter pull request 25-01-11

* tag 'nf-next-25-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: conntrack: add conntrack event timestamp
  netfilter: xt_hashlimit: htable_selective_cleanup() optimization
  ipvs: speed up reads from ip_vs_conn proc file
  netfilter: nf_tables: remove the genmask parameter
====================

Link: https://patch.msgid.link/20250111230800.67349-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agoMerge branch 'introduce-unified-and-structured-phy'
Paolo Abeni [Tue, 14 Jan 2025 10:44:22 +0000 (11:44 +0100)]
Merge branch 'introduce-unified-and-structured-phy'

Oleksij Rempel says:

====================
Introduce unified and structured PHY

This patch set introduces a unified and well-structured interface for
reporting PHY statistics. Instead of relying on arbitrary strings in PHY
drivers, this interface provides a consistent and structured way to
expose PHY statistics to userspace via ethtool.

The initial groundwork for this effort was laid by Jakub Kicinski, who
contributed patches to plumb PHY statistics to drivers and added support
for structured statistics in ethtool. Building on Jakub's work, I tested
the implementation with several PHYs, addressed a few issues, and added
support for statistics in two specific PHY drivers.

Most of changes are tracked in separate patches.
changes v6:
- drop ethtool_stat_add patch
changes v5:
- rebase against latest net-next
====================

Link: https://patch.msgid.link/20250110060517.711683-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: phy: dp83tg720: add statistics support
Oleksij Rempel [Fri, 10 Jan 2025 06:05:17 +0000 (07:05 +0100)]
net: phy: dp83tg720: add statistics support

Add support for reporting PHY statistics in the DP83TG720 driver. This
includes cumulative tracking of link loss events, transmit/receive
packet counts, and error counts. Implemented functions to update and
provide statistics via ethtool, with optional polling support enabled
through `PHY_POLL_STATS`.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: phy: dp83td510: add statistics support
Oleksij Rempel [Fri, 10 Jan 2025 06:05:16 +0000 (07:05 +0100)]
net: phy: dp83td510: add statistics support

Add support for reporting PHY statistics in the DP83TD510 driver. This
includes cumulative tracking of transmit/receive packet counts, and
error counts. Implemented functions to update and provide statistics via
ethtool, with optional polling support enabled through `PHY_POLL_STATS`.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: phy: introduce optional polling interface for PHY statistics
Oleksij Rempel [Fri, 10 Jan 2025 06:05:15 +0000 (07:05 +0100)]
net: phy: introduce optional polling interface for PHY statistics

Add an optional polling interface for PHY statistics to simplify driver
implementation.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agoDocumentation: networking: update PHY error counter diagnostics in twisted pair guide
Oleksij Rempel [Fri, 10 Jan 2025 06:05:14 +0000 (07:05 +0100)]
Documentation: networking: update PHY error counter diagnostics in twisted pair guide

Replace generic instructions for monitoring error counters with a
procedure using the unified PHY statistics interface (`--all-groups`).

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: ethtool: add support for structured PHY statistics
Jakub Kicinski [Fri, 10 Jan 2025 06:05:13 +0000 (07:05 +0100)]
net: ethtool: add support for structured PHY statistics

Introduce a new way to report PHY statistics in a structured and
standardized format using the netlink API. This new method does not
replace the old driver-specific stats, which can still be accessed with
`ethtool -S <eth name>`. The structured stats are available with
`ethtool -S <eth name> --all-groups`.

This new method makes it easier to diagnose problems by organizing stats
in a consistent and documented way.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: ethtool: plumb PHY stats to PHY drivers
Jakub Kicinski [Fri, 10 Jan 2025 06:05:12 +0000 (07:05 +0100)]
net: ethtool: plumb PHY stats to PHY drivers

Introduce support for standardized PHY statistics reporting in ethtool
by extending the PHYLIB framework. Add the functions
phy_ethtool_get_phy_stats() and phy_ethtool_get_link_ext_stats() to
provide a consistent interface for retrieving PHY-level and
link-specific statistics. These functions are used within the ethtool
implementation to avoid direct access to the phy_device structure
outside of the PHYLIB framework.

A new structure, ethtool_phy_stats, is introduced to standardize PHY
statistics such as packet counts, byte counts, and error counters.
Drivers are updated to include callbacks for retrieving PHY and
link-specific statistics, ensuring values are explicitly set only for
supported fields, initialized with ETHTOOL_STAT_NOT_SET to avoid
ambiguity.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agoethtool: linkstate: migrate linkstate functions to support multi-PHY setups
Oleksij Rempel [Fri, 10 Jan 2025 06:05:11 +0000 (07:05 +0100)]
ethtool: linkstate: migrate linkstate functions to support multi-PHY setups

Adapt linkstate_get_sqi() and linkstate_get_sqi_max() to take a
phy_device argument directly, enabling support for setups with
multiple PHYs. The previous assumption of a single PHY attached to
a net_device no longer holds.

Use ethnl_req_get_phydev() to identify the appropriate PHY device
for the operation. Update linkstate_prepare_data() and related
logic to accommodate this change, ensuring compatibility with
multi-PHY configurations.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: phy: microchip_t1: depend on PTP_1588_CLOCK_OPTIONAL
Divya Koppera [Fri, 10 Jan 2025 05:44:24 +0000 (11:14 +0530)]
net: phy: microchip_t1: depend on PTP_1588_CLOCK_OPTIONAL

When microchip_t1_phy is built in and phyptp is module
facing undefined reference issue. This get fixed when
microchip_t1_phy made dependent on PTP_1588_CLOCK_OPTIONAL.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501090604.YEoJXCXi-lkp@intel.com
Fixes: fa51199c5f34 ("net: phy: microchip_rds_ptp : Add rds ptp library for Microchip phys")
Signed-off-by: Divya Koppera <divya.koppera@microchip.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://patch.msgid.link/20250110054424.16807-1-divya.koppera@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agonet: sched: calls synchronize_net() only when needed
Eric Dumazet [Thu, 9 Jan 2025 17:18:50 +0000 (17:18 +0000)]
net: sched: calls synchronize_net() only when needed

dev_deactivate_many() role is to remove the qdiscs
of a network device.

When/if a qdisc is dismantled, an rcu grace period
is needed to make sure all outstanding qdisc enqueue
are done before we proceed with a qdisc reset.

Most virtual devices do not have a qdisc.

We can call the expensive synchronize_net() only
if needed.

Note that dev_deactivate_many() does not have to deal
with qdisc-less dev_queue_xmit, as an old comment
was claiming.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250109171850.2871194-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
9 months agoMerge branch 'mlx5-hw-managed-flow-steering-in-fs-core-level'
Jakub Kicinski [Tue, 14 Jan 2025 03:24:32 +0000 (19:24 -0800)]
Merge branch 'mlx5-hw-managed-flow-steering-in-fs-core-level'

Tariq Toukan says:

====================
mlx5 HW-Managed Flow Steering in FS core level

This patchset by Moshe follows Yevgeny's patchsets [1][2] on subject
"HW-Managed Flow Steering in mlx5 driver". As introduced there in HW
managed Flow Steering mode (HWS) the driver is configuring steering
rules directly to the HW using WQs with a special new type of WQE (Work
Queue Element). This way we can reach higher rule insertion/deletion
rate with much lower CPU utilization compared to SW Managed Flow
Steering (SWS).

This patchset adds API to manage namespace, flow tables, flow groups and
prepare FTE (Flow Table Entry) rules. It also adds caching and pool
mechanisms for HWS actions to allow sharing of steering actions among
different rules. The implementation of this API in FS layer, allows FS
core to use HW Managed Flow Steering in addition to the existing FW or
SW Managed Flow Steering.

Patch 13 of this series adds support for configuring HW Managed Flow
Steering mode through devlink param, similar to configuring SW Managed
Flow Steering mode:

 # devlink dev param set pci/0000:08:00.0 name flow_steering_mode \
      cmode runtime value hmfs

In addition, the series contains 2 HWS patches from Yevgeny that
implement flow update support.

[1] https://lore.kernel.org/netdev/20240903031948.78006-1-saeed@kernel.org/
[2] https://lore.kernel.org/all/20250102181415.1477316-1-tariqt@nvidia.com/
====================

Link: https://patch.msgid.link/20250109160546.1733647-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: HWS, update flow - support through bigger action RTC
Yevgeny Kliteynik [Thu, 9 Jan 2025 16:05:46 +0000 (18:05 +0200)]
net/mlx5: HWS, update flow - support through bigger action RTC

This patch is the second part of update flow implementation.

Instead of using two action RTCs, we use the same RTC which is twice
the size of what was required before the update flow support.
This way we always allocate STEs from the same RTC (same pool),
which means that update is done similar to how create is done.
The bigger size allows us to allocate and write new STEs, and
later free the old (pre-update) STEs.

Similar to rule creation, STEs are written in reverse order:
 - write action STEs, while match STE is still pointing to
   the old action STEs
 - overwrite the match STE with the new one, which now
   is pointing to the new action STEs

Old action STEs can be freed only once we got completion on the
writing of the new match STE. To implement this we added new rule
states: UPDATING/UPDATED. Rule is moved to UPDATING state in the
beginning of the update flow. Once all completions are received,
rule is moved to UPDATED state. At this point old action STEs are
freed and rule goes back to CREATED state.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-16-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: HWS, update flow - remove the use of dual RTCs
Yevgeny Kliteynik [Thu, 9 Jan 2025 16:05:45 +0000 (18:05 +0200)]
net/mlx5: HWS, update flow - remove the use of dual RTCs

This patch is the first part of update flow implementation.

Update flow should support rules with single STE (match STE only),
as well as rules with multiple STEs (match STE plus action STEs).

Supporting the rules with single STE is straightforward: we just
overwrite the STE, which is an atomic operation.
Supporting the rules with action STEs is a more complicated case.
The existing implementation uses two action RTCs per matcher and
alternates between the two for each update request.
This implementation was unnecessarily complex and lead to some
unhandled edge cases, so the support for rule update with multiple
STEs wasn't really functional.

This patch removes this code, and the next patch adds implementation
of a different approach.

Note that after applying this patch and before applying the next
patch we still have support for update rule with single STE (only
match STE w/o action STEs), but update will fail for rules with
action STEs.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-15-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS to steering mode options
Moshe Shemesh [Thu, 9 Jan 2025 16:05:44 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS to steering mode options

Add HW Steering mode to mlx5 devlink param of steering mode options.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-14-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS get capabilities
Moshe Shemesh [Thu, 9 Jan 2025 16:05:43 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS get capabilities

Add API function get capabilities to HW Steering flow commands.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-13-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, set create match definer to not supported by HWS
Moshe Shemesh [Thu, 9 Jan 2025 16:05:42 +0000 (18:05 +0200)]
net/mlx5: fs, set create match definer to not supported by HWS

Currently HW Steering does not support the API functions of create and
destroy match definer. Return not supported error in case requested.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add support for dest vport HWS action
Moshe Shemesh [Thu, 9 Jan 2025 16:05:41 +0000 (18:05 +0200)]
net/mlx5: fs, add support for dest vport HWS action

Add support for HW Steering action of vport destination. Add dest vport
actions cache. Hold action in cache per vport / vport and vhca_id. Add
action to cache on demand and remove on namespace closure to reduce
actions creation and destroy.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS fte API functions
Moshe Shemesh [Thu, 9 Jan 2025 16:05:40 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS fte API functions

Add create, destroy and update fte API functions for adding, removing
and updating flow steering rules in HW Steering mode. Get HWS actions
according to required rule, use actions from pool whenever possible.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add dest table cache
Moshe Shemesh [Thu, 9 Jan 2025 16:05:39 +0000 (18:05 +0200)]
net/mlx5: fs, add dest table cache

Add cache of destination flow table HWS action per HWS table. For each
flow table created cache a destination action towards this table. The
cached action will be used on the downstream patch whenever a rule
requires such action.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, manage flow counters HWS action sharing by refcount
Moshe Shemesh [Thu, 9 Jan 2025 16:05:38 +0000 (18:05 +0200)]
net/mlx5: fs, manage flow counters HWS action sharing by refcount

Multiple flow counters can utilize a single Hardware Steering (HWS)
action for Hardware Steering rules. Given that these counter bulks are
not exclusively created for Hardware Steering, but also serve purposes
such as statistics gathering and other steering modes, it's more
efficient to create the HWS action only when it's first needed by a
Hardware Steering rule. This approach allows for better resource
management through the use of a reference count, rather than
automatically creating an HWS action for every bulk of flow counters.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS modify header API function
Moshe Shemesh [Thu, 9 Jan 2025 16:05:37 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS modify header API function

Add modify header alloc and dealloc API functions to provide modify
header actions for steering rules. Use fs hws pools to get actions from
shared bulks of modify header actions.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS packet reformat API function
Moshe Shemesh [Thu, 9 Jan 2025 16:05:36 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS packet reformat API function

Add packet reformat alloc and dealloc API functions to provide packet
reformat actions for steering rules.

Add HWS action pools for each of the following packet reformat types:
- decapl3: decapsulate l3 tunnel to l2
- encapl2: encapsulate l2 to tunnel l2
- encapl3: encapsulate l2 to tunnel l3
- insert_hdr: insert header

In addition cache remove header action for remove vlan header as this is
currently the only use case of remove header action in the driver.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS actions pool
Moshe Shemesh [Thu, 9 Jan 2025 16:05:35 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS actions pool

The HW Steering actions pool will help utilize the option in HW Steering
to share steering actions among different rules.

Create pool on root namespace creation and add few HW Steering actions
that don't depend on the steering rule itself and thus can be shared
between rules, created on same namespace: tag, pop_vlan, push_vlan,
drop, decap l2.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS flow group API functions
Moshe Shemesh [Thu, 9 Jan 2025 16:05:34 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS flow group API functions

Add API functions to create and destroy HW Steering flow groups. Each
flow group consists of a Backward Compatible (BWC) HW Steering matcher
which holds the flow group match criteria.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS flow table API functions
Moshe Shemesh [Thu, 9 Jan 2025 16:05:33 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS flow table API functions

Add API functions to create, modify and destroy HW Steering flow tables.
Modify table enables change, connect or disconnect default miss table.
Add update root flow table API function.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: fs, add HWS root namespace functions
Moshe Shemesh [Thu, 9 Jan 2025 16:05:32 +0000 (18:05 +0200)]
net/mlx5: fs, add HWS root namespace functions

Add flow steering commands structure for HW steering. Implement create,
destroy and set peer HW steering root namespace functions.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoeth: iavf: extend the netdev_lock usage
Jakub Kicinski [Sat, 11 Jan 2025 07:13:39 +0000 (23:13 -0800)]
eth: iavf: extend the netdev_lock usage

iavf uses the netdev->lock already to protect shapers.
In an upcoming series we'll try to protect NAPI instances
with netdev->lock.

We need to modify the protection a bit. All NAPI related
calls in the driver need to be consistently under the lock.
This will allow us to easily switch to a "we already hold
the lock" NAPI API later.

register_netdevice(), OTOH, must not be called under
the netdev_lock() as we do not intend to have an
"already locked" version of this call.

Link: https://patch.msgid.link/20250111071339.3709071-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: cleanup init_dummy_netdev_core()
Jakub Kicinski [Mon, 13 Jan 2025 00:34:56 +0000 (16:34 -0800)]
net: cleanup init_dummy_netdev_core()

init_dummy_netdev_core() used to cater to net_devices which
did not come from alloc_netdev_mqs(). Since that's no longer
supported remove the init logic which duplicates alloc_netdev_mqs().

While at it rename back to init_dummy_netdev().

Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250113003456.3904110-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: remove init_dummy_netdev()
Jakub Kicinski [Mon, 13 Jan 2025 00:34:55 +0000 (16:34 -0800)]
net: remove init_dummy_netdev()

init_dummy_netdev() can initialize statically declared or embedded
net_devices. Such netdevs did not come from alloc_netdev_mqs().
After recent work by Breno, there are the only two cases where
we have do that.

Switch those cases to alloc_netdev_mqs() and delete init_dummy_netdev().
Dealing with static netdevs is not worth the maintenance burden.

Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250113003456.3904110-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agotools/net/ynl: ethtool: support spec load from install location
Donald Hunter [Sat, 11 Jan 2025 15:48:03 +0000 (15:48 +0000)]
tools/net/ynl: ethtool: support spec load from install location

Replace hard-coded paths for spec and schema with lookup functions so
that ethtool.py will work in-tree or when installed.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250111154803.7496-2-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agotools/net/ynl: add support for --family and --list-families
Donald Hunter [Sat, 11 Jan 2025 15:48:02 +0000 (15:48 +0000)]
tools/net/ynl: add support for --family and --list-families

Add a --family option to ynl to specify the spec by family name instead
of file path, with support for searching in-tree and system install
location and a --list-families option to show the available families.

./tools/net/ynl/pyynl/cli.py --family rt_addr --dump getaddr

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250111154803.7496-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agocan: grcan: move napi_enable() from under spin lock
Jakub Kicinski [Sat, 11 Jan 2025 02:47:42 +0000 (18:47 -0800)]
can: grcan: move napi_enable() from under spin lock

I don't see any reason why napi_enable() needs to be under the lock,
only reason I could think of is if the IRQ also took this lock
but it doesn't. napi_enable() will soon need to sleep.

Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://patch.msgid.link/20250111024742.3680902-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: stmmac: sti: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
Raphael Gallais-Pou [Thu, 9 Jan 2025 15:58:42 +0000 (16:58 +0100)]
net: stmmac: sti: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()

Letting the compiler remove these functions when the kernel is built
without CONFIG_PM_SLEEP support is simpler and less error prone than the
use of #ifdef based kernel configuration guards.

Signed-off-by: Raphael Gallais-Pou <rgallaispou@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Link: https://patch.msgid.link/20250109155842.60798-1-rgallaispou@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/smc: fix data error when recvmsg with MSG_PEEK flag
Guangguan Wang [Sat, 4 Jan 2025 14:32:01 +0000 (22:32 +0800)]
net/smc: fix data error when recvmsg with MSG_PEEK flag

When recvmsg with MSG_PEEK flag, the data will be copied to
user's buffer without advancing consume cursor and without
reducing the length of rx available data. Once the expected
peek length is larger than the value of bytes_to_rcv, in the
loop of do while in smc_rx_recvmsg, the first loop will copy
bytes_to_rcv bytes of data from the position local_tx_ctrl.cons,
the second loop will copy the min(bytes_to_rcv, read_remaining)
bytes from the position local_tx_ctrl.cons again because of the
lacking of process with advancing consume cursor and reducing
the length of available data. So do the subsequent loops. The
data copied in the second loop and the subsequent loops will
result in data error, as it should not be copied if no more data
arrives and it should be copied from the position advancing
bytes_to_rcv bytes from the local_tx_ctrl.cons if more data arrives.

This issue can be reproduce by the following python script:
server.py:
import socket
import time
server_ip = '0.0.0.0'
server_port = 12346
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.bind((server_ip, server_port))
server_socket.listen(1)
print('Server is running and listening for connections...')
conn, addr = server_socket.accept()
print('Connected by', addr)
while True:
    data = conn.recv(1024)
    if not data:
        break
    print('Received request:', data.decode())
    conn.sendall(b'Hello, client!\n')
    time.sleep(5)
    conn.sendall(b'Hello, again!\n')
conn.close()

client.py:
import socket
server_ip = '<server ip>'
server_port = 12346
resp=b'Hello, client!\nHello, again!\n'
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect((server_ip, server_port))
request = 'Hello, server!'
client_socket.sendall(request.encode())
peek_data = client_socket.recv(len(resp),
    socket.MSG_PEEK | socket.MSG_WAITALL)
print('Peeked data:', peek_data.decode())
client_socket.close()

Fixes: 952310ccf2d8 ("smc: receive data from RMBE")
Reported-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Link: https://patch.msgid.link/20250104143201.35529-1-guangguan.wang@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/mlx5: Add nic_cap_reg and vhca_icm_ctrl registers
Akiva Goldberger [Thu, 9 Jan 2025 20:42:31 +0000 (22:42 +0200)]
net/mlx5: Add nic_cap_reg and vhca_icm_ctrl registers

Add nic_cap_reg and vhca_icm_ctrl registers interfaces for exposing ICM
consumption.

Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109204231.1809851-5-tariqt@nvidia.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
9 months agonet/mlx5: SHAMPO: Introduce new SHAMPO specific HCA caps
Saeed Mahameed [Thu, 9 Jan 2025 20:42:30 +0000 (22:42 +0200)]
net/mlx5: SHAMPO: Introduce new SHAMPO specific HCA caps

Read and cache SHAMPO specific caps for header data split capabilities.
Will be used in downstream patch.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109204231.1809851-4-tariqt@nvidia.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
9 months agonet/mlx5: Add support for MRTCQ register
Jianbo Liu [Thu, 9 Jan 2025 20:42:29 +0000 (22:42 +0200)]
net/mlx5: Add support for MRTCQ register

Management Real Time Clock Query (MRTCQ) register is used to query
hardware clock identity.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109204231.1809851-3-tariqt@nvidia.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
9 months agonet/mlx5: Update mlx5_ifc to support FEC for 200G per lane link modes
Jianbo Liu [Thu, 9 Jan 2025 20:42:28 +0000 (22:42 +0200)]
net/mlx5: Update mlx5_ifc to support FEC for 200G per lane link modes

Add FEC admin and override related fields in PPLM, and the bit in PCAM
to indicate those fields are supported.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109204231.1809851-2-tariqt@nvidia.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
9 months agonet: airoha: Fix channel configuration for ETS Qdisc
Lorenzo Bianconi [Tue, 7 Jan 2025 22:26:28 +0000 (23:26 +0100)]
net: airoha: Fix channel configuration for ETS Qdisc

Limit ETS QoS channel to AIROHA_NUM_QOS_CHANNELS in
airoha_tc_setup_qdisc_ets() in order to align the configured channel to
the value set in airoha_dev_select_queue().

Fixes: 20bf7d07c956 ("net: airoha: Add sched ETS offload support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250107-airoha-ets-fix-chan-v1-1-97f66ed3a068@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/smc: delete pointless divide by one
Dan Carpenter [Wed, 8 Jan 2025 09:26:06 +0000 (12:26 +0300)]
net/smc: delete pointless divide by one

Here "buf" is a void pointer so sizeof(*buf) is one.  Doing a divide
by one makes the code less readable.  Delete it.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/ee1a790b-f874-4512-b3ae-9c45f99dc640@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: phy: dp83822: Add support for PHY LEDs on DP83822
Dimitri Fedrau [Tue, 7 Jan 2025 08:23:04 +0000 (09:23 +0100)]
net: phy: dp83822: Add support for PHY LEDs on DP83822

The DP83822 supports up to three configurable Light Emitting Diode (LED)
pins: LED_0, LED_1 (GPIO1), COL (GPIO2) and RX_D3 (GPIO3). Several
functions can be multiplexed onto the LEDs for different modes of
operation. LED_0 and COL (GPIO2) use the MLED function. MLED can be routed
to only one of these two pins at a time. Add minimal LED controller driver
supporting the most common uses with the 'netdev' trigger.

Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250107-dp83822-leds-v2-1-5b260aad874f@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge tag 'linux-can-next-for-6.14-20250110' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Sat, 11 Jan 2025 06:46:08 +0000 (22:46 -0800)]
Merge tag 'linux-can-next-for-6.14-20250110' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2025-01-10

Pierre-Henry Moussay adds PIC64GX compatibility to the DT bindings for
Microchip's mpfs-can IP core.

The next 3 patches are by Sean Nyekjaer and target the tcan4x5x
driver. First the DT bindings is converted to DT schema, then nWKRQ
voltage selection is added to the driver.

Dario Binacchi's patch for the sun4i_can makes the driver more
consistent by adding a likely() to the driver.

Another patch by Sean Nyekjaer for the tcan4x5x driver gets rid of a
false error message.

Charan Pedumuru converts the atmel-can DT bindings to DT schema.

The next 2 patches are by Oliver Hartkopp. The first one maps Oliver's
former mail addresses to a dedicated CAN mail address. The second one
assigns net/sched/em_canid.c additionally to the CAN maintainers.

Ariel Otilibili's patch removes dead code from the CAN dev helper.

The next 3 patches are by Sean Nyekjaer and add HW standby support to
the tcan4x5x driver.

A patch by Dario Binacchi fixes the DT bindings for the st,stm32-bxcan
driver.

The last 4 patches are by Jimmy Assarsson and target the kvaser_usb
and the kvaser_pciefd driver: error statistics are improved and
CAN_CTRLMODE_BERR_REPORTING is added.

* tag 'linux-can-next-for-6.14-20250110' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: kvaser_pciefd: Add support for CAN_CTRLMODE_BERR_REPORTING
  can: kvaser_pciefd: Update stats and state even if alloc_can_err_skb() fails
  can: kvaser_usb: Add support for CAN_CTRLMODE_BERR_REPORTING
  can: kvaser_usb: Update stats and state even if alloc_can_err_skb() fails
  dt-bindings: can: st,stm32-bxcan: fix st,gcan property type
  can: m_can: call deinit/init callback when going into suspend/resume
  can: tcan4x5x: add deinit callback to set standby mode
  can: m_can: add deinit callback
  can: dev: can_get_state_str(): Remove dead code
  MAINTAINERS: assign em_canid.c additionally to CAN maintainers
  mailmap: add an entry for Oliver Hartkopp
  dt-bindings: net: can: atmel: Convert to json schema
  can: tcan4x5x: get rid of false clock errors
  can: sun4i_can: continue to use likely() to check skb
  can: tcan4x5x: add option for selecting nWKRQ voltage
  dt-bindings: can: tcan4x5x: Document the ti,nwkrq-voltage-vio option
  dt-bindings: can: convert tcan4x5x.txt to DT schema
  dt-bindings: can: mpfs: add PIC64GX CAN compatibility
====================

Link: https://patch.msgid.link/20250110112712.3214173-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: hide the definition of dev_get_by_napi_id()
Jakub Kicinski [Fri, 10 Jan 2025 00:49:24 +0000 (16:49 -0800)]
net: hide the definition of dev_get_by_napi_id()

There are no module callers of dev_get_by_napi_id(),
and commit d1cacd747768 ("netdev: prevent accessing NAPI instances
from another namespace") proves that getting NAPI by id
needs to be done with care. So hide dev_get_by_napi_id().

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250110004924.3212260-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: warn during dump if NAPI list is not sorted
Jakub Kicinski [Fri, 10 Jan 2025 00:45:04 +0000 (16:45 -0800)]
net: warn during dump if NAPI list is not sorted

Dump continuation depends on the NAPI list being sorted.
Broken netlink dump continuation may be rare and hard to debug
so add a warning if we notice the potential problem while walking
the list.

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250110004505.3210140-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ethernet: ti: cpsw: fix the comment regarding VLAN-aware ALE
Alexander Sverdlin [Thu, 9 Jan 2025 21:42:13 +0000 (22:42 +0100)]
net: ethernet: ti: cpsw: fix the comment regarding VLAN-aware ALE

In all 3 cases (cpsw, cpsw-new, am65-cpsw) ALE is being configured in
VLAN-aware mode, while the comment states the opposite. Seems to be a typo
copy-pasted from one driver to another. Fix the commend which has been
puzzling some people (including me) for at least a decade.

Link: https://lore.kernel.org/linux-arm-kernel/4699400.vD3TdgH1nR@localhost/
Link: https://lore.kernel.org/netdev/0106ce78-c83f-4552-a234-1bf7a33f1ed1@kernel.org/
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250109214219.123767-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agotls: skip setting sk_write_space on rekey
Sabrina Dubroca [Thu, 9 Jan 2025 22:30:54 +0000 (23:30 +0100)]
tls: skip setting sk_write_space on rekey

syzbot reported a problem when calling setsockopt(SO_SNDBUF) after a
rekey. SO_SNDBUF calls sk_write_space, ie tls_write_space, which then
calls the original socket's sk_write_space, saved in
ctx->sk_write_space. Rekeys should skip re-assigning
ctx->sk_write_space, so we don't end up with tls_write_space calling
itself.

Fixes: 47069594e67e ("tls: implement rekey for TLS1.3")
Reported-by: syzbot+6ac73b3abf1b598863fa@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/676d231b.050a0220.2f3838.0461.GAE@google.com/
Tested-by: syzbot+6ac73b3abf1b598863fa@syzkaller.appspotmail.com
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/ffdbe4de691d1c1eead556bbf42e33ae215304a7.1736436785.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ethtool: Use hwprov under rcu_read_lock
Li RongQing [Thu, 9 Jan 2025 11:10:57 +0000 (19:10 +0800)]
net: ethtool: Use hwprov under rcu_read_lock

hwprov should be protected by rcu_read_lock to prevent possible UAF

Fixes: 4c61d809cf60 ("net: ethtool: Fix suspicious rcu_dereference usage")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Acked-by: Kory Maincent <kory.maincent@bootlin.com>
diff with v1: move and use err varialbe, instead of define a new variable

 net/ethtool/common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Link: https://patch.msgid.link/20250109111057.4746-1-lirongqing@baidu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'ipvlan-support-bonding-events'
Jakub Kicinski [Sat, 11 Jan 2025 02:10:29 +0000 (18:10 -0800)]
Merge branch 'ipvlan-support-bonding-events'

Etienne Champetier says:

====================
ipvlan: Support bonding events
====================

Link: https://patch.msgid.link/20250109032819.326528-1-champetier.etienne@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>