]> www.infradead.org Git - users/hch/misc.git/log
users/hch/misc.git
2 months agonet: phy: mediatek: Add token ring set bit operation support
Sky Huang [Thu, 13 Feb 2025 08:05:51 +0000 (16:05 +0800)]
net: phy: mediatek: Add token ring set bit operation support

Previously in mtk-ge-soc.c, we set some register bits via token
ring, which were implemented in three __phy_write().
Now we can do the same thing via __mtk_tr_set_bits() helper.

Signed-off-by: Sky Huang <skylake.huang@mediatek.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250213080553.921434-4-SkyLake.Huang@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: mediatek: Add token ring access helper functions in mtk-phy-lib
Sky Huang [Thu, 13 Feb 2025 08:05:50 +0000 (16:05 +0800)]
net: phy: mediatek: Add token ring access helper functions in mtk-phy-lib

This patch adds TR(token ring) manipulations and adds correct
macro names for those magic numbers. TR is a way to access
proprietary registers on page 52b5. Use these helper functions
so we can see which fields we're going to modify/set/clear.

TR functions with __* prefix mean that the operations inside
aren't wrapped by page select/restore functions.

This patch doesn't really change registers' settings but just
enhances readability and maintainability.

Signed-off-by: Sky Huang <skylake.huang@mediatek.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250213080553.921434-3-SkyLake.Huang@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: mediatek: Change to more meaningful macros
Sky Huang [Thu, 13 Feb 2025 08:05:49 +0000 (16:05 +0800)]
net: phy: mediatek: Change to more meaningful macros

Replace magic number with more meaningful macros in mtk-ge.c.
Also, move some common macros into mtk-phy-lib.c.

Signed-off-by: Sky Huang <skylake.huang@mediatek.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250213080553.921434-2-SkyLake.Huang@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: xpcs: rearrange register definitions
Russell King (Oracle) [Sun, 16 Feb 2025 10:20:42 +0000 (10:20 +0000)]
net: xpcs: rearrange register definitions

Place register number definitions immediately above their field
definitions and order by register number.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tjblS-00448F-8v@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agondisc: ndisc_send_redirect() cleanup
Eric Dumazet [Fri, 14 Feb 2025 14:07:05 +0000 (14:07 +0000)]
ndisc: ndisc_send_redirect() cleanup

ndisc_send_redirect() is always called under rcu_read_lock().

It can use dev_net_rcu() and avoid one redundant
rcu_read_lock()/rcu_read_unlock() pair.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250214140705.2105890-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'bnxt_en-add-npar-1-2-and-tph-support'
Jakub Kicinski [Sat, 15 Feb 2025 03:50:25 +0000 (19:50 -0800)]
Merge branch 'bnxt_en-add-npar-1-2-and-tph-support'

Michael Chan says:

====================
bnxt_en: Add NPAR 1.2 and TPH support

The first patch adds NPAR 1.2 support.  Patches 2 to 11 add TPH
(TLP Processing Hints) support.  These TPH driver patches are new
revisions originally posted as part of the TPH PCI patch series.
Additional driver refactoring has been done so that we can free
and allocate RX completion ring and the TX rings if the channel is
a combined channel.  We also add napi_disable() and napi_enable()
during queue_stop() and queue_start() respectively, and reset for
error handling in queue_start().

v4: https://lore.kernel.org/20250208202916.1391614-1-michael.chan@broadcom.com
v3: https://lore.kernel.org/20250204004609.1107078-1-michael.chan@broadcom.com
v2: https://lore.kernel.org/20250116192343.34535-1-michael.chan@broadcom.com
v1: https://lore.kernel.org/20250113063927.4017173-1-michael.chan@broadcom.com

Discussion about adding napi_disable()/napi_enable():

https://lore.kernel.org/netdev/5336d624-8d8b-40a6-b732-b020e4a119a2@davidwei.uk/#t

Previous driver series fixing rtnl_lock and empty release function:

https://lore.kernel.org/netdev/20241115200412.1340286-1-wei.huang2@amd.com/

v5 of the PCI series using netdev_rx_queue_restart():

https://lore.kernel.org/netdev/20240916205103.3882081-5-wei.huang2@amd.com/

v1 of the PCI series using open/close:

https://lore.kernel.org/netdev/20240509162741.1937586-9-wei.huang2@amd.com/
====================

Link: https://patch.msgid.link/20250213011240.1640031-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Add TPH support in BNXT driver
Manoj Panicker [Thu, 13 Feb 2025 01:12:39 +0000 (17:12 -0800)]
bnxt_en: Add TPH support in BNXT driver

Add TPH support to the Broadcom BNXT device driver. This allows the
driver to utilize TPH functions for retrieving and configuring Steering
Tags when changing interrupt affinity. With compatible NIC firmware,
network traffic will be tagged correctly with Steering Tags, resulting
in significant memory bandwidth savings and other advantages as
demonstrated by real network benchmarks on TPH-capable platforms.

Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-12-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Extend queue stop/start for TX rings
Somnath Kotur [Thu, 13 Feb 2025 01:12:38 +0000 (17:12 -0800)]
bnxt_en: Extend queue stop/start for TX rings

In order to use queue_stop/queue_start to support the new Steering
Tags, we need to free the TX ring and TX completion ring if it is a
combined channel with TX/RX sharing the same NAPI.  Otherwise
TX completions will not have the updated Steering Tag.  If TPH is
not enabled, we just stop the TX ring without freeing the TX/TX cmpl
rings.  With that we can now add napi_disable() and napi_enable()
during queue_stop()/ queue_start().  This will guarantee that NAPI
will stop processing the completion entries in case there are
additional pending entries in the completion rings after queue_stop().

There could be some NQEs sitting unprocessed while NAPI is disabled
thereby leaving the NQ unarmed.  Explicitly re-arm the NQ after
napi_enable() in queue start so that NAPI will resume properly.

Error handling in bnxt_queue_start() requires a reset.  If a TX
ring cannot be allocated or initialized properly, it will cause
TX timeout.  The reset will also free any partially allocated
rings.  We don't expect to hit this error path because re-allocating
previously reserved and allocated rings with the same parameters
should never fail.

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-11-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Refactor TX ring free logic
Michael Chan [Thu, 13 Feb 2025 01:12:37 +0000 (17:12 -0800)]
bnxt_en: Refactor TX ring free logic

Add a new bnxt_hwrm_tx_ring_free() function to handle freeing a HW
transmit ring.  The new function will also be used in the next patch
to free the TX ring in queue_stop.

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-10-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Reallocate RX completion ring for TPH support
Somnath Kotur [Thu, 13 Feb 2025 01:12:36 +0000 (17:12 -0800)]
bnxt_en: Reallocate RX completion ring for TPH support

In order to program the correct Steering Tag during an IRQ affinity
change, we need to free/re-allocate the RX completion ring during
queue_restart.  If TPH is enabled, call FW to free the Rx completion
ring and clear the ring entries in queue_stop().  Re-allocate it in
queue_start() if TPH is enabled.  Note that TPH mode is not enabled
in this patch and will be enabled later in the patch series.

While modifying bnxt_queue_start(), remove the unnecessary zeroing of
rxr->rx_next_cons.  It gets overwritten by the clone in
bnxt_queue_start().  Remove the rx_reset counter increment since
restart is not reset.  Add comment to clarify that the ring
allocations in queue_start should never fail.

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-9-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
Michael Chan [Thu, 13 Feb 2025 01:12:35 +0000 (17:12 -0800)]
bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings

Newer firmware can use the NQ ring ID associated with each RX/RX AGG
ring to enable PCIe Steering Tags on P5_PLUS chips.  When allocating
RX/RX AGG rings, pass along NQ ring ID for the firmware to use.  This
information helps optimize DMA writes by directing them to the cache
closer to the CPU consuming the data, potentially improving the
processing speed.  This change is backward-compatible with older
firmware, which will simply disregard the information.

Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-8-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Refactor RX/RX AGG ring parameters setup for P5_PLUS
Michael Chan [Thu, 13 Feb 2025 01:12:34 +0000 (17:12 -0800)]
bnxt_en: Refactor RX/RX AGG ring parameters setup for P5_PLUS

There is some common code for setting up RX and RX AGG ring allocation
parameters for P5_PLUS chips.  Refactor the logic into a new function.

Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Refactor bnxt_free_tx_rings() to free per TX ring
Somnath Kotur [Thu, 13 Feb 2025 01:12:33 +0000 (17:12 -0800)]
bnxt_en: Refactor bnxt_free_tx_rings() to free per TX ring

Modify bnxt_free_tx_rings() to free the skbs per TX ring.
This will be useful later in the series.

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Refactor completion ring free routine
Somnath Kotur [Thu, 13 Feb 2025 01:12:32 +0000 (17:12 -0800)]
bnxt_en: Refactor completion ring free routine

Add a wrapper routine to free L2 completion rings.  This will be
useful later in the series.

Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Refactor TX ring allocation logic
Michael Chan [Thu, 13 Feb 2025 01:12:31 +0000 (17:12 -0800)]
bnxt_en: Refactor TX ring allocation logic

Add a new bnxt_hwrm_tx_ring_alloc() function to handle allocating
a transmit ring.  This will be useful later in the series.

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Refactor completion ring allocation logic for P5_PLUS chips
Michael Chan [Thu, 13 Feb 2025 01:12:30 +0000 (17:12 -0800)]
bnxt_en: Refactor completion ring allocation logic for P5_PLUS chips

Add a new bnxt_hwrm_cp_ring_alloc_p5() function to handle allocating
one completion ring on P5_PLUS chips.  This simplifies the existing code
and will be useful later in the series.

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobnxt_en: Set NPAR 1.2 support when registering with firmware
Michael Chan [Thu, 13 Feb 2025 01:12:29 +0000 (17:12 -0800)]
bnxt_en: Set NPAR 1.2 support when registering with firmware

NPAR (Network interface card partitioning)[1] 1.2 adds a transparent
VLAN tag for all packets between the NIC and the switch.  Because of
that, RX VLAN acceleration cannot be supported for any additional
host configured VLANs.  The driver has to acknowledge that it can
support no RX VLAN acceleration and set the NPAR 1.2 supported flag
when registering with the FW.  Otherwise, the FW call will fail and
the driver will abort on these NPAR 1.2 NICs with this error:

bnxt_en 0000:26:00.0 (unnamed net_device) (uninitialized): hwrm req_type 0x1d seq id 0xb error 0x2

[1] https://techdocs.broadcom.com/us/en/storage-and-ethernet-connectivity/ethernet-nic-controllers/bcm957xxx/adapters/introduction/features/network-partitioning-npar.html

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet/mlx4_core: Avoid impossible mlx4_db_alloc() order value
Kees Cook [Mon, 10 Feb 2025 17:45:05 +0000 (09:45 -0800)]
net/mlx4_core: Avoid impossible mlx4_db_alloc() order value

GCC can see that the value range for "order" is capped, but this leads
it to consider that it might be negative, leading to a false positive
warning (with GCC 15 with -Warray-bounds -fdiagnostics-details):

../drivers/net/ethernet/mellanox/mlx4/alloc.c:691:47: error: array subscript -1 is below array bounds of 'long unsigned int *[2]' [-Werror=array-bounds=]
  691 |                 i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o);
      |                                    ~~~~~~~~~~~^~~
  'mlx4_alloc_db_from_pgdir': events 1-2
  691 |                 i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o);                        |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |                         |                                                   |                     |                         (2) out of array bounds here
      |                     (1) when the condition is evaluated to true                             In file included from ../drivers/net/ethernet/mellanox/mlx4/mlx4.h:53,
                 from ../drivers/net/ethernet/mellanox/mlx4/alloc.c:42:
../include/linux/mlx4/device.h:664:33: note: while referencing 'bits'
  664 |         unsigned long          *bits[2];
      |                                 ^~~~

Switch the argument to unsigned int, which removes the compiler needing
to consider negative values.

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20250210174504.work.075-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: c45: improve handling of disabled EEE modes in generic ethtool functions
Heiner Kallweit [Wed, 12 Feb 2025 20:01:42 +0000 (21:01 +0100)]
net: phy: c45: improve handling of disabled EEE modes in generic ethtool functions

Currently disabled EEE modes are shown as supported in ethtool.
Change this by filtering them out when populating data->supported
in genphy_c45_ethtool_get_eee.
Disabled EEE modes are silently filtered out by genphy_c45_write_eee_adv.
This is planned to be removed, therefore ensure in
genphy_c45_ethtool_set_eee that disabled EEE modes are removed from the
user space provided EEE advertisement. For now keep the current behavior
to do this silently.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/5187c86d-9a5a-482c-974f-cc103ce9738c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoice: Fix signedness bug in ice_init_interrupt_scheme()
Dan Carpenter [Thu, 13 Feb 2025 06:31:41 +0000 (09:31 +0300)]
ice: Fix signedness bug in ice_init_interrupt_scheme()

If pci_alloc_irq_vectors() can't allocate the minimum number of vectors
then it returns -ENOSPC so there is no need to check for that in the
caller.  In fact, because pf->msix.min is an unsigned int, it means that
any negative error codes are type promoted to high positive values and
treated as success.  So here, the "return -ENOMEM;" is unreachable code.
Check for negatives instead.

Now that we're only dealing with error codes, it's easier to propagate
the error code from pci_alloc_irq_vectors() instead of hardcoding
-ENOMEM.

Fixes: 79d97b8cf9a8 ("ice: remove splitting MSI-X between features")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/b16e4f01-4c85-46e2-b602-fce529293559@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: remove phylink_pcs .neg_mode boolean
Russell King (Oracle) [Thu, 13 Feb 2025 17:54:19 +0000 (17:54 +0000)]
net: remove phylink_pcs .neg_mode boolean

As all PCS are using the neg_mode parameter rather than the legacy
an_mode, remove the ability to use the legacy an_mode. We remove the
tests in the phylink code, unconditionally passing the PCS neg_mode
parameter to PCS methods, and remove setting the flag from drivers.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tidPn-0040hd-2R@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agodocumentation: networking: Add NAPI config
Joe Damato [Thu, 13 Feb 2025 19:15:34 +0000 (19:15 +0000)]
documentation: networking: Add NAPI config

Document the existence of persistent per-NAPI configuration space and
the API that drivers can opt into.

Update stale documentation which suggested that NAPI IDs cannot be
queried from userspace.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://patch.msgid.link/20250213191535.38792-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'net-phy-clean-up-phy-h'
Jakub Kicinski [Sat, 15 Feb 2025 01:07:48 +0000 (17:07 -0800)]
Merge branch 'net-phy-clean-up-phy-h'

Heiner Kallweit says:

====================
net: phy: clean up phy.h

This series is a starting point to clean up phy.h and remove
definitions which are phylib-internal.
====================

Link: https://patch.msgid.link/d14f8a69-dc21-4ff7-8401-574ffe2f4bc5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: remove helper phy_is_internal
Heiner Kallweit [Thu, 13 Feb 2025 21:51:53 +0000 (22:51 +0100)]
net: phy: remove helper phy_is_internal

Helper phy_is_internal() is just used in two places phylib-internally.
So let's remove it from the API.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/f3f35265-80a9-4ed7-ad78-ae22c21e288b@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: stop exporting phy_queue_state_machine
Heiner Kallweit [Thu, 13 Feb 2025 21:50:02 +0000 (22:50 +0100)]
net: phy: stop exporting phy_queue_state_machine

phy_queue_state_machine() isn't used outside phy.c,
so stop exporting it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/16986d3d-7baf-4b02-a641-e2916d491264@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: stop exporting feature arrays which aren't used outside phylib
Heiner Kallweit [Thu, 13 Feb 2025 21:49:19 +0000 (22:49 +0100)]
net: phy: stop exporting feature arrays which aren't used outside phylib

Stop exporting feature arrays which aren't used outside phylib.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/01886672-4880-4ca8-b7b0-94d40f6e0ec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: remove fixup-related definitions from phy.h which are not used outside...
Heiner Kallweit [Thu, 13 Feb 2025 21:48:11 +0000 (22:48 +0100)]
net: phy: remove fixup-related definitions from phy.h which are not used outside phylib

Certain fixup-related definitions aren't used outside phy_device.c.
So make them private and remove them from phy.h.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/ea6fde13-9183-4c7c-8434-6c0eb64fc72c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'net-phy-realtek-improve-mmd-register-access-for-internal-phy-s'
Jakub Kicinski [Sat, 15 Feb 2025 00:59:31 +0000 (16:59 -0800)]
Merge branch 'net-phy-realtek-improve-mmd-register-access-for-internal-phy-s'

Heiner Kallweit says:

====================
net: phy: realtek: improve MMD register access for internal PHY's

The integrated PHYs on chip versions from RTL8168g allow to address
MDIO_MMD_VEND2 registers. All c22 standard registers are mapped to
MDIO_MMD_VEND2 registers. So far the paging mechanism is used to
address PHY registers. Add support for c45 ops to address MDIO_MMD_VEND2
registers directly, w/o the paging.
====================

Link: https://patch.msgid.link/c6a969ef-fd7f-48d6-8c48-4bc548831a8d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: realtek: switch from paged to MMD ops in rtl822x functions
Heiner Kallweit [Thu, 13 Feb 2025 19:19:14 +0000 (20:19 +0100)]
net: phy: realtek: switch from paged to MMD ops in rtl822x functions

The MDIO bus provided by r8169 for the internal PHY's now supports
c45 ops for the MDIO_MMD_VEND2 device. So we can switch to standard
MMD ops here.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/81416f95-0fac-4225-87b4-828e3738b8ed@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: realtek: improve mmd register access for internal PHY's
Heiner Kallweit [Thu, 13 Feb 2025 19:18:17 +0000 (20:18 +0100)]
net: phy: realtek: improve mmd register access for internal PHY's

r8169 provides the MDIO bus for the internal PHY's. It has been extended
with c45 access functions for addressing MDIO_MMD_VEND2 registers.
So we can switch from paged access to directly addressing the
MDIO_MMD_VEND2 registers.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/a5f2333c-dda9-48ad-9801-77049766e632@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agor8169: add PHY c45 ops for MDIO_MMD_VENDOR2 registers
Heiner Kallweit [Thu, 13 Feb 2025 19:15:42 +0000 (20:15 +0100)]
r8169: add PHY c45 ops for MDIO_MMD_VENDOR2 registers

The integrated PHYs on chip versions from RTL8168g allow to address
MDIO_MMD_VEND2 registers. All c22 standard registers are mapped to
MDIO_MMD_VEND2 registers. So far the paging mechanism is used to
address PHY registers. Add support for c45 ops to address MDIO_MMD_VEND2
registers directly, w/o the paging.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/d6f97eaa-0f13-468f-89cb-75a41087bc4a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'net-phylink-xpcs-stmmac-support-pcs-eee-configuration'
Jakub Kicinski [Fri, 14 Feb 2025 21:42:53 +0000 (13:42 -0800)]
Merge branch 'net-phylink-xpcs-stmmac-support-pcs-eee-configuration'

Russell King says:

====================
net: phylink,xpcs,stmmac: support PCS EEE configuration

This series adds support for phylink managed EEE at the PCS level,
allowing xpcs_config_eee() to be removed. Sadly, we still end up with
a XPCS specific function to configure the clock multiplier.
====================

Link: https://patch.msgid.link/Z6naiPpxfxGr1Ic6@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: xpcs: group EEE code together
Russell King (Oracle) [Mon, 10 Feb 2025 10:54:15 +0000 (10:54 +0000)]
net: xpcs: group EEE code together

Move xpcs_config_eee() with the other EEE-related functions.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQd-003w7a-MM@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: xpcs: clean up xpcs_config_eee()
Russell King (Oracle) [Mon, 10 Feb 2025 10:54:10 +0000 (10:54 +0000)]
net: xpcs: clean up xpcs_config_eee()

There is now no need to pass the mult_fact into xpcs_config_eee(), so
let's remove that argument and use xpcs->eee_mult_fact directly. While
changing the function signature, as we pass true/false for enable, use
"bool" instead of "int" for this.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQY-003w7U-IG@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: xpcs: remove xpcs_config_eee() from global scope
Russell King (Oracle) [Mon, 10 Feb 2025 10:54:05 +0000 (10:54 +0000)]
net: xpcs: remove xpcs_config_eee() from global scope

Make xpcs_config_eee() private to the XPCS driver, called only from
the phylink pcs_disable_eee() and pcs_enable_eee() methods.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQT-003w7O-Ec@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: remove calls to xpcs_config_eee()
Russell King (Oracle) [Mon, 10 Feb 2025 10:54:00 +0000 (10:54 +0000)]
net: stmmac: remove calls to xpcs_config_eee()

Remove the explicit calls to xpcs_config_eee() from the stmmac driver,
preferring instead for phylink to manage the EEE configuration at the
PCS.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQO-003w7I-Ap@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: xpcs: convert to phylink managed EEE
Russell King (Oracle) [Mon, 10 Feb 2025 10:53:55 +0000 (10:53 +0000)]
net: xpcs: convert to phylink managed EEE

Convert XPCS to use the new pcs_disable_eee() and pcs_enable_eee()
methods. Since stmmac is the only user of xpcs_config_eee(), we can
make this a no-op along with this change.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQJ-003w7C-6v@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: call xpcs_config_eee_mult_fact()
Russell King (Oracle) [Mon, 10 Feb 2025 10:53:50 +0000 (10:53 +0000)]
net: stmmac: call xpcs_config_eee_mult_fact()

Arrange for stmmac to call the new xpcs_config_eee_mult_fact() function
to configure the EEE clock multiplying factor. This will allow the
removal of the xpcs_config_eee() calls in the next patch.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQE-003w76-3C@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: xpcs: add function to configure EEE clock multiplying factor
Russell King (Oracle) [Mon, 10 Feb 2025 10:53:44 +0000 (10:53 +0000)]
net: xpcs: add function to configure EEE clock multiplying factor

Add a function to separate out the EEE clock multiplying factor. This
will be called by the stmmac driver to configure this value.

It would have been better had the driver used the CLK API to retrieve
this clock, get its rate and calculate the appropriate multiplier, but
that door has closed.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQ8-003w70-VT@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phylink: add support for notifying PCS about EEE
Russell King (Oracle) [Mon, 10 Feb 2025 10:53:39 +0000 (10:53 +0000)]
net: phylink: add support for notifying PCS about EEE

There are hooks in the stmmac driver into XPCS to control the EEE
settings when LPI is configured at the MAC. This bypasses the layering.
To allow this to be removed from the stmmac driver, add two new
methods for PCS to inform them when the LPI/EEE enablement state
changes at the MAC.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQ3-003w6u-RH@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'inet-better-inet_sock_set_state-for-passive-flows'
Jakub Kicinski [Fri, 14 Feb 2025 21:40:35 +0000 (13:40 -0800)]
Merge branch 'inet-better-inet_sock_set_state-for-passive-flows'

Eric Dumazet says:

====================
inet: better inet_sock_set_state() for passive flows

Small series to make inet_sock_set_state() more interesting for
LISTEN -> TCP_SYN_RECV changes : The 4-tuple parts are now correct.

First patch is a cleanup.
====================

Link: https://patch.msgid.link/20250212131328.1514243-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoinet: consolidate inet_csk_clone_lock()
Eric Dumazet [Wed, 12 Feb 2025 13:13:28 +0000 (13:13 +0000)]
inet: consolidate inet_csk_clone_lock()

Current inet_sock_set_state trace from inet_csk_clone_lock() is missing
many details :

... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
    sport=4901 dport=0 \
    saddr=127.0.0.6 daddr=0.0.0.0 \
    saddrv6=:: daddrv6=:: \
    oldstate=TCP_LISTEN newstate=TCP_SYN_RECV

Only the sport gives the listener port, no other parts of the n-tuple are correct.

In this patch, I initialize relevant fields of the new socket before
calling inet_sk_set_state(newsk, TCP_SYN_RECV).

We now have a trace including all the source/destination bits.

... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
    sport=4901 dport=47648 \
    saddr=127.0.0.6 daddr=127.0.0.6 \
    saddrv6=2002:a05:6830:1f85:: daddrv6=2001:4860:f803:65::3 \
    oldstate=TCP_LISTEN newstate=TCP_SYN_RECV

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250212131328.1514243-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoinet: reduce inet_csk_clone_lock() indent level
Eric Dumazet [Wed, 12 Feb 2025 13:13:27 +0000 (13:13 +0000)]
inet: reduce inet_csk_clone_lock() indent level

Return early from inet_csk_clone_lock() if the socket
allocation failed, to reduce the indentation level.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250212131328.1514243-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: refactor clock management in EQoS driver
Swathi K S [Thu, 13 Feb 2025 04:15:59 +0000 (09:45 +0530)]
net: stmmac: refactor clock management in EQoS driver

Refactor clock management in EQoS driver for code reuse and to avoid
redundancy. This way, only minimal changes are required when a new platform
is added.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Swathi K S <swathi.ks@samsung.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250213041559.106111-1-swathi.ks@samsung.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: airoha: Fix TSO support for header cloned skbs
Lorenzo Bianconi [Thu, 13 Feb 2025 15:34:20 +0000 (16:34 +0100)]
net: airoha: Fix TSO support for header cloned skbs

For GSO packets, skb_cow_head() will reallocate the skb for TSO header
cloned skbs in airoha_dev_xmit(). For this reason, sinfo pointer can be
no more valid. Fix the issue relying on skb_shinfo() macro directly in
airoha_dev_xmit().

The problem exists since
commit 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
but it is not a user visible, since we can't currently enable TSO
for DSA user ports since we are missing to initialize net_device
vlan_features field.

Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250213-airoha-en7581-flowtable-offload-v4-1-b69ca16d74db@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'net-add-export_ipv6_mod'
Jakub Kicinski [Fri, 14 Feb 2025 21:09:42 +0000 (13:09 -0800)]
Merge branch 'net-add-export_ipv6_mod'

Eric Dumazet says:

====================
net: add EXPORT_IPV6_MOD()

In this series I am adding EXPORT_IPV6_MOD and EXPORT_IPV6_MOD_GPL()
so that we can replace some EXPORT_SYMBOL() when IPV6 is
not modular.

This is making all the selected symbols internal to core
linux networking.

v1: https://lore.kernel.org/20250210082805.465241-2-edumazet@google.com
====================

Link: https://patch.msgid.link/20250212132418.1524422-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoudp: use EXPORT_IPV6_MOD[_GPL]()
Eric Dumazet [Wed, 12 Feb 2025 13:24:18 +0000 (13:24 +0000)]
udp: use EXPORT_IPV6_MOD[_GPL]()

Use EXPORT_IPV6_MOD[_GPL]() for symbols that don't need
to be exported unless CONFIG_IPV6=m

udp_table is no longer used from any modules, and does not
need to be exported anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250212132418.1524422-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agotcp: use EXPORT_IPV6_MOD[_GPL]()
Eric Dumazet [Wed, 12 Feb 2025 13:24:17 +0000 (13:24 +0000)]
tcp: use EXPORT_IPV6_MOD[_GPL]()

Use EXPORT_IPV6_MOD[_GPL]() for symbols that don't need
to be exported unless CONFIG_IPV6=m

tcp_hashinfo and tcp_openreq_init_rwin() are no longer
used from any module anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250212132418.1524422-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoinetpeer: use EXPORT_IPV6_MOD[_GPL]()
Eric Dumazet [Wed, 12 Feb 2025 13:24:16 +0000 (13:24 +0000)]
inetpeer: use EXPORT_IPV6_MOD[_GPL]()

Use EXPORT_IPV6_MOD[_GPL]() for symbols that do not need to
to be exported unless CONFIG_IPV6=m

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250212132418.1524422-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL()
Eric Dumazet [Wed, 12 Feb 2025 13:24:15 +0000 (13:24 +0000)]
net: introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL()

We have many EXPORT_SYMBOL(x) in networking tree because IPv6
can be built as a module.

CONFIG_IPV6=y is becoming the norm.

Define a EXPORT_IPV6_MOD(x) which only exports x
for modular IPv6.

Same principle applies to EXPORT_IPV6_MOD_GPL()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250212132418.1524422-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 13 Feb 2025 20:43:01 +0000 (12:43 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-6.14-rc3).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 13 Feb 2025 20:17:04 +0000 (12:17 -0800)]
Merge tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, wireless and bluetooth.

  Kalle Valo steps down after serving as the WiFi driver maintainer for
  over a decade.

  Current release - fix to a fix:

   - vsock: orphan socket after transport release, avoid null-deref

   - Bluetooth: L2CAP: fix corrupted list in hci_chan_del

  Current release - regressions:

   - eth:
      - stmmac: correct Rx buffer layout when SPH is enabled
      - iavf: fix a locking bug in an error path

   - rxrpc: fix alteration of headers whilst zerocopy pending

   - s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH

   - Revert "netfilter: flowtable: teardown flow if cached mtu is stale"

  Current release - new code bugs:

   - rxrpc: fix ipv6 path MTU discovery, only ipv4 worked

   - pse-pd: fix deadlock in current limit functions

  Previous releases - regressions:

   - rtnetlink: fix netns refleak with rtnl_setlink()

   - wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364
     firmware

  Previous releases - always broken:

   - add missing RCU protection of struct net throughout the stack

   - can: rockchip: bail out if skb cannot be allocated

   - eth: ti: am65-cpsw: base XDP support fixes

  Misc:

   - ethtool: tsconfig: update the format of hwtstamp flags, changes the
     uAPI but this uAPI was not in any release yet"

* tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
  net: pse-pd: Fix deadlock in current limit functions
  rxrpc: Fix ipv6 path MTU discovery
  Reapply "net: skb: introduce and use a single page frag cache"
  s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH
  mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()
  ipv6: mcast: add RCU protection to mld_newpack()
  team: better TEAM_OPTION_TYPE_STRING validation
  Bluetooth: L2CAP: Fix corrupted list in hci_chan_del
  Bluetooth: btintel_pcie: Fix a potential race condition
  Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
  net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case
  net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case
  net: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases
  vsock/test: Add test for SO_LINGER null ptr deref
  vsock: Orphan socket after transport release
  MAINTAINERS: Add sctp headers to the general netdev entry
  Revert "netfilter: flowtable: teardown flow if cached mtu is stale"
  iavf: Fix a locking bug in an error path
  rxrpc: Fix alteration of headers whilst zerocopy pending
  net: phylink: make configuring clock-stop dependent on MAC support
  ...

2 months agoMerge tag 'for-6.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Thu, 13 Feb 2025 20:06:29 +0000 (12:06 -0800)]
Merge tag 'for-6.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix stale page cache after race between readahead and direct IO write

 - fix hole expansion when writing at an offset beyond EOF, the range
   will not be zeroed

 - use proper way to calculate offsets in folio ranges

* tag 'for-6.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix hole expansion when writing at an offset beyond EOF
  btrfs: fix stale page cache after race between readahead and direct IO write
  btrfs: fix two misuses of folio_shift()

2 months agoMerge tag 'bcachefs-2025-02-12' of git://evilpiepirate.org/bcachefs
Linus Torvalds [Thu, 13 Feb 2025 19:58:11 +0000 (11:58 -0800)]
Merge tag 'bcachefs-2025-02-12' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Just small stuff.

  As a general announcement, on disk format is now frozen in my master
  branch - future on disk format changes will be optional, not required.

   - More fixes for going read-only: the previous fix was insufficient,
     but with more work on ordering journal reclaim flushing (and a
     btree node accounting fix so we don't split until we have to) the
     tiering_replication test now consistently goes read-only in less
     than a second.

   - fix for fsck when we have reflink pointers to missing indirect
     extents

   - some transaction restart handling fixes from Alan; the "Pass
     _orig_restart_count to trans_was_restarted" likely fixes some rare
     undefined behaviour heisenbugs"

* tag 'bcachefs-2025-02-12' of git://evilpiepirate.org/bcachefs:
  bcachefs: Reuse transaction
  bcachefs: Pass _orig_restart_count to trans_was_restarted
  bcachefs: CONFIG_BCACHEFS_INJECT_TRANSACTION_RESTARTS
  bcachefs: Fix want_new_bset() so we write until the end of the btree node
  bcachefs: Split out journal pins by btree level
  bcachefs: Fix use after free
  bcachefs: Fix marking reflink pointers to missing indirect extents

2 months agonet: pse-pd: Fix deadlock in current limit functions
Kory Maincent [Wed, 12 Feb 2025 15:17:51 +0000 (16:17 +0100)]
net: pse-pd: Fix deadlock in current limit functions

Fix a deadlock in pse_pi_get_current_limit and pse_pi_set_current_limit
caused by consecutive mutex_lock calls. One in the function itself and
another in pse_pi_get_voltage.

Resolve the issue by using the unlocked version of pse_pi_get_voltage
instead.

Fixes: e0a5e2bba38a ("net: pse-pd: Use power limit at driver side instead of current limit")
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20250212151751.1515008-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agorxrpc: Fix ipv6 path MTU discovery
David Howells [Wed, 12 Feb 2025 11:21:24 +0000 (11:21 +0000)]
rxrpc: Fix ipv6 path MTU discovery

rxrpc path MTU discovery currently only makes use of ICMPv4, but not
ICMPv6, which means that pmtud for IPv6 doesn't work correctly.  Fix it to
check for ICMPv6 messages also.

Fixes: eeaedc5449d9 ("rxrpc: Implement path-MTU probing using padded PING ACKs (RFC8899)")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/3517283.1739359284@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'for-net-2025-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluet...
Jakub Kicinski [Thu, 13 Feb 2025 17:41:33 +0000 (09:41 -0800)]
Merge tag 'for-net-2025-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - btintel_pcie: Fix a potential race condition
 - L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
 - L2CAP: Fix corrupted list in hci_chan_del

* tag 'for-net-2025-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: L2CAP: Fix corrupted list in hci_chan_del
  Bluetooth: btintel_pcie: Fix a potential race condition
  Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
====================

Link: https://patch.msgid.link/20250213162446.617632-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'nf-25-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Thu, 13 Feb 2025 17:38:50 +0000 (09:38 -0800)]
Merge tag 'nf-25-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains one revert for:

1) Revert flowtable entry teardown cycle when skbuff exceeds mtu to
   deal with DF flag unset scenarios. This is reverts a patch coming
   in the previous merge window (available in 6.14-rc releases).

* tag 'nf-25-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  Revert "netfilter: flowtable: teardown flow if cached mtu is stale"
====================

Link: https://patch.msgid.link/20250213100502.3983-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoReapply "net: skb: introduce and use a single page frag cache"
Jakub Kicinski [Thu, 13 Feb 2025 16:49:44 +0000 (08:49 -0800)]
Reapply "net: skb: introduce and use a single page frag cache"

This reverts commit 011b0335903832facca86cd8ed05d7d8d94c9c76.

Sabrina reports that the revert may trigger warnings due to intervening
changes, especially the ability to rise MAX_SKB_FRAGS. Let's drop it
and revisit once that part is also ironed out.

Fixes: 011b03359038 ("Revert "net: skb: introduce and use a single page frag cache"")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/6bf54579233038bc0e76056c5ea459872ce362ab.1739375933.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'loongarch-fixes-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 13 Feb 2025 16:43:46 +0000 (08:43 -0800)]
Merge tag 'loongarch-fixes-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix bugs about idle, kernel_page_present(), IP checksum and KVM, plus
  some trival cleanups"

* tag 'loongarch-fixes-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Set host with kernel mode when switch to VM mode
  LoongArch: KVM: Remove duplicated cache attribute setting
  LoongArch: KVM: Fix typo issue about GCFG feature detection
  LoongArch: csum: Fix OoB access in IP checksum code for negative lengths
  LoongArch: Remove the deprecated notifier hook mechanism
  LoongArch: Use str_yes_no() helper function for /proc/cpuinfo
  LoongArch: Fix kernel_page_present() for KPRANGE/XKPRANGE
  LoongArch: Fix idle VS timer enqueue

2 months agoMerge tag 'platform-drivers-x86-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 13 Feb 2025 16:41:48 +0000 (08:41 -0800)]
Merge tag 'platform-drivers-x86-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:

 - thinkpad_acpi:
     - Fix registration of tpacpi platform driver
     - Support fan speed in ticks per revolution (Thinkpad X120e)
     - Support V9 DYTC profiles (new Thinkpad AMD platforms)

 - int3472: Handle GPIO "enable" vs "reset" variation (ov7251)

* tag 'platform-drivers-x86-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: thinkpad_acpi: Fix registration of tpacpi platform driver
  platform/x86: int3472: Call "reset" GPIO "enable" for INT347E
  platform/x86: int3472: Use correct type for "polarity", call it gpio_flags
  platform/x86: thinkpad_acpi: Support for V9 DYTC platform profiles
  platform/x86: thinkpad_acpi: Fix invalid fan speed on ThinkPad X120e

2 months agos390/qeth: move netif_napi_add_tx() and napi_enable() from under BH
Alexandra Winter [Wed, 12 Feb 2025 16:36:59 +0000 (17:36 +0100)]
s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH

Like other drivers qeth is calling local_bh_enable() after napi_schedule()
to kick-start softirqs [0].
Since netif_napi_add_tx() and napi_enable() now take the netdev_lock()
mutex [1], move them out from under the BH protection. Same solution as in
commit a60558644e20 ("wifi: mt76: move napi_enable() from under BH")

Fixes: 1b23cdbd2bbc ("net: protect netdev->napi_list with netdev_lock()")
Link: https://lore.kernel.org/netdev/20240612181900.4d9d18d0@kernel.org/
Link: https://lore.kernel.org/netdev/20250115035319.559603-1-kuba@kernel.org/
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Acked-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250212163659.2287292-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: usb: asix_devices: add FiberGecko DeviceID
Max Schulze [Wed, 12 Feb 2025 15:09:51 +0000 (16:09 +0100)]
net: usb: asix_devices: add FiberGecko DeviceID

The FiberGecko is a small USB module that connects a 100 Mbit/s SFP

Signed-off-by: Max Schulze <max.schulze@online.de>
Tested-by: Max Schulze <max.schulze@online.de>
Suggested-by: David Hollis <dhollis@davehollis.com>
Reported-by: Sven Kreiensen <s.kreiensen@lyconsys.com>
Link: https://patch.msgid.link/20250212150957.43900-2-max.schulze@online.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agomlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()
Wentao Liang [Wed, 12 Feb 2025 15:23:11 +0000 (23:23 +0800)]
mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()

Add a check for the return value of mlxsw_sp_port_get_stats_raw()
in __mlxsw_sp_port_get_stats(). If mlxsw_sp_port_get_stats_raw()
returns an error, exit the function to prevent further processing
with potentially invalid data.

Fixes: 614d509aa1e7 ("mlxsw: Move ethtool_ops to spectrum_ethtool.c")
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250212152311.1332-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoipv6: mcast: add RCU protection to mld_newpack()
Eric Dumazet [Wed, 12 Feb 2025 14:10:21 +0000 (14:10 +0000)]
ipv6: mcast: add RCU protection to mld_newpack()

mld_newpack() can be called without RTNL or RCU being held.

Note that we no longer can use sock_alloc_send_skb() because
ipv6.igmp_sk uses GFP_KERNEL allocations which can sleep.

Instead use alloc_skb() and charge the net->ipv6.igmp_sk
socket under RCU protection.

Fixes: b8ad0cbc58f7 ("[NETNS][IPV6] mcast - handle several network namespace")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250212141021.1663666-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoteam: better TEAM_OPTION_TYPE_STRING validation
Eric Dumazet [Wed, 12 Feb 2025 13:49:28 +0000 (13:49 +0000)]
team: better TEAM_OPTION_TYPE_STRING validation

syzbot reported following splat [1]

Make sure user-provided data contains one nul byte.

[1]
 BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:633 [inline]
 BUG: KMSAN: uninit-value in string+0x3ec/0x5f0 lib/vsprintf.c:714
  string_nocheck lib/vsprintf.c:633 [inline]
  string+0x3ec/0x5f0 lib/vsprintf.c:714
  vsnprintf+0xa5d/0x1960 lib/vsprintf.c:2843
  __request_module+0x252/0x9f0 kernel/module/kmod.c:149
  team_mode_get drivers/net/team/team_core.c:480 [inline]
  team_change_mode drivers/net/team/team_core.c:607 [inline]
  team_mode_option_set+0x437/0x970 drivers/net/team/team_core.c:1401
  team_option_set drivers/net/team/team_core.c:375 [inline]
  team_nl_options_set_doit+0x1339/0x1f90 drivers/net/team/team_core.c:2662
  genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
  genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
  genl_rcv_msg+0x1214/0x12c0 net/netlink/genetlink.c:1210
  netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2543
  genl_rcv+0x40/0x60 net/netlink/genetlink.c:1219
  netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
  netlink_unicast+0xf52/0x1260 net/netlink/af_netlink.c:1348
  netlink_sendmsg+0x10da/0x11e0 net/netlink/af_netlink.c:1892
  sock_sendmsg_nosec net/socket.c:718 [inline]
  __sock_sendmsg+0x30f/0x380 net/socket.c:733
  ____sys_sendmsg+0x877/0xb60 net/socket.c:2573
  ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2627
  __sys_sendmsg net/socket.c:2659 [inline]
  __do_sys_sendmsg net/socket.c:2664 [inline]
  __se_sys_sendmsg net/socket.c:2662 [inline]
  __x64_sys_sendmsg+0x212/0x3c0 net/socket.c:2662
  x64_sys_call+0x2ed6/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:47
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Reported-by: syzbot+1fcd957a82e3a1baa94d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1fcd957a82e3a1baa94d
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250212134928.1541609-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agor8169: add support for Intel Killer E5000
Heiner Kallweit [Wed, 12 Feb 2025 07:03:56 +0000 (08:03 +0100)]
r8169: add support for Intel Killer E5000

This adds support for the Intel Killer E5000 which seems to be a
rebranded RTL8126. Copied from r8126 vendor driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/9db73e9b-e2e8-45de-97a5-041c5f71d774@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoixgene-v2: prepare for phylib stop exporting phy_10_100_features_array
Heiner Kallweit [Wed, 12 Feb 2025 06:32:52 +0000 (07:32 +0100)]
ixgene-v2: prepare for phylib stop exporting phy_10_100_features_array

As part of phylib cleanup we plan to stop exporting the feature arrays.
So explicitly remove the modes not supported by the MAC. The media type
bits don't have any impact on kernel behavior, so don't touch them.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/be356a21-5a1a-45b3-9407-3a97f3af4600@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agosctp: Remove commented out code
Thorsten Blum [Tue, 11 Feb 2025 10:20:56 +0000 (11:20 +0100)]
sctp: Remove commented out code

Remove commented out code.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20250211102057.587182-1-thorsten.blum@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoBluetooth: L2CAP: Fix corrupted list in hci_chan_del
Luiz Augusto von Dentz [Thu, 6 Feb 2025 20:54:45 +0000 (15:54 -0500)]
Bluetooth: L2CAP: Fix corrupted list in hci_chan_del

This fixes the following trace by reworking the locking of l2cap_conn
so instead of only locking when changing the chan_l list this promotes
chan_lock to a general lock of l2cap_conn so whenever it is being held
it would prevents the likes of l2cap_conn_del to run:

list_del corruption, ffff888021297e00->prev is LIST_POISON2 (dead000000000122)
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:61!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 1 UID: 0 PID: 5896 Comm: syz-executor213 Not tainted 6.14.0-rc1-next-20250204-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
RIP: 0010:__list_del_entry_valid_or_report+0x12c/0x190 lib/list_debug.c:59
Code: 8c 4c 89 fe 48 89 da e8 32 8c 37 fc 90 0f 0b 48 89 df e8 27 9f 14 fd 48 c7 c7 a0 c0 60 8c 4c 89 fe 48 89 da e8 15 8c 37 fc 90 <0f> 0b 4c 89 e7 e8 0a 9f 14 fd 42 80 3c 2b 00 74 08 4c 89 e7 e8 cb
RSP: 0018:ffffc90003f6f998 EFLAGS: 00010246
RAX: 000000000000004e RBX: dead000000000122 RCX: 01454d423f7fbf00
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: dffffc0000000000 R08: ffffffff819f077c R09: 1ffff920007eded0
R10: dffffc0000000000 R11: fffff520007eded1 R12: dead000000000122
R13: dffffc0000000000 R14: ffff8880352248d8 R15: ffff888021297e00
FS:  00007f7ace6686c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7aceeeb1d0 CR3: 000000003527c000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __list_del_entry_valid include/linux/list.h:124 [inline]
 __list_del_entry include/linux/list.h:215 [inline]
 list_del_rcu include/linux/rculist.h:168 [inline]
 hci_chan_del+0x70/0x1b0 net/bluetooth/hci_conn.c:2858
 l2cap_conn_free net/bluetooth/l2cap_core.c:1816 [inline]
 kref_put include/linux/kref.h:65 [inline]
 l2cap_conn_put+0x70/0xe0 net/bluetooth/l2cap_core.c:1830
 l2cap_sock_shutdown+0xa8a/0x1020 net/bluetooth/l2cap_sock.c:1377
 l2cap_sock_release+0x79/0x1d0 net/bluetooth/l2cap_sock.c:1416
 __sock_release net/socket.c:642 [inline]
 sock_close+0xbc/0x240 net/socket.c:1393
 __fput+0x3e9/0x9f0 fs/file_table.c:448
 task_work_run+0x24f/0x310 kernel/task_work.c:227
 ptrace_notify+0x2d2/0x380 kernel/signal.c:2522
 ptrace_report_syscall include/linux/ptrace.h:415 [inline]
 ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline]
 syscall_exit_work+0xc7/0x1d0 kernel/entry/common.c:173
 syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline]
 syscall_exit_to_user_mode+0x24a/0x340 kernel/entry/common.c:218
 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7aceeaf449
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7ace668218 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: fffffffffffffffc RBX: 00007f7acef39328 RCX: 00007f7aceeaf449
RDX: 000000000000000e RSI: 0000000020000100 RDI: 0000000000000004
RBP: 00007f7acef39320 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 0000000000000004 R14: 00007f7ace668670 R15: 000000000000000b
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__list_del_entry_valid_or_report+0x12c/0x190 lib/list_debug.c:59
Code: 8c 4c 89 fe 48 89 da e8 32 8c 37 fc 90 0f 0b 48 89 df e8 27 9f 14 fd 48 c7 c7 a0 c0 60 8c 4c 89 fe 48 89 da e8 15 8c 37 fc 90 <0f> 0b 4c 89 e7 e8 0a 9f 14 fd 42 80 3c 2b 00 74 08 4c 89 e7 e8 cb
RSP: 0018:ffffc90003f6f998 EFLAGS: 00010246
RAX: 000000000000004e RBX: dead000000000122 RCX: 01454d423f7fbf00
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: dffffc0000000000 R08: ffffffff819f077c R09: 1ffff920007eded0
R10: dffffc0000000000 R11: fffff520007eded1 R12: dead000000000122
R13: dffffc0000000000 R14: ffff8880352248d8 R15: ffff888021297e00
FS:  00007f7ace6686c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7acef05b08 CR3: 000000003527c000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Reported-by: syzbot+10bd8fe6741eedd2be2e@syzkaller.appspotmail.com
Tested-by: syzbot+10bd8fe6741eedd2be2e@syzkaller.appspotmail.com
Fixes: b4f82f9ed43a ("Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
2 months agoBluetooth: btintel_pcie: Fix a potential race condition
Kiran K [Fri, 31 Jan 2025 13:00:19 +0000 (18:30 +0530)]
Bluetooth: btintel_pcie: Fix a potential race condition

On HCI_OP_RESET command, firmware raises alive interrupt. Driver needs
to wait for this before sending other command. This patch fixes the potential
miss of alive interrupt due to which HCI_OP_RESET can timeout.

Expected flow:
If tx command is HCI_OP_RESET,
  1. set data->gp0_received = false
  2. send HCI_OP_RESET
  3. wait for alive interrupt

Actual flow having potential race:
If tx command is HCI_OP_RESET,
 1. send HCI_OP_RESET
   1a. Firmware raises alive interrupt here and in ISR
       data->gp0_received  is set to true
 2. set data->gp0_received = false
 3. wait for alive interrupt

Signed-off-by: Kiran K <kiran.k@intel.com>
Fixes: 05c200c8f029 ("Bluetooth: btintel_pcie: Add handshake between driver and firmware")
Reported-by: Bjorn Helgaas <helgaas@kernel.org>
Closes: https://patchwork.kernel.org/project/bluetooth/patch/20241001104451.626964-1-kiran.k@intel.com/
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 months agoBluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
Luiz Augusto von Dentz [Thu, 16 Jan 2025 15:35:03 +0000 (10:35 -0500)]
Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd

After the hci sync command releases l2cap_conn, the hci receive data work
queue references the released l2cap_conn when sending to the upper layer.
Add hci dev lock to the hci receive data work queue to synchronize the two.

[1]
BUG: KASAN: slab-use-after-free in l2cap_send_cmd+0x187/0x8d0 net/bluetooth/l2cap_core.c:954
Read of size 8 at addr ffff8880271a4000 by task kworker/u9:2/5837

CPU: 0 UID: 0 PID: 5837 Comm: kworker/u9:2 Not tainted 6.13.0-rc5-syzkaller-00163-gab75170520d4 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: hci1 hci_rx_work
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:489
 kasan_report+0x143/0x180 mm/kasan/report.c:602
 l2cap_build_cmd net/bluetooth/l2cap_core.c:2964 [inline]
 l2cap_send_cmd+0x187/0x8d0 net/bluetooth/l2cap_core.c:954
 l2cap_sig_send_rej net/bluetooth/l2cap_core.c:5502 [inline]
 l2cap_sig_channel net/bluetooth/l2cap_core.c:5538 [inline]
 l2cap_recv_frame+0x221f/0x10db0 net/bluetooth/l2cap_core.c:6817
 hci_acldata_packet net/bluetooth/hci_core.c:3797 [inline]
 hci_rx_work+0x508/0xdb0 net/bluetooth/hci_core.c:4040
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310
 worker_thread+0x870/0xd30 kernel/workqueue.c:3391
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>

Allocated by task 5837:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
 __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394
 kasan_kmalloc include/linux/kasan.h:260 [inline]
 __kmalloc_cache_noprof+0x243/0x390 mm/slub.c:4329
 kmalloc_noprof include/linux/slab.h:901 [inline]
 kzalloc_noprof include/linux/slab.h:1037 [inline]
 l2cap_conn_add+0xa9/0x8e0 net/bluetooth/l2cap_core.c:6860
 l2cap_connect_cfm+0x115/0x1090 net/bluetooth/l2cap_core.c:7239
 hci_connect_cfm include/net/bluetooth/hci_core.h:2057 [inline]
 hci_remote_features_evt+0x68e/0xac0 net/bluetooth/hci_event.c:3726
 hci_event_func net/bluetooth/hci_event.c:7473 [inline]
 hci_event_packet+0xac2/0x1540 net/bluetooth/hci_event.c:7525
 hci_rx_work+0x3f3/0xdb0 net/bluetooth/hci_core.c:4035
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310
 worker_thread+0x870/0xd30 kernel/workqueue.c:3391
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Freed by task 54:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:582
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2353 [inline]
 slab_free mm/slub.c:4613 [inline]
 kfree+0x196/0x430 mm/slub.c:4761
 l2cap_connect_cfm+0xcc/0x1090 net/bluetooth/l2cap_core.c:7235
 hci_connect_cfm include/net/bluetooth/hci_core.h:2057 [inline]
 hci_conn_failed+0x287/0x400 net/bluetooth/hci_conn.c:1266
 hci_abort_conn_sync+0x56c/0x11f0 net/bluetooth/hci_sync.c:5603
 hci_cmd_sync_work+0x22b/0x400 net/bluetooth/hci_sync.c:332
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310
 worker_thread+0x870/0xd30 kernel/workqueue.c:3391
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Reported-by: syzbot+31c2f641b850a348a734@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=31c2f641b850a348a734
Tested-by: syzbot+31c2f641b850a348a734@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 months agoDocumentation: dpaa2 ethernet switch driver: Fix spelling
Ritvik Gupta [Wed, 12 Feb 2025 02:13:11 +0000 (02:13 +0000)]
Documentation: dpaa2 ethernet switch driver: Fix spelling

Corrected spelling mistake

Signed-off-by: Ritvik Gupta <ritvikfoss@gmail.com>
Link: https://patch.msgid.link/20250212021311.13257-1-ritvikfoss@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoarp: Convert SIOCDARP and SIOCSARP to per-netns RTNL.
Kuniyuki Iwashima [Tue, 11 Feb 2025 04:50:57 +0000 (13:50 +0900)]
arp: Convert SIOCDARP and SIOCSARP to per-netns RTNL.

ioctl(SIOCDARP/SIOCSARP) operates on a single netns fetched from
an AF_INET socket in inet_ioctl().

Let's hold rtnl_net_lock() for SIOCDARP and SIOCSARP.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250211045057.10419-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: phy: marvell-88q2xxx: Add support for PHY LEDs on 88q2xxx
Dimitri Fedrau [Mon, 10 Feb 2025 14:53:40 +0000 (15:53 +0100)]
net: phy: marvell-88q2xxx: Add support for PHY LEDs on 88q2xxx

Marvell 88Q2XXX devices support up to two configurable Light Emitting
Diode (LED). Add minimal LED controller driver supporting the most common
uses with the 'netdev' trigger.

Reviewed-by: Stefan Eichenberger <eichest@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Link: https://patch.msgid.link/20250210-marvell-88q2xxx-leds-v4-1-3a0900dc121f@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoMerge branch 'net-ethernet-ti-am65-cpsw-xdp-fixes'
Jakub Kicinski [Thu, 13 Feb 2025 04:08:47 +0000 (20:08 -0800)]
Merge branch 'net-ethernet-ti-am65-cpsw-xdp-fixes'

Roger Quadros says:

====================
net: ethernet: ti: am65-cpsw: XDP fixes

This series fixes memleak and statistics for XDP cases.
====================

Link: https://patch.msgid.link/20250210-am65-cpsw-xdp-fixes-v1-0-ec6b1f7f1aca@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case
Roger Quadros [Mon, 10 Feb 2025 14:52:17 +0000 (16:52 +0200)]
net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case

For XDP transmit case, swdata doesn't contain SKB but the
XDP Frame. Infer the correct swdata based on buffer type
and return the XDP Frame for XDP transmit case.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support")
Link: https://patch.msgid.link/20250210-am65-cpsw-xdp-fixes-v1-3-ec6b1f7f1aca@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case
Roger Quadros [Mon, 10 Feb 2025 14:52:16 +0000 (16:52 +0200)]
net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case

For successful XDP_TX and XDP_REDIRECT cases, the packet was received
successfully so update RX statistics. Use original received
packet length for that.

TX packets statistics are incremented on TX completion so don't
update it while TX queueing.

If xdp_convert_buff_to_frame() fails, increment tx_dropped.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support")
Link: https://patch.msgid.link/20250210-am65-cpsw-xdp-fixes-v1-2-ec6b1f7f1aca@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases
Roger Quadros [Mon, 10 Feb 2025 14:52:15 +0000 (16:52 +0200)]
net: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases

If the XDP program doesn't result in XDP_PASS then we leak the
memory allocated by am65_cpsw_build_skb().

It is pointless to allocate SKB memory before running the XDP
program as we would be wasting CPU cycles for cases other than XDP_PASS.
Move the SKB allocation after evaluating the XDP program result.

This fixes the memleak. A performance boost is seen for XDP_DROP test.

XDP_DROP test:
Before: 460256 rx/s                  0 err/s
After:  784130 rx/s                  0 err/s

Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Link: https://patch.msgid.link/20250210-am65-cpsw-xdp-fixes-v1-1-ec6b1f7f1aca@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: dwmac-loongson: Set correct {tx,rx}_fifo_size
Huacai Chen [Mon, 10 Feb 2025 13:43:28 +0000 (21:43 +0800)]
net: stmmac: dwmac-loongson: Set correct {tx,rx}_fifo_size

Now for dwmac-loongson {tx,rx}_fifo_size are uninitialised, which means
zero. This means dwmac-loongson doesn't support changing MTU because in
stmmac_change_mtu() it requires the fifo size be no less than MTU. Thus,
set the correct tx_fifo_size and rx_fifo_size for it (16KB multiplied by
queue counts).

Here {tx,rx}_fifo_size is initialised with the initial value (also the
maximum value) of {tx,rx}_queues_to_use. So it will keep as 16KB if we
don't change the queue count, and will be larger than 16KB if we change
(decrease) the queue count. However stmmac_change_mtu() still work well
with current logic (MTU cannot be larger than 16KB for stmmac).

Note: the Fixes tag picked here is the oldest commit and key commit of
the dwmac-loongson series "stmmac: Add Loongson platform support".

Acked-by: Yanteng Si <si.yanteng@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Chong Qiao <qiaochong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://patch.msgid.link/20250210134328.2755328-1-chenhuacai@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoLoongArch: KVM: Set host with kernel mode when switch to VM mode
Bibo Mao [Thu, 13 Feb 2025 04:02:56 +0000 (12:02 +0800)]
LoongArch: KVM: Set host with kernel mode when switch to VM mode

PRMD register is only meaningful on the beginning stage of exception
entry, and it is overwritten with nested irq or exception.

When CPU runs in VM mode, interrupt need be enabled on host. And the
mode for host had better be kernel mode rather than random or user mode.

When VM is running, the running mode with top command comes from CRMD
register, and running mode should be kernel mode since kernel function
is executing with perf command. It needs be consistent with both top and
perf command.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 months agoLoongArch: KVM: Remove duplicated cache attribute setting
Bibo Mao [Thu, 13 Feb 2025 04:02:56 +0000 (12:02 +0800)]
LoongArch: KVM: Remove duplicated cache attribute setting

Cache attribute comes from GPA->HPA secondary mmu page table and is
configured when kvm is enabled. It is the same for all VMs, so remove
duplicated cache attribute setting on vCPU context switch.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 months agoLoongArch: KVM: Fix typo issue about GCFG feature detection
Bibo Mao [Thu, 13 Feb 2025 04:02:56 +0000 (12:02 +0800)]
LoongArch: KVM: Fix typo issue about GCFG feature detection

This is typo issue and misusage about GCFG feature macro. The code
is wrong, only that it does not cause obvious problem since GCFG is
set again on vCPU context switch.

Fixes: 0d0df3c99d4f ("LoongArch: KVM: Implement kvm hardware enable, disable interface")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 months agoLoongArch: csum: Fix OoB access in IP checksum code for negative lengths
Yuli Wang [Thu, 13 Feb 2025 04:02:40 +0000 (12:02 +0800)]
LoongArch: csum: Fix OoB access in IP checksum code for negative lengths

Commit 69e3a6aa6be2 ("LoongArch: Add checksum optimization for 64-bit
system") would cause an undefined shift and an out-of-bounds read.

Commit 8bd795fedb84 ("arm64: csum: Fix OoB access in IP checksum code
for negative lengths") fixes the same issue on ARM64.

Fixes: 69e3a6aa6be2 ("LoongArch: Add checksum optimization for 64-bit system")
Co-developed-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 months agoLoongArch: Remove the deprecated notifier hook mechanism
Yuli Wang [Thu, 13 Feb 2025 04:02:40 +0000 (12:02 +0800)]
LoongArch: Remove the deprecated notifier hook mechanism

The notifier hook mechanism in proc and cpuinfo is actually unnecessary
for LoongArch because it's not used anywhere.

It was originally added to the MIPS code in commit d6d3c9afaab4 ("MIPS:
MT: proc: Add support for printing VPE and TC ids"), and LoongArch then
inherited it.

But as the kernel code stands now, this notifier hook mechanism doesn't
really make sense for either LoongArch or MIPS.

In addition, the seq_file forward declaration needs to be moved to its
proper place, as only the show_ipi_list() function in smp.c requires it.

Co-developed-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 months agoLoongArch: Use str_yes_no() helper function for /proc/cpuinfo
Yuli Wang [Thu, 13 Feb 2025 04:02:35 +0000 (12:02 +0800)]
LoongArch: Use str_yes_no() helper function for /proc/cpuinfo

Remove hard-coded strings by using the str_yes_no() helper function.

Similar to commit c4a0a4a45a45 ("MIPS: kernel: proc: Use str_yes_no()
helper function").

Co-developed-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 months agoLoongArch: Fix kernel_page_present() for KPRANGE/XKPRANGE
Huacai Chen [Thu, 13 Feb 2025 04:02:35 +0000 (12:02 +0800)]
LoongArch: Fix kernel_page_present() for KPRANGE/XKPRANGE

Now kernel_page_present() always return true for KPRANGE/XKPRANGE
addresses, this isn't correct because hibernation (ACPI S4) use it
to distinguish whether a page is saveable. If all KPRANGE/XKPRANGE
addresses are considered as saveable, then reserved memory such as
EFI_RUNTIME_SERVICES_CODE / EFI_RUNTIME_SERVICES_DATA will also be
saved and restored.

Fix this by returning true only if the KPRANGE/XKPRANGE address is in
memblock.memory.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 months agoLoongArch: Fix idle VS timer enqueue
Marco Crivellari [Thu, 13 Feb 2025 04:02:35 +0000 (12:02 +0800)]
LoongArch: Fix idle VS timer enqueue

LoongArch re-enables interrupts on its idle routine and performs a
TIF_NEED_RESCHED check afterwards before putting the CPU to sleep.

The IRQs firing between the check and the idle instruction may set the
TIF_NEED_RESCHED flag. In order to deal with such a race, IRQs
interrupting __arch_cpu_idle() rollback their return address to the
beginning of __arch_cpu_idle() so that TIF_NEED_RESCHED is checked
again before going back to sleep.

However idle IRQs can also queue timers that may require a tick
reprogramming through a new generic idle loop iteration but those timers
would go unnoticed here because __arch_cpu_idle() only checks
TIF_NEED_RESCHED. It doesn't check for pending timers.

Fix this with fast-forwarding idle IRQs return address to the end of the
idle routine instead of the beginning, so that the generic idle loop can
handle both TIF_NEED_RESCHED and pending timers.

Fixes: 0603839b18f4 ("LoongArch: Add exception/interrupt handling")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 months agoMerge branch 'vsock-null-ptr-deref-when-so_linger-enabled'
Jakub Kicinski [Thu, 13 Feb 2025 04:01:30 +0000 (20:01 -0800)]
Merge branch 'vsock-null-ptr-deref-when-so_linger-enabled'

Michal Luczaj says:

====================
vsock: null-ptr-deref when SO_LINGER enabled

syzbot pointed out that a recent patching of a use-after-free introduced a
null-ptr-deref. This series fixes the problem and adds a test.

v2: https://lore.kernel.org/20250206-vsock-linger-nullderef-v2-0-f8a1f19146f8@rbox.co
v1: https://lore.kernel.org/20250204-vsock-linger-nullderef-v1-0-6eb1760fa93e@rbox.co
====================

Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-0-ef6244d02b54@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agovsock/test: Add test for SO_LINGER null ptr deref
Michal Luczaj [Mon, 10 Feb 2025 12:15:01 +0000 (13:15 +0100)]
vsock/test: Add test for SO_LINGER null ptr deref

Explicitly close() a TCP_ESTABLISHED (connectible) socket with SO_LINGER
enabled.

As for now, test does not verify if close() actually lingers.
On an unpatched machine, may trigger a null pointer dereference.

Tested-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-2-ef6244d02b54@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agovsock: Orphan socket after transport release
Michal Luczaj [Mon, 10 Feb 2025 12:15:00 +0000 (13:15 +0100)]
vsock: Orphan socket after transport release

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

Orphan the socket only after transport release.

Partially reverts the 'Fixes:' commit.

KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
 lock_acquire+0x19e/0x500
 _raw_spin_lock_irqsave+0x47/0x70
 add_wait_queue+0x46/0x230
 virtio_transport_release+0x4e7/0x7f0
 __vsock_release+0xfd/0x490
 vsock_release+0x90/0x120
 __sock_release+0xa3/0x250
 sock_close+0x14/0x20
 __fput+0x35e/0xa90
 __x64_sys_close+0x78/0xd0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c
Fixes: fcdd2242c023 ("vsock: Keep the binding until socket destruction")
Tested-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-1-ef6244d02b54@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMAINTAINERS: Add sctp headers to the general netdev entry
Marcelo Ricardo Leitner [Mon, 10 Feb 2025 13:24:55 +0000 (10:24 -0300)]
MAINTAINERS: Add sctp headers to the general netdev entry

All SCTP patches are picked up by netdev maintainers. Two headers were
missing to be listed there.

Reported-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/b3c2dc3a102eb89bd155abca2503ebd015f50ee0.1739193671.git.marcelo.leitner@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net...
Jakub Kicinski [Thu, 13 Feb 2025 03:53:03 +0000 (19:53 -0800)]
Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2025-02-11 (idpf, ixgbe, igc)

For idpf:

Sridhar fixes a couple issues in handling of RSC packets.

Josh adds a call to set_real_num_queues() to keep queue count in sync.

For ixgbe:

Piotr removes missed IS_ERR() removal when ERR_PTR usage was removed.

For igc:

Zdenek Bouska fixes reporting of Rx timestamp with AF_XDP.

Siang sets buffer type on empty frame to ensure proper handling.

* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igc: Set buffer type for empty frames in igc_init_empty_frame
  igc: Fix HW RX timestamp when passed by ZC XDP
  ixgbe: Fix possible skb NULL pointer dereference
  idpf: call set_real_num_queues in idpf_open
  idpf: record rx queue in skb for RSC packets
  idpf: fix handling rsc packet with a single segment
====================

Link: https://patch.msgid.link/20250211214343.4092496-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetlink: specs: add conntrack dump and stats dump support
Florian Westphal [Mon, 10 Feb 2025 15:21:52 +0000 (16:21 +0100)]
netlink: specs: add conntrack dump and stats dump support

This adds support to dump the connection tracking table
("conntrack -L") and the conntrack statistics, ("conntrack -S").

Example conntrack dump:
tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/conntrack.yaml --dump get
[{'id': 59489769,
  'mark': 0,
  'nfgen-family': 2,
  'protoinfo': {'protoinfo-tcp': {'tcp-flags-original': {'flags': {'maxack',
                                                                   'sack-perm',
                                                                   'window-scale'},
                                                         'mask': set()},
                                  'tcp-flags-reply': {'flags': {'maxack',
                                                                'sack-perm',
                                                                'window-scale'},
                                                      'mask': set()},
                                  'tcp-state': 'established',
                                  'tcp-wscale-original': 7,
                                  'tcp-wscale-reply': 8}},
  'res-id': 0,
  'secctx': {'secctx-name': 'system_u:object_r:unlabeled_t:s0'},
  'status': {'assured',
             'confirmed',
             'dst-nat-done',
             'seen-reply',
             'src-nat-done'},
  'timeout': 431949,
  'tuple-orig': {'tuple-ip': {'ip-v4-dst': '34.107.243.93',
                              'ip-v4-src': '192.168.0.114'},
                 'tuple-proto': {'proto-dst-port': 443,
                                 'proto-num': 6,
                                 'proto-src-port': 37104}},
  'tuple-reply': {'tuple-ip': {'ip-v4-dst': '192.168.0.114',
                               'ip-v4-src': '34.107.243.93'},
                  'tuple-proto': {'proto-dst-port': 37104,
                                  'proto-num': 6,
                                  'proto-src-port': 443}},
  'use': 1,
  'version': 0},
 {'id': 3402229480,

Example stats dump:
tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/conntrack.yaml --dump get-stats
[{'chain-toolong': 0,
  'clash-resolve': 3,
  'drop': 0,
 ....

Changes since last iteration:
 - Address comments from Donald Hunter, in particular, fixup "get" and
   "get-stats" descriptions, the former operation supports both dump
   and normal request (returns a single entry, if found), the latter
   only supports dumps.

Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20250210152159.41077-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: avoid unconditionally touching sk_tsflags on RX
Paolo Abeni [Tue, 11 Feb 2025 17:17:31 +0000 (18:17 +0100)]
net: avoid unconditionally touching sk_tsflags on RX

After commit 5d4cc87414c5 ("net: reorganize "struct sock" fields"),
the sk_tsflags field shares the same cacheline with sk_forward_alloc.

The UDP protocol does not acquire the sock lock in the RX path;
forward allocations are protected via the receive queue spinlock;
additionally udp_recvmsg() calls sock_recv_cmsgs() unconditionally
touching sk_tsflags on each packet reception.

Due to the above, under high packet rate traffic, when the BH and the
user-space process run on different CPUs, UDP packet reception
experiences a cache miss while accessing sk_tsflags.

The receive path doesn't strictly need to access the problematic field;
change sock_set_timestamping() to maintain the relevant information
in a newly allocated sk_flags bit, so that sock_recv_cmsgs() can
take decisions accessing the latter field only.

With this patch applied, on an AMD epic server with i40e NICs, I
measured a 10% performance improvement for small packets UDP flood
performance tests - possibly a larger delta could be observed with more
recent H/W.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/dbd18c8a1171549f8249ac5a8b30b1b5ec88a425.1739294057.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch 'netlink-specs-add-a-spec-for-nl80211-wiphy'
Jakub Kicinski [Thu, 13 Feb 2025 03:32:26 +0000 (19:32 -0800)]
Merge branch 'netlink-specs-add-a-spec-for-nl80211-wiphy'

Donald Hunter says:

====================
netlink: specs: add a spec for nl80211 wiphy

Add a rudimentary YNL spec for nl80211 that includes get-wiphy and
get-interface, along with some required enhancements to YNL and the
netlink schemas.

Patch 1 is a minor cleanup to prepare for patch 2
Patches 2-4 are new features for YNL
Patches 5-7 are updates to ynl_gen_c
Patches 8-9 are schema updates for feature parity
Patch 10 is the new nl80211 spec
====================

Link: https://patch.msgid.link/20250211120127.84858-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetlink: specs: wireless: add a spec for nl80211
Donald Hunter [Tue, 11 Feb 2025 12:01:27 +0000 (12:01 +0000)]
netlink: specs: wireless: add a spec for nl80211

Add a rudimentary YNL spec for nl80211 that covers get-wiphy,
get-interface and get-protocol-features.

./tools/net/ynl/pyynl/cli.py --family nl80211 \
    --do get-protocol-features
{'protocol-features': {'split-wiphy-dump'}}

./tools/net/ynl/pyynl/cli.py --family nl80211 \
    --dump get-wiphy --json '{ "split-wiphy-dump": true }'

./tools/net/ynl/pyynl/cli.py --family nl80211 \
    --dump get-interface

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Link: https://patch.msgid.link/20250211120127.84858-11-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetlink: specs: add s8, s16 to genetlink schemas
Donald Hunter [Tue, 11 Feb 2025 12:01:26 +0000 (12:01 +0000)]
netlink: specs: add s8, s16 to genetlink schemas

Add s8 and s16 types to the genetlink schemas to align scalar types
across all schemas.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-10-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetlink: specs: support nested structs in genetlink legacy
Donald Hunter [Tue, 11 Feb 2025 12:01:25 +0000 (12:01 +0000)]
netlink: specs: support nested structs in genetlink legacy

Nested structs are already supported in netlink-raw. Add the same
capability to the genetlink legacy schema.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-9-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agotools/net/ynl: add indexed-array scalar support to ynl-gen-c
Donald Hunter [Tue, 11 Feb 2025 12:01:24 +0000 (12:01 +0000)]
tools/net/ynl: add indexed-array scalar support to ynl-gen-c

Extend ynl-gen-c.py with support for indexed-array that has a scalar
sub-type.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-8-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>