]> www.infradead.org Git - users/hch/misc.git/log
users/hch/misc.git
18 months agonet: sparx5: add support for tc flower mirred action.
Daniel Machon [Fri, 5 Apr 2024 07:44:49 +0000 (09:44 +0200)]
net: sparx5: add support for tc flower mirred action.

Add support for tc flower mirred action. Two VCAP actions are encoded in
the rule - one for the port mask, and one for the port mask mode. When
the rule is hit, the destination mask is OR'ed with the port mask.

Also add new VCAP function for supporting 72-bit wide actions, and a tc
helper for setting the port forwarding mask.

Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agoMerge branch 'support-icssg-based-ethernet-on-am65x-sr1-0-devices'
Paolo Abeni [Tue, 9 Apr 2024 07:47:31 +0000 (09:47 +0200)]
Merge branch 'support-icssg-based-ethernet-on-am65x-sr1-0-devices'

Diogo Ivo says:

====================
Support ICSSG-based Ethernet on AM65x SR1.0 devices

This series extends the current ICSSG-based Ethernet driver to support
AM65x Silicon Revision 1.0 devices.

Notable differences between the Silicon Revisions are that there is
no TX core in SR1.0 with this being handled by the firmware, requiring
extra DMA channels to manage communication with the firmware (with the
firmware being different as well) and in the packet classifier.

The motivation behind it is that a significant number of Siemens
devices containing SR1.0 silicon have been deployed in the field
and need to be supported and updated to newer kernel versions
without losing functionality.

This series is based on TI's 5.10 SDK [1].

The fifth version of this patch series can be found in [2].

Compared to the last version of the patch set there are only changes in
patch 05/10, where the fields of a struct are now explicitly declared as
__le32 so that we can properly interpret them.

Both of the problems mentioned in v4 have been addressed by disabling
those functionalities, meaning that this driver currently only supports
one TX queue and does not support a 100Mbit/s half-duplex connection.
The removal of these features has been commented in the appropriate
locations in the code.

[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y
[2]: https://lore.kernel.org/netdev/20240326110709.26165-1-diogo.ivo@siemens.com/
====================

Link: https://lore.kernel.org/r/20240403104821.283832-1-diogo.ivo@siemens.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: ti: icssg-prueth: Add ICSSG Ethernet driver for AM65x SR1.0 platforms
Diogo Ivo [Wed, 3 Apr 2024 10:48:20 +0000 (11:48 +0100)]
net: ti: icssg-prueth: Add ICSSG Ethernet driver for AM65x SR1.0 platforms

Add the PRUeth driver for the ICSSG subsystem found in AM65x SR1.0 devices.
The main differences that set SR1.0 and SR2.0 apart are the missing TXPRU
core in SR1.0, two extra DMA channels for management purposes and different
firmware that needs to be configured accordingly.

Based on the work of Roger Quadros, Vignesh Raghavendra and
Grygorii Strashko in TI's 5.10 SDK [1].

[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y

Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: ti: icssg-prueth: Modify common functions for SR1.0
Diogo Ivo [Wed, 3 Apr 2024 10:48:19 +0000 (11:48 +0100)]
net: ti: icssg-prueth: Modify common functions for SR1.0

Some parts of the logic differ only slightly between Silicon Revisions.
In these cases add the bits that differ to a common function that
executes those bits conditionally based on the Silicon Revision.

Based on the work of Roger Quadros, Vignesh Raghavendra and
Grygorii Strashko in TI's 5.10 SDK [1].

[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y

Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: ti: icssg-prueth: Add functions to configure SR1.0 packet classifier
Diogo Ivo [Wed, 3 Apr 2024 10:48:18 +0000 (11:48 +0100)]
net: ti: icssg-prueth: Add functions to configure SR1.0 packet classifier

Add the functions to configure the SR1.0 packet classifier.

Based on the work of Roger Quadros in TI's 5.10 SDK [1].

[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y

Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: ti: icssg-prueth: Adjust the number of TX channels for SR1.0
Diogo Ivo [Wed, 3 Apr 2024 10:48:17 +0000 (11:48 +0100)]
net: ti: icssg-prueth: Adjust the number of TX channels for SR1.0

As SR1.0 uses the current higher priority channel to send commands to
the firmware, take this into account when setting/getting the number
of channels to/from the user.

Based on the work of Roger Quadros in TI's 5.10 SDK [1].

[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y

Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: ti: icssg-prueth: Adjust IPG configuration for SR1.0
Diogo Ivo [Wed, 3 Apr 2024 10:48:16 +0000 (11:48 +0100)]
net: ti: icssg-prueth: Adjust IPG configuration for SR1.0

Correctly adjust the IPG based on the Silicon Revision.

Based on the work of Roger Quadros, Vignesh Raghavendra
and Grygorii Strashko in TI's 5.10 SDK [1].

[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y

Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: ti: icssg-prueth: Add SR1.0-specific description bits
Diogo Ivo [Wed, 3 Apr 2024 10:48:15 +0000 (11:48 +0100)]
net: ti: icssg-prueth: Add SR1.0-specific description bits

Add a field to distinguish between SR1.0 and SR2.0 in the driver
as well as the necessary structures to program SR1.0.

Based on the work of Roger Quadros in TI's 5.10 SDK [1].

[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y

Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: ti: icssg-prueth: Add SR1.0-specific configuration bits
Diogo Ivo [Wed, 3 Apr 2024 10:48:14 +0000 (11:48 +0100)]
net: ti: icssg-prueth: Add SR1.0-specific configuration bits

Define the firmware configuration structure and commands needed to
communicate with SR1.0 firmware, as well as SR1.0 buffer information
where it differs from SR2.0.

Based on the work of Roger Quadros, Murali Karicheri and
Grygorii Strashko in TI's 5.10 SDK [1].

[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y

Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: ti: icssg-prueth: Move common functions into a separate file
Diogo Ivo [Wed, 3 Apr 2024 10:48:13 +0000 (11:48 +0100)]
net: ti: icssg-prueth: Move common functions into a separate file

In order to allow code sharing between Silicon Revisions 1.0 and 2.0
move all functions that can be shared into a common file. This commit
introduces no functional changes.

Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agoeth: Move IPv4/IPv6 multicast address bases to their own symbols
Diogo Ivo [Wed, 3 Apr 2024 10:48:12 +0000 (11:48 +0100)]
eth: Move IPv4/IPv6 multicast address bases to their own symbols

As these addresses can be useful outside of checking if an address
is a multicast address (for example in device drivers) make them
accessible to users of etherdevice.h to avoid code duplication.

Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agodt-bindings: net: Add support for AM65x SR1.0 in ICSSG
Diogo Ivo [Wed, 3 Apr 2024 10:48:11 +0000 (11:48 +0100)]
dt-bindings: net: Add support for AM65x SR1.0 in ICSSG

Silicon Revision 1.0 of the AM65x came with a slightly different ICSSG
support: Only 2 PRUs per slice are available and instead 2 additional
DMA channels are used for management purposes. We have no restrictions
on specified PRUs, but the DMA channels need to be adjusted.

Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
18 months agonet: phy: air_en8811h: fix some error codes
Dan Carpenter [Fri, 5 Apr 2024 10:08:59 +0000 (13:08 +0300)]
net: phy: air_en8811h: fix some error codes

These error paths accidentally return "ret" which is zero/success
instead of the correct error code.

Fixes: 71e79430117d ("net: phy: air_en8811h: Add the Airoha EN8811H PHY driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/7ef2e230-dfb7-4a77-8973-9e5be1a99fc2@moroto.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoarchnet: Convert from tasklet to BH workqueue
Allen Pais [Wed, 3 Apr 2024 16:23:06 +0000 (16:23 +0000)]
archnet: Convert from tasklet to BH workqueue

The only generic interface to execute asynchronously in the BH context is
tasklet; however, it's marked deprecated and has some design flaws. To
replace tasklets, BH workqueue support was recently added. A BH workqueue
behaves similarly to regular workqueues except that the queued work items
are executed in the BH context.

This patch converts drivers/net/archnet/* from tasklet to BH workqueue.

Based on the work done by Tejun Heo <tj@kernel.org>
Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-6.10

Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Link: https://lore.kernel.org/r/20240403162306.20258-1-apais@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agor8169: add support for RTL8168M
Heiner Kallweit [Sun, 7 Apr 2024 21:19:25 +0000 (23:19 +0200)]
r8169: add support for RTL8168M

A user reported an unknown chip version. According to the r8168 vendor
driver it's called RTL8168M, but handling is identical to RTL8168H.
So let's simply treat it as RTL8168H.

Tested-by: Евгений <octobergun@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoMerge branch 'devlink-io-eqs'
David S. Miller [Mon, 8 Apr 2024 13:10:45 +0000 (14:10 +0100)]
Merge branch 'devlink-io-eqs'

Parav Pandit says:

====================
devlink: Add port function attribute for IO EQs

Currently, PCI SFs and VFs use IO event queues to deliver netdev per
channel events. The number of netdev channels is a function of IO
event queues. In the second scenario of an RDMA device, the
completion vectors are also a function of IO event queues. Currently, an
administrator on the hypervisor has no means to provision the number
of IO event queues for the SF device or the VF device. Device/firmware
determines some arbitrary value for these IO event queues. Due to this,
the SF netdev channels are unpredictable, and consequently, the
performance is too.

This short series introduces a new port function attribute: max_io_eqs.
The goal is to provide administrators at the hypervisor level with the
ability to provision the maximum number of IO event queues for a
function. This gives the control to the administrator to provision
right number of IO event queues and have predictable performance.

Examples of when an administrator provisions (set) maximum number of
IO event queues when using switchdev mode:

  $ devlink port show pci/0000:06:00.0/1
      pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
          function:
          hw_addr 00:00:00:00:00:00 roce enable max_io_eqs 10

  $ devlink port function set pci/0000:06:00.0/1 max_io_eqs 20

  $ devlink port show pci/0000:06:00.0/1
      pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
          function:
          hw_addr 00:00:00:00:00:00 roce enable max_io_eqs 20

This sets the corresponding maximum IO event queues of the function
before it is enumerated. Thus, when the VF/SF driver reads the
capability from the device, it sees the value provisioned by the
hypervisor. The driver is then able to configure the number of channels
for the net device, as well as the number of completion vectors
for the RDMA device. The device/firmware also honors the provisioned
value, hence any VF/SF driver attempting to create IO EQs
beyond provisioned value results in an error.

With above setting now, the administrator is able to achieve the 2x
performance on SFs with 20 channels. In second example when SF was
provisioned for a container with 2 cpus, the administrator provisioned only
2 IO event queues, thereby saving device resources.

With the above settings now in place, the administrator achieved 2x
performance with the SF device with 20 channels. In the second example,
when the SF was provisioned for a container with 2 CPUs, the administrator
provisioned only 2 IO event queues, thereby saving device resources.

changelog:
v2->v3:
- limited to 80 chars per line in devlink
- fixed comments from Jakub in mlx5 driver to fix missing mutex unlock
  on error path
v1->v2:
- limited comment to 80 chars per line in header file
- fixed set function variables for reverse christmas tree
- fixed comments from Kalesh
- fixed missing kfree in get call
- returning error code for get cmd failure
- fixed error msg copy paste error in set on cmd failure
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agomlx5/core: Support max_io_eqs for a function
Parav Pandit [Sat, 6 Apr 2024 01:05:38 +0000 (04:05 +0300)]
mlx5/core: Support max_io_eqs for a function

Implement get and set for the maximum IO event queues for SF and VF.
This enables administrator on the hypervisor to control the maximum
IO event queues which are typically used to derive the maximum and
default number of net device channels or rdma device completion vectors.

Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agodevlink: Support setting max_io_eqs
Parav Pandit [Sat, 6 Apr 2024 01:05:37 +0000 (04:05 +0300)]
devlink: Support setting max_io_eqs

Many devices send event notifications for the IO queues,
such as tx and rx queues, through event queues.

Enable a privileged owner, such as a hypervisor PF, to set the number
of IO event queues for the VF and SF during the provisioning stage.

example:
Get maximum IO event queues of the VF device::

  $ devlink port show pci/0000:06:00.0/2
  pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
      function:
          hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10

Set maximum IO event queues of the VF device::

  $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32

  $ devlink port show pci/0000:06:00.0/2
  pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
      function:
          hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: display more skb fields in skb_dump()
Eric Dumazet [Sun, 7 Apr 2024 08:06:06 +0000 (08:06 +0000)]
net: display more skb fields in skb_dump()

Print these additional fields in skb_dump() to ease debugging.

- mac_len
- csum_start (in v2, at Willem suggestion)
- csum_offset (in v2, at Willem suggestion)
- priority
- mark
- alloc_cpu
- vlan_all
- encapsulation
- inner_protocol
- inner_mac_header
- inner_network_header
- inner_transport_header

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoMerge branch 'phy-cleanup-EEE'
David S. Miller [Mon, 8 Apr 2024 13:04:16 +0000 (14:04 +0100)]
Merge branch 'phy-cleanup-EEE'

Andrew Lunn says:

====================
net: Clean up some EEE code

Previous patches have reworked the API between phylib and MAC drivers
with respect to EEE, pushing most of the work into phylib. These two
patches rework two drivers to make use of the new API, and fix their
EEE implementation, so that EEE is configured in the MAC based on what
is actually negotiated during autoneg.

Compile tested only.
====================

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: lan743x: Fixup EEE
Andrew Lunn [Sat, 6 Apr 2024 20:16:00 +0000 (15:16 -0500)]
net: lan743x: Fixup EEE

The enabling/disabling of EEE in the MAC should happen as a result of
auto negotiation. So move the enable/disable into
lan743x_phy_link_status_change() which gets called by phylib when
there is a change in link status.

lan743x_ethtool_set_eee() now just programs the hardware with the LTI
timer value, and passed everything else to phylib, so it can correctly
setup the PHY.

lan743x_ethtool_get_eee() relies on phylib doing most of the work, the
MAC driver just adds the LTI timer value.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: usb: lan78xx: Fixup EEE
Andrew Lunn [Sat, 6 Apr 2024 20:15:59 +0000 (15:15 -0500)]
net: usb: lan78xx: Fixup EEE

The enabling/disabling of EEE in the MAC should happen as a result of
auto negotiation. So move the enable/disable into
lan783xx_phy_link_status_change() which gets called by phylib when
there is a change in link status.

lan78xx_set_eee() now just programs the hardware with the LPI
timer value, and passed everything else to phylib, so it can correctly
setup the PHY.

lan743x_get_eee() relies on phylib doing most of the work, the
MAC driver just adds the LPI timer value.

Call phy_support_eee() to indicate the MAC does actually support EEE.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agomptcp: add reset reason options in some places
Jason Xing [Sat, 6 Apr 2024 01:48:48 +0000 (09:48 +0800)]
mptcp: add reset reason options in some places

The reason codes are handled in two ways nowadays (quoting Mat Martineau):
1. Sending in the MPTCP option on RST packets when there is no subflow
context available (these use subflow_add_reset_reason() and directly call
a TCP-level send_reset function)
2. The "normal" way via subflow->reset_reason. This will propagate to both
the outgoing reset packet and to a local path manager process via netlink
in mptcp_event_sub_closed()

RFC 8684 defines the skb reset reason behaviour which is not required
even though in some places:

    A host sends a TCP RST in order to close a subflow or reject
    an attempt to open a subflow (MP_JOIN). In order to let the
    receiving host know why a subflow is being closed or rejected,
    the TCP RST packet MAY include the MP_TCPRST option (Figure 15).
    The host MAY use this information to decide, for example, whether
    it tries to re-establish the subflow immediately, later, or never.

Since the commit dc87efdb1a5cd ("mptcp: add mptcp reset option support")
introduced this feature about three years ago, we can fully use it.
There remains some places where we could insert reason into skb as
we can see in this patch.

Many thanks to Mat and Paolo for help:)

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoipv4: Set scope explicitly in ip_route_output().
Guillaume Nault [Fri, 5 Apr 2024 20:05:00 +0000 (22:05 +0200)]
ipv4: Set scope explicitly in ip_route_output().

Add a "scope" parameter to ip_route_output() so that callers don't have
to override the tos parameter with the RTO_ONLINK flag if they want a
local scope.

This will allow converting flowi4_tos to dscp_t in the future, thus
allowing static analysers to flag invalid interactions between
"tos" (the DSCP bits) and ECN.

Only three users ask for local scope (bonding, arp and atm). The others
continue to use RT_SCOPE_UNIVERSE. While there, add a comment to warn
users about the limitations of ip_route_output().

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com> # infiniband
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoipvlan: handle NETDEV_DOWN event
Venkat Venkatsubra [Fri, 5 Apr 2024 18:16:12 +0000 (11:16 -0700)]
ipvlan: handle NETDEV_DOWN event

In case of stacked devices, to help propagate the down
link state from the parent/root device (to this leaf device),
handle NETDEV_DOWN event like it is done now for NETDEV_UP.

In the below example, ens5 is the host interface which is the
parent of the ipvlan interface eth0 in the container.

Host:

[root@gkn-podman-x64 ~]# ip link set ens5 down
[root@gkn-podman-x64 ~]# ip -d link show dev ens5
3: ens5: <BROADCAST,MULTICAST> mtu 9000 qdisc mq state DOWN
      ...
[root@gkn-podman-x64 ~]#

Container:

[root@testnode-ol8 /]# ip -d link show dev eth0
2: eth0@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 state UNKNOWN
        ...
    ipvlan mode l2 bridge
        ...
[root@testnode-ol8 /]#

eth0's state continues to show up as UP even though ens5 is now DOWN.

For macvlan the handling of NETDEV_DOWN event was added in
commit 80fd2d6ca546 ("macvlan: Change status when lower device goes down").

Reported-by: Gia-Khanh Nguyen <gia-khanh.nguyen@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoaf_packet: avoid a false positive warning in packet_setsockopt()
Eric Dumazet [Fri, 5 Apr 2024 11:49:39 +0000 (11:49 +0000)]
af_packet: avoid a false positive warning in packet_setsockopt()

Although the code is correct, the following line

copy_from_sockptr(&req_u.req, optval, len));

triggers this warning :

memcpy: detected field-spanning write (size 28) of single field "dst" at include/linux/sockptr.h:49 (size 16)

Refactor the code to be more explicit.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: handle HAS_IOPORT dependencies
Niklas Schnelle [Fri, 5 Apr 2024 11:18:31 +0000 (13:18 +0200)]
net: handle HAS_IOPORT dependencies

In a future patch HAS_IOPORT=n will disable inb()/outb() and friends at
compile time. We thus need to add HAS_IOPORT as dependency for
those drivers requiring them. For the DEFXX driver the use of I/O
ports is optional and we only need to fence specific code paths. It also
turns out that with HAS_IOPORT handled explicitly HAMRADIO does not need
the !S390 dependency and successfully builds the bpqether driver.

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Maciej W. Rozycki <macro@orcam.me.uk>
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoMerge branch 'mptcp-selftests'
David S. Miller [Mon, 8 Apr 2024 10:53:22 +0000 (11:53 +0100)]
Merge branch 'mptcp-selftests'

Matthieu Baerts says:

====================
selftests: mptcp: cleanups and 'ip mptcp' support

Here are some patches from Geliang, doing different cleanups, and
supporting 'ip mptcp' in more MPTCP selftests.

Patch 1 checks that TC is available in selftests requiring it.

Patch 2 adds 'ms' units in TC commands, to avoid confusions.

Patches 3-9 are some prerequisites for patch 10: some export code from
mptcp_join.sh to mptcp_lib.sh, to be re-used in pm_netlink.sh,
mptcp_sockopt.sh and simult_flows.sh ; and others add helpers to
pm_netlink.sh to easily support both 'ip mptcp' and 'pm_nl_ctl' tools to
interact with the in-kernel MPTCP path-manager.

Patch 10 adds a '-i' parameter in mptcp_sockopt.sh, pm_netlink.sh, and
simult_flows.sh to use 'ip mptcp' tool instead of 'pm_nl_ctl'.

Patch 11 fixes some ShellCheck warnings in pm_netlink.sh, in order to
drop a ShellCheck's 'disable' instruction.
====================

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: netlink: drop disable=SC2086
Geliang Tang [Fri, 5 Apr 2024 10:52:15 +0000 (12:52 +0200)]
selftests: mptcp: netlink: drop disable=SC2086

Now there are only a few of variables are not using double quotes.
Modifying them, then "shellcheck disable=SC2086" can be dropped.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: ip_mptcp option for more scripts
Geliang Tang [Fri, 5 Apr 2024 10:52:14 +0000 (12:52 +0200)]
selftests: mptcp: ip_mptcp option for more scripts

This patch adds '-i' option for mptcp_sockopt.sh, pm_netlink.sh, and
simult_flows.sh, to use 'ip mptcp' command in the tests instead of
'pm_nl_ctl'. Update usage() correspondingly.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: use pm_nl endpoint ops
Geliang Tang [Fri, 5 Apr 2024 10:52:13 +0000 (12:52 +0200)]
selftests: mptcp: use pm_nl endpoint ops

Use those newly added pm_nl endpoint ops helpers to replace all 'pm_nl_ctl'
commands with 'limits', 'add', 'del', 'flush', 'show' and 'set' arguments
in scripts mptcp_sockopt.sh and simult_flows.sh.

In pm_netlink.sh, add wrappers of there helpers to make the function names
shorter. Then use the wrappers to replace all 'pm_nl_ctl' commands.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: export pm_nl endpoint ops
Geliang Tang [Fri, 5 Apr 2024 10:52:12 +0000 (12:52 +0200)]
selftests: mptcp: export pm_nl endpoint ops

This patch exports six endpoint operation helpers with pm_nl_ prefix,
pm_nl_set_limits(), pm_nl_add_endpoint(), pm_nl_del_endpoint(),
pm_nl_flush_endpoint(), pm_nl_show_endpoints() and pm_nl_change_endpoint()
into mptcp_lib.sh as public functions, and renamed each of them with a
mptcp_lib_ prefix. Then these old pm_nl_ prefix helpers in mptcp_join.sh
can be wrappers of mptcp_lib_ prefix ones.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: join: update endpoint ops
Geliang Tang [Fri, 5 Apr 2024 10:52:11 +0000 (12:52 +0200)]
selftests: mptcp: join: update endpoint ops

This patch uses 'case' statements to simplify pm_nl_add_endpoint() and
pm_nl_check_endpoint(). And simplify pm_nl_check_endpoint() with
check_output() helper. Also update pm_nl_del_endpoint() to avoid the
'double quote' shellcheck warning.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: netlink: add change_address helper
Geliang Tang [Fri, 5 Apr 2024 10:52:10 +0000 (12:52 +0200)]
selftests: mptcp: netlink: add change_address helper

The output formats of 'ip mptcp' commands are much different from that
of 'pm_nl_ctl' commands.

A new 'change_address' helper is added here, to change the flag of an
address. This is a bit similar to mptcp_join.sh's pm_nl_change_endpoint().

Usage:
Address ID - pm_nl_change_endpoint $ns id $id $flags
IP address - change_address $ns $addr $flags

Use this new helper in pm_netlink.sh to replace all 'pm_nl_ctl set'
commands.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: add {get,format}_endpoint(s) helpers
Geliang Tang [Fri, 5 Apr 2024 10:52:09 +0000 (12:52 +0200)]
selftests: mptcp: add {get,format}_endpoint(s) helpers

The output formats of 'ip mptcp' commands are much different from that
of 'pm_nl_ctl' commands.

This patch adds a new helper format_endpoints() to format the outputs of
'ip mptcp' and 'pm_nl_ctl' with 'endpoints' arguments to hide these
differences.

A new helper named get_endpoint() has also been added to show a specific
endpoint identified by the given address ID, similar to mptcp_join.sh's
pm_nl_show_endpoints() helper, but showing all entries.

Use these two helpers in mptcp_join.sh and pm_netlink.sh to replace all
'pm_nl_ctl get' commands and outputs of 'pm_nl_ctl dump/get'.

Suggested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: netlink: add 'limits' helpers
Geliang Tang [Fri, 5 Apr 2024 10:52:08 +0000 (12:52 +0200)]
selftests: mptcp: netlink: add 'limits' helpers

The output format of 'ip mptcp limits' command is much different from
that of 'pm_nl_ctl limits' command.

This patch adds format_limits() helper to format the outputs of these
two commands to hide the difference. get_limits() has been added to show
the limits.

Use these two helpers in pm_netlink.sh to replace all 'pm_nl_ctl limits'
commands and outputs.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: export ip_mptcp to mptcp_lib
Geliang Tang [Fri, 5 Apr 2024 10:52:07 +0000 (12:52 +0200)]
selftests: mptcp: export ip_mptcp to mptcp_lib

This patch exports ip_mptcp into mptcp_lib.sh as a public variable,
named MPTCP_LIB_IP_MPTCP. Add a helper mptcp_lib_set_ip_mptcp() to set
it, and a helper mptcp_lib_is_ip_mptcp() to test whether it is set. Use
these two helpers in mptcp_join.sh.

This patch is prepared for coming commits.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: add ms units for tc-netem delay
Geliang Tang [Fri, 5 Apr 2024 10:52:06 +0000 (12:52 +0200)]
selftests: mptcp: add ms units for tc-netem delay

'delay 1' in tc-netem is confusing, not sure if it's a delay of 1 second or
1 millisecond. This patch explicitly adds millisecond units to make these
commands clearer.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: mptcp: add tc check for check_tools
Geliang Tang [Fri, 5 Apr 2024 10:52:05 +0000 (12:52 +0200)]
selftests: mptcp: add tc check for check_tools

tc are used in some test scripts: mptcp_connect.sh, mptcp_join.sh and
simult_flows.sh. It makes sense to check if tc is installed before running
these scripts, just like other tools. So this patch add 'tc' check for
mptcp_lib_check_tools(), and check it in these test scripts.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agotcp: more struct tcp_sock adjustments
Eric Dumazet [Fri, 5 Apr 2024 10:29:26 +0000 (10:29 +0000)]
tcp: more struct tcp_sock adjustments

tp->recvmsg_inq is used from tcp recvmsg() thus should
be in tcp_sock_read_rx group.

tp->tcp_clock_cache and tp->tcp_mstamp are written
both in rx and tx paths, thus are better placed
in tcp_sock_write_txrx group.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: usb: ax88179_178a: non necessary second random mac address
Jose Ignacio Tornos Martinez [Fri, 5 Apr 2024 08:24:31 +0000 (10:24 +0200)]
net: usb: ax88179_178a: non necessary second random mac address

If the mac address can not be read from the device registers or the
devicetree, a random address is generated, but this was already done from
usbnet_probe, so it is not necessary to call eth_hw_addr_random from here
again to generate another random address.

Indeed, when reset was also executed from bind, generate another random mac
address invalidated the check from usbnet_probe to configure if the assigned
mac address for the interface was random or not, because it is comparing
with the initial generated random address. Now, with only a reset from open
operation, it is just a harmless simplification.

Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agopfcp: avoid copy warning by simplifing code
Michal Swiatkowski [Fri, 5 Apr 2024 06:36:05 +0000 (08:36 +0200)]
pfcp: avoid copy warning by simplifing code

From Arnd comments:
"The memcpy() in the ip_tunnel_info_opts_set() causes
a string.h fortification warning, with at least gcc-13:

    In function 'fortify_memcpy_chk',
        inlined from 'ip_tunnel_info_opts_set' at include/net/ip_tunnels.h:619:3,
        inlined from 'pfcp_encap_recv' at drivers/net/pfcp.c:84:2:
    include/linux/fortify-string.h:553:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
      553 |                         __write_overflow_field(p_size_field, size);"

It is a false-positivie caused by ambiguity of the union.

However, as Arnd noticed, copying here is unescessary. The code can be
simplified to avoid calling ip_tunnel_info_opts_set(), which is doing
copying, setting flags and options_len.

Set correct flags and options_len directly on tun_info.

Fixes: 6dd514f48110 ("pfcp: always set pfcp metadata")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Closes: https://lore.kernel.org/netdev/701f8f93-f5fb-408b-822a-37a1d5c424ba@app.fastmail.com/
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoMerge branch 'ynl-tests'
David S. Miller [Mon, 8 Apr 2024 10:40:41 +0000 (11:40 +0100)]
Merge branch 'ynl-tests'

Jakub Kicinski says:

====================
selftests: net: groundwork for YNL-based tests

Currently the options for writing networking tests are C, bash or
some mix of the two. YAML/Netlink gives us the ability to easily
interface with Netlink in higher level laguages. In particular,
there is a Python library already available in tree, under tools/net.
Add the scaffolding which allows writing tests using this library.

The "scaffolding" is needed because the library lives under
tools/net and uses YAML files from under Documentation/.
So we need a small amount of glue code to find those things
and add them to TEST_FILES.

This series adds both a basic SW sanity test and driver
test which can be run against netdevsim or a real device.
When I develop core code I usually test with netdevsim,
then a real device, and then a backport to Meta's kernel.
Because of the lack of integration, until now I had
to throw away the (YNL-based) test script and netdevsim code.

Running tests in tree directly:

 $ ./tools/testing/selftests/net/nl_netdev.py
 KTAP version 1
 1..2
 ok 1 nl_netdev.empty_check
 ok 2 nl_netdev.lo_check
 # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0

in tree via make:

 $ make -C tools/testing/selftests/ TARGETS=net \
TEST_PROGS=nl_netdev.py TEST_GEN_PROGS="" run_tests
  [ ... ]

and installed externally, all seem to work:

 $ make -C tools/testing/selftests/ TARGETS=net \
install INSTALL_PATH=/tmp/ksft-net
 $ /tmp/ksft-net/run_kselftest.sh -t net:nl_netdev.py
  [ ... ]

For driver tests I followed the lead of net/forwarding and
get the device name from env and/or a config file.

v3:
 - fix up netdevsim C
 - various small nits in other patches (see changelog in patches)
v2: https://lore.kernel.org/all/20240403023426.1762996-1-kuba@kernel.org/
 - don't add to TARGETS, create a deperate variable with deps
 - support and use with
 - support and use passing arguments to tests
v1: https://lore.kernel.org/all/20240402010520.1209517-1-kuba@kernel.org/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agotesting: net-drv: add a driver test for stats reporting
Jakub Kicinski [Fri, 5 Apr 2024 02:45:26 +0000 (19:45 -0700)]
testing: net-drv: add a driver test for stats reporting

Add a very simple test to make sure drivers report expected
stats. Drivers which implement FEC or pause configuration
should report relevant stats. Qstats must be reported,
at least packet and byte counts, and they must match
total device stats.

Tested with netdevsim, bnxt, in-tree and installed.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: drivers: add scaffolding for Netlink tests in Python
Jakub Kicinski [Fri, 5 Apr 2024 02:45:25 +0000 (19:45 -0700)]
selftests: drivers: add scaffolding for Netlink tests in Python

Add drivers/net as a target for mixed-use tests.
The setup is expected to work similarly to the forwarding tests.
Since we only need one interface (unlike forwarding tests)
read the target device name from NETIF. If not present we'll
try to run the test against netdevsim.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonetdevsim: report stats by default, like a real device
Jakub Kicinski [Fri, 5 Apr 2024 02:45:24 +0000 (19:45 -0700)]
netdevsim: report stats by default, like a real device

Real devices should implement qstats. Devices which support
pause or FEC configuration should also report the relevant stats.

nsim was missing FEC stats completely, some of the qstats
and pause stats required toggling a debugfs knob.

Note that the tests which used pause always initialize the setting
so they shouldn't be affected by the different starting value.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: nl_netdev: add a trivial Netlink netdev test
Jakub Kicinski [Fri, 5 Apr 2024 02:45:23 +0000 (19:45 -0700)]
selftests: nl_netdev: add a trivial Netlink netdev test

Add a trivial test using YNL.

  $ ./tools/testing/selftests/net/nl_netdev.py
  KTAP version 1
  1..2
  ok 1 nl_netdev.empty_check
  ok 2 nl_netdev.lo_check

Instantiate the family once, it takes longer than the test itself.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoselftests: net: add scaffolding for Netlink tests in Python
Jakub Kicinski [Fri, 5 Apr 2024 02:45:22 +0000 (19:45 -0700)]
selftests: net: add scaffolding for Netlink tests in Python

Add glue code for accessing the YNL library which lives under
tools/net and YAML spec files from under Documentation/.
Automatically figure out if tests are run in tree or not.
Since we'll want to use this library both from net and
drivers/net test targets make the library a target as well,
and automatically include it when net or drivers/net are
included. Making net/lib a target ensures that we end up
with only one copy of it, and saves us some path guessing.

Add a tiny bit of formatting support to be able to output KTAP
from the start.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoMerge tag 'batadv-next-pullrequest-20240405' of git://git.open-mesh.org/linux-merge
David S. Miller [Mon, 8 Apr 2024 10:37:27 +0000 (11:37 +0100)]
Merge tag 'batadv-next-pullrequest-20240405' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - prefer kfree_rcu() over call_rcu() with free-only callbacks,
   by Dmitry Antipov

 - bypass empty buckets in batadv_purge_orig_ref(), by Eric Dumazet
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: mdio-gpio: Use device_is_compatible()
Andy Shevchenko [Thu, 4 Apr 2024 17:55:57 +0000 (20:55 +0300)]
net: mdio-gpio: Use device_is_compatible()

Replace open coded variant of device_is_compatible().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: dqs: use sysfs_emit() in favor of sprintf()
Eric Dumazet [Thu, 4 Apr 2024 16:46:04 +0000 (16:46 +0000)]
net: dqs: use sysfs_emit() in favor of sprintf()

Commit 6025b9135f7a ("net: dqs: add NIC stall detector based on BQL")
added three sysfs files.

Use the recommended sysfs_emit() helper.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Breno Leitao <leitao@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoip_tunnel: harden copying IP tunnel params to userspace
Alexander Lobakin [Thu, 4 Apr 2024 16:03:02 +0000 (18:03 +0200)]
ip_tunnel: harden copying IP tunnel params to userspace

Structures which are about to be copied to userspace shouldn't have
uninitialized fields or paddings.
memset() the whole &ip_tunnel_parm in ip_tunnel_parm_to_user() before
filling it with the kernel data. The compilers will hopefully combine
writes to it.

Fixes: 117aef12a7b1 ("ip_tunnel: use a separate struct to store tunnel params in the kernel")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/netdev/5f63dd25-de94-4ca3-84e6-14095953db13@moroto.mountain
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoipv6: remove RTNL protection from ip6addrlbl_dump()
Eric Dumazet [Thu, 4 Apr 2024 13:24:13 +0000 (13:24 +0000)]
ipv6: remove RTNL protection from ip6addrlbl_dump()

No longer hold RTNL while calling ip6addrlbl_dump()
("ip addrlabel show")

ip6addrlbl_dump() was already mostly relying on RCU anyway.

Add READ_ONCE()/WRITE_ONCE() annotations around
net->ipv6.ip6addrlbl_table.seq

Note that ifal_seq value is currently ignored in iproute2,
and a bit weak.

We might user later cb->seq  / nl_dump_check_consistent()
protocol if needed.

Also change return value for a completed dump,
so that NLMSG_DONE can be appended to current skb,
saving one recvmsg() system call.

v2: read net->ipv6.ip6addrlbl_table.seq once, (David Ahern)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link:https://lore.kernel.org/netdev/67f5cb70-14a4-4455-8372-f039da2f15c2@kernel.org/
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoinet: frags: delay fqdir_free_fn()
Eric Dumazet [Thu, 4 Apr 2024 13:07:51 +0000 (13:07 +0000)]
inet: frags: delay fqdir_free_fn()

fqdir_free_fn() is using very expensive rcu_barrier()

When one netns is dismantled, we often call fqdir_exit()
multiple times, typically lauching fqdir_free_fn() twice.

Delaying by one second fqdir_free_fn() helps to reduce
the number of rcu_barrier() calls, and lock contention
on rcu_state.barrier_mutex.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoip6_vti: Remove generic .ndo_get_stats64
Breno Leitao [Thu, 4 Apr 2024 12:52:52 +0000 (05:52 -0700)]
ip6_vti: Remove generic .ndo_get_stats64

Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is
configured") moved the callback to dev_get_tstats64() to net core, so,
unless the driver is doing some custom stats collection, it does not
need to set .ndo_get_stats64.

Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it
doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64
function pointer.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoip6_vti: Do not use custom stat allocator
Breno Leitao [Thu, 4 Apr 2024 12:52:51 +0000 (05:52 -0700)]
ip6_vti: Do not use custom stat allocator

With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core
instead of in this driver.

With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.

Remove the allocation in the ip6_vti and leverage the network
core allocation instead.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoMerge branch 'phy-listing-link_topology-tracking'
David S. Miller [Sat, 6 Apr 2024 17:25:15 +0000 (18:25 +0100)]
Merge branch 'phy-listing-link_topology-tracking'

Maxime Chevallier says:

====================
Introduce PHY listing and link_topology tracking

This is V11 for the link topology addition, allowing to track all PHYs
that are linked to netdevices.

This V11 addresses the various netlink-related issues that were raised
by Jakub, and fixes some typos in the documentation.

As a remainder, here's what the PHY listings would look like :
 - eth0 has a 88x3310 acting as media converter, and an SFP module with
   an embedded 88e1111 PHY
 - eth2 has a 88e1510 PHY

PHY for eth0:
PHY index: 1
Driver name: mv88x3310
PHY device name: f212a600.mdio-mii:00
Downstream SFP bus name: sfp-eth0
PHY id: 0
Upstream type: MAC

PHY for eth0:
PHY index: 2
Driver name: Marvell 88E1111
PHY device name: i2c:sfp-eth0:16
PHY id: 21040322
Upstream type: PHY
Upstream PHY index: 1
Upstream SFP name: sfp-eth0

PHY for eth2:
PHY index: 1
Driver name: Marvell 88E1510
PHY device name: f212a200.mdio-mii:00
PHY id: 21040593
Upstream type: MAC

Ethtool patches : https://github.com/minimaxwell/ethtool/tree/link-topo-v6
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: ethtool: Allow passing a phy index for some commands
Maxime Chevallier [Thu, 4 Apr 2024 09:29:55 +0000 (11:29 +0200)]
net: ethtool: Allow passing a phy index for some commands

Some netlink commands are target towards ethernet PHYs, to control some
of their features. As there's several such commands, add the ability to
pass a PHY index in the ethnl request, which will populate the generic
ethnl_req_info with the relevant phydev when the command targets a PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: sfp: Add helper to return the SFP bus name
Maxime Chevallier [Thu, 4 Apr 2024 09:29:54 +0000 (11:29 +0200)]
net: sfp: Add helper to return the SFP bus name

Knowing the bus name is helpful when we want to expose the link topology
to userspace, add a helper to return the SFP bus name.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: phy: add helpers to handle sfp phy connect/disconnect
Maxime Chevallier [Thu, 4 Apr 2024 09:29:53 +0000 (11:29 +0200)]
net: phy: add helpers to handle sfp phy connect/disconnect

There are a few PHY drivers that can handle SFP modules through their
sfp_upstream_ops. Introduce Phylib helpers to keep track of connected
SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the
upstream PHY's netdev's namespace.

By doing so, these SFP PHYs can be enumerated and exposed to users,
which will be able to use their capabilities.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: sfp: pass the phy_device when disconnecting an sfp module's PHY
Maxime Chevallier [Thu, 4 Apr 2024 09:29:52 +0000 (11:29 +0200)]
net: sfp: pass the phy_device when disconnecting an sfp module's PHY

Pass the phy_device as a parameter to the sfp upstream .disconnect_phy
operation. This is preparatory work to help track phy devices across
a net_device's link.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: phy: Introduce ethernet link topology representation
Maxime Chevallier [Thu, 4 Apr 2024 09:29:51 +0000 (11:29 +0200)]
net: phy: Introduce ethernet link topology representation

Link topologies containing multiple network PHYs attached to the same
net_device can be found when using a PHY as a media converter for use
with an SFP connector, on which an SFP transceiver containing a PHY can
be used.

With the current model, the transceiver's PHY can't be used for
operations such as cable testing, timestamping, macsec offload, etc.

The reason being that most of the logic for these configuration, coming
from either ethtool netlink or ioctls tend to use netdev->phydev, which
in multi-phy systems will reference the PHY closest to the MAC.

Introduce a numbering scheme allowing to enumerate PHY devices that
belong to any netdev, which can in turn allow userspace to take more
precise decisions with regard to each PHY's configuration.

The numbering is maintained per-netdev, in a phy_device_list.
The numbering works similarly to a netdevice's ifindex, with
identifiers that are only recycled once INT_MAX has been reached.

This prevents races that could occur between PHY listing and SFP
transceiver removal/insertion.

The identifiers are assigned at phy_attach time, as the numbering
depends on the netdevice the phy is attached to. The PHY index can be
re-used for PHYs that are persistent.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: phy: marvell: implement cable test for 88E1111
Pawel Dembicki [Thu, 4 Apr 2024 09:07:26 +0000 (11:07 +0200)]
net: phy: marvell: implement cable test for 88E1111

The same implementation is also valid for 88E1145. VCT in 88E1111 is
similar to the 88E609x family. The main difference lies in register
organization and required workarounds.

It utilizes the same fields in registers but requires a simpler
implementation.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonetlink: add nlmsg_consume() and use it in devlink compat
Jakub Kicinski [Wed, 3 Apr 2024 20:22:59 +0000 (13:22 -0700)]
netlink: add nlmsg_consume() and use it in devlink compat

devlink_compat_running_version() sticks out when running
netdevsim tests and watching dropped skbs. Add nlmsg_consume()
for cases were we want to free a netlink skb but it is expected,
rather than a drop. af_netlink code uses consume_skb() directly,
which is fine, but some may prefer the symmetry of nlmsg_new() /
nlmsg_consume().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agonet: skbuff: generalize the skb->decrypted bit
Jakub Kicinski [Wed, 3 Apr 2024 20:21:39 +0000 (13:21 -0700)]
net: skbuff: generalize the skb->decrypted bit

The ->decrypted bit can be reused for other crypto protocols.
Remove the direct dependency on TLS, add helpers to clean up
the ifdefs leaking out everywhere.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 months agoMerge branch 'ynl-rename-array-nest-to-indexed-array'
Jakub Kicinski [Sat, 6 Apr 2024 05:32:51 +0000 (22:32 -0700)]
Merge branch 'ynl-rename-array-nest-to-indexed-array'

Hangbin Liu says:

====================
ynl: rename array-nest to indexed-array

rename array-nest to indexed-array and add un-nest sub-type support
====================

Link: https://lore.kernel.org/r/20240404063114.1221532-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoynl: support binary and integer sub-type for indexed-array
Hangbin Liu [Thu, 4 Apr 2024 06:31:13 +0000 (14:31 +0800)]
ynl: support binary and integer sub-type for indexed-array

Add binary and integer sub-type support for indexed-array to display bond
arp and ns targets. Here is what the result looks like:

 # ip link add bond0 type bond mode 1 \
   arp_ip_target 192.168.1.1,192.168.1.2 ns_ip6_target 2001::1,2001::2
 # ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_link.yaml \
   --do getlink --json '{"ifname": "bond0"}' --output-json | jq '.linkinfo'

    "arp-ip-target": [
      "192.168.1.1",
      "192.168.1.2"
    ],
    [...]
    "ns-ip6-target": [
      "2001::1",
      "2001::2"
    ],

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240404063114.1221532-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoynl: rename array-nest to indexed-array
Hangbin Liu [Thu, 4 Apr 2024 06:31:12 +0000 (14:31 +0800)]
ynl: rename array-nest to indexed-array

Some implementations, like bonding, has nest array with same attr type.
To support all kinds of entries under one nest array. As discussed[1],
let's rename array-nest to indexed-array, and assuming the value is
a nest by passing the type via sub-type.

[1] https://lore.kernel.org/netdev/20240312100105.16a59086@kernel.org/

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240404063114.1221532-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agotcp: annotate data-races around tp->window_clamp
Eric Dumazet [Thu, 4 Apr 2024 11:42:31 +0000 (11:42 +0000)]
tcp: annotate data-races around tp->window_clamp

tp->window_clamp can be read locklessly, add READ_ONCE()
and WRITE_ONCE() annotations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20240404114231.2195171-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoMerge branch 'ethtool-hw-timestamping-statistics'
Jakub Kicinski [Sat, 6 Apr 2024 05:22:49 +0000 (22:22 -0700)]
Merge branch 'ethtool-hw-timestamping-statistics'

Rahul Rameshbabu says:

====================
ethtool HW timestamping statistics

The goal of this patch series is to introduce a common set of ethtool
statistics for hardware timestamping that a driver implementer can hook into.
The statistics counters added are based on what I believe are common
patterns/behaviors found across various hardware timestamping implementations
seen in the kernel tree today. The mlx5 family of devices is used
as the PoC for this patch series. Other vendors are more than welcome
to chime in on this series.

Link: https://lore.kernel.org/netdev/20240402205223.137565-1-rrameshbabu@nvidia.com/
Link: https://lore.kernel.org/netdev/20240309084440.299358-1-rrameshbabu@nvidia.com/
Link: https://lore.kernel.org/netdev/20240223192658.45893-1-rrameshbabu@nvidia.com/
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
====================

Link: https://lore.kernel.org/r/20240403212931.128541-1-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agotools: ynl: ethtool.py: Output timestamping statistics from tsinfo-get operation
Rahul Rameshbabu [Wed, 3 Apr 2024 21:28:45 +0000 (14:28 -0700)]
tools: ynl: ethtool.py: Output timestamping statistics from tsinfo-get operation

Print the nested stats attribute containing timestamping statistics when
the --show-time-stamping flag is used.

  [root@binary-eater-vm-01 linux-ethtool-ts]# ./tools/net/ynl/ethtool.py --show-time-stamping mlx5_1
  Time stamping parameters for mlx5_1:
  Capabilities:
    hardware-transmit
    hardware-receive
    hardware-raw-clock
  PTP Hardware Clock: 0
  Hardware Transmit Timestamp Modes:
    off
    on
  Hardware Receive Filter Modes:
    none
    all
  Statistics:
    tx-pkts: 8
    tx-lost: 0
    tx-err: 0

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/20240403212931.128541-8-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonetlink: specs: ethtool: define header-flags as an enum
Jakub Kicinski [Sat, 6 Apr 2024 05:22:38 +0000 (22:22 -0700)]
netlink: specs: ethtool: define header-flags as an enum

Recent changes added header flags to the spec.
Use an enum instead of defines for more seamless codegen.

[Jakub: drop the already applied parts and rewrite message]

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/20240403212931.128541-6-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet/mlx5e: Implement ethtool hardware timestamping statistics
Rahul Rameshbabu [Wed, 3 Apr 2024 21:28:42 +0000 (14:28 -0700)]
net/mlx5e: Implement ethtool hardware timestamping statistics

Feed driver statistics counters related to hardware timestamping to
standardized ethtool hardware timestamping statistics group.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/20240403212931.128541-5-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet/mlx5e: Introduce timestamps statistic counter for Tx DMA layer
Rahul Rameshbabu [Wed, 3 Apr 2024 21:28:41 +0000 (14:28 -0700)]
net/mlx5e: Introduce timestamps statistic counter for Tx DMA layer

Count number of transmitted packets that were hardware timestamped at the
device DMA layer.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/20240403212931.128541-4-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet/mlx5e: Introduce lost_cqe statistic counter for PTP Tx port timestamping CQ
Rahul Rameshbabu [Wed, 3 Apr 2024 21:28:40 +0000 (14:28 -0700)]
net/mlx5e: Introduce lost_cqe statistic counter for PTP Tx port timestamping CQ

Track the number of times a CQE was expected to not be delivered on PTP Tx
port timestamping CQ. A CQE is expected to not be delivered if a certain
amount of time passes since the corresponding CQE containing the DMA
timestamp information has arrived. Increment the late_cqe counter when such
a CQE does manage to be delivered to the CQ.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20240403212931.128541-3-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoethtool: add interface to read Tx hardware timestamping statistics
Rahul Rameshbabu [Wed, 3 Apr 2024 21:28:39 +0000 (14:28 -0700)]
ethtool: add interface to read Tx hardware timestamping statistics

Multiple network devices that support hardware timestamping appear to have
common behavior with regards to timestamp handling. Implement common Tx
hardware timestamping statistics in a tx_stats struct_group. Common Rx
hardware timestamping statistics can subsequently be implemented in a
rx_stats struct_group for ethtool_ts_stats.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/20240403212931.128541-2-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoMerge branch 'address-all-wunused-const-warnings'
Jakub Kicinski [Sat, 6 Apr 2024 05:11:28 +0000 (22:11 -0700)]
Merge branch 'address-all-wunused-const-warnings'

Arnd Bergmann says:

====================
address all -Wunused-const warnings (the net part)

Compilers traditionally warn for unused 'static' variables, but not
if they are constant. The reason here is a custom for C++ programmers
to define named constants as 'static const' variables in header files
instead of using macros or enums.

In W=1 builds, we get warnings only static const variables in C
files, but not in headers, which is a good compromise, but this still
produces warning output in at least 30 files. These warnings are
almost all harmless, but also trivial to fix, and there is no
good reason to warn only about the non-const variables being unused.

I've gone through all the files that I found using randconfig and
allmodconfig builds and created patches to avoid these warnings,
with the goal of retaining a clean build once the option is enabled
by default.

Unfortunately, there is one fairly large patch ("drivers: remove
incorrect of_match_ptr/ACPI_PTR annotations") that touches
34 individual drivers that all need the same one-line change.
If necessary, I can split it up by driver or by subsystem,
but at least for reviewing I would keep it as one piece for
the moment.
====================

Link: https://lore.kernel.org/r/20240403080702.3509288-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: xgbe: remove extraneous #ifdef checks
Arnd Bergmann [Wed, 3 Apr 2024 08:06:46 +0000 (10:06 +0200)]
net: xgbe: remove extraneous #ifdef checks

When both ACPI and OF are disabled, xgbe_v1 is unused and
causes a W=1 warning:

drivers/net/ethernet/amd/xgbe/xgbe-platform.c:533:39: error: unused variable 'xgbe_v1' [-Werror,-Wunused-const-variable]
static const struct xgbe_version_data xgbe_v1 = {

There is no real point in trying to save a few bytes for the match
tables, so just make them always visible.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20240403080702.3509288-29-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoisdn: kcapi: don't build unused procfs code
Arnd Bergmann [Wed, 3 Apr 2024 08:06:44 +0000 (10:06 +0200)]
isdn: kcapi: don't build unused procfs code

The procfs file is completely unused without CONFIG_PROC_FS but causes
a compile time warning:

drivers/isdn/capi/kcapi_proc.c:97:36: error: unused variable 'seq_controller_ops' [-Werror,-Wunused-const-variable]
static const struct seq_operations seq_controller_ops = {
drivers/isdn/capi/kcapi_proc.c:104:36: error: unused variable 'seq_contrstats_ops' [-Werror,-Wunused-const-variable]
drivers/isdn/capi/kcapi_proc.c:179:36: error: unused variable 'seq_applications_ops' [-Werror,-Wunused-const-variable]
drivers/isdn/capi/kcapi_proc.c:186:36: error: unused variable 'seq_applstats_ops' [-Werror,-Wunused-const-variable]

Remove the file from the build in that config and make the calls into
it conditional instead.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20240403080702.3509288-27-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months ago3c515: remove unused 'mtu' variable
Arnd Bergmann [Wed, 3 Apr 2024 08:06:23 +0000 (10:06 +0200)]
3c515: remove unused 'mtu' variable

This has never been used since the start of the git history. When building
with W=1, the unused variable produces a gcc warning:

drivers/net/ethernet/3com/3c515.c:35:18: error: 'mtu' defined but not used [-Werror=unused-const-variable=]

Just remove it.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20240403080702.3509288-6-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agotrace: events: cleanup deprecated strncpy uses
Justin Stitt [Mon, 1 Apr 2024 23:48:52 +0000 (23:48 +0000)]
trace: events: cleanup deprecated strncpy uses

strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

For 2 out of 3 of these changes we can simply swap in strscpy() as it
guarantess NUL-termination which is needed for the following trace
print.

trace_rpcgss_context() should use memcpy as its format specifier %.*s
allows for the length to be specifier (__entry->len). Due to this,
acceptor does not technically need to be NUL-terminated. Moreover,
swapping in strscpy() and keeping everything else the same could result
in truncation of the source string by one byte. To remedy this, we could
use `len + 1` but I am unsure of the size of the destination buffer so a
simple memcpy should suffice.
| TP_printk("win_size=%u expiry=%lu now=%lu timeout=%u acceptor=%.*s",
| __entry->window_size, __entry->expiry, __entry->now,
| __entry->timeout, __entry->len, __get_str(acceptor))

I suspect acceptor not to naturally be a NUL-terminated string due to
the presence of some stringify methods.
| .crstringify_acceptor = gss_stringify_acceptor,

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/20240401-strncpy-include-trace-events-mdio-h-v1-1-9cb5a4cda116@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoMerge branch 'mlx5e-rc2-misc-patches'
Jakub Kicinski [Sat, 6 Apr 2024 04:54:46 +0000 (21:54 -0700)]
Merge branch 'mlx5e-rc2-misc-patches'

Tariq Toukan says:

====================
mlx5e rc2 misc patches (part)

This patchset includes small features and a cleanup for the mlx5e driver.

Patches 1-2 by Cosmin implements FEC settings for 100G/lane modes.

Patch 3 is a simple cleanup.
====================

Link: https://lore.kernel.org/r/20240404173357.123307-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet/mlx5e: Un-expose functions in en.h
Tariq Toukan [Thu, 4 Apr 2024 17:33:57 +0000 (20:33 +0300)]
net/mlx5e: Un-expose functions in en.h

Un-expose functions that are not used outside of their c file.
Make them static.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://lore.kernel.org/r/20240404173357.123307-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet/mlx5e: Support FEC settings for 100G/lane modes
Cosmin Ratiu [Thu, 4 Apr 2024 17:33:54 +0000 (20:33 +0300)]
net/mlx5e: Support FEC settings for 100G/lane modes

This consists of:
1. Expose the 100G/lane capability bit in the PCAM reg.
2. Expose the per link mode FEC capability masks in the PPLM reg.
3. Set the overrides according to ethtool parameters.
FEC for new modes is set if and only if the PCAM 100G/lane capability is
advertised and the capability mask for a given link mode reports that it
can accept the requested FEC mode.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240404173357.123307-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet/mlx5e: Extract checking of FEC support for a link mode
Cosmin Ratiu [Thu, 4 Apr 2024 17:33:53 +0000 (20:33 +0300)]
net/mlx5e: Extract checking of FEC support for a link mode

The check of whether a given FEC mode is supported in a given link mode
is about to get more complicated, so extract it in a separate function
to avoid code duplication.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240404173357.123307-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agobnxt_en: Fix PTP firmware timeout parameter
Michael Chan [Thu, 4 Apr 2024 19:55:00 +0000 (12:55 -0700)]
bnxt_en: Fix PTP firmware timeout parameter

Use the correct tmo_us microsecond parameter for the PTP firmware
timeout parameter.

Fixes: 7de3c2218eed ("bnxt_en: Add a timeout parameter to bnxt_hwrm_port_ts_query()")
Reported-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240404195500.171071-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoMerge branch 'net-dsa-microchip-ksz8-refactor-fdb-dump-path'
Jakub Kicinski [Fri, 5 Apr 2024 02:08:46 +0000 (19:08 -0700)]
Merge branch 'net-dsa-microchip-ksz8-refactor-fdb-dump-path'

Oleksij Rempel says:

====================
net: dsa: microchip: ksz8: refactor FDB dump path

Refactor FDB dump code path for Microchip KSZ8xxx series. This series
mostly makes some cosmetic reworks and allows to forward errors detected
by the regmap.

Change logs are part of patch commit messages.
====================

Link: https://lore.kernel.org/r/20240403125039.3414824-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: dsa: microchip: ksz8_r_dyn_mac_table(): use entries variable to signal 0 entries
Oleksij Rempel [Wed, 3 Apr 2024 12:50:39 +0000 (14:50 +0200)]
net: dsa: microchip: ksz8_r_dyn_mac_table(): use entries variable to signal 0 entries

We already have a variable to provide number of entries. So use it,
instead of using error number.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240403125039.3414824-9-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: dsa: microchip: ksz8_r_dyn_mac_table(): return read/write error if we got any
Oleksij Rempel [Wed, 3 Apr 2024 12:50:38 +0000 (14:50 +0200)]
net: dsa: microchip: ksz8_r_dyn_mac_table(): return read/write error if we got any

The read/write path may fail. So, return error if we got it.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240403125039.3414824-8-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: dsa: microchip: ksz8_r_dyn_mac_table(): ksz: do not return EAGAIN on timeout
Oleksij Rempel [Wed, 3 Apr 2024 12:50:37 +0000 (14:50 +0200)]
net: dsa: microchip: ksz8_r_dyn_mac_table(): ksz: do not return EAGAIN on timeout

EAGAIN was not used by previous code and not used by  current code. So,
remove it and use proper error value.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240403125039.3414824-7-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: dsa: microchip: ksz8: Unify variable naming in ksz8_r_dyn_mac_table()
Oleksij Rempel [Wed, 3 Apr 2024 12:50:36 +0000 (14:50 +0200)]
net: dsa: microchip: ksz8: Unify variable naming in ksz8_r_dyn_mac_table()

Use 'ret' instead of 'rc' in ksz8_r_dyn_mac_table() to maintain
consistency with the rest of the file.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240403125039.3414824-6-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: dsa: microchip: ksz8: Refactor ksz8_r_dyn_mac_table() for readability
Oleksij Rempel [Wed, 3 Apr 2024 12:50:35 +0000 (14:50 +0200)]
net: dsa: microchip: ksz8: Refactor ksz8_r_dyn_mac_table() for readability

Move the code out of a long if statement scope in ksz8_r_dyn_mac_table()
to improve code readability.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240403125039.3414824-5-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: dsa: microchip: ksz8: Refactor ksz8_fdb_dump()
Oleksij Rempel [Wed, 3 Apr 2024 12:50:34 +0000 (14:50 +0200)]
net: dsa: microchip: ksz8: Refactor ksz8_fdb_dump()

Refactor ksz8_fdb_dump() to address potential issues:
- Limit the number of iterations to avoid endless loops.
- Handle error codes returned by ksz8_r_dyn_mac_table(), with
  an exception for -ENXIO when no more dynamic entries are detected.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Link: https://lore.kernel.org/r/20240403125039.3414824-4-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: dsa: microchip: Make ksz8_r_dyn_mac_table() static
Oleksij Rempel [Wed, 3 Apr 2024 12:50:33 +0000 (14:50 +0200)]
net: dsa: microchip: Make ksz8_r_dyn_mac_table() static

ksz8_r_dyn_mac_table() is not used outside the source file. Make it
static.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240403125039.3414824-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agonet: dsa: microchip: Remove unused FDB timestamp support in ksz8_r_dyn_mac_table()
Oleksij Rempel [Wed, 3 Apr 2024 12:50:32 +0000 (14:50 +0200)]
net: dsa: microchip: Remove unused FDB timestamp support in ksz8_r_dyn_mac_table()

The FDB timestamps are not being utilized. This commit removes the
unused timestamp support from ksz8_r_dyn_mac_table() function.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240403125039.3414824-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoMerge branch 'add-starfive-jh8100-dwmac-support'
Jakub Kicinski [Fri, 5 Apr 2024 02:07:46 +0000 (19:07 -0700)]
Merge branch 'add-starfive-jh8100-dwmac-support'

Tan Chun Hau says:

====================
Add StarFive JH8100 dwmac support

Add StarFive JH8100 dwmac support.
The JH8100 dwmac shares the same driver code as the JH7110 dwmac
and has only one reset signal.

Please refer to below:

  JH8100: reset-names = "stmmaceth";
  JH7110: reset-names = "stmmaceth", "ahb";
  JH7100: reset-names = "ahb";

Example usage of JH8100 in the device tree:

gmac0: ethernet@16030000 {
        compatible = "starfive,jh8100-dwmac",
                     "starfive,jh7110-dwmac",
                     "snps,dwmac-5.20";
        ...
};

Changes in v6:
- Removed unnecessary enum "starfive,jh8100-dwmac".
====================

Link: https://lore.kernel.org/r/20240403100549.78719-1-chunhau.tan@starfivetech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agodt-bindings: net: starfive,jh7110-dwmac: Add StarFive JH8100 support
Tan Chun Hau [Wed, 3 Apr 2024 10:05:49 +0000 (03:05 -0700)]
dt-bindings: net: starfive,jh7110-dwmac: Add StarFive JH8100 support

Add StarFive JH8100 dwmac support.
The JH8100 dwmac shares the same driver code as the JH7110 dwmac
and has only one reset signal.

Please refer to below:

  JH8100: reset-names = "stmmaceth";
  JH7110: reset-names = "stmmaceth", "ahb";
  JH7100: reset-names = "ahb";

Example usage of JH8100 in the device tree:

gmac0: ethernet@16030000 {
        compatible = "starfive,jh8100-dwmac",
                     "starfive,jh7110-dwmac",
                     "snps,dwmac-5.20";
        ...
};

Signed-off-by: Tan Chun Hau <chunhau.tan@starfivetech.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20240403100549.78719-2-chunhau.tan@starfivetech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Fri, 5 Apr 2024 00:03:18 +0000 (17:03 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

Conflicts:

net/ipv4/ip_gre.c
  17af420545a7 ("erspan: make sure erspan_base_hdr is present in skb->head")
  5832c4a77d69 ("ip_tunnel: convert __be16 tunnel flags to bitmaps")
https://lore.kernel.org/all/20240402103253.3b54a1cf@canb.auug.org.au/

Adjacent changes:

net/ipv6/ip6_fib.c
  d21d40605bca ("ipv6: Fix infinite recursion in fib6_dump_done().")
  5fc68320c1fb ("ipv6: remove RTNL protection from inet6_dump_fib()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
18 months agoMerge tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 4 Apr 2024 21:49:10 +0000 (14:49 -0700)]
Merge tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, bluetooth and bpf.

  Fairly usual collection of driver and core fixes. The large selftest
  accompanying one of the fixes is also becoming a common occurrence.

  Current release - regressions:

   - ipv6: fix infinite recursion in fib6_dump_done()

   - net/rds: fix possible null-deref in newly added error path

  Current release - new code bugs:

   - net: do not consume a full cacheline for system_page_pool

   - bpf: fix bpf_arena-related file descriptor leaks in the verifier

   - drv: ice: fix freeing uninitialized pointers, fixing misuse of the
     newfangled __free() auto-cleanup

  Previous releases - regressions:

   - x86/bpf: fixes the BPF JIT with retbleed=stuff

   - xen-netfront: add missing skb_mark_for_recycle, fix page pool
     accounting leaks, revealed by recently added explicit warning

   - tcp: fix bind() regression for v6-only wildcard and v4-mapped-v6
     non-wildcard addresses

   - Bluetooth:
      - replace "hci_qca: Set BDA quirk bit if fwnode exists in DT" with
        better workarounds to un-break some buggy Qualcomm devices
      - set conn encrypted before conn establishes, fix re-connecting to
        some headsets which use slightly unusual sequence of msgs

   - mptcp:
      - prevent BPF accessing lowat from a subflow socket
      - don't account accept() of non-MPC client as fallback to TCP

   - drv: mana: fix Rx DMA datasize and skb_over_panic

   - drv: i40e: fix VF MAC filter removal

  Previous releases - always broken:

   - gro: various fixes related to UDP tunnels - netns crossing
     problems, incorrect checksum conversions, and incorrect packet
     transformations which may lead to panics

   - bpf: support deferring bpf_link dealloc to after RCU grace period

   - nf_tables:
      - release batch on table validation from abort path
      - release mutex after nft_gc_seq_end from abort path
      - flush pending destroy work before exit_net release

   - drv: r8169: skip DASH fw status checks when DASH is disabled"

* tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
  netfilter: validate user input for expected length
  net/sched: act_skbmod: prevent kernel-infoleak
  net: usb: ax88179_178a: avoid the interface always configured as random address
  net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45()
  net: ravb: Always update error counters
  net: ravb: Always process TX descriptor ring
  netfilter: nf_tables: discard table flag update with pending basechain deletion
  netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
  netfilter: nf_tables: reject new basechain after table flag update
  netfilter: nf_tables: flush pending destroy work before exit_net release
  netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
  netfilter: nf_tables: release batch on table validation from abort path
  Revert "tg3: Remove residual error handling in tg3_suspend"
  tg3: Remove residual error handling in tg3_suspend
  net: mana: Fix Rx DMA datasize and skb_over_panic
  net/sched: fix lockdep splat in qdisc_tree_reduce_backlog()
  net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping
  net: stmmac: fix rx queue priority assignment
  net: txgbe: fix i2c dev name cannot match clkdev
  net: fec: Set mac_managed_pm during probe
  ...

18 months agoMerge tag 'bcachefs-2024-04-03' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Thu, 4 Apr 2024 21:36:32 +0000 (14:36 -0700)]
Merge tag 'bcachefs-2024-04-03' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs repair code from Kent Overstreet:
 "A couple more small fixes, and new repair code.

  We can now automatically recover from arbitrary corrupted interior
  btree nodes by scanning, and we can reconstruct metadata as needed to
  bring a filesystem back into a working, consistent, read-write state
  and preserve access to whatevver wasn't corrupted.

  Meaning - you can blow away all metadata except for extents and
  dirents leaf nodes, and repair will reconstruct everything else and
  give you your data, and under the correct paths. If inodes are missing
  i_size will be slightly off and permissions/ownership/timestamps will
  be gone, and we do still need the snapshots btree if snapshots were in
  use - in the future we'll be able to guess the snapshot tree structure
  in some situations.

  IOW - aside from shaking out remaining bugs (fuzz testing is still
  coming), repair code should be complete and if repair ever doesn't
  work that's the highest priority bug that I want to know about
  immediately.

  This patchset was kindly tested by a user from India who accidentally
  wiped one drive out of a three drive filesystem with no replication on
  the family computer - it took a couple weeks but we got everything
  important back"

* tag 'bcachefs-2024-04-03' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: reconstruct_inode()
  bcachefs: Subvolume reconstruction
  bcachefs: Check for extents that point to same space
  bcachefs: Reconstruct missing snapshot nodes
  bcachefs: Flag btrees with missing data
  bcachefs: Topology repair now uses nodes found by scanning to fill holes
  bcachefs: Repair pass for scanning for btree nodes
  bcachefs: Don't skip fake btree roots in fsck
  bcachefs: bch2_btree_root_alloc() -> bch2_btree_root_alloc_fake()
  bcachefs: Etyzinger cleanups
  bcachefs: bch2_shoot_down_journal_keys()
  bcachefs: Clear recovery_passes_required as they complete without errors
  bcachefs: ratelimit informational fsck errors
  bcachefs: Check for bad needs_discard before doing discard
  bcachefs: Improve bch2_btree_update_to_text()
  mean_and_variance: Drop always failing tests
  bcachefs: fix nocow lock deadlock
  bcachefs: BCH_WATERMARK_interior_updates
  bcachefs: Fix btree node reserve