Roger Quadros [Tue, 1 Aug 2023 09:14:24 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add ICSSG ethernet driver
This is the Ethernet driver for TI AM654 Silicon rev. 2
with the ICSSG PRU Sub-system running dual-EMAC firmware.
The Programmable Real-time Unit and Industrial Communication Subsystem
Gigabit (PRU_ICSSG) is a low-latency microcontroller subsystem in the TI
SoCs. This subsystem is provided for the use cases like implementation of
custom peripheral interfaces, offloading of tasks from the other
processor cores of the SoC, etc.
Every ICSSG core has two Programmable Real-Time Unit(PRUs),
two auxiliary Real-Time Transfer Unit (RT_PRUs), and
two Transmit Real-Time Transfer Units (TX_PRUs). Each one of these runs
its own firmware. Every ICSSG core has two MII ports connect to these
PRUs and also a MDIO port.
The cores can run different firmwares to support different protocols and
features like switch-dev, timestamping, etc.
It uses System DMA to transfer and receive packets and
shared memory register emulation between the firmware and
driver for control and configuration.
This patch adds support for basic EMAC functionality with 1Gbps
and 100Mbps link speed. 10M and half duplex mode are not supported
currently as they require IEP, the support for which will be added later.
Support for switch-dev, timestamp, etc. will be added later
by subsequent patch series.
Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add a YAML binding document for the ICSSG Programmable real time unit
based Ethernet hardware. The ICSSG driver uses the PRU and PRUSS consumer
APIs to interface the PRUs and load/run the firmware for supporting
ethernet functionality.
Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ti: icssg-prueth: Add Firmware config and classification APIs.
Add icssg_config.h / .c and icssg_classifier.c files. These are firmware
configuration and classification related files. These will be used by
ICSSG ethernet driver.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ti: icssg-prueth: Add mii helper apis and macros
Add MII helper APIs and MACROs. These APIs and MACROs will be later used
by ICSSG Ethernet driver. Also introduce icssg_prueth.h which has
definition of prueth related structures.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ante Knezic [Tue, 1 Aug 2023 06:48:15 +0000 (08:48 +0200)]
net: dsa: mv88e6xxx: Add erratum 3.14 for 88E6390X and 88E6190X
Fixes XAUI/RXAUI lane alignment errors.
Issue causes dropped packets when trying to communicate over
fiber via SERDES lanes of port 9 and 10.
Errata document applies only to 88E6190X and 88E6390X devices.
Requires poking in undocumented registers.
Signed-off-by: Ante Knezic <ante.knezic@helmholz.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 2 Aug 2023 09:09:32 +0000 (10:09 +0100)]
Merge branch 'tc-flower-SPI'
Ratheesh Kannoth says:
====================
Packet classify by matching against SPI
1. net: flow_dissector: Add IPSEC dissector.
Flow dissector patch reads IPSEC headers (ESP or AH) header
from packet and retrieves the SPI header.
2. tc: flower: support for SPI.
TC control path changes to pass SPI field from userspace to
kernel.
3. tc: flower: Enable offload support IPSEC SPI field.
Next patch enables the HW support for classify offload for ESP/AH.
This patch enables the HW offload control.
4. octeontx2-pf: TC flower offload support for SPI field.
HW offload support for classification in octeontx2 driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With [1] removing MPCore SMP support, this makes the OX820 barely usable,
associated with a clear lack of maintainance, development and migration to
dt-schema it's clear that Linux support for OX810 and OX820 should be removed.
In addition, the OX810 hasn't been booted for years and isn't even present
in an ARM config file.
For the OX820, lack of USB and SATA support makes the platform not usable
in the current Linux support and relies on off-tree drivers hacked from the
vendor (defunct for years) sources.
The last users are in the OpenWRT distribution, and today's removal means
support will still be in stable 6.1 LTS kernel until end of 2026.
If someone wants to take over the development even with lack of SMP, I'll
be happy to hand off maintainance.
It has been a fun time adding support for this architecture, but it's time
to get over!
This patchset only removes net changes, and is derived from:
https://lore.kernel.org/r/20230630-topic-oxnas-upstream-remove-v2-0-fb6ab3dea87c@linaro.org
---
Changes in v3:
- Removed applied changes
- Added Andy's tags
- Reduced for net
- Link to v2: https://lore.kernel.org/r/20230630-topic-oxnas-upstream-remove-v2-0-fb6ab3dea87c@linaro.org
Changes in v2:
- s/maintainance/maintenance/
- added acked/review tags
- dropped already applied patches
- drop RFC
- Link to v1: https://lore.kernel.org/r/20230331-topic-oxnas-upstream-remove-v1-0-5bd58fd1dd1f@linaro.org
====================
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Due to lack of maintenance and stall of development for a few years now,
and since no new features will ever be added upstream, remove the
OX810 and OX820 dwmac glue.
Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Daniel Golle <daniel@makrotopia.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Due to lack of maintenance and stall of development for a few years now,
and since no new features will ever be added upstream, remove support
for OX810 and OX820 ethernet.
Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Daniel Golle <daniel@makrotopia.org> Acked-by: Andy Shevchenko <andy@kernel.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 2 Aug 2023 08:18:18 +0000 (09:18 +0100)]
Merge branch 'selftests-mlxsw'
Petr Machata says:
====================
selftests: New selftests for out-of-order-operations patches in mlxsw
In the past, the mlxsw driver made the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices needed to be added to
the bridge before IP addresses were configured on that bridge or SVI added
on top of it, because whatever happened before a netdevice was mlxsw upper
was generally ignored by mlxsw. Recently, several patch series were pushed
to introduce the bookkeeping and replays necessary to offload the full
state, not just the immediate configuration step.
In this patchset, introduce new selftests that directly exercise the out of
order code paths in mlxsw.
- Patch #1 adds new tests into the existing selftest router_bridge.sh.
- Patches #2-#5 add new generic selftests.
- Patches #6-#8 add new mlxsw-specific selftests.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 31 Jul 2023 15:47:22 +0000 (17:47 +0200)]
selftests: mlxsw: rif_bridge: Add a new selftest
This test verifies driver behavior with regards to creation of RIFs for a
bridge as LAGs are added or removed to/from it, and ports added or removed
to/from the LAG.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 31 Jul 2023 15:47:21 +0000 (17:47 +0200)]
selftests: mlxsw: rif_lag_vlan: Add a new selftest
This test verifies driver behavior with regards to creation of RIFs for LAG
VLAN uppers as ports are added or removed to/from the LAG.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 31 Jul 2023 15:47:20 +0000 (17:47 +0200)]
selftests: mlxsw: rif_lag: Add a new selftest
This test verifies driver behavior with regards to creation of RIFs for a
LAG as ports are added or removed to/from it.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 31 Jul 2023 15:47:19 +0000 (17:47 +0200)]
selftests: router_bridge_1d_lag: Add a new selftest
Add a selftest to verify that routing through several bridges works when
LAG VLANs are used instead of physical ports, and that routing through LAG
VLANs themselves works as physical ports are de/enslaved.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 31 Jul 2023 15:47:18 +0000 (17:47 +0200)]
selftests: router_bridge_lag: Add a new selftest
Add a selftest to verify that routing through a bridge works when LAG is
used instead of physical ports.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 31 Jul 2023 15:47:17 +0000 (17:47 +0200)]
selftests: router_bridge_vlan_upper: Add a new selftest
Add a selftest that verifies routing through VLAN bridge uppers.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 31 Jul 2023 15:47:16 +0000 (17:47 +0200)]
selftests: router_bridge_1d: Add a new selftest
Add a selftest to verify that routing through a 1d bridge works when VLAN
upper of a physical port is used instead of a physical port. Also verify
that when a port is attached to an already-configured bridge, the
configuration is applied.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 31 Jul 2023 15:47:15 +0000 (17:47 +0200)]
selftests: router_bridge: Add remastering tests
Add two tests to deslave a port from and reenslave to a bridge. This should
retain the ability of the system to forward traffic, but on an offloading
driver that is sensitive to ordering of operations, it might not.
The first test does this configuration in a way that relies on
vlan_default_pvid to assign the PVID. The second test disables that
autoconfiguration and configures PVID by hand in a separate step.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rohan G Thomas [Mon, 31 Jul 2023 11:50:41 +0000 (19:50 +0800)]
net: stmmac: XGMAC support for mdio C22 addr > 3
For XGMAC versions < 2.2 number of supported mdio C22 addresses is
restricted to 3. From XGMAC version 2.2 there are no restrictions on
the C22 addresses, it supports all valid mdio addresses(0 to 31).
Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com> Acked-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 2 Aug 2023 04:06:27 +0000 (21:06 -0700)]
Merge branch 'add-tja1120-support'
Radu Pirea says:
====================
Add TJA1120 support
This patch series got bigger than I expected. It cleans up the
next-c45-tja11xx driver and adds support for the TJA1120(1000BaseT1
automotive phy).
Master/slave custom implementation was replaced with the generic
implementation (genphy_c45_config_aneg/genphy_c45_read_status).
The TJA1120 and TJA1103 are a bit different when it comes to the PTP
interface. The timestamp read procedure was changed, some addresses were
changed and some bits were moved from one register to another. Adding
TJA1120 support was tricky, and I tried not to duplicate the code. If
something looks too hacky to you, I am open to suggestions.
====================
net: phy: nxp-c45-tja11xx: reset PCS if the link goes down
During PTP testing on early TJA1120 engineering samples I observed that
if the link is lost and recovered, the tx timestamps will be randomly
lost. To avoid this HW issue, the PCS should be reset.
Resetting the PCS will break the link and we should reset the PCS on
LINK UP -> LINK DOWN transition, otherwise we will trigger and infinite
loop of LINK UP -> LINK DOWN events.
net: phy: nxp-c45-tja11xx: read ext trig ts on TJA1120
On TJA1120, the external trigger timestamp now has a VALID bit. This
changes the logic and we can't use the TJA1103 procedure.
For TJA1103, we can always read a valid timestamp from the registers,
compare the new timestamp with the old timestamp and, if they are not the
same, an event occurred. This logic cannot be applied for TJA1120 because
the timestamp is 0 if the VALID bit is not set.
TJA1120 and TJA1103 have a set of functional safety hardware tests
executed after every reset, and when the tests are done, the IRQ line is
asserted. For the moment, the purpose of these handlers is to acknowledge
the IRQ and not to check the FUSA tests status.
net: phy: nxp-c45-tja11xx: read egress ts on TJA1120
The egress timestamp FIFO/circular buffer work different on TJA1120 than
TJA1103.
For TJA1103 the new timestamp should be manually moved from the FIFO to
the hardware buffer before checking if the timestamp is valid.
For TJA1120 the hardware will move automatically the new timestamp
from the FIFO to the buffer and the user should check the valid bit, read
the timestamp and unlock the buffer by writing any of the buffer
registers(which are read only).
Another change for the TJA1120 is the behaviour of the EGR TS IRQ bit.
This bit was a self-clear bit for TJA1103, but now should be cleared
before reading the timestamp.
PHY_BASIC_T1_FEATURES are not the right features supported by TJA1103
anymore.
For example ethtool reports:
[root@alarm ~]# ethtool end0
Settings for end0:
Supported ports: [ TP ]
Supported link modes: 100baseT1/Full
10baseT1L/Full
10baseT1L/Full is not supported by TJA1103 and supported ports list is
not completed. The PHY also have a MII port.
genphy_c45_pma_read_abilities implementation can detect the PHY features
and they look like this.
[root@alarm ~]# ethtool end0
Settings for end0:
Supported ports: [ TP MII ]
Supported link modes: 100baseT1/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 100baseT1/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 100Mb/s
Duplex: Full
Auto-negotiation: off
master-slave cfg: forced master
master-slave status: master
Port: Twisted Pair
PHYAD: 1
Transceiver: external
MDI-X: Unknown
Supports Wake-on: g
Wake-on: d
Link detected: yes
SQI: 7/7
net: phy: nxp-c45-tja11xx: prepare the ground for TJA1120
Between TJA1120 and TJA1103 the hardware was improved, but some register
addresses were changed and some bit fields were moved from one register
to another.
Introduce the nxp_c45_reg_field structure and its associated functions to
abstract the differences between the PHYs.
Remove the defined bits and register addresses that are not common
between TJA1103 and TJA1120 and replace them with reg_fields and
register addresses from phydev->drv->driver_data.
net: phy: nxp-c45-tja11xx: use phylib master/slave implementation
Remove the custom implementation of master/save setup and read status
and use genphy_c45_config_aneg and genphy_c45_read_status since phylib
has support for master/slave setup and master/slave status.
====================
virtio_net: add per queue interrupt coalescing support
Currently, coalescing parameters are grouped for all transmit and receive
virtqueues. This patch series add support to set or get the parameters for
a specified virtqueue.
When the traffic between virtqueues is unbalanced, for example, one virtqueue
is busy and another virtqueue is idle, then it will be very useful to
control coalescing parameters at the virtqueue granularity.
Example command:
$ ethtool -Q eth5 queue_mask 0x1 --coalesce tx-packets 10
Would set max_packets=10 to VQ 1.
$ ethtool -Q eth5 queue_mask 0x1 --coalesce rx-packets 10
Would set max_packets=10 to VQ 0.
$ ethtool -Q eth5 queue_mask 0x1 --show-coalesce
Queue: 0
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0
Gavin Li [Mon, 31 Jul 2023 07:06:55 +0000 (10:06 +0300)]
virtio_net: support per queue interrupt coalesce command
Add interrupt_coalesce config in send_queue and receive_queue to cache user
config.
Send per virtqueue interrupt moderation config to underlying device in
order to have more efficient interrupt moderation and cpu utilization of
guest VM.
Additionally, address all the VQs when updating the global configuration,
as now the individual VQs configuration can diverge from the global
configuration.
Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20230731070656.96411-3-gavinl@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/macmace: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
Since zero-length arrays are deprecated, we are replacing
them with C99 flexible-array members. As a result, instead
of declaring a zero-length array, use the new
DECLARE_FLEX_ARRAY() helper macro.
This fixes warnings such as:
./drivers/net/ethernet/apple/macmace.c:80:4-8: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
Christian Marangi [Sun, 30 Jul 2023 07:41:11 +0000 (09:41 +0200)]
net: dsa: qca8k: limit user ports access to the first CPU port on setup
In preparation for multi-CPU support, set CPU port LOOKUP MEMBER outside
the port loop and setup the LOOKUP MEMBER mask for user ports only to
the first CPU port.
This is to handle flooding condition where every CPU port is set as
target and prevent packet duplication for unknown frames from user ports.
Secondary CPU port LOOKUP MEMBER mask will be setup later when
port_change_master will be implemented.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20230730074113.21889-3-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Christian Marangi [Sun, 30 Jul 2023 07:41:10 +0000 (09:41 +0200)]
net: dsa: qca8k: make learning configurable and keep off if standalone
Address learning should initially be turned off by the driver for port
operation in standalone mode, then the DSA core handles changes to it
via ds->ops->port_bridge_flags().
Currently this is not the case for qca8k where learning is enabled
unconditionally in qca8k_setup for every user port.
Handle ports configured in standalone mode by making the learning
configurable and not enabling it by default.
Implement .port_pre_bridge_flags and .port_bridge_flags dsa ops to
enable learning for bridge that request it and tweak
.port_stp_state_set to correctly disable learning when port is
configured in standalone mode.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20230730074113.21889-2-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
====================
net/sched: improve class lifetime handling
Valis says[0]:
============
Three classifiers (cls_fw, cls_u32 and cls_route) always copy
tcf_result struct into the new instance of the filter on update.
This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.
============
Turns out these could have been spotted easily with proper warnings.
Improve the current class lifetime with wrappers that check for
overflow/underflow.
While at it add an extack for when a class in use is deleted.
Pedro Tammela [Fri, 28 Jul 2023 15:35:37 +0000 (12:35 -0300)]
net/sched: sch_qfq: warn about class in use while deleting
Add extack to warn that delete was rejected because
the class is still in use
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Fri, 28 Jul 2023 15:35:36 +0000 (12:35 -0300)]
net/sched: sch_htb: warn about class in use while deleting
Add extack to warn that delete was rejected because
the class is still in use
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Fri, 28 Jul 2023 15:35:35 +0000 (12:35 -0300)]
net/sched: sch_hfsc: warn about class in use while deleting
Add extack to warn that delete was rejected because
the class is still in use
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Fri, 28 Jul 2023 15:35:34 +0000 (12:35 -0300)]
net/sched: sch_drr: warn about class in use while deleting
Add extack to warn that delete was rejected because
the class is still in use
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Fri, 28 Jul 2023 15:35:33 +0000 (12:35 -0300)]
net/sched: wrap open coded Qdics class filter counter
The 'filter_cnt' counter is used to control a Qdisc class lifetime.
Each filter referecing this class by its id will eventually
increment/decrement this counter in their respective
'add/update/delete' routines.
As these operations are always serialized under rtnl lock, we don't
need an atomic type like 'refcount_t'.
It also means that we lose the overflow/underflow checks already
present in refcount_t, which are valuable to hunt down bugs
where the unsigned counter wraps around as it aids automated tools
like syzkaller to scream in such situations.
Wrap the open coded increment/decrement into helper functions and
add overflow checks to the operations.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
====================
mptcp: cleanup and improvements in the selftests
This small series of 4 patches adds some improvements in MPTCP
selftests:
- Patch 1 reworks the detailed report of mptcp_join.sh selftest to
better display what went well or wrong per test.
- Patch 2 adds colours (if supported, forced and/or not disabled) in
mptcp_join.sh selftest output to help spotting issues.
- Patch 3 modifies an MPTCP selftest tool to interact with the
path-manager via Netlink to always look for errors if any. This makes
sure odd behaviours can be seen in the logs and errors can be caught
later if needed.
- Patch 4 removes stdout and stderr redirections to /dev/null when using
pm_nl_ctl if no errors are expected in order to log odd behaviours.
====================
All pm_nl_ctl commands were muted. If there was an unexpected error with
one of them, this was simply not visible in the logs, making the
analysis very hard. It could also hide misuse of commands by mistake.
Now the output is only muted when we do expect to have an error, e.g.
when giving invalid arguments on purpose.
selftests: mptcp: pm_nl_ctl: always look for errors
If a Netlink command for the MPTCP path-managers is not valid, it is
important to check if there are errors. If yes, they need to be reported
instead of being ignored and exiting without errors.
Now if no replies are expected, an ACK from the kernelspace is asked by
the userspace in order to always expect a reply. We can use the same
buffer that is currently always >1024 bytes. Then we can check if there
is an error (err->error), print it if any and report the error.
After this modification, it is required to mute expected errors in
mptcp_join.sh and pm_netlink.sh selftests:
- when trying to add a bad endpoint, e.g. duplicated
- when trying to set the two limits above the hard limit
This patch modifies how the detailed results are printed, mainly to
improve what is displayed in case of issue:
- Now the test name (title) is printed earlier, when starting the test
if it is not intentionally skipped: by doing that, errors linked to
a test will be printed after having written the test name and then
avoid confusions.
- Due to the previous item, it is required to add a new line after
having printed the test name because in case of error with a command,
it is better not to have the output in the middle of the screen.
- Each check is printed on a dedicated line with aligned status (ok,
skip, fail): it is easier to spot which one has failed, simpler to
manage in the code not having to deal with alignment case by case and
helpers can be used to uniform what is done. These helpers can also be
useful later to do more actions depending on the results or change in
one place what is printed.
- Info messages have been reduced and aligned as well. And info messages
about the creation of the default test files of 1 KB are no longer
printed.
Example:
001 no JOIN
syn [ ok ]
synack [ ok ]
ack [ ok ]
Or with a skip and a failure:
001 no JOIN
syn [ ok ]
synack [fail] got 42 JOIN[s] synack expected 0
Server ns stats
(...)
Client ns stats
(...)
ack [skip]
Or with info:
104 Infinite map
Test file (size 128 KB) for client
Test file (size 128 KB) for server
file received by server has inverted byte at 169
5 corrupted pkts
syn [ ok ]
synack [ ok ]
While at it, verify_listener_events() now also print more info in case
of failure and in pm_nl_check_endpoint(), the test is marked as failed
instead of skipped if no ID has been given (internal selftest issue).
The syzbot reproducer triggered a case that the qdisc refcnt is not
zero during dev_shutdown().
tcx_uninstall() will then WARN_ON_ONCE(tcx_entry(entry)->miniq_active)
because the miniq is still active and the entry should not be freed.
The latter assumed that qdisc destruction happens before tcx teardown.
This fix is to avoid tcx_uninstall() doing tcx_entry_free() when the
miniq is still alive and let the clsact_destroy() do the free later, so
that we do not assume any specific ordering for either of them.
If still active, tcx_uninstall() does clear the entry when flushing out
the prog/link. clsact_destroy() will then notice the "!tcx_entry_is_active()"
and then does the tcx_entry_free() eventually.
Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support") Reported-by: syzbot+376a289e86a0fd02b9ba@syzkaller.appspotmail.com Reported-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+376a289e86a0fd02b9ba@syzkaller.appspotmail.com Tested-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/222255fe07cb58f15ee662e7ee78328af5b438e4.1690549248.git.daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
commit f9aab6f2ce57 ("net/smc: immediate freeing in smc_lgr_cleanup_early()")
left behind smc_lgr_schedule_free_work_fast() declaration.
And since commit 349d43127dac ("net/smc: fix kernel panic caused by race of smc_sock")
smc_ib_modify_qp_reset() is not used anymore.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Link: https://lore.kernel.org/r/20230729121929.17180-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jan Sokolowski [Fri, 28 Jul 2023 17:13:36 +0000 (10:13 -0700)]
i40e: remove i40e_status
Replace uses of i40e_status to as equivalent as possible error codes.
Remove enum i40e_status as it is no longer needed
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230728171336.2446156-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
commit 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
left behind tcp_bpf_get_proto() declaration. And tcp_v4_tw_remember_stamp()
function is remove in ccb7c410ddc0 ("timewait_sock: Create and use getpeer op.").
Since commit 686989700cab ("tcp: simplify tcp_mark_skb_lost")
tcp_skb_mark_lost_uncond_verify() declaration is not used anymore.
net: Use sockaddr_storage for getsockopt(SO_PEERNAME).
Commit df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") started
applying strict rules to standard string functions.
It does not work well with conventional socket code around each protocol-
specific sockaddr_XXX struct, which is cast from sockaddr_storage and has
a bigger size than fortified functions expect. See these commits:
commit 06d4c8a80836 ("af_unix: Fix fortify_panic() in unix_bind_bsd().")
commit ecb4534b6a1c ("af_unix: Terminate sun_path when bind()ing pathname socket.")
commit a0ade8404c3b ("af_packet: Fix warning of fortified memcpy() in packet_getname().")
We must cast the protocol-specific address back to sockaddr_storage
to call such functions.
However, in the case of getsockaddr(SO_PEERNAME), the rationale is a bit
unclear as the buffer is defined by char[128] which is the same size as
sockaddr_storage.
Let's use sockaddr_storage explicitly.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
As 32bits of dissector->used_keys are exhausted,
increase the size to 64bits.
This is base change for ESP/AH flow dissector patch.
Please find patch and discussions at
https://lore.kernel.org/netdev/ZMDNjD46BvZ5zp5I@corigine.com/T/#t
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw Tested-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Li [Thu, 27 Jul 2023 00:57:41 +0000 (08:57 +0800)]
team: Remove NULL check before dev_{put, hold}
The call netdev_{put, hold} of dev_{put, hold} will check NULL,
so there is no need to check before using dev_{put, hold},
remove it to silence the warning:
./drivers/net/team/team.c:2325:3-10: WARNING: NULL check before dev_{put, hold} functions is not needed.
Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5991 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Referring to platform_get_irq()'s definition, the return value has
already been checked, error message also been printed via
dev_err_probe() if ret < 0. Calling dev_err_probe() one more time
outside platform_get_irq() is obviously redundant.
Removing dev_err_probe() outside platform_get_irq() to clean up
above problem.
Hayes Wang [Wed, 26 Jul 2023 03:08:07 +0000 (11:08 +0800)]
r8152: adjust generic_ocp_write function
Reduce the control transfer if all bytes of first or the last DWORD are
written.
The original method is to split the control transfer into three parts
(the first DWORD, middle continuous data, and the last DWORD). However,
they could be combined if whole bytes of the first DWORD or last DWORD
are written. That is, the first DWORD or the last DWORD could be combined
with the middle continuous data, if the byte_en is 0xff.
====================
In-kernel support for the TLS Alert protocol
IMO the kernel doesn't need user space (ie, tlshd) to handle the TLS
Alert protocol. Instead, a set of small helper functions can be used
to handle sending and receiving TLS Alerts for in-kernel TLS
consumers.
====================
Merged on top of a tag in case it's needed in the NFS tree.
Chuck Lever [Thu, 27 Jul 2023 17:37:37 +0000 (13:37 -0400)]
SUNRPC: Use new helpers to handle TLS Alerts
Use the helpers to parse the level and description fields in
incoming alerts. "Warning" alerts are discarded, and "fatal"
alerts mean the session is no longer valid.
Chuck Lever [Thu, 27 Jul 2023 17:35:50 +0000 (13:35 -0400)]
net/tls: Add TLS Alert definitions
I'm about to add support for kernel handshake API consumers to send
TLS Alerts, so introduce the needed protocol definitions in the new
header tls_prot.h.
This presages support for Closure alerts. Also, support for alerts
is a pre-requite for handling session re-keying, where one peer will
signal the need for a re-key by sending a TLS Alert.
Chuck Lever [Thu, 27 Jul 2023 17:35:23 +0000 (13:35 -0400)]
net/tls: Move TLS protocol elements to a separate header
Kernel TLS consumers will need definitions of various parts of the
TLS protocol, but often do not need the function declarations and
other infrastructure provided in <net/tls.h>.
Break out existing standardized protocol elements into a separate
header, and make room for a few more elements in subsequent patches.
Jakub Kicinski [Thu, 27 Jul 2023 19:07:26 +0000 (12:07 -0700)]
eth: bnxt: fix warning for define in struct_group
Fix C=1 warning with sparse 0.6.4:
drivers/net/ethernet/broadcom/bnxt/bnxt.c: note: in included file:
drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h:30:1: warning: directive in macro's argument list
Jakub Kicinski [Thu, 27 Jul 2023 19:07:25 +0000 (12:07 -0700)]
eth: bnxt: fix one of the W=1 warnings about fortified memcpy()
Fix a W=1 warning with gcc 13.1:
In function ‘fortify_memcpy_chk’,
inlined from ‘bnxt_hwrm_queue_cos2bw_cfg’ at drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c:133:3:
include/linux/fortify-string.h:592:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
592 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The field group is already defined and starts at queue_id:
Jakub Kicinski [Fri, 28 Jul 2023 20:41:59 +0000 (13:41 -0700)]
Merge tag 'mlx5-updates-2023-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2023-07-24
1) Generalize devcom implementation to be independent of number of ports
or device's GUID.
2) Save memory on command interface statistics.
3) General code cleanups
* tag 'mlx5-updates-2023-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Give esw_offloads_load/unload_rep() "mlx5_" prefix
net/mlx5: Make mlx5_eswitch_load/unload_vport() static
net/mlx5: Make mlx5_esw_offloads_rep_load/unload() static
net/mlx5: Remove pointless devlink_rate checks
net/mlx5: Don't check vport->enabled in port ops
net/mlx5e: Make flow classification filters static
net/mlx5e: Remove duplicate code for user flow
net/mlx5: Allocate command stats with xarray
net/mlx5: split mlx5_cmd_init() to probe and reload routines
net/mlx5: Remove redundant cmdif revision check
net/mlx5: Re-organize mlx5_cmd struct
net/mlx5e: E-Switch, Allow devcom initialization on more vports
net/mlx5e: E-Switch, Register devcom device with switch id key
net/mlx5: Devcom, Infrastructure changes
net/mlx5: Use shared code for checking lag is supported
====================
====================
mlxsw: Avoid non-tracker helpers when holding and putting netdevices
Using the tracking helpers, netdev_hold() and netdev_put(), makes it easier
to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER
is enabled. For example, the following traceback shows the callpath to the
point of an outstanding hold that was never put:
unregister_netdevice: waiting for swp3 to become free. Usage count = 6
ref_tracker: eth%d@ffff888123c9a580 has 1/5 users at
mlxsw_sp_switchdev_event+0x6bd/0xcc0 [mlxsw_spectrum]
notifier_call_chain+0xbf/0x3b0
atomic_notifier_call_chain+0x78/0x200
br_switchdev_fdb_notify+0x25f/0x2c0 [bridge]
fdb_notify+0x16a/0x1a0 [bridge]
[...]
In this patchset, get rid of all non-ref-tracking helpers in mlxsw.
- Patch #1 drops two functions that are not used anymore, but contain
dev_hold() / dev_put() calls.
- Patch #2 avoids taking a reference in one function which is called
under RTNL.
- The remaining patches convert individual hold/put sites one by one
from trackerless to tracker-enabled.