]> www.infradead.org Git - users/hch/block.git/log
users/hch/block.git
2 years agonet/sched: Retire dsmark qdisc
Jamal Hadi Salim [Tue, 14 Feb 2023 13:49:13 +0000 (08:49 -0500)]
net/sched: Retire dsmark qdisc

The dsmark qdisc has served us well over the years for diffserv but has not
been getting much attention due to other more popular approaches to do diffserv
services. Most recently it has become a shooting target for syzkaller. For this
reason, we are retiring it.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/sched: Retire ATM qdisc
Jamal Hadi Salim [Tue, 14 Feb 2023 13:49:12 +0000 (08:49 -0500)]
net/sched: Retire ATM qdisc

The ATM qdisc has served us well over the years but has not been getting much
TLC due to lack of known users. Most recently it has become a shooting target
for syzkaller. For this reason, we are retiring it.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/sched: Retire CBQ qdisc
Jamal Hadi Salim [Tue, 14 Feb 2023 13:49:11 +0000 (08:49 -0500)]
net/sched: Retire CBQ qdisc

While this amazing qdisc has served us well over the years it has not been
getting any tender love and care and has bitrotted over time.
It has become mostly a shooting target for syzkaller lately.
For this reason, we are retiring it. Goodbye CBQ - we loved you.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'adding-sparx5-es0-vcap-support'
Paolo Abeni [Thu, 16 Feb 2023 07:59:51 +0000 (08:59 +0100)]
Merge branch 'adding-sparx5-es0-vcap-support'

Steen Hegelund says:

====================
Adding Sparx5 ES0 VCAP support

This provides the Egress Stage 0 (ES0) VCAP (Versatile Content-Aware
Processor) support for the Sparx5 platform.

The ES0 VCAP is an Egress Access Control VCAP that uses frame keyfields and
previously classified keyfields to add, rewrite or remove VLAN tags on the
egress frames, and is therefore often referred to as the rewriter.

The ES0 VCAP also supports trapping frames to the host.

The ES0 VCAP has 1 lookup accessible with this chain id:

- chain 10000000: ES0 Lookup 0

The ES0 VCAP does not do traffic classification to select a keyset, but it
does have two keysets that can be used on all traffic.  For now only the
ISDX keyset is used.

The ES0 VCAP can match on an ISDX key (Ingress Service Index) as one of the
frame metadata keyfields, similar to the ES2 VCAP.

The ES0 VCAP uses external counters in the XQS (statistics) group.
====================

Link: https://lore.kernel.org/r/20230214104049.1553059-1-steen.hegelund@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Add TC vlan action support for the ES0 VCAP
Steen Hegelund [Tue, 14 Feb 2023 10:40:49 +0000 (11:40 +0100)]
net: microchip: sparx5: Add TC vlan action support for the ES0 VCAP

This provides these 3 actions for rule in the ES0 VCAP:

- action vlan pop
- action vlan modify id X priority Y
- action vlan push id X priority Y protocol Z

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Add TC support for the ES0 VCAP
Steen Hegelund [Tue, 14 Feb 2023 10:40:48 +0000 (11:40 +0100)]
net: microchip: sparx5: Add TC support for the ES0 VCAP

This enables the TC command to use the Sparx5 ES0 VCAP, and handling of
rule links between IS0 and ES0.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Add ES0 VCAP keyset configuration for Sparx5
Steen Hegelund [Tue, 14 Feb 2023 10:40:47 +0000 (11:40 +0100)]
net: microchip: sparx5: Add ES0 VCAP keyset configuration for Sparx5

This adds the ES0 VCAP port keyset configuration for Sparx5.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Updated register interface with VCAP ES0 access
Steen Hegelund [Tue, 14 Feb 2023 10:40:46 +0000 (11:40 +0100)]
net: microchip: sparx5: Updated register interface with VCAP ES0 access

This provides access to the ES0 VCAP register targets

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Add ES0 VCAP model and updated KUNIT VCAP model
Steen Hegelund [Tue, 14 Feb 2023 10:40:45 +0000 (11:40 +0100)]
net: microchip: sparx5: Add ES0 VCAP model and updated KUNIT VCAP model

This provides the VCAP model for the Sparx5 ES0 (Egress Stage 0) VCAP.

This VCAP provides rewriting functionality in the egress path.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Improve the error handling for linked rules
Steen Hegelund [Tue, 14 Feb 2023 10:40:44 +0000 (11:40 +0100)]
net: microchip: sparx5: Improve the error handling for linked rules

Ensure that an error is returned if the VCAP instance was not found.
The chain offset (diff) is allowed to be zero as this just means that the
user did not request rules to be linked.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Use chain ids without offsets when enabling rules
Steen Hegelund [Tue, 14 Feb 2023 10:40:43 +0000 (11:40 +0100)]
net: microchip: sparx5: Use chain ids without offsets when enabling rules

This improves the check performed on linked rules when enabling or
disabling them.  The chain id used must be the chain id without the offset
used for linking the rules.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Egress VLAN TPID configuration follows IFH
Steen Hegelund [Tue, 14 Feb 2023 10:40:42 +0000 (11:40 +0100)]
net: microchip: sparx5: Egress VLAN TPID configuration follows IFH

This changes the TPID of the egress frames to use the TPID stored in the
IFH (internal frame header), which ensures that this is the TPID classified
for the frame at ingress.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Clear rule counter even if lookup is disabled
Steen Hegelund [Tue, 14 Feb 2023 10:40:41 +0000 (11:40 +0100)]
net: microchip: sparx5: Clear rule counter even if lookup is disabled

The rule counter must be cleared when creating a new rule, even if the VCAP
lookup is currently disabled.

This ensures that rules located in VCAPs that use external counters (such
as Sparx5 IS2 and ES0) will have their counter reset even if the VCAP
lookup is not enabled at the moment.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Fixes: 95fa74148daa ("net: microchip: sparx5: Reset VCAP counter for new rules")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: sparx5: Discard frames with SMAC multicast addresses
Steen Hegelund [Tue, 14 Feb 2023 10:40:40 +0000 (11:40 +0100)]
net: microchip: sparx5: Discard frames with SMAC multicast addresses

A valid frame should never use a multicast address as its source MAC
address, so discard these invalid frames.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Thu, 16 Feb 2023 05:52:33 +0000 (21:52 -0800)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-02-14 (ice)

This series contains updates to ice driver only.

Karol extends support for GPIO pins to E823 devices.

Daniel Vacek stops processing of PTP packets when link is down.

Pawel adds support for BIG TCP for IPv6.

Tony changes return type of ice_vsi_realloc_stat_arrays() as it always
returns success.

Zhu Yanjun updates kdoc stating supported TLVs.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: Mention CEE DCBX in code comment
  ice: Change ice_vsi_realloc_stat_arrays() to void
  ice: add support BIG TCP on IPv6
  ice/ptp: fix the PTP worker retrying indefinitely if the link went down
  ice: Add GPIO pin support for E823 products
====================

Link: https://lore.kernel.org/r/20230214213003.2117125-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoDocumentation: core-api: packing: correct spelling
Randy Dunlap [Wed, 15 Feb 2023 05:37:38 +0000 (21:37 -0800)]
Documentation: core-api: packing: correct spelling

Correct spelling problems for Documentation/core-api/packing.rst as
reported by codespell.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Link: https://lore.kernel.org/r/20230215053738.11562-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: snps,dwmac: Fix snps,reset-delays-us dependency
Andrew Halaney [Tue, 14 Feb 2023 17:15:04 +0000 (11:15 -0600)]
dt-bindings: net: snps,dwmac: Fix snps,reset-delays-us dependency

The schema had snps,reset-delay-us as dependent on snps,reset-gpio. The
actual property is called snps,reset-delays-us, so fix this to catch any
devicetree defining snsps,reset-delays-us without snps,reset-gpio.

Fixes: 7db3545aef5f ("dt-bindings: net: stmmac: Convert the binding to a schemas")
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Link: https://lore.kernel.org/r/20230214171505.224602-1-ahalaney@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: forwarding: tc_actions: cleanup temporary files when test is aborted
Davide Caratti [Tue, 14 Feb 2023 09:52:37 +0000 (10:52 +0100)]
selftests: forwarding: tc_actions: cleanup temporary files when test is aborted

remove temporary files created by 'mirred_egress_to_ingress_tcp' test
in the cleanup() handler. Also, change variable names to avoid clashing
with globals from lib.sh.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/091649045a017fc00095ecbb75884e5681f7025f.1676368027.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: wangxun: Add the basic ethtool interfaces
Mengyuan Lou [Tue, 14 Feb 2023 09:15:27 +0000 (17:15 +0800)]
net: wangxun: Add the basic ethtool interfaces

Add the basic ethtool ops get_drvinfo and get_link for ngbe and txgbe.
Ngbe implements get_link_ksettings, nway_reset and set_link_ksettings
for free using phylib code.
The code related to the physical interface is not yet fully implemented
in txgbe using phylink code. So do not implement get_link_ksettings,
nway_reset and set_link_ksettings in txgbe.

Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230214091527.69943-1-mengyuanlou@net-swift.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: msg_zerocopy: elide page accounting if RLIM_INFINITY
Willem de Bruijn [Tue, 14 Feb 2023 15:57:40 +0000 (10:57 -0500)]
net: msg_zerocopy: elide page accounting if RLIM_INFINITY

MSG_ZEROCOPY ensures that pinned user pages do not exceed the limit.
If no limit is set, skip this accounting as otherwise expensive
atomic_long operations are called for no reason.

This accounting is already skipped for privileged (CAP_IPC_LOCK)
users. Rely on the same mechanism: if no mmp->user is set,
mm_unaccount_pinned_pages does not decrement either.

Tested by running tools/testing/selftests/net/msg_zerocopy.sh with
an unprivileged user for the TXMODE binary:

    ip netns exec "${NS1}" sudo -u "{$USER}" "${BIN}" "-${IP}" ...

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230214155740.3448763-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: phy: c45: genphy_c45_an_config_aneg(): fix uninitialized symbol error
Oleksij Rempel [Wed, 15 Feb 2023 05:04:53 +0000 (06:04 +0100)]
net: phy: c45: genphy_c45_an_config_aneg(): fix uninitialized symbol error

Fix warning:
drivers/net/phy/phy-c45.c:712 genphy_c45_write_eee_adv() error: uninitialized symbol 'changed'

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/r/202302150232.q6idsV8s-lkp@intel.com/
Fixes: 022c3f87f88e ("net: phy: add genphy_c45_ethtool_get/set_eee() support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230215050453.2251360-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: phy: motorcomm: uninitialized variables in yt8531_link_change_notify()
Dan Carpenter [Wed, 15 Feb 2023 04:21:47 +0000 (07:21 +0300)]
net: phy: motorcomm: uninitialized variables in yt8531_link_change_notify()

These booleans are never set to false, but are just used without being
initialized.

Fixes: 4ac94f728a58 ("net: phy: Add driver for Motorcomm yt8531 gigabit ethernet phy")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Frank Sae <Frank.Sae@motor-comm.com>
Link: https://lore.kernel.org/r/Y+xd2yJet2ImHLoQ@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'mlx5-updates-2023-02-10' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Thu, 16 Feb 2023 03:24:52 +0000 (19:24 -0800)]
Merge tag 'mlx5-updates-2023-02-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-02-10

1) From Roi and Mark: MultiPort eswitch support

MultiPort E-Switch builds on newer hardware's capabilities and introduces
a mode where a single E-Switch is used and all the vports and physical
ports on the NIC are connected to it.

The new mode will allow in the future a decrease in the memory used by the
driver and advanced features that aren't possible today.

This represents a big change in the current E-Switch implantation in mlx5.
Currently, by default, each E-Switch manager manages its E-Switch.
Steering rules in each E-Switch can only forward traffic to the native
physical port associated with that E-Switch. While there are ways to target
non-native physical ports, for example using a bond or via special TC
rules. None of the ways allows a user to configure the driver
to operate by default in such a mode nor can the driver decide
to move to this mode by default as it's user configuration-driven right now.

While MultiPort E-Switch single FDB mode is the preferred mode, older
generations of ConnectX hardware couldn't support this mode so it was never
implemented. Now that there is capable hardware present, start the
transition to having this mode by default.

Introduce a devlink parameter to control MultiPort Eswitch single FDB mode.
This will allow users to select this mode on their system right now
and in the future will allow the driver to move to this mode by default.

2) From Jiri: Improvements and fixes for mlx5 netdev's devlink logic
 2.1) Cleanups related to mlx5's devlink port logic
 2.2) Move devlink port registration to be done before netdev alloc
 2.3) Create auxdev devlink instance in the same ns as parent devlink
 2.4) Suspend auxiliary devices only in case of PCI device suspend

* tag 'mlx5-updates-2023-02-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Suspend auxiliary devices only in case of PCI device suspend
  net/mlx5: Remove "recovery" arg from mlx5_load_one() function
  net/mlx5e: Create auxdev devlink instance in the same ns as parent devlink
  net/mlx5e: Move devlink port registration to be done before netdev alloc
  net/mlx5e: Move dl_port to struct mlx5e_dev
  net/mlx5e: Replace usage of mlx5e_devlink_get_dl_port() by netdev->devlink_port
  net/mlx5e: Pass mdev to mlx5e_devlink_port_register()
  net/mlx5: Remove outdated comment
  net/mlx5e: TC, Remove redundant parse_attr argument
  net/mlx5e: Use a simpler comparison for uplink rep
  net/mlx5: Lag, Add single RDMA device in multiport mode
  net/mlx5: Lag, set different uplink vport metadata in multiport eswitch mode
  net/mlx5: E-Switch, rename bond update function to be reused
  net/mlx5e: TC, Add peer flow in mpesw mode
  net/mlx5: Lag, Control MultiPort E-Switch single FDB mode
====================

Link: https://lore.kernel.org/r/20230214221239.159033-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'devlink-cleanups-and-move-devlink-health-functionality-to-separate...
Jakub Kicinski [Thu, 16 Feb 2023 03:15:45 +0000 (19:15 -0800)]
Merge branch 'devlink-cleanups-and-move-devlink-health-functionality-to-separate-file'

Moshe Shemesh says:

====================
devlink: cleanups and move devlink health functionality to separate file

This patchset moves devlink health callbacks, helpers and related code
from leftover.c to new file health.c. About 1.3K LoC are moved by this
patchset, covering all devlink health functionality.

In addition this patchset includes a couple of small cleanups in devlink
health code and documentation update.
====================

Link: https://lore.kernel.org/r/1676392686-405892-1-git-send-email-moshe@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Fix TP_STRUCT_entry in trace of devlink health report
Moshe Shemesh [Tue, 14 Feb 2023 16:38:06 +0000 (18:38 +0200)]
devlink: Fix TP_STRUCT_entry in trace of devlink health report

Fix a bug in trace point definition for devlink health report, as
TP_STRUCT_entry of reporter_name should get reporter_name and not msg.

Note no fixes tag as this is a harmless bug as both reporter_name and
msg are strings and TP_fast_assign for this entry is correct.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Update devlink health documentation
Moshe Shemesh [Tue, 14 Feb 2023 16:38:05 +0000 (18:38 +0200)]
devlink: Update devlink health documentation

Update devlink-health.rst file:
- Add devlink formatted message (fmsg) API documentation.
- Add auto-dump as a condition to do dump once error reported.
- Expand OOB to clarify this acronym.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Move health common function to health file
Moshe Shemesh [Tue, 14 Feb 2023 16:38:04 +0000 (18:38 +0200)]
devlink: Move health common function to health file

Now that all devlink health callbacks and related code are in file
health.c move common health functions and devlink_health_reporter struct
to be local in health.c file.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Move devlink health test to health file
Moshe Shemesh [Tue, 14 Feb 2023 16:38:03 +0000 (18:38 +0200)]
devlink: Move devlink health test to health file

Move devlink health report test callback from leftover.c to health.c. No
functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Move devlink health dump to health file
Moshe Shemesh [Tue, 14 Feb 2023 16:38:02 +0000 (18:38 +0200)]
devlink: Move devlink health dump to health file

Move devlink health report dump callbacks and related code from
leftover.c to health.c. No functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Move devlink fmsg and health diagnose to health file
Moshe Shemesh [Tue, 14 Feb 2023 16:38:01 +0000 (18:38 +0200)]
devlink: Move devlink fmsg and health diagnose to health file

Devlink fmsg (formatted message) is used by devlink health diagnose,
dump and drivers which support these devlink health callbacks.
Therefore, move devlink fmsg helpers and related code to file health.c.
Move devlink health diagnose to file health.c. No functional change in
this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Move devlink health report and recover to health file
Moshe Shemesh [Tue, 14 Feb 2023 16:38:00 +0000 (18:38 +0200)]
devlink: Move devlink health report and recover to health file

Move devlink health report helper and recover callback and related code
from leftover.c to health.c. No functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Move devlink health get and set code to health file
Moshe Shemesh [Tue, 14 Feb 2023 16:37:59 +0000 (18:37 +0200)]
devlink: Move devlink health get and set code to health file

Move devlink health get and set callbacks and related code from
leftover.c to health.c. No functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: health: Fix nla_nest_end in error flow
Moshe Shemesh [Tue, 14 Feb 2023 16:37:58 +0000 (18:37 +0200)]
devlink: health: Fix nla_nest_end in error flow

devlink_nl_health_reporter_fill() error flow calls nla_nest_end(). Fix
it to call nla_nest_cancel() instead.

Note the bug is harmless as genlmsg_cancel() cancel the entire message,
so no fixes tag added.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Split out health reporter create code
Moshe Shemesh [Tue, 14 Feb 2023 16:37:57 +0000 (18:37 +0200)]
devlink: Split out health reporter create code

Move devlink health reporter create/destroy and related dev code to new
file health.c. This file shall include all callbacks and functionality
that are related to devlink health.

In addition, fix kdoc indentation and make reporter create/destroy kdoc
more clear. No functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: update xdp_features with xdp multi-buff
Lorenzo Bianconi [Tue, 14 Feb 2023 14:39:27 +0000 (15:39 +0100)]
ice: update xdp_features with xdp multi-buff

Now ice driver supports xdp multi-buffer so add it to xdp_features.
Check vsi type before setting xdp_features flag.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/8a4781511ab6e3cd280e944eef69158954f1a15f.1676385351.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoi40e: check vsi type before setting xdp_features flag
Lorenzo Bianconi [Tue, 14 Feb 2023 20:07:33 +0000 (21:07 +0100)]
i40e: check vsi type before setting xdp_features flag

Set xdp_features flag just for I40E_VSI_MAIN vsi type since XDP is
supported just in this configuration.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/f2b537f86b34fc176fbc6b3d249b46a20a87a2f3.1676405131.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: phylink: support validated pause and autoneg in fixed-link
Ivan Bornyakov [Fri, 10 Feb 2023 15:46:27 +0000 (18:46 +0300)]
net: phylink: support validated pause and autoneg in fixed-link

In fixed-link setup phylink_parse_fixedlink() unconditionally sets
Pause, Asym_Pause and Autoneg bits to "supported" bitmap, while MAC may
not support these.

This leads to ethtool reporting:

 > Supported pause frame use: Symmetric Receive-only
 > Supports auto-negotiation: Yes

regardless of what is actually supported.

Instead of unconditionally set Pause, Asym_Pause and Autoneg it is
sensible to set them according to validated "supported" bitmap, i.e. the
result of phylink_validate().

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: no longer support SOCK_REFCNT_DEBUG feature
Jason Xing [Tue, 14 Feb 2023 04:14:10 +0000 (12:14 +0800)]
net: no longer support SOCK_REFCNT_DEBUG feature

Commit e48c414ee61f ("[INET]: Generalise the TCP sock ID lookup routines")
commented out the definition of SOCK_REFCNT_DEBUG in 2005 and later another
commit 463c84b97f24 ("[NET]: Introduce inet_connection_sock") removed it.
Since we could track all of them through bpf and kprobe related tools
and the feature could print loads of information which might not be
that helpful even under a little bit pressure, the whole feature which
has been inactive for many years is no longer supported.

Link: https://lore.kernel.org/lkml/20230211065153.54116-1-kerneljasonxing@gmail.com/
Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonetlink-specs: add rx-push to ethtool family
Jakub Kicinski [Tue, 14 Feb 2023 04:32:46 +0000 (20:32 -0800)]
netlink-specs: add rx-push to ethtool family

Commit 5b4e9a7a71ab ("net: ethtool: extend ringparam set/get APIs for rx_push")
added a new attr for configuring rx-push, right after tx-push.
Add it to the spec, the ring param operation is covered by
the otherwise sparse ethtool spec.

Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20230214043246.230518-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'net-make-kobj_type-structures-constant'
Jakub Kicinski [Wed, 15 Feb 2023 04:48:10 +0000 (20:48 -0800)]
Merge branch 'net-make-kobj_type-structures-constant'

Thomas Weißschuh says:

====================
net: make kobj_type structures constant

Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definitions to prevent
modification at runtime.
====================

Link: https://lore.kernel.org/r/20230211-kobj_type-net-v2-0-013b59e59bf3@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet-sysfs: make kobj_type structures constant
Thomas Weißschuh [Tue, 14 Feb 2023 04:23:12 +0000 (04:23 +0000)]
net-sysfs: make kobj_type structures constant

Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definitions to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: bridge: make kobj_type structure constant
Thomas Weißschuh [Tue, 14 Feb 2023 04:23:11 +0000 (04:23 +0000)]
net: bridge: make kobj_type structure constant

Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definition to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'net-ipa-define-gsi-register-fields-differently'
Jakub Kicinski [Wed, 15 Feb 2023 04:39:40 +0000 (20:39 -0800)]
Merge branch 'net-ipa-define-gsi-register-fields-differently'

Alex Elder says:

====================
net: ipa: define GSI register fields differently

Now that we have "reg" definitions in place to define GSI register
offsets, add the definitions for the fields of GSI registers that
have them.

There aren't many differences between versions, but a few fields are
present only in some versions of IPA, so additional "gsi_reg-vX.Y.c"
files are created to capture such differences.  As in the previous
series, these files are created as near-copies of existing files
just before they're needed to represent these differences.  The
first patch adds files for IPA v4.0, v4.5, and v4.9; the fifth patch
adds a file for IPA v4.11.

Note that the first and fifth patch cause some checkpatch warnings
because they align some continued lines with an open parenthesis
that at the fourth column.
====================

Link: https://lore.kernel.org/r/20230213162229.604438-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: define fields for remaining GSI registers
Alex Elder [Mon, 13 Feb 2023 16:22:29 +0000 (10:22 -0600)]
net: ipa: define fields for remaining GSI registers

Define field IDs for the remaining GSI registers, and populate the
register definition files accordingly.  Use the reg_*() functions to
access field values for those regiters, and get rid of the previous
field definition constants.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: add "gsi_v4.11.c"
Alex Elder [Mon, 13 Feb 2023 16:22:28 +0000 (10:22 -0600)]
net: ipa: add "gsi_v4.11.c"

The next patch adds a GSI register field that is only valid starting
at IPA v4.11.  Create "gsi_v4.11.c" from "gsi_v4.9.c", changing only
the name of the public regs structure it defines.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: define fields for event-ring related registers
Alex Elder [Mon, 13 Feb 2023 16:22:27 +0000 (10:22 -0600)]
net: ipa: define fields for event-ring related registers

Define field IDs for the EV_CH_E_CNTXT_0 and EV_CH_E_CNTXT_8 GSI
registers, and populate the register definition files accordingly.
Use the reg_*() functions to access field values for those regiters,
and get rid of the previous field definition constants.

The remaining EV_CH_E_CNTXT_* registers are written with full 32-bit
values (and have no fields).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: define more fields for GSI registers
Alex Elder [Mon, 13 Feb 2023 16:22:26 +0000 (10:22 -0600)]
net: ipa: define more fields for GSI registers

Beyond the CH_C_QOS register, two other registers whose offset is
related to channel number have fields within them.

Define the fields within the CH_C_CNTXT_0 GSI register, using an
enumerated type to identify the register's fields, and define an
array of field masks to use for that register's reg structure.

For the CH_C_CNTXT_1 GSI register, ch_c_cntxt_1_length_encode()
previously hid the difference in bit width in the channel ring
length field.  Instead, define a new field CH_R_LENGTH and encode
the ring size with reg_encode().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: define GSI CH_C_QOS register fields
Alex Elder [Mon, 13 Feb 2023 16:22:25 +0000 (10:22 -0600)]
net: ipa: define GSI CH_C_QOS register fields

Define the fields within the CH_C_QOS GSI register using an array of
field masks in that register's reg structure.  Use the reg functions
for encoding values in those fields.

One field in the register is present for IPA v4.0-4.2 only, two
others are present starting at IPA v4.5, and one more is there
starting at IPA v4.9.

Drop the "GSI_" prefix in symbols defined in the gsi_prefetch_mode
enumerated type, and define their values using decimal rather than
hexidecimal values.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: populate more GSI register files
Alex Elder [Mon, 13 Feb 2023 16:22:24 +0000 (10:22 -0600)]
net: ipa: populate more GSI register files

Create "gsi_v4.0.c", "gsi_v4.5.c", and "gsi_v4.9.c" as essentially
identical copies of "gsi_v3.5.1.c".  The only difference is the name
of the exported "gsi_regs_vX_Y" structure.  The next patch will
start differentiating them.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/mlx5: Suspend auxiliary devices only in case of PCI device suspend
Jiri Pirko [Thu, 26 Jan 2023 12:39:09 +0000 (13:39 +0100)]
net/mlx5: Suspend auxiliary devices only in case of PCI device suspend

The original behavior introduced by commit c6acd629eec7 ("net/mlx5e: Add
support for devlink-port in non-representors mode") correctly
re-instantiated uplink devlink port and related netdevice during devlink
reload. However with migration to auxiliary devices, this behaviour
changed.

Restore the original behaviour and tear down auxiliary devices
completely during devlink reload.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Remove "recovery" arg from mlx5_load_one() function
Jiri Pirko [Thu, 26 Jan 2023 12:23:17 +0000 (13:23 +0100)]
net/mlx5: Remove "recovery" arg from mlx5_load_one() function

mlx5_load_one() is always called with recovery==false, so remove the
unneeded function arg.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: Create auxdev devlink instance in the same ns as parent devlink
Jiri Pirko [Thu, 26 Jan 2023 09:46:49 +0000 (10:46 +0100)]
net/mlx5e: Create auxdev devlink instance in the same ns as parent devlink

Commit cited in "fixes" tag moved the devlink port under separate
devlink entity created for auxiliary device. Respect the network
namespace of parent devlink entity and allocate the devlink there.

Fixes: ee75f1fc44dd ("net/mlx5e: Create separate devlink instance for ethernet auxiliary device")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: Move devlink port registration to be done before netdev alloc
Jiri Pirko [Wed, 18 Jan 2023 14:52:33 +0000 (15:52 +0100)]
net/mlx5e: Move devlink port registration to be done before netdev alloc

Move the devlink port registration to be done right after devlink
instance registration.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: Move dl_port to struct mlx5e_dev
Jiri Pirko [Tue, 17 Jan 2023 13:44:54 +0000 (14:44 +0100)]
net/mlx5e: Move dl_port to struct mlx5e_dev

No need to have dl_port which is tightly coupled with mlx5e code
in mlx5 core code. Move it to struct mlx5e_dev and loose
mlx5e_devlink_get_dl_port() helper.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: Replace usage of mlx5e_devlink_get_dl_port() by netdev->devlink_port
Jiri Pirko [Wed, 18 Jan 2023 14:44:48 +0000 (15:44 +0100)]
net/mlx5e: Replace usage of mlx5e_devlink_get_dl_port() by netdev->devlink_port

On places where netdev pointer is available, access related devlink_port
pointer by netdev->devlink_port instead of using
mlx5e_devlink_get_dl_port() which is going to be removed.

Move SET_NETDEV_DEVLINK_PORT() call right after devlink port
registration to make sure netdev->devlink_port is valid.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: Pass mdev to mlx5e_devlink_port_register()
Jiri Pirko [Wed, 18 Jan 2023 14:39:24 +0000 (15:39 +0100)]
net/mlx5e: Pass mdev to mlx5e_devlink_port_register()

Instead of accessing priv->mdev, pass mdev pointer to
mlx5e_devlink_port_register() and access it directly.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Remove outdated comment
Jiri Pirko [Fri, 13 Jan 2023 14:44:42 +0000 (15:44 +0100)]
net/mlx5: Remove outdated comment

The comment is no longer applicable, as the devlink reload and instance
cleanup are both protected with devlink instance lock, therefore no race
can happen.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: TC, Remove redundant parse_attr argument
Roi Dayan [Sun, 29 Jan 2023 09:53:42 +0000 (11:53 +0200)]
net/mlx5e: TC, Remove redundant parse_attr argument

The parse_attr argument is not being used in
actions_match_supported_fdb(). remove it.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: Use a simpler comparison for uplink rep
Roi Dayan [Mon, 6 Feb 2023 12:40:24 +0000 (14:40 +0200)]
net/mlx5e: Use a simpler comparison for uplink rep

get_route_and_out_devs()  is uses the following condition
mlx5e_eswitch_rep() && mlx5e_is_uplink_rep() to check if a given netdev is the
uplink rep.

Alternatively we can just use the straight forward version
mlx5e_eswitch_uplink_rep() that only checks if a given netdev is uplink rep.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Lag, Add single RDMA device in multiport mode
Mark Bloch [Mon, 5 Dec 2022 13:32:52 +0000 (15:32 +0200)]
net/mlx5: Lag, Add single RDMA device in multiport mode

In MultiPort E-Switch mode a single RDMA is created. This device has multiple
RDMA ports that represent the uplink ports that are connected to the E-Switch.
Account for this when creating the RDMA device so it has an additional port for
the non native uplink.

As a side effect of this patch, use shared fdb in multiport eswitch mode.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Lag, set different uplink vport metadata in multiport eswitch mode
Roi Dayan [Wed, 30 Nov 2022 13:12:50 +0000 (15:12 +0200)]
net/mlx5: Lag, set different uplink vport metadata in multiport eswitch mode

In a follow-up commit multiport eswitch mode will use a shared fdb.
In shared fdb there is a single eswitch fdb and traffic could come from any
port. to distinguish between the ports set a different metadata per uplink port.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: E-Switch, rename bond update function to be reused
Roi Dayan [Tue, 29 Nov 2022 14:02:55 +0000 (16:02 +0200)]
net/mlx5: E-Switch, rename bond update function to be reused

The vport bond update function is really updating the vport metadata
and there is no direct relation to bond. Rename the function
to vport metadata update to be used a followup commit.
This commit doesn't change any functionality.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: TC, Add peer flow in mpesw mode
Roi Dayan [Thu, 1 Dec 2022 09:10:20 +0000 (11:10 +0200)]
net/mlx5e: TC, Add peer flow in mpesw mode

While at it rename mlx5_lag_mpesw_is_activated() to mlx5_lag_is_mpesw() to
be consistent with checking if other lag modes are activated.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Lag, Control MultiPort E-Switch single FDB mode
Roi Dayan [Mon, 28 Nov 2022 11:33:03 +0000 (13:33 +0200)]
net/mlx5: Lag, Control MultiPort E-Switch single FDB mode

MultiPort E-Switch builds on newer hardware's capabilities and introduces
a mode where a single E-Switch is used and all the vports and physical
ports on the NIC are connected to it.

The new mode will allow in the future a decrease in the memory used by the
driver and advanced features that aren't possible today.

This represents a big change in the current E-Switch implantation in mlx5.
Currently, by default, each E-Switch manager manages its E-Switch.
Steering rules in each E-Switch can only forward traffic to the native
physical port associated with that E-Switch. While there are ways to target
non-native physical ports, for example using a bond or via special TC
rules. None of the ways allows a user to configure the driver
to operate by default in such a mode nor can the driver decide
to move to this mode by default as it's user configuration-driven right now.

While MultiPort E-Switch single FDB mode is the preferred mode, older
generations of ConnectX hardware couldn't support this mode so it was never
implemented. Now that there is capable hardware present, start the
transition to having this mode by default.

Introduce a devlink parameter to control MultiPort E-Switch single FDB mode.
This will allow users to select this mode on their system right now
and in the future will allow the driver to move to this mode by default.

Example:
    $ devlink dev param set pci/0000:00:0b.0 name esw_multiport value 1 \
                  cmode runtime

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agoice: Mention CEE DCBX in code comment
Zhu Yanjun [Mon, 16 Jan 2023 18:51:31 +0000 (13:51 -0500)]
ice: Mention CEE DCBX in code comment

From the function ice_parse_org_tlv, CEE DCBX TLV is also supported.
So update the comment. Or else, it is confusing.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoice: Change ice_vsi_realloc_stat_arrays() to void
Tony Nguyen [Thu, 5 Jan 2023 19:33:11 +0000 (11:33 -0800)]
ice: Change ice_vsi_realloc_stat_arrays() to void

smatch reports:

smatch warnings:
drivers/net/ethernet/intel/ice/ice_lib.c:3612 ice_vsi_rebuild() warn: missing error code 'ret'

If an error is encountered for ice_vsi_realloc_stat_arrays(), ret is not
assigned an error value so the goto error path would return success. The
function, however, only returns 0 so an error will never be reported; due
to this, change the function to return void.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
2 years agoice: add support BIG TCP on IPv6
Pawel Chmielewski [Tue, 7 Feb 2023 16:23:03 +0000 (17:23 +0100)]
ice: add support BIG TCP on IPv6

Enable sending BIG TCP packets on IPv6 in the ice driver using generic
ipv6_hopopt_jumbo_remove helper for stripping HBH header.

Tested:
netperf -t TCP_RR -H 2001:db8:0:f101::1  -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,TRANSACTION_RATE

Tested on two different setups. In both cases, the following settings were
applied after loading the changed driver:

ip link set dev enp175s0f1np1 gso_max_size 130000
ip link set dev enp175s0f1np1 gro_max_size 130000
ip link set dev enp175s0f1np1 mtu 9000

First setup:
Before:
Minimum      90th         99th         Transaction
Latency      Percentile   Percentile   Rate
Microseconds Latency      Latency      Tran/s
             Microseconds Microseconds
134          279          410          3961.584

After:
Minimum      90th         99th         Transaction
Latency      Percentile   Percentile   Rate
Microseconds Latency      Latency      Tran/s
             Microseconds Microseconds
135          178          216          6093.404

The other setup:
Before:
Minimum      90th         99th         Transaction
Latency      Percentile   Percentile   Rate
Microseconds Latency      Latency      Tran/s
             Microseconds Microseconds
218          414          478          2944.765

After:
Minimum      90th         99th         Transaction
Latency      Percentile   Percentile   Rate
Microseconds Latency      Latency      Tran/s
             Microseconds Microseconds
146          238          266          4700.596

Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoice/ptp: fix the PTP worker retrying indefinitely if the link went down
Daniel Vacek [Thu, 19 Jan 2023 20:23:16 +0000 (21:23 +0100)]
ice/ptp: fix the PTP worker retrying indefinitely if the link went down

When the link goes down the ice_ptp_tx_tstamp() may loop re-trying to
process the packets till the 2 seconds timeout finally drops them.
In such a case it makes sense to just drop them right away.

Signed-off-by: Daniel Vacek <neelx@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoice: Add GPIO pin support for E823 products
Karol Kolacinski [Wed, 14 Sep 2022 10:04:29 +0000 (12:04 +0200)]
ice: Add GPIO pin support for E823 products

Add GPIO pin setup for E823, which is only 1PPS input and output.

Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agodevlink: don't allow to change net namespace for FW_ACTIVATE reload action
Jiri Pirko [Mon, 13 Feb 2023 11:58:36 +0000 (12:58 +0100)]
devlink: don't allow to change net namespace for FW_ACTIVATE reload action

The change on network namespace only makes sense during re-init reload
action. For FW activation it is not applicable. So check if user passed
an ATTR indicating network namespace change request and forbid it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230213115836.3404039-1-jiri@resnulli.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agohv_netvsc: Check status in SEND_RNDIS_PKT completion message
Michael Kelley [Mon, 13 Feb 2023 05:08:01 +0000 (21:08 -0800)]
hv_netvsc: Check status in SEND_RNDIS_PKT completion message

Completion responses to SEND_RNDIS_PKT messages are currently processed
regardless of the status in the response, so that resources associated
with the request are freed.  While this is appropriate, code bugs that
cause sending a malformed message, or errors on the Hyper-V host, go
undetected. Fix this by checking the status and outputting a rate-limited
message if there is an error.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1676264881-48928-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'add-support-for-per-action-hw-stats'
Paolo Abeni [Tue, 14 Feb 2023 10:00:03 +0000 (11:00 +0100)]
Merge branch 'add-support-for-per-action-hw-stats'

Oz Shlomo says:

====================
add support for per action hw stats

There are currently two mechanisms for populating hardware stats:
1. Using flow_offload api to query the flow's statistics.
   The api assumes that the same stats values apply to all
   the flow's actions.
   This assumption breaks when action drops or jumps over following
   actions.
2. Using hw_action api to query specific action stats via a driver
   callback method. This api assures the correct action stats for
   the offloaded action, however, it does not apply to the rest of the
   actions in the flow's actions array, as elaborated below.

The current hw_action api does not apply to the following use cases:
1. Actions that are implicitly created by filters (aka bind actions).
   In the following example only one counter will apply to the rule:
   tc filter add dev $DEV prio 2 protocol ip parent ffff: \
        flower ip_proto tcp dst_ip $IP2 \
        action police rate 1mbit burst 100k conform-exceed drop/pipe \
        action mirred egress redirect dev $DEV2

2. Action preceding a hw action.
   In the following example the same flow stats will apply to the sample and
   mirred actions:
    tc action add police rate 1mbit burst 100k conform-exceed drop / pipe
    tc filter add dev $DEV prio 2 protocol ip parent ffff: \
        flower ip_proto tcp dst_ip $IP2 \
        action sample rate 1 group 10 trunc 60 pipe \
        action police index 1 \
        action mirred egress redirect dev $DEV2

3. Meter action using jump control.
   In the following example the same flow stats will apply to both
   mirred actions:
    tc action add police rate 1mbit burst 100k conform-exceed jump 2 / pipe
    tc filter add dev $DEV prio 2 protocol ip parent ffff: \
        flower ip_proto tcp dst_ip $IP2 \
        action police index 1 \
        action mirred egress redirect dev $DEV2
        action mirred egress redirect dev $DEV3

This series provides the platform to query per action stats for in_hw flows.

The first four patches are preparation patches with no functionality change.
The fifth patch re-uses the existing flow action stats api to query action
stats for both classifier and action dumps.
The rest of the patches add per action stats support to the Mellanox driver.
====================

Link: https://lore.kernel.org/r/20230212132520.12571-1-ozsh@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/mlx5e: TC, support per action stats
Oz Shlomo [Sun, 12 Feb 2023 13:25:20 +0000 (15:25 +0200)]
net/mlx5e: TC, support per action stats

Extend the action stats callback implementation to update stats for actions
that are associated with hw counters.
Note that the callback may be called from tc action utility or from tc
flower. Both apis expect the driver to return the stats difference from
the last update. As such, query the raw counter value and maintain
the diff from the last api call in the tc layer, instead of the fs_core
layer.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/mlx5e: TC, map tc action cookie to a hw counter
Oz Shlomo [Sun, 12 Feb 2023 13:25:19 +0000 (15:25 +0200)]
net/mlx5e: TC, map tc action cookie to a hw counter

Currently a hardware counter is associated with a flow cookie.
This does not apply to flows using branching action which are required to
return per action stats.

A single counter may apply to multiple actions.
Scan the flow actions in reverse (from the last to the first action) while
caching the last counter.
Associate all the flow attribute tc action cookies with the current
cached counter.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/mlx5e: TC, store tc action cookies per attr
Oz Shlomo [Sun, 12 Feb 2023 13:25:18 +0000 (15:25 +0200)]
net/mlx5e: TC, store tc action cookies per attr

The tc parse action phase translates the tc actions to mlx5 flow
attributes data structure that is used during the flow offload phase.
Currently, the flow offload stage instantiates hw counters while
associating them to flow cookie. However, flows with branching
actions are required to associate a hardware counter with its action
cookies.

Store the parsed tc action cookies on the flow attribute.
Use the list of cookies in the next patch to associate a tc action cookie
with its allocated hw counter.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/mlx5e: TC, add hw counter to branching actions
Oz Shlomo [Sun, 12 Feb 2023 13:25:17 +0000 (15:25 +0200)]
net/mlx5e: TC, add hw counter to branching actions

Currently a hw count action is appended to the last action of the action
list. However, a branching action may terminate the action list before
reaching the last action.

Append a count action to a branching action.
In the next patches, filters with branching actions will read this counter
when reporting stats per action.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/sched: support per action hw stats
Oz Shlomo [Sun, 12 Feb 2023 13:25:16 +0000 (15:25 +0200)]
net/sched: support per action hw stats

There are currently two mechanisms for populating hardware stats:
1. Using flow_offload api to query the flow's statistics.
   The api assumes that the same stats values apply to all
   the flow's actions.
   This assumption breaks when action drops or jumps over following
   actions.
2. Using hw_action api to query specific action stats via a driver
   callback method. This api assures the correct action stats for
   the offloaded action, however, it does not apply to the rest of the
   actions in the flow's actions array.

Extend the flow_offload stats callback to indicate that a per action
stats update is required.
Use the existing flow_offload_action api to query the action's hw stats.
In addition, currently the tc action stats utility only updates hw actions.
Reuse the existing action stats cb infrastructure to query any action
stats.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/sched: introduce flow_offload action cookie
Oz Shlomo [Sun, 12 Feb 2023 13:25:15 +0000 (15:25 +0200)]
net/sched: introduce flow_offload action cookie

Currently a hardware action is uniquely identified by the <id, hw_index>
tuple. However, the id is set by the flow_act_setup callback and tc core
cannot enforce this, and it is possible that a future change could break
this. In addition, <id, hw_index> are not unique across network namespaces.

Uniquely identify the action by setting an action cookie by the tc core.
Use the unique action cookie to query the action's hardware stats.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/sched: pass flow_stats instead of multiple stats args
Oz Shlomo [Sun, 12 Feb 2023 13:25:14 +0000 (15:25 +0200)]
net/sched: pass flow_stats instead of multiple stats args

Instead of passing 6 stats related args, pass the flow_stats.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/sched: act_pedit, setup offload action for action stats query
Oz Shlomo [Sun, 12 Feb 2023 13:25:13 +0000 (15:25 +0200)]
net/sched: act_pedit, setup offload action for action stats query

A single tc pedit action may be translated to multiple flow_offload
actions.
Offload only actions that translate to a single pedit command value.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet/sched: optimize action stats api calls
Oz Shlomo [Sun, 12 Feb 2023 13:25:12 +0000 (15:25 +0200)]
net/sched: optimize action stats api calls

Currently the hw action stats update is called from tcf_exts_hw_stats_update,
when a tc filter is dumped, and from tcf_action_copy_stats, when a hw
action is dumped.
However, the tcf_action_copy_stats is also called from tcf_action_dump.
As such, the hw action stats update cb is called 3 times for every
tc flower filter dump.

Move the tc action hw stats update from tcf_action_copy_stats to
tcf_dump_walker to update the hw action stats when tc action is dumped.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agodt-bindings: net: dsa: mediatek,mt7530: improve binding description
Arınç ÜNAL [Sun, 12 Feb 2023 13:12:58 +0000 (16:12 +0300)]
dt-bindings: net: dsa: mediatek,mt7530: improve binding description

Fix inaccurate information about PHY muxing, and merge standalone and
multi-chip module MT7530 configuration methods.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230212131258.47551-1-arinc.unal@arinc9.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'ipv6-more-drop-reason'
Jakub Kicinski [Tue, 14 Feb 2023 03:55:34 +0000 (19:55 -0800)]
Merge branch 'ipv6-more-drop-reason'

Eric Dumazet says:

====================
ipv6: more drop reason

Add more drop reasons to IPv6:

 - IPV6_BAD_EXTHDR
 - IPV6_NDISC_FRAG
 - IPV6_NDISC_HOP_LIMIT
 - IPV6_NDISC_BAD_CODE
====================

Link: https://lore.kernel.org/r/20230210184708.2172562-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: icmp6: add drop reason support to ndisc_rcv()
Eric Dumazet [Fri, 10 Feb 2023 18:47:08 +0000 (18:47 +0000)]
ipv6: icmp6: add drop reason support to ndisc_rcv()

Creates three new drop reasons:

SKB_DROP_REASON_IPV6_NDISC_FRAG: invalid frag (suppress_frag_ndisc).

SKB_DROP_REASON_IPV6_NDISC_HOP_LIMIT: invalid hop limit.

SKB_DROP_REASON_IPV6_NDISC_BAD_CODE: invalid NDISC icmp6 code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: icmp6: add drop reason support to icmpv6_notify()
Eric Dumazet [Fri, 10 Feb 2023 18:47:07 +0000 (18:47 +0000)]
ipv6: icmp6: add drop reason support to icmpv6_notify()

Accurately reports what happened in icmpv6_notify() when handling
a packet.

This makes use of the new IPV6_BAD_EXTHDR drop reason.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add pskb_may_pull_reason() helper
Eric Dumazet [Fri, 10 Feb 2023 18:47:06 +0000 (18:47 +0000)]
net: add pskb_may_pull_reason() helper

pskb_may_pull() can fail for two different reasons.

Provide pskb_may_pull_reason() helper to distinguish
between these reasons.

It returns:

SKB_NOT_DROPPED_YET           : Success
SKB_DROP_REASON_PKT_TOO_SMALL : packet too small
SKB_DROP_REASON_NOMEM         : skb->head could not be resized

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dropreason: add SKB_DROP_REASON_IPV6_BAD_EXTHDR
Eric Dumazet [Fri, 10 Feb 2023 18:47:05 +0000 (18:47 +0000)]
net: dropreason: add SKB_DROP_REASON_IPV6_BAD_EXTHDR

This drop reason can be used whenever an IPv6 packet
has a malformed extension header.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: stmmac: dwc-qos: Make struct dwc_eth_dwmac_data::remove return void
Uwe Kleine-König [Sat, 11 Feb 2023 11:24:31 +0000 (12:24 +0100)]
net: stmmac: dwc-qos: Make struct dwc_eth_dwmac_data::remove return void

All implementations of the remove callback return 0 unconditionally. So
in dwc_eth_dwmac_remove() there is no error handling necessary. Simplify
accordingly.

This is a preparation for making struct platform_driver::remove return
void, too.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230211112431.214252-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: stmmac: Make stmmac_dvr_remove() return void
Uwe Kleine-König [Sat, 11 Feb 2023 11:24:30 +0000 (12:24 +0100)]
net: stmmac: Make stmmac_dvr_remove() return void

The function returns zero unconditionally. Change it to return void instead
which simplifies some callers as error handing becomes unnecessary.

This also makes it more obvious that most platform remove callbacks always
return zero.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230211112431.214252-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mvneta: do not set xdp_features for hw buffer devices
Lorenzo Bianconi [Sun, 12 Feb 2023 10:08:26 +0000 (11:08 +0100)]
net: mvneta: do not set xdp_features for hw buffer devices

Devices with hardware buffer management do not support XDP, so do not
set xdp_features for them.

Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/19b5838bb3e4515750af822edb2fa5e974d0a86b.1676196230.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agohv_netvsc: add missing NETDEV_XDP_ACT_NDO_XMIT xdp-features flag
Lorenzo Bianconi [Sun, 12 Feb 2023 09:57:58 +0000 (10:57 +0100)]
hv_netvsc: add missing NETDEV_XDP_ACT_NDO_XMIT xdp-features flag

Add missing ndo_xdp_xmit bit to xdp_features capability flag.

Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/8e3747018f0fd0b5d6e6b9aefe8d9448ca3a3288.1676195726.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: stmmac: add missing NETDEV_XDP_ACT_XSK_ZEROCOPY bit to xdp_features
Lorenzo Bianconi [Sun, 12 Feb 2023 09:54:43 +0000 (10:54 +0100)]
net: stmmac: add missing NETDEV_XDP_ACT_XSK_ZEROCOPY bit to xdp_features

Add missing xsk zero-copy bit to xdp_features capability flag.

Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/c8949baafdf617188dcedb9033ce5a9ca6e9e5ff.1676195440.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_wed: No need to clear memory after a dma_alloc_coherent() call
Christophe JAILLET [Sun, 12 Feb 2023 06:51:51 +0000 (07:51 +0100)]
net: ethernet: mtk_wed: No need to clear memory after a dma_alloc_coherent() call

dma_alloc_coherent() already clears the allocated memory, there is no need
to explicitly call memset().

Moreover, it is likely that the size in the memset() is incorrect and
should be "size * sizeof(*ring->desc)".

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/d5acce7dd108887832c9719f62c7201b4c83b3fb.1676184599.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: lan966x: set xdp_features flag
Lorenzo Bianconi [Fri, 10 Feb 2023 19:06:04 +0000 (20:06 +0100)]
net: lan966x: set xdp_features flag

Set xdp_features netdevice flag if lan966x nic supports xdp mode.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/01f4412f28899d97b0054c9c1a63694201301b42.1676055718.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'ksz9477-eee-support'
David S. Miller [Mon, 13 Feb 2023 11:12:31 +0000 (11:12 +0000)]
Merge branch 'ksz9477-eee-support'

Oleksij Rempel says:

====================
net: add EEE support for KSZ9477 switch family

changes v8:
- fix comment for linkmode_to_mii_eee_cap1_t() function
- add Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
- add Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>

changes v7:
- update documentation for genphy_c45_eee_is_active()
- address review comments on "net: dsa: microchip: enable EEE support"
  patch

changes v6:
- split patch set and send only first 9 patches
- Add Reviewed-by: Andrew Lunn <andrew@lunn.ch>
- use 0xffff instead of GENMASK
- Document @supported_eee
- use "()" with function name in comments

changes v5:
- spell fixes
- move part of genphy_c45_read_eee_abilities() to
  genphy_c45_read_eee_cap1()
- validate MDIO_PCS_EEE_ABLE register against 0xffff val.
- rename *eee_100_10000* to *eee_cap1*
- use linkmode_intersects(phydev->supported, PHY_EEE_CAP1_FEATURES)
  instead of !linkmode_empty()
- add documentation to linkmode/register helpers

changes v4:
- remove following helpers:
  mmd_eee_cap_to_ethtool_sup_t
  mmd_eee_adv_to_ethtool_adv_t
  ethtool_adv_to_mmd_eee_adv_t
  and port drivers from this helpers to linkmode helpers.
- rebase against latest net-next
- port phy_init_eee() to genphy_c45_eee_is_active()

changes v3:
- rework some parts of EEE infrastructure and move it to c45 code.
- add supported_eee storage and start using it in EEE code and by the
  micrel driver.
- add EEE support for ar8035 PHY
- add SmartEEE support to FEC i.MX series.

changes v2:
- use phydev->supported instead of reading MII_BMSR regiaster
- fix @get_eee > @set_eee

With this patch series we provide EEE control for KSZ9477 family of
switches and
AR8035 with i.MX6 configuration.
According to my tests, on a system with KSZ8563 switch and 100Mbit idle
link,
we consume 0,192W less power per port if EEE is enabled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: start using genphy_c45_ethtool_get/set_eee()
Oleksij Rempel [Sat, 11 Feb 2023 07:41:13 +0000 (08:41 +0100)]
net: phy: start using genphy_c45_ethtool_get/set_eee()

All preparations are done. Now we can start using new functions and remove
the old code.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: migrate phy_init_eee() to genphy_c45_eee_is_active()
Oleksij Rempel [Sat, 11 Feb 2023 07:41:12 +0000 (08:41 +0100)]
net: phy: migrate phy_init_eee() to genphy_c45_eee_is_active()

Reduce code duplicated by migrating phy_init_eee() to
genphy_c45_eee_is_active().

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: c45: migrate to genphy_c45_write_eee_adv()
Oleksij Rempel [Sat, 11 Feb 2023 07:41:11 +0000 (08:41 +0100)]
net: phy: c45: migrate to genphy_c45_write_eee_adv()

Migrate from genphy_config_eee_advert() to genphy_c45_write_eee_adv().

It should work as before except write operation to the EEE adv registers
will be done only if some EEE abilities was detected.

If some driver will have a regression, related driver should provide own
.get_features callback. See micrel.c:ksz9477_get_features() as example.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: c22: migrate to genphy_c45_write_eee_adv()
Oleksij Rempel [Sat, 11 Feb 2023 07:41:10 +0000 (08:41 +0100)]
net: phy: c22: migrate to genphy_c45_write_eee_adv()

Migrate from genphy_config_eee_advert() to genphy_c45_write_eee_adv().

It should work as before except write operation to the EEE adv registers
will be done only if some EEE abilities was detected.

If some driver will have a regression, related driver should provide own
.get_features callback. See micrel.c:ksz9477_get_features() as example.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: add genphy_c45_ethtool_get/set_eee() support
Oleksij Rempel [Sat, 11 Feb 2023 07:41:09 +0000 (08:41 +0100)]
net: phy: add genphy_c45_ethtool_get/set_eee() support

Add replacement for phy_ethtool_get/set_eee() functions.

Current phy_ethtool_get/set_eee() implementation is great and it is
possible to make it even better:
- this functionality is for devices implementing parts of IEEE 802.3
  specification beyond Clause 22. The better place for this code is
  phy-c45.c
- currently it is able to do read/write operations on PHYs with
  different abilities to not existing registers. It is better to
  use stored supported_eee abilities to avoid false read/write
  operations.
- the eee_active detection will provide wrong results on not supported
  link modes. It is better to validate speed/duplex properties against
  supported EEE link modes.
- it is able to support only limited amount of link modes. We have more
  EEE link modes...

By refactoring this code I address most of this point except of the last
one. Adding additional EEE link modes will need more work.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>