]> www.infradead.org Git - users/hch/uuid.git/log
users/hch/uuid.git
3 years agodt-bindings: net: xgmac_mdio: Add "clock-frequency" and "suppress-preamble"
Tobias Waldekranz [Wed, 26 Jan 2022 16:05:43 +0000 (17:05 +0100)]
dt-bindings: net: xgmac_mdio: Add "clock-frequency" and "suppress-preamble"

The driver now supports the standard "clock-frequency" and
"suppress-preamble" properties, do document them in the binding
description.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/fsl: xgmac_mdio: Support setting the MDC frequency
Tobias Waldekranz [Wed, 26 Jan 2022 16:05:42 +0000 (17:05 +0100)]
net/fsl: xgmac_mdio: Support setting the MDC frequency

Support the standard "clock-frequency" attribute to set the generated
MDC frequency. If not specified, the driver will leave the divisor
bits untouched.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/fsl: xgmac_mdio: Support preamble suppression
Tobias Waldekranz [Wed, 26 Jan 2022 16:05:41 +0000 (17:05 +0100)]
net/fsl: xgmac_mdio: Support preamble suppression

Support the standard "suppress-preamble" attribute to disable preamble
generation.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/fsl: xgmac_mdio: Use managed device resources
Tobias Waldekranz [Wed, 26 Jan 2022 16:05:40 +0000 (17:05 +0100)]
net/fsl: xgmac_mdio: Use managed device resources

All of the resources used by this driver has managed interfaces, so
use them. Heed the warning in the comment before platform_get_resource
and use a bare devm_ioremap to allow for non-exclusive access to the
IO memory.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: xgmac_mdio: Remove unsupported "bus-frequency"
Tobias Waldekranz [Wed, 26 Jan 2022 16:05:39 +0000 (17:05 +0100)]
dt-bindings: net: xgmac_mdio: Remove unsupported "bus-frequency"

This property has never been supported by the driver. The kernel has
settled on "clock-frequency" as the standard name for this binding, so
once that is supported we will document that instead.

Fixes: 7f93c9d90f4d ("power/fsl: add MDIO dt binding for FMan")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoipv4: Namespaceify min_adv_mss sysctl knob
xu xin [Wed, 26 Jan 2022 07:10:58 +0000 (07:10 +0000)]
ipv4: Namespaceify min_adv_mss sysctl knob

Different netns has different requirement on the setting of min_adv_mss
sysctl which the advertised MSS will be never lower than.

Enable min_adv_mss to be configured per network namespace.

Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mvneta-mac_select_pcs'
David S. Miller [Thu, 27 Jan 2022 13:31:03 +0000 (13:31 +0000)]
Merge branch 'mvneta-mac_select_pcs'

Russell King says:

====================
net: mvneta: use .mac_select_pcs()

This series converts mvneta to use the .mac_select_pcs() like eventually
everything else will be. mvneta is slightly more involved because we
need to rearrange the initialisation first to ensure everything required
is initialised prior to phylink_create() being called.

Tested locally on SolidRun Clearfog.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mvneta: use .mac_select_pcs() interface
Russell King (Oracle) [Tue, 25 Jan 2022 16:59:34 +0000 (16:59 +0000)]
net: mvneta: use .mac_select_pcs() interface

Convert mvneta to use the mac_select_interface rather than using
phylink_set_pcs(). The intention here is to unify the approach for
PCS and eventually remove phylink_set_pcs().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mvneta: reorder initialisation
Russell King (Oracle) [Tue, 25 Jan 2022 16:59:29 +0000 (16:59 +0000)]
net: mvneta: reorder initialisation

Re-order the mvneta initialisation to move devm based resources and
easy setup earlier in the probe function. The primary reason for this
is to allow us to switch the driver to use phylink's mac_select_pcs()
callback.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'at803x-sfp-fiber'
David S. Miller [Thu, 27 Jan 2022 13:28:20 +0000 (13:28 +0000)]
Merge branch 'at803x-sfp-fiber'

Robert Hancock says:

====================
at803x fiber/SFP support

Add support for 1000Base-X fiber modes to the at803x PHY driver, as
well as support for connecting a downstream SFP cage.

Changes since v3:
-Renamed some constants with OHM suffix for clarity

Changes since v2:
-fixed tabs/spaces issue in one patch

Changes since v1:
-moved page selection to config_init so it is handled properly
after suspend/resume
-added explicit check for empty sfp_support bitmask
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: at803x: Support downstream SFP cage
Robert Hancock [Tue, 25 Jan 2022 16:54:10 +0000 (10:54 -0600)]
net: phy: at803x: Support downstream SFP cage

Add support for downstream SFP cages for AR8031 and AR8033. This is
primarily intended for fiber modules or direct-attach cables, however
copper modules which work in 1000Base-X mode may also function. Such
modules are allowed with a warning.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: at803x: add fiber support
Robert Hancock [Tue, 25 Jan 2022 16:54:09 +0000 (10:54 -0600)]
net: phy: at803x: add fiber support

Previously this driver always forced the copper page to be selected,
however for AR8031 in 100Base-FX or 1000Base-X modes, the fiber page
needs to be selected. Set the appropriate mode based on the hardware
mode_cfg strap selection.

Enable the appropriate interrupt bits to detect fiber-side link up
or down events.

Update config_aneg and read_status methods to use the appropriate
Clause 37 calls when fiber mode is in use.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: at803x: move page selection fix to config_init
Robert Hancock [Tue, 25 Jan 2022 16:54:08 +0000 (10:54 -0600)]
net: phy: at803x: move page selection fix to config_init

The fix to select the copper page on AR8031 was being done in the probe
function rather than config_init, so it would not be redone after resume
from suspend. Move this to config_init so it is always redone when
needed.

Fixes: c329e5afb42f ("net: phy: at803x: select correct page on config init")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotcp: allocate tcp_death_row outside of struct netns_ipv4
Eric Dumazet [Wed, 26 Jan 2022 18:07:14 +0000 (10:07 -0800)]
tcp: allocate tcp_death_row outside of struct netns_ipv4

I forgot tcp had per netns tracking of timewait sockets,
and their sysctl to change the limit.

After 0dad4087a86a ("tcp/dccp: get rid of inet_twsk_purge()"),
whole struct net can be freed before last tw socket is freed.

We need to allocate a separate struct inet_timewait_death_row
object per netns.

tw_count becomes a refcount and gains associated debugging infrastructure.

BUG: KASAN: use-after-free in inet_twsk_kill+0x358/0x3c0 net/ipv4/inet_timewait_sock.c:46
Read of size 8 at addr ffff88807d5f9f40 by task kworker/1:7/3690

CPU: 1 PID: 3690 Comm: kworker/1:7 Not tainted 5.16.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events pwq_unbound_release_workfn
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x336 mm/kasan/report.c:255
 __kasan_report mm/kasan/report.c:442 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 inet_twsk_kill+0x358/0x3c0 net/ipv4/inet_timewait_sock.c:46
 call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1421
 expire_timers kernel/time/timer.c:1466 [inline]
 __run_timers.part.0+0x67c/0xa30 kernel/time/timer.c:1734
 __run_timers kernel/time/timer.c:1715 [inline]
 run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1747
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:lockdep_unregister_key+0x1c9/0x250 kernel/locking/lockdep.c:6328
Code: 00 00 00 48 89 ee e8 46 fd ff ff 4c 89 f7 e8 5e c9 ff ff e8 09 cc ff ff 9c 58 f6 c4 02 75 26 41 f7 c4 00 02 00 00 74 01 fb 5b <5d> 41 5c 41 5d 41 5e 41 5f e9 19 4a 08 00 0f 0b 5b 5d 41 5c 41 5d
RSP: 0018:ffffc90004077cb8 EFLAGS: 00000206
RAX: 0000000000000046 RBX: ffff88807b61b498 RCX: 0000000000000001
RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888077027128 R08: 0000000000000001 R09: ffffffff8f1ea4fc
R10: fffffbfff1ff93ee R11: 000000000000af1e R12: 0000000000000246
R13: 0000000000000000 R14: ffffffff8ffc89b8 R15: ffffffff90157fb0
 wq_unregister_lockdep kernel/workqueue.c:3508 [inline]
 pwq_unbound_release_workfn+0x254/0x340 kernel/workqueue.c:3746
 process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307
 worker_thread+0x657/0x1110 kernel/workqueue.c:2454
 kthread+0x2e9/0x3a0 kernel/kthread.c:377
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>

Allocated by task 3635:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:437 [inline]
 __kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:470
 kasan_slab_alloc include/linux/kasan.h:260 [inline]
 slab_post_alloc_hook mm/slab.h:732 [inline]
 slab_alloc_node mm/slub.c:3230 [inline]
 slab_alloc mm/slub.c:3238 [inline]
 kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3243
 kmem_cache_zalloc include/linux/slab.h:705 [inline]
 net_alloc net/core/net_namespace.c:407 [inline]
 copy_net_ns+0x125/0x760 net/core/net_namespace.c:462
 create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110
 unshare_nsproxy_namespaces+0xc1/0x1f0 kernel/nsproxy.c:226
 ksys_unshare+0x445/0x920 kernel/fork.c:3048
 __do_sys_unshare kernel/fork.c:3119 [inline]
 __se_sys_unshare kernel/fork.c:3117 [inline]
 __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3117
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff88807d5f9a80
 which belongs to the cache net_namespace of size 6528
The buggy address is located 1216 bytes inside of
 6528-byte region [ffff88807d5f9a80ffff88807d5fb400)
The buggy address belongs to the page:
page:ffffea0001f57e00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88807d5f9a80 pfn:0x7d5f8
head:ffffea0001f57e00 order:3 compound_mapcount:0 compound_pincount:0
memcg:ffff888070023001
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000010200 ffff888010dd4f48 ffffea0001404e08 ffff8880118fd000
raw: ffff88807d5f9a80 0000000000040002 00000001ffffffff ffff888070023001
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 3634, ts 119694798460, free_ts 119693556950
 prep_new_page mm/page_alloc.c:2434 [inline]
 get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4165
 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5389
 alloc_pages+0x1aa/0x310 mm/mempolicy.c:2271
 alloc_slab_page mm/slub.c:1799 [inline]
 allocate_slab mm/slub.c:1944 [inline]
 new_slab+0x28a/0x3b0 mm/slub.c:2004
 ___slab_alloc+0x87c/0xe90 mm/slub.c:3018
 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3105
 slab_alloc_node mm/slub.c:3196 [inline]
 slab_alloc mm/slub.c:3238 [inline]
 kmem_cache_alloc+0x35c/0x3a0 mm/slub.c:3243
 kmem_cache_zalloc include/linux/slab.h:705 [inline]
 net_alloc net/core/net_namespace.c:407 [inline]
 copy_net_ns+0x125/0x760 net/core/net_namespace.c:462
 create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110
 unshare_nsproxy_namespaces+0xc1/0x1f0 kernel/nsproxy.c:226
 ksys_unshare+0x445/0x920 kernel/fork.c:3048
 __do_sys_unshare kernel/fork.c:3119 [inline]
 __se_sys_unshare kernel/fork.c:3117 [inline]
 __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3117
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1352 [inline]
 free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1404
 free_unref_page_prepare mm/page_alloc.c:3325 [inline]
 free_unref_page+0x19/0x690 mm/page_alloc.c:3404
 skb_free_head net/core/skbuff.c:655 [inline]
 skb_release_data+0x65d/0x790 net/core/skbuff.c:677
 skb_release_all net/core/skbuff.c:742 [inline]
 __kfree_skb net/core/skbuff.c:756 [inline]
 consume_skb net/core/skbuff.c:914 [inline]
 consume_skb+0xc2/0x160 net/core/skbuff.c:908
 skb_free_datagram+0x1b/0x1f0 net/core/datagram.c:325
 netlink_recvmsg+0x636/0xea0 net/netlink/af_netlink.c:1998
 sock_recvmsg_nosec net/socket.c:948 [inline]
 sock_recvmsg net/socket.c:966 [inline]
 sock_recvmsg net/socket.c:962 [inline]
 ____sys_recvmsg+0x2c4/0x600 net/socket.c:2632
 ___sys_recvmsg+0x127/0x200 net/socket.c:2674
 __sys_recvmsg+0xe2/0x1a0 net/socket.c:2704
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Memory state around the buggy address:
 ffff88807d5f9e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88807d5f9e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88807d5f9f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff88807d5f9f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88807d5fa000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 0dad4087a86a ("tcp/dccp: get rid of inet_twsk_purge()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20220126180714.845362-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonfp: only use kdoc style comments for kdoc
Simon Horman [Wed, 26 Jan 2022 09:08:03 +0000 (10:08 +0100)]
nfp: only use kdoc style comments for kdoc

Update comments to only use kdoc style comments, starting with '/**',
for kdoc.

Flagged by ./scripts/kernel-doc

Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220126090803.5582-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ethernet: cortina: permit to set mac address in DT
Corentin Labbe [Tue, 25 Jan 2022 21:08:11 +0000 (21:08 +0000)]
net: ethernet: cortina: permit to set mac address in DT

Add ability of setting mac address in DT for cortina ethernet driver.

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220125210811.54350-1-clabbe@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonfp: flower: Use struct_size() helper in kmalloc()
Gustavo A. R. Silva [Tue, 25 Jan 2022 19:33:19 +0000 (13:33 -0600)]
nfp: flower: Use struct_size() helper in kmalloc()

Make use of the struct_size() helper instead of an open-coded version,
in order to avoid any potential type mistakes or integer overflows that,
in the worst scenario, could lead to heap overflows.

Also, address the following sparse warnings:
drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c:359:25: warning: using sizeof on a flexible structure

Link: https://github.com/KSPP/linux/issues/174
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: spectrum_kvdl: Use struct_size() helper in kzalloc()
Gustavo A. R. Silva [Tue, 25 Jan 2022 17:01:28 +0000 (11:01 -0600)]
mlxsw: spectrum_kvdl: Use struct_size() helper in kzalloc()

Make use of the struct_size() helper instead of an open-coded version,
in order to avoid any potential type mistakes or integer overflows that,
in the worst scenario, could lead to heap overflows.

Also, address the following sparse warnings:
drivers/net/ethernet/mellanox/mlxsw/spectrum1_kvdl.c:229:24: warning: using sizeof on a flexible structure

Link: https://github.com/KSPP/linux/issues/174
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: enetc: use .mac_select_pcs() interface
Russell King (Oracle) [Tue, 25 Jan 2022 16:31:10 +0000 (16:31 +0000)]
net: enetc: use .mac_select_pcs() interface

Convert the PCS selection to use mac_select_pcs, which allows the PCS
to perform any validation it needs.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dpaa2-mac: use .mac_select_pcs() interface
Russell King (Oracle) [Tue, 25 Jan 2022 16:27:30 +0000 (16:27 +0000)]
net: dpaa2-mac: use .mac_select_pcs() interface

Convert dpaa2-mac to use the mac_select_pcs() interface rather than
using phylink_set_pcs(). The intention here is to unify the approach
for PCS and eventually to remove phylink_set_pcs().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'axienet-pcs-modernize'
David S. Miller [Wed, 26 Jan 2022 15:48:19 +0000 (15:48 +0000)]
Merge branch 'axienet-pcs-modernize'

Russell King says:

====================
net: axienet: modernise pcs implementation

These two patches modernise the Xilinx axienet PCS implementation to
use the phylink split PCS support.

The first patch adds split PCS support and makes use of the newly
introduced mac_select_pcs() function, which is the preferred way to
conditionally attach a PCS.

The second patch cleans up the use of mdiobus_write() since we now have
bus accessors for mdio devices.

There should be no functional change to the driver.

This series was previously sent CFT on the 16th December (message ID
Ybs1cdM3KUTsq4Vx@shell.armlinux.org.uk), and feedback addressed. CFT v2
sent 4th January (message ID YdQlI8gcVwg2sR+5@shell.armlinux.org.uk).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: axienet: replace mdiobus_write() with mdiodev_write()
Russell King (Oracle) [Tue, 25 Jan 2022 16:10:16 +0000 (16:10 +0000)]
net: axienet: replace mdiobus_write() with mdiodev_write()

Commit 0ebecb2644c8 ("net: mdio: Add helper functions for accessing
MDIO devices") added support for mdiodev accessor operations that
neatly wrap the mdiobus accessor operations. Since we are dealing with
a mdio device here, update the driver to use mdiodev_write().

Tested-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: axienet: convert to phylink_pcs
Russell King (Oracle) [Tue, 25 Jan 2022 16:10:11 +0000 (16:10 +0000)]
net: axienet: convert to phylink_pcs

Convert axienet to use the phylink_pcs layer, resulting in it no longer
being a legacy driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'bnxt_en-RTC'
David S. Miller [Wed, 26 Jan 2022 15:35:20 +0000 (15:35 +0000)]
Merge branch 'bnxt_en-RTC'

Michael Chan says:

====================
bnxt_en: Add RTC mode for PTP

This series adds Real Time Clock (RTC) mode for PTP timestamping.  In
RTC mode, the 64-bit time value is programmed into the NIC's PTP
hardware clock (PHC).  Prior to this, the PHC is running as a free
counter.  For example, in multi-function environment, we need to run
PTP in RTC mode.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Handle async event when the PHC is updated in RTC mode
Pavan Chebbi [Wed, 26 Jan 2022 04:40:13 +0000 (23:40 -0500)]
bnxt_en: Handle async event when the PHC is updated in RTC mode

In Multi-host environment, when the PHC is updated by one host,
an async message from firmware will be sent to other hosts.
Re-initialize the timecounter when the driver receives this
async message.

Cc: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Implement .adjtime() for PTP RTC mode
Pavan Chebbi [Wed, 26 Jan 2022 04:40:12 +0000 (23:40 -0500)]
bnxt_en: Implement .adjtime() for PTP RTC mode

The adjusted time is set in the PHC in RTC mode.  We also need to
update the snapshots ptp->current_time and ptp->old_time when the
time is adjusted.

Cc: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Add driver support to use Real Time Counter for PTP
Pavan Chebbi [Wed, 26 Jan 2022 04:40:11 +0000 (23:40 -0500)]
bnxt_en: Add driver support to use Real Time Counter for PTP

Add support for RTC mode if it is supported by firmware.  In RTC
mode, the PHC is set to the 64-bit clock.  Because the legacy interface
is 48-bit, the driver still has to keep track of the upper 16 bits and
handle the rollover.

Cc: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: PTP: Refactor PTP initialization functions
Pavan Chebbi [Wed, 26 Jan 2022 04:40:10 +0000 (23:40 -0500)]
bnxt_en: PTP: Refactor PTP initialization functions

Making the ptp free and timecounter initialization code into separate
functions so that later patches can use them.

Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Update firmware interface to 1.10.2.73
Michael Chan [Wed, 26 Jan 2022 04:40:09 +0000 (23:40 -0500)]
bnxt_en: Update firmware interface to 1.10.2.73

The main changes are PTP support for RTC, additional NVM error codes,
backing store v2 firmware APIs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'stmmac-PCS-modernize'
David S. Miller [Wed, 26 Jan 2022 11:20:37 +0000 (11:20 +0000)]
Merge branch 'stmmac-PCS-modernize'

Russell King says:

====================
net: stmmac/xpcs: modernise PCS support

This series updates xpcs and stmmac for the recent changes to phylink
to better support split PCS and to get rid of private MAC validation
functions.

This series is slightly more involved than other conversions as stmmac
has already had optional proper split PCS support.

The first six patches of this series were originally posted on 16th
December for CFT, and Wong Vee Khee reported his Intel Elkhart Lake
setup was fine the first six these. However, no tested-by was given.

The patches:

1) Provide a function to query the xpcs for the interface modes that
   are supported.

2) Populates the MAC capabilities and switches stmmac_validate() to use
   phylink_get_linkmodes(). We do not use phylink_generic_validate() yet
   as (a) we do not always have the supported interfaces populated, and
   (b) the existing code does not restrict based on interface. There
   should be no functional effect from this patch.

3) Populates phylink's supported interfaces from the xpcs when the xpcs
   is configured by firmware and also the firmware configured interface
   mode. Note: this will restrict stmmac to only supporting these
   interfaces modes - stmmac maintainers need to verify that this
   behaviour is acceptable.

4) stmmac_validate() tail-calls xpcs_validate(), but we don't need it to
   now that PCS have their own validation method. Convert stmmac and
   xpcs to use this method instead.

5) xpcs sets the poll field of phylink_pcs to true, meaning xpcs
   requires its status to be polled. There is no need to also set the
   phylink_config.pcs_poll. Remove this.

6) Switch to phylink_generic_validate(). This is probably the most
   contravertial change in this patch set as this will cause the MAC to
   restrict link modes based on the interface mode. From an inspection
   of the xpcs driver, this should be safe, as XPCS only further
   restricts the link modes to a subset of these (whether that is
   correct or not is not an issue I am addressing here.) For
   implementations that do not use xpcs, this is a more open question
   and needs feedback from stmmac maintainers.

7) Convert to use mac_select_pcs() rather than phylink_set_pcs() to set
   the PCS - the intention is to eventually remove phylink_set_pcs()
   once there are no more users of this.

v2: fix signoff and temporary warning in patch 4
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: use .mac_select_pcs() interface
Russell King (Oracle) [Wed, 26 Jan 2022 10:26:25 +0000 (10:26 +0000)]
net: stmmac: use .mac_select_pcs() interface

Convert stmmac to use the mac_select_pcs() interface rather than using
phylink_set_pcs(). The intention here is to unify the approach for PCS
and eventually to remove phylink_set_pcs().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: convert to phylink_generic_validate()
Russell King (Oracle) [Wed, 26 Jan 2022 10:26:20 +0000 (10:26 +0000)]
net: stmmac: convert to phylink_generic_validate()

Convert stmmac to use phylink_generic_validate() now that we have the
MAC capabilities and supported interfaces filled in, and we have the
PCS validation handled via the PCS operations.

Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: remove phylink_config.pcs_poll usage
Russell King (Oracle) [Wed, 26 Jan 2022 10:26:15 +0000 (10:26 +0000)]
net: stmmac: remove phylink_config.pcs_poll usage

Phylink will use PCS polling whenever the PCS's poll member is set, so
setting phylink_config.pcs_poll as well is redundant.

Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac/xpcs: convert to pcs_validate()
Russell King (Oracle) [Wed, 26 Jan 2022 10:26:10 +0000 (10:26 +0000)]
net: stmmac/xpcs: convert to pcs_validate()

stmmac explicitly calls the xpcs driver to validate the ethtool
linkmodes. This is no longer necessary as phylink now supports
validation through a PCS method. Convert both drivers to use this
new mechanism.

Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: fill in supported_interfaces
Russell King (Oracle) [Wed, 26 Jan 2022 10:26:05 +0000 (10:26 +0000)]
net: stmmac: fill in supported_interfaces

Fill in phylink's supported_interfaces bitmap with the PHY interface
modes which can be used to talk to the PHY.

We indicate that the PHY interface mode passed in platform data is
always supported, as this is the initial mode passed into phylink.
When there is no PCS specified, we assume that this is the only mode
that is supported - indeed, the driver appears not to support dynamic
switching of interface types at present.

When a xpcs is present, it defines the PHY interface modes that the
stmmac driver can support. Request the supported interfaces from the
xpcs driver, and pass them to phylink.

Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: convert to phylink_get_linkmodes()
Russell King (Oracle) [Wed, 26 Jan 2022 10:26:00 +0000 (10:26 +0000)]
net: stmmac: convert to phylink_get_linkmodes()

Add the MAC speed, duplex and pause capabilities to the phylink_config
structure, and switch stmmac_validate() to use phylink_get_linkmodes()
to generate the mask of supported ethtool link modes.

Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: xpcs: add support for retrieving supported interface modes
Russell King (Oracle) [Wed, 26 Jan 2022 10:25:55 +0000 (10:25 +0000)]
net: xpcs: add support for retrieving supported interface modes

Add a function to the xpcs driver to retrieve the supported PHY
interface modes, which can be used by drivers to fill in phylink's
supported_interfaces mask.

We validate the interface bit index to ensure that it fits within the
bitmap as xpcs lists PHY_INTERFACE_MODE_MAX in an entry.

Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mlxsw-RJ45'
David S. Miller [Wed, 26 Jan 2022 11:15:42 +0000 (11:15 +0000)]
Merge branch 'mlxsw-RJ45'

Ido Schimmel says:

====================
mlxsw: Add RJ45 ports support

We are in the process of qualifying a new system that has RJ45 ports as
opposed to the transceiver modules (e.g., SFP, QSFP) present on all
existing systems.

This patchset adds support for these ports in mlxsw by adding a couple of
missing BaseT link modes and rejecting ethtool operations that are
specific to transceiver modules.

Patchset overview:

Patches #1-#3 are cleanups and preparations.

Patch #4 adds support for two new link modes.

Patches #5-#6 query and cache the port module's type (e.g., QSFP, RJ45)
during initialization.

Patches #7-#9 forbid ethtool operations that are invalid on RJ45 ports.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: core_env: Forbid module reset on RJ45 ports
Danielle Ratson [Wed, 26 Jan 2022 10:30:37 +0000 (12:30 +0200)]
mlxsw: core_env: Forbid module reset on RJ45 ports

Transceiver module reset through 'rst' field in PMAOS register is not
supported on RJ45 ports, so module reset should be rejected.

Therefore, before trying to access this field, validate the port module
type that was queried during initialization and return an error to user
space in case the port module type is RJ45 (twisted pair).

Output example:

 # ethtool --reset swp11 phy
 ETHTOOL_RESET 0x40
 Cannot issue ETHTOOL_RESET: Invalid argument
 $ dmesg
 mlxsw_spectrum 0000:03:00.0 swp11: Reset module is not supported on port module type

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: core_env: Forbid power mode set and get on RJ45 ports
Danielle Ratson [Wed, 26 Jan 2022 10:30:36 +0000 (12:30 +0200)]
mlxsw: core_env: Forbid power mode set and get on RJ45 ports

PMMP (Port Module Memory Map Properties) and MCION (Management Cable IO
and Notifications) registers are not supported on RJ45 ports, so setting
and getting power mode should be rejected.

Therefore, before trying to access those registers, validate the port
module type that was queried during initialization and return an error
to user space in case the port module type is RJ45 (twisted pair).

Set output example:

 # ethtool --set-module swp1 power-mode-policy auto
 netlink error: mlxsw_core: Power mode is not supported on port module type
 netlink error: Invalid argument

Get output example:

 $ ethtool --show-module swp11
 netlink error: mlxsw_core: Power mode is not supported on port module type
 netlink error: Invalid argument

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: core_env: Forbid getting module EEPROM on RJ45 ports
Danielle Ratson [Wed, 26 Jan 2022 10:30:35 +0000 (12:30 +0200)]
mlxsw: core_env: Forbid getting module EEPROM on RJ45 ports

MCIA (Management Cable Info Access) register is not supported on RJ45
ports, so getting module EEPROM should be rejected.

Therefore, before trying to access this register, validate the port
module type that was queried during initialization and return an error
to user space in case the port module type is RJ45 (twisted pair).

Examples for output when trying to get EEPROM module:

Using netlink:

 # ethtool -m swp1
 netlink error: mlxsw_core: EEPROM is not equipped on port module type
 netlink error: Invalid argument

Using IOCTL:

 # ethtool -m swp1
 Cannot get module EEPROM information: Invalid argument
 $ dmesg
 mlxsw_spectrum 0000:03:00.0 swp1: EEPROM is not equipped on port module type

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: core_env: Query and store port module's type during initialization
Danielle Ratson [Wed, 26 Jan 2022 10:30:34 +0000 (12:30 +0200)]
mlxsw: core_env: Query and store port module's type during initialization

Query and store port module's type during initialization so that it
could be later used to determine if certain configurations are allowed
based on the type.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: reg: Add Port Module Type Mapping register
Danielle Ratson [Wed, 26 Jan 2022 10:30:33 +0000 (12:30 +0200)]
mlxsw: reg: Add Port Module Type Mapping register

Add the Port Module Type Mapping (PMTP) register. It will be used by
subsequent patches to query port module types and forbid certain
configurations based on the port module's type.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: spectrum_ethtool: Add support for two new link modes
Danielle Ratson [Wed, 26 Jan 2022 10:30:32 +0000 (12:30 +0200)]
mlxsw: spectrum_ethtool: Add support for two new link modes

As part of a process for supporting a new system with RJ45 connectors,
100BaseT and 1000BaseT link modes need to be supported.

Add support for these two link modes by adding the two corresponding
bits in PTYS (Port Type and Speed) register.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: Add netdev argument to mlxsw_env_get_module_info()
Danielle Ratson [Wed, 26 Jan 2022 10:30:31 +0000 (12:30 +0200)]
mlxsw: Add netdev argument to mlxsw_env_get_module_info()

The next patches will forbid querying the port module's EEPROM info when
its type is RJ45 as in this case no transceiver module can ever be
connected to the port.

Add netdev argument to mlxsw_env_get_module_info() so it could be used
to print an error to the kernel log via netdev_err().

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: core_env: Do not pass number of modules as argument
Ido Schimmel [Wed, 26 Jan 2022 10:30:30 +0000 (12:30 +0200)]
mlxsw: core_env: Do not pass number of modules as argument

The number of modules can be resolved from the first argument, so do not
pass it.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: spectrum_ethtool: Remove redundant variable
Ido Schimmel [Wed, 26 Jan 2022 10:30:29 +0000 (12:30 +0200)]
mlxsw: spectrum_ethtool: Remove redundant variable

Remove the 'err' variable and simply return.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: Adjust sk_gso_max_size once when set
David Ahern [Tue, 25 Jan 2022 02:45:11 +0000 (19:45 -0700)]
net: Adjust sk_gso_max_size once when set

sk_gso_max_size is set based on the dst dev. Both users of it
adjust the value by the same offset - (MAX_TCP_HEADER + 1). Rather
than compute the same adjusted value on each call do the adjustment
once when set.

Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220125024511.27480-1-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: tulip: remove redundant assignment to variable new_csr6
Colin Ian King [Sun, 23 Jan 2022 18:34:40 +0000 (18:34 +0000)]
net: tulip: remove redundant assignment to variable new_csr6

Variable new_csr6 is being initialized with a value that is never
read, it is being re-assigned later on. The assignment is redundant
and can be removed.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220123183440.112495-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoipv6: gro: flush instead of assuming different flows on hop_limit mismatch
Jakub Kicinski [Tue, 25 Jan 2022 04:44:44 +0000 (20:44 -0800)]
ipv6: gro: flush instead of assuming different flows on hop_limit mismatch

IPv6 GRO considers packets to belong to different flows when their
hop_limit is different. This seems counter-intuitive, the flow is
the same. hop_limit may vary because of various bugs or hacks but
that doesn't mean it's okay for GRO to reorder packets.

Practical impact of this problem on overall TCP performance
is unclear, but TCP itself detects this reordering and bumps
TCPSACKReorder resulting in user complaints.

Eric warns that there may be performance regressions in setups
which do packet spraying across links with similar RTT but different
hop count. To be safe let's target -next and not treat this
as a fix. If the packet spraying is using flow label there should
be no difference in behavior as flow label is checked first.

Note that the code plays an easy to miss trick by upcasting next_hdr
to a u16 pointer and compares next_hdr and hop_limit in one go.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mana: Use struct_size() helper in mana_gd_create_dma_region()
Gustavo A. R. Silva [Mon, 24 Jan 2022 21:43:47 +0000 (15:43 -0600)]
net: mana: Use struct_size() helper in mana_gd_create_dma_region()

Make use of the struct_size() helper instead of an open-coded version,
in order to avoid any potential type mistakes or integer overflows that,
in the worst scenario, could lead to heap overflows.

Also, address the following sparse warnings:
drivers/net/ethernet/microsoft/mana/gdma_main.c:677:24: warning: using sizeof on a flexible structure

Link: https://github.com/KSPP/linux/issues/174
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agor8169: use new PM macros
Heiner Kallweit [Mon, 24 Jan 2022 21:17:17 +0000 (22:17 +0100)]
r8169: use new PM macros

This is based on series [0] that extended the PM core. Now the compiler
can see the PM callbacks also on systems not defining CONFIG_PM.
The optimizer will remove the functions then in this case.

[0] https://lore.kernel.org/netdev/20211207002102.26414-1-paul@crapouillou.net/

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'dsa-avoid-cross-chip-vlan-sync'
David S. Miller [Tue, 25 Jan 2022 11:45:39 +0000 (11:45 +0000)]
Merge branch 'dsa-avoid-cross-chip-vlan-sync'

Tobias Waldekranz says:

====================
net: dsa: Avoid cross-chip syncing of VLAN filtering

This bug has been latent in the source for quite some time, I suspect
due to the homogeneity of both typical configurations and hardware.

On singlechip systems, this would never be triggered. The only reason
I saw it on my multichip system was because not all chips had the same
number of ports, which means that the misdemeanor alien call turned
into a felony array-out-of-bounds access.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: Avoid cross-chip syncing of VLAN filtering
Tobias Waldekranz [Mon, 24 Jan 2022 21:09:44 +0000 (22:09 +0100)]
net: dsa: Avoid cross-chip syncing of VLAN filtering

Changes to VLAN filtering are not applicable to cross-chip
notifications.

On a system like this:

.-----.   .-----.   .-----.
| sw1 +---+ sw2 +---+ sw3 |
'-1-2-'   '-1-2-'   '-1-2-'

Before this change, upon sw1p1 leaving a bridge, a call to
dsa_port_vlan_filtering would also be made to sw2p1 and sw3p1.

In this scenario:

.---------.   .-----.   .-----.
|   sw1   +---+ sw2 +---+ sw3 |
'-1-2-3-4-'   '-1-2-'   '-1-2-'

When sw1p4 would leave a bridge, dsa_port_vlan_filtering would be
called for sw2 and sw3 with a non-existing port - leading to array
out-of-bounds accesses and crashes on mv88e6xxx.

Fixes: d371b7c92d19 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: Move VLAN filtering syncing out of dsa_switch_bridge_leave
Tobias Waldekranz [Mon, 24 Jan 2022 21:09:43 +0000 (22:09 +0100)]
net: dsa: Move VLAN filtering syncing out of dsa_switch_bridge_leave

Most of dsa_switch_bridge_leave was, in fact, dealing with the syncing
of VLAN filtering for switches on which that is a global
setting. Separate the two phases to prepare for the cross-chip related
bugfix in the following commit.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'netns-speedup-dismantle'
David S. Miller [Tue, 25 Jan 2022 11:25:22 +0000 (11:25 +0000)]
Merge branch 'netns-speedup-dismantle'

Eric Dumazet says:

====================
netns: speedup netns dismantles

netns are dismantled by a single thread, from cleanup_net()

On hosts with many TCP sockets, and/or many cpus, this thread
is spending too many cpu cycles, and can not keep up with some
workloads.

- Removing 3*num_possible_cpus() sockets per netns, for icmp and tcp protocols.
- Iterating over all TCP sockets to remove stale timewait sockets.

This patch series removes ~50% of cleanup_net() cpu costs on
hosts with 256 cpus. It also reduces per netns memory footprint.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoipv4/tcp: do not use per netns ctl sockets
Eric Dumazet [Mon, 24 Jan 2022 20:24:57 +0000 (12:24 -0800)]
ipv4/tcp: do not use per netns ctl sockets

TCP ipv4 uses per-cpu/per-netns ctl sockets in order to send
RST and some ACK packets (on behalf of TIMEWAIT sockets).

This adds memory and cpu costs, which do not seem needed.
Now typical servers have 256 or more cores, this adds considerable
tax to netns users.

tcp sockets are used from BH context, are not receiving packets,
and do not store any persistent state but the 'struct net' pointer
in order to be able to use IPv4 output functions.

Note that I attempted a related change in the past, that had
to be hot-fixed in commit bdbbb8527b6f ("ipv4: tcp: get rid of ugly unicast_sock")

This patch could very well surface old bugs, on layers not
taking care of sk->sk_kern_sock properly.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoipv6: do not use per netns icmp sockets
Eric Dumazet [Mon, 24 Jan 2022 20:24:56 +0000 (12:24 -0800)]
ipv6: do not use per netns icmp sockets

Back in linux-2.6.25 (commit 98c6d1b261e7 "[NETNS]: Make icmpv6_sk per namespace.",
we added private per-cpu/per-netns ipv6 icmp sockets.

This adds memory and cpu costs, which do not seem needed.
Now typical servers have 256 or more cores, this adds considerable
tax to netns users.

icmp sockets are used from BH context, are not receiving packets,
and do not store any persistent state but the 'struct net' pointer.

icmpv6_xmit_lock() already makes sure to lock the chosen per-cpu
socket.

This patch has a considerable impact on the number of netns
that the worker thread in cleanup_net() can dismantle per second,
because ip6mr_sk_done() is no longer called, meaning we no longer
acquire the rtnl mutex, competing with other threads adding new netns.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoipv4: do not use per netns icmp sockets
Eric Dumazet [Mon, 24 Jan 2022 20:24:55 +0000 (12:24 -0800)]
ipv4: do not use per netns icmp sockets

Back in linux-2.6.25 (commit 4a6ad7a141cb "[NETNS]: Make icmp_sk per namespace."),
we added private per-cpu/per-netns ipv4 icmp sockets.

This adds memory and cpu costs, which do not seem needed.
Now typical servers have 256 or more cores, this adds considerable
tax to netns users.

icmp sockets are used from BH context, are not receiving packets,
and do not store any persistent state but the 'struct net' pointer.

icmp_xmit_lock() already makes sure to lock the chosen per-cpu
socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotcp/dccp: get rid of inet_twsk_purge()
Eric Dumazet [Mon, 24 Jan 2022 20:24:54 +0000 (12:24 -0800)]
tcp/dccp: get rid of inet_twsk_purge()

Prior patches in the series made sure tw_timer_handler()
can be fired after netns has been dismantled/freed.

We no longer have to scan a potentially big TCP ehash
table at netns dismantle.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotcp/dccp: no longer use twsk_net(tw) from tw_timer_handler()
Eric Dumazet [Mon, 24 Jan 2022 20:24:53 +0000 (12:24 -0800)]
tcp/dccp: no longer use twsk_net(tw) from tw_timer_handler()

We will soon get rid of inet_twsk_purge().

This means that tw_timer_handler() might fire after
a netns has been dismantled/freed.

Instead of adding a function (and data structure) to find a netns
from tw->tw_net_cookie, just update the SNMP counters
a bit earlier, when the netns is known to be alive.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotcp/dccp: add tw->tw_bslot
Eric Dumazet [Mon, 24 Jan 2022 20:24:52 +0000 (12:24 -0800)]
tcp/dccp: add tw->tw_bslot

We want to allow inet_twsk_kill() working even if netns
has been dismantled/freed, to get rid of inet_twsk_purge().

This patch adds tw->tw_bslot to cache the bind bucket slot
so that inet_twsk_kill() no longer needs to dereference twsk_net(tw)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ionic-fw-recovery'
David S. Miller [Tue, 25 Jan 2022 11:15:09 +0000 (11:15 +0000)]
Merge branch 'ionic-fw-recovery'

Shannon Nelson says:

====================
ionic: updates for stable FW recovery

Recent FW work has tightened up timings in its error recovery
handling and uncovered weaknesses in the driver's responses,
so this is a set of updates primarily for better handling of
the firmware's recovery mechanisms.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: replace set_vf data with union
Shannon Nelson [Mon, 24 Jan 2022 18:53:12 +0000 (10:53 -0800)]
ionic: replace set_vf data with union

This (ab)use of a data buffer made some static code checkers
rather itchy, so we replace the a generic data buffer with
the union in the struct ionic_vf_setattr_cmd.

Fixes: fbb39807e9ae ("ionic: support sr-iov operations")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: stretch heartbeat detection
Shannon Nelson [Mon, 24 Jan 2022 18:53:11 +0000 (10:53 -0800)]
ionic: stretch heartbeat detection

The driver can be premature in detecting stalled firmware
when the heartbeat is not updated because the firmware can
occasionally take a long time (more than 2 seconds) to service
a request, and doesn't update the heartbeat during that time.

The firmware heartbeat is not necessarily a steady 1 second
periodic beat, but better described as something that should
progress at least once in every DECVMD_TIMEOUT period.
The single-threaded design in the FW means that if a devcmd
or adminq request launches a large internal job, it is stuck
waiting for that job to finish before it can get back to
updating the heartbeat.  Since all requests are "guaranteed"
to finish within the DEVCMD_TIMEOUT period, the driver needs
to less aggressive in checking the heartbeat progress.

We change our current 2 second window to something bigger than
DEVCMD_TIMEOUT which should take care of most of the issue.
We stop checking for the heartbeat while waiting for a request,
as long as we're still watching for the FW status.  Lastly,
we make sure our FW status is up to date before running a
devcmd request.

Once we do this, we need to not check the heartbeat on DEV
commands because it may be stalled while we're on the fw_down
path.  Instead, we can rely on the is_fw_running check.

Fixes: b2b9a8d7ed13 ("ionic: avoid races in ionic_heartbeat_check")
Signed-off-by: Brett Creeley <brett@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: remove the dbid_inuse bitmap
Shannon Nelson [Mon, 24 Jan 2022 18:53:10 +0000 (10:53 -0800)]
ionic: remove the dbid_inuse bitmap

The dbid_inuse bitmap is not useful in this driver so remove it.

Fixes: 6461b446f2a0 ("ionic: Add interrupts and doorbells")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: disable napi when ionic_lif_init() fails
Brett Creeley [Mon, 24 Jan 2022 18:53:09 +0000 (10:53 -0800)]
ionic: disable napi when ionic_lif_init() fails

When the driver is going through reset, it will eventually call
ionic_lif_init(), which does a lot of re-initialization. One
of the re-initialization steps is to setup the adminq and
enable napi for it.  If something breaks after this point
we can end up with a kernel NULL pointer dereference through
ionic_adminq_napi.

Fix this by making sure to call napi_disable() in the cleanup
path of ionic_lif_init().  This forces any pending napi contexts
to finish and prevents them from being recalled before deleting
the napi context.

Fixes: 77ceb68e29cc ("ionic: Add notifyq support")
Signed-off-by: Brett Creeley <brett@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: Cleanups in the Tx hotpath code
Brett Creeley [Mon, 24 Jan 2022 18:53:08 +0000 (10:53 -0800)]
ionic: Cleanups in the Tx hotpath code

Buffer DMA mapping happens in ionic_tx_map_skb() and this function is
called from ionic_tx() and ionic_tx_tso(). If ionic_tx_map_skb()
succeeds, but a failure is encountered later in ionic_tx() or
ionic_tx_tso() we aren't unmapping the buffers. This can be fixed in
ionic_tx() by changing functions it calls to return void because they
always return 0. For ionic_tx_tso(), there's an actual possibility that
we leave the buffers mapped, so fix this by introducing the helper
function ionic_tx_desc_unmap_bufs(). This function is also re-used
in ionic_tx_clean().

Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
Signed-off-by: Brett Creeley <brett@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: Prevent filter add/del err msgs when the device is not available
Brett Creeley [Mon, 24 Jan 2022 18:53:07 +0000 (10:53 -0800)]
ionic: Prevent filter add/del err msgs when the device is not available

Currently when a request for add/deleting a filter is made when
ionic_heartbeat_check() returns failure the driver will be overly
verbose about failures, especially when these are usually temporary
fails and the request will be retried later. An example of this is
a filter add when the FW is in the middle of resetting:

IONIC_CMD_RX_FILTER_ADD (31) failed: IONIC_RC_ERROR (-6)
rx_filter add failed: ADDR 01:80:c2:00:00:0e

Fix this by checking for -ENXIO and other error values on filter
request fails before printing the error message.  Add similar
checking to the delete filter code.

Fixes: f91958cc9622 ("ionic: tame the filter no space message")
Signed-off-by: Brett Creeley <brett@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: Query FW when getting VF info via ndo_get_vf_config
Brett Creeley [Mon, 24 Jan 2022 18:53:06 +0000 (10:53 -0800)]
ionic: Query FW when getting VF info via ndo_get_vf_config

Currently when an administrator configures a VF via ndo_set_vf*,
the driver will send the set command to FW and then update the
cached value.  The cached value is then used when reporting
VF info via ndo_get_vf_config.

A problem is that the VF info may have been updated between
the last ndo_set_vf* and ndo_get_vf_info commands via some
other method, i.e. a VF changes its MAC address (assuming it's
allowed to do so) and since this is all managed by the FW,
this new value won't be reflected in the PF's cache of values.

To fix this, update the driver to always get the latest VF
information by making use of the IONIC_CMD_VF_GETATTR dev
command. The FW may not support getting all the attributes for
IONIC_CMD_VF_GETATTR, so the driver will only update the cached
VF config members if their associated IONIC_CMD_VF_GETATTR
was successful. Otherwise the cached VF config members will
remain the same as what was set in ndo_set_vf*.

Fixes: fbb39807e9ae ("ionic: support sr-iov operations")
Signed-off-by: Brett Creeley <brett@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: Allow flexibility for error reporting on dev commands
Brett Creeley [Mon, 24 Jan 2022 18:53:05 +0000 (10:53 -0800)]
ionic: Allow flexibility for error reporting on dev commands

When dev commands fail, an error message will always be printed,
which may be overly alarming the to system administrators,
especially if the driver shouldn't be printing the error due
to some unsupported capability.

Similar to recent adminq request changes, we can update the
dev command interface with the ability to selectively print
error messages to allow the driver to prevent printing errors
that are expected.

Signed-off-by: Brett Creeley <brett@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: Correctly print AQ errors if completions aren't received
Brett Creeley [Mon, 24 Jan 2022 18:53:04 +0000 (10:53 -0800)]
ionic: Correctly print AQ errors if completions aren't received

Recent changes went into the driver to allow flexibility when
printing error messages. Unfortunately this had the unexpected
consequence of printing confusing messages like the following:

IONIC_CMD_RX_FILTER_ADD (31) failed: IONIC_RC_SUCCESS (-6)

In cases like this the completion of the admin queue command never
completes, so the completion status is 0, hence IONIC_RC_SUCCESS
is printed even though the command clearly failed. For example,
this could happen when the driver tries to add a filter and at
the same time the FW goes through a reset, so the AQ command
never completes.

Fix this by forcing the FW completion status to IONIC_RC_ERROR
in cases where we never get the completion.

Fixes: 8c9d956ab6fb ("ionic: allow adminq requests to override default error message")
Signed-off-by: Brett Creeley <brett@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: fix up printing of timeout error
Shannon Nelson [Mon, 24 Jan 2022 18:53:03 +0000 (10:53 -0800)]
ionic: fix up printing of timeout error

Make sure we print the TIMEOUT string if we had a timeout
error, rather than printing the wrong status.

Fixes: 8c9d956ab6fb ("ionic: allow adminq requests to override default error message")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: better handling of RESET event
Shannon Nelson [Mon, 24 Jan 2022 18:53:02 +0000 (10:53 -0800)]
ionic: better handling of RESET event

When IONIC_EVENT_RESET is received, we only need to start the
fw_down process if we aren't already down, and we need to be
sure to set the FW_STOPPING state on the way.

If this is how we noticed that FW was stopped, it is most
likely from a FW update, and we'll see a new FW generation.
The update happens quickly enough that we might not see
fw_status==0, so we need to be sure things get restarted when
we see the fw_generation change.

Fixes: d2662072c094 ("ionic: monitor fw status generation")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: add FW_STOPPING state
Shannon Nelson [Mon, 24 Jan 2022 18:53:01 +0000 (10:53 -0800)]
ionic: add FW_STOPPING state

Between fw running and fw actually stopped into reset, we need
a fw_stopping concept to catch and block some actions while
we're transitioning to FW_RESET state.  This will help to be
sure the fw_up task is not scheduled until after the fw_down
task has completed.

On some rare occasion timing, it is possible for the fw_up task
to try to run before the fw_down task, then not get run after
the fw_down task has run, leaving the device in a down state.
This is possible if the watchdog goes off in between finding the
down transition and starting the fw_down task, where the later
watchdog sees the FW is back up and schedules a fw_up task.

Fixes: c672412f6172 ("ionic: remove lifs on fw reset")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: Don't send reset commands if FW isn't running
Brett Creeley [Mon, 24 Jan 2022 18:53:00 +0000 (10:53 -0800)]
ionic: Don't send reset commands if FW isn't running

It's possible the FW is already shutting down while the driver is being
removed and/or when the driver is going through reset. This can cause
unexpected/unnecessary errors to be printed:

eth0: DEV_CMD IONIC_CMD_PORT_RESET (12) error, IONIC_RC_ERROR (29) failed
eth1: DEV_CMD IONIC_CMD_RESET (3) error, IONIC_RC_ERROR (29) failed

Fix this by checking the FW status register before issuing the reset
commands.

Also, since err may not be assigned in ionic_port_reset(), assign it a
default value of 0, and remove an unnecessary log message.

Fixes: fbfb8031533c ("ionic: Add hardware init and device commands")
Signed-off-by: Brett Creeley <brett@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: separate function for watchdog init
Shannon Nelson [Mon, 24 Jan 2022 18:52:59 +0000 (10:52 -0800)]
ionic: separate function for watchdog init

Pull the watchdog init code out to a separate bite-sized
function.  Code cleaning for now, will be a useful change in
the near future.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: start watchdog after all is setup
Shannon Nelson [Mon, 24 Jan 2022 18:52:58 +0000 (10:52 -0800)]
ionic: start watchdog after all is setup

The watchdog expects the lif to fully exist when it goes off,
so lets not start the watchdog until all is ready in case there
is some quirky time dialation that makes probe take multiple
seconds.

Fixes: 089406bc5ad6 ("ionic: add a watchdog timer to monitor heartbeat")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: fix type complaint in ionic_dev_cmd_clean()
Shannon Nelson [Mon, 24 Jan 2022 18:52:57 +0000 (10:52 -0800)]
ionic: fix type complaint in ionic_dev_cmd_clean()

Sparse seems to have gotten a little more picky lately and
we need to revisit this bit of code to make sparse happy.

warning: incorrect type in initializer (different address spaces)
   expected union ionic_dev_cmd_regs *regs
   got union ionic_dev_cmd_regs [noderef] __iomem *dev_cmd_regs
warning: incorrect type in argument 2 (different address spaces)
   expected void [noderef] __iomem *
   got unsigned int *
warning: incorrect type in argument 1 (different address spaces)
   expected void volatile [noderef] __iomem *
   got union ionic_dev_cmd *

Fixes: d701ec326a31 ("ionic: clean up sparse complaints")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoipv4: get rid of fib_info_hash_{alloc|free}
Eric Dumazet [Mon, 24 Jan 2022 17:31:15 +0000 (09:31 -0800)]
ipv4: get rid of fib_info_hash_{alloc|free}

Use kvzalloc()/kvfree() instead of hand coded functions.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoip6_tunnel: allow routing IPv4 traffic in NBMA mode
Qing Deng [Sun, 23 Jan 2022 06:00:00 +0000 (14:00 +0800)]
ip6_tunnel: allow routing IPv4 traffic in NBMA mode

Since IPv4 routes support IPv6 gateways now, we can route IPv4 traffic in
NBMA tunnels.

Signed-off-by: Qing Deng <i@moy.cat>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: use bool values to pass bool param of phy_init_eee()
Jisheng Zhang [Sun, 23 Jan 2022 15:22:41 +0000 (23:22 +0800)]
net: use bool values to pass bool param of phy_init_eee()

The 2nd param of phy_init_eee(): clk_stop_enable is a bool param, use
true or false instead of 1/0.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220123152241.1480-1-jszhang@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: fec_ptp: remove redundant initialization of variable val
Colin Ian King [Sun, 23 Jan 2022 18:49:36 +0000 (18:49 +0000)]
net: fec_ptp: remove redundant initialization of variable val

Variable val is being initialized with a value that is never read,
it is being re-assigned later. The assignment is redundant and
can be removed.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20220123184936.113486-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: usb: asix: remove redundant assignment to variable reg
Colin Ian King [Sun, 23 Jan 2022 18:40:35 +0000 (18:40 +0000)]
net: usb: asix: remove redundant assignment to variable reg

Variable reg is being masked however the variable is never read
after this. The assignment is redundant and can be removed.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20220123184035.112785-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Jakub Kicinski [Mon, 24 Jan 2022 23:42:28 +0000 (15:42 -0800)]
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2022-01-24

We've added 80 non-merge commits during the last 14 day(s) which contain
a total of 128 files changed, 4990 insertions(+), 895 deletions(-).

The main changes are:

1) Add XDP multi-buffer support and implement it for the mvneta driver,
   from Lorenzo Bianconi, Eelco Chaudron and Toke Høiland-Jørgensen.

2) Add unstable conntrack lookup helpers for BPF by using the BPF kfunc
   infra, from Kumar Kartikeya Dwivedi.

3) Extend BPF cgroup programs to export custom ret value to userspace via
   two helpers bpf_get_retval() and bpf_set_retval(), from YiFei Zhu.

4) Add support for AF_UNIX iterator batching, from Kuniyuki Iwashima.

5) Complete missing UAPI BPF helper description and change bpf_doc.py script
   to enforce consistent & complete helper documentation, from Usama Arif.

6) Deprecate libbpf's legacy BPF map definitions and streamline XDP APIs to
   follow tc-based APIs, from Andrii Nakryiko.

7) Support BPF_PROG_QUERY for BPF programs attached to sockmap, from Di Zhu.

8) Deprecate libbpf's bpf_map__def() API and replace users with proper getters
   and setters, from Christy Lee.

9) Extend libbpf's btf__add_btf() with an additional hashmap for strings to
   reduce overhead, from Kui-Feng Lee.

10) Fix bpftool and libbpf error handling related to libbpf's hashmap__new()
    utility function, from Mauricio Vásquez.

11) Add support to BTF program names in bpftool's program dump, from Raman Shukhau.

12) Fix resolve_btfids build to pick up host flags, from Connor O'Brien.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (80 commits)
  selftests, bpf: Do not yet switch to new libbpf XDP APIs
  selftests, xsk: Fix rx_full stats test
  bpf: Fix flexible_array.cocci warnings
  xdp: disable XDP_REDIRECT for xdp frags
  bpf: selftests: add CPUMAP/DEVMAP selftests for xdp frags
  bpf: selftests: introduce bpf_xdp_{load,store}_bytes selftest
  net: xdp: introduce bpf_xdp_pointer utility routine
  bpf: generalise tail call map compatibility check
  libbpf: Add SEC name for xdp frags programs
  bpf: selftests: update xdp_adjust_tail selftest to include xdp frags
  bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature
  bpf: introduce frags support to bpf_prog_test_run_xdp()
  bpf: move user_size out of bpf_test_init
  bpf: add frags support to xdp copy helpers
  bpf: add frags support to the bpf_xdp_adjust_tail() API
  bpf: introduce bpf_xdp_get_buff_len helper
  net: mvneta: enable jumbo frames if the loaded XDP program support frags
  bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program
  net: mvneta: add frags support to XDP_TX
  xdp: add frags support to xdp_return_{buff/frame}
  ...
====================

Link: https://lore.kernel.org/r/20220124221235.18993-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests, bpf: Do not yet switch to new libbpf XDP APIs
Daniel Borkmann [Mon, 24 Jan 2022 22:01:28 +0000 (23:01 +0100)]
selftests, bpf: Do not yet switch to new libbpf XDP APIs

Revert commit 544356524dd6 ("selftests/bpf: switch to new libbpf XDP APIs")
for now given this will heavily conflict with 4b27480dcaa7 ("bpf/selftests:
convert xdp_link test to ASSERT_* macros") upon merge. Andrii agreed to redo
the conversion cleanly after trees merged.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
3 years agoMerge tag 'linux-can-fixes-for-5.17-20220124' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Mon, 24 Jan 2022 20:17:58 +0000 (12:17 -0800)]
Merge tag 'linux-can-fixes-for-5.17-20220124' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2022-01-24

The first patch updates the email address of Brian Silverman from his
former employer to his private address.

The next patch fixes DT bindings information for the tcan4x5x SPI CAN
driver.

The following patch targets the m_can driver and fixes the
introduction of FIFO bulk read support.

Another patch for the tcan4x5x driver, which fixes the max register
value for the regmap config.

The last patch for the flexcan driver marks the RX mailbox support for
the MCF5441X as support.

* tag 'linux-can-fixes-for-5.17-20220124' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: flexcan: mark RX via mailboxes as supported on MCF5441X
  can: tcan4x5x: regmap: fix max register value
  can: m_can: m_can_fifo_{read,write}: don't read or write from/to FIFO if length is 0
  dt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO config
  mailmap: update email address of Brian Silverman
====================

Link: https://lore.kernel.org/r/20220124175955.3464134-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agocan: flexcan: mark RX via mailboxes as supported on MCF5441X
Marc Kleine-Budde [Fri, 21 Jan 2022 08:22:34 +0000 (09:22 +0100)]
can: flexcan: mark RX via mailboxes as supported on MCF5441X

Most flexcan IP cores support 2 RX modes:
- FIFO
- mailbox

The flexcan IP core on the MCF5441X cannot receive CAN RTR messages
via mailboxes. However the mailbox mode is more performant. The commit

1c45f5778a3b ("can: flexcan: add ethtool support to change rx-rtr setting during runtime")

added support to switch from FIFO to mailbox mode on these cores.

After testing the mailbox mode on the MCF5441X by Angelo Dureghello,
this patch marks it (without RTR capability) as supported. Further the
IP core overview table is updated, that RTR reception via mailboxes is
not supported.

Link: https://lore.kernel.org/all/20220121084425.3141218-1-mkl@pengutronix.de
Tested-by: Angelo Dureghello <angelo@kernel-space.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: tcan4x5x: regmap: fix max register value
Marc Kleine-Budde [Fri, 14 Jan 2022 17:50:54 +0000 (18:50 +0100)]
can: tcan4x5x: regmap: fix max register value

The MRAM of the tcan4x5x has a size of 2K and starts at 0x8000. There
are no further registers in the tcan4x5x making 0x87fc the biggest
addressable register.

This patch fixes the max register value of the regmap config from
0x8ffc to 0x87fc.

Fixes: 6e1caaf8ed22 ("can: tcan4x5x: fix max register value")
Link: https://lore.kernel.org/all/20220119064011.2943292-1-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: m_can: m_can_fifo_{read,write}: don't read or write from/to FIFO if length is 0
Marc Kleine-Budde [Fri, 14 Jan 2022 14:35:01 +0000 (15:35 +0100)]
can: m_can: m_can_fifo_{read,write}: don't read or write from/to FIFO if length is 0

In order to optimize FIFO access, especially on m_can cores attached
to slow busses like SPI, in patch

e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors")

bulk read/write support has been added to the m_can_fifo_{read,write}
functions.

That change leads to the tcan driver to call
regmap_bulk_{read,write}() with a length of 0 (for CAN frames with 0
data length). regmap treats this as an error:

| tcan4x5x spi1.0 tcan4x5x0: FIFO write returned -22

This patch fixes the problem by not calling the
cdev->ops->{read,write)_fifo() in case of a 0 length read/write.

Fixes: e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors")
Link: https://lore.kernel.org/all/20220114155751.2651888-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org
Cc: Matt Kline <matt@bitbashing.io>
Cc: Chandrasekar Ramakrishnan <rcsekar@samsung.com>
Reported-by: Michael Anochin <anochin@photo-meter.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agodt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO config
Marc Kleine-Budde [Fri, 14 Jan 2022 17:47:41 +0000 (18:47 +0100)]
dt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO config

This tcan4x5x only comes with 2K of MRAM, a RX FIFO with a dept of 32
doesn't fit into the MRAM. Use a depth of 16 instead.

Fixes: 4edd396a1911 ("dt-bindings: can: tcan4x5x: Add DT bindings for TCAN4x5X driver")
Link: https://lore.kernel.org/all/20220119062951.2939851-1-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agomailmap: update email address of Brian Silverman
Marc Kleine-Budde [Mon, 10 Jan 2022 08:20:19 +0000 (09:20 +0100)]
mailmap: update email address of Brian Silverman

Brian Silverman's address at bluerivertech.com is not valid anymore,
use Brian's private email address instead.

Link: https://lore.kernel.org/all/20220110082359.2019735-1-mkl@pengutronix.de
Cc: Brian Silverman <bsilver16384@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agoselftests, xsk: Fix rx_full stats test
Magnus Karlsson [Fri, 21 Jan 2022 12:35:08 +0000 (13:35 +0100)]
selftests, xsk: Fix rx_full stats test

Fix the rx_full stats test so that it correctly reports pass even when
the fill ring is not full of buffers.

Fixes: 872a1184dbf2 ("selftests: xsk: Put the same buffer only once in the fill ring")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20220121123508.12759-1-magnus.karlsson@gmail.com
3 years agobpf: Fix flexible_array.cocci warnings
kernel test robot [Sat, 22 Jan 2022 11:09:44 +0000 (12:09 +0100)]
bpf: Fix flexible_array.cocci warnings

Zero-length and one-element arrays are deprecated, see:
Documentation/process/deprecated.rst

Flexible-array members should be used instead.

Generated by: scripts/coccinelle/misc/flexible_array.cocci

Fixes: c1ff181ffabc ("selftests/bpf: Extend kfunc selftests")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/alpine.DEB.2.22.394.2201221206320.12220@hadrien
3 years agonet: stmmac: remove unused members in struct stmmac_priv
Jisheng Zhang [Sun, 23 Jan 2022 12:27:58 +0000 (20:27 +0800)]
net: stmmac: remove unused members in struct stmmac_priv

The tx_coalesce and mii_irq are not used at all now, so remove them.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: atlantic: Use the bitmap API instead of hand-writing it
Christophe JAILLET [Sun, 23 Jan 2022 06:53:46 +0000 (07:53 +0100)]
net: atlantic: Use the bitmap API instead of hand-writing it

Simplify code by using bitmap_weight() and bitmap_zero() instead of
hand-writing these functions.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoping: fix the sk_bound_dev_if match in ping_lookup
Xin Long [Sat, 22 Jan 2022 11:40:56 +0000 (06:40 -0500)]
ping: fix the sk_bound_dev_if match in ping_lookup

When 'ping' changes to use PING socket instead of RAW socket by:

   # sysctl -w net.ipv4.ping_group_range="0 100"

the selftests 'router_broadcast.sh' will fail, as such command

  # ip vrf exec vrf-h1 ping -I veth0 198.51.100.255 -b

can't receive the response skb by the PING socket. It's caused by mismatch
of sk_bound_dev_if and dif in ping_rcv() when looking up the PING socket,
as dif is vrf-h1 if dif's master was set to vrf-h1.

This patch is to fix this regression by also checking the sk_bound_dev_if
against sdif so that the packets can stil be received even if the socket
is not bound to the vrf device but to the real iif.

Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/smc: Transitional solution for clcsock race issue
Wen Gu [Sat, 22 Jan 2022 09:43:09 +0000 (17:43 +0800)]
net/smc: Transitional solution for clcsock race issue

We encountered a crash in smc_setsockopt() and it is caused by
accessing smc->clcsock after clcsock was released.

 BUG: kernel NULL pointer dereference, address: 0000000000000020
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E     5.16.0-rc4+ #53
 RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
 Call Trace:
  <TASK>
  __sys_setsockopt+0xfc/0x190
  __x64_sys_setsockopt+0x20/0x30
  do_syscall_64+0x34/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f16ba83918e
  </TASK>

This patch tries to fix it by holding clcsock_release_lock and
checking whether clcsock has already been released before access.

In case that a crash of the same reason happens in smc_getsockopt()
or smc_switch_to_fallback(), this patch also checkes smc->clcsock
in them too. And the caller of smc_switch_to_fallback() will identify
whether fallback succeeds according to the return value.

Fixes: fd57770dd198 ("net/smc: wait for pending work before clcsock release_sock")
Link: https://lore.kernel.org/lkml/5dd7ffd1-28e2-24cc-9442-1defec27375e@linux.ibm.com/T/
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoibmvnic: remove unused ->wait_capability
Sukadev Bhattiprolu [Sat, 22 Jan 2022 02:59:21 +0000 (18:59 -0800)]
ibmvnic: remove unused ->wait_capability

With previous bug fix, ->wait_capability flag is no longer needed and can
be removed.

Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoibmvnic: don't spin in tasklet
Sukadev Bhattiprolu [Sat, 22 Jan 2022 02:59:20 +0000 (18:59 -0800)]
ibmvnic: don't spin in tasklet

ibmvnic_tasklet() continuously spins waiting for responses to all
capability requests. It does this to avoid encountering an error
during initialization of the vnic. However if there is a bug in the
VIOS and we do not receive a response to one or more queries the
tasklet ends up spinning continuously leading to hard lock ups.

If we fail to receive a message from the VIOS it is reasonable to
timeout the login attempt rather than spin indefinitely in the tasklet.

Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>