]> www.infradead.org Git - users/hch/dma-mapping.git/log
users/hch/dma-mapping.git
22 months agowifi: rtw89: mac: functions to configure hardware engine and quota for WiFi 7 chips
Ping-Ke Shih [Fri, 24 Nov 2023 07:17:03 +0000 (15:17 +0800)]
wifi: rtw89: mac: functions to configure hardware engine and quota for WiFi 7 chips

Add functions to configure HCI, DMAC (data MAC), DLE (data link engine),
HFC (HCI flow control), PLE (payload engine) and etc for WiFi 7 chips.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124071703.132549-9-pkshih@realtek.com
22 months agowifi: rtw89: mac: use pointer to access functions of hardware engine and quota
Ping-Ke Shih [Fri, 24 Nov 2023 07:17:02 +0000 (15:17 +0800)]
wifi: rtw89: mac: use pointer to access functions of hardware engine and quota

To share flow with WiFi 7 chips, abstract functions related hardware
engines and their quota, so use pointer to access them. This doesn't change
logic at all.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124071703.132549-8-pkshih@realtek.com
22 months agowifi: rtw89: mac: move code related to hardware engine to individual functions
Ping-Ke Shih [Fri, 24 Nov 2023 07:17:01 +0000 (15:17 +0800)]
wifi: rtw89: mac: move code related to hardware engine to individual functions

WiFi 7 chips will use the same functionalities but different registers to
control hardware components, so move these stuff into functions, and then
we can implement these for WiFi 7 chips later. This patch doesn't change
logic.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124071703.132549-7-pkshih@realtek.com
22 months agowifi: rtw89: mac: check queue empty according to chip gen
Zong-Zhe Yang [Fri, 24 Nov 2023 07:17:00 +0000 (15:17 +0800)]
wifi: rtw89: mac: check queue empty according to chip gen

This function, currently called by WoWLAN flow, polls until specific HW
queues are empty. The polling bit definitions are not totally the same
between WiFi 6 and 7 chips. In addition, the check conditions are also
a little different. So, we differentiate the implementations according to
chip gen.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124071703.132549-6-pkshih@realtek.com
22 months agowifi: rtw89: refine element naming used by queue empty check
Zong-Zhe Yang [Fri, 24 Nov 2023 07:16:59 +0000 (15:16 +0800)]
wifi: rtw89: refine element naming used by queue empty check

In queue empty check, one group contains 32 queues. And, the two elements,
wde_qempty_acq_num and wde_qempty_mgq_sel, are number of group and select
of group. To avoid confusing them with queue number and queue selection,
we refine their naming.

(don't change logic at all)

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124071703.132549-5-pkshih@realtek.com
22 months agowifi: rtw89: add reserved size as factor of DLE used size
Ping-Ke Shih [Fri, 24 Nov 2023 07:16:58 +0000 (15:16 +0800)]
wifi: rtw89: add reserved size as factor of DLE used size

DLE stands for Double Link Engine that is used to maintain buffer page.
To avoid linking to wrong pages, we check the used page size during
initialization and stop driver probe if the used size is unexpected.

Currently, we check the page size used by PLE (payload engine) and WDE
(WiFi descriptor engine). For coming WiFi 7 chips, additional reserved
size is added for BB as buffer to run LA mode, so add and check the
reserved size as well.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124071703.132549-4-pkshih@realtek.com
22 months agowifi: rtw89: mac: add to get DLE reserved quota
Ping-Ke Shih [Fri, 24 Nov 2023 07:16:57 +0000 (15:16 +0800)]
wifi: rtw89: mac: add to get DLE reserved quota

The reserved quota of DLE (data link engine) is used for processing next
packet. Add this to get quota number, and then WiFi 7 chips can use them.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124071703.132549-3-pkshih@realtek.com
22 months agowifi: rtw89: 8922a: extend and add quota number
Ping-Ke Shih [Fri, 24 Nov 2023 07:16:56 +0000 (15:16 +0800)]
wifi: rtw89: 8922a: extend and add quota number

Define 8922A buffer quota that are used by HCI control flow, payload
engine, descriptor engine and etc for operation modes, such as SCC (single
channel concurrence) and download firmware. Since WiFi 7 chips has more
buffer classifications, add fields and struct according to design.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124071703.132549-2-pkshih@realtek.com
22 months agowifi: iwlwifi: fw: replace deprecated strncpy with strscpy_pad
Justin Stitt [Thu, 19 Oct 2023 17:44:59 +0000 (17:44 +0000)]
wifi: iwlwifi: fw: replace deprecated strncpy with strscpy_pad

strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

Based on the deliberate `sizeof(dest) ... - 1` pattern we can see that
both dump_info->dev_human_readable and dump_info->bus_human_readable are
intended to be NUL-terminated.

Moreover, since this seems to cross the file boundary let's NUL-pad to
ensure no behavior change.

strscpy_pad() covers both the NUL-termination and NUL-padding, let's use
it.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231019-strncpy-drivers-net-wireless-intel-iwlwifi-fw-dbg-c-v2-1-179b211a374b@google.com
22 months agowifi: rtw89: debug: remove wrapper of rtw89_debug()
Ping-Ke Shih [Wed, 22 Nov 2023 06:04:58 +0000 (14:04 +0800)]
wifi: rtw89: debug: remove wrapper of rtw89_debug()

The wrapper of rtw89_debug() is unnecessary, so just remove it.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231122060458.30878-5-pkshih@realtek.com
22 months agowifi: rtw89: debug: add debugfs entry to disable dynamic mechanism
Ping-Ke Shih [Wed, 22 Nov 2023 06:04:57 +0000 (14:04 +0800)]
wifi: rtw89: debug: add debugfs entry to disable dynamic mechanism

A dynamic mechanism is usually an algorithm to adjust registers to adapt
to different environment every two seconds. In field, it could get
unexpected result, so we need to stop it and adjust registers manually, and
then fine tune the algorithm.

To stop mechanisms to assist debugging, add a debugfs entry shown as

  Disabled DM: 0x1
  [0] DYNAMIC_EDCCA: X

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231122060458.30878-4-pkshih@realtek.com
22 months agowifi: rtw89: phy: dynamically adjust EDCCA threshold
Yi-Chen Chen [Wed, 22 Nov 2023 06:04:56 +0000 (14:04 +0800)]
wifi: rtw89: phy: dynamically adjust EDCCA threshold

Add dynamic mechanism EDCCA (Energy Detection Clear Channel Assessment)
in track work. Using a fixed-value threshold will make EDCCA particularly
sensitive and cause failure to transmit under certain circumstances.
Therefore, the threshold is dynamically adjusted to make EDCCA suitable
for any situation.

However, in some cases, we will adjust the EDCCA threshold to the highest
level so that urgent transmissions can be performed successfully, such as
scanning.

Finally, in order to observe the EDCCA report in time, add the EDCCA perIC
register macro and EDCCA HW report analysis. EDCCA logs can be displayed
by using the EDCCA debug mask.

Signed-off-by: Yi-Chen Chen <jamie_chen@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231122060458.30878-3-pkshih@realtek.com
22 months agowifi: rtw89: debug: add to check if debug mask is enabled
Ping-Ke Shih [Wed, 22 Nov 2023 06:04:55 +0000 (14:04 +0800)]
wifi: rtw89: debug: add to check if debug mask is enabled

The coming dynamic mechanism of EDCCA adjustment will add a function to
dump registers to reflect status. However, if we are not debugging
the mechanism, we don't print anything, so avoid reading registers by
checking debug mask to reduce IO.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231122060458.30878-2-pkshih@realtek.com
22 months agowifi: rtlwifi: rtl8821ae: phy: fix an undefined bitwise shift behavior
Su Hui [Mon, 27 Nov 2023 01:35:13 +0000 (09:35 +0800)]
wifi: rtlwifi: rtl8821ae: phy: fix an undefined bitwise shift behavior

Clang static checker warns:

drivers/net/wireless/realtek/rtlwifi/rtl8821ae/phy.c:184:49:
The result of the left shift is undefined due to shifting by '32',
which is greater or equal to the width of type 'u32'.
[core.UndefinedBinaryOperatorResult]

If the value of the right operand is negative or is greater than or
equal to the width of the promoted left operand, the behavior is
undefined.[1][2]

For example, when using different gcc's compilation optimization options
(-O0 or -O2), the result of '(u32)data << 32' is different. One is 0, the
other is old value of data. Let _rtl8821ae_phy_calculate_bit_shift()'s
return value less than 32 to fix this problem. Warn if bitmask is zero.

[1] https://stackoverflow.com/questions/11270492/what-does-the-c-standard-say-about-bitshifting-more-bits-than-the-width-of-type
[2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

Fixes: 21e4b0726dc6 ("rtlwifi: rtl8821ae: Move driver from staging to regular tree")
Signed-off-by: Su Hui <suhui@nfschina.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231127013511.26694-2-suhui@nfschina.com
22 months agowifi: rtlwifi: rtl8821ae: phy: remove some useless code
Su Hui [Mon, 27 Nov 2023 01:35:11 +0000 (09:35 +0800)]
wifi: rtlwifi: rtl8821ae: phy: remove some useless code

Clang static checker warns:

Value stored to 'v1' is never read [deadcode.DeadStores]
Value stored to 'channel' is never read [deadcode.DeadStores]

Remove them to save some place.

Signed-off-by: Su Hui <suhui@nfschina.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231127013511.26694-1-suhui@nfschina.com
22 months agobcma: Use PCI_HEADER_TYPE_MASK instead of literal
Ilpo Järvinen [Fri, 24 Nov 2023 09:09:18 +0000 (11:09 +0200)]
bcma: Use PCI_HEADER_TYPE_MASK instead of literal

Replace literal 0x7f with PCI_HEADER_TYPE_MASK.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231124090919.23687-6-ilpo.jarvinen@linux.intel.com
22 months agowifi: libertas: fix config name in dependency for SDIO support
Lukas Bulwahn [Wed, 22 Nov 2023 08:30:47 +0000 (09:30 +0100)]
wifi: libertas: fix config name in dependency for SDIO support

Commit 4b478bf6bdd8 ("wifi: libertas: drop 16-bit PCMCIA support") reworks
the dependencies for config LIBERTAS, and adds alternative dependencies for
USB, SDIO and SPI.

The config option SDIO however does not exist in the kernel tree. It was
probably intended to refer to the config option MMC, which represents
"MMC/SD/SDIO card support" and is used as dependency by various other
drivers that use SDIO.

Fix the dependency to the config option MMC for declaring the requirement
on provision of SDIO support.

Fixes: 4b478bf6bdd8 ("wifi: libertas: drop 16-bit PCMCIA support")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231122083047.12774-1-lukas.bulwahn@gmail.com
22 months agowifi: rtw88: debug: remove wrapper of rtw_dbg()
Ping-Ke Shih [Wed, 22 Nov 2023 06:14:29 +0000 (14:14 +0800)]
wifi: rtw88: debug: remove wrapper of rtw_dbg()

Remove unnecessary wrapper of rtw_dbg(), and just call it directly.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231122061429.34487-1-pkshih@realtek.com
22 months agowifi: brcmfmac: Convert to platform remove callback returning void
Uwe Kleine-König [Fri, 17 Nov 2023 09:31:02 +0000 (10:31 +0100)]
wifi: brcmfmac: Convert to platform remove callback returning void

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231117093056.873834-13-u.kleine-koenig@pengutronix.de
22 months agowifi: rt2x00: Simplify bool conversion
Yang Li [Wed, 15 Nov 2023 01:00:17 +0000 (09:00 +0800)]
wifi: rt2x00: Simplify bool conversion

./drivers/net/wireless/ralink/rt2x00/rt2800lib.c:1331:47-52: WARNING: conversion to bool not needed here
./drivers/net/wireless/ralink/rt2x00/rt2800lib.c:1332:47-52: WARNING: conversion to bool not needed here

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7531
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231115010017.112081-1-yang.lee@linux.alibaba.com
22 months agonet: page_pool: fix general protection fault in page_pool_unlist
Eric Dumazet [Thu, 30 Nov 2023 09:22:59 +0000 (09:22 +0000)]
net: page_pool: fix general protection fault in page_pool_unlist

syzbot was able to trigger a crash [1] in page_pool_unlist()

page_pool_list() only inserts a page pool into a netdev page pool list
if a netdev was set in params.

Even if the kzalloc() call in page_pool_create happens to initialize
pool->user.list, I chose to be more explicit in page_pool_list()
adding one INIT_HLIST_NODE().

We could test in page_pool_unlist() if netdev was set,
but since netdev can be changed to lo, it seems more robust to
check if pool->user.list is hashed  before calling hlist_del().

[1]

Illegal XDP return value 4294946546 on prog  (id 2) dev N/A, expect packet loss!
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 5064 Comm: syz-executor391 Not tainted 6.7.0-rc2-syzkaller-00533-ga379972973a8 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:__hlist_del include/linux/list.h:988 [inline]
RIP: 0010:hlist_del include/linux/list.h:1002 [inline]
RIP: 0010:page_pool_unlist+0xd1/0x170 net/core/page_pool_user.c:342
Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 90 00 00 00 4c 8b a3 f0 06 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 68 48 85 ed 49 89 2c 24 74 24 e8 1b ca 07 f9 48 8d
RSP: 0018:ffffc900039ff768 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff88814ae02000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88814ae026f0
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1d57fdc
R10: ffffffff8eabfee3 R11: ffffffff8aa0008b R12: 0000000000000000
R13: ffff88814ae02000 R14: dffffc0000000000 R15: 0000000000000001
FS:  000055555717a380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002555398 CR3: 0000000025044000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __page_pool_destroy net/core/page_pool.c:851 [inline]
 page_pool_release+0x507/0x6b0 net/core/page_pool.c:891
 page_pool_destroy+0x1ac/0x4c0 net/core/page_pool.c:956
 xdp_test_run_teardown net/bpf/test_run.c:216 [inline]
 bpf_test_run_xdp_live+0x1578/0x1af0 net/bpf/test_run.c:388
 bpf_prog_test_run_xdp+0x827/0x1530 net/bpf/test_run.c:1254
 bpf_prog_test_run kernel/bpf/syscall.c:4041 [inline]
 __sys_bpf+0x11bf/0x4920 kernel/bpf/syscall.c:5402
 __do_sys_bpf kernel/bpf/syscall.c:5488 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5486 [inline]
 __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5486

Fixes: 083772c9f972 ("net: page_pool: record pools per netdev")
Reported-and-tested-by: syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20231130092259.3797753-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'net-ethernet-convert-to-platform-remove-callback-returning-void'
Paolo Abeni [Thu, 30 Nov 2023 12:04:15 +0000 (13:04 +0100)]
Merge branch 'net-ethernet-convert-to-platform-remove-callback-returning-void'

Uwe Kleine-König says:

====================
net: ethernet: Convert to platform remove callback returning void

in (implicit) v1 of this series
(https://lore.kernel.org/netdev/20231117091655.872426-1-u.kleine-koenig@pengutronix.de)
I tried to address the resource leaks in the three cpsw drivers. However
this is hard to get right without being able to test the changes. So
here comes a series that just converts all drivers below
drivers/net/ethernet to use .remove_new() and adds a comment about the
potential leaks for someone else to fix the problem.

See commit 5c5a7680e67b ("platform: Provide a remove callback that
returns no value") for an extended explanation and the eventual goal.
The TL;DR; is to prevent bugs like the three noticed here.

Note this series results in no change of behaviour apart from improving
the error message for the three cpsw drivers from

remove callback returned a non-zero value. This will be ignored.

to

Failed to resume device (-ESOMETHING)
====================

Link: https://lore.kernel.org/r/20231128173823.867512-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agonet: ethernet: ezchip: Convert to platform remove callback returning void
Uwe Kleine-König [Tue, 28 Nov 2023 17:38:28 +0000 (18:38 +0100)]
net: ethernet: ezchip: Convert to platform remove callback returning void

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agonet: ethernet: ti: cpsw-new: Convert to platform remove callback returning void
Uwe Kleine-König [Tue, 28 Nov 2023 17:38:27 +0000 (18:38 +0100)]
net: ethernet: ti: cpsw-new: Convert to platform remove callback returning void

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Replace the error path returning a non-zero value by an error message
and a comment that there is more to do. With that this patch results in
no change of behaviour in this driver apart from improving the error
message.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agonet: ethernet: ti: cpsw: Convert to platform remove callback returning void
Uwe Kleine-König [Tue, 28 Nov 2023 17:38:26 +0000 (18:38 +0100)]
net: ethernet: ti: cpsw: Convert to platform remove callback returning void

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Replace the error path returning a non-zero value by an error message
and a comment that there is more to do. With that this patch results in
no change of behaviour in this driver apart from improving the error
message.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agonet: ethernet: ti: am65-cpsw: Convert to platform remove callback returning void
Uwe Kleine-König [Tue, 28 Nov 2023 17:38:25 +0000 (18:38 +0100)]
net: ethernet: ti: am65-cpsw: Convert to platform remove callback returning void

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Replace the error path returning a non-zero value by an error message
and a comment that there is more to do. With that this patch results in
no change of behaviour in this driver apart from improving the error
message.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agoMerge branch 'devlink-warn-about-existing-entities-during-reload-reinit'
Paolo Abeni [Thu, 30 Nov 2023 11:31:39 +0000 (12:31 +0100)]
Merge branch 'devlink-warn-about-existing-entities-during-reload-reinit'

Jiri Pirko says:

====================
devlink: warn about existing entities during reload-reinit

Recently there has been a couple of attempts from drivers to block
devlink reload in certain situations. Turned out, the drivers do not
properly tear down ports and related netdevs during reload.

To address this, add couple of checks to be done during devlink reload
reinit action. Also, extend documentation to be more explicit.
====================

Link: https://lore.kernel.org/r/20231128115255.773377-1-jiri@resnulli.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agodevlink: warn about existing entities during reload-reinit
Jiri Pirko [Tue, 28 Nov 2023 11:52:55 +0000 (12:52 +0100)]
devlink: warn about existing entities during reload-reinit

During reload-reinit, all entities except for params, resources, regions
and health reporter should be removed and re-added. Add a warning to
be triggered in case the driver behaves differently.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agoDocumentation: devlink: extend reload-reinit description
Jiri Pirko [Tue, 28 Nov 2023 11:52:54 +0000 (12:52 +0100)]
Documentation: devlink: extend reload-reinit description

Be more explicit about devlink entities that may stay and that have to
be removed during reload reinit action.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agoMerge branch 'clean-up-and-refactor-cookie_v46_check'
Jakub Kicinski [Thu, 30 Nov 2023 04:16:42 +0000 (20:16 -0800)]
Merge branch 'clean-up-and-refactor-cookie_v46_check'

Kuniyuki Iwashima says:

====================
tcp: Clean up and refactor cookie_v[46]_check().

This is a preparation series for upcoming arbitrary SYN Cookie
support with BPF. [0]

There are slight differences between cookie_v[46]_check().  Such a
discrepancy caused an issue in the past, and BPF SYN Cookie support
will add more churn.

The primary purpose of this series is to clean up and refactor
cookie_v[46]_check() to minimise such discrepancies and make the
BPF series easier to review.

[0]: https://lore.kernel.org/netdev/20231121184245.69569-1-kuniyu@amazon.com/
v2: https://lore.kernel.org/netdev/20231125011638.72056-1-kuniyu@amazon.com/
v1: https://lore.kernel.org/netdev/20231123012521.62841-1-kuniyu@amazon.com/
====================

Link: https://lore.kernel.org/r/20231129022924.96156-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: Factorise cookie-dependent fields initialisation in cookie_v[46]_check()
Kuniyuki Iwashima [Wed, 29 Nov 2023 02:29:24 +0000 (18:29 -0800)]
tcp: Factorise cookie-dependent fields initialisation in cookie_v[46]_check()

We will support arbitrary SYN Cookie with BPF, and then kfunc at
TC will preallocate reqsk and initialise some fields that should
not be overwritten later by cookie_v[46]_check().

To simplify the flow in cookie_v[46]_check(), we move such fields'
initialisation to cookie_tcp_reqsk_alloc() and factorise non-BPF
SYN Cookie handling into cookie_tcp_check(), where we validate the
cookie and allocate reqsk, as done by kfunc later.

Note that we set ireq->ecn_ok in two steps, the latter of which will
be shared by the BPF case.  As cookie_ecn_ok() is one-liner, now
it's inlined.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: Factorise cookie-independent fields initialisation in cookie_v[46]_check().
Kuniyuki Iwashima [Wed, 29 Nov 2023 02:29:23 +0000 (18:29 -0800)]
tcp: Factorise cookie-independent fields initialisation in cookie_v[46]_check().

We will support arbitrary SYN Cookie with BPF, and then some reqsk fields
are initialised in kfunc, and others are done in cookie_v[46]_check().

This patch factorises the common part as cookie_tcp_reqsk_init() and
calls it in cookie_tcp_reqsk_alloc() to minimise the discrepancy between
cookie_v[46]_check().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().
Kuniyuki Iwashima [Wed, 29 Nov 2023 02:29:22 +0000 (18:29 -0800)]
tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().

We initialise treq->af_specific in cookie_tcp_reqsk_alloc() so that
we can look up a key later in tcp_create_openreq_child().

Initially, that change was added for MD5 by commit ba5a4fdd63ae ("tcp:
make sure treq->af_specific is initialized"), but it has not been used
since commit d0f2b7a9ca0a ("tcp: Disable header prediction for MD5
flow.").

Now, treq->af_specific is used only by TCP-AO, so, we can move that
initialisation into tcp_ao_syncookie().

In addition to that, l3index in cookie_v[46]_check() is only used for
tcp_ao_syncookie(), so let's move it as well.

While at it, we move down tcp_ao_syncookie() in cookie_v4_check() so
that it will be called after security_inet_conn_request() to make
functions order consistent with cookie_v6_check().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-7-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: Don't initialise tp->tsoffset in tcp_get_cookie_sock().
Kuniyuki Iwashima [Wed, 29 Nov 2023 02:29:21 +0000 (18:29 -0800)]
tcp: Don't initialise tp->tsoffset in tcp_get_cookie_sock().

When we create a full socket from SYN Cookie, we initialise
tcp_sk(sk)->tsoffset redundantly in tcp_get_cookie_sock() as
the field is inherited from tcp_rsk(req)->ts_off.

  cookie_v[46]_check
  |- treq->ts_off = 0
  `- tcp_get_cookie_sock
     |- tcp_v[46]_syn_recv_sock
     |  `- tcp_create_openreq_child
     |    `- newtp->tsoffset = treq->ts_off
     `- tcp_sk(child)->tsoffset = tsoff

Let's initialise tcp_rsk(req)->ts_off with the correct offset
and remove the second initialisation of tcp_sk(sk)->tsoffset.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: Don't pass cookie to __cookie_v[46]_check().
Kuniyuki Iwashima [Wed, 29 Nov 2023 02:29:20 +0000 (18:29 -0800)]
tcp: Don't pass cookie to __cookie_v[46]_check().

tcp_hdr(skb) and SYN Cookie are passed to __cookie_v[46]_check(), but
none of the callers passes cookie other than ntohl(th->ack_seq) - 1.

Let's fetch it in __cookie_v[46]_check() instead of passing the cookie
over and over.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: Clean up goto labels in cookie_v[46]_check().
Kuniyuki Iwashima [Wed, 29 Nov 2023 02:29:19 +0000 (18:29 -0800)]
tcp: Clean up goto labels in cookie_v[46]_check().

We will support arbitrary SYN Cookie with BPF, and then reqsk
will be preallocated before cookie_v[46]_check().

Depending on how validation fails, we send RST or just drop skb.

To make the error handling easier, let's clean up goto labels.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: Cache sock_net(sk) in cookie_v[46]_check().
Kuniyuki Iwashima [Wed, 29 Nov 2023 02:29:18 +0000 (18:29 -0800)]
tcp: Cache sock_net(sk) in cookie_v[46]_check().

sock_net(sk) is used repeatedly in cookie_v[46]_check().
Let's cache it in a variable.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: Clean up reverse xmas tree in cookie_v[46]_check().
Kuniyuki Iwashima [Wed, 29 Nov 2023 02:29:17 +0000 (18:29 -0800)]
tcp: Clean up reverse xmas tree in cookie_v[46]_check().

We will grow and cut the xmas tree in cookie_v[46]_check().
This patch cleans it up to make later patches tidy.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: mana: Fix spelling mistake "enforecement" -> "enforcement"
Colin Ian King [Tue, 28 Nov 2023 09:53:04 +0000 (09:53 +0000)]
net: mana: Fix spelling mistake "enforecement" -> "enforcement"

There is a spelling mistake in struct field hc_tx_err_sqpdid_enforecement.
Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Link: https://lore.kernel.org/r/20231128095304.515492-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: dsa: sja1105: Use units.h instead of the copy of a definition
Andy Shevchenko [Tue, 28 Nov 2023 17:50:27 +0000 (19:50 +0200)]
net: dsa: sja1105: Use units.h instead of the copy of a definition

BYTES_PER_KBIT is defined in units.h, use that definition.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231128175027.394754-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'mptcp-more-selftest-coverage-and-code-cleanup-for-net-next'
Jakub Kicinski [Thu, 30 Nov 2023 04:10:27 +0000 (20:10 -0800)]
Merge branch 'mptcp-more-selftest-coverage-and-code-cleanup-for-net-next'

Mat Martineau says:

====================
mptcp: More selftest coverage and code cleanup for net-next

Patches 1-5 and 7-8 add selftest coverage (and an associated subflow
counter in the kernel) to validate the recently-updated handling of
subflows with ID 0.

Patch 6 renames a label in the userspace path manager for clarity.

Patches 9-11 and 13-15 factor out common selftest code by moving certain
functions to mptcp_lib.sh

Patch 12 makes sure the random data file generated for selftest
payloads has the intended size.

v3: https://lore.kernel.org/r/20231115-send-net-next-2023107-v3-0-1ef58145a882@kernel.org
v2: https://lore.kernel.org/r/20231114-send-net-next-2023107-v2-0-b650a477362c@kernel.org
v1: https://lore.kernel.org/r/20231027-send-net-next-2023107-v1-0-03eff9452957@kernel.org
====================

Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-0-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add mptcp_lib_wait_local_port_listen
Geliang Tang [Tue, 28 Nov 2023 23:18:59 +0000 (15:18 -0800)]
selftests: mptcp: add mptcp_lib_wait_local_port_listen

To avoid duplicated code in different MPTCP selftests, we can add
and use helpers defined in mptcp_lib.sh.

wait_local_port_listen() helper is defined in diag.sh, mptcp_connect.sh,
mptcp_join.sh and simult_flows.sh, export it into mptcp_lib.sh and
rename it with mptcp_lib_ prefix. Use this new helper in all these
scripts.

Note: We only have IPv4 connections in this helper, not looking at IPv6
(tcp6) but that's OK because we only have IPv4 connections here in diag.sh.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-15-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add mptcp_lib_check_transfer
Geliang Tang [Tue, 28 Nov 2023 23:18:58 +0000 (15:18 -0800)]
selftests: mptcp: add mptcp_lib_check_transfer

To avoid duplicated code in different MPTCP selftests, we can add
and use helpers defined in mptcp_lib.sh.

check_transfer() and print_file_err() helpers are defined both in
mptcp_connect.sh and mptcp_sockopt.sh, export them into mptcp_lib.sh
and rename them with mptcp_lib_ prefix. And use them in all scripts.

Note: In mptcp_sockopt.sh it is OK to drop 'ret=1' in check_transfer()
because it will be set in run_tests() anyway.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-14-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add mptcp_lib_make_file
Geliang Tang [Tue, 28 Nov 2023 23:18:57 +0000 (15:18 -0800)]
selftests: mptcp: add mptcp_lib_make_file

To avoid duplicated code in different MPTCP selftests, we can add
and use helpers defined in mptcp_lib.sh.

make_file() helper in mptcp_sockopt.sh and userspace_pm.sh are the same.
Export it into mptcp_lib.sh and rename it as mptcp_lib_kill_wait(). Use
it in both mptcp_connect.sh and mptcp_join.sh.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-13-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add missing oflag=append
Geliang Tang [Tue, 28 Nov 2023 23:18:56 +0000 (15:18 -0800)]
selftests: mptcp: add missing oflag=append

In mptcp_connect.sh we are missing something like "oflag=append"
because this will write "${rem}" bytes at the beginning of the file
where there is already some random bytes. It should write that at
the end.

This patch adds this missing 'oflag=append' flag for 'dd' command in
make_file().

Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-12-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add mptcp_lib_get_counter
Geliang Tang [Tue, 28 Nov 2023 23:18:55 +0000 (15:18 -0800)]
selftests: mptcp: add mptcp_lib_get_counter

To avoid duplicated code in different MPTCP selftests, we can add
and use helpers defined in mptcp_lib.sh.

The helper get_counter() in mptcp_join.sh and get_mib_counter() in
mptcp_connect.sh have the same functionality, export get_counter() into
mptcp_lib.sh and rename it as mptcp_lib_get_counter(). Use this new
helper instead of get_counter() and get_mib_counter().

Use this helper in test_prio() in userspace_pm.sh too instead of
open-coding.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-11-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add mptcp_lib_is_v6
Geliang Tang [Tue, 28 Nov 2023 23:18:54 +0000 (15:18 -0800)]
selftests: mptcp: add mptcp_lib_is_v6

To avoid duplicated code in different MPTCP selftests, we can add
and use helpers defined in mptcp_lib.sh.

is_v6() helper is defined in mptcp_connect.sh, mptcp_join.sh and
mptcp_sockopt.sh, so export it into mptcp_lib.sh and rename it as
mptcp_lib_is_v6(). Use this new helper in all scripts.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-10-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add mptcp_lib_kill_wait
Geliang Tang [Tue, 28 Nov 2023 23:18:53 +0000 (15:18 -0800)]
selftests: mptcp: add mptcp_lib_kill_wait

To avoid duplicated code in different MPTCP selftests, we can add
and use helpers defined in mptcp_lib.sh.

Export kill_wait() helper in userspace_pm.sh into mptcp_lib.sh and
rename it as mptcp_lib_kill_wait(). It can be used to instead of
kill_wait() in mptcp_join.sh. Use the new helper in both scripts.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-9-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: userspace pm send RM_ADDR for ID 0
Geliang Tang [Tue, 28 Nov 2023 23:18:52 +0000 (15:18 -0800)]
selftests: mptcp: userspace pm send RM_ADDR for ID 0

This patch adds a selftest for userspace PM to remove id 0 address.

Use userspace_pm_add_addr() helper to add an id 10 address, then use
userspace_pm_rm_addr() helper to remove id 0 address.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-8-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: userspace pm remove initial subflow
Geliang Tang [Tue, 28 Nov 2023 23:18:51 +0000 (15:18 -0800)]
selftests: mptcp: userspace pm remove initial subflow

This patch adds a selftest for userspace PM to remove the initial
subflow.

Use userspace_pm_add_sf() to add a subflow, and pass initial IP address
to userspace_pm_rm_sf() to remove the initial subflow.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-7-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomptcp: userspace pm rename remove_err to out
Geliang Tang [Tue, 28 Nov 2023 23:18:50 +0000 (15:18 -0800)]
mptcp: userspace pm rename remove_err to out

The value of 'err' will not be only '-EINVAL', but can be '0' in some
cases.

So it's better to rename the label 'remove_err' to 'out' to avoid
confusions.

Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-6-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: userspace pm create id 0 subflow
Geliang Tang [Tue, 28 Nov 2023 23:18:49 +0000 (15:18 -0800)]
selftests: mptcp: userspace pm create id 0 subflow

This patch adds a selftest to create id 0 subflow. Pass id 0 to the
helper userspace_pm_add_sf() to create id 0 subflow. chk_mptcp_info
shows one subflow but chk_subflows_total shows two subflows in each
namespace.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-5-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: update userspace pm test helpers
Geliang Tang [Tue, 28 Nov 2023 23:18:48 +0000 (15:18 -0800)]
selftests: mptcp: update userspace pm test helpers

This patch adds a new argument namespace to userspace_pm_add_addr() and
userspace_pm_add_sf() to make these two helper more versatile.

Add two more versatile helpers for userspace pm remove subflow or address:
userspace_pm_rm_addr() and userspace_pm_rm_sf(). The original test helpers
userspace_pm_rm_sf_addr_ns1() and userspace_pm_rm_sf_addr_ns2() can be
replaced by these new helpers.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-4-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add chk_subflows_total helper
Geliang Tang [Tue, 28 Nov 2023 23:18:47 +0000 (15:18 -0800)]
selftests: mptcp: add chk_subflows_total helper

This patch adds a new helper chk_subflows_total(), in it use the newly
added counter mptcpi_subflows_total to get the "correct" amount of
subflows, including the initial one.

To be compatible with old 'ss' or kernel versions not supporting this
counter, get the total subflows by listing TCP connections that are
MPTCP subflows:

    ss -ti state state established state syn-sent state syn-recv |
        grep -c tcp-ulp-mptcp.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-3-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: mptcp: add evts_get_info helper
Geliang Tang [Tue, 28 Nov 2023 23:18:46 +0000 (15:18 -0800)]
selftests: mptcp: add evts_get_info helper

This patch adds a new helper get_info_value(), using 'sed' command to
parse the value of the given item name in the line with the given keyword,
to make chk_mptcp_info() and pedit_action_pkts() more readable.

Also add another helper evts_get_info() to use get_info_value() to parse
the output of 'pm_nl_ctl events' command, to make all the userspace pm
selftests more readable, both in mptcp_join.sh and userspace_pm.sh.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-2-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomptcp: add mptcpi_subflows_total counter
Geliang Tang [Tue, 28 Nov 2023 23:18:45 +0000 (15:18 -0800)]
mptcp: add mptcpi_subflows_total counter

If the initial subflow has been removed, we cannot know without checking
other counters, e.g. ss -ti <filter> | grep -c tcp-ulp-mptcp or
getsockopt(SOL_MPTCP, MPTCP_FULL_INFO, ...) (or others except MPTCP_INFO
of course) and then check mptcp_subflow_data->num_subflows to get the
total amount of subflows.

This patch adds a new counter mptcpi_subflows_total in mptcpi_flags to
store the total amount of subflows, including the initial one. A new
helper __mptcp_has_initial_subflow() is added to check whether the
initial subflow has been removed or not. With this helper, we can then
compute the total amount of subflows from mptcp_info by doing something
like:

    mptcpi_subflows_total = mptcpi_subflows +
            __mptcp_has_initial_subflow(msk).

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/428
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-1-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'mlxsw-support-cff-flood-mode'
Jakub Kicinski [Thu, 30 Nov 2023 04:03:27 +0000 (20:03 -0800)]
Merge branch 'mlxsw-support-cff-flood-mode'

Petr Machata says:

====================
mlxsw: Support CFF flood mode

The registers to configure to initialize a flood table differ between the
controlled and CFF flood modes. In therefore needs to be an op. Add it,
hook up the current init to the existing families, and invoke the op.

PGT is an in-HW table that maps addresses to sets of ports. Then when some
HW process needs a set of ports as an argument, instead of embedding the
actual set in the dynamic configuration, what gets configured is the
address referencing the set. The HW then works with the appropriate PGT
entry.

Among other allocations, the PGT currently contains two large blocks for
bridge flooding: one for 802.1q and one for 802.1d. Within each of these
blocks are three tables, for unknown-unicast, multicast and broadcast
flooding:

      . . . |    802.1q    |    802.1d    | . . .
            | UC | MC | BC | UC | MC | BC |
             \______ _____/ \_____ ______/
                    v             v
                   FID flood vectors

Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an
802.1q bridge) uses three flood vectors spread across a fairly large region
of PGT.

This way of organizing the flood table (called "controlled") is not very
flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one
would need to completely rewrite the bridge PGT blocks, or resort to hacks
such as storing individual MC flood vectors into unused part of the bridge
table.

In order to address these shortcomings, Spectrum-2 and above support what
is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode,
each FID has a little table of its own, with three entries adjacent to each
other, one for unknown-UC, one for MC, one for BC. This allows for a much
more fine-grained approach to PGT management, where bits of it are
allocated on demand.

      . . . | FID | FID | FID | FID | FID | . . .
            |U|M|B|U|M|B|U|M|B|U|M|B|U|M|B|
             \_____________ _____________/
                           v
                   FID flood vectors

Besides the FID table organization, the CFF flood mode also impacts Router
Subport (RSP) table. This table contains flood vectors for rFIDs, which are
FIDs that reference front panel ports or LAGs. The RSP table contains two
entries per front panel port and LAG, one for unknown-UC traffic, and one
for everything else. Currently, the FW allocates and manages the table in
its own part of PGT. rFIDs are marked with flood_rsp bit and managed
specially. In CFF mode, rFIDs are managed as all other FIDs. The driver
therefore has to allocate and maintain the flood vectors. Like with bridge
FIDs, this is more work, but increases flexibility of the system.

The FW currently supports both the controlled and CFF flood modes. To shed
complexity, in the future it should only support CFF flood mode. Hence this
patchset, which adds CFF flood mode support to mlxsw.

Since mlxsw needs to maintain both the controlled mode as well as CFF mode
support, we will keep the layout as compatible as possible. The bridge
tables will stay in the same overall shape, just their inner organization
will change from flood mode -> FID to FID -> flood mode. Likewise will RSP
be kept as a contiguous block of PGT memory, as was the case when the FW
maintained it.

- The way FIDs get configured under the CFF flood mode differs from the
  currently used controlled mode. The simple approach of having several
  globally visible arrays for spectrum.c to statically choose from no
  longer works.

  Patch #1 thus privatizes all FID initialization and finalization logic,
  and exposes it as ops instead.

- Patch #2 renames the ops that are specific to the controlled mode, to
  make room in the namespace for the CFF variants.

  Patch #3 extracts a helper to compute flood table base out of
  mlxsw_sp_fid_flood_table_mid().

- The op fid_setup configured fid_offset, i.e. the number of this FID
  within its family. For rFIDs in CFF mode, to determine this number, the
  driver will need to do fallible queries.

  Thus in patch #4, make the FID setup operation fallible as well.

- Flood mode initialization routine differs between the controlled and CFF
  flood modes. The controlled mode needs to configure flood table layout,
  which the CFF mode does not need to do.

  In patch #5, move mlxsw_sp_fid_flood_table_init() up so that the
  following patch can make use of it.

  In patch #6, add an op to be invoked per table (if defined).

- The current way of determining PGT allocation size depends on the number
  of FIDs and number of flood tables. RFIDs however have PGT footprint
  depending not on number of FIDs, but on number of ports and LAGs, because
  which ports an rFID should flood to does not depend on the FID itself,
  but on the port or LAG that it references.

  Therefore in patch #7, add FID family ops for determining PGT allocation
  size.

- As elaborated above, layout of PGT will differ between controlled and CFF
  flood modes. In CFF mode, it will further differ between rFIDs and other
  FIDs (as described at previous patch). The way to pack the SFMR register
  to configure a FID will likewise differ from controlled to CFF.

  Thus in patches #8 and #9 add FID family ops to determine PGT base
  address for a FID and to pack SFMR.

- Patches #10 and #11 add more bits for RSP support. In patch #10, add a
  new traffic type enumerator, for non-UC traffic. This is a combination of
  BC and MC traffic, but the way that mlxsw maps these mnemonic names to
  actual traffic type configurations requires that we have a new name to
  describe this class of traffic.

  Patch #11 then adds hooks necessary for RSP table maintenance. As ports
  come and go, and join and leave LAGs, it is necessary to update flood
  vectors that the rFIDs use. These new hooks will make that possible.

- Patches #12, #13 and #14 introduce flood profiles. These have been
  implicit so far, but the way that CFF flood mode works with profile IDs
  requires that we make them explicit.

  Thus in patch #12, introduce flood profile objects as a set of flood
  tables that FID families then refer to. The FID code currently only
  uses a single flood profile.

  In patch #13, add a flood profile ID to flood profile objects.

  In patch #14, when in CFF mode, configure SFFP according to the existing
  flood profiles (or the one that exists as of that point).

- Patches #15 and #16 add code to implement, respectively, bridge FIDs and
  RSP FIDs in CFF mode.

- In patch #17, toggle flood_mode_prefer_cff on Spectrum-2 and above, which
  makes the newly-added code live.
====================

Link: https://lore.kernel.org/r/cover.1701183891.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum: Use CFF mode where available
Petr Machata [Tue, 28 Nov 2023 15:50:50 +0000 (16:50 +0100)]
mlxsw: spectrum: Use CFF mode where available

Mark all Spectrum>2 systems as preferring CFF flood mode if supported by
the firmware.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/8a3d2ad96b943f7e3f53f998bd333a14e19cd641.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add support for rFID family in CFF flood mode
Petr Machata [Tue, 28 Nov 2023 15:50:49 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add support for rFID family in CFF flood mode

In this patch, add the artifacts for the rFID family that works in CFF
flood mode.

The same that was said about PGT organization and lookup in bridge FID
families applies for the rFID family as well. The main difference lies in
the fact that in the controlled flood mode, the FW was taking care of
maintaining the PGT tables for rFIDs. In CFF mode, the responsibility
shifts to the driver.

All rFIDs are based off either a front panel port, or a LAG port. For those
based off ports, we need to maintain at worst one PGT block for each port,
for those based off LAGs, one PGT block per LAG. This reflects in the
pgt_size callback, which determines the PGT footprint based on number of
ports and the LAG capacity.

A number of FIDs may end up using the same PGT base. Unlike with bridges,
where membership of a port in a given FID is highly dynamic, an rFID based
of a port will just always need to flood to that port.

Both the port and the LAG subtables need to be actively maintained. To that
end, the CFF rFID family implements fid_port_init and fid_port_fini
callbacks, which toggle the necessary bits.

Both FID-MID translation and SFMR packing then point into either the port
or the LAG subtable, to the block that corresponds to a given port or a
given LAG, depending on what port the RIF bound to the rFID uses.

As in the previous patch, the way CFF flood mode organizes PGT accesses
allows for much more smarts and dynamism. As in the previous patch, we
rather aim to keep things simple and static.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/962deb4367585d38250e80c685a34735c0c7f3ad.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add a family for bridge FIDs in CFF flood mode
Petr Machata [Tue, 28 Nov 2023 15:50:48 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add a family for bridge FIDs in CFF flood mode

In this patch, add the artifacts for 802.1d and 802.1q FID families that
work in CFF flood mode.

In CFF flood mode, the way flood vectors are looked up changes: there's a
per-FID PGT base, to which a small offset is added depending on type of
traffic. Thus each FID occupies a small contiguous block of PGT memory,
whereas in the controlled flood mode, flood vectors for a given FID were
spread across the PGT.

The term "flood table" as used by the spectrum_fid module, borrows from
controlled flood mode way of organizing the PGT table. There flood tables
were actual tables, contiguous in the PGT. In the CFF flood mode, they are
more abstract: a flood table becomes a collection of e.g. all first rows of
the per-FID PGT blocks. Nonetheless we retain the nomenclature.

FIDs are still configured through the SFMR register, but there are
different fields to set under CFF mode: PGT base and profile. Thus register
packing gets a dedicated op overload as well.

The new organization of PGT makes it possible to treat the PGT as a block
of an ordinary memory, allocate and deallocate on demand, and achieve
better flexibility. Here instead, we aim to keep the code as close as
possible to the previous controlled flood mode, support for which we need
to retain for Spectrum-1 and older FW versions anyway. Thus the PGT
footprint of the individual families is the same as before, just the
internal organization of the per-family PGT region differs. Hence the
pgt_size callback is reused between the controlled and CFF flood modes.

Since the dummy family has no flood tables in either the CTL mode or in
CFF mode, the existing one can be reused for the CFF family array.

Users should not notice any changes between the controlled and CFF flood
modes.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/ca40b8163e6d6a21f63ef299619acee953cf9519.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Initialize flood profiles in CFF mode
Petr Machata [Tue, 28 Nov 2023 15:50:47 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Initialize flood profiles in CFF mode

In CFF flood mode, the way flood vectors are looked up changes: there's a
per-FID PGT base, to which a small offset is added depending on type of
traffic. Thus each FID occupies a small contiguous block of PGT memory,
whereas in the controlled flood mode, flood vectors for a given FID were
spread across the PGT.

Each FID is associated with one of a handful of profiles. The profile and
the traffic type are then used as keys to look up the PGT offset. This
offset is then added to the per-FID PGT base. The profile / type / offset
mapping needs to be configured by the driver, and is only relevant in CFF
flood mode.

In this patch, add the SFFP initialization code. Only initialize the one
profile currently explicitly used. As follow-up patch add more profiles,
this code will pick them up and initialize as well.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/2c4733ed72d439444218969c032acad22cd4ed88.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add profile_id to flood profile
Petr Machata [Tue, 28 Nov 2023 15:50:46 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add profile_id to flood profile

In the CFF mode, flood profiles are identified by a unique numerical
identifier. This is used for configuration of FIDs and for configuration of
traffic-type to PGT offset rules. In both cases, the numerical identifier
serves as a handle for the flood profile. Add the identifier to the flood
profile structure.

There is currently only one flood profile in use explicitly, the one used
for all bridging. Eventually three will be necessary in total: one for
bridges, one for rFIDs, one for NVE underlay. A total of four profiles
are supported by the HW. Start allocating at 1, because 0 is currently
used for underlay NVE flood.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/19ea9c35ba8b522fa5f7eb6fd7bc1b68f0f66b41.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add an object to keep flood profiles
Petr Machata [Tue, 28 Nov 2023 15:50:45 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add an object to keep flood profiles

A flood profile is a mapping from traffic type to an offset at which a
flood vector should be looked up. In mlxsw so far, a flood profile was
somewhat implicitly represented by flood table array. When the CFF flood
mode will be introduced, the flood profile will become more explicit: each
will get a number and the profile ID / traffic-type / offset mapping will
actually need to be initialized in the hardware.

Therefore it is going to be handy to have a structure that keeps all the
components that compose a flood profile. Add this structure, currently with
just the flood table array bits. In the FID families that flood at all,
reference the flood profile instead of just the table array.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/15e113de114d3f41ce3fd2a14a2fa6a1b1d7e8f2.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add hooks for RSP table maintenance
Petr Machata [Tue, 28 Nov 2023 15:50:44 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add hooks for RSP table maintenance

In the CFF flood mode, the driver has to allocate a table within PGT, which
holds flood vectors for router subport FIDs. For LAGs, these flood vectors
have to obviously be maintained dynamically as port membership in a LAG
changes. But even for physical ports, the flood vectors have to be kept
valid, and may not contain enabled bits corresponding to non-existent
ports. It is therefore not possible to precompute the port part of the RSP
table, it has to be maintained as ports come and go due to splits.

To support the RSP table maintenance, add to FID ops two new ops:
fid_port_init and fid_port_fini, for when a port comes to existence, or
joins a lag, and vice versa. Invoke these ops from
mlxsw_sp_port_fids_init() and mlxsw_sp_port_fids_fini(), which are called
when port is added and removed, respectively. Also add two new hooks for
LAG maintenance, mlxsw_sp_fid_port_join_lag() / _leave_lag() which
transitively call into the same ops.

Later patches will actually add the op implementations themselves, this
just adds the scaffolding.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/234398a23540317abb25f74f920a5c8121faecf0.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add a not-UC packet type
Petr Machata [Tue, 28 Nov 2023 15:50:43 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add a not-UC packet type

In CFF flood mode, the rFID family will allocate two tables. One for
unknown UC traffic, one for everything else. Add a traffic type for the
everything else traffic.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/8fb968b2d1cc37137cd0110c98cdeb625b03ca99.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add an op for packing SFMR
Petr Machata [Tue, 28 Nov 2023 15:50:42 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add an op for packing SFMR

The way SFMR is packed differs between the controlled and CFF flood modes.
Add an op to dispatch it dynamically.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/f12fe7879a7086ee86343ee4db02c859f78f0534.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add an op to get PGT address of a FID
Petr Machata [Tue, 28 Nov 2023 15:50:41 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add an op to get PGT address of a FID

In the CFF flood mode, the way to determine a PGT address where a given FID
/ flood table resides is different from the controlled flood mode, which
mlxsw currently uses. Furthermore, this will differ between rFID family and
bridge families. The operation therefore needs to be dynamically
dispatched. To that end, add an op to FID-family ops.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Link: https://lore.kernel.org/r/00e8f6ad79009a9a77a5c95d596ea9574776dc95.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add an op to get PGT allocation size
Petr Machata [Tue, 28 Nov 2023 15:50:40 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add an op to get PGT allocation size

In the CFF flood mode, the PGT allocation size of RFID family will not
depend on number of FIDs, but rather number of ports and LAGs. Therefore
introduce a FID family operation to calculate the PGT allocation size.

The way that size is calculated in the CFF mode depends on calling fallible
functions. Thus express the op as returning an int, with the size returned
via a pointer argument.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/1174651b7160fcedbef50010ae4b68201112fe6f.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Add an op for flood table initialization
Petr Machata [Tue, 28 Nov 2023 15:50:39 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Add an op for flood table initialization

In controlled flood mode, for each bridge FID family (i.e., 802.1Q and
802.1D) and packet type (i.e., UUC/MC/BC), the hardware needs to be told
which PGT address to use as the base address for the flood table and how
to determine the offset from the base for each FID.

The above is not needed in CFF mode where each FID has its own flood
table instead of the FID family itself.

Therefore, create a new FID family operation for the above configuration
and only implement it for the 802.1Q and 802.1D families in controlled
flood mode.

No functional changes intended.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/06f71415eec75811585ec597e1dd101b6dff77e7.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Move mlxsw_sp_fid_flood_table_init() up
Petr Machata [Tue, 28 Nov 2023 15:50:38 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Move mlxsw_sp_fid_flood_table_init() up

Move the function to the point where it will need to be to be visible for
the 802.1d ops.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/aef09e26b0c2dd077531e665d7135b300bdaf0a8.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Make mlxsw_sp_fid_ops.setup return an int
Petr Machata [Tue, 28 Nov 2023 15:50:37 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Make mlxsw_sp_fid_ops.setup return an int

This operation will be fallible for rFIDs in CFF mode, which will be
introduced in follow-up patches. Have it return an int, and handle
the failures in the caller.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/75f1b85c0cb86bea5501fcc8657042f221a78b32.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Split a helper out of mlxsw_sp_fid_flood_table_mid()
Petr Machata [Tue, 28 Nov 2023 15:50:36 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Split a helper out of mlxsw_sp_fid_flood_table_mid()

In future patches, for CFF flood mode support, we will need a way to
determine a PGT base dynamically, as an op. Therefore, for symmetry,
split out a helper, mlxsw_sp_fid_pgt_base_ctl(), that determines a PGT base
in the controlled mode as well.

Now that the helper is available, use it in mlxsw_sp_fid_flood_table_init()
which currently invokes the FID->MID helper to that end.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/fd41c66a1df4df6499d3da34f40e7b9efa15bc3e.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Rename FID ops, families, arrays
Petr Machata [Tue, 28 Nov 2023 15:50:35 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Rename FID ops, families, arrays

Currently, mlxsw always uses a "controlled" flood mode on all Nvidia
Spectrum generations. The following patches will however introduce a
possibility to run a "CFF" (for Compressed FID Flooding) mode on newer
machines, if the FW supports it.

To reflect that, label all FID ops, FID families and FID family arrays with
a _ctl suffix. This will make it clearer what is what when the CFF families
are introduced in later patches.

Keep the dummy family intact. Since the dummy family has no flood tables
in either CTL or CFF mode, there are no flood-mode-specific callbacks.

Additionally, add a remark at two fields that they are only relevant when
flood mode is not CFF.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/96b6da5439bb662fa86e795bbcec9dc3ccfa59fd.1701183892.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum_fid: Privatize FID families
Petr Machata [Tue, 28 Nov 2023 15:50:34 +0000 (16:50 +0100)]
mlxsw: spectrum_fid: Privatize FID families

Currently, mlxsw always uses a "controlled" flood mode on all Nvidia
Spectrum generations. The following patches will however introduce a
possibility to run a "CFF" (for Compressed FID Flooding) mode on newer
machines, if the FW supports it.

Several operations will differ between how they need to be done in
controlled mode vs. CFF mode. Thus the per-FID-family ops will differ
between controlled and CFF, thus the FID family array as such will
differ depending on whether the mode negotiated with FW is controlled
or CFF.

The simple approach of having several globally visible arrays for
spectrum.c to statically choose from no longer works. Instead privatize all
FID initialization and finalization logic, and expose it as ops instead.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/d3fa390d97cf3dbd2f7a28741be69b311e2059e4.1701183891.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: phy: aquantia: drop wrong endianness conversion for addr and CRC
Christian Marangi [Tue, 28 Nov 2023 13:59:28 +0000 (14:59 +0100)]
net: phy: aquantia: drop wrong endianness conversion for addr and CRC

On further testing on BE target with kernel test robot, it was notice
that the endianness conversion for addr and CRC in fw_load_memory was
wrong.

Drop the cpu_to_le32 conversion for addr load as it's not needed.

Use get_unaligned_le32 instead of get_unaligned for FW data word load to
correctly convert data in the correct order to follow system endian.

Also drop the cpu_to_be32 for CRC calculation as it's wrong and would
cause different CRC on BE system.
The loaded word is swapped internally and MAILBOX calculates the CRC on
the swapped word. To correctly calculate the CRC to be later matched
with the one from MAILBOX, use an u8 struct and swap the word there to
keep the same order on both LE and BE for crc_ccitt_false function.
Also add additional comments on how the CRC verification for the loaded
section works.

CRC is calculated as we load the section and verified with the MAILBOX
only after the entire section is loaded to skip additional slowdown by
loop the section data again.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311210414.sEJZjlcD-lkp@intel.com/
Fixes: e93984ebc1c8 ("net: phy: aquantia: add firmware load support")
Tested-by: Robert Marko <robimarko@gmail.com> # ipq8072 LE device
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20231128135928.9841-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: phy: adin: allow control of Fast Link Down
Vincent Whitchurch [Mon, 27 Nov 2023 15:31:39 +0000 (16:31 +0100)]
net: phy: adin: allow control of Fast Link Down

Add support to allow Fast Link Down (aka "Enhanced link detection") to
be controlled via the ETHTOOL_PHY_FAST_LINK_DOWN tunable.  These PHYs
have this feature enabled by default.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Acked-by: Nuno Sa <nuno.sa@analog.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20231127-adin-fld-v1-1-797f6423fd48@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agor8169: improve handling task scheduling
Heiner Kallweit [Mon, 27 Nov 2023 17:20:11 +0000 (18:20 +0100)]
r8169: improve handling task scheduling

If we know that the task is going to be a no-op, don't even schedule it.
And remove the check for netif_running() in the worker function, the
check for flag RTL_FLAG_TASK_ENABLED is sufficient. Note that we can't
remove the check for flag RTL_FLAG_TASK_ENABLED in the worker function
because we have no guarantee when it will be executed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/c65873a3-7394-4107-99a7-83f20030779c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'create-a-binding-for-the-marvell-mv88e6xxx-dsa-switches'
Jakub Kicinski [Thu, 30 Nov 2023 03:37:23 +0000 (19:37 -0800)]
Merge branch 'create-a-binding-for-the-marvell-mv88e6xxx-dsa-switches'

Linus Walleij says:

====================
Create a binding for the Marvell MV88E6xxx DSA switches

The Marvell switches are lacking DT bindings.

I need proper schema checking to add LED support to the
Marvell switch. Just how it is, it can't go on like this.

Some Device Tree fixes are included in the series, these
remove the major and most annoying warnings fallout noise:
some warnings remain, and these are of more serious nature,
such as missing phy-mode. They can be applied individually,
or to the networking tree with the rest of the patches.

Thanks to Andrew Lunn, Vladimir Oltean and Russell King
for excellent review and feedback!

This latest version employs special compatibles in the
odd ABI device trees.

v8: https://lore.kernel.org/r/20231114-marvell-88e6152-wan-led-v8-0-50688741691b@linaro.org
v7: https://lore.kernel.org/r/20231024-marvell-88e6152-wan-led-v7-0-2869347697d1@linaro.org
v6: https://lore.kernel.org/r/20231024-marvell-88e6152-wan-led-v6-0-993ab0949344@linaro.org
v5: https://lore.kernel.org/r/20231023-marvell-88e6152-wan-led-v5-0-0e82952015a7@linaro.org
v4: https://lore.kernel.org/r/20231018-marvell-88e6152-wan-led-v4-0-3ee0c67383be@linaro.org
v3: https://lore.kernel.org/r/20231016-marvell-88e6152-wan-led-v3-0-38cd449dfb15@linaro.org
v2: https://lore.kernel.org/r/20231014-marvell-88e6152-wan-led-v2-0-7fca08b68849@linaro.org
v1: https://lore.kernel.org/r/20231013-marvell-88e6152-wan-led-v1-0-0712ba99857c@linaro.org
====================

Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-0-272934e04681@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: marvell: Add Marvell MV88E6060 DSA schema
Linus Walleij [Mon, 27 Nov 2023 15:43:08 +0000 (16:43 +0100)]
dt-bindings: marvell: Add Marvell MV88E6060 DSA schema

The Marvell MV88E6060 is one of the oldest DSA switches from
Marvell, and it has DT bindings used in the wild. Let's define
them properly.

It is different enough from the rest of the MV88E6xxx switches
that it deserves its own binding.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-5-272934e04681@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: marvell: Rewrite MV88E6xxx in schema
Linus Walleij [Mon, 27 Nov 2023 15:43:07 +0000 (16:43 +0100)]
dt-bindings: marvell: Rewrite MV88E6xxx in schema

This is an attempt to rewrite the Marvell MV88E6xxx switch bindings
in YAML schema.

The current text binding says:
  WARNING: This binding is currently unstable. Do not program it into a
  FLASH never to be changed again. Once this binding is stable, this
  warning will be removed.

Well that never happened before we switched to YAML markup,
we can't have it like this, what about fixing the mess?

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-4-272934e04681@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: net: ethernet-switch: Accept special variants
Linus Walleij [Mon, 27 Nov 2023 15:43:06 +0000 (16:43 +0100)]
dt-bindings: net: ethernet-switch: Accept special variants

Accept special node naming variants for Marvell switches with
special node names as ABI.

This is maybe not the prettiest but it avoids special-casing
the Marvell MV88E6xxx bindings by copying a lot of generic
binding code down into that one binding just to special-case
these unfixable nodes.

Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-3-272934e04681@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: net: mvusb: Fix up DSA example
Linus Walleij [Mon, 27 Nov 2023 15:43:05 +0000 (16:43 +0100)]
dt-bindings: net: mvusb: Fix up DSA example

When adding a proper schema for the Marvell mx88e6xxx switch,
the scripts start complaining about this embedded example:

  dtschema/dtc warnings/errors:
  net/marvell,mvusb.example.dtb: switch@0: ports: '#address-cells'
  is a required property
  from schema $id: http://devicetree.org/schemas/net/dsa/marvell,mv88e6xxx.yaml#
  net/marvell,mvusb.example.dtb: switch@0: ports: '#size-cells'
  is a required property
  from schema $id: http://devicetree.org/schemas/net/dsa/marvell,mv88e6xxx.yaml#

Fix this up by extending the example with those properties in
the ports node.

While we are at it, rename "ports" to "ethernet-ports" and rename
"switch" to "ethernet-switch" as this is recommended practice.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-2-272934e04681@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: net: dsa: Require ports or ethernet-ports
Linus Walleij [Mon, 27 Nov 2023 15:43:04 +0000 (16:43 +0100)]
dt-bindings: net: dsa: Require ports or ethernet-ports

Bindings using dsa.yaml#/$defs/ethernet-ports specify that
a DSA switch node need to have a ports or ethernet-ports
subnode, and that is actually required, so add requirements
using oneOf.

Suggested-by: Rob Herring <robh@kernel.org>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-1-272934e04681@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'tools-ynl-fixes-for-the-page-pool-sample-and-the-generation-process'
Jakub Kicinski [Thu, 30 Nov 2023 00:07:02 +0000 (16:07 -0800)]
Merge branch 'tools-ynl-fixes-for-the-page-pool-sample-and-the-generation-process'

Jakub Kicinski says:

====================
tools: ynl: fixes for the page-pool sample and the generation process

Minor fixes to the new sample and the Makefiles.
====================

Link: https://lore.kernel.org/r/20231129193622.2912353-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotools: ynl: don't skip regeneration from make targets
Jakub Kicinski [Wed, 29 Nov 2023 19:36:22 +0000 (11:36 -0800)]
tools: ynl: don't skip regeneration from make targets

Commit 2b7ac0c87d98 ("tools: ynl-gen: don't touch the output file if
content is the same") is working too well. It was added so that
ynl-regen -f doesn't make us rebuild half of the kernel, if there
are no actual changes in any generated code.

When ynl-gen-c is called by make, however, we're better off trusting
make's tracking and overwrite the file. Otherwise if output is identical
we won't update file timestamps and make will retry code gen on every
invocation.

Link: https://lore.kernel.org/r/20231129193622.2912353-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotools: ynl: order building samples after generated code
Jakub Kicinski [Wed, 29 Nov 2023 19:36:21 +0000 (11:36 -0800)]
tools: ynl: order building samples after generated code

Parallel builds of ynl:

  make -C tools/net/ynl/ -j 4

don't work correctly right now. samples get handled before
generated, so build of samples does not notice that protos.a
has changed. Order samples to be last.

Link: https://lore.kernel.org/r/20231129193622.2912353-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotools: ynl: make sure we use local headers for page-pool
Jakub Kicinski [Wed, 29 Nov 2023 19:36:20 +0000 (11:36 -0800)]
tools: ynl: make sure we use local headers for page-pool

Building samples generates the following warning:

  In file included from page-pool.c:11:
  generated/netdev-user.h:21:45: warning: ‘enum netdev_xdp_rx_metadata’ declared inside parameter list will not be visible outside of this definition or declaration
   21 | const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value);
      |                                             ^~~~~~~~~~~~~~~~~~~~~~

Our magic way of including uAPI headers assumes the sample
name matches the family name. We need to copy the flags over.

Fixes: 637567e4a3ef ("tools: ynl: add sample for getting page-pool information")
Link: https://lore.kernel.org/r/20231129193622.2912353-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotools: ynl: fix build of the page-pool sample
Jakub Kicinski [Wed, 29 Nov 2023 19:36:19 +0000 (11:36 -0800)]
tools: ynl: fix build of the page-pool sample

The name of the "destroyed" field in the reply was not changed
in the sample after we started calling it "detach_time".

page-pool.c: In function ‘main’:
page-pool.c:84:33: error: ‘struct <anonymous>’ has no member named ‘destroyed’
   84 |                 if (pp->_present.destroyed)
      |                                 ^

Fixes: 637567e4a3ef ("tools: ynl: add sample for getting page-pool information")
Link: https://lore.kernel.org/r/20231129193622.2912353-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'fine-tune-flow-control-and-speed-configurations-in-microchip-ksz8xxx...
Jakub Kicinski [Wed, 29 Nov 2023 16:38:36 +0000 (08:38 -0800)]
Merge branch 'fine-tune-flow-control-and-speed-configurations-in-microchip-ksz8xxx-dsa-driver'

Oleksij Rempel says:

====================
Fine-Tune Flow Control and Speed Configurations in Microchip KSZ8xxx DSA Driver

This patch set focuses on enhancing the configurability of flow
control, speed, and duplex settings in the Microchip KSZ8xxx DSA driver.

The first patch allows more granular control over the CPU port's flow
control, speed, and duplex settings. The second patch introduces a
method for port configurations for port with integrated PHYs, primarily
concerning flow control based on duplex mode.
====================

Link: https://lore.kernel.org/r/20231127145101.3039399-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: dsa: microchip: make phylink_mac_link_up() not optional
Oleksij Rempel [Mon, 27 Nov 2023 14:51:01 +0000 (15:51 +0100)]
net: dsa: microchip: make phylink_mac_link_up() not optional

Last part of the driver do now support phylink_mac_link_up(). So, make it
not optional.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20231127145101.3039399-4-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: dsa: microchip: ksz8: Add function to configure ports with integrated PHYs
Oleksij Rempel [Mon, 27 Nov 2023 14:51:00 +0000 (15:51 +0100)]
net: dsa: microchip: ksz8: Add function to configure ports with integrated PHYs

This patch introduces the function 'ksz8_phy_port_link_up' to the
Microchip KSZ8xxx driver. This function is responsible for setting up
flow control and duplex settings for the ports that are integrated with
PHYs.

The KSZ8795 switch supports asymmetric pause control, which can't be
fully utilized since a single bit controls both RX and TX pause. Despite
this, the flow control can be adjusted based on the auto-negotiation
process, taking into account the capabilities of both link partners.

On the other hand, the KSZ8873's PORT_FORCE_FLOW_CTRL bit can be set by
the hardware bootstrap, which ignores the auto-negotiation result.
Therefore, even in auto-negotiation mode, we need to ensure that this
bit is correctly set.

When auto-negotiation isn't in use, we enforce symmetric pause control
for the KSZ8795 switch.

Please note, forcing flow control disable on a port while still
advertising pause support isn't possible. While this scenario
might not be practical or desired, it's important to be aware of this
limitation when working with the KSZ8873 and similar devices.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20231127145101.3039399-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: dsa: microchip: ksz8: Make flow control, speed, and duplex on CPU port configurable
Oleksij Rempel [Mon, 27 Nov 2023 14:50:59 +0000 (15:50 +0100)]
net: dsa: microchip: ksz8: Make flow control, speed, and duplex on CPU port configurable

Allow flow control, speed, and duplex settings on the CPU port to be
configurable. Previously, the speed and duplex relied on default switch
values, which limited flexibility. Additionally, flow control was
hardcoded and only functional in duplex mode. This update enhances the
configurability of these parameters.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20231127145101.3039399-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'gve-add-support-for-non-4k-page-sizes'
Jakub Kicinski [Wed, 29 Nov 2023 16:32:38 +0000 (08:32 -0800)]
Merge branch 'gve-add-support-for-non-4k-page-sizes'

John Fraker says:

====================
gve: Add support for non-4k page sizes.

This patch series adds support for non-4k page sizes to the driver. Prior
to this patch series, the driver assumes a 4k page size in many small
ways, and will crash in a kernel compiled for a different page size.

This changeset aims to be a minimal changeset that unblocks certain arm
platforms with large page sizes.
====================

Link: https://lore.kernel.org/r/20231128002648.320892-1-jfraker@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agogve: Remove dependency on 4k page size.
John Fraker [Tue, 28 Nov 2023 00:26:48 +0000 (16:26 -0800)]
gve: Remove dependency on 4k page size.

Prior to this change, gve crashes when attempting to run in kernels with
page sizes other than 4k. This change removes unnecessary references to
PAGE_SIZE and replaces them with more meaningful constants.

Signed-off-by: Jordan Kimbrough <jrkim@google.com>
Signed-off-by: John Fraker <jfraker@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231128002648.320892-6-jfraker@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agogve: Add page size register to the register_page_list command.
John Fraker [Tue, 28 Nov 2023 00:26:47 +0000 (16:26 -0800)]
gve: Add page size register to the register_page_list command.

This register is required on platforms with page sizes greater than 4k.
This is because the tx side of the driver vmaps the entire queue page
list of pages into a single flat address space, then uses the entire
space. Without communicating the guest page size to the backend, the
backend will only access the first 4k of each page in the queue page list.

Signed-off-by: Jordan Kimbrough <jrkim@google.com>
Signed-off-by: John Fraker <jfraker@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231128002648.320892-5-jfraker@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agogve: Remove obsolete checks that rely on page size.
John Fraker [Tue, 28 Nov 2023 00:26:46 +0000 (16:26 -0800)]
gve: Remove obsolete checks that rely on page size.

These checks are safe to remove as they are no longer enforced by the
backend. Retaining them would require updating them to work differently
with page sizes larger than 4k.

Signed-off-by: Jordan Kimbrough <jrkim@google.com>
Signed-off-by: John Fraker <jfraker@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231128002648.320892-4-jfraker@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agogve: Deprecate adminq_pfn for pci revision 0x1.
John Fraker [Tue, 28 Nov 2023 00:26:45 +0000 (16:26 -0800)]
gve: Deprecate adminq_pfn for pci revision 0x1.

adminq_pfn assumes a page size of 4k, causing this mechanism to break in
kernels compiled with different page sizes. A new PCI device revision was
needed for the device to be able to communicate with the driver how to
set up the admin queue prior to having access to the admin queue.

Signed-off-by: Jordan Kimbrough <jrkim@google.com>
Signed-off-by: John Fraker <jfraker@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231128002648.320892-3-jfraker@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agogve: Perform adminq allocations through a dma_pool.
John Fraker [Tue, 28 Nov 2023 00:26:44 +0000 (16:26 -0800)]
gve: Perform adminq allocations through a dma_pool.

This allows the adminq to be smaller than a page, paving the way for
non 4k page support. This is to support platforms where PAGE_SIZE
is not 4k, such as some ARM platforms.

Signed-off-by: Jordan Kimbrough <jrkim@google.com>
Signed-off-by: John Fraker <jfraker@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231128002648.320892-2-jfraker@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Wed, 29 Nov 2023 04:10:30 +0000 (20:10 -0800)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-11-27 (i40e, iavf)

This series contains updates to i40e and iavf drivers.

Ivan Vecera performs more cleanups on i40e and iavf drivers; removing
unused fields, defines, and unneeded fields.

Petr Oros utilizes iavf_schedule_aq_request() helper to replace open
coded equivalents.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  iavf: use iavf_schedule_aq_request() helper
  iavf: Remove queue tracking fields from iavf_adminq_ring
  i40e: Remove queue tracking fields from i40e_adminq_ring
  i40e: Remove AQ register definitions for VF types
  i40e: Delete unused and useless i40e_pf fields
====================

Link: https://lore.kernel.org/r/20231127211037.1135403-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agor8169: remove multicast filter limit
Heiner Kallweit [Mon, 27 Nov 2023 20:16:10 +0000 (21:16 +0100)]
r8169: remove multicast filter limit

Once upon a time, when r8169 was new, the multicast filter limit code
was copied from RTL8139 driver. There the filter limit is even
user-configurable.
The filtering is hash-based and we don't have perfect filtering.
Actually the mc filtering on RTL8125 still seems to be the same
as used on 8390/NE2000. So it's not clear to me which benefit it
should bring when switching to all-multi mode once a certain number
of filter bits is set. More the opposite: Filtering out at least
some unwanted mc traffic is better than no filtering.
Also the available chip documentation doesn't mention any restriction.
Therefore remove the filter limit.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/57076c05-3730-40d1-ab9a-5334b263e41a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>