]> www.infradead.org Git - users/hch/block.git/log
users/hch/block.git
5 years agoath11k: add packet log support for QCA6390
Carl Huang [Wed, 30 Sep 2020 10:51:09 +0000 (13:51 +0300)]
ath11k: add packet log support for QCA6390

Add packet log support for QCA6390, otherwise the data connection will stall
within a minute or so.  Enable it via debugfs and use trace-cmd to capture the
pktlogs.

echo 0xffff 1 > /sys/kernel/debug/ath11k/qca6390\ hw2.0/mac0/pktlog_filter

The mon status ring doesn't support interrupt so far, so host starts
a timer to reap this ring. The timer handler also reaps the
rxdma_err_dst_ring in case of monitor mode.

As QCA6390 requires bss created ahead of starting vdev, so check
vdev_start_delay for monitor mode.

For QCA6390, it uses wbm_desc_rel_ring to return descriptors.
It also uses rx_refill_buf_ring to fill mon buffer instead of
rxdma_mon_buf_ring.

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Carl Huang <cjhuang@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601463073-12106-2-git-send-email-kvalo@codeaurora.org
5 years agoath11k: Use GFP_ATOMIC instead of GFP_KERNEL in idr_alloc
Govind Singh [Tue, 29 Sep 2020 17:15:36 +0000 (20:15 +0300)]
ath11k: Use GFP_ATOMIC instead of GFP_KERNEL in idr_alloc

With SLUB DEBUG CONFIG below crash is seen as kmem_cache_alloc
is being called in non-atomic context.
To fix this issue, use GFP_ATOMIC instead of GFP_KERNEL in idr_alloc.

BUG: sleeping function called from invalid context at mm/slab.h:393

[   59.805451] Call trace:
[   59.807971]  ___might_sleep+0x110/0x118
[   59.811915]  __might_sleep+0x50/0x84
[   59.815593]  kmem_cache_alloc+0x60/0x3e0
[   59.819630]  radix_tree_node_alloc+0x4c/0xe8
[   59.824014]  radix_tree_extend+0x8c/0x164
[   59.828135]  idr_get_free_cmn+0xa4/0x27c
[   59.832167]  idr_alloc_cmn+0x70/0xe8
[   59.835856]  ath11k_dp_rxbufs_replenish+0x1e8/0x310 [ath11k]
[   59.841687]  ath11k_dp_rxdma_ring_buf_setup+0x50/0x60 [ath11k]
[   59.847693]  ath11k_dp_rx_pdev_alloc+0x260/0x4d8 [ath11k]
[   59.853248]  ath11k_dp_pdev_alloc+0x40/0xc4 [ath11k]
[   59.858357]  ath11k_core_qmi_firmware_ready+0x3c4/0x490 [ath11k]
[   59.864538]  ath11k_qmi_driver_event_work+0x4c8/0x1178 [ath11k]
[   59.870620]  process_one_work+0x208/0x434

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-9-git-send-email-kvalo@codeaurora.org
5 years agoath11k: Use GFP_ATOMIC instead of GFP_KERNEL in ath11k_dp_htt_get_ppdu_desc
Wen Gong [Tue, 29 Sep 2020 17:15:35 +0000 (20:15 +0300)]
ath11k: Use GFP_ATOMIC instead of GFP_KERNEL in ath11k_dp_htt_get_ppdu_desc

With SLUB DEBUG CONFIG below crash is seen as kmem_cache_alloc
is being called in non-atomic context.

To fix this issue, use GFP_ATOMIC instead of GFP_KERNEL kzalloc.

[  357.217088] BUG: sleeping function called from invalid context at mm/slab.h:498
[  357.217091] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/0
[  357.217092] INFO: lockdep is turned off.
[  357.217095] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.9.0-rc5-wt-ath+ #196
[  357.217096] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601 08/01/2018
[  357.217097] Call Trace:
[  357.217098]  <IRQ>
[  357.217107]  ? ath11k_dp_htt_get_ppdu_desc+0xa9/0x170 [ath11k]
[  357.217110]  dump_stack+0x77/0xa0
[  357.217113]  ___might_sleep.cold+0xa6/0xb6
[  357.217116]  kmem_cache_alloc_trace+0x1f2/0x270
[  357.217122]  ath11k_dp_htt_get_ppdu_desc+0xa9/0x170 [ath11k]
[  357.217129]  ath11k_htt_pull_ppdu_stats.isra.0+0x96/0x270 [ath11k]
[  357.217135]  ath11k_dp_htt_htc_t2h_msg_handler+0xe7/0x1d0 [ath11k]
[  357.217137]  ? trace_hardirqs_on+0x1c/0x100
[  357.217143]  ath11k_htc_rx_completion_handler+0x207/0x370 [ath11k]
[  357.217149]  ath11k_ce_recv_process_cb+0x15e/0x1e0 [ath11k]
[  357.217151]  ? handle_irq_event+0x70/0xa8
[  357.217154]  ath11k_pci_ce_tasklet+0x10/0x30 [ath11k_pci]
[  357.217157]  tasklet_action_common.constprop.0+0xd4/0xf0
[  357.217160]  __do_softirq+0xc9/0x482
[  357.217162]  asm_call_on_stack+0x12/0x20
[  357.217163]  </IRQ>
[  357.217166]  do_softirq_own_stack+0x49/0x60
[  357.217167]  irq_exit_rcu+0x9a/0xd0
[  357.217169]  common_interrupt+0xa1/0x190
[  357.217171]  asm_common_interrupt+0x1e/0x40
[  357.217173] RIP: 0010:cpu_idle_poll.isra.0+0x2e/0x60
[  357.217175] Code: 8b 35 26 27 74 69 e8 11 c8 3d ff e8 bc fa 42 ff e8 e7 9f 4a ff fb 65 48 8b 1c 25 80 90 01 00 48 8b 03 a8 08 74 0b eb 1c f3 90 <48> 8b 03 a8 08 75 13 8b 0
[  357.217177] RSP: 0018:ffffffff97403ee0 EFLAGS: 00000202
[  357.217178] RAX: 0000000000000001 RBX: ffffffff9742b8c0 RCX: 0000000000b890ca
[  357.217180] RDX: 0000000000b890ca RSI: 0000000000000001 RDI: ffffffff968d0c49
[  357.217181] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[  357.217182] R10: ffffffff9742b8c0 R11: 0000000000000046 R12: 0000000000000000
[  357.217183] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000066fdf520
[  357.217186]  ? cpu_idle_poll.isra.0+0x19/0x60
[  357.217189]  do_idle+0x5f/0xe0
[  357.217191]  cpu_startup_entry+0x14/0x20
[  357.217193]  start_kernel+0x443/0x464
[  357.217196]  secondary_startup_64+0xa4/0xb0

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-8-git-send-email-kvalo@codeaurora.org
5 years agoath11k: change to disable softirqs for ath11k_regd_update to solve deadlock
Wen Gong [Tue, 29 Sep 2020 17:15:34 +0000 (20:15 +0300)]
ath11k: change to disable softirqs for ath11k_regd_update to solve deadlock

After base_lock which occupy by ath11k_regd_update, the softirq run for
WMI_REG_CHAN_LIST_CC_EVENTID maybe arrived and it also need to accuire
the spin lock, then deadlock happend, change to disable softirqis to solve it.

[  235.576990] ================================
[  235.576991] WARNING: inconsistent lock state
[  235.576993] 5.9.0-rc5-wt-ath+ #196 Not tainted
[  235.576994] --------------------------------
[  235.576995] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[  235.576997] kworker/u16:1/98 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  235.576998] ffff9655f75cad98 (&ab->base_lock){+.?.}-{2:2}, at: ath11k_regd_update+0x28/0x1d0 [ath11k]
[  235.577009] {IN-SOFTIRQ-W} state was registered at:
[  235.577013]   __lock_acquire+0x219/0x6e0
[  235.577015]   lock_acquire+0xb6/0x270
[  235.577018]   _raw_spin_lock+0x2c/0x70
[  235.577023]   ath11k_reg_chan_list_event.isra.0+0x10d/0x1e0 [ath11k]
[  235.577028]   ath11k_wmi_tlv_op_rx+0x3c3/0x560 [ath11k]
[  235.577033]   ath11k_htc_rx_completion_handler+0x207/0x370 [ath11k]
[  235.577039]   ath11k_ce_recv_process_cb+0x15e/0x1e0 [ath11k]
[  235.577041]   ath11k_pci_ce_tasklet+0x10/0x30 [ath11k_pci]
[  235.577043]   tasklet_action_common.constprop.0+0xd4/0xf0
[  235.577045]   __do_softirq+0xc9/0x482
[  235.577046]   asm_call_on_stack+0x12/0x20
[  235.577048]   do_softirq_own_stack+0x49/0x60
[  235.577049]   irq_exit_rcu+0x9a/0xd0
[  235.577050]   common_interrupt+0xa1/0x190
[  235.577052]   asm_common_interrupt+0x1e/0x40
[  235.577053]   cpu_idle_poll.isra.0+0x2e/0x60
[  235.577055]   do_idle+0x5f/0xe0
[  235.577056]   cpu_startup_entry+0x14/0x20
[  235.577058]   start_kernel+0x443/0x464
[  235.577060]   secondary_startup_64+0xa4/0xb0
[  235.577061] irq event stamp: 432035
[  235.577063] hardirqs last  enabled at (432035): [<ffffffff968d12b4>] _raw_spin_unlock_irqrestore+0x34/0x40
[  235.577064] hardirqs last disabled at (432034): [<ffffffff968d10d3>] _raw_spin_lock_irqsave+0x63/0x80
[  235.577066] softirqs last  enabled at (431998): [<ffffffff967115c1>] inet6_fill_ifla6_attrs+0x3f1/0x430
[  235.577067] softirqs last disabled at (431996): [<ffffffff9671159f>] inet6_fill_ifla6_attrs+0x3cf/0x430
[  235.577068]
[  235.577068] other info that might help us debug this:
[  235.577069]  Possible unsafe locking scenario:
[  235.577069]
[  235.577070]        CPU0
[  235.577070]        ----
[  235.577071]   lock(&ab->base_lock);
[  235.577072]   <Interrupt>
[  235.577073]     lock(&ab->base_lock);
[  235.577074]
[  235.577074]  *** DEADLOCK ***
[  235.577074]
[  235.577075] 3 locks held by kworker/u16:1/98:
[  235.577076]  #0: ffff9655f75b1d48 ((wq_completion)ath11k_qmi_driver_event){+.+.}-{0:0}, at: process_one_work+0x1d3/0x5d0
[  235.577079]  #1: ffffa33cc02f3e70 ((work_completion)(&ab->qmi.event_work)){+.+.}-{0:0}, at: process_one_work+0x1d3/0x5d0
[  235.577081]  #2: ffff9655f75cad50 (&ab->core_lock){+.+.}-{3:3}, at: ath11k_core_qmi_firmware_ready.part.0+0x4e/0x160 [ath11k]
[  235.577087]
[  235.577087] stack backtrace:
[  235.577088] CPU: 3 PID: 98 Comm: kworker/u16:1 Not tainted 5.9.0-rc5-wt-ath+ #196
[  235.577089] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601 08/01/2018
[  235.577095] Workqueue: ath11k_qmi_driver_event ath11k_qmi_driver_event_work [ath11k]
[  235.577096] Call Trace:
[  235.577100]  dump_stack+0x77/0xa0
[  235.577102]  mark_lock_irq.cold+0x15/0x3c
[  235.577104]  mark_lock+0x1d7/0x540
[  235.577105]  mark_usage+0xc7/0x140
[  235.577107]  __lock_acquire+0x219/0x6e0
[  235.577108]  ? sched_clock_cpu+0xc/0xb0
[  235.577110]  lock_acquire+0xb6/0x270
[  235.577116]  ? ath11k_regd_update+0x28/0x1d0 [ath11k]
[  235.577118]  ? atomic_notifier_chain_register+0x2d/0x40
[  235.577120]  _raw_spin_lock+0x2c/0x70
[  235.577125]  ? ath11k_regd_update+0x28/0x1d0 [ath11k]
[  235.577130]  ath11k_regd_update+0x28/0x1d0 [ath11k]
[  235.577136]  __ath11k_mac_register+0x3fb/0x480 [ath11k]
[  235.577141]  ath11k_mac_register+0x119/0x180 [ath11k]
[  235.577146]  ath11k_core_pdev_create+0x17/0xe0 [ath11k]
[  235.577150]  ath11k_core_qmi_firmware_ready.part.0+0x65/0x160 [ath11k]
[  235.577155]  ath11k_qmi_driver_event_work+0x1c5/0x230 [ath11k]
[  235.577158]  process_one_work+0x265/0x5d0
[  235.577160]  worker_thread+0x49/0x300
[  235.577161]  ? process_one_work+0x5d0/0x5d0
[  235.577163]  kthread+0x135/0x150
[  235.577164]  ? kthread_create_worker_on_cpu+0x60/0x60
[  235.577166]  ret_from_fork+0x22/0x30

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-7-git-send-email-kvalo@codeaurora.org
5 years agoath11k: disable monitor mode on QCA6390
Kalle Valo [Tue, 29 Sep 2020 17:15:33 +0000 (20:15 +0300)]
ath11k: disable monitor mode on QCA6390

QCA6390 does not support monitor mode at the moment so disable it altogether,
using a hack as mac80211 does not support disabling it otherwise. Add a boolean
to hw_params to know if hardware supports monitor mode.

IPQ8074 continues to support monitor mode normally.

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-6-git-send-email-kvalo@codeaurora.org
5 years agoath11k: pci: check TCSR_SOC_HW_VERSION
Kalle Valo [Tue, 29 Sep 2020 17:15:32 +0000 (20:15 +0300)]
ath11k: pci: check TCSR_SOC_HW_VERSION

There are different versions of QCA6390. Check TCSR_SOC_HW_VERSION to make sure
that the device is hw2.0, all the rest are unsupported.

This needs to be checked after ath11k_pci_claim() so move the whole switch choosing hw_ver.

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-5-git-send-email-kvalo@codeaurora.org
5 years agoath11k: add interface_modes to hw_params
Kalle Valo [Tue, 29 Sep 2020 17:15:31 +0000 (20:15 +0300)]
ath11k: add interface_modes to hw_params

As QCA6390 does not support mesh interfaces, move the interface_modes to
hw_params. Also create interface combinations dynamically so that it's easy to
change the values.

Now QCA6390 does not claim to support mesh interfaces to user space, but
IPQ8074 continues to do that.

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-4-git-send-email-kvalo@codeaurora.org
5 years agoath11k: fix AP mode for QCA6390
Carl Huang [Tue, 29 Sep 2020 17:15:30 +0000 (20:15 +0300)]
ath11k: fix AP mode for QCA6390

For QCA6390, station vdev needs to delay startup but not for AP mode. On AP
mode vdev starts up immediately after bss peer is created in chanctx assignment
context.

This patch does not affect IPQ8074 family of devices.

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Carl Huang <cjhuang@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-3-git-send-email-kvalo@codeaurora.org
5 years agoath11k: support loading ELF board files
Ben Greear [Tue, 29 Sep 2020 17:15:29 +0000 (20:15 +0300)]
ath11k: support loading ELF board files

The QCA6390 board I have, model 8291M-PR comes with an ELF board file.  To get
this to at least somewhat work, I renamed bdwlan.e04 to 'board.bin' and then
added this patch to check for ELF magic string in the beginning of the file.
If that is found, use type ELF.  After this the driver loads.

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Ben Greear <greearb@candelatech.com>
[kvalo@codeaurora.org: use elf.h, minor cleanup]
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-2-git-send-email-kvalo@codeaurora.org
5 years agoath11k: Correctly check errors for calls to debugfs_create_dir()
Alex Dewar [Sun, 27 Sep 2020 13:24:50 +0000 (14:24 +0100)]
ath11k: Correctly check errors for calls to debugfs_create_dir()

debugfs_create_dir() returns an ERR_PTR in case of error, but never a
null pointer. There are a number of places where error-checking code can
accordingly be simplified.

Addresses-Coverity: CID 1497150: Memory - illegal accesses (USE_AFTER_FREE)
Addresses-Coverity: CID 1497158: Memory - illegal accesses (USE_AFTER_FREE)
Addresses-Coverity: CID 1497160: Memory - illegal accesses (USE_AFTER_FREE)
Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200927132451.585473-1-alex.dewar90@gmail.com
5 years agoath11k: mac: fix parenthesis alignment
Kalle Valo [Tue, 29 Sep 2020 08:46:00 +0000 (11:46 +0300)]
ath11k: mac: fix parenthesis alignment

Commit 6aea26ce5a4c ("mac80211: rework tx encapsulation offload API")
introduced a new checkpatch warning:

drivers/net/wireless/ath/ath11k/mac.c:4354: Alignment should match open parenthesis

Fix that.

Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601369160-1252-1-git-send-email-kvalo@codeaurora.org
5 years agoath11k: Move non-fatal warn logs to dbg level
Govind Singh [Tue, 29 Sep 2020 06:25:03 +0000 (09:25 +0300)]
ath11k: Move non-fatal warn logs to dbg level

During driver load below warn logs are printed in the console.
Since driver may not implement all wmi events sent by fw and
all of them are non-fatal, move this log to debug level to
remove un-necessary warn message on console.

[876.898735] ath11k_pci 0000:06:00.0: Unknown eventid: 0x16005
[879.283250] ath11k_pci 0000:06:00.0: Unknown eventid: 0x1d00a

No functional changes. Compile tested only.

Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1600948691-6901-1-git-send-email-govinds@codeaurora.org
5 years agoath9k: Remove set but not used variable
Li Heng [Tue, 29 Sep 2020 06:25:03 +0000 (09:25 +0300)]
ath9k: Remove set but not used variable

This addresses the following gcc warning with "make W=1":

drivers/net/wireless/ath/ath9k/ar9580_1p0_initvals.h:1331:18: warning:
‘ar9580_1p0_pcie_phy_clkreq_enable_L1’ defined but not used [-Wunused-const-variable=]

drivers/net/wireless/ath/ath9k/ar9580_1p0_initvals.h:1338:18: warning:
‘ar9580_1p0_pcie_phy_clkreq_disable_L1’ defined but not used [-Wunused-const-variable=]

drivers/net/wireless/ath/ath9k/ar9580_1p0_initvals.h:1345:18: warning:
‘ar9580_1p0_pcie_phy_pll_on_clkreq’ defined but not used [-Wunused-const-variable=]

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Li Heng <liheng40@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1600831531-8573-1-git-send-email-liheng40@huawei.com
5 years agonet: vlan: Fixed signedness in vlan_group_prealloc_vid()
Florian Fainelli [Mon, 28 Sep 2020 02:31:50 +0000 (19:31 -0700)]
net: vlan: Fixed signedness in vlan_group_prealloc_vid()

After commit d0186842ec5f ("net: vlan: Avoid using BUG() in
vlan_proto_idx()"), vlan_proto_idx() was changed to return a signed
integer, however one of its called: vlan_group_prealloc_vid() was still
using an unsigned integer for its return value, fix that.

Fixes: d0186842ec5f ("net: vlan: Avoid using BUG() in vlan_proto_idx()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'bnxt_en-Update-for-net-next'
David S. Miller [Sun, 27 Sep 2020 20:35:46 +0000 (13:35 -0700)]
Merge branch 'bnxt_en-Update-for-net-next'

Michael Chan says:

====================
bnxt_en: Update for net-next.

This patch series adds 2 main features to the bnxt_en driver: 200G
link speed support and FEC support with some refactoring of the
link speed logic.  The firmware interface is updated to have proper
support for these 2 features.  The ethtool preset max channel value
is also adjusted properly to account for XDP and TCs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Improve preset max value for ethtool -l.
Michael Chan [Sun, 27 Sep 2020 17:42:20 +0000 (13:42 -0400)]
bnxt_en: Improve preset max value for ethtool -l.

The current logic that calculates the preset maximum value for combined
channel does not take into account the rings used for XDP and mqprio
TCs.  Each of these features will reduce the number of TX rings.  Add
the logic to divide the TX rings accordingly based on whether the
device is currently in XDP mode and whether TCs are in use.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Implement ethtool set_fec_param() method.
Michael Chan [Sun, 27 Sep 2020 17:42:19 +0000 (13:42 -0400)]
bnxt_en: Implement ethtool set_fec_param() method.

This feature allows the user to set the different FEC modes on the NIC
port.  Any new setting will take effect immediately after a link toggle.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Report Active FEC encoding during link up.
Michael Chan [Sun, 27 Sep 2020 17:42:18 +0000 (13:42 -0400)]
bnxt_en: Report Active FEC encoding during link up.

The current code is reporting the FEC configured settings during link up.
Change it to report the more useful active FEC encoding that may be
negotiated or auto detected.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Report FEC settings to ethtool.
Michael Chan [Sun, 27 Sep 2020 17:42:17 +0000 (13:42 -0400)]
bnxt_en: Report FEC settings to ethtool.

Implement .get_fecparam() method to report the configured and active FEC
settings.  Also report the supported and advertised FEC settings to
the .get_link_ksettings() method.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: avoid link reset if speed is not changed
Edwin Peer [Sun, 27 Sep 2020 17:42:16 +0000 (13:42 -0400)]
bnxt_en: avoid link reset if speed is not changed

PORT_PHY_CONFIG is always sent with REQ_FLAGS_RESET_PHY set. This flag
must be set in order for the firmware to institute the requested PHY
change immediately, but it results in a link flap. This is unnecessary
and results in an improved user experience if the PHY reconfiguration
is avoided when the user requested speed does not constitute a change.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Handle ethernet link being disabled by firmware.
Michael Chan [Sun, 27 Sep 2020 17:42:15 +0000 (13:42 -0400)]
bnxt_en: Handle ethernet link being disabled by firmware.

On some 200G dual port NICs, if one port is configured to 200G,
firmware will disable the ethernet link on the other port.  Firmware
will send notification to the driver for the disabled port when this
happens.  Define a new field in the link_info structure to keep track
of this state.  The new phy_state field replaces the unused loop_back
field.

Log a message when the phy_state changes state.  In the disabled state,
disallow any PHY configurations on the disabled port as the firmware
will fail all calls to configure the PHY in this state.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: ethtool: support PAM4 link speeds up to 200G
Edwin Peer [Sun, 27 Sep 2020 17:42:14 +0000 (13:42 -0400)]
bnxt_en: ethtool: support PAM4 link speeds up to 200G

Add ethtool PAM4 link modes for:
        50000baseCR_Full
        100000baseCR2_Full
        200000baseCR4_Full

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: add basic infrastructure to support PAM4 link speeds
Edwin Peer [Sun, 27 Sep 2020 17:42:13 +0000 (13:42 -0400)]
bnxt_en: add basic infrastructure to support PAM4 link speeds

The firmware interface has added support for new link speeds using
PAM4 modulation.  Expand the bnxt_link_info structure to closely
mirror the new firmware structures.  Add logic to copy the PAM4
capabilities and settings from the firmware.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: refactor bnxt_get_fw_speed()
Edwin Peer [Sun, 27 Sep 2020 17:42:12 +0000 (13:42 -0400)]
bnxt_en: refactor bnxt_get_fw_speed()

It will be necessary to update more than one field in the link_info
structure when PAM4 speeds are added in a later patch. Instead of
merely translating ethtool speed values to firmware speed values,
change the responsiblity of this function to update all the necessary
link_info fields required to force the speed change to the desired
ethtool value. This also reduces code duplication somewhat at the two
call sites, which otherwise both have to independently update link_info
fields to turn off auto negotiation advertisements.

Also use the appropriate REQ_FORCE_LINK_SPEED definitions. These happen
to have the same values, but req_link_speed is utilimately passed as
force_link_speed in HWRM_PORT_PHY_CFG which is not defined in terms of
REQ_AUTO_LINK_SPEED.

Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: refactor code to limit speed advertising
Edwin Peer [Sun, 27 Sep 2020 17:42:11 +0000 (13:42 -0400)]
bnxt_en: refactor code to limit speed advertising

Extract the code for determining an advertised speed is no longer
supported into a separate function. This will avoid some code
duplication in a later patch when supporting PAM4 speeds, since
these speeds are specified in a separate field.

Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Update firmware interface spec to 1.10.1.65.
Michael Chan [Sun, 27 Sep 2020 17:42:10 +0000 (13:42 -0400)]
bnxt_en: Update firmware interface spec to 1.10.1.65.

The main changes include FEC, ECN statistics, HWRM_PORT_PHY_QCFG
response size reduction, and a new counter added to
ctx_hw_stats_ext struct to support the new 58818 chip.

The ctx_hw_stats_ext structure is now the superset supporting the new
58818 chips and the prior P5 chips.  Add a new flag to identify the new
chip and use constants for the chip specific ring statistics sizes
instead of the size of the structure.

Because the HWRM_PORT_PHY_QCFG response structure size has shrunk back
to 96 bytes, the workaround added earlier to limit the size of this
message for forwarding to the VF can be removed.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoptp: add stub function for ptp_get_msgtype()
Yangbo Lu [Sun, 27 Sep 2020 08:01:50 +0000 (16:01 +0800)]
ptp: add stub function for ptp_get_msgtype()

Added the missing stub function for ptp_get_msgtype().

Fixes: 036c508ba95e ("ptp: Add generic ptp message type function")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: marvell: mvpp2: Fix W=1 warning with !CONFIG_ACPI
Andrew Lunn [Sat, 26 Sep 2020 21:26:03 +0000 (23:26 +0200)]
net: marvell: mvpp2: Fix W=1 warning with !CONFIG_ACPI

rivers/net/ethernet/marvell/mvpp2/mvpp2_main.c:7084:36: warning: ‘mvpp2_acpi_match’ defined but not used [-Wunused-const-variable=]
 7084 | static const struct acpi_device_id mvpp2_acpi_match[] = {
      |                                    ^~~~~~~~~~~~~~~~

Wrap the definition inside #ifdef/#endif.

Compile tested only.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-Expose-transceiver-overheat-counter'
David S. Miller [Sun, 27 Sep 2020 20:27:01 +0000 (13:27 -0700)]
Merge branch 'mlxsw-Expose-transceiver-overheat-counter'

Ido Schimmel says:

====================
mlxsw: Expose transceiver overheat counter

Amit says:

An overheated transceiver can be the root cause of various network
problems such as link flapping. Counting the number of times a
transceiver's temperature was higher than its configured threshold can
therefore help in debugging such issues.

This patch set exposes a transceiver overheat counter via ethtool. This
is achieved by configuring the Spectrum ASIC to generate events whenever
a transceiver is overheated. The temperature thresholds are queried from
the transceiver (if available) and set to the default otherwise.

Example:

...
transceiver_overheat: 2

Patch set overview:

Patches #1-#3 add required device registers
Patches #4-#5 add required infrastructure in mlxsw to configure and
count overheat events
Patches #6-#9 gradually add support for the transceiver overheat counter
Patch #10 exposes the transceiver overheat counter via ethtool
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_ethtool: Expose transceiver_overheat counter
Amit Cohen [Sun, 27 Sep 2020 07:50:15 +0000 (10:50 +0300)]
mlxsw: spectrum_ethtool: Expose transceiver_overheat counter

Add structures for port statistics which read from core and not directly
from registers.

When netdev's ethtool statistics are queried, query the corresponding
module's overheat counter from core and expose it as
"transceiver_overheat".

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: Update module's settings when module is plugged in
Amit Cohen [Sun, 27 Sep 2020 07:50:14 +0000 (10:50 +0300)]
mlxsw: Update module's settings when module is plugged in

Module temperature warning events are enabled for modules that have a
temperature sensor and configured according to the temperature
thresholds queried from the module.

When a module is unplugged we are guaranteed not to get temperature
warning events. However, when a module is plugged in we need to
potentially update its current settings (i.e., event enablement and
thresholds).

Register to port module plug/unplug events and update module's settings
upon plug in events.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Initialize netdev's module overheat counter
Amit Cohen [Sun, 27 Sep 2020 07:50:13 +0000 (10:50 +0300)]
mlxsw: spectrum: Initialize netdev's module overheat counter

The overheat counter is a per-module counter, but it is exposed as part
of the corresponding netdev's statistics. It should therefore be
presented to user space relative to the netdev's lifetime.

Query the counter just before registering the netdev, so that the value
exposed to user space will be relative to this initial value.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: Enable temperature event for all supported port module sensors
Amit Cohen [Sun, 27 Sep 2020 07:50:12 +0000 (10:50 +0300)]
mlxsw: Enable temperature event for all supported port module sensors

MTWE (Management Temperature Warning Event) is triggered for sensors
whose temperature event enable bit is enabled in the MTMP register.

Enable events for all the modules that have a temperature sensor.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: Update transceiver_overheat counter according to MTWE
Amit Cohen [Sun, 27 Sep 2020 07:50:11 +0000 (10:50 +0300)]
mlxsw: Update transceiver_overheat counter according to MTWE

MTWE (Management Temperature Warning Event) is triggered when module's
temperature is higher than its threshold.

Register for MTWE events and increase the module's overheat counter when
its corresponding sensor goes above the configured threshold.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Add an infrastructure to track transceiver overheat counter
Amit Cohen [Sun, 27 Sep 2020 07:50:10 +0000 (10:50 +0300)]
mlxsw: core: Add an infrastructure to track transceiver overheat counter

Initialize an array that stores per-module overheat state and a counter
indicating how many times the module was in overheat state.

Export a function to query the counter according to module number.
Will be used later on by the switch driver (i.e., mlxsw_spectrum) to expose
module's overheat counter as part of ethtool statistics.

Initialize mlxsw_env after driver initialization to be able to query
number of modules from MGPIR register.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core_hwmon: Query MTMP before writing to set only relevant fields
Amit Cohen [Sun, 27 Sep 2020 07:50:09 +0000 (10:50 +0300)]
mlxsw: core_hwmon: Query MTMP before writing to set only relevant fields

The MTMP register controls various temperature settings on a per-sensor
basis. Subsequent patches are going to alter some of these settings for
sensors found on port modules in response to certain events.

In order to prevent the current callers that write to MTMP from
overriding these settings, have them first query the register and then
change only the relevant register fields.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Ports Module Administrative and Operational Status Register
Amit Cohen [Sun, 27 Sep 2020 07:50:08 +0000 (10:50 +0300)]
mlxsw: reg: Add Ports Module Administrative and Operational Status Register

PMAOS register configures and retrieves the per module status.
The register is used also for enabling event for status change.

It will be used to enable PMPE (Port Module Plug/Unplug) event.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Port Module Plug/Unplug Event Register
Amit Cohen [Sun, 27 Sep 2020 07:50:07 +0000 (10:50 +0300)]
mlxsw: reg: Add Port Module Plug/Unplug Event Register

PMPE register reports any operational status change of a module.
It will be used for enabling temperature warning event when a module is
plugged in.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Management Temperature Warning Event Register
Amit Cohen [Sun, 27 Sep 2020 07:50:06 +0000 (10:50 +0300)]
mlxsw: reg: Add Management Temperature Warning Event Register

Add MTWE (Management Temperature Warning Event) register, which is used
for over temperature warning.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'hns3-next'
David S. Miller [Sun, 27 Sep 2020 20:25:22 +0000 (13:25 -0700)]
Merge branch 'hns3-next'

Huazhong Tan says:

====================
net: hns3: updates for -next

To facilitate code maintenance and compatibility, #1 and #2 add
device version to replace pci revision, #3 to #9 adds support for
querying device capabilities and specifications, then the driver
can use these query results to implement corresponding features
(some features will be implemented later).

And #10 is a minor cleanup since too many parameters for
hclge_shaper_para_calc().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add a structure for IR shaper's parameter in hclge_shaper_para_calc()
Huazhong Tan [Sun, 27 Sep 2020 07:12:48 +0000 (15:12 +0800)]
net: hns3: add a structure for IR shaper's parameter in hclge_shaper_para_calc()

As function hclge_shaper_para_calc() has too many arguments to add
more, so encapsulate its three arguments ir_b, ir_u, ir_s into a
structure.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add a check for device specifications queried from firmware
Guangbin Huang [Sun, 27 Sep 2020 07:12:47 +0000 (15:12 +0800)]
net: hns3: add a check for device specifications queried from firmware

The device specifications querying is unsupported by the old
firmware, in this case, these specifications are 0. However,
some specifications should not be 0 or will cause problem.

So after querying from firmware, some device specifications
are needed to check their value and set to default value if
their values are 0.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: replace the macro of max tm rate with the queried specification
Guangbin Huang [Sun, 27 Sep 2020 07:12:46 +0000 (15:12 +0800)]
net: hns3: replace the macro of max tm rate with the queried specification

The max tm rate is a fixed value(100Gb/s) now as it is defined by a
macro. In order to support other rates in different kinds of device,
it is better to use specification queried from firmware to replace
this macro.

As function hclge_shaper_para_calc() has too many arguments to add
more, so encapsulate its three arguments ir_b, ir_u, ir_s into a
structure.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add support to query device specifications
Guangbin Huang [Sun, 27 Sep 2020 07:12:45 +0000 (15:12 +0800)]
net: hns3: add support to query device specifications

To improve code maintainability and compatibility, new commands
HCLGE_OPC_QUERY_DEV_SPECS for PF and HCLGEVF_OPC_QUERY_DEV_SPECS
for VF are introduced to query device specifications, instead of
statically defining specifications by checking the hardware version
or other methods.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add debugfs to dump device capabilities
Guangbin Huang [Sun, 27 Sep 2020 07:12:44 +0000 (15:12 +0800)]
net: hns3: add debugfs to dump device capabilities

Adds debugfs to dump each device capability whether is supported.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: use capabilities queried from firmware
Guangbin Huang [Sun, 27 Sep 2020 07:12:43 +0000 (15:12 +0800)]
net: hns3: use capabilities queried from firmware

In order to improve code maintainability and compatibility, the
capabilities of new features are queried from firmware.

The member flag in struct hnae3_ae_dev indicates not only
capabilities, but some initialized status. As capabilities bits
queried from firmware is too many, it is better to use new member
to indicate them. So adds member capabs in struce hnae3_ae_dev.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: use capability flag to indicate FEC
Guangbin Huang [Sun, 27 Sep 2020 07:12:42 +0000 (15:12 +0800)]
net: hns3: use capability flag to indicate FEC

Currently, the revision of the pci device is used to identify
whether FEC is supported, which is not good for maintainability
and compatibility. So use a capability flag to do that.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add support to query device capability
Guangbin Huang [Sun, 27 Sep 2020 07:12:41 +0000 (15:12 +0800)]
net: hns3: add support to query device capability

In order to improve code maintainability and compatibility,
add support to query the device capability by expanding the
existing version query command. The device capability refers
to the features supported by the device.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: delete redundant PCI revision judgement
Guangbin Huang [Sun, 27 Sep 2020 07:12:40 +0000 (15:12 +0800)]
net: hns3: delete redundant PCI revision judgement

Fibre device of PCI revision 0x20 don't support autoneg, and the ops
get_autoneg() return AUTONEG_DISABLE so function hns3_nway_reset()
will return earlier than judging PCI revision.

Function hclge_handle_rocee_ras_error() don't need to judge PCI
revision again because its caller hclge_handle_hw_ras_error() has
judged once.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add device version to replace pci revision
Guangbin Huang [Sun, 27 Sep 2020 07:12:39 +0000 (15:12 +0800)]
net: hns3: add device version to replace pci revision

To better identify the device version, struct hnae3_handle adds a
member dev_version to replace pci revision. The dev_version consists
of hardware version and PCI revision. The hardware version is queried
from firmware by an existing firmware version query command.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetdevsim: fix duplicated debugfs directory
Jakub Kicinski [Sat, 26 Sep 2020 01:19:13 +0000 (18:19 -0700)]
netdevsim: fix duplicated debugfs directory

The "ethtool" debugfs directory holds per-netdev knobs, so move
it from the device instance directory to the port directory.

This fixes the following warning when creating multiple ports:

 debugfs: Directory 'ethtool' with parent 'netdevsim1' already present!

Fixes: ff1f7c17fb20 ("netdevsim: add pause frame stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Generic-adjustment-for-flow-dissector-in-DSA'
David S. Miller [Sat, 26 Sep 2020 21:17:59 +0000 (14:17 -0700)]
Merge branch 'Generic-adjustment-for-flow-dissector-in-DSA'

Vladimir Oltean says:

====================
Generic adjustment for flow dissector in DSA

This is the v2 of a series initially submitted in May:
https://www.spinics.net/lists/netdev/msg651866.html

The end goal is to get rid of the unintuitive code for the flow
dissector that currently exists in the taggers. It can all be replaced
by a single, common function.

Some background work needs to be done for that. Especially the ocelot
driver poses some problems, since it has a different tag length between
RX and TX, and I didn't want to make DSA aware of that, since I could
instead make the tag lengths equal.

Changes in v3:
- Added an optimization (08/15) that makes the generic case not need to
  call the .flow_dissect function pointer. Basically .flow_dissect now
  currently only exists for sja1105.
- Moved the .promisc_on_master property to the tagger structure.
- Added the .tail_tag property to the tagger structure.
- Disabled "suppresscc = all" from my .gitconfig.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_rtl4_a: use the generic flow dissector procedure
Vladimir Oltean [Sat, 26 Sep 2020 19:32:15 +0000 (22:32 +0300)]
net: dsa: tag_rtl4_a: use the generic flow dissector procedure

Remove the .flow_dissect procedure, so the flow dissector will call the
generic variant which works for this tagging protocol.

Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_sja1105: use a custom flow dissector procedure
Vladimir Oltean [Sat, 26 Sep 2020 19:32:14 +0000 (22:32 +0300)]
net: dsa: tag_sja1105: use a custom flow dissector procedure

The sja1105 is a bit of a special snowflake, in that not all frames are
transmitted/received in the same way. L2 link-local frames are received
with the source port/switch ID information put in the destination MAC
address. For the rest, a tag_8021q header is used. So only the latter
frames displace the rest of the headers and need to use the generic flow
dissector procedure.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_qca: use the generic flow dissector procedure
Vladimir Oltean [Sat, 26 Sep 2020 19:32:13 +0000 (22:32 +0300)]
net: dsa: tag_qca: use the generic flow dissector procedure

Remove the .flow_dissect procedure, so the flow dissector will call the
generic variant which works for this tagging protocol.

Cc: John Crispin <john@phrozen.org>
Cc: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_mtk: use the generic flow dissector procedure
Vladimir Oltean [Sat, 26 Sep 2020 19:32:12 +0000 (22:32 +0300)]
net: dsa: tag_mtk: use the generic flow dissector procedure

Remove the .flow_dissect procedure, so the flow dissector will call the
generic variant which works for this tagging protocol.

Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: John Crispin <john@phrozen.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_edsa: use the generic flow dissector procedure
Vladimir Oltean [Sat, 26 Sep 2020 19:32:11 +0000 (22:32 +0300)]
net: dsa: tag_edsa: use the generic flow dissector procedure

Remove the .flow_dissect procedure, so the flow dissector will call the
generic variant which works for this tagging protocol.

Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_dsa: use the generic flow dissector procedure
Vladimir Oltean [Sat, 26 Sep 2020 19:32:10 +0000 (22:32 +0300)]
net: dsa: tag_dsa: use the generic flow dissector procedure

Remove the .flow_dissect procedure, so the flow dissector will call the
generic variant which works for this tagging protocol.

Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_brcm: use generic flow dissector procedure
Vladimir Oltean [Sat, 26 Sep 2020 19:32:09 +0000 (22:32 +0300)]
net: dsa: tag_brcm: use generic flow dissector procedure

There are 2 Broadcom tags in use, one places the DSA tag before the
Ethernet destination MAC address, and the other before the EtherType.
Nonetheless, both displace the rest of the headers, so this tagger can
use the generic flow dissector procedure which accounts for that.

The ASCII art drawing is a good reference though, so keep it but move it
somewhere else.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: flow_dissector: avoid indirect call to DSA .flow_dissect for generic case
Vladimir Oltean [Sat, 26 Sep 2020 19:32:08 +0000 (22:32 +0300)]
net: flow_dissector: avoid indirect call to DSA .flow_dissect for generic case

With the recent mitigations against speculative execution exploits,
indirect function calls are more expensive and it would be good to avoid
them where possible.

In the case of DSA, most switch taggers will shift the EtherType and
next headers by a fixed amount equal to that tag's length in bytes.
So we can use a generic procedure to determine that, without calling
into custom tagger code. However we still leave the flow_dissect method
inside struct dsa_device_ops as an override for the generic function.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: point out the tail taggers
Vladimir Oltean [Sat, 26 Sep 2020 19:32:07 +0000 (22:32 +0300)]
net: dsa: point out the tail taggers

The Marvell 88E6060 uses tag_trailer.c and the KSZ8795, KSZ9477 and
KSZ9893 switches also use tail tags.

Tell that to the DSA core, since this makes a difference for the flow
dissector. Most switches break the parsing of frame headers, but these
ones don't, so no flow dissector adjustment needs to be done for them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: add a generic procedure for the flow dissector
Vladimir Oltean [Sat, 26 Sep 2020 19:32:06 +0000 (22:32 +0300)]
net: dsa: add a generic procedure for the flow dissector

For all DSA formats that don't use tail tags, it looks like behind the
obscure number crunching they're all doing the same thing: locating the
real EtherType behind the DSA tag. Nonetheless, this is not immediately
obvious, so create a generic helper for those DSA taggers that put the
header before the EtherType.

Another assumption for the generic function is that the DSA tags are of
equal length on RX and on TX. Prior to the previous patch, this was not
true for ocelot and for gswip. The problem was resolved for ocelot, but
for gswip it still remains, so that can't use this helper yet.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: make the .flow_dissect tagger callback return void
Vladimir Oltean [Sat, 26 Sep 2020 19:32:05 +0000 (22:32 +0300)]
net: dsa: make the .flow_dissect tagger callback return void

There is no tagger that returns anything other than zero, so just change
the return type appropriately.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_ocelot: use a short prefix on both ingress and egress
Vladimir Oltean [Sat, 26 Sep 2020 19:32:04 +0000 (22:32 +0300)]
net: dsa: tag_ocelot: use a short prefix on both ingress and egress

There are 2 goals that we follow:

- Reduce the header size
- Make the header size equal between RX and TX

The issue that required long prefix on RX was the fact that the ocelot
DSA tag, being put before Ethernet as it is, would overlap with the area
that a DSA master uses for RX filtering (destination MAC address
mainly).

Now that we can ask DSA to put the master in promiscuous mode, in theory
we could remove the prefix altogether and call it a day, but it looks
like we can't. Using no prefix on ingress, some packets (such as ICMP)
would be received, while others (such as PTP) would not be received.
This is because the DSA master we use (enetc) triggers parse errors
("MAC rx frame errors") presumably because it sees Ethernet frames with
a bad length. And indeed, when using no prefix, the EtherType (bytes
12-13 of the frame, bits 96-111) falls over the REW_VAL field from the
extraction header, aka the PTP timestamp.

When turning the short (32-bit) prefix on, the EtherType overlaps with
bits 64-79 of the extraction header, which are a reserved area
transmitted as zero by the switch. The packets are not dropped by the
DSA master with a short prefix. Actually, the frames look like this in
tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 -
added by a downstream sja1105).

89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \
dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \
ctrl 0x0004: Information, send seq 2, rcv seq 0, \
Flags [Response], length 78

0x0000:  8880 000a 001d 890c a9f2 0100 0000 100f  ................
0x0010:  0400 0000 0180 c200 000e 001f 7b63 0248  ............{c.H
0x0020:  dadb 0482 88f7 1202 0036 0000 0000 0000  .........6......
0x0030:  0000 0000 0000 0000 0000 001f 7bff fe63  ............{..c
0x0040:  0248 0001 1f81 0500 0000 0000 0000 0000  .H..............
0x0050:  0000 0000 0000 0000 0000 0000            ............

So the short prefix is our new default: we've shortened our RX frames by
12 octets, increased TX by 4, and headers are now equal between RX and
TX. Note that we still need promiscuous mode for the DSA master to not
drop it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_sja1105: request promiscuous mode for master
Vladimir Oltean [Sat, 26 Sep 2020 19:32:03 +0000 (22:32 +0300)]
net: dsa: tag_sja1105: request promiscuous mode for master

Currently PTP is broken when ports are in standalone mode (the tagger
keeps printing this message):

sja1105 spi0.1: Expected meta frame, is 01-80-c2-00-00-0e in the DSA master multicast filter?

Sure, one might say "simply add 01-80-c2-00-00-0e to the master's RX
filter" but things become more complicated because:

- Actually all frames in the 01-80-c2-xx-xx-xx and 01-1b-19-xx-xx-xx
  range are trapped to the CPU automatically
- The switch mangles bytes 3 and 4 of the MAC address via the incl_srcpt
  ("include source port [in the DMAC]") option, which is how source port
  and switch id identification is done for link-local traffic on RX. But
  this means that an address installed to the RX filter would, at the
  end of the day, not correspond to the final address seen by the DSA
  master.

Assume RX filtering lists on DSA masters are typically too small to
include all necessary addresses for PTP to work properly on sja1105, and
just request promiscuous mode unconditionally.

Just an example:
Assuming the following addresses are trapped to the CPU:
01-80-c2-00-00-00 to 01-80-c2-00-00-ff
01-1b-19-00-00-00 to 01-1b-19-00-00-ff

These are 512 addresses.
Now let's say this is a board with 3 switches, and 4 ports per switch.
The 512 addresses become 6144 addresses that must be managed by the DSA
master's RX filtering lists.

This may be refined in the future, but for now, it is simply not worth
it to add the additional addresses to the master's RX filter, so simply
request it to become promiscuous as soon as the driver probes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: allow drivers to request promiscuous mode on master
Vladimir Oltean [Sat, 26 Sep 2020 19:32:02 +0000 (22:32 +0300)]
net: dsa: allow drivers to request promiscuous mode on master

Currently DSA assumes that taggers don't mess with the destination MAC
address of the frames on RX. That is not always the case. Some DSA
headers are placed before the Ethernet header (ocelot), and others
simply mangle random bytes from the destination MAC address (sja1105
with its incl_srcpt option).

Currently the DSA master goes to promiscuous mode automatically when the
slave devices go too (such as when enslaved to a bridge), but in
standalone mode this is a problem that needs to be dealt with.

So give drivers the possibility to signal that their tagging protocol
will get randomly dropped otherwise, and let DSA deal with fixing that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mscc: ocelot: move NPI port configuration to DSA
Vladimir Oltean [Sat, 26 Sep 2020 19:32:01 +0000 (22:32 +0300)]
net: mscc: ocelot: move NPI port configuration to DSA

Remove the ocelot_configure_cpu() function, which was in fact bringing
up 2 ports: the CPU port module, which both switchdev and DSA have, and
the NPI port, which only DSA has.

The (non-Ethernet) CPU port module is at a fixed index in the analyzer,
whereas the NPI port is selected through the "ethernet" property in the
device tree.

Therefore, the function to set up an NPI port is DSA-specific, so we
move it there, simplifying the ocelot switch library a little bit.

Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: UNGLinuxDriver <UNGLinuxDriver@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "vxlan: move encapsulation warning"
Jakub Kicinski [Sat, 26 Sep 2020 01:56:04 +0000 (18:56 -0700)]
Revert "vxlan: move encapsulation warning"

This reverts commit 546c044c9651e81a16833806feff6b369bb5de33.

Nothing prevents user from sending frames to "external" VxLAN devices.
In fact kernel itself may generate icmp chatter.

This is fine, such frames should be dropped.

The point of the "missing encapsulation" warning was that
frames with missing encap should not make it into vxlan_xmit_one().
And vxlan_xmit() drops them cleanly, so let it just do that.

Without this revert the warning is triggered by the udp_tunnel_nic.sh
test, but the minimal repro is:

$ ip link add vxlan0 type vxlan \
                  group 239.1.1.1 \
     dev lo \
     dstport 1234 \
     external
$ ip li set dev vxlan0 up

[  419.165981] vxlan0: Missing encapsulation instructions
[  419.166551] WARNING: CPU: 0 PID: 1041 at drivers/net/vxlan.c:2889 vxlan_xmit+0x15c0/0x1fc0 [vxlan]

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'devlink-flash-update-overwrite-mask'
David S. Miller [Sat, 26 Sep 2020 00:20:57 +0000 (17:20 -0700)]
Merge branch 'devlink-flash-update-overwrite-mask'

Jacob Keller says:

====================
devlink flash update overwrite mask

This series introduces support for a new attribute to the flash update
command: DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK.

This attribute is a bitfield which allows userspace to specify what set of
subfields to overwrite when performing a flash update for a device.

The intention is to support the ability to control the behavior of
overwriting the configuration and identifying fields in the Intel ice device
flash update process. This is necessary  as the firmware layout for the ice
device includes some settings and configuration within the same flash
section as the main firmware binary.

This series, and the accompanying iproute2 series, introduce support for the
attribute. Once applied, the overwrite support can be be invoked via
devlink:

  # overwrite settings
  devlink dev flash pci/0000:af:00.0 file firmware.bin overwrite settings

  # overwrite identifiers and settings
  devlink dev flash pci/0000:af:00.0 file firmware.bin overwrite settings overwrite identifiers

To aid in the safe addition of new parameters, first some refactoring is
done to the .flash_update function: its parameters are converted from a
series of function arguments into a structure. This makes it easier to add
the new parameter without changing the signature of the .flash_update
handler in the future. Additionally, a "supported_flash_update_params" field
is added to devlink_ops. This field is similar to the ethtool
"supported_coalesc_params" field. The devlink core will now check that the
DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT bit is set before forwarding the
component attribute. Similarly, the new overwrite attribute will also
require a supported bit.

Doing these refactors will aid in adding any other attributes in the future,
and creates a good pattern for other interfaces to use in the future. By
requiring drivers to opt-in, we reduce the risk of accidentally breaking
drivers when ever we add an additional parameter. We also reduce boiler
plate code in drivers which do not support the parameters.

Changes since v9:
* rebased to current net-next, no other changes

Changes since v7
* resend, hopefully avoiding the SMTP server issues I experienced on Friday

Changes since v6
* Rebased to current net-next to resolve conflicts
* Added changes to the ionic driver that recently merged flash update support
* Fixed the changes for mlxsw to apply to core instead of spectrum.c after
  the recent refactor.
* Picked up the review tags from Jakub

Changes since v5
* Fix *all* of the BIT usage to use _BITUL() (thanks Jakub!)

Changes since v4
* Renamed nla_overwrite to nla_overwrite_mask at Jiri's suggestion
* Added "by this device" to the netlink error messages for unsupported
  attributes
* Removed use of BIT() in the uapi header
* Fixed the commit message for the netdevsim patch
* Picked up Jakub's reviewed
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoice: add support for flash update overwrite mask
Jacob Keller [Fri, 25 Sep 2020 20:46:09 +0000 (13:46 -0700)]
ice: add support for flash update overwrite mask

Support the recently added DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK
parameter in the ice flash update handler. Convert the overwrite mask
bitfield into the appropriate preservation level used by the firmware
when updating.

Because there is no equivalent preservation level for overwriting only
identifiers, this combination is rejected by the driver as not supported
with an appropriate extended ACK message.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetdevsim: add support for flash_update overwrite mask
Jacob Keller [Fri, 25 Sep 2020 20:46:08 +0000 (13:46 -0700)]
netdevsim: add support for flash_update overwrite mask

The devlink interface recently gained support for a new "overwrite mask"
parameter that allows specifying how various sub-sections of a flash
component are modified when updating.

Add support for this to netdevsim, to enable easily testing the
interface. Make the allowed overwrite mask values controllable via
a debugfs parameter. This enables testing a flow where the driver
rejects an unsupportable overwrite mask.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: introduce flash update overwrite mask
Jacob Keller [Fri, 25 Sep 2020 20:46:07 +0000 (13:46 -0700)]
devlink: introduce flash update overwrite mask

Sections of device flash may contain settings or device identifying
information. When performing a flash update, it is generally expected
that these settings and identifiers are not overwritten.

However, it may sometimes be useful to allow overwriting these fields
when performing a flash update. Some examples include, 1) customizing
the initial device config on first programming, such as overwriting
default device identifying information, or 2) reverting a device
configuration to known good state provided in the new firmware image, or
3) in case it is suspected that current firmware logic for managing the
preservation of fields during an update is broken.

Although some devices are able to completely separate these types of
settings and fields into separate components, this is not true for all
hardware.

To support controlling this behavior, a new
DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK is defined. This is an
nla_bitfield32 which will define what subset of fields in a component
should be overwritten during an update.

If no bits are specified, or of the overwrite mask is not provided, then
an update should not overwrite anything, and should maintain the
settings and identifiers as they are in the previous image.

If the overwrite mask has the DEVLINK_FLASH_OVERWRITE_SETTINGS bit set,
then the device should be configured to overwrite any of the settings in
the requested component with settings found in the provided image.

Similarly, if the DEVLINK_FLASH_OVERWRITE_IDENTIFIERS bit is set, the
device should be configured to overwrite any device identifiers in the
requested component with the identifiers from the image.

Multiple overwrite modes may be combined to indicate that a combination
of the set of fields that should be overwritten.

Drivers which support the new overwrite mask must set the
DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK in the
supported_flash_update_params field of their devlink_ops.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: convert flash_update to use params structure
Jacob Keller [Fri, 25 Sep 2020 20:46:06 +0000 (13:46 -0700)]
devlink: convert flash_update to use params structure

The devlink core recently gained support for checking whether the driver
supports a flash_update parameter, via `supported_flash_update_params`.
However, parameters are specified as function arguments. Adding a new
parameter still requires modifying the signature of the .flash_update
callback in all drivers.

Convert the .flash_update function to take a new `struct
devlink_flash_update_params` instead. By using this structure, and the
`supported_flash_update_params` bit field, a new parameter to
flash_update can be added without requiring modification to existing
drivers.

As before, all parameters except file_name will require driver opt-in.
Because file_name is a necessary field to for the flash_update to make
sense, no "SUPPORTED" bitflag is provided and it is always considered
valid. All future additional parameters will require a new bit in the
supported_flash_update_params bitfield.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Bin Luo <luobin9@huawei.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: check flash_update parameter support in net core
Jacob Keller [Fri, 25 Sep 2020 20:46:05 +0000 (13:46 -0700)]
devlink: check flash_update parameter support in net core

When implementing .flash_update, drivers which do not support
per-component update are manually checking the component parameter to
verify that it is NULL. Without this check, the driver might accept an
update request with a component specified even though it will not honor
such a request.

Instead of having each driver check this, move the logic into
net/core/devlink.c, and use a new `supported_flash_update_params` field
in the devlink_ops. Drivers which will support per-component update must
now specify this by setting DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT in
the supported_flash_update_params in their devlink_ops.

This helps ensure that drivers do not forget to check for a NULL
component if they do not support per-component update. This also enables
a slightly better error message by enabling the core stack to set the
netlink bad attribute message to indicate precisely the unsupported
attribute in the message.

Going forward, any new additional parameter to flash update will require
a bit in the supported_flash_update_params bitfield.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Bin Luo <luobin9@huawei.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: Danielle Ratson <danieller@mellanox.com>
Cc: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'simplify-TCP-loss-marking-code'
David S. Miller [Sat, 26 Sep 2020 00:17:14 +0000 (17:17 -0700)]
Merge branch 'simplify-TCP-loss-marking-code'

Yuchung Cheng says:

====================
simplify TCP loss marking code

The TCP loss marking is implemented by a set of intertwined
subroutines. TCP has several loss detection algorithms
(RACK, RFC6675/FACK, NewReno, etc) each calls a subset of
these routines to mark a packet lost. This has led to
various bugs (and fixes and fixes of fixes).

This patch set is to consolidate the loss marking code so
all detection algorithms call the same routine tcp_mark_skb_lost().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: consolidate tcp_mark_skb_lost and tcp_skb_mark_lost
Yuchung Cheng [Fri, 25 Sep 2020 17:04:31 +0000 (10:04 -0700)]
tcp: consolidate tcp_mark_skb_lost and tcp_skb_mark_lost

tcp_skb_mark_lost is used by RFC6675-SACK and can easily be replaced
with the new tcp_mark_skb_lost handler.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: simplify tcp_mark_skb_lost
Yuchung Cheng [Fri, 25 Sep 2020 17:04:30 +0000 (10:04 -0700)]
tcp: simplify tcp_mark_skb_lost

This patch consolidates and simplifes the loss marking logic used
by a few loss detections (RACK, RFC6675, NewReno). Previously
each detection uses a subset of several intertwined subroutines.
This unncessary complexity has led to bugs (and fixes of bug fixes).

tcp_mark_skb_lost now is the single one routine to mark a packet loss
when a loss detection caller deems an skb ist lost:

   1. rewind tp->retransmit_hint_skb if skb has lower sequence or
      all lost ones have been retransmitted.

   2. book-keeping: adjust flags and counts depending on if skb was
      retransmitted or not.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: move tcp_mark_skb_lost
Yuchung Cheng [Fri, 25 Sep 2020 17:04:29 +0000 (10:04 -0700)]
tcp: move tcp_mark_skb_lost

A pure refactor to move tcp_mark_skb_lost to tcp_input.c to prepare
for the later loss marking consolidation.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: consistently check retransmit hint
Yuchung Cheng [Fri, 25 Sep 2020 17:04:28 +0000 (10:04 -0700)]
tcp: consistently check retransmit hint

tcp_simple_retransmit() used for path MTU discovery may not adjust
the retransmit hint properly by deducting retrans_out before checking
it to adjust the hint. This patch fixes this by a correct routine
tcp_mark_skb_lost() already used by the RACK loss detection.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-mac: Fix potential null pointer dereference
Gustavo A. R. Silva [Fri, 25 Sep 2020 17:03:23 +0000 (12:03 -0500)]
dpaa2-mac: Fix potential null pointer dereference

There is a null-check for _pcs_, but it is being dereferenced
prior to this null-check. So, if _pcs_ can actually be null,
then there is a potential null pointer dereference that should
be fixed by null-checking _pcs_ before being dereferenced.

Addresses-Coverity-ID: 1497159 ("Dereference before null check")
Fixes: 94ae899b2096 ("dpaa2-mac: add PCS support through the Lynx module")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'dpaa2-eth-small-updates'
David S. Miller [Sat, 26 Sep 2020 00:10:27 +0000 (17:10 -0700)]
Merge branch 'dpaa2-eth-small-updates'

Ioana Ciornei says:

====================
dpaa2-eth: small updates

This patch set is just a collection of small updates to the dpaa2-eth
driver.

First, we only need to check the availability of the DTS child node, not
both child and parent node. Then remove a call to
dpaa2_eth_link_state_update() which is now just a leftover and it's not
useful in how are things working now in the PHY integration. Lastly,
modify how the driver is behaving when the the flow steering table is
used between all the traffic classes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-eth: install a single steering rule when SHARED_FS is enabled
Ionut-robert Aron [Fri, 25 Sep 2020 14:44:21 +0000 (17:44 +0300)]
dpaa2-eth: install a single steering rule when SHARED_FS is enabled

When SHARED_FS is enabled on a DPNI object the flow steering tables are
shared between all the traffic classes. Modify the driver so that we
only add a new flow steering entry on the TC#0 when this new option is
enabled.

Signed-off-by: Ionut-robert Aron <ionut-robert.aron@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-eth: no need to check link state right after ndo_open
Ioana Ciornei [Fri, 25 Sep 2020 14:44:20 +0000 (17:44 +0300)]
dpaa2-eth: no need to check link state right after ndo_open

The call to dpaa2_eth_link_state_update() is a leftover from the time
when on DPAA2 platforms the PHYs were started at boot time so when an
ifconfig was issued on the associated interface, the link status needed
to be checked directly from the ndo_open() callback.
This is not needed anymore since we are now properly integrated with the
PHY layer thus a link interrupt will come directly from the PHY
eventually without the need to call the sync function.
Fix this up by removing the call to dpaa2_eth_link_state_update().

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-mac: do not check for both child and parent DTS nodes
Ioana Ciornei [Fri, 25 Sep 2020 14:44:19 +0000 (17:44 +0300)]
dpaa2-mac: do not check for both child and parent DTS nodes

There is no need to check if both the MDIO controller node and its
child node, the PCS device, are available since there is no chance that
the child node would be enabled when the parent it's not.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'vxlan-clean-up'
David S. Miller [Fri, 25 Sep 2020 23:58:07 +0000 (16:58 -0700)]
Merge branch 'vxlan-clean-up'

Fabian Frederick says:

====================
vxlan: clean-up

This small patchet does some clean-up on vxlan.
Second version removes VXLAN_NL2FLAG macro relevant patches as suggested by Michal and David

I hope to have some feedback/ACK from vxlan developers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovxlan: fix vxlan_find_sock() documentation for l3mdev
Fabian Frederick [Fri, 25 Sep 2020 13:17:17 +0000 (15:17 +0200)]
vxlan: fix vxlan_find_sock() documentation for l3mdev

Since commit aab8cc3630e32
("vxlan: add support for underlay in non-default VRF")

vxlan_find_sock() also checks if socket is assigned to the right
level 3 master device when lower device is not in the default VRF.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovxlan: check rtnl_configure_link return code correctly
Fabian Frederick [Fri, 25 Sep 2020 13:16:59 +0000 (15:16 +0200)]
vxlan: check rtnl_configure_link return code correctly

rtnl_configure_link is always checked if < 0 for error code.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovxlan: move encapsulation warning
Fabian Frederick [Fri, 25 Sep 2020 13:16:39 +0000 (15:16 +0200)]
vxlan: move encapsulation warning

vxlan_xmit_one() was only called from vxlan_xmit() without rdst and
info was already tested. Emit warning in that function instead

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovxlan: add unlikely to vxlan_remcsum check
Fabian Frederick [Fri, 25 Sep 2020 13:16:18 +0000 (15:16 +0200)]
vxlan: add unlikely to vxlan_remcsum check

small optimization around checking as it's being done in all
receptions

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovxlan: don't collect metadata if remote checksum is wrong
Fabian Frederick [Fri, 25 Sep 2020 13:16:02 +0000 (15:16 +0200)]
vxlan: don't collect metadata if remote checksum is wrong

call vxlan_remcsum() before md filling in vxlan_rcv()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: bridge: mcast: remove only S,G port groups from sg_port hash
Nikolay Aleksandrov [Fri, 25 Sep 2020 10:25:49 +0000 (13:25 +0300)]
net: bridge: mcast: remove only S,G port groups from sg_port hash

We should remove a group from the sg_port hash only if it's an S,G
entry. This makes it correct and more symmetric with group add. Also
since *,G groups are not added to that hash we can hide a bug.

Fixes: 085b53c8beab ("net: bridge: mcast: add sg_port rhashtable")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Add option for VLAN filter fail queue enable
Chuah, Kim Tatt [Fri, 25 Sep 2020 09:40:41 +0000 (17:40 +0800)]
net: stmmac: Add option for VLAN filter fail queue enable

Add option in plat_stmmacenet_data struct to enable VLAN Filter Fail
Queuing. This option allows packets that fail VLAN filter to be routed
to a specific Rx queue when Receive All is also set.

When this option is enabled:
- Enable VFFQ only when entering promiscuous mode, because Receive All
  will pass up all rx packets that failed address filtering (similar to
  promiscuous mode).
- VLAN-promiscuous mode is never entered to allow rx packet to fail VLAN
  filters and get routed to selected VFFQ Rx queue.

Reviewed-by: Voon Weifeng <weifeng.voon@intel.com>
Reviewed-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Chuah, Kim Tatt <kim.tatt.chuah@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Devlink-regions-for-SJA1105-DSA-driver'
David S. Miller [Fri, 25 Sep 2020 23:35:27 +0000 (16:35 -0700)]
Merge branch 'Devlink-regions-for-SJA1105-DSA-driver'

Vladimir Oltean says:

====================
Devlink regions for SJA1105 DSA driver

This series exposes the SJA1105 static config as a devlink region. This
can be used for debugging, for example with the sja1105_dump user space
program that I have derived from Andrew Lunn's mv88e6xxx_dump:

https://github.com/vladimiroltean/mv88e6xxx_dump/tree/sja1105

Changes in v2:
- Tear down devlink params on initialization failure.
- Add driver identification through devlink.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: sja1105: implement .devlink_info_get
Vladimir Oltean [Fri, 25 Sep 2020 23:04:21 +0000 (02:04 +0300)]
net: dsa: sja1105: implement .devlink_info_get

Return the driver name and ASIC ID so that generic user space
application are able to know they're looking at sja1105 devlink regions
when pretty-printing them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: sja1105: expose static config as devlink region
Vladimir Oltean [Fri, 25 Sep 2020 23:04:20 +0000 (02:04 +0300)]
net: dsa: sja1105: expose static config as devlink region

As explained in Documentation/networking/dsa/sja1105.rst, this switch
has a static config held in the driver's memory and re-uploaded from
time to time into the device (after any major change).

The format of this static config is in fact described in UM10944.pdf and
it contains all the switch's settings (it also contains device ID, table
CRCs, etc, just like in the manual). So it is a useful and universal
devlink region to expose to user space, for debugging purposes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: sja1105: move devlink param code to sja1105_devlink.c
Vladimir Oltean [Fri, 25 Sep 2020 23:04:19 +0000 (02:04 +0300)]
net: dsa: sja1105: move devlink param code to sja1105_devlink.c

We'll have more devlink code soon. Group it together in a separate
translation object.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'drivers-net-warning-clean'
David S. Miller [Fri, 25 Sep 2020 23:29:00 +0000 (16:29 -0700)]
Merge branch 'drivers-net-warning-clean'

Jesse Brandeburg says:

====================
make drivers/net/ethernet W=1 clean

The Goal: move to W=1 being default for drivers/net/ethernet, and
then use automation to catch more code issues (warnings) being
introduced.
The status: Getting much closer but not quite done for all
architectures.

After applying the patches below, the drivers/net/ethernet
directory can be built as modules with W=1 with no warnings (so
far on x64_64 arch only!). As Jakub pointed out, there is much
more work to do to clean up C=1, but that will be another series
of changes.

This series removes 1,247 warnings and hopefully allows the
ethernet directory to move forward from here without more
warnings being added. There is only one objtool warning now.

This version drops one of the Intel patches, as I couldn't
reproduce the original issue to document the warning.

Some of these patches are already sent and tested on Intel Wired
Lan, but the rest of the series titled drivers/net/ethernet
affects other drivers. The changes are all pretty
straightforward.
====================

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
5 years agodrivers/net/ethernet: clean up mis-targeted comments
Jesse Brandeburg [Fri, 25 Sep 2020 22:24:45 +0000 (15:24 -0700)]
drivers/net/ethernet: clean up mis-targeted comments

As part of the W=1 cleanups for ethernet, a million [*] driver
comments had to be cleaned up to get the W=1 compilation to
succeed. This change finally makes the drivers/net/ethernet tree
compile with W=1 set on the command line. NOTE: The kernel uses
kdoc style (see Documentation/process/kernel-doc.rst) when
documenting code, not doxygen or other styles.

After this patch the x86_64 build has no warnings from W=1, however
scripts/kernel-doc says there are 1545 more warnings in source files, that
I need to develop a script to fix in a followup patch.

The errors fixed here are all kdoc of a few classes, with a few outliers:
In file included from drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c:10:
drivers/net/ethernet/qlogic/netxen/netxen_nic.h:1193:18: warning: ‘FW_DUMP_LEVELS’ defined but not used [-Wunused-const-variable=]
 1193 | static const u32 FW_DUMP_LEVELS[] = { 0x3, 0x7, 0xf, 0x1f, 0x3f, 0x7f, 0xff };
      |                  ^~~~~~~~~~~~~~
... repeats 4 times...
drivers/net/ethernet/sun/cassini.c:2084:24: warning: suggest braces around empty body in an ‘else’ statement [-Wempty-body]
 2084 |    RX_USED_ADD(page, i);
drivers/net/ethernet/natsemi/ns83820.c: In function ‘phy_intr’:
drivers/net/ethernet/natsemi/ns83820.c:603:6: warning: variable ‘tbisr’ set but not used [-Wunused-but-set-variable]
  603 |  u32 tbisr, tanar, tanlpar;
      |      ^~~~~
drivers/net/ethernet/natsemi/ns83820.c: In function ‘ns83820_get_link_ksettings’:
drivers/net/ethernet/natsemi/ns83820.c:1207:11: warning: variable ‘tanar’ set but not used [-Wunused-but-set-variable]
 1207 |  u32 cfg, tanar, tbicr;
      |           ^~~~~
drivers/net/ethernet/packetengines/yellowfin.c:1063:18: warning: variable ‘yf_size’ set but not used [-Wunused-but-set-variable]
 1063 |   int data_size, yf_size;
      |                  ^~~~~~~

Normal kdoc fixes:
warning: Function parameter or member 'x' not described in 'y'
warning: Excess function parameter 'x' description in 'y'
warning: Cannot understand <string> on line <NNN> - I thought it was a doc line

[*] - ok it wasn't quite a million, but it felt like it.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosfc: fix kdoc warning
Jesse Brandeburg [Fri, 25 Sep 2020 22:24:44 +0000 (15:24 -0700)]
sfc: fix kdoc warning

kernel-doc script as used by W=1, is confused by the macro
usage inside the header describing the efx_ptp_data struct.

drivers/net/ethernet/sfc/ptp.c:345: warning: Function parameter or member 'MC_CMD_PTP_IN_TRANSMIT_LENMAX' not described in 'efx_ptp_data'

After some discussion on the list, break this patch out to
a separate one, and fix the issue through a creative
macro declaration.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodrivers/net/ethernet: remove incorrectly formatted doc
Jesse Brandeburg [Fri, 25 Sep 2020 22:24:43 +0000 (15:24 -0700)]
drivers/net/ethernet: remove incorrectly formatted doc

As part of the W=1 series for ethernet, these drivers were
discovered to be using kdoc style comments but were not actually
doing kdoc. The kernel uses kdoc style when documenting code, not
doxygen or other styles.

Fixed Warnings:
drivers/net/ethernet/amazon/ena/ena_com.c:613: warning: Function parameter or member 'ena_dev' not described in 'ena_com_set_llq'
drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c:1540: warning: Cannot understand  * @brief Set VLAN filter table
drivers/net/ethernet/xilinx/ll_temac_main.c:114: warning: Function parameter or member 'lp' not described in 'temac_indirect_busywait'
drivers/net/ethernet/xilinx/ll_temac_main.c:129: warning: Function parameter or member 'lp' not described in 'temac_indirect_in32'
drivers/net/ethernet/xilinx/ll_temac_main.c:129: warning: Function parameter or member 'reg' not described in 'temac_indirect_in32'
drivers/net/ethernet/xilinx/ll_temac_main.c:147: warning: Function parameter or member 'lp' not described in 'temac_indirect_in32_locked'
drivers/net/ethernet/xilinx/ll_temac_main.c:147: warning: Function parameter or member 'reg' not described in 'temac_indirect_in32_locked'
drivers/net/ethernet/xilinx/ll_temac_main.c:172: warning: Function parameter or member 'lp' not described in 'temac_indirect_out32'
drivers/net/ethernet/xilinx/ll_temac_main.c:172: warning: Function parameter or member 'reg' not described in 'temac_indirect_out32'
drivers/net/ethernet/xilinx/ll_temac_main.c:172: warning: Function parameter or member 'value' not described in 'temac_indirect_out32'
drivers/net/ethernet/xilinx/ll_temac_main.c:188: warning: Function parameter or member 'lp' not described in 'temac_indirect_out32_locked'
drivers/net/ethernet/xilinx/ll_temac_main.c:188: warning: Function parameter or member 'reg' not described in 'temac_indirect_out32_locked'
drivers/net/ethernet/xilinx/ll_temac_main.c:188: warning: Function parameter or member 'value' not described in 'temac_indirect_out32_locked'
drivers/net/ethernet/xilinx/ll_temac_main.c:212: warning: Function parameter or member 'lp' not described in 'temac_dma_in32_be'
drivers/net/ethernet/xilinx/ll_temac_main.c:212: warning: Function parameter or member 'reg' not described in 'temac_dma_in32_be'
drivers/net/ethernet/xilinx/ll_temac_main.c:228: warning: Function parameter or member 'lp' not described in 'temac_dma_out32_be'
drivers/net/ethernet/xilinx/ll_temac_main.c:228: warning: Function parameter or member 'reg' not described in 'temac_dma_out32_be'
drivers/net/ethernet/xilinx/ll_temac_main.c:228: warning: Function parameter or member 'value' not described in 'temac_dma_out32_be'
drivers/net/ethernet/xilinx/ll_temac_main.c:247: warning: Function parameter or member 'lp' not described in 'temac_dma_dcr_in'
drivers/net/ethernet/xilinx/ll_temac_main.c:247: warning: Function parameter or member 'reg' not described in 'temac_dma_dcr_in'
drivers/net/ethernet/xilinx/ll_temac_main.c:255: warning: Function parameter or member 'lp' not described in 'temac_dma_dcr_out'
drivers/net/ethernet/xilinx/ll_temac_main.c:255: warning: Function parameter or member 'reg' not described in 'temac_dma_dcr_out'
drivers/net/ethernet/xilinx/ll_temac_main.c:255: warning: Function parameter or member 'value' not described in 'temac_dma_dcr_out'
drivers/net/ethernet/xilinx/ll_temac_main.c:265: warning: Function parameter or member 'lp' not described in 'temac_dcr_setup'
drivers/net/ethernet/xilinx/ll_temac_main.c:265: warning: Function parameter or member 'op' not described in 'temac_dcr_setup'
drivers/net/ethernet/xilinx/ll_temac_main.c:265: warning: Function parameter or member 'np' not described in 'temac_dcr_setup'
drivers/net/ethernet/xilinx/ll_temac_main.c:300: warning: Function parameter or member 'ndev' not described in 'temac_dma_bd_release'
drivers/net/ethernet/xilinx/ll_temac_main.c:330: warning: Function parameter or member 'ndev' not described in 'temac_dma_bd_init'
drivers/net/ethernet/xilinx/ll_temac_main.c:600: warning: Function parameter or member 'ndev' not described in 'temac_setoptions'
drivers/net/ethernet/xilinx/ll_temac_main.c:600: warning: Function parameter or member 'options' not described in 'temac_setoptions'

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>