]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
12 months agowifi: ath12k: Dump additional Tx PDEV HTT stats
Ramya Gnanasekar [Thu, 27 Jun 2024 16:37:02 +0000 (19:37 +0300)]
wifi: ath12k: Dump additional Tx PDEV HTT stats

Support to dump additional Tx PDEV stats through HTT stats debugfs.
Following stats dump are supported:
        1. PDEV control path stat to dump Tx management frame count
        2. Tx PDEV SIFS histogram stats
        3. Tx MU MIMO PPDU stats for 802.11ac, 802.11ax and 802.11be

Sample Output:
---------------
echo 1 > /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats_type
cat /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats

HTT_TX_PDEV_STATS_CMN_TLV:
mac_id = 0
comp_delivered = 0
self_triggers = 13
......
......
HTT_TX_PDEV_STATS_CTRL_PATH_TX_STATS:
fw_tx_mgmt_subtype =  0:1, 1:0, 2:0, 3:0, 4:38, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:1, 12:0, 13:7, 14:0, 15:0

HTT_TX_PDEV_STATS_SIFS_HIST_TLV:
sifs_hist_status =  0:237, 1:185, 2:1, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0

HTT_TX_PDEV_AC_MU_PPDU_DISTRIBUTION_STATS:
ac_mu_mimo_num_seq_posted_nr4 = 0
ac_mu_mimo_num_ppdu_posted_per_burst_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
ac_mu_mimo_num_ppdu_completed_per_burst_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
ac_mu_mimo_num_seq_term_status_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0

ac_mu_mimo_num_seq_posted_nr8 = 0
ac_mu_mimo_num_ppdu_posted_per_burst_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
ac_mu_mimo_num_ppdu_completed_per_burst_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
ac_mu_mimo_num_seq_term_status_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0

HTT_TX_PDEV_AX_MU_PPDU_DISTRIBUTION_STATS:
ax_mu_mimo_num_seq_posted_nr4 = 0
ax_mu_mimo_num_ppdu_posted_per_burst_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
ax_mu_mimo_num_ppdu_completed_per_burst_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
ax_mu_mimo_num_seq_term_status_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0

ax_mu_mimo_num_seq_posted_nr8 = 0
ax_mu_mimo_num_ppdu_posted_per_burst_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
ax_mu_mimo_num_ppdu_completed_per_burst_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
ax_mu_mimo_num_seq_term_status_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0

HTT_TX_PDEV_BE_MU_PPDU_DISTRIBUTION_STATS:
be_mu_mimo_num_seq_posted_nr4 = 0
be_mu_mimo_num_ppdu_posted_per_burst_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
be_mu_mimo_num_ppdu_completed_per_burst_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
be_mu_mimo_num_seq_term_status_nr4 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0

be_mu_mimo_num_seq_posted_nr8 = 0
be_mu_mimo_num_ppdu_posted_per_burst_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
be_mu_mimo_num_ppdu_completed_per_burst_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0
be_mu_mimo_num_seq_term_status_nr8 =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240626085854.2500681-5-quic_rgnanase@quicinc.com
12 months agowifi: ath12k: Add support to parse requested stats_type
Dinesh Karthikeyan [Thu, 27 Jun 2024 16:36:56 +0000 (19:36 +0300)]
wifi: ath12k: Add support to parse requested stats_type

Add extended htt stats parser and print the corresponding TLVs associated
with the requested htt_stats_type.
Add support for TX PDEV related htt stats.

Sample output:
--------------
echo 1 > /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats_type
cat /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats

HTT_TX_PDEV_STATS_CMN_TLV:
mac_id = 0
comp_delivered = 0
self_triggers = 256
hw_queued = 275
hw_reaped = 275
underrun = 241
hw_paused = 0
hw_flush = 0
hw_filt = 1
tx_abort = 0
ppdu_ok = 246
mpdu_requeued = 0
tx_xretry = 0
data_rc = 3
mpdu_dropped_xretry = 0
illegal_rate_phy_err = 0
cont_xretry = 0
tx_timeout = 0
tx_time_dur_data = 0
pdev_resets = 0
phy_underrun = 0
txop_ovf = 0
seq_posted = 247
seq_failed_queueing = 0
seq_completed = 247
seq_restarted = 0
seq_txop_repost_stop = 0
next_seq_cancel = 0
dl_mu_mimo_seq_posted = 0
dl_mu_ofdma_seq_posted = 0
ul_mu_mimo_seq_posted = 0
ul_mu_ofdma_seq_posted = 0
mu_mimo_peer_blacklisted = 0
seq_qdepth_repost_stop = 0
seq_min_msdu_repost_stop = 0
mu_seq_min_msdu_repost_stop = 0
seq_switch_hw_paused = 0
next_seq_posted_dsr = 0
seq_posted_isr = 0
seq_ctrl_cached = 0
mpdu_count_tqm = 0
msdu_count_tqm = 0
mpdu_removed_tqm = 0
msdu_removed_tqm = 0
remove_mpdus_max_retries = 0
mpdus_sw_flush = 0
mpdus_hw_filter = 0
mpdus_truncated = 0
mpdus_ack_failed = 0
mpdus_expired = 0
mpdus_seq_hw_retry = 0
ack_tlv_proc = 0
coex_abort_mpdu_cnt_valid = 0
coex_abort_mpdu_cnt = 5
num_total_ppdus_tried_ota = 5
num_data_ppdus_tried_ota = 0
local_ctrl_mgmt_enqued = 247
local_ctrl_mgmt_freed = 247
local_data_enqued = 0
local_data_freed = 0
mpdu_tried = 0
isr_wait_seq_posted = 0
tx_active_dur_us_low = 0
tx_active_dur_us_high = 0
fes_offsets_err_cnt = 0

HTT_TX_PDEV_STATS_URRN_TLV:
urrn_stats =  0:0, 1:241, 2:0

HTT_TX_PDEV_STATS_SIFS_TLV:
sifs_status =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0

HTT_TX_PDEV_STATS_FLUSH_TLV:
flush_errs =  0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0,
11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0,
23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0,
35:0, 36:0, 37:0, 38:0, 39:0, 40:0, 41:0, 42:0, 43:0, 44:0, 45:0, 46:0,
47:0, 48:0, 49:0, 50:0, 51:0, 52:0, 53:0, 54:0, 55:0, 56:0, 57:0, 58:0,
59:0, 60:0, 61:0, 62:0, 63:0, 64:0, 65:0, 66:0, 67:0, 68:0, 69:0, 70:0,
71:0, 72:0, 73:0, 74:0, 75:0, 76:0, 77:0, 78:0, 79:0, 80:0, 81:0, 82:0,
83:0, 84:0, 85:0, 86:0, 87:0, 88:0, 89:0, 90:0, 91:0, 92:0, 93:0, 94:0,
95:0, 96:0, 97:0, 98:0, 99:0, 100:0, 101:0, 102:0, 103:0, 104:0, 105:0,
106:0, 107:0, 108:0, 109:0, 110:0, 111:0, 112:0, 113:0, 114:0, 115:0,
116:0, 117:0, 118:0, 119:0, 120:0, 121:0, 122:0, 123:0, 124:0, 125:0,
126:0, 127:0

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Signed-off-by: Dinesh Karthikeyan <quic_dinek@quicinc.com>
Co-developed-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240626085854.2500681-4-quic_rgnanase@quicinc.com
12 months agowifi: ath12k: Add htt_stats_dump file ops support
Dinesh Karthikeyan [Thu, 27 Jun 2024 16:36:50 +0000 (19:36 +0300)]
wifi: ath12k: Add htt_stats_dump file ops support

Add dump_htt_stats file operation to dump the stats value requested
for the requested stats_type.
Stats sent from firmware will be cumulative. Hence add debugfs to reset
the requested stats type.

Example with one ath12k device:

ath12k
`-- pci-0000:06:00.0
    |-- mac0
        `-- htt_stats
        |-- htt_stats_type
        |-- htt_stats_reset

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Signed-off-by: Dinesh Karthikeyan <quic_dinek@quicinc.com>
Co-developed-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240626085854.2500681-3-quic_rgnanase@quicinc.com
12 months agowifi: ath12k: Add support to enable debugfs_htt_stats
Dinesh Karthikeyan [Wed, 26 Jun 2024 08:58:51 +0000 (14:28 +0530)]
wifi: ath12k: Add support to enable debugfs_htt_stats

Create debugfs_htt_stats file when ath12k debugfs support is enabled.
Add basic ath12k_debugfs_htt_stats_register and handle htt_stats_type
file operations.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Signed-off-by: Dinesh Karthikeyan <quic_dinek@quicinc.com>
Co-developed-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240626085854.2500681-2-quic_rgnanase@quicinc.com
12 months agowifi: ath12k: fix driver initialization for WoW unsupported devices
Rameshkumar Sundaram [Wed, 26 Jun 2024 02:42:57 +0000 (08:12 +0530)]
wifi: ath12k: fix driver initialization for WoW unsupported devices

Commit 4a3c212eee0e ("wifi: ath12k: add basic WoW functionalities") broke
driver initialization, now mac registration is allowed only for devices that
advertise WMI_TLV_SERVICE_WOW, but QCN9274 doesn't support WoW and hence mac
registration is aborted and driver is de-initialized.

Allow mac registration to proceed without WoW Support for devices
that don't advertise WMI_TLV_SERVICE_WOW.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: 4a3c212eee0e ("wifi: ath12k: add basic WoW functionalities")
Signed-off-by: Rameshkumar Sundaram <quic_ramess@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240626024257.709460-1-quic_ramess@quicinc.com
12 months agowifi: ath12k: fix peer metadata parsing
Karthikeyan Periyasamy [Mon, 24 Jun 2024 14:54:18 +0000 (20:24 +0530)]
wifi: ath12k: fix peer metadata parsing

Currently, the Rx data path only supports parsing peer metadata of version
zero. However, the QCN9274 platform configures the peer metadata version
as V1B. When V1B peer metadata is parsed using the version zero logic,
invalid data is populated, causing valid packets to be dropped. To address
this issue, refactor the peer metadata version and add the version based
parsing to populate the data from peer metadata correctly.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: 287033810990 ("wifi: ath12k: add support for peer meta data version")
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240624145418.2043461-1-quic_periyasa@quicinc.com
12 months agowifi: ath12k: Fix pdev id sent to firmware for single phy devices
Lingbo Kong [Fri, 21 Jun 2024 10:28:09 +0000 (15:58 +0530)]
wifi: ath12k: Fix pdev id sent to firmware for single phy devices

Pdev id from mac phy capabilities will be sent as a part of
HTT/WMI command to firmware. This causes issue with single pdev
devices where firmware does not respond to the WMI/HTT request
sent from host.

For single pdev devices firmware expects pdev id as 1 for 5 GHz/6 GHz
phy and 2 for 2 GHz band. Add wrapper ath12k_mac_get_target_pdev_id()
to help fetch right pdev for single pdev devices.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com>
Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240621102809.3984004-1-quic_rgnanase@quicinc.com
12 months agowifi: ath12k: handle keepalive during WoWLAN suspend and resume
Baochen Qiang [Wed, 19 Jun 2024 14:21:12 +0000 (17:21 +0300)]
wifi: ath12k: handle keepalive during WoWLAN suspend and resume

With WoWLAN enabled and after sleeping for a rather long time,
we are seeing that with some APs, it is not able to wake up
the STA though the correct wake up pattern has been configured.
This is because the host doesn't send keepalive command to
firmware, thus firmware will not send any packet to the AP and
after a specific time the AP kicks out the STA.

So enable keepalive before going to suspend and disable it after
resume back.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240604055407.12506-9-quic_bqiang@quicinc.com
12 months agowifi: ath12k: support GTK rekey offload
Baochen Qiang [Wed, 19 Jun 2024 14:21:12 +0000 (17:21 +0300)]
wifi: ath12k: support GTK rekey offload

Host sets GTK related info to firmware before WoW is enabled, and
gets rekey replay_count and then disables GTK rekey when WoW quits.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240604055407.12506-8-quic_bqiang@quicinc.com
12 months agowifi: ath12k: support ARP and NS offload
Baochen Qiang [Wed, 19 Jun 2024 14:21:12 +0000 (17:21 +0300)]
wifi: ath12k: support ARP and NS offload

Support ARP and NS offload in WoW state.

Tested this way: put machine A with QCA6390 to WoW state,
ping/ping6 machine A from another machine B, check sniffer to see
any ARP response and Neighbor Advertisement from machine A.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240604055407.12506-7-quic_bqiang@quicinc.com
12 months agowifi: ath12k: implement hardware data filter
Baochen Qiang [Wed, 19 Jun 2024 14:21:12 +0000 (17:21 +0300)]
wifi: ath12k: implement hardware data filter

Host needs to set hardware data filter before entering WoW to
let firmware drop needless broadcast/multicast frames to avoid
frequent wakeup. Host clears hardware data filter when leaving WoW.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240604055407.12506-6-quic_bqiang@quicinc.com
12 months agowifi: ath12k: add WoW net-detect functionality
Baochen Qiang [Tue, 4 Jun 2024 05:54:03 +0000 (13:54 +0800)]
wifi: ath12k: add WoW net-detect functionality

Implement net-detect feature by setting flag
WIPHY_WOWLAN_NET_DETECT if firmware supports this
feature. Driver sets the related PNO configuration
to firmware before entering WoW and firmware then
scans periodically and wakes up host if a specific
SSID is found.

Note that firmware crashes if we enable it for both
P2P vdev and station vdev simultaneously because
firmware can only support one vdev at a time. Since
there is rare scenario for a P2P vdev to do net-detect,
skip it for P2P vdevs.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240604055407.12506-5-quic_bqiang@quicinc.com
12 months agowifi: ath12k: add basic WoW functionalities
Baochen Qiang [Wed, 19 Jun 2024 14:21:11 +0000 (17:21 +0300)]
wifi: ath12k: add basic WoW functionalities

Implement basic WoW functionalities such as magic-packet, disconnect
and pattern. The logic is very similar to ath11k.

When WoW is configured, ath12k_core_suspend and ath12k_core_resume
are skipped (by checking ar->state) as we are not allowed to power
cycle firmware.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240604055407.12506-4-quic_bqiang@quicinc.com
12 months agowifi: ath12k: implement WoW enable and wakeup commands
Baochen Qiang [Wed, 19 Jun 2024 14:21:11 +0000 (17:21 +0300)]
wifi: ath12k: implement WoW enable and wakeup commands

Implement WoW enable and WoW wakeup commands which are needed
for suspend/resume.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240604055407.12506-3-quic_bqiang@quicinc.com
12 months agowifi: ath12k: add ATH12K_DBG_WOW log level
Baochen Qiang [Wed, 19 Jun 2024 14:21:11 +0000 (17:21 +0300)]
wifi: ath12k: add ATH12K_DBG_WOW log level

Currently there is no dedicated log level for WoW, so create it for future use.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240604055407.12506-2-quic_bqiang@quicinc.com
12 months agowifi: ath12k: fix mbssid max interface advertisement
Karthikeyan Periyasamy [Thu, 13 Jun 2024 15:38:13 +0000 (21:08 +0530)]
wifi: ath12k: fix mbssid max interface advertisement

The Current method for advertising the maximum MBSSID interface count
assumes single radio per wiphy (multi wiphy model). However, this
assumption is incorrect for multi radio per wiphy (single wiphy model).
Therefore, populate the parameter for each radio present in the MAC
abstraction layer (ah). This approach ensure scalability for both single
wiphy and multi wiphy models.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: 519a545cfee7 ("wifi: ath12k: advertise driver capabilities for MBSSID and EMA")
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240613153813.3509837-1-quic_periyasa@quicinc.com
12 months agowifi: ath12k: fix firmware crash due to invalid peer nss
Ajith C [Thu, 13 Jun 2024 05:35:28 +0000 (11:05 +0530)]
wifi: ath12k: fix firmware crash due to invalid peer nss

Currently, if the access point receives an association
request containing an Extended HE Capabilities Information
Element with an invalid MCS-NSS, it triggers a firmware
crash.

This issue arises when EHT-PHY capabilities shows support
for a bandwidth and MCS-NSS set for that particular
bandwidth is filled by zeros and due to this, driver obtains
peer_nss as 0 and sending this value to firmware causes
crash.

Address this issue by implementing a validation step for
the peer_nss value before passing it to the firmware. If
the value is greater than zero, proceed with forwarding
it to the firmware. However, if the value is invalid,
reject the association request to prevent potential
firmware crashes.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Signed-off-by: Ajith C <quic_ajithc@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240613053528.2541645-1-quic_ajithc@quicinc.com
12 months agowifi: ath12k: fix legacy peer association due to missing HT or 6 GHz capabilities
Pradeep Kumar Chitrapu [Wed, 12 Jun 2024 22:53:36 +0000 (15:53 -0700)]
wifi: ath12k: fix legacy peer association due to missing HT or 6 GHz capabilities

Currently SMPS configuration failed when the Information
Elements (IEs) did not contain HT or 6 GHz capabilities. This
caused legacy peer association to fail as legacy peers do not
have HT or 6 GHz capabilities. Fix this by not returning an
error when SMPS configuration fails due to the absence of HT
or 6 GHz capabilities.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: f0e61dc7ecf9 ("wifi: ath12k: refactor SMPS configuration")
Reported-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Reported-by: Zachary Smith <dr.z.smith@gmail.com>
Closes: https://lore.kernel.org/all/CAM=znoFPcXrn5GhDmDmo50Syic3-hXpWvD+vkv8KX5o_ZTo8kQ@mail.gmail.com/
Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com>
Reported-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240612225336.2303119-1-quic_pradeepc@quicinc.com
12 months agowifi: ath12k: fix uninitialize symbol error on ath12k_peer_assoc_h_he()
Aaradhana Sahu [Tue, 11 Jun 2024 03:10:17 +0000 (08:40 +0530)]
wifi: ath12k: fix uninitialize symbol error on ath12k_peer_assoc_h_he()

Smatch throws following errors

drivers/net/wireless/ath/ath12k/mac.c:1922 ath12k_peer_assoc_h_he() error: uninitialized symbol 'rx_mcs_80'.
drivers/net/wireless/ath/ath12k/mac.c:1922 ath12k_peer_assoc_h_he() error: uninitialized symbol 'rx_mcs_160'.
drivers/net/wireless/ath/ath12k/mac.c:1924 ath12k_peer_assoc_h_he() error: uninitialized symbol 'rx_mcs_80'.

In ath12k_peer_assoc_h_he() rx_mcs_80 and rx_mcs_160 variables
remain uninitialized in the following conditions:
1. Whenever the value of mcs_80 become equal to
   IEEE80211_HE_MCS_NOT_SUPPORTED then rx_mcs_80 remains uninitialized.
2. Whenever phy capability is not supported 160 channel width and
   value of mcs_160 become equal to IEEE80211_HE_MCS_NOT_SUPPORTED
   then rx_mcs_160 remains uninitialized.

Initialize these variables during declaration.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00188-QCAHKSWPL_SILICONZ-1

Signed-off-by: Aaradhana Sahu <quic_aarasahu@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240611031017.297927-3-quic_aarasahu@quicinc.com
12 months agowifi: ath12k: fix NULL pointer access in ath12k_mac_op_get_survey()
Aaradhana Sahu [Tue, 11 Jun 2024 03:10:16 +0000 (08:40 +0530)]
wifi: ath12k: fix NULL pointer access in ath12k_mac_op_get_survey()

Smatch throws below error

drivers/net/wireless/ath/ath12k/mac.c:8318 ath12k_mac_op_get_survey() error: we previously assumed 'sband' could be null

Currently, we access sband inside the null check of the sband
in ath12k_mac_op_get_survey().

Fix this issue by removing the entire if block, because decrement
idx is unnecessary since there are no more band to test.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: 70e3be54bbdd ("wifi: ath12k: fix survey dump collection in 6 GHz")
Signed-off-by: Aaradhana Sahu <quic_aarasahu@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240611031017.297927-2-quic_aarasahu@quicinc.com
12 months agowifi: ath11k: modify the calculation of the average signal strength in station mode
Lingbo Kong [Sat, 9 Mar 2024 12:11:29 +0000 (20:11 +0800)]
wifi: ath11k: modify the calculation of the average signal strength in station mode

Currently, the calculation of the average signal strength in station mode
is incorrect.

This is because before calculating the average signal strength, ath11k need
to determine whether the hardware and firmware support db2dbm, if the
hardware and firmware support db2dbm, do not need to add noise floor,
otherwise, need to add noise floor.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23

Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240309121129.5379-1-quic_lingbok@quicinc.com
12 months agowifi: ath11k: fix ack signal strength calculation
Lingbo Kong [Tue, 11 Jun 2024 02:25:50 +0000 (10:25 +0800)]
wifi: ath11k: fix ack signal strength calculation

Currently, the calculation of ack signal strength is incorrect.

This is because before calculating the ack signal strength, ath11k need
to determine whether the hardware and firmware support db2dbm. If the
hardware and firmware support db2dbm, do not need to add noise floor,
otherwise, need to add noise floor.

Besides, the value of ack_rssi passed by firmware to ath11k should be a
signed number, so change its type to s8.

After that, "iw wlan0 station dump" show the correct ack signal strength.

Such as:
root@CDCCSTEX0799733-LIN:~# iw wlp88s0 station dump
Station 00:03:7f:12:df:df (on wlp88s0)
        inactive time:  75 ms
        rx bytes:       11599
        rx packets:     99
        tx bytes:       9029
        tx packets:     81
        tx retries:     4
        tx failed:      0
        rx drop misc:   2
        signal:         -16 dBm
        signal avg:     -24 dBm
        tx bitrate:     1560.0 MBit/s VHT-MCS 9 80MHz VHT-NSS 4
        tx duration:    9230 us
        rx bitrate:     1560.0 MBit/s VHT-MCS 9 80MHz VHT-NSS 4
        rx duration:    7201 us
        last ack signal:-23 dBm
        avg ack signal: -22 dBm

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1

Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240611022550.59078-1-quic_lingbok@quicinc.com
13 months agowifi: ath12k: Remove unused ath12k_base from ath12k_hw
Harshitha Prem [Mon, 17 Jun 2024 11:12:47 +0000 (14:12 +0300)]
wifi: ath12k: Remove unused ath12k_base from ath12k_hw

Currently, device (ab) reference in hardware abstraction (ah)
is not used anywhere. Also, with multiple device group abstraction,
hardware abstraction would be coupled with device group abstraction
rather than single device.

Hence, remove the ab reference from hardware abstraction.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Signed-off-by: Harshitha Prem <quic_hprem@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240529060939.4156281-1-quic_hprem@quicinc.com
13 months agowifi: ath12k: Fix WARN_ON during firmware crash in split-phy
Aaradhana Sahu [Mon, 17 Jun 2024 11:12:47 +0000 (14:12 +0300)]
wifi: ath12k: Fix WARN_ON during firmware crash in split-phy

Whenever firmware is crashed in split-phy below WARN_ON() triggered:

WARNING: CPU: 3 PID: 82 at net/mac80211/driver-ops.c:41 drv_stop+0xac/0xbc
Modules linked in: ath12k qmi_helpers
CPU: 3 PID: 82 Comm: kworker/3:2 Tainted: G      D W          6.9.0-next-20240520-00113-gd981a3784e15 #39
Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C9 (DT)
Workqueue: events_freezable ieee80211_restart_work
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : drv_stop+0xac/0xbc
lr : ieee80211_stop_device+0x54/0x64
sp : ffff8000848dbb20
x29: ffff8000848dbb20 x28: 0000000000000790 x27: ffff000014d78900
x26: ffff000014d791f8 x25: ffff000007f0d9b0 x24: 0000000000000018
x23: 0000000000000001 x22: 0000000000000000 x21: ffff000014d78e10
x20: ffff800081dc0000 x19: ffff000014d78900 x18: ffffffffffffffff
x17: ffff7fffbca84000 x16: ffff800083fe0000 x15: ffff800081dc0b48
x14: 0000000000000076 x13: 0000000000000076 x12: 0000000000000001
x11: 0000000000000000 x10: 0000000000000a60 x9 : ffff8000848db980
x8 : ffff000000dddfc0 x7 : 0000000000000400 x6 : ffff800083b012d8
x5 : ffff800083b012d8 x4 : 0000000000000000 x3 : ffff000014d78398
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000014d78900
Call trace:
 drv_stop+0xac/0xbc
 ieee80211_stop_device+0x54/0x64
 ieee80211_do_stop+0x5a0/0x790
 ieee80211_stop+0x4c/0x178
 __dev_close_many+0xb0/0x150
 dev_close_many+0x88/0x130
 dev_close.part.171+0x44/0x74
 dev_close+0x1c/0x28
 cfg80211_shutdown_all_interfaces+0x44/0xfc
 ieee80211_restart_work+0xfc/0x14c
 process_scheduled_works+0x18c/0x2dc
 worker_thread+0x13c/0x314
 kthread+0x118/0x124
 ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---

The warning in question is from drv_stop():

if (WARN_ON(!local->started))
return;

The sequence of WARN_ON() is:
Thread 1:
-Firmware crash calls ath12k_core_reset().
-Call ieee80211_restart_hw() inside
 ath12k_core_post_reconfigure_recovery() which schedules worker
 for both hardware.
-Wait for completion of ab->recovery_start.

Thread 2 (worker thread):
-One hardware acquires rtnl_lock() inside ieee80211_restart_hw() and
 calls ath12k_mac_wait_reconfigure() into ath12k_mac_op_start().
-Hardware is waiting for ab->reconfigure_complete but at this time
 recovery_start_count value is 1 because another worker thread
 (local->restart_work) is still waiting for rtnl_lock().
 recovery_start_count is not equal to number of radios
 (2 in split-phy). So ab->recovery_start complete does not set
 due to this, thread 1 is still waiting and not able to perform
 hif power down up and firmware reload.
-Wait timeout happens for ab->reconfigure_complete and comeback
 to caller (ath12k_mac_op_start()) and sends WMI command to
 crashed firmware and gets error.
-This returns error to drv_start() and local->started is set to false.
-Hardware calls cfg80211_shutdown_all_interfaces() after receiving error
 inside ieee80211_restart_work() and goes to drv_stop(), here we trigger
 WARN_ON as local->started is false.

To fix this issue call ieee80211_restart_hw() after firmware has been
reloaded. Now, each hardware can send WMI command to firmware
successfully. With this fix we don't need to wait for
ab->recovery_start completion so remove
ath12k_mac_wait_reconfigure().

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00209-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 HW2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Signed-off-by: Aaradhana Sahu <quic_aarasahu@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240529034405.2863150-1-quic_aarasahu@quicinc.com
13 months agodt-bindings: net: wireless: describe the ath12k PCI module
Bartosz Golaszewski [Wed, 5 Jun 2024 12:21:05 +0000 (14:21 +0200)]
dt-bindings: net: wireless: describe the ath12k PCI module

Add device-tree bindings for the ATH12K module found in the WCN7850
package.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240605122106.23818-3-brgl@bgdev.pl
13 months agodt-bindings: net: wireless: qcom,ath11k: describe the ath11k on QCA6390
Bartosz Golaszewski [Wed, 5 Jun 2024 12:21:04 +0000 (14:21 +0200)]
dt-bindings: net: wireless: qcom,ath11k: describe the ath11k on QCA6390

Add a PCI compatible for the ATH11K module on QCA6390 and describe the
power inputs from the PMU that it consumes.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240605122106.23818-2-brgl@bgdev.pl
13 months agowifi: ath12k: handle symlink cleanup for per pdev debugfs dentry
Aditya Kumar Singh [Tue, 11 Jun 2024 06:42:33 +0000 (09:42 +0300)]
wifi: ath12k: handle symlink cleanup for per pdev debugfs dentry

Whenever per pdev debugfs directory is created, a symlink to it is also
placed in ieee80211/phy* directory. During clean up of per pdev debugfs,
this symlink also needs to be cleaned up.

Add changes to clean up the symlink whenever the per pdev debugfs is
removed.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240529043043.2488031-4-quic_adisi@quicinc.com
13 months agowifi: ath12k: unregister per pdev debugfs
Aditya Kumar Singh [Tue, 11 Jun 2024 06:42:32 +0000 (09:42 +0300)]
wifi: ath12k: unregister per pdev debugfs

During normal de-initialization path or if any error happens while
registering the hardware, there is no support to unregister the per pdev
debugfs. Add support for the same.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240529043043.2488031-3-quic_adisi@quicinc.com
13 months agowifi: ath12k: fix per pdev debugfs registration
Aditya Kumar Singh [Tue, 11 Jun 2024 06:42:32 +0000 (09:42 +0300)]
wifi: ath12k: fix per pdev debugfs registration

Function ath12k_debugfs_register() is called once inside the function
ath12k_mac_hw_register(). However, with single wiphy model, there could
be multiple pdevs registered under single hardware. Hence, need to register
debugfs for each one of them.

Move the caller inside the loop which iterates over all underlying pdevs.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: f8bde02a26b9 ("wifi: ath12k: initial debugfs support")
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240529043043.2488031-2-quic_adisi@quicinc.com
13 months agowifi: ath11k: fix wrong handling of CCMP256 and GCMP ciphers
Baochen Qiang [Tue, 11 Jun 2024 06:42:34 +0000 (09:42 +0300)]
wifi: ath11k: fix wrong handling of CCMP256 and GCMP ciphers

Currently for CCMP256, GCMP128 and GCMP256 ciphers, in ath11k_install_key()
IEEE80211_KEY_FLAG_GENERATE_IV_MGMT is not set. And in ath11k_mac_mgmt_tx_wmi()
a length of IEEE80211_CCMP_MIC_LEN is reserved for all ciphers.

This results in unexpected management frame drop in case either of above 3 ciphers
is used. The reason is, without IEEE80211_KEY_FLAG_GENERATE_IV_MGMT set, mac80211
will not generate CCMP/GCMP headers in frame for ath11k. Also MIC length reserved
is wrong. Such frame is dropped later by hardware:

ath11k_pci 0000:5a:00.0: mac tx mgmt frame, buf id 0
ath11k_pci 0000:5a:00.0: mgmt tx compl ev pdev_id 1, desc_id 0, status 1

From user point of view, we have observed very low throughput due to this issue:
action frames are all dropped so ADDBA response from DUT never reaches AP. AP
can not use aggregation thus throughput is low.

Fix this by setting IEEE80211_KEY_FLAG_GENERATE_IV_MGMT flag and by reserving proper
MIC length for those ciphers.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1

Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Reported-by: Yaroslav Isakov <yaroslav.isakov@gmail.com>
Tested-by: Yaroslav Isakov <yaroslav.isakov@gmail.com>
Closes: https://lore.kernel.org/all/CADS+iDX5=JtJr0apAtAQ02WWBxgOFEv8G063vuGYwDTC8AVZaw@mail.gmail.com
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240605014826.22498-1-quic_bqiang@quicinc.com
13 months agowifi: ath12k: avoid unnecessary MSDU drop in the Rx error process
Karthikeyan Periyasamy [Tue, 11 Jun 2024 06:42:33 +0000 (09:42 +0300)]
wifi: ath12k: avoid unnecessary MSDU drop in the Rx error process

Currently, in the Rx error processing handler, once an MSDU drop is
detected, the subsequent MSDUs get unintentionally dropped due to the
previous drop flag being retained across all MSDU processing, leading
to the discarding of valid MSDUs. To resolve this issue, the drop flag
should be reset to false before processing each descriptor.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240604062641.2956288-1-quic_periyasa@quicinc.com
13 months agowifi: ath11k: use 'time_left' variable with wait_event_timeout()
Wolfram Sang [Tue, 11 Jun 2024 06:42:33 +0000 (09:42 +0300)]
wifi: ath11k: use 'time_left' variable with wait_event_timeout()

There is a confusing pattern in the kernel to use a variable named 'timeout' to
store the result of wait_event_timeout() causing patterns like:

timeout = wait_event_timeout(...)
if (!timeout) return -ETIMEDOUT;

with all kinds of permutations. Use 'time_left' as a variable to make the code
self explaining.

Fix to the proper variable type 'long' while here.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240603091541.8367-2-wsa+renesas@sang-engineering.com
13 months agowifi: ath11k: fix RCU documentation in ath11k_mac_op_ipv6_changed()
Baochen Qiang [Tue, 11 Jun 2024 06:42:33 +0000 (09:42 +0300)]
wifi: ath11k: fix RCU documentation in ath11k_mac_op_ipv6_changed()

Current documentation on RCU in ath11k_mac_op_ipv6_changed() says:

/* Note: read_lock_bh() calls rcu_read_lock() */
read_lock_bh(&idev->lock);

This is wrong because without enabling CONFIG_PREEMPT_RT
rcu_read_lock() is not called by read_lock_bh(). The reason
why current code works even in a CONFIG_PREEMPT_RT=n kernel
is because atomic_notifier_call_chain() already does that for
us, see:

int atomic_notifier_call_chain()
{
...
rcu_read_lock();
ret = notifier_call_chain(&nh->head, val, v, -1, NULL);
rcu_read_unlock();
...
}

and backtrace:

ath11k_mac_op_ipv6_changed
ieee80211_ifa6_changed
notifier_call_chain
atomic_notifier_call_chain

So update the comment to make it correct.

This is found during code review, compile tested only.

Fixes: feafe59c8975 ("wifi: ath11k: use RCU when accessing struct inet6_dev::ac_list")
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240531022411.6543-1-quic_bqiang@quicinc.com
13 months agowifi: ath12k: fix ACPI warning when resume
Baochen Qiang [Tue, 11 Jun 2024 06:42:33 +0000 (09:42 +0300)]
wifi: ath12k: fix ACPI warning when resume

Currently ACPI notification handler is installed when driver loads and only
gets removed when driver unloads. During resume after firmware is reloaded,
ath12k tries to install it by default. Since it is installed already, ACPI
subsystem rejects it and returns an error:

[   83.094206] ath12k_pci 0000:03:00.0: failed to install DSM notify callback: 7

Fix it by removing that handler when going to suspend. This also avoid any
possible ACPI call to firmware before firmware is reloaded/reinitialized.

Note ab->acpi also needs to be cleared in ath12k_acpi_stop() such that we
are in a clean state when ACPI structures are reinitialized in
ath12k_acpi_start().

Tested-on: WCN7850 HW2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Fixes: 576771c9fa21 ("wifi: ath12k: ACPI TAS support")
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240531024000.9291-1-quic_bqiang@quicinc.com
13 months agowifi: ath12k: modify remain on channel for single wiphy
Rameshkumar Sundaram [Tue, 11 Jun 2024 06:42:31 +0000 (09:42 +0300)]
wifi: ath12k: modify remain on channel for single wiphy

When multiple radios are advertised as a single wiphy which supports various
bands, vdev creation for the vif is deferred until channel is assigned to it.
If a remain on channel (RoC) request is received from mac80211, select the
corresponding radio (ar) based on channel and create a vdev on that radio to
initiate an RoC scan.

Note that on RoC completion this vdev is not deleted. If a new RoC/hw scan
request is seen on that same vif for a different band the vdev will be deleted
and created on the new radio supporting the request.

Also if the RoC scan is requested when the vdev is in started state, no
switching to new radio is allowed and RoC request can be accepted only on
channels within same radio.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Signed-off-by: Rameshkumar Sundaram <quic_ramess@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240528082739.1226758-1-quic_ramess@quicinc.com
13 months agoMerge branch 'intel-wired-lan-driver-updates-2024-06-03'
Jakub Kicinski [Tue, 11 Jun 2024 02:52:50 +0000 (19:52 -0700)]
Merge branch 'intel-wired-lan-driver-updates-2024-06-03'

Jacob Keller says:

====================
Intel Wired LAN Driver Updates 2024-06-03

This series includes miscellaneous improvements for the ice as well as a
cleanup to the Makefiles for all Intel net drivers.

Andy fixes all of the Intel net driver Makefiles to use the documented
'*-y' syntax for specifying object files to link into kernel driver
modules, rather than the '*-objs' syntax which works but is documented as
reserved for user-space host programs.

Jacob has a cleanup to refactor rounding logic in the ice driver into a
common roundup_u64 helper function.

Michal Schmidt replaces irq_set_affinity_hint() to use
irq_update_affinity_hint() which behaves better with user-applied affinity
settings.

v2: https://lore.kernel.org/r/20240605-next-2024-06-03-intel-next-batch-v2-0-39c23963fa78@intel.com
v1: https://lore.kernel.org/r/20240603-next-2024-06-03-intel-next-batch-v1-0-e0523b28f325@intel.com
====================

Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-0-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoice: use irq_update_affinity_hint()
Michal Schmidt [Fri, 7 Jun 2024 21:22:34 +0000 (14:22 -0700)]
ice: use irq_update_affinity_hint()

irq_set_affinity_hint() is deprecated. Use irq_update_affinity_hint()
instead. This removes the side-effect of actually applying the affinity.

The driver does not really need to worry about spreading its IRQs across
CPUs. The core code already takes care of that.
On the contrary, when the driver applies affinities by itself, it breaks
the users' expectations:
 1. The user configures irqbalance with IRQBALANCE_BANNED_CPULIST in
    order to prevent IRQs from being moved to certain CPUs that run a
    real-time workload.
 2. ice reconfigures VSIs at runtime due to a MIB change
    (ice_dcb_process_lldp_set_mib_change). Reopening a VSI resets the
    affinity in ice_vsi_req_irq_msix().
 3. ice has no idea about irqbalance's config, so it may move an IRQ to
    a banned CPU. The real-time workload suffers unacceptable latency.

I am not sure if updating the affinity hints is at all useful, because
irqbalance ignores them since 2016 ([1]), but at least it's harmless.

This ice change is similar to i40e commit d34c54d1739c ("i40e: Use
irq_update_affinity_hint()").

[1] https://github.com/Irqbalance/irqbalance/commit/dcc411e7bfdd

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-3-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoice: add and use roundup_u64 instead of open coding equivalent
Jacob Keller [Fri, 7 Jun 2024 21:22:33 +0000 (14:22 -0700)]
ice: add and use roundup_u64 instead of open coding equivalent

In ice_ptp_cfg_clkout(), the ice driver needs to calculate the nearest next
second of a current time value specified in nanoseconds. It implements this
using div64_u64, because the time value is a u64. It could use div_u64
since NSEC_PER_SEC is smaller than 32-bits.

Ideally this would be implemented directly with roundup(), but that can't
work on all platforms due to a division which requires using the specific
macros and functions due to platform restrictions, and to ensure that the
most appropriate and fast instructions are used.

The kernel doesn't currently provide any 64-bit equivalents for doing
roundup. Attempting to use roundup() on a 32-bit platform will result in a
link failure due to not having a direct 64-bit division.

The closest equivalent for this is DIV64_U64_ROUND_UP, which does a
division always rounding up. However, this only computes the division, and
forces use of the div64_u64 in cases where the divisor is a 32bit value and
could make use of div_u64.

Introduce DIV_U64_ROUND_UP based on div_u64, and then use it to implement
roundup_u64 which takes a u64 input value and a u32 rounding value.

The name roundup_u64 matches the naming scheme of div_u64, and future
patches could implement roundup64_u64 if they need to round by a multiple
that is greater than 32-bits.

Replace the logic in ice_ptp.c which does this equivalent with the newly
added roundup_u64.

Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-2-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: intel: Use *-y instead of *-objs in Makefile
Andy Shevchenko [Fri, 7 Jun 2024 21:22:32 +0000 (14:22 -0700)]
net: intel: Use *-y instead of *-objs in Makefile

*-objs suffix is reserved rather for (user-space) host programs while
usually *-y suffix is used for kernel drivers (although *-objs works
for that purpose for now).

Let's correct the old usages of *-objs in Makefiles.

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-1-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoisdn: add missing MODULE_DESCRIPTION() macros
Jeff Johnson [Fri, 7 Jun 2024 18:56:56 +0000 (11:56 -0700)]
isdn: add missing MODULE_DESCRIPTION() macros

make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/hfcpci.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/hfcmulti.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/hfcsusb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/avmfritz.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/speedfax.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/mISDNinfineon.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/w6692.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/netjet.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/mISDNipac.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/mISDNisar.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/mISDN/mISDN_core.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/mISDN/mISDN_dsp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/mISDN/l1oip.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240607-md-drivers-isdn-v1-1-81fb7001bc3a@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf...
Jakub Kicinski [Tue, 11 Jun 2024 01:02:14 +0000 (18:02 -0700)]
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-06-06

We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 50 files changed, 1887 insertions(+), 527 deletions(-).

The main changes are:

1) Add a user space notification mechanism via epoll when a struct_ops
   object is getting detached/unregistered, from Kui-Feng Lee.

2) Big batch of BPF selftest refactoring for sockmap and BPF congctl
   tests, from Geliang Tang.

3) Add BTF field (type and string fields, right now) iterator support
   to libbpf instead of using existing callback-based approaches,
   from Andrii Nakryiko.

4) Extend BPF selftests for the latter with a new btf_field_iter
   selftest, from Alan Maguire.

5) Add new kfuncs for a generic, open-coded bits iterator,
   from Yafang Shao.

6) Fix BPF selftests' kallsyms_find() helper under kernels configured
   with CONFIG_LTO_CLANG_THIN, from Yonghong Song.

7) Remove a bunch of unused structs in BPF selftests,
   from David Alan Gilbert.

8) Convert test_sockmap section names into names understood by libbpf
   so it can deduce program type and attach type, from Jakub Sitnicki.

9) Extend libbpf with the ability to configure log verbosity
   via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko.

10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness
    in nested VMs, from Song Liu.

11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba
    optimization, from Xiao Wang.

12) Enable BPF programs to declare arrays and struct fields with kptr,
    bpf_rb_root, and bpf_list_head, from Kui-Feng Lee.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
  selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
  selftests/bpf: Add start_test helper in bpf_tcp_ca
  selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
  libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
  selftests/bpf: Add btf_field_iter selftests
  selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
  libbpf: Remove callback-based type/string BTF field visitor helpers
  bpftool: Use BTF field iterator in btfgen
  libbpf: Make use of BTF field iterator in BTF handling code
  libbpf: Make use of BTF field iterator in BPF linker code
  libbpf: Add BTF field iterator
  selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find()
  selftests/bpf: Fix bpf_cookie and find_vma in nested VM
  selftests/bpf: Test global bpf_list_head arrays.
  selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types.
  selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
  bpf: limit the number of levels of a nested struct type.
  bpf: look into the types of the fields of a struct type recursively.
  ...
====================

Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge tag 'wireless-next-2024-06-07' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Tue, 11 Jun 2024 00:40:25 +0000 (17:40 -0700)]
Merge tag 'wireless-next-2024-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.11

The first "new features" pull request for v6.11 with changes both in
stack and in drivers. Nothing out of ordinary, except that we have
two conflicts this time:

net/mac80211/cfg.c
  https://lore.kernel.org/all/20240531124415.05b25e7a@canb.auug.org.au

drivers/net/wireless/microchip/wilc1000/netdev.c
  https://lore.kernel.org/all/20240603110023.23572803@canb.auug.org.au

Major changes:

cfg80211/mac80211
 * parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers

wilc1000
 * read MAC address during probe to make it visible to user space

iwlwifi
 * bump FW API to 91 for BZ/SC devices
 * report 64-bit radiotap timestamp
 * enable P2P low latency by default
 * handle Transmit Power Envelope (TPE) advertised by AP
 * start using guard()

rtlwifi
 * RTL8192DU support

ath12k
 * remove unsupported tx monitor handling
 * channel 2 in 6 GHz band support
 * Spatial Multiplexing Power Save (SMPS) in 6 GHz band support
 * multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA)
   support
 * dynamic VLAN support
 * add panic handler for resetting the firmware state

ath10k
 * add qcom,no-msa-ready-indicator Device Tree property
 * LED support for various chipsets

* tag 'wireless-next-2024-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (194 commits)
  wifi: ath12k: add hw_link_id in ath12k_pdev
  wifi: ath12k: add panic handler
  wifi: rtw89: chan: Use swap() in rtw89_swap_sub_entity()
  wifi: brcm80211: remove unused structs
  wifi: brcm80211: use sizeof(*pointer) instead of sizeof(type)
  wifi: ath12k: do not process consecutive RDDM event
  dt-bindings: net: wireless: ath11k: Drop "qcom,ipq8074-wcss-pil" from example
  wifi: ath12k: fix memory leak in ath12k_dp_rx_peer_frag_setup()
  wifi: rtlwifi: handle return value of usb init TX/RX
  wifi: rtlwifi: Enable the new rtl8192du driver
  wifi: rtlwifi: Add rtl8192du/sw.c
  wifi: rtlwifi: Constify rtl_hal_cfg.{ops,usb_interface_cfg} and rtl_priv.cfg
  wifi: rtlwifi: Add rtl8192du/dm.{c,h}
  wifi: rtlwifi: Add rtl8192du/fw.{c,h} and rtl8192du/led.{c,h}
  wifi: rtlwifi: Add rtl8192du/rf.{c,h}
  wifi: rtlwifi: Add rtl8192du/trx.{c,h}
  wifi: rtlwifi: Add rtl8192du/phy.{c,h}
  wifi: rtlwifi: Add rtl8192du/hw.{c,h}
  wifi: rtlwifi: Add new members to struct rtl_priv for RTL8192DU
  wifi: rtlwifi: Add rtl8192du/table.{c,h}
  ...

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================

Link: https://lore.kernel.org/r/20240607093517.41394C2BBFC@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge branch 'fix-changing-dsa-conduit'
David S. Miller [Mon, 10 Jun 2024 12:48:06 +0000 (13:48 +0100)]
Merge branch 'fix-changing-dsa-conduit'

Marek Behún says:

====================
Fix changing DSA conduit

This series fixes an issue in the DSA code related to host interface UC
address installed into port FDB and port conduit address database when
live-changing port conduit.

The first patch refactores/deduplicates the installation/uninstallation
of the interface's MAC address and the second patch fixes the issue.

Cover letter for v1 and v2:
  https://patchwork.kernel.org/project/netdevbpf/cover/20240429163627.16031-1-kabel@kernel.org/
  https://patchwork.kernel.org/project/netdevbpf/cover/20240502122922.28139-1-kabel@kernel.org/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: dsa: update the unicast MAC address when changing conduit
Marek Behún [Wed, 5 Jun 2024 13:33:29 +0000 (15:33 +0200)]
net: dsa: update the unicast MAC address when changing conduit

When changing DSA user interface conduit while the user interface is up,
DSA exhibits different behavior in comparison to when the interface is
down. This different behavior concerns the primary unicast MAC address
stored in the port standalone FDB and in the conduit device UC database.

If we put a switch port down while changing the conduit with
  ip link set sw0p0 down
  ip link set sw0p0 type dsa conduit conduit1
  ip link set sw0p0 up
we delete the address in dsa_user_close() and install the (possibly
different) address in dsa_user_open().

But when changing the conduit on the fly, the old address is not
deleted and the new one is not installed.

Since we explicitly want to support live-changing the conduit, uninstall
the old address before calling dsa_port_assign_conduit() and install the
(possibly different) new address after the call.

Because conduit change might also trigger address change (the user
interface is supposed to inherit the conduit interface MAC address if no
address is defined in hardware (dp->mac is a zero address)), move the
eth_hw_addr_inherit() call from dsa_user_change_conduit() to
dsa_port_change_conduit(), just before installing the new address.

Although this is in theory a flaw in DSA core, it needs not be
backported, since there is currently no DSA driver that can be affected
by this. The only DSA driver that supports changing conduit is felix,
and, as explained by Vladimir Oltean [1]:

  There are 2 reasons why with felix the bug does not manifest itself.

  First is because both the 'ocelot' and the alternate 'ocelot-8021q'
  tagging protocols have the 'promisc_on_conduit = true' flag. So the
  unicast address doesn't have to be in the conduit's RX filter -
  neither the old or the new conduit.

  Second, dsa_user_host_uc_install() theoretically leaves behind host
  FDB entries installed towards the wrong (old) CPU port. But in
  felix_fdb_add(), we treat any FDB entry requested towards any CPU port
  as if it was a multicast FDB entry programmed towards _all_ CPU ports.
  For that reason, it is installed towards the port mask of the PGID_CPU
  port group ID:

if (dsa_port_is_cpu(dp))
port = PGID_CPU;

Therefore no Fixes tag for this change.

[1] https://lore.kernel.org/netdev/20240507201827.47suw4fwcjrbungy@skbuf/
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: dsa: deduplicate code adding / deleting the port address to fdb
Marek Behún [Wed, 5 Jun 2024 13:33:28 +0000 (15:33 +0200)]
net: dsa: deduplicate code adding / deleting the port address to fdb

The sequence
  if (dsa_switch_supports_uc_filtering(ds))
    dsa_port_standalone_host_fdb_add(dp, addr, 0);
  if (!ether_addr_equal(addr, conduit->dev_addr))
    dev_uc_add(conduit, addr);
is executed both in dsa_user_open() and dsa_user_set_mac_addr().

Its reverse is executed both in dsa_user_close() and
dsa_user_set_mac_addr().

Refactor these sequences into new functions dsa_user_host_uc_install()
and dsa_user_host_uc_uninstall().

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoMerge branch 'rtnetlink-rtnl_lock'
David S. Miller [Mon, 10 Jun 2024 12:15:40 +0000 (13:15 +0100)]
Merge branch 'rtnetlink-rtnl_lock'

Jakub Kicinski says:

====================
rtnetlink: move rtnl_lock handling out of af_netlink

With the changes done in commit 5b4b62a169e1 ("rtnetlink: make
the "split" NLM_DONE handling generic") we can also move the
rtnl locking out of af_netlink.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: netlink: remove the cb_mutex "injection" from netlink core
Jakub Kicinski [Thu, 6 Jun 2024 19:29:06 +0000 (12:29 -0700)]
net: netlink: remove the cb_mutex "injection" from netlink core

Back in 2007, in commit af65bdfce98d ("[NETLINK]: Switch cb_lock spinlock
to mutex and allow to override it") netlink core was extended to allow
subsystems to replace the dump mutex lock with its own lock.

The mechanism was used by rtnetlink to take rtnl_lock but it isn't
sufficiently flexible for other users. Over the 17 years since
it was added no other user appeared. Since rtnetlink needs conditional
locking now, and doesn't use it either, axe this feature complete.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agortnetlink: move rtnl_lock handling out of af_netlink
Jakub Kicinski [Thu, 6 Jun 2024 19:29:05 +0000 (12:29 -0700)]
rtnetlink: move rtnl_lock handling out of af_netlink

Now that we have an intermediate layer of code for handling
rtnl-level netlink dump quirks, we can move the rtnl_lock
taking there.

For dump handlers with RTNL_FLAG_DUMP_SPLIT_NLM_DONE we can
avoid taking rtnl_lock just to generate NLM_DONE, once again.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: dsa: hellcreek: Replace kernel.h with what is used
Andy Shevchenko [Thu, 6 Jun 2024 16:15:49 +0000 (19:15 +0300)]
net: dsa: hellcreek: Replace kernel.h with what is used

kernel.h is included solely for some other existing headers.
Include them directly and get rid of kernel.h.

While at it, sort headers alphabetically for easier maintenance.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoMerge branch 'tcp-up-pin-tw-timer'
David S. Miller [Mon, 10 Jun 2024 10:54:19 +0000 (11:54 +0100)]
Merge branch 'tcp-up-pin-tw-timer'

Florian Westphal says:

====================
net: tcp: un-pin tw timer

Changes since previous iteration:
 - Patch 1: update a comment, I copied Erics v7 RvB tag.
 - Patch 2: move bh off/on into hashdance_schedule and get rid of
   comment mentioning pinned tw timer.
   I did not copy Erics RvB tag over from v7 because of the change.
 - Patch 3 is unchanged, so I kept Erics RvB tag.

This is v8 of the series where the tw_timer is un-pinned to get rid of
interferences in isolated CPUs setups.

First patch makes necessary preparations, existing code relies on
TIMER_PINNED to avoid races.

Second patch un-pins the TW timer. Could be folded into the first one,
but it might help wrt. bisection.

Third patch is a minor cleanup to move a helper from .h to the only
remaining compilation unit.

Tested with iperf3 and stress-ng socket mode.
====================

Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agotcp: move inet_twsk_schedule helper out of header
Florian Westphal [Thu, 6 Jun 2024 15:11:39 +0000 (17:11 +0200)]
tcp: move inet_twsk_schedule helper out of header

Its no longer used outside inet_timewait_sock.c, so move it there.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: tcp: un-pin the tw_timer
Florian Westphal [Thu, 6 Jun 2024 15:11:38 +0000 (17:11 +0200)]
net: tcp: un-pin the tw_timer

After previous patch, even if timer fires immediately on another CPU,
context that schedules the timer now holds the ehash spinlock, so timer
cannot reap tw socket until ehash lock is released.

BH disable is moved into hashdance_schedule.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: tcp/dccp: prepare for tw_timer un-pinning
Valentin Schneider [Thu, 6 Jun 2024 15:11:37 +0000 (17:11 +0200)]
net: tcp/dccp: prepare for tw_timer un-pinning

The TCP timewait timer is proving to be problematic for setups where
scheduler CPU isolation is achieved at runtime via cpusets (as opposed to
statically via isolcpus=domains).

What happens there is a CPU goes through tcp_time_wait(), arming the
time_wait timer, then gets isolated. TCP_TIMEWAIT_LEN later, the timer
fires, causing interference for the now-isolated CPU. This is conceptually
similar to the issue described in commit e02b93124855 ("workqueue: Unbind
kworkers before sending them to exit()")

Move inet_twsk_schedule() to within inet_twsk_hashdance(), with the ehash
lock held. Expand the lock's critical section from inet_twsk_kill() to
inet_twsk_deschedule_put(), serializing the scheduling vs descheduling of
the timer. IOW, this prevents the following race:

     tcp_time_wait()
       inet_twsk_hashdance()
  inet_twsk_deschedule_put()
    del_timer_sync()
       inet_twsk_schedule()

Thanks to Paolo Abeni for suggesting to leverage the ehash lock.

This also restores a comment from commit ec94c2696f0b ("tcp/dccp: avoid
one atomic operation for timewait hashdance") as inet_twsk_hashdance() had
a "Step 1" and "Step 3" comment, but the "Step 2" had gone missing.

inet_twsk_deschedule_put() now acquires the ehash spinlock to synchronize
with inet_twsk_hashdance_schedule().

To ease possible regression search, actual un-pin is done in next patch.

Link: https://lore.kernel.org/all/ZPhpfMjSiHVjQkTk@localhost.localdomain/
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Co-developed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoMerge branch 'mlxsw-acl-fixes'
David S. Miller [Mon, 10 Jun 2024 10:14:53 +0000 (11:14 +0100)]
Merge branch 'mlxsw-acl-fixes'

Petr Machata says:

====================
mlxsw: ACL fixes

Ido Schimmel writes:

Patches #1-#3 fix various spelling mistakes I noticed while working on
the code base.

Patch #4 fixes a general protection fault by bailing out when the error
occurs and warning.

Patch #5 fixes the warning.

Patch #6 fixes ACL scale regression and firmware errors.

See the commit messages for more info.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agomlxsw: spectrum_acl: Fix ACL scale regression and firmware errors
Ido Schimmel [Thu, 6 Jun 2024 14:49:43 +0000 (16:49 +0200)]
mlxsw: spectrum_acl: Fix ACL scale regression and firmware errors

ACLs that reside in the algorithmic TCAM (A-TCAM) in Spectrum-2 and
newer ASICs can share the same mask if their masks only differ in up to
8 consecutive bits. For example, consider the following filters:

 # tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 192.0.2.0/24 action drop
 # tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 198.51.100.128/25 action drop

The second filter can use the same mask as the first (dst_ip/24) with a
delta of 1 bit.

However, the above only works because the two filters have different
values in the common unmasked part (dst_ip/24). When entries have the
same value in the common unmasked part they create undesired collisions
in the device since many entries now have the same key. This leads to
firmware errors such as [1] and to a reduced scale.

Fix by adjusting the hash table key to only include the value in the
common unmasked part. That is, without including the delta bits. That
way the driver will detect the collision during filter insertion and
spill the filter into the circuit TCAM (C-TCAM).

Add a test case that fails without the fix and adjust existing cases
that check C-TCAM spillage according to the above limitation.

[1]
mlxsw_spectrum2 0000:06:00.0: EMAD reg access failed (tid=3379b18a00003394,reg_id=3027(ptce3),type=write,status=8(resource not available))

Fixes: c22291f7cf45 ("mlxsw: spectrum: acl: Implement delta for ERP")
Reported-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agomlxsw: spectrum_acl_erp: Fix object nesting warning
Ido Schimmel [Thu, 6 Jun 2024 14:49:42 +0000 (16:49 +0200)]
mlxsw: spectrum_acl_erp: Fix object nesting warning

ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
(A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
contain more ACLs (i.e., tc filters), but the number of masks in each
region (i.e., tc chain) is limited.

In order to mitigate the effects of the above limitation, the device
allows filters to share a single mask if their masks only differ in up
to 8 consecutive bits. For example, dst_ip/25 can be represented using
dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
number of masks being used (and therefore does not support mask
aggregation), but can contain a limited number of filters.

The driver uses the "objagg" library to perform the mask aggregation by
passing it objects that consist of the filter's mask and whether the
filter is to be inserted into the A-TCAM or the C-TCAM since filters in
different TCAMs cannot share a mask.

The set of created objects is dependent on the insertion order of the
filters and is not necessarily optimal. Therefore, the driver will
periodically ask the library to compute a more optimal set ("hints") by
looking at all the existing objects.

When the library asks the driver whether two objects can be aggregated
the driver only compares the provided masks and ignores the A-TCAM /
C-TCAM indication. This is the right thing to do since the goal is to
move as many filters as possible to the A-TCAM. The driver also forbids
two identical masks from being aggregated since this can only happen if
one was intentionally put in the C-TCAM to avoid a conflict in the
A-TCAM.

The above can result in the following set of hints:

H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta

After getting the hints from the library the driver will start migrating
filters from one region to another while consulting the computed hints
and instructing the device to perform a lookup in both regions during
the transition.

Assuming a filter with mask X is being migrated into the A-TCAM in the
new region, the hints lookup will return H1. Since H2 is the parent of
H1, the library will try to find the object associated with it and
create it if necessary in which case another hints lookup (recursive)
will be performed. This hints lookup for {mask Y, A-TCAM} will either
return H2 or H3 since the driver passes the library an object comparison
function that ignores the A-TCAM / C-TCAM indication.

This can eventually lead to nested objects which are not supported by
the library [1].

Fix by removing the object comparison function from both the driver and
the library as the driver was the only user. That way the lookup will
only return exact matches.

I do not have a reliable reproducer that can reproduce the issue in a
timely manner, but before the fix the issue would reproduce in several
minutes and with the fix it does not reproduce in over an hour.

Note that the current usefulness of the hints is limited because they
include the C-TCAM indication and represent aggregation that cannot
actually happen. This will be addressed in net-next.

[1]
WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
Modules linked in:
CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
[...]
Call Trace:
 <TASK>
 __objagg_obj_get+0x2bb/0x580
 objagg_obj_get+0xe/0x80
 mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
 mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
 mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
 mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
 process_one_work+0x151/0x370

Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agolib: objagg: Fix general protection fault
Ido Schimmel [Thu, 6 Jun 2024 14:49:41 +0000 (16:49 +0200)]
lib: objagg: Fix general protection fault

The library supports aggregation of objects into other objects only if
the parent object does not have a parent itself. That is, nesting is not
supported.

Aggregation happens in two cases: Without and with hints, where hints
are a pre-computed recommendation on how to aggregate the provided
objects.

Nesting is not possible in the first case due to a check that prevents
it, but in the second case there is no check because the assumption is
that nesting cannot happen when creating objects based on hints. The
violation of this assumption leads to various warnings and eventually to
a general protection fault [1].

Before fixing the root cause, error out when nesting happens and warn.

[1]
general protection fault, probably for non-canonical address 0xdead000000000d90: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1083 Comm: kworker/1:9 Tainted: G        W          6.9.0-rc6-custom-gd9b4f1cca7fb #7
Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:mlxsw_sp_acl_erp_bf_insert+0x25/0x80
[...]
Call Trace:
 <TASK>
 mlxsw_sp_acl_atcam_entry_add+0x256/0x3c0
 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
 mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
 mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
 process_one_work+0x151/0x370
 worker_thread+0x2cb/0x3e0
 kthread+0xd0/0x100
 ret_from_fork+0x34/0x50
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Reported-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agomlxsw: spectrum_acl_atcam: Fix wrong comment
Ido Schimmel [Thu, 6 Jun 2024 14:49:40 +0000 (16:49 +0200)]
mlxsw: spectrum_acl_atcam: Fix wrong comment

The key is encoded, not encrypted.

Fixes: c22291f7cf45 ("mlxsw: spectrum: acl: Implement delta for ERP")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agolib: test_objagg: Fix spelling
Ido Schimmel [Thu, 6 Jun 2024 14:49:39 +0000 (16:49 +0200)]
lib: test_objagg: Fix spelling

Fixes: 0a020d416d0a ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agolib: objagg: Fix spelling
Ido Schimmel [Thu, 6 Jun 2024 14:49:38 +0000 (16:49 +0200)]
lib: objagg: Fix spelling

Fixes: 0a020d416d0a ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agodmaengine: ti: k3-udma-glue: clean up return in k3_udma_glue_rx_get_irq()
Dan Carpenter [Thu, 6 Jun 2024 14:23:44 +0000 (17:23 +0300)]
dmaengine: ti: k3-udma-glue: clean up return in k3_udma_glue_rx_get_irq()

Currently the k3_udma_glue_rx_get_irq() function returns either negative
error codes or zero on error.  Generally, in the kernel, zero means
success so this be confusing and has caused bugs in the past.  Also the
"tx" version of this function only returns negative error codes.  Let's
clean this "rx" function so both functions match.

This patch has no effect on runtime.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Acked-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agotools: ynl: make user space policies const
Jakub Kicinski [Wed, 5 Jun 2024 17:16:44 +0000 (10:16 -0700)]
tools: ynl: make user space policies const

Dan, who's working on C++ YNL, pointed out that the C code
does not make policies const. Sprinkle some 'const's around.

Reported-by: Dan Melnic <dmm@meta.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agopage_pool: remove WARN_ON() with OR
David Wei [Wed, 5 Jun 2024 16:19:24 +0000 (09:19 -0700)]
page_pool: remove WARN_ON() with OR

Having an OR in WARN_ON() makes me sad because it's impossible to tell
which condition is true when triggered.

Split a WARN_ON() with an OR in page_pool_disable_direct_recycling().

Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ti: icssg-prueth: Add multicast filtering support
MD Danish Anwar [Wed, 5 Jun 2024 09:52:23 +0000 (15:22 +0530)]
net: ti: icssg-prueth: Add multicast filtering support

Add multicast filtering support for ICSSG Driver. Multicast addresses will
be updated by __dev_mc_sync() API. icssg_prueth_add_macst () and
icssg_prueth_del_mcast() will be sync and unsync APIs for the driver
respectively.

To add a mac_address for a port, driver needs to call icssg_fdb_add_del()
and pass the mac_address and BIT(port_id) to the API. The ICSSG firmware
will then configure the rules and allow filtering.

If a mac_address is added to port0 and the same mac_address needs to be
added for port1, driver needs to pass BIT(port0) | BIT(port1) to the
icssg_fdb_add_del() API. If driver just pass BIT(port1) then the entry for
port0 will be overwritten / lost. This is a design constraint on the
firmware side.

To overcome this in the driver, to add any mac_address for let's say portX
driver first checks if the same mac_address is already added for any other
port. If yes driver calls icssg_fdb_add_del() with BIT(portX) |
BIT(other_existing_port). If not, driver calls icssg_fdb_add_del() with
BIT(portX).

The same thing is applicable for deleting mac_addresses as well. This
logic is in icssg_prueth_add_mcast / icssg_prueth_del_mcast APIs.

Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agor8152: Set NET_ADDR_STOLEN if using passthru MAC
Milan Broz [Wed, 5 Jun 2024 15:33:40 +0000 (17:33 +0200)]
r8152: Set NET_ADDR_STOLEN if using passthru MAC

Some docks support MAC pass-through - MAC address
is taken from another device.

Driver should indicate that with NET_ADDR_STOLEN flag.

This should help to avoid collisions if network interface
names are generated with MAC policy.

Reported and discussed here
https://github.com/systemd/systemd/issues/33104

Signed-off-by: Milan Broz <gmazyland@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240605153340.25694-1-gmazyland@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
Geliang Tang [Thu, 30 May 2024 07:41:12 +0000 (15:41 +0800)]
selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca

bpf_map_lookup_elem() has been removed from do_test(), it makes the
sk_stg_map argument of do_test() useless. In addition, two exactly the
same opts are passed in all the places where do_test() is invoked, so
cli_opts argument can be dropped too.

This patch drops these two useless arguments of do_test() in bpf_tcp_ca.c.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/7056eab111d78a05bce29d2821228dc93f240de4.1717054461.git.tanggeliang@kylinos.cn
13 months agoselftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
Geliang Tang [Thu, 30 May 2024 07:41:11 +0000 (15:41 +0800)]
selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca

The "if (sk_stg_map)" block in do_test() is only used by test_dctcp(),
it makes sense to move it from do_test() into test_dctcp(). Then
do_test() can be used by other tests except test_dctcp().

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/9938916627b9810c877e5c03a621bc0ba5acf5c5.1717054461.git.tanggeliang@kylinos.cn
13 months agoselftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
Geliang Tang [Thu, 30 May 2024 07:41:10 +0000 (15:41 +0800)]
selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca

The newly added helper start_test() can be used in test_dctcp_fallback()
too, to replace start_server_str() and connect_to_fd_opts(). In that
way, two network_helper_opts srv_opts and cli_opts are used instead of
the previously shared opts.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/792ca3bb013fa06e618176da02d75e4f79a76733.1717054461.git.tanggeliang@kylinos.cn
13 months agoselftests/bpf: Add start_test helper in bpf_tcp_ca
Geliang Tang [Thu, 30 May 2024 07:41:09 +0000 (15:41 +0800)]
selftests/bpf: Add start_test helper in bpf_tcp_ca

For moving the "if (sk_stg_map)" block out of do_test(), extract the
code before this block as a new function start_test(). It creates
server-side and client-side sockets and returns them to the caller.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/48f2921ff9be958f5d3d28fe6bb7269a61cafa9f.1717054461.git.tanggeliang@kylinos.cn
13 months agoselftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
Geliang Tang [Thu, 30 May 2024 07:41:08 +0000 (15:41 +0800)]
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca

This patch uses connect_to_fd_opts() instead of using connect_fd_to_fd()
and settcpca() in do_test() in prog_tests/bpf_tcp_ca.c to accept a struct
network_helper_opts argument.

Then define a dctcp dedicated post_socket_cb callback stg_post_socket_cb(),
invoking both settcpca() and bpf_map_update_elem() in it, and set it in
test_dctcp(). For passing map_fd into stg_post_socket_cb() callback, a new
member map_fd is added in struct cb_opts.

Add another "const struct network_helper_opts *cli_opts" to do_test() to
separate it from the server "opts".

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/876ec90430865bc468e3b7f6fb2648420b075548.1717054461.git.tanggeliang@kylinos.cn
13 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 6 Jun 2024 18:33:09 +0000 (11:33 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/pensando/ionic/ionic_txrx.c
  d9c04209990b ("ionic: Mark error paths in the data path as unlikely")
  491aee894a08 ("ionic: fix kernel panic in XDP_TX action")

net/ipv6/ip6_fib.c
  b4cb4a1391dc ("net: use unrcu_pointer() helper")
  b01e1c030770 ("ipv6: fix possible race in __fib6_drop_pcpu_from()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agolibbpf: Auto-attach struct_ops BPF maps in BPF skeleton
Mykyta Yatsenko [Wed, 5 Jun 2024 17:51:35 +0000 (18:51 +0100)]
libbpf: Auto-attach struct_ops BPF maps in BPF skeleton

Similarly to `bpf_program`, support `bpf_map` automatic attachment in
`bpf_object__attach_skeleton`. Currently only struct_ops maps could be
attached.

On bpftool side, code-generate links in skeleton struct for struct_ops maps.
Similarly to `bpf_program_skeleton`, set links in `bpf_map_skeleton`.

On libbpf side, extend `bpf_map` with new `autoattach` field to support
enabling or disabling autoattach functionality, introducing
getter/setter for this field.

`bpf_object__(attach|detach)_skeleton` is extended with
attaching/detaching struct_ops maps logic.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240605175135.117127-1-yatsenko@meta.com
13 months agoMerge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 6 Jun 2024 16:55:27 +0000 (09:55 -0700)]
Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from BPF and big collection of fixes for WiFi core and
  drivers.

  Current release - regressions:

   - vxlan: fix regression when dropping packets due to invalid src
     addresses

   - bpf: fix a potential use-after-free in bpf_link_free()

   - xdp: revert support for redirect to any xsk socket bound to the
     same UMEM as it can result in a corruption

   - virtio_net:
      - add missing lock protection when reading return code from
        control_buf
      - fix false-positive lockdep splat in DIM
      - Revert "wifi: wilc1000: convert list management to RCU"

   - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config

  Previous releases - regressions:

   - rtnetlink: make the "split" NLM_DONE handling generic, restore the
     old behavior for two cases where we started coalescing those
     messages with normal messages, breaking sloppily-coded userspace

   - wifi:
      - cfg80211: validate HE operation element parsing
      - cfg80211: fix 6 GHz scan request building
      - mt76: mt7615: add missing chanctx ops
      - ath11k: move power type check to ASSOC stage, fix connecting to
        6 GHz AP
      - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs
      - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS
      - iwlwifi: mvm: fix a crash on 7265

  Previous releases - always broken:

   - ncsi: prevent multi-threaded channel probing, a spec violation

   - vmxnet3: disable rx data ring on dma allocation failure

   - ethtool: init tsinfo stats if requested, prevent unintentionally
     reporting all-zero stats on devices which don't implement any

   - dst_cache: fix possible races in less common IPv6 features

   - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED

   - ax25: fix two refcounting bugs

   - eth: ionic: fix kernel panic in XDP_TX action

  Misc:

   - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB"

* tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
  selftests: net: lib: set 'i' as local
  selftests: net: lib: avoid error removing empty netns name
  selftests: net: lib: support errexit with busywait
  net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()
  ipv6: fix possible race in __fib6_drop_pcpu_from()
  af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
  af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
  af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
  af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
  af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
  af_unix: Annotate data-races around sk->sk_sndbuf.
  af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
  af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
  af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
  af_unix: Annotate data-race of sk->sk_state in unix_accept().
  af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
  af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
  af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
  af_unix: Annodate data-races around sk->sk_state for writers.
  af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
  ...

13 months agoMerge tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo
Linus Torvalds [Thu, 6 Jun 2024 16:48:57 +0000 (09:48 -0700)]
Merge tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo

Pull tomoyo fixlet from Tetsuo Handa:
 "Single patch to update project links, no behavior changes"

* tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo:
  tomoyo: update project links

13 months agoMerge tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 6 Jun 2024 16:39:36 +0000 (09:39 -0700)]
Merge tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

 - Ensure that .discard sections are really discarded in the EFI zboot
   image build

 - Return proper error numbers from efi-pstore

 - Add __nocfi annotations to EFI runtime wrappers

* tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: Add missing __nocfi annotations to runtime wrappers
  efi: pstore: Return proper errors on UEFI failures
  efi/libstub: zboot.lds: Discard .discard sections

13 months agoMerge branch 'selftests-net-lib-small-fixes'
Jakub Kicinski [Thu, 6 Jun 2024 15:23:44 +0000 (08:23 -0700)]
Merge branch 'selftests-net-lib-small-fixes'

Matthieu Baerts says:

====================
selftests: net: lib: small fixes

While looking at using 'lib.sh' for the MPTCP selftests [1], we found
some small issues with 'lib.sh'. Here they are:

- Patch 1: fix 'errexit' (set -e) support with busywait. 'errexit' is
  supported in some functions, not all. A fix for v6.8+.

- Patch 2: avoid confusing error messages linked to the cleaning part
  when the netns setup fails. A fix for v6.8+.

- Patch 3: set a variable as local to avoid accidentally changing the
  value of a another one with the same name on the caller side. A fix
  for v6.10-rc1+.

Link: https://lore.kernel.org/mptcp/5f4615c3-0621-43c5-ad25-55747a4350ce@kernel.org/T/
====================

Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-0-b3afadd368c9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: net: lib: set 'i' as local
Matthieu Baerts (NGI0) [Wed, 5 Jun 2024 09:21:18 +0000 (11:21 +0200)]
selftests: net: lib: set 'i' as local

Without this, the 'i' variable declared before could be overridden by
accident, e.g.

  for i in "${@}"; do
      __ksft_status_merge "${i}"  ## 'i' has been modified
      foo "${i}"                  ## using 'i' with an unexpected value
  done

After a quick look, it looks like 'i' is currently not used after having
been modified in __ksft_status_merge(), but still, better be safe than
sorry. I saw this while modifying the same file, not because I suspected
an issue somewhere.

Fixes: 596c8819cb78 ("selftests: forwarding: Have RET track kselftest framework constants")
Acked-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-3-b3afadd368c9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: net: lib: avoid error removing empty netns name
Matthieu Baerts (NGI0) [Wed, 5 Jun 2024 09:21:17 +0000 (11:21 +0200)]
selftests: net: lib: avoid error removing empty netns name

If there is an error to create the first netns with 'setup_ns()',
'cleanup_ns()' will be called with an empty string as first parameter.

The consequences is that 'cleanup_ns()' will try to delete an invalid
netns, and wait 20 seconds if the netns list is empty.

Instead of just checking if the name is not empty, convert the string
separated by spaces to an array. Manipulating the array is cleaner, and
calling 'cleanup_ns()' with an empty array will be a no-op.

Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Cc: stable@vger.kernel.org
Acked-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-2-b3afadd368c9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: net: lib: support errexit with busywait
Matthieu Baerts (NGI0) [Wed, 5 Jun 2024 09:21:16 +0000 (11:21 +0200)]
selftests: net: lib: support errexit with busywait

If errexit is enabled ('set -e'), loopy_wait -- or busywait and others
using it -- will stop after the first failure.

Note that if the returned status of loopy_wait is checked, and even if
errexit is enabled, Bash will not stop at the first error.

Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Cc: stable@vger.kernel.org
Acked-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-1-b3afadd368c9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests/bpf: Add btf_field_iter selftests
Alan Maguire [Wed, 5 Jun 2024 15:33:14 +0000 (16:33 +0100)]
selftests/bpf: Add btf_field_iter selftests

The added selftests verify that for every BTF kind we iterate correctly
over consituent strings and ids.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240605153314.3727466-1-alan.maguire@oracle.com
13 months agoselftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
Yonghong Song [Wed, 5 Jun 2024 20:12:03 +0000 (13:12 -0700)]
selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT

Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT
configs. In this particular case, the base VM is AMD with 166 cpus, and I
run selftests with regular qemu on top of that and indeed send_signal test
failed. I also tried with an Intel box with 80 cpus and there is no issue.

The main qemu command line includes:

  -enable-kvm -smp 16 -cpu host

The failure log looks like:

  $ ./test_progs -t send_signal
  [   48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225]
  [   48.503622] Modules linked in: bpf_testmod(O)
  [   48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G           O       6.9.0-08561-g2c1713a8f1c9-dirty #69
  [   48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
  [   48.511635] RIP: 0010:handle_softirqs+0x71/0x290
  [   48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb
  [   48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246
  [   48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0
  [   48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000
  [   48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000
  [   48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280
  [   48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  [   48.528525] FS:  00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000
  [   48.531600] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0
  [   48.537538] Call Trace:
  [   48.537538]  <IRQ>
  [   48.537538]  ? watchdog_timer_fn+0x1cd/0x250
  [   48.539590]  ? lockup_detector_update_enable+0x50/0x50
  [   48.539590]  ? __hrtimer_run_queues+0xff/0x280
  [   48.542520]  ? hrtimer_interrupt+0x103/0x230
  [   48.544524]  ? __sysvec_apic_timer_interrupt+0x4f/0x140
  [   48.545522]  ? sysvec_apic_timer_interrupt+0x3a/0x90
  [   48.547612]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [   48.547612]  ? handle_softirqs+0x71/0x290
  [   48.547612]  irq_exit_rcu+0x63/0x80
  [   48.551585]  sysvec_apic_timer_interrupt+0x75/0x90
  [   48.552521]  </IRQ>
  [   48.553529]  <TASK>
  [   48.553529]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [   48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260
  [   48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74
  [   48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282
  [   48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620
  [   48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00
  [   48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000
  [   48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000
  [   48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00
  [   48.575614]  ? finish_task_switch.isra.0+0x8d/0x260
  [   48.576523]  __schedule+0x364/0xac0
  [   48.577535]  schedule+0x2e/0x110
  [   48.578555]  pipe_read+0x301/0x400
  [   48.579589]  ? destroy_sched_domains_rcu+0x30/0x30
  [   48.579589]  vfs_read+0x2b3/0x2f0
  [   48.579589]  ksys_read+0x8b/0xc0
  [   48.583590]  do_syscall_64+0x3d/0xc0
  [   48.583590]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
  [   48.586525] RIP: 0033:0x7f2f28703fa1
  [   48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0
  [   48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
  [   48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1
  [   48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006
  [   48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000
  [   48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  [   48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0
  [   48.605527]  </TASK>

In the test, two processes are communicating through pipe. Further debugging
with strace found that the above splat is triggered as read() syscall could
not receive the data even if the corresponding write() syscall in another
process successfully wrote data into the pipe.

The failed subtest is "send_signal_perf". The corresponding perf event has
sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every
overflow event will trigger a call to the BPF program. So I suspect this may
overwhelm the system. So I increased the sample_period to 100,000 and the test
passed. The sample_period 10,000 still has the test failed.

In other parts of selftest, e.g., [1], sample_freq is used instead. So I
decided to use sample_freq = 1,000 since the test can pass as well.

  [1] https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org/

Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240605201203.2603846-1-yonghong.song@linux.dev
13 months agoMerge branch 'tcp-small-code-reorg'
Paolo Abeni [Thu, 6 Jun 2024 13:18:06 +0000 (15:18 +0200)]
Merge branch 'tcp-small-code-reorg'

Eric Dumazet says:

====================
tcp: small code reorg

Replace a WARN_ON_ONCE() that never triggered
to DEBUG_NET_WARN_ON_ONCE() in reqsk_free().

Move inet_reqsk_alloc() and reqsk_alloc()
to inet_connection_sock.c, to unclutter
net/ipv4/tcp_input.c and include/net/request_sock.h
====================

Link: https://lore.kernel.org/r/20240605071553.1365557-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agotcp: move reqsk_alloc() to inet_connection_sock.c
Eric Dumazet [Wed, 5 Jun 2024 07:15:53 +0000 (07:15 +0000)]
tcp: move reqsk_alloc() to inet_connection_sock.c

reqsk_alloc() has a single caller, no need to expose it
in include/net/request_sock.h.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agotcp: move inet_reqsk_alloc() close to inet_reqsk_clone()
Eric Dumazet [Wed, 5 Jun 2024 07:15:52 +0000 (07:15 +0000)]
tcp: move inet_reqsk_alloc() close to inet_reqsk_clone()

inet_reqsk_alloc() does not belong to tcp_input.c,
move it to inet_connection_sock.c instead.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agotcp: small changes in reqsk_put() and reqsk_free()
Eric Dumazet [Wed, 5 Jun 2024 07:15:51 +0000 (07:15 +0000)]
tcp: small changes in reqsk_put() and reqsk_free()

In reqsk_free(), use DEBUG_NET_WARN_ON_ONCE()
instead of WARN_ON_ONCE() for a condition which never fired.

In reqsk_put() directly call __reqsk_free(), there is no
point checking rsk_refcnt again right after a transition to zero.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoMerge branch 'mptcp-misc-cleanups'
Paolo Abeni [Thu, 6 Jun 2024 13:13:48 +0000 (15:13 +0200)]
Merge branch 'mptcp-misc-cleanups'

Matthieu Baerts says:

====================
mptcp: misc. cleanups

Here is a small collection of miscellaneous cleanups:

- Patch 1 uses an MPTCP helper, instead of a TCP one, to do the same
  thing.

- Patch 2 adds a similar MPTCP helper, instead of using a TCP one
  directly.

- Patch 3 uses more appropriated terms in comments.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
====================

Link: https://lore.kernel.org/r/20240605-upstream-net-next-20240604-misc-cleanup-v1-0-ae2e35c3ecc5@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agomptcp: refer to 'MPTCP' socket in comments
Davide Caratti [Wed, 5 Jun 2024 07:15:42 +0000 (09:15 +0200)]
mptcp: refer to 'MPTCP' socket in comments

We used to call it 'master' socket at the early stages of MPTCP
development, but the correct wording is 'MPTCP' socket opposed to 'TCP
subflows': convert the last 3 comments to use a more appropriate term.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agomptcp: add mptcp_space_from_win helper
Geliang Tang [Wed, 5 Jun 2024 07:15:41 +0000 (09:15 +0200)]
mptcp: add mptcp_space_from_win helper

As a wrapper of __tcp_space_from_win(), this patch adds a MPTCP dedicated
space_from_win helper mptcp_space_from_win() in protocol.h to paired with
mptcp_win_from_space().

Use it instead of __tcp_space_from_win() in both mptcp_rcv_space_adjust()
and mptcp_set_rcvlowat().

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agomptcp: use mptcp_win_from_space helper
Geliang Tang [Wed, 5 Jun 2024 07:15:40 +0000 (09:15 +0200)]
mptcp: use mptcp_win_from_space helper

The MPTCP dedicated win_from_space helper mptcp_win_from_space() is defined
in protocol.h, use it in mptcp_rcv_space_adjust() instead of using the TCP
one. Here scaling_ratio is the same as msk->scaling_ratio.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agonet: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()
Su Hui [Wed, 5 Jun 2024 03:47:43 +0000 (11:47 +0800)]
net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()

Clang static checker (scan-build) warning:
net/ethtool/ioctl.c:line 2233, column 2
Called function pointer is null (null dereference).

Return '-EOPNOTSUPP' when 'ops->get_ethtool_phy_stats' is NULL to fix
this typo error.

Fixes: 201ed315f967 ("net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers")
Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://lore.kernel.org/r/20240605034742.921751-1-suhui@nfschina.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agonet: allow rps/rfs related configs to be switched
Jason Xing [Wed, 5 Jun 2024 02:29:32 +0000 (10:29 +0800)]
net: allow rps/rfs related configs to be switched

After John Sperbeck reported a compile error if the CONFIG_RFS_ACCEL
is off, I found that I cannot easily enable/disable the config
because of lack of the prompt when using 'make menuconfig'. Therefore,
I decided to change rps/rfc related configs altogether.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://lore.kernel.org/r/20240605022932.33703-1-kerneljasonxing@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoipv6: fix possible race in __fib6_drop_pcpu_from()
Eric Dumazet [Tue, 4 Jun 2024 19:35:49 +0000 (19:35 +0000)]
ipv6: fix possible race in __fib6_drop_pcpu_from()

syzbot found a race in __fib6_drop_pcpu_from() [1]

If compiler reads more than once (*ppcpu_rt),
second read could read NULL, if another cpu clears
the value in rt6_get_pcpu_route().

Add a READ_ONCE() to prevent this race.

Also add rcu_read_lock()/rcu_read_unlock() because
we rely on RCU protection while dereferencing pcpu_rt.

[1]

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
Workqueue: netns cleanup_net
 RIP: 0010:__fib6_drop_pcpu_from.part.0+0x10a/0x370 net/ipv6/ip6_fib.c:984
Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48
RSP: 0018:ffffc900040df070 EFLAGS: 00010206
RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16
RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091
RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007
R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8
R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
  __fib6_drop_pcpu_from net/ipv6/ip6_fib.c:966 [inline]
  fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1027 [inline]
  fib6_purge_rt+0x7f2/0x9f0 net/ipv6/ip6_fib.c:1038
  fib6_del_route net/ipv6/ip6_fib.c:1998 [inline]
  fib6_del+0xa70/0x17b0 net/ipv6/ip6_fib.c:2043
  fib6_clean_node+0x426/0x5b0 net/ipv6/ip6_fib.c:2205
  fib6_walk_continue+0x44f/0x8d0 net/ipv6/ip6_fib.c:2127
  fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2175
  fib6_clean_tree+0xd7/0x120 net/ipv6/ip6_fib.c:2255
  __fib6_clean_all+0x100/0x2d0 net/ipv6/ip6_fib.c:2271
  rt6_sync_down_dev net/ipv6/route.c:4906 [inline]
  rt6_disable_ip+0x7ed/0xa00 net/ipv6/route.c:4911
  addrconf_ifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855
  addrconf_notify+0x223/0x19e0 net/ipv6/addrconf.c:3778
  notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
  call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1992
  call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
  call_netdevice_notifiers net/core/dev.c:2044 [inline]
  dev_close_many+0x333/0x6a0 net/core/dev.c:1585
  unregister_netdevice_many_notify+0x46d/0x19f0 net/core/dev.c:11193
  unregister_netdevice_many net/core/dev.c:11276 [inline]
  default_device_exit_batch+0x85b/0xae0 net/core/dev.c:11759
  ops_exit_list+0x128/0x180 net/core/net_namespace.c:178
  cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640
  process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
  process_scheduled_works kernel/workqueue.c:3312 [inline]
  worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
  kthread+0x2c1/0x3a0 kernel/kthread.c:389
  ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Fixes: d52d3997f843 ("ipv6: Create percpu rt6_info")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/r/20240604193549.981839-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoMerge branch 'af_unix-fix-lockless-access-of-sk-sk_state-and-others-fields'
Paolo Abeni [Thu, 6 Jun 2024 10:57:17 +0000 (12:57 +0200)]
Merge branch 'af_unix-fix-lockless-access-of-sk-sk_state-and-others-fields'

Kuniyuki Iwashima says:

====================
af_unix: Fix lockless access of sk->sk_state and others fields.

The patch 1 fixes a bug where SOCK_DGRAM's sk->sk_state is changed
to TCP_CLOSE even if the socket is connect()ed to another socket.

The rest of this series annotates lockless accesses to the following
fields.

  * sk->sk_state
  * sk->sk_sndbuf
  * net->unx.sysctl_max_dgram_qlen
  * sk->sk_receive_queue.qlen
  * sk->sk_shutdown

Note that with this series there is skb_queue_empty() left in
unix_dgram_disconnected() that needs to be changed to lockless
version, and unix_peer(other) access there should be protected
by unix_state_lock().

This will require some refactoring, so another series will follow.

Changes:
  v2:
    * Patch 1: Fix wrong double lock

  v1: https://lore.kernel.org/netdev/20240603143231.62085-1-kuniyu@amazon.com/
====================

Link: https://lore.kernel.org/r/20240604165241.44758-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoaf_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
Kuniyuki Iwashima [Tue, 4 Jun 2024 16:52:41 +0000 (09:52 -0700)]
af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().

While dumping sockets via UNIX_DIAG, we do not hold unix_state_lock().

Let's use READ_ONCE() to read sk->sk_shutdown.

Fixes: e4e541a84863 ("sock-diag: Report shutdown for inet and unix sockets (v2)")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoaf_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
Kuniyuki Iwashima [Tue, 4 Jun 2024 16:52:40 +0000 (09:52 -0700)]
af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().

We can dump the socket queue length via UNIX_DIAG by specifying
UDIAG_SHOW_RQLEN.

If sk->sk_state is TCP_LISTEN, we return the recv queue length,
but here we do not hold recvq lock.

Let's use skb_queue_len_lockless() in sk_diag_show_rqlen().

Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoaf_unix: Use skb_queue_empty_lockless() in unix_release_sock().
Kuniyuki Iwashima [Tue, 4 Jun 2024 16:52:39 +0000 (09:52 -0700)]
af_unix: Use skb_queue_empty_lockless() in unix_release_sock().

If the socket type is SOCK_STREAM or SOCK_SEQPACKET, unix_release_sock()
checks the length of the peer socket's recvq under unix_state_lock().

However, unix_stream_read_generic() calls skb_unlink() after releasing
the lock.  Also, for SOCK_SEQPACKET, __skb_try_recv_datagram() unlinks
skb without unix_state_lock().

Thues, unix_state_lock() does not protect qlen.

Let's use skb_queue_empty_lockless() in unix_release_sock().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoaf_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
Kuniyuki Iwashima [Tue, 4 Jun 2024 16:52:38 +0000 (09:52 -0700)]
af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().

Once sk->sk_state is changed to TCP_LISTEN, it never changes.

unix_accept() takes advantage of this characteristics; it does not
hold the listener's unix_state_lock() and only acquires recvq lock
to pop one skb.

It means unix_state_lock() does not prevent the queue length from
changing in unix_stream_connect().

Thus, we need to use unix_recvq_full_lockless() to avoid data-race.

Now we remove unix_recvq_full() as no one uses it.

Note that we can remove READ_ONCE() for sk->sk_max_ack_backlog in
unix_recvq_full_lockless() because of the following reasons:

  (1) For SOCK_DGRAM, it is a written-once field in unix_create1()

  (2) For SOCK_STREAM and SOCK_SEQPACKET, it is changed under the
      listener's unix_state_lock() in unix_listen(), and we hold
      the lock in unix_stream_connect()

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoaf_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
Kuniyuki Iwashima [Tue, 4 Jun 2024 16:52:37 +0000 (09:52 -0700)]
af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.

net->unx.sysctl_max_dgram_qlen is exposed as a sysctl knob and can be
changed concurrently.

Let's use READ_ONCE() in unix_create1().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoaf_unix: Annotate data-races around sk->sk_sndbuf.
Kuniyuki Iwashima [Tue, 4 Jun 2024 16:52:36 +0000 (09:52 -0700)]
af_unix: Annotate data-races around sk->sk_sndbuf.

sk_setsockopt() changes sk->sk_sndbuf under lock_sock(), but it's
not used in af_unix.c.

Let's use READ_ONCE() to read sk->sk_sndbuf in unix_writable(),
unix_dgram_sendmsg(), and unix_stream_sendmsg().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
13 months agoaf_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
Kuniyuki Iwashima [Tue, 4 Jun 2024 16:52:35 +0000 (09:52 -0700)]
af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.

While dumping AF_UNIX sockets via UNIX_DIAG, sk->sk_state is read
locklessly.

Let's use READ_ONCE() there.

Note that the result could be inconsistent if the socket is dumped
during the state change.  This is common for other SOCK_DIAG and
similar interfaces.

Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report")
Fixes: 2aac7a2cb0d9 ("unix_diag: Pending connections IDs NLA")
Fixes: 45a96b9be6ec ("unix_diag: Dumping all sockets core")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>