]> www.infradead.org Git - users/dwmw2/linux.git/log
users/dwmw2/linux.git
2 years agonet/tcp: Wire TCP-AO to request sockets
Dmitry Safonov [Mon, 23 Oct 2023 19:22:02 +0000 (20:22 +0100)]
net/tcp: Wire TCP-AO to request sockets

Now when the new request socket is created from the listening socket,
it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is
a new helper for checking if TCP-AO option was used to create the
request socket.
tcp_ao_copy_all_matching() will copy all keys that match the peer on the
request socket, as well as preparing them for the usage (creating
traffic keys).

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Add TCP-AO sign to twsk
Dmitry Safonov [Mon, 23 Oct 2023 19:22:01 +0000 (20:22 +0100)]
net/tcp: Add TCP-AO sign to twsk

Add support for sockets in time-wait state.
ao_info as well as all keys are inherited on transition to time-wait
socket. The lifetime of ao_info is now protected by ref counter, so
that tcp_ao_destroy_sock() will destruct it only when the last user is
gone.

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Add AO sign to RST packets
Dmitry Safonov [Mon, 23 Oct 2023 19:22:00 +0000 (20:22 +0100)]
net/tcp: Add AO sign to RST packets

Wire up sending resets to TCP-AO hashing.

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Add tcp_parse_auth_options()
Dmitry Safonov [Mon, 23 Oct 2023 19:21:59 +0000 (20:21 +0100)]
net/tcp: Add tcp_parse_auth_options()

Introduce a helper that:
(1) shares the common code with TCP-MD5 header options parsing
(2) looks for hash signature only once for both TCP-MD5 and TCP-AO
(3) fails with -EEXIST if any TCP sign option is present twice, see
    RFC5925 (2.2):
    ">> A single TCP segment MUST NOT have more than one TCP-AO in its
    options sequence. When multiple TCP-AOs appear, TCP MUST discard
    the segment."

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Add TCP-AO sign to outgoing packets
Dmitry Safonov [Mon, 23 Oct 2023 19:21:58 +0000 (20:21 +0100)]
net/tcp: Add TCP-AO sign to outgoing packets

Using precalculated traffic keys, sign TCP segments as prescribed by
RFC5925. Per RFC, TCP header options are included in sign calculation:
"The TCP header, by default including options, and where the TCP
checksum and TCP-AO MAC fields are set to zero, all in network-
byte order." (5.1.3)

tcp_ao_hash_header() has exclude_options parameter to optionally exclude
TCP header from hash calculation, as described in RFC5925 (9.1), this is
needed for interaction with middleboxes that may change "some TCP
options". This is wired up to AO key flags and setsockopt() later.

Similarly to TCP-MD5 hash TCP segment fragments.

From this moment a user can start sending TCP-AO signed segments with
one of crypto ahash algorithms from supported by Linux kernel. It can
have a user-specified MAC length, to either save TCP option header space
or provide higher protection using a longer signature.
The inbound segments are not yet verified, TCP-AO option is ignored and
they are accepted.

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Calculate TCP-AO traffic keys
Dmitry Safonov [Mon, 23 Oct 2023 19:21:57 +0000 (20:21 +0100)]
net/tcp: Calculate TCP-AO traffic keys

Add traffic key calculation the way it's described in RFC5926.
Wire it up to tcp_finish_connect() and cache the new keys straight away
on already established TCP connections.

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Prevent TCP-MD5 with TCP-AO being set
Dmitry Safonov [Mon, 23 Oct 2023 19:21:56 +0000 (20:21 +0100)]
net/tcp: Prevent TCP-MD5 with TCP-AO being set

Be as conservative as possible: if there is TCP-MD5 key for a given peer
regardless of L3 interface - don't allow setting TCP-AO key for the same
peer. According to RFC5925, TCP-AO is supposed to replace TCP-MD5 and
there can't be any switch between both on any connected tuple.
Later it can be relaxed, if there's a use, but in the beginning restrict
any intersection.

Note: it's still should be possible to set both TCP-MD5 and TCP-AO keys
on a listening socket for *different* peers.

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Introduce TCP_AO setsockopt()s
Dmitry Safonov [Mon, 23 Oct 2023 19:21:55 +0000 (20:21 +0100)]
net/tcp: Introduce TCP_AO setsockopt()s

Add 3 setsockopt()s:
1. TCP_AO_ADD_KEY to add a new Master Key Tuple (MKT) on a socket
2. TCP_AO_DEL_KEY to delete present MKT from a socket
3. TCP_AO_INFO to change flags, Current_key/RNext_key on a TCP-AO sk

Userspace has to introduce keys on every socket it wants to use TCP-AO
option on, similarly to TCP_MD5SIG/TCP_MD5SIG_EXT.
RFC5925 prohibits definition of MKTs that would match the same peer,
so do sanity checks on the data provided by userspace. Be as
conservative as possible, including refusal of defining MKT on
an established connection with no AO, removing the key in-use and etc.

(1) and (2) are to be used by userspace key manager to add/remove keys.
(3) main purpose is to set RNext_key, which (as prescribed by RFC5925)
is the KeyID that will be requested in TCP-AO header from the peer to
sign their segments with.

At this moment the life of ao_info ends in tcp_v4_destroy_sock().

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Add TCP-AO config and structures
Dmitry Safonov [Mon, 23 Oct 2023 19:21:54 +0000 (20:21 +0100)]
net/tcp: Add TCP-AO config and structures

Introduce new kernel config option and common structures as well as
helpers to be used by TCP-AO code.

Co-developed-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Co-developed-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tcp: Prepare tcp_md5sig_pool for TCP-AO
Dmitry Safonov [Mon, 23 Oct 2023 19:21:53 +0000 (20:21 +0100)]
net/tcp: Prepare tcp_md5sig_pool for TCP-AO

TCP-AO, similarly to TCP-MD5, needs to allocate tfms on a slow-path,
which is setsockopt() and use crypto ahash requests on fast paths,
which are RX/TX softirqs. Also, it needs a temporary/scratch buffer
for preparing the hash.

Rework tcp_md5sig_pool in order to support other hashing algorithms
than MD5. It will make it possible to share pre-allocated crypto_ahash
descriptors and scratch area between all TCP hash users.

Internally tcp_sigpool calls crypto_clone_ahash() API over pre-allocated
crypto ahash tfm. Kudos to Herbert, who provided this new crypto API.

I was a little concerned over GFP_ATOMIC allocations of ahash and
crypto_request in RX/TX (see tcp_sigpool_start()), so I benchmarked both
"backends" with different algorithms, using patched version of iperf3[2].
On my laptop with i7-7600U @ 2.80GHz:

                         clone-tfm                per-CPU-requests
TCP-MD5                  2.25 Gbits/sec           2.30 Gbits/sec
TCP-AO(hmac(sha1))       2.53 Gbits/sec           2.54 Gbits/sec
TCP-AO(hmac(sha512))     1.67 Gbits/sec           1.64 Gbits/sec
TCP-AO(hmac(sha384))     1.77 Gbits/sec           1.80 Gbits/sec
TCP-AO(hmac(sha224))     1.29 Gbits/sec           1.30 Gbits/sec
TCP-AO(hmac(sha3-512))    481 Mbits/sec            480 Mbits/sec
TCP-AO(hmac(md5))        2.07 Gbits/sec           2.12 Gbits/sec
TCP-AO(hmac(rmd160))     1.01 Gbits/sec            995 Mbits/sec
TCP-AO(cmac(aes128))     [not supporetd yet]      2.11 Gbits/sec

So, it seems that my concerns don't have strong grounds and per-CPU
crypto_request allocation can be dropped/removed from tcp_sigpool once
ciphers get crypto_clone_ahash() support.

[1]: https://lore.kernel.org/all/ZDefxOq6Ax0JeTRH@gondor.apana.org.au/T/#u
[2]: https://github.com/0x7f454c46/iperf/tree/tcp-md5-ao
Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMAINTAINERS: Remove linuxwwan@intel.com mailing list
Bagas Sanjaya [Wed, 25 Oct 2023 13:03:32 +0000 (20:03 +0700)]
MAINTAINERS: Remove linuxwwan@intel.com mailing list

Messages submitted to the ML bounce (address not found error). In
fact, the ML was mistagged as person maintainer instead of mailing
list.

Remove the ML to keep Cc: lists a bit shorter and not to spam
everyone's inbox with postmaster notifications.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231025130332.67995-2-bagasdotme@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'intel-wired-lan-driver-updates-for-2023-10-25-ice'
Jakub Kicinski [Fri, 27 Oct 2023 03:34:25 +0000 (20:34 -0700)]
Merge branch 'intel-wired-lan-driver-updates-for-2023-10-25-ice'

Jacob Keller says:

====================
Intel Wired LAN Driver Updates for 2023-10-25 (ice)

This series extends the ice driver with basic support for the E830 device
line. It does not include support for all device features, but enables basic
functionality to load and pass traffic.

Alice adds the 200G speed and PHY types supported by E830 hardware.

Dan extends the DDP package logic to support the E830 package segment.

Paul adds the basic registers and macros used by E830 hardware, and adds
support for handling variable length link status information from firmware.

Pawel removes some redundant zeroing of the PCI IDs list, and extends the
list to include the E830 device IDs.
====================

Link: https://lore.kernel.org/r/20231025214157.1222758-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Hook up 4 E830 devices by adding their IDs
Pawel Chmielewski [Wed, 25 Oct 2023 21:41:57 +0000 (14:41 -0700)]
ice: Hook up 4 E830 devices by adding their IDs

As the previous patches provide support for E830 hardware, add E830
specific IDs to the PCI device ID table, so these devices can now be
probed by the kernel.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025214157.1222758-7-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Remove redundant zeroing of the fields.
Pawel Chmielewski [Wed, 25 Oct 2023 21:41:56 +0000 (14:41 -0700)]
ice: Remove redundant zeroing of the fields.

Remove zeroing of the fields, as all the fields are in fact initialized
with zeros automatically

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025214157.1222758-6-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Add support for E830 DDP package segment
Dan Nowlin [Wed, 25 Oct 2023 21:41:55 +0000 (14:41 -0700)]
ice: Add support for E830 DDP package segment

Add support for E830 DDP package segment. For the E830 package,
signature buffers will not be included inline in the configuration
buffers. Instead, the signature buffers will be located in a
signature segment.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025214157.1222758-5-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Add ice_get_link_status_datalen
Paul Greenwalt [Wed, 25 Oct 2023 21:41:54 +0000 (14:41 -0700)]
ice: Add ice_get_link_status_datalen

The Get Link Status data length can vary with different versions of
ice_aqc_get_link_status_data. Add ice_get_link_status_datalen() to return
datalen for the specific ice_aqc_get_link_status_data version.
Add new link partner fields to ice_aqc_get_link_status_data; PHY type,
FEC, and flow control.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Co-developed-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025214157.1222758-4-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Add 200G speed/phy type use
Alice Michael [Wed, 25 Oct 2023 21:41:53 +0000 (14:41 -0700)]
ice: Add 200G speed/phy type use

Add the support for 200G phy speeds and the mapping for their
advertisement in link. Add the new PHY type bits for AQ command, as
needed for 200G E830 controllers.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Co-developed-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025214157.1222758-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Add E830 device IDs, MAC type and registers
Paul Greenwalt [Wed, 25 Oct 2023 21:41:52 +0000 (14:41 -0700)]
ice: Add E830 device IDs, MAC type and registers

E830 is the 200G NIC family which uses the ice driver.

Add specific E830 registers. Embed macros to use proper register based on
(hw)->mac_type & name those macros to [ORIGINAL]_BY_MAC(hw). Registers
only available on one of the macs will need to be explicitly referred to
as E800_NAME instead of just NAME. PTP is not yet supported.

Co-developed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Milena Olech <milena.olech@intel.com>
Co-developed-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Co-developed-by: Scott Taylor <scott.w.taylor@intel.com>
Signed-off-by: Scott Taylor <scott.w.taylor@intel.com>
Co-developed-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025214157.1222758-2-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'wireless-next-2023-10-26' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Fri, 27 Oct 2023 03:27:57 +0000 (20:27 -0700)]
Merge tag 'wireless-next-2023-10-26' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.7

The third, and most likely the last, features pull request for v6.7.
Fixes all over and only few small new features.

Major changes:

iwlwifi
 - more Multi-Link Operation (MLO) work

ath12k
 - QCN9274: mesh support

ath11k
 - firmware-2.bin container file format support

* tag 'wireless-next-2023-10-26' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (155 commits)
  wifi: ray_cs: Remove unnecessary (void*) conversions
  Revert "wifi: ath11k: call ath11k_mac_fils_discovery() without condition"
  wifi: ath12k: Introduce and use ath12k_sta_to_arsta()
  wifi: ath12k: fix htt mlo-offset event locking
  wifi: ath12k: fix dfs-radar and temperature event locking
  wifi: ath11k: fix gtk offload status event locking
  wifi: ath11k: fix htt pktlog locking
  wifi: ath11k: fix dfs radar event locking
  wifi: ath11k: fix temperature event locking
  wifi: ath12k: rename the sc naming convention to ab
  wifi: ath12k: rename the wmi_sc naming convention to wmi_ab
  wifi: ath11k: add firmware-2.bin support
  wifi: ath11k: qmi: refactor ath11k_qmi_m3_load()
  wifi: rtw89: cleanup firmware elements parsing
  wifi: rt2x00: rework MT7620 PA/LNA RF calibration
  wifi: rt2x00: rework MT7620 channel config function
  wifi: rt2x00: improve MT7620 register initialization
  MAINTAINERS: wifi: rt2x00: drop Helmut Schaa
  wifi: wlcore: main: replace deprecated strncpy with strscpy
  wifi: wlcore: boot: replace deprecated strncpy with strscpy
  ...
====================

Link: https://lore.kernel.org/r/20231026090411.B2426C433CB@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf...
Jakub Kicinski [Fri, 27 Oct 2023 03:02:40 +0000 (20:02 -0700)]
Merge tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-10-26

We've added 51 non-merge commits during the last 10 day(s) which contain
a total of 75 files changed, 5037 insertions(+), 200 deletions(-).

The main changes are:

1) Add open-coded task, css_task and css iterator support.
   One of the use cases is customizable OOM victim selection via BPF,
   from Chuyi Zhou.

2) Fix BPF verifier's iterator convergence logic to use exact states
   comparison for convergence checks, from Eduard Zingerman,
   Andrii Nakryiko and Alexei Starovoitov.

3) Add BPF programmable net device where bpf_mprog defines the logic
   of its xmit routine. It can operate in L3 and L2 mode,
   from Daniel Borkmann and Nikolay Aleksandrov.

4) Batch of fixes for BPF per-CPU kptr and re-enable unit_size checking
   for global per-CPU allocator, from Hou Tao.

5) Fix libbpf which eagerly assumed that SHT_GNU_verdef ELF section
   was going to be present whenever a binary has SHT_GNU_versym section,
   from Andrii Nakryiko.

6) Fix BPF ringbuf correctness to fold smp_mb__before_atomic() into
   atomic_set_release(), from Paul E. McKenney.

7) Add a warning if NAPI callback missed xdp_do_flush() under
   CONFIG_DEBUG_NET which helps checking if drivers were missing
   the former, from Sebastian Andrzej Siewior.

8) Fix missed RCU read-lock in bpf_task_under_cgroup() which was throwing
   a warning under sleepable programs, from Yafang Shao.

9) Avoid unnecessary -EBUSY from htab_lock_bucket by disabling IRQ before
   checking map_locked, from Song Liu.

10) Make BPF CI linked_list failure test more robust,
    from Kumar Kartikeya Dwivedi.

11) Enable samples/bpf to be built as PIE in Fedora, from Viktor Malik.

12) Fix xsk starving when multiple xsk sockets were associated with
    a single xsk_buff_pool, from Albert Huang.

13) Clarify the signed modulo implementation for the BPF ISA standardization
    document that it uses truncated division, from Dave Thaler.

14) Improve BPF verifier's JEQ/JNE branch taken logic to also consider
    signed bounds knowledge, from Andrii Nakryiko.

15) Add an option to XDP selftests to use multi-buffer AF_XDP
    xdp_hw_metadata and mark used XDP programs as capable to use frags,
    from Larysa Zaremba.

16) Fix bpftool's BTF dumper wrt printing a pointer value and another
    one to fix struct_ops dump in an array, from Manu Bretelle.

* tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (51 commits)
  netkit: Remove explicit active/peer ptr initialization
  selftests/bpf: Fix selftests broken by mitigations=off
  samples/bpf: Allow building with custom bpftool
  samples/bpf: Fix passing LDFLAGS to libbpf
  samples/bpf: Allow building with custom CFLAGS/LDFLAGS
  bpf: Add more WARN_ON_ONCE checks for mismatched alloc and free
  selftests/bpf: Add selftests for netkit
  selftests/bpf: Add netlink helper library
  bpftool: Extend net dump with netkit progs
  bpftool: Implement link show support for netkit
  libbpf: Add link-based API for netkit
  tools: Sync if_link uapi header
  netkit, bpf: Add bpf programmable net device
  bpf: Improve JEQ/JNE branch taken logic
  bpf: Fold smp_mb__before_atomic() into atomic_set_release()
  bpf: Fix unnecessary -EBUSY from htab_lock_bucket
  xsk: Avoid starving the xsk further down the list
  bpf: print full verifier states on infinite loop detection
  selftests/bpf: test if state loops are detected in a tricky case
  bpf: correct loop detection for iterators convergence
  ...
====================

Link: https://lore.kernel.org/r/20231026150509.2824-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMAINTAINERS: Maintainer change for ptp_vmw driver
Alexey Makhalov [Wed, 25 Oct 2023 23:19:31 +0000 (16:19 -0700)]
MAINTAINERS: Maintainer change for ptp_vmw driver

Deep has decided to transfer the maintainership of the VMware virtual
PTP clock driver (ptp_vmw) to Jeff. Update the MAINTAINERS file to
reflect this change.

Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Acked-by: Deep Shah <sdeep@vmware.com>
Acked-by: Jeff Sipek <jsipek@vmware.com>
Link: https://lore.kernel.org/r/20231025231931.76842-1-amakhalov@vmware.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobnxt_en: Fix 2 stray ethtool -S counters
Michael Chan [Thu, 26 Oct 2023 01:32:31 +0000 (18:32 -0700)]
bnxt_en: Fix 2 stray ethtool -S counters

The recent firmware interface change has added 2 counters in struct
rx_port_stats_ext. This caused 2 stray ethtool counters to be
displayed.

Since new counters are added from time to time, fix it so that the
ethtool logic will only display up to the maximum known counters.
These 2 counters are not used by production firmware yet.

Fixes: 754fbf604ff6 ("bnxt_en: Update firmware interface to 1.10.2.171")
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231026013231.53271-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotools: ynl-gen: respect attr-cnt-name at the attr set level
Jakub Kicinski [Wed, 25 Oct 2023 18:27:39 +0000 (11:27 -0700)]
tools: ynl-gen: respect attr-cnt-name at the attr set level

Davide reports that we look for the attr-cnt-name in the wrong
object. We try to read it from the family, but the schema only
allows for it to exist at attr-set level.

Reported-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/all/CAKa-r6vCj+gPEUKpv7AsXqM77N6pB0evuh7myHq=585RA3oD5g@mail.gmail.com/
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231025182739.184706-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonetlink: specs: support conditional operations
Jakub Kicinski [Wed, 25 Oct 2023 16:22:53 +0000 (09:22 -0700)]
netlink: specs: support conditional operations

Page pool code is compiled conditionally, but the operations
are part of the shared netlink family. We can handle this
by reporting empty list of pools or -EOPNOTSUPP / -ENOSYS
but the cleanest way seems to be removing the ops completely
at compilation time. That way user can see that the page
pool ops are not present using genetlink introspection.
Same way they'd check if the kernel is "new enough" to
support the ops.

Extend the specs with the ability to specify the config
condition under which op (and its policies, etc.) should
be hidden.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231025162253.133159-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonetlink: make range pointers in policies const
Jakub Kicinski [Wed, 25 Oct 2023 16:22:04 +0000 (09:22 -0700)]
netlink: make range pointers in policies const

struct nla_policy is usually constant itself, but unless
we make the ranges inside constant we won't be able to
make range structs const. The ranges are not modified
by the core.

Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231025162204.132528-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/mlx5: fix uninit value use
Przemek Kitszel [Wed, 25 Oct 2023 14:50:50 +0000 (16:50 +0200)]
net/mlx5: fix uninit value use

Avoid use of uninitialized state variable.

In case of mlx5e_tx_reporter_build_diagnose_output_sq_common() it's better
to still collect other data than bail out entirely.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/netdev/8bd30131-c9f2-4075-a575-7fa2793a1760@moroto.mountain
Fixes: d17f98bf7cc9 ("net/mlx5: devlink health: use retained error fmsg API")
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://lore.kernel.org/r/20231025145050.36114-1-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 26 Oct 2023 20:42:19 +0000 (13:42 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

Conflicts:

net/mac80211/rx.c
  91535613b609 ("wifi: mac80211: don't drop all unprotected public action frames")
  6c02fab72429 ("wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return value")

Adjacent changes:

drivers/net/ethernet/apm/xgene/xgene_enet_main.c
  61471264c018 ("net: ethernet: apm: Convert to platform remove callback returning void")
  d2ca43f30611 ("net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF")

net/vmw_vsock/virtio_transport.c
  64c99d2d6ada ("vsock/virtio: support to send non-linear skb")
  53b08c498515 ("vsock/virtio: initialize the_virtio_vsock before using VQs")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 26 Oct 2023 17:41:27 +0000 (07:41 -1000)]
Merge tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from WiFi and netfilter.

  Most regressions addressed here come from quite old versions, with the
  exceptions of the iavf one and the WiFi fixes. No known outstanding
  reports or investigation.

  Fixes to fixes:

   - eth: iavf: in iavf_down, disable queues when removing the driver

  Previous releases - regressions:

   - sched: act_ct: additional checks for outdated flows

   - tcp: do not leave an empty skb in write queue

   - tcp: fix wrong RTO timeout when received SACK reneging

   - wifi: cfg80211: pass correct pointer to rdev_inform_bss()

   - eth: i40e: sync next_to_clean and next_to_process for programming
     status desc

   - eth: iavf: initialize waitqueues before starting watchdog_task

  Previous releases - always broken:

   - eth: r8169: fix data-races

   - eth: igb: fix potential memory leak in igb_add_ethtool_nfc_entry

   - eth: r8152: avoid writing garbage to the adapter's registers

   - eth: gtp: fix fragmentation needed check with gso"

* tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
  iavf: in iavf_down, disable queues when removing the driver
  vsock/virtio: initialize the_virtio_vsock before using VQs
  net: ipv6: fix typo in comments
  net: ipv4: fix typo in comments
  net/sched: act_ct: additional checks for outdated flows
  netfilter: flowtable: GC pushes back packets to classic path
  i40e: Fix wrong check for I40E_TXR_FLAGS_WB_ON_ITR
  gtp: fix fragmentation needed check with gso
  gtp: uapi: fix GTPA_MAX
  Fix NULL pointer dereference in cn_filter()
  sfc: cleanup and reduce netlink error messages
  net/handshake: fix file ref count in handshake_nl_accept_doit()
  wifi: mac80211: don't drop all unprotected public action frames
  wifi: cfg80211: fix assoc response warning on failed links
  wifi: cfg80211: pass correct pointer to rdev_inform_bss()
  isdn: mISDN: hfcsusb: Spelling fix in comment
  tcp: fix wrong RTO timeout when received SACK reneging
  r8152: Block future register access if register access fails
  r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE
  r8152: Check for unplug in r8153b_ups_en() / r8153c_ups_en()
  ...

2 years agonetkit: Remove explicit active/peer ptr initialization
Nikolay Aleksandrov [Thu, 26 Oct 2023 09:41:05 +0000 (12:41 +0300)]
netkit: Remove explicit active/peer ptr initialization

Remove the explicit NULLing of active/peer pointers and rely on the
implicit one done at net device allocation.

Suggested-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231026094106.1505892-2-razor@blackwall.org
2 years agoselftests/bpf: Fix selftests broken by mitigations=off
Yafang Shao [Wed, 25 Oct 2023 03:11:44 +0000 (03:11 +0000)]
selftests/bpf: Fix selftests broken by mitigations=off

When we configure the kernel command line with 'mitigations=off' and set
the sysctl knob 'kernel.unprivileged_bpf_disabled' to 0, the commit
bc5bc309db45 ("bpf: Inherit system settings for CPU security mitigations")
causes issues in the execution of `test_progs -t verifier`. This is
because 'mitigations=off' bypasses Spectre v1 and Spectre v4 protections.

Currently, when a program requests to run in unprivileged mode
(kernel.unprivileged_bpf_disabled = 0), the BPF verifier may prevent
it from running due to the following conditions not being enabled:

  - bypass_spec_v1
  - bypass_spec_v4
  - allow_ptr_leaks
  - allow_uninit_stack

While 'mitigations=off' enables the first two conditions, it does not
enable the latter two. As a result, some test cases in
'test_progs -t verifier' that were expected to fail to run may run
successfully, while others still fail but with different error messages.
This makes it challenging to address them comprehensively.

Moreover, in the future, we may introduce more fine-grained control over
CPU mitigations, such as enabling only bypass_spec_v1 or bypass_spec_v4.

Given the complexity of the situation, rather than fixing each broken test
case individually, it's preferable to skip them when 'mitigations=off' is
in effect and introduce specific test cases for the new 'mitigations=off'
scenario. For instance, we can introduce new BTF declaration tags like
'__failure__nospec', '__failure_nospecv1' and '__failure_nospecv4'.

In this patch, the approach is to simply skip the broken test cases when
'mitigations=off' is enabled. The result of `test_progs -t verifier` as
follows after this commit,

Before this commit
==================

- without 'mitigations=off'
  - kernel.unprivileged_bpf_disabled = 2
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
  - kernel.unprivileged_bpf_disabled = 0
    Summary: 74/1336 PASSED, 0 SKIPPED, 0 FAILED    <<<<
- with 'mitigations=off'
  - kernel.unprivileged_bpf_disabled = 2
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
  - kernel.unprivileged_bpf_disabled = 0
    Summary: 63/1276 PASSED, 0 SKIPPED, 11 FAILED   <<<< 11 FAILED

After this commit
=================

- without 'mitigations=off'
  - kernel.unprivileged_bpf_disabled = 2
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
  - kernel.unprivileged_bpf_disabled = 0
    Summary: 74/1336 PASSED, 0 SKIPPED, 0 FAILED    <<<<
- with this patch, with 'mitigations=off'
  - kernel.unprivileged_bpf_disabled = 2
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
  - kernel.unprivileged_bpf_disabled = 0
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED   <<<< SKIPPED

Fixes: bc5bc309db45 ("bpf: Inherit system settings for CPU security mitigations")
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Closes: https://lore.kernel.org/bpf/CAADnVQKUBJqg+hHtbLeeC2jhoJAWqnmRAzXW3hmUCNSV9kx4sQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20231025031144.5508-1-laoar.shao@gmail.com
2 years agosamples/bpf: Allow building with custom bpftool
Viktor Malik [Wed, 25 Oct 2023 06:19:14 +0000 (08:19 +0200)]
samples/bpf: Allow building with custom bpftool

samples/bpf build its own bpftool boostrap to generate vmlinux.h as well
as some BPF objects. This is a redundant step if bpftool has been
already built, so update samples/bpf/Makefile such that it accepts a
path to bpftool passed via the BPFTOOL variable. The approach is
practically the same as tools/testing/selftests/bpf/Makefile uses.

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/bd746954ac271b02468d8d951ff9f11e655d485b.1698213811.git.vmalik@redhat.com
2 years agosamples/bpf: Fix passing LDFLAGS to libbpf
Viktor Malik [Wed, 25 Oct 2023 06:19:13 +0000 (08:19 +0200)]
samples/bpf: Fix passing LDFLAGS to libbpf

samples/bpf/Makefile passes LDFLAGS=$(TPROGS_LDFLAGS) to libbpf build
without surrounding quotes, which may cause compilation errors when
passing custom TPROGS_USER_LDFLAGS.

For example:

    $ make -C samples/bpf/ TPROGS_USER_LDFLAGS="-Wl,--as-needed -specs=/usr/lib/gcc/x86_64-redhat-linux/13/libsanitizer.spec"
    make: Entering directory './samples/bpf'
    make -C ../../ M=./samples/bpf BPF_SAMPLES_PATH=./samples/bpf
    make[1]: Entering directory '.'
    make -C ./samples/bpf/../../tools/lib/bpf RM='rm -rf' EXTRA_CFLAGS="-Wall -O2 -Wmissing-prototypes -Wstrict-prototypes  -I./usr/include -I./tools/testing/selftests/bpf/ -I./samples/bpf/libbpf/include -I./tools/include -I./tools/perf -I./tools/lib -DHAVE_ATTR_TEST=0" \
            LDFLAGS=-Wl,--as-needed -specs=/usr/lib/gcc/x86_64-redhat-linux/13/libsanitizer.spec srctree=./samples/bpf/../../ \
            O= OUTPUT=./samples/bpf/libbpf/ DESTDIR=./samples/bpf/libbpf prefix= \
            ./samples/bpf/libbpf/libbpf.a install_headers
    make: invalid option -- 'c'
    make: invalid option -- '='
    make: invalid option -- '/'
    make: invalid option -- 'u'
    make: invalid option -- '/'
    [...]

Fix the error by properly quoting $(TPROGS_LDFLAGS).

Suggested-by: Donald Zickus <dzickus@redhat.com>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/c690de6671cc6c983d32a566d33fd7eabd18b526.1698213811.git.vmalik@redhat.com
2 years agosamples/bpf: Allow building with custom CFLAGS/LDFLAGS
Viktor Malik [Wed, 25 Oct 2023 06:19:12 +0000 (08:19 +0200)]
samples/bpf: Allow building with custom CFLAGS/LDFLAGS

Currently, it is not possible to specify custom flags when building
samples/bpf. The flags are defined in TPROGS_CFLAGS/TPROGS_LDFLAGS
variables, however, when trying to override those from the make command,
compilation fails.

For example, when trying to build with PIE:

    $ make -C samples/bpf TPROGS_CFLAGS="-fpie" TPROGS_LDFLAGS="-pie"

This is because samples/bpf/Makefile updates these variables, especially
appends include paths to TPROGS_CFLAGS and these updates are overridden
by setting the variables from the make command.

This patch introduces variables TPROGS_USER_CFLAGS/TPROGS_USER_LDFLAGS
for this purpose, which can be set from the make command and their
values are propagated to TPROGS_CFLAGS/TPROGS_LDFLAGS.

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/2d81100b830a71f0e72329cc7781edaefab75f62.1698213811.git.vmalik@redhat.com
2 years agobareudp: use ports to lookup route
Beniamino Galvani [Wed, 25 Oct 2023 09:44:41 +0000 (11:44 +0200)]
bareudp: use ports to lookup route

The source and destination ports should be taken into account when
determining the route destination; they can affect the result, for
example in case there are routing rules defined.

Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231025094441.417464-1-b.galvani@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agobpf: Add more WARN_ON_ONCE checks for mismatched alloc and free
Hou Tao [Sat, 21 Oct 2023 01:49:59 +0000 (09:49 +0800)]
bpf: Add more WARN_ON_ONCE checks for mismatched alloc and free

There are two possible mismatched alloc and free cases in BPF memory
allocator:

1) allocate from cache X but free by cache Y with a different unit_size
2) allocate from per-cpu cache but free by kmalloc cache or vice versa

So add more WARN_ON_ONCE checks in free_bulk() and __free_by_rcu() to
spot these mismatched alloc and free early.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231021014959.3563841-1-houtao@huaweicloud.com
2 years agoMerge tag 'nf-next-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilt...
Paolo Abeni [Thu, 26 Oct 2023 10:20:35 +0000 (12:20 +0200)]
Merge tag 'nf-next-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next. Mostly
nf_tables updates with two patches for connlabel and br_netfilter.

1) Rename function name to perform on-demand GC for rbtree elements,
   and replace async GC in rbtree by sync GC. Patches from Florian Westphal.

2) Use commit_mutex for NFT_MSG_GETRULE_RESET to ensure that two
   concurrent threads invoking this command do not underrun stateful
   objects. Patches from Phil Sutter.

3) Use single hook to deal with IP and ARP packets in br_netfilter.
   Patch from Florian Westphal.

4) Use atomic_t in netns->connlabel use counter instead of using a
   spinlock, also patch from Florian.

5) Cleanups for stateful objects infrastructure in nf_tables.
   Patches from Phil Sutter.

6) Flush path uses opaque set element offered by the iterator, instead of
   calling pipapo_deactivate() which looks up for it again.

7) Set backend .flush interface always succeeds, make it return void
   instead.

8) Add struct nft_elem_priv placeholder structure and use it by replacing
   void * to pass opaque set element representation from backend to frontend
   which defeats compiler type checks.

9) Shrink memory consumption of set element transactions, by reducing
   struct nft_trans_elem object size and reducing stack memory usage.

10) Use struct nft_elem_priv also for set backend .insert operation too.

11) Carry reset flag in nft_set_dump_ctx structure, instead of passing it
    as a function argument, from Phil Sutter.

netfilter pull request 23-10-25

* tag 'nf-next-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: Carry reset boolean in nft_set_dump_ctx
  netfilter: nf_tables: set->ops->insert returns opaque set element in case of EEXIST
  netfilter: nf_tables: shrink memory consumption of set elements
  netfilter: nf_tables: expose opaque set element as struct nft_elem_priv
  netfilter: nf_tables: set backend .flush always succeeds
  netfilter: nft_set_pipapo: no need to call pipapo_deactivate() from flush
  netfilter: nf_tables: Carry reset boolean in nft_obj_dump_ctx
  netfilter: nf_tables: nft_obj_filter fits into cb->ctx
  netfilter: nf_tables: Carry s_idx in nft_obj_dump_ctx
  netfilter: nf_tables: A better name for nft_obj_filter
  netfilter: nf_tables: Unconditionally allocate nft_obj_filter
  netfilter: nf_tables: Drop pointless memset in nf_tables_dump_obj
  netfilter: conntrack: switch connlabels to atomic_t
  br_netfilter: use single forward hook for ip and arp
  netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests
  netfilter: nf_tables: Introduce nf_tables_getrule_single()
  netfilter: nf_tables: Open-code audit log call in nf_tables_getrule()
  netfilter: nft_set_rbtree: prefer sync gc to async worker
  netfilter: nft_set_rbtree: rename gc deactivate+erase function
====================

Link: https://lore.kernel.org/r/20231025212555.132775-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'net-ipv6-addrconf-ensure-that-temporary-addresses-preferred-lifetimes...
Jakub Kicinski [Thu, 26 Oct 2023 01:23:08 +0000 (18:23 -0700)]
Merge branch 'net-ipv6-addrconf-ensure-that-temporary-addresses-preferred-lifetimes-are-in-the-valid-range'

Alex Henrie says:

====================
net: ipv6/addrconf: ensure that temporary addresses' preferred lifetimes are in the valid range

No changes from v2, but there are only four patches now because the
first patch has already been applied.

https://lore.kernel.org/all/20230829054623.104293-1-alexhenrie24@gmail.com/
====================

Link: https://lore.kernel.org/r/20231024212312.299370-1-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoDocumentation: networking: explain what happens if temp_prefered_lft is too small...
Alex Henrie [Tue, 24 Oct 2023 21:23:10 +0000 (15:23 -0600)]
Documentation: networking: explain what happens if temp_prefered_lft is too small or too large

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231024212312.299370-5-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoDocumentation: networking: explain what happens if temp_valid_lft is too small
Alex Henrie [Tue, 24 Oct 2023 21:23:09 +0000 (15:23 -0600)]
Documentation: networking: explain what happens if temp_valid_lft is too small

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231024212312.299370-4-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipv6/addrconf: clamp preferred_lft to the minimum required
Alex Henrie [Tue, 24 Oct 2023 21:23:08 +0000 (15:23 -0600)]
net: ipv6/addrconf: clamp preferred_lft to the minimum required

If the preferred lifetime was less than the minimum required lifetime,
ipv6_create_tempaddr would error out without creating any new address.
On my machine and network, this error happened immediately with the
preferred lifetime set to 1 second, after a few minutes with the
preferred lifetime set to 4 seconds, and not at all with the preferred
lifetime set to 5 seconds. During my investigation, I found a Stack
Exchange post from another person who seems to have had the same
problem: They stopped getting new addresses if they lowered the
preferred lifetime below 3 seconds, and they didn't really know why.

The preferred lifetime is a preference, not a hard requirement. The
kernel does not strictly forbid new connections on a deprecated address,
nor does it guarantee that the address will be disposed of the instant
its total valid lifetime expires. So rather than disable IPv6 privacy
extensions altogether if the minimum required lifetime swells above the
preferred lifetime, it is more in keeping with the user's intent to
increase the temporary address's lifetime to the minimum necessary for
the current network conditions.

With these fixes, setting the preferred lifetime to 3 or 4 seconds "just
works" because the extra fraction of a second is practically
unnoticeable. It's even possible to reduce the time before deprecation
to 1 or 2 seconds by also disabling duplicate address detection (setting
/proc/sys/net/ipv6/conf/*/dad_transmits to 0). I realize that that is a
pretty niche use case, but I know at least one person who would gladly
sacrifice performance and convenience to be sure that they are getting
the maximum possible level of privacy.

Link: https://serverfault.com/a/1031168/310447
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231024212312.299370-3-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipv6/addrconf: clamp preferred_lft to the maximum allowed
Alex Henrie [Tue, 24 Oct 2023 21:23:07 +0000 (15:23 -0600)]
net: ipv6/addrconf: clamp preferred_lft to the maximum allowed

Without this patch, there is nothing to stop the preferred lifetime of a
temporary address from being greater than its valid lifetime. If that
was the case, the valid lifetime was effectively ignored.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231024212312.299370-2-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'ipv6-avoid-atomic-fragment-on-gso-output'
Jakub Kicinski [Thu, 26 Oct 2023 01:04:31 +0000 (18:04 -0700)]
Merge branch 'ipv6-avoid-atomic-fragment-on-gso-output'

Yan Zhai says:

====================
ipv6: avoid atomic fragment on GSO output

When the ipv6 stack output a GSO packet, if its gso_size is larger than
dst MTU, then all segments would be fragmented. However, it is possible
for a GSO packet to have a trailing segment with smaller actual size
than both gso_size as well as the MTU, which leads to an "atomic
fragment". Atomic fragments are considered harmful in RFC-8021. An
Existing report from APNIC also shows that atomic fragments are more
likely to be dropped even it is equivalent to a no-op [1].

The series contains following changes:
* drop feature RTAX_FEATURE_ALLFRAG, which has been broken. This helps
  simplifying other changes in this set.
* refactor __ip6_finish_output code to separate GSO and non-GSO packet
  processing, mirroring IPv4 side logic.
* avoid generating atomic fragment on GSO packets.

Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf
V4: https://lore.kernel.org/netdev/cover.1698114636.git.yan@cloudflare.com/
V3: https://lore.kernel.org/netdev/cover.1697779681.git.yan@cloudflare.com/
V2: https://lore.kernel.org/netdev/ZS1%2Fqtr0dZJ35VII@debian.debian/
====================

Link: https://lore.kernel.org/r/cover.1698156966.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: avoid atomic fragment on GSO packets
Yan Zhai [Tue, 24 Oct 2023 14:26:40 +0000 (07:26 -0700)]
ipv6: avoid atomic fragment on GSO packets

When the ipv6 stack output a GSO packet, if its gso_size is larger than
dst MTU, then all segments would be fragmented. However, it is possible
for a GSO packet to have a trailing segment with smaller actual size
than both gso_size as well as the MTU, which leads to an "atomic
fragment". Atomic fragments are considered harmful in RFC-8021. An
Existing report from APNIC also shows that atomic fragments are more
likely to be dropped even it is equivalent to a no-op [1].

Add an extra check in the GSO slow output path. For each segment from
the original over-sized packet, if it fits with the path MTU, then avoid
generating an atomic fragment.

Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf
Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
Reported-by: David Wragg <dwragg@cloudflare.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Link: https://lore.kernel.org/r/90912e3503a242dca0bc36958b11ed03a2696e5e.1698156966.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: refactor ip6_finish_output for GSO handling
Yan Zhai [Tue, 24 Oct 2023 14:26:37 +0000 (07:26 -0700)]
ipv6: refactor ip6_finish_output for GSO handling

Separate GSO and non-GSO packets handling to make the logic cleaner. For
GSO packets, frag_max_size check can be omitted because it is only
useful for packets defragmented by netfilter hooks. Both local output
and GRO logic won't produce GSO packets when defragment is needed. This
also mirrors what IPv4 side code is doing.

Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/0e1d4599f858e2becff5c4fe0b5f843236bc3fe8.1698156966.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: drop feature RTAX_FEATURE_ALLFRAG
Yan Zhai [Tue, 24 Oct 2023 14:26:33 +0000 (07:26 -0700)]
ipv6: drop feature RTAX_FEATURE_ALLFRAG

RTAX_FEATURE_ALLFRAG was added before the first git commit:

https://www.mail-archive.com/bk-commits-head@vger.kernel.org/msg03399.html

The feature would send packets to the fragmentation path if a box
receives a PMTU value with less than 1280 byte. However, since commit
9d289715eb5c ("ipv6: stop sending PTB packets for MTU < 1280"), such
message would be simply discarded. The feature flag is neither supported
in iproute2 utility. In theory one can still manipulate it with direct
netlink message, but it is not ideal because it was based on obsoleted
guidance of RFC-2460 (replaced by RFC-8200).

The feature would always test false at the moment, so remove related
code or mark them as unused.

Signed-off-by: Yan Zhai <yan@cloudflare.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/d78e44dcd9968a252143ffe78460446476a472a1.1698156966.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoiavf: in iavf_down, disable queues when removing the driver
Michal Schmidt [Wed, 25 Oct 2023 18:32:13 +0000 (11:32 -0700)]
iavf: in iavf_down, disable queues when removing the driver

In iavf_down, we're skipping the scheduling of certain operations if
the driver is being removed. However, the IAVF_FLAG_AQ_DISABLE_QUEUES
request must not be skipped in this case, because iavf_close waits
for the transition to the __IAVF_DOWN state, which happens in
iavf_virtchnl_completion after the queues are released.

Without this fix, "rmmod iavf" takes half a second per interface that's
up and prints the "Device resources not yet released" warning.

Fixes: c8de44b577eb ("iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025183213.874283-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'nf-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Wed, 25 Oct 2023 23:02:06 +0000 (16:02 -0700)]
Merge tag 'nf-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

This patch contains two late Netfilter's flowtable fixes for net:

1) Flowtable GC pushes back packets to classic path in every GC run,
   ie. every second. This is because NF_FLOW_HW_ESTABLISHED is only
   used by sched/act_ct (never set) and IPS_SEEN_REPLY might be unset
   by the time the flow is offloaded (this status bit is only reliable
   in the sched/act_ct datapath).

2) sched/act_ct logic to push back packets to classic path to reevaluate
   if UDP flow is unidirectional only applies if IPS_HW_OFFLOAD_BIT is
   set on and no hardware offload request is pending to be handled.
   From Vlad Buslov.

These two patches fixes two problems that were introduced in the
previous 6.5 development cycle.

* tag 'nf-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  net/sched: act_ct: additional checks for outdated flows
  netfilter: flowtable: GC pushes back packets to classic path
====================

Link: https://lore.kernel.org/r/20231025100819.2664-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agovsock/virtio: initialize the_virtio_vsock before using VQs
Alexandru Matei [Tue, 24 Oct 2023 19:17:42 +0000 (22:17 +0300)]
vsock/virtio: initialize the_virtio_vsock before using VQs

Once VQs are filled with empty buffers and we kick the host, it can send
connection requests. If the_virtio_vsock is not initialized before,
replies are silently dropped and do not reach the host.

virtio_transport_send_pkt() can queue packets once the_virtio_vsock is
set, but they won't be processed until vsock->tx_run is set to true. We
queue vsock->send_pkt_work when initialization finishes to send those
packets queued earlier.

Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock")
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20231024191742.14259-1-alexandru.matei@uipath.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mptcp-features-and-fixes-for-v6-7'
Jakub Kicinski [Wed, 25 Oct 2023 19:23:36 +0000 (12:23 -0700)]
Merge branch 'mptcp-features-and-fixes-for-v6-7'

Mat Martineau says:

====================
mptcp: Features and fixes for v6.7

Patch 1 adds a configurable timeout for the MPTCP connection when all
subflows are closed, to support break-before-make use cases.

Patch 2 is a fix for a 1-byte error in rx data counters with MPTCP
fastopen connections.

Patch 3 is a minor code cleanup.

Patches 4 & 5 add handling of rcvlowat for MPTCP sockets, with a
prerequisite patch to use a common scaling ratio between TCP and MPTCP.

Patch 6 improves efficiency of memory copying in MPTCP transmit code.

Patch 7 refactors syncing of socket options from the MPTCP socket to
its subflows.

Patches 8 & 9 help the MPTCP packet scheduler perform well by changing
the handling of notsent_lowat in subflows and how available buffer space
is calculated for MPTCP-level sends.
====================

Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-0-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: refactor sndbuf auto-tuning
Paolo Abeni [Mon, 23 Oct 2023 20:44:42 +0000 (13:44 -0700)]
mptcp: refactor sndbuf auto-tuning

The MPTCP protocol account for the data enqueued on all the subflows
to the main socket send buffer, while the send buffer auto-tuning
algorithm set the main socket send buffer size as the max size among
the subflows.

That causes bad performances when at least one subflow is sndbuf
limited, e.g. due to very high latency, as the MPTCP scheduler can't
even fill such buffer.

Change the send-buffer auto-tuning algorithm to compute the main socket
send buffer size as the sum of all the subflows buffer size.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-9-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: ignore notsent_lowat setting at the subflow level
Paolo Abeni [Mon, 23 Oct 2023 20:44:41 +0000 (13:44 -0700)]
mptcp: ignore notsent_lowat setting at the subflow level

Any latency related tuning taking action at the subflow level does
not really affect the user-space, as only the main MPTCP socket is
relevant.

Anyway any limiting setting may foul the MPTCP scheduler, not being
able to fully use the subflow-level cwin, leading to very poor b/w
usage.

Enforce notsent_lowat to be a no-op on every subflow.

Note that TCP_NOTSENT_LOWAT is currently not supported, and properly
dealing with that will require more invasive changes.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-8-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: consolidate sockopt synchronization
Paolo Abeni [Mon, 23 Oct 2023 20:44:40 +0000 (13:44 -0700)]
mptcp: consolidate sockopt synchronization

Move the socket option synchronization for active subflows
at subflow creation time. This allows removing the now unused
unlocked variant of such helper.

While at that, clean-up a bit the mptcp_subflow_create_socket()
errors path.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-7-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: use copy_from_iter helpers on transmit
Paolo Abeni [Mon, 23 Oct 2023 20:44:39 +0000 (13:44 -0700)]
mptcp: use copy_from_iter helpers on transmit

The perf traces show an high cost for the MPTCP transmit path memcpy.

It turn out that the helper currently in use carries quite a bit
of unneeded overhead, e.g. to map/unmap the memory pages.

Moving to the 'copy_from_iter' variant removes such overhead and
additionally gains the no-cache support.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-6-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: give rcvlowat some love
Paolo Abeni [Mon, 23 Oct 2023 20:44:38 +0000 (13:44 -0700)]
mptcp: give rcvlowat some love

The MPTCP protocol allow setting sk_rcvlowat, but the value there
is currently ignored.

Additionally, the default subflows sk_rcvlowat basically disables per
subflow delayed ack: the MPTCP protocol move the incoming data from the
subflows into the msk socket as soon as the TCP stacks invokes the subflow
data_ready callback. Later, when __tcp_ack_snd_check() takes action,
the subflow-level copied_seq matches rcv_nxt, and that mandate for an
immediate ack.

Let the mptcp receive path be aware of such threshold, explicitly tracking
the amount of data available to be ready and checking vs sk_rcvlowat in
mptcp_poll() and before waking-up readers.

Additionally implement the set_rcvlowat() callback, to properly handle
the rcvbuf auto-tuning on sk_rcvlowat changes.

Finally to properly handle delayed ack, force the subflow level threshold
to 0 and instead explicitly ask for an immediate ack when the msk level th
is not reached.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-5-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: define initial scaling factor value as a macro
Paolo Abeni [Mon, 23 Oct 2023 20:44:37 +0000 (13:44 -0700)]
tcp: define initial scaling factor value as a macro

So that other users could access it. Notably MPTCP will use
it in the next patch.

No functional change intended.

Acked-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-4-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: use plain bool instead of custom binary enum
Paolo Abeni [Mon, 23 Oct 2023 20:44:36 +0000 (13:44 -0700)]
mptcp: use plain bool instead of custom binary enum

The 'data_avail' subflow field is already used as plain boolean,
drop the custom binary enum type and switch to bool.

No functional changed intended.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-3-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: properly account fastopen data
Paolo Abeni [Mon, 23 Oct 2023 20:44:35 +0000 (13:44 -0700)]
mptcp: properly account fastopen data

Currently the socket level counter aggregating the received data
does not take in account the data received via fastopen.

Address the issue updating the counter as required.

Fixes: 38967f424b5b ("mptcp: track some aggregate data counters")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-2-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: add a new sysctl for make after break timeout
Paolo Abeni [Mon, 23 Oct 2023 20:44:34 +0000 (13:44 -0700)]
mptcp: add a new sysctl for make after break timeout

The MPTCP protocol allows sockets with no alive subflows to stay
in ESTABLISHED status for and user-defined timeout, to allow for
later subflows creation.

Currently such timeout is constant - TCP_TIMEWAIT_LEN. Let the
user-space configure them via a newly added sysctl, to better cope
with busy servers and simplify (make them faster) the relevant
pktdrill tests.

Note that the new know does not apply to orphaned MPTCP socket
waiting for the data_fin handshake completion: they always wait
TCP_TIMEWAIT_LEN.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-1-9dc60939d371@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'acpi-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Wed, 25 Oct 2023 17:51:56 +0000 (07:51 -1000)]
Merge tag 'acpi-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Unbreak the ACPI NFIT driver after a recent change that inadvertently
  altered its behavior (Xiang Chen)"

* tag 'acpi-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: NFIT: Install Notify() handler before getting NFIT table

2 years agowifi: ray_cs: Remove unnecessary (void*) conversions
Wu Yunchuan [Fri, 20 Oct 2023 09:34:31 +0000 (17:34 +0800)]
wifi: ray_cs: Remove unnecessary (void*) conversions

No need cast (void *) to (struct net_device *).

Signed-off-by: Wu Yunchuan <yunchuan@nfschina.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231020093432.214001-1-yunchuan@nfschina.com
2 years agoMerge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
Kalle Valo [Wed, 25 Oct 2023 17:10:23 +0000 (20:10 +0300)]
Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git

ath.git patches for v6.7.

Major changes:

ath12k

* QCN9274: mesh support

ath11k

* firmware-2.bin support

2 years agoRevert "Merge branch 'mv88e6xxx-dsa-bindings'"
Jakub Kicinski [Wed, 25 Oct 2023 14:22:37 +0000 (07:22 -0700)]
Revert "Merge branch 'mv88e6xxx-dsa-bindings'"

This reverts the following commits:

commit 53313ed25ba8 ("dt-bindings: marvell: Add Marvell MV88E6060 DSA schema")
commit 0f35369b4efe ("dt-bindings: marvell: Rewrite MV88E6xxx in schema")
commit 605a5f5d406d ("ARM64: dts: marvell: Fix some common switch mistakes")
commit bfedd8423643 ("ARM: dts: nxp: Fix some common switch mistakes")
commit 2b83557a588f ("ARM: dts: marvell: Fix some common switch mistakes")
commit ddae07ce9bb3 ("dt-bindings: net: mvusb: Fix up DSA example")
commit b5ef61718ad7 ("dt-bindings: net: dsa: Require ports or ethernet-ports")

As repoted by Vladimir, it breaks boot on the Turris MOX board.

Link: https://lore.kernel.org/all/20231025093632.fb2qdtunzaznd73z@skbuf/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "wifi: ath11k: call ath11k_mac_fils_discovery() without condition"
Kalle Valo [Mon, 23 Oct 2023 16:41:20 +0000 (19:41 +0300)]
Revert "wifi: ath11k: call ath11k_mac_fils_discovery() without condition"

This reverts commit e149353e6562f3e3246f75dfc4cca6a0cc5b4efc. The commit caused
QCA6390 hw2.0 firmware WLAN.HST.1.0.1-05266-QCAHSTSWPLZ_V2_TO_X86-1 to crash
during disconnect:

[71990.787525] ath11k_pci 0000:72:00.0: firmware crashed: MHI_CB_EE_RDDM

Closes: https://lore.kernel.org/all/87edhu3550.fsf@kernel.org/
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20231023164120.651151-1-kvalo@kernel.org
2 years agowifi: ath12k: Introduce and use ath12k_sta_to_arsta()
Jeff Johnson [Thu, 19 Oct 2023 16:57:50 +0000 (09:57 -0700)]
wifi: ath12k: Introduce and use ath12k_sta_to_arsta()

Currently, the logic to return an ath12k_sta pointer, given a
ieee80211_sta pointer, uses typecasting throughout the driver. In
general, conversion functions are preferable to typecasting since
using a conversion function allows the compiler to validate the types
of both the input and output parameters.

ath12k already defines a conversion function ath12k_vif_to_arvif() for
a similar conversion. So introduce ath12k_sta_to_arsta() for this use
case, and convert all of the existing typecasting to use this
function.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231019-upstream-ath12k_sta_to_arsta-v1-1-06f06f693338@quicinc.com
2 years agowifi: ath12k: fix htt mlo-offset event locking
Johan Hovold [Thu, 19 Oct 2023 11:36:50 +0000 (13:36 +0200)]
wifi: ath12k: fix htt mlo-offset event locking

The ath12k active pdevs are protected by RCU but the htt mlo-offset
event handling code calling ath12k_mac_get_ar_by_pdev_id() was not
marked as a read-side critical section.

Mark the code in question as an RCU read-side critical section to avoid
any potential use-after-free issues.

Compile tested only.

Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Cc: stable@vger.kernel.org # v6.2
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231019113650.9060-3-johan+linaro@kernel.org
2 years agowifi: ath12k: fix dfs-radar and temperature event locking
Johan Hovold [Thu, 19 Oct 2023 11:36:49 +0000 (13:36 +0200)]
wifi: ath12k: fix dfs-radar and temperature event locking

The ath12k active pdevs are protected by RCU but the DFS-radar and
temperature event handling code calling ath12k_mac_get_ar_by_pdev_id()
was not marked as a read-side critical section.

Mark the code in question as RCU read-side critical sections to avoid
any potential use-after-free issues.

Note that the temperature event handler looks like a place holder
currently but would still trigger an RCU lockdep splat.

Compile tested only.

Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Cc: stable@vger.kernel.org # v6.2
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231019113650.9060-2-johan+linaro@kernel.org
2 years agowifi: ath11k: fix gtk offload status event locking
Johan Hovold [Thu, 19 Oct 2023 15:53:42 +0000 (17:53 +0200)]
wifi: ath11k: fix gtk offload status event locking

The ath11k active pdevs are protected by RCU but the gtk offload status
event handling code calling ath11k_mac_get_arvif_by_vdev_id() was not
marked as a read-side critical section.

Mark the code in question as an RCU read-side critical section to avoid
any potential use-after-free issues.

Compile tested only.

Fixes: a16d9b50cfba ("ath11k: support GTK rekey offload")
Cc: stable@vger.kernel.org # 5.18
Cc: Carl Huang <quic_cjhuang@quicinc.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231019155342.31631-1-johan+linaro@kernel.org
2 years agowifi: ath11k: fix htt pktlog locking
Johan Hovold [Thu, 19 Oct 2023 11:25:21 +0000 (13:25 +0200)]
wifi: ath11k: fix htt pktlog locking

The ath11k active pdevs are protected by RCU but the htt pktlog handling
code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a
read-side critical section.

Mark the code in question as an RCU read-side critical section to avoid
any potential use-after-free issues.

Compile tested only.

Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Cc: stable@vger.kernel.org # 5.6
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231019112521.2071-1-johan+linaro@kernel.org
2 years agowifi: ath11k: fix dfs radar event locking
Johan Hovold [Thu, 19 Oct 2023 15:31:15 +0000 (17:31 +0200)]
wifi: ath11k: fix dfs radar event locking

The ath11k active pdevs are protected by RCU but the DFS radar event
handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a
read-side critical section.

Mark the code in question as an RCU read-side critical section to avoid
any potential use-after-free issues.

Compile tested only.

Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Cc: stable@vger.kernel.org # 5.6
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231019153115.26401-3-johan+linaro@kernel.org
2 years agowifi: ath11k: fix temperature event locking
Johan Hovold [Thu, 19 Oct 2023 15:31:14 +0000 (17:31 +0200)]
wifi: ath11k: fix temperature event locking

The ath11k active pdevs are protected by RCU but the temperature event
handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a
read-side critical section as reported by RCU lockdep:

=============================
WARNING: suspicious RCU usage
6.6.0-rc6 #7 Not tainted
-----------------------------
drivers/net/wireless/ath/ath11k/mac.c:638 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
no locks held by swapper/0/0.
...
Call trace:
...
 lockdep_rcu_suspicious+0x16c/0x22c
 ath11k_mac_get_ar_by_pdev_id+0x194/0x1b0 [ath11k]
 ath11k_wmi_tlv_op_rx+0xa84/0x2c1c [ath11k]
 ath11k_htc_rx_completion_handler+0x388/0x510 [ath11k]

Mark the code in question as an RCU read-side critical section to avoid
any potential use-after-free issues.

Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23

Fixes: a41d10348b01 ("ath11k: add thermal sensor device support")
Cc: stable@vger.kernel.org # 5.7
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231019153115.26401-2-johan+linaro@kernel.org
2 years agowifi: ath12k: rename the sc naming convention to ab
Karthikeyan Periyasamy [Wed, 18 Oct 2023 15:30:08 +0000 (21:00 +0530)]
wifi: ath12k: rename the sc naming convention to ab

In PCI and HAL interface layer module, the identifier sc is used
to represent an instance of ath12k_base structure. However,
within ath12k, the convention is to use "ab" to represent an SoC
"base" struct. So change the all instances of sc to ab.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00125-QCAHKSWPL_SILICONZ-1

Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231018153008.29820-3-quic_periyasa@quicinc.com
2 years agowifi: ath12k: rename the wmi_sc naming convention to wmi_ab
Karthikeyan Periyasamy [Wed, 18 Oct 2023 15:30:07 +0000 (21:00 +0530)]
wifi: ath12k: rename the wmi_sc naming convention to wmi_ab

In WMI layer module, the identifier wmi_sc is used to represent
an instance of ath12k_wmi_base structure. However, within ath12k,
the convention is to use "ab" to represent an SoC "base" struct.
So change the all instances of wmi_sc to wmi_ab.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00125-QCAHKSWPL_SILICONZ-1

Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231018153008.29820-2-quic_periyasa@quicinc.com
2 years agowifi: ath11k: add firmware-2.bin support
Anilkumar Kolli [Wed, 18 Oct 2023 08:37:06 +0000 (11:37 +0300)]
wifi: ath11k: add firmware-2.bin support

Firmware IE containers can dynamically provide various information
what firmware supports. Also it can embed more than one image so
updating firmware is easy, user just needs to update one file in
/lib/firmware/.

The firmware API 2 or higher will use the IE container format, the
current API 1 will not use the new format but it still is supported
for some time. Firmware API 2 files are named as firmware-2.bin
(which contains both amss.bin and m3.bin images) and API 1 files are
amss.bin and m3.bin.

Currently ath11k PCI driver provides firmware binary (amss.bin) path to
MHI driver, MHI driver reads firmware from filesystem and boots it. Add
provision to read firmware files from ath11k driver and provide the amss.bin
firmware data and size to MHI using a pointer.

Currently enum ath11k_fw_features is empty, the patches adding features will
add the flags.

With AHB devices there's no amss.bin or m3.bin, so no changes in how AHB
firmware files are used. But AHB devices can use future additions to the meta
data, for example in enum ath11k_fw_features.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9

Co-developed-by: P Praneesh <quic_ppranees@quicinc.com>
Signed-off-by: P Praneesh <quic_ppranees@quicinc.com>
Signed-off-by: Anilkumar Kolli <quic_akolli@quicinc.com>
Co-developed-by: Kalle Valo <quic_kvalo@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230727100430.3603551-4-kvalo@kernel.org
2 years agowifi: ath11k: qmi: refactor ath11k_qmi_m3_load()
Kalle Valo [Wed, 18 Oct 2023 08:37:06 +0000 (11:37 +0300)]
wifi: ath11k: qmi: refactor ath11k_qmi_m3_load()

Simple refactoring to make it easier to add firmware-2.bin support in the
following patch.

Earlier ath11k_qmi_m3_load() supported changing m3.bin contents while ath11k is
running. But that's not going to actually work, m3.bin is supposed to be the
same during the lifetime of ath11k, for example we don't support changing the
firmware capabilities on the fly. Due to this ath11k requests m3.bin firmware
file first and only then checks m3_mem->vaddr, so we are basically requesting
the firmware file even if it's not needed. Reverse the code so that m3_mem
buffer is checked first, and only if it doesn't exist, then m3.bin is requested
from user space.

Checking for m3_mem->size is redundant when m3_mem->vaddr is NULL, we would
not be able to use the buffer in that case. So remove the check for size.

Simplify the exit handling and use 'goto out'.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9

Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20230727100430.3603551-3-kvalo@kernel.org
2 years agonet: ipv6: fix typo in comments
Deming Wang [Wed, 25 Oct 2023 06:16:56 +0000 (02:16 -0400)]
net: ipv6: fix typo in comments

The word "advertize" should be replaced by "advertise".

Signed-off-by: Deming Wang <wangdeming@inspur.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipv4: fix typo in comments
Deming Wang [Wed, 25 Oct 2023 06:14:34 +0000 (02:14 -0400)]
net: ipv4: fix typo in comments

The word "advertize" should be replaced by "advertise".

Signed-off-by: Deming Wang <wangdeming@inspur.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/sched: act_ct: additional checks for outdated flows
Vlad Buslov [Tue, 24 Oct 2023 19:58:57 +0000 (21:58 +0200)]
net/sched: act_ct: additional checks for outdated flows

Current nf_flow_is_outdated() implementation considers any flow table flow
which state diverged from its underlying CT connection status for teardown
which can be problematic in the following cases:

- Flow has never been offloaded to hardware in the first place either
because flow table has hardware offload disabled (flag
NF_FLOWTABLE_HW_OFFLOAD is not set) or because it is still pending on 'add'
workqueue to be offloaded for the first time. The former is incorrect, the
later generates excessive deletions and additions of flows.

- Flow is already pending to be updated on the workqueue. Tearing down such
flows will also generate excessive removals from the flow table, especially
on highly loaded system where the latency to re-offload a flow via 'add'
workqueue can be quite high.

When considering a flow for teardown as outdated verify that it is both
offloaded to hardware and doesn't have any pending updates.

Fixes: 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 years agonetfilter: flowtable: GC pushes back packets to classic path
Pablo Neira Ayuso [Tue, 24 Oct 2023 19:09:47 +0000 (21:09 +0200)]
netfilter: flowtable: GC pushes back packets to classic path

Since 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded
unreplied tuple"), flowtable GC pushes back flows with IPS_SEEN_REPLY
back to classic path in every run, ie. every second. This is because of
a new check for NF_FLOW_HW_ESTABLISHED which is specific of sched/act_ct.

In Netfilter's flowtable case, NF_FLOW_HW_ESTABLISHED never gets set on
and IPS_SEEN_REPLY is unreliable since users decide when to offload the
flow before, such bit might be set on at a later stage.

Fix it by adding a custom .gc handler that sched/act_ct can use to
deal with its NF_FLOW_HW_ESTABLISHED bit.

Fixes: 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
Reported-by: Vladimir Smelhaus <vl.sm@email.cz>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 years agoamd/pds_core: core: No need for Null pointer check before kfree
Bragatheswaran Manickavel [Tue, 24 Oct 2023 18:20:51 +0000 (23:50 +0530)]
amd/pds_core: core: No need for Null pointer check before kfree

kfree()/vfree() internally perform NULL check on the
pointer handed to it and take no action if it indeed is
NULL. Hence there is no need for a pre-check of the memory
pointer before handing it to kfree()/vfree().

Issue reported by ifnullfree.cocci Coccinelle semantic
patch script.

Signed-off-by: Bragatheswaran Manickavel <bragathemanick0908@gmail.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'mv88e6xxx-dsa-bindings'
David S. Miller [Wed, 25 Oct 2023 09:28:00 +0000 (10:28 +0100)]
Merge branch 'mv88e6xxx-dsa-bindings'

Linus Walleij says:

====================
Create a binding for the Marvell MV88E6xxx DSA switches

The Marvell switches are lacking DT bindings.

I need proper schema checking to add LED support to the
Marvell switch. Just how it is, it can't go on like this.

Some Device Tree fixes are included in the series, these
remove the major and most annoying warnings fallout noise:
some warnings remain, and these are of more serious nature,
such as missing phy-mode. They can be applied individually,
or to the networking tree with the rest of the patches.

Thanks to Andrew Lunn, Vladimir Oltean and Russell King
for excellent review and feedback!

---
Changes in v7:
- Fix the elaborate spacing to satisfy yamllint in the
  ports/ethernet-ports requirement.
- Link to v6: https://lore.kernel.org/r/20231024-marvell-88e6152-wan-led-v6-0-993ab0949344@linaro.org

Changes in v6:
- Fix ports/ethernet-ports requirement with proper indenting
  (hopefully).
- Link to v5: https://lore.kernel.org/r/20231023-marvell-88e6152-wan-led-v5-0-0e82952015a7@linaro.org

Changes in v5:
- Consistently rename switch@n to ethernet-switch@n in all cleanup patches
- Consistently rename ports to ethernet-ports in all cleanup patches
- Consistently rename all port@n to ethernet-port@n in all cleanup patches
- Consistently rename all phy@n to ethernet-phy@n in all cleanup patches
- Restore the nodename on the Turris MOX which has a U-Boot binary using the
  nodename as ABI, put in a blurb warning about this so no-one else tries
  to change it in the future.
- Drop dsa.yaml direct references where we reference dsa.yaml#/$defs/ethernet-ports
- Replace the conjured MV88E6xxx example by a better one based on imx6qdl
  plus strictly named nodes and added reset-gpios for a more complete example,
  and another example using the interrupt controller based on
  armada-381-netgear-gs110emx.dts
- Bump lineage to 2008 as Vladimir says the code was developed starting 2008.
- Link to v4: https://lore.kernel.org/r/20231018-marvell-88e6152-wan-led-v4-0-3ee0c67383be@linaro.org

Changes in v4:
- Rebase the series on top of Rob's series
  "dt-bindings: net: Child node schema cleanups" (or the hex numbered
  ports will not work)
- Fix up a whitespacing error corrupting v3...
- Add a new patch making the generic DSA binding require ports or
  ethernet-ports in the switch node.
- Drop any corrections of port@a in the patches.
- Drop oneOf in the compatible enum for mv88e6xxx
- Use ethernet-switch, ethernet-ports and ethernet-phy in the examples
- Transclude the dsa.yaml#/$defs/ethernet-ports define for ports
- Move the DTS and binding fixes first, before the actual bindings,
  so they apply without (too many) warnings as fallout.
- Drop stray colon in text.
- Drop example port in the mveusb binding.
- Link to v3: https://lore.kernel.org/r/20231016-marvell-88e6152-wan-led-v3-0-38cd449dfb15@linaro.org

Changes in v3:
- Fix up a related mvusb example in a different binding that
  the scripts were complaining about.
- Fix up the wording on internal vs external MDIO buses in the
  mv88e6xxx binding document.
- Remove pointless label and put the right rev-mii into the
  MV88E6060 schema.
- Link to v2: https://lore.kernel.org/r/20231014-marvell-88e6152-wan-led-v2-0-7fca08b68849@linaro.org

Changes in v2:
- Break out a separate Marvell MV88E6060 binding file. I stand corrected.
- Drop the idea to rely on nodename mdio-external for the external
  MDIO bus, keep the compatible, drop patch for the driver.
- Fix more Marvell DT mistakes.
- Fix NXP DT mistakes in a separate patch.
- Fix Marvell ARM64 mistakes in a separate patch.
- Link to v1: https://lore.kernel.org/r/20231013-marvell-88e6152-wan-led-v1-0-0712ba99857c@linaro.org
====================

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: marvell: Add Marvell MV88E6060 DSA schema
Linus Walleij [Tue, 24 Oct 2023 13:20:33 +0000 (15:20 +0200)]
dt-bindings: marvell: Add Marvell MV88E6060 DSA schema

The Marvell MV88E6060 is one of the oldest DSA switches from
Marvell, and it has DT bindings used in the wild. Let's define
them properly.

It is different enough from the rest of the MV88E6xxx switches
that it deserves its own binding.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: marvell: Rewrite MV88E6xxx in schema
Linus Walleij [Tue, 24 Oct 2023 13:20:32 +0000 (15:20 +0200)]
dt-bindings: marvell: Rewrite MV88E6xxx in schema

This is an attempt to rewrite the Marvell MV88E6xxx switch bindings
in YAML schema.

The current text binding says:
  WARNING: This binding is currently unstable. Do not program it into a
  FLASH never to be changed again. Once this binding is stable, this
  warning will be removed.

Well that never happened before we switched to YAML markup,
we can't have it like this, what about fixing the mess?

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoARM64: dts: marvell: Fix some common switch mistakes
Linus Walleij [Tue, 24 Oct 2023 13:20:31 +0000 (15:20 +0200)]
ARM64: dts: marvell: Fix some common switch mistakes

Fix some errors in the Marvell MV88E6xxx switch descriptions:
- The top node had no address size or cells.
- switch0@0 is not OK, should be ethernet-switch@0.
- ports should be ethernet-ports
- port@0 should be ethernet-port@0
- PHYs should be named ethernet-phy@

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoARM: dts: nxp: Fix some common switch mistakes
Linus Walleij [Tue, 24 Oct 2023 13:20:30 +0000 (15:20 +0200)]
ARM: dts: nxp: Fix some common switch mistakes

Fix some errors in the Marvell MV88E6xxx switch descriptions:
- switch0@0 is not OK, should be ethernet-switch@0
- ports should be ethernet-ports
- port should be ethernet-port
- phy should be ethernet-phy

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoARM: dts: marvell: Fix some common switch mistakes
Linus Walleij [Tue, 24 Oct 2023 13:20:29 +0000 (15:20 +0200)]
ARM: dts: marvell: Fix some common switch mistakes

Fix some errors in the Marvell MV88E6xxx switch descriptions:
- The top node had no address size or cells.
- switch0@0 is not OK, should be ethernet-switch@0.
- The ports node should be named ethernet-ports
- The ethernet-ports node should have port@0 etc children, no
  plural "ports" in the children.
- Ports should be named ethernet-port@0 etc
- PHYs should be named ethernet-phy@0 etc

This serves as an example of fixes needed for introducing a
schema for the bindings, but the patch can simply be applied.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: net: mvusb: Fix up DSA example
Linus Walleij [Tue, 24 Oct 2023 13:20:28 +0000 (15:20 +0200)]
dt-bindings: net: mvusb: Fix up DSA example

When adding a proper schema for the Marvell mx88e6xxx switch,
the scripts start complaining about this embedded example:

  dtschema/dtc warnings/errors:
  net/marvell,mvusb.example.dtb: switch@0: ports: '#address-cells'
  is a required property
  from schema $id: http://devicetree.org/schemas/net/dsa/marvell,mv88e6xxx.yaml#
  net/marvell,mvusb.example.dtb: switch@0: ports: '#size-cells'
  is a required property
  from schema $id: http://devicetree.org/schemas/net/dsa/marvell,mv88e6xxx.yaml#

Fix this up by extending the example with those properties in
the ports node.

While we are at it, rename "ports" to "ethernet-ports" and rename
"switch" to "ethernet-switch" as this is recommended practice.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: net: dsa: Require ports or ethernet-ports
Linus Walleij [Tue, 24 Oct 2023 13:20:27 +0000 (15:20 +0200)]
dt-bindings: net: dsa: Require ports or ethernet-ports

Bindings using dsa.yaml#/$defs/ethernet-ports specify that
a DSA switch node need to have a ports or ethernet-ports
subnode, and that is actually required, so add requirements
using oneOf.

Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosched: act_ct: switch to per-action label counting
Florian Westphal [Tue, 24 Oct 2023 11:05:51 +0000 (13:05 +0200)]
sched: act_ct: switch to per-action label counting

net->ct.labels_used was meant to convey 'number of ip/nftables rules
that need the label extension allocated'.

act_ct enables this for each net namespace, which voids all attempts
to avoid ct->ext allocation when possible.

Move this increment to the control plane to request label extension
space allocation only when its needed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agowifi: rtw89: cleanup firmware elements parsing
Dmitry Antipov [Fri, 20 Oct 2023 04:09:36 +0000 (07:09 +0300)]
wifi: rtw89: cleanup firmware elements parsing

When compiling with clang-18, I've noticed the following:

drivers/net/wireless/realtek/rtw89/fw.c:389:28: warning: cast to smaller
integer type 'enum rtw89_fw_type' from 'const void *' [-Wvoid-pointer-to-enum-cast]
  389 |         enum rtw89_fw_type type = (enum rtw89_fw_type)data;
      |                                   ^~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireless/realtek/rtw89/fw.c:569:13: warning: cast to smaller
integer type 'enum rtw89_rf_path' from 'const void *' [-Wvoid-pointer-to-enum-cast]
  569 |                 rf_path = (enum rtw89_rf_path)data;
      |                           ^~~~~~~~~~~~~~~~~~~~~~~~

So avoid brutal everything-to-const-void-and-back casts, introduce
'union rtw89_fw_element_arg' to pass parameters to element handler
callbacks, and adjust all of the related bits accordingly. Compile
tested only.

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231020040940.33154-1-dmantipov@yandex.ru
2 years agowifi: rt2x00: rework MT7620 PA/LNA RF calibration
Shiji Yang [Thu, 19 Oct 2023 11:58:58 +0000 (19:58 +0800)]
wifi: rt2x00: rework MT7620 PA/LNA RF calibration

1. Move MT7620 PA/LNA calibration code to dedicated functions.
2. For external PA/LNA devices, restore RF and BBP registers before
   R-Calibration.
3. Do Rx DCOC calibration again before RXIQ calibration.
4. Add some missing LNA related registers' initialization.

Signed-off-by: Shiji Yang <yangshiji66@outlook.com>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/TYAP286MB0315979F92DC563019B8F238BCD4A@TYAP286MB0315.JPNP286.PROD.OUTLOOK.COM
2 years agowifi: rt2x00: rework MT7620 channel config function
Shiji Yang [Thu, 19 Oct 2023 11:58:57 +0000 (19:58 +0800)]
wifi: rt2x00: rework MT7620 channel config function

1. Move the channel configuration code from rt2800_vco_calibration()
   to the rt2800_config_channel().
2. Use MT7620 SoC specific AGC initial LNA value instead of the
   RT5592's value.
3. BBP{195,196} pairing write has been replaced with
   rt2800_bbp_glrt_write() to reduce redundant code.

Signed-off-by: Shiji Yang <yangshiji66@outlook.com>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/TYAP286MB0315622A4340BFFA530B1B86BCD4A@TYAP286MB0315.JPNP286.PROD.OUTLOOK.COM
2 years agowifi: rt2x00: improve MT7620 register initialization
Shiji Yang [Thu, 19 Oct 2023 11:58:56 +0000 (19:58 +0800)]
wifi: rt2x00: improve MT7620 register initialization

1. Do not hard reset the BBP. We can use soft reset instead. This
   change has some help to the calibration failure issue.
2. Enable falling back to legacy rate from the HT/RTS rate by
   setting the HT_FBK_TO_LEGACY register.
3. Implement MCS rate specific maximum PSDU size. It can improve
   the transmission quality under the low RSSI condition.
4. Set BBP_84 register value to 0x19. This is used for extension
   channel overlapping IOT.

Signed-off-by: Shiji Yang <yangshiji66@outlook.com>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/TYAP286MB031553CCD4B7A3B89C85935DBCD4A@TYAP286MB0315.JPNP286.PROD.OUTLOOK.COM
2 years agonet: hns3: add some link modes for hisilicon device
Hao Chen [Tue, 24 Oct 2023 03:20:34 +0000 (11:20 +0800)]
net: hns3: add some link modes for hisilicon device

Add HCLGE_SUPPORT_50G_R1_BIT and HCLGE_SUPPORT_100G_R2_BIT two
capability bits and Corresponding link modes.

Signed-off-by: Hao Chen <chenhao418@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'dsa-microchip-WoL-support'
David S. Miller [Wed, 25 Oct 2023 07:47:33 +0000 (08:47 +0100)]
Merge branch 'dsa-microchip-WoL-support'

2 years agonet: dsa: microchip: ksz9477: add Wake on LAN support
Oleksij Rempel [Mon, 23 Oct 2023 09:33:38 +0000 (11:33 +0200)]
net: dsa: microchip: ksz9477: add Wake on LAN support

Add WoL support for KSZ9477 family of switches. This code was tested on
KSZ8563 chip.

KSZ9477 family of switches supports multiple PHY events:
- wake on Link Up
- wake on Energy Detect.
Since current UAPI can't differentiate between this PHY events, map all
of them to WAKE_PHY.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dsa: microchip: use wakeup-source DT property to enable PME output
Oleksij Rempel [Mon, 23 Oct 2023 09:33:37 +0000 (11:33 +0200)]
net: dsa: microchip: use wakeup-source DT property to enable PME output

KSZ switches with WoL support signals wake event over PME pin. If this
pin is attached to some external PMIC or System Controller can't be
described as GPIO, the only way to describe it in the devicetree is to
use wakeup-source property. So, add support for this property and enable
PME switch output if this property is present.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: net: dsa: microchip: add wakeup-source property
Oleksij Rempel [Mon, 23 Oct 2023 09:33:36 +0000 (11:33 +0200)]
dt-bindings: net: dsa: microchip: add wakeup-source property

Add wakeup-source property to enable Wake on Lan functionality in the
switch.

Since PME wake pin is not always attached to the SoC, use wakeup-source
instead of wakeup-gpios

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dsa: microchip: Add missing MAC address register offset for ksz8863
Oleksij Rempel [Mon, 23 Oct 2023 09:33:35 +0000 (11:33 +0200)]
net: dsa: microchip: Add missing MAC address register offset for ksz8863

Add the missing offset for the global MAC address register
(REG_SW_MAC_ADDR) for the ksz8863 family of switches.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agos390/qeth: replace deprecated strncpy with strscpy
Justin Stitt [Mon, 23 Oct 2023 19:39:39 +0000 (19:39 +0000)]
s390/qeth: replace deprecated strncpy with strscpy

strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We expect new_entry->dbf_name to be NUL-terminated based on its use with
strcmp():
|       if (strcmp(entry->dbf_name, name) == 0) {

Moreover, NUL-padding is not required as new_entry is kzalloc'd just
before this assignment:
|       new_entry = kzalloc(sizeof(struct qeth_dbf_entry), GFP_KERNEL);

... rendering any future NUL-byte assignments (like the ones strncpy()
does) redundant.

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Tested-by: Thorsten Winkler <twinkler@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231023-strncpy-drivers-s390-net-qeth_core_main-c-v1-1-e7ce65454446@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agos390/ctcm: replace deprecated strncpy with strscpy
Justin Stitt [Mon, 23 Oct 2023 19:35:07 +0000 (19:35 +0000)]
s390/ctcm: replace deprecated strncpy with strscpy

strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We expect chid to be NUL-terminated based on its use with format
strings:

CTCM_DBF_TEXT_(SETUP, CTC_DBF_INFO, "%s(%s) %s", CTCM_FUNTAIL,
chid, ok ? "OK" : "failed");

Moreover, NUL-padding is not required as it is _only_ used in this one
instance with a format string.

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

We can also drop the +1 from chid's declaration as we no longer need to
be cautious about leaving a spot for a NUL-byte. Let's use the more
idiomatic strscpy usage of (dest, src, sizeof(dest)) as this more
closely ties the destination buffer to the length.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Tested-by: Thorsten Winkler <twinkler@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231023-strncpy-drivers-s390-net-ctcm_main-c-v1-1-265db6e78165@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>