Some drivers depend on shutdown being called for proper operation.
Unset HCI_USER_CHANNEL and call the full close routine since shutdown is
complementary to setup.
Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Suspend notifier should only be registered and unregistered once per
hdev. Simplify this by only registering during driver registration and
simply exiting early when HCI_USER_CHANNEL is set.
Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: 359ee4f834f5 (Bluetooth: Unregister suspend with userchannel) Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Luiz Augusto von Dentz [Mon, 26 Sep 2022 22:44:42 +0000 (15:44 -0700)]
Bluetooth: hci_core: Fix not handling link timeouts propertly
Change that introduced the use of __check_timeout did not account for
link types properly, it always assumes ACL_LINK is used thus causing
hdev->acl_last_tx to be used even in case of LE_LINK and then again
uses ACL_LINK with hci_link_tx_to.
To fix this __check_timeout now takes the link type as parameter and
then procedure to use the right last_tx based on the link type and pass
it to hci_link_tx_to.
Fixes: 1b1d29e51499 ("Bluetooth: Make use of __check_timeout on hci_sched_le") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: David Beinder <david@beinder.at>
Luiz Augusto von Dentz [Mon, 19 Sep 2022 18:10:17 +0000 (11:10 -0700)]
Bluetooth: hci_event: Make sure ISO events don't affect non-ISO connections
ISO events (CIS/BIS) shall only be relevant for connection with link
type of ISO_LINK, otherwise the controller is probably buggy or it is
the result of fuzzer tools such as syzkaller.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
syzbot is reporting NULL pointer dereference at hci_uart_tty_close() [1],
for rcu_sync_enter() is called without rcu_sync_init() due to
hci_uart_tty_open() ignoring percpu_init_rwsem() failure.
While we are at it, fix that hci_uart_register_device() ignores
percpu_init_rwsem() failure and hci_uart_unregister_device() does not
call percpu_free_rwsem().
Link: https://syzkaller.appspot.com/bug?extid=576dfca25381fb6fbc5f Reported-by: syzbot <syzbot+576dfca25381fb6fbc5f@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 67d2f8781b9f00d1 ("Bluetooth: hci_ldisc: Allow sleeping while proto locks are held.") Fixes: d73e172816652772 ("Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: use hdev->workqueue when queuing hdev->{cmd,ncmd}_timer works
syzbot is reporting attempt to schedule hdev->cmd_work work from system_wq
WQ into hdev->workqueue WQ which is under draining operation [1], for
commit c8efcc2589464ac7 ("workqueue: allow chained queueing during
destruction") does not allow such operation.
The check introduced by commit 877afadad2dce8aa ("Bluetooth: When HCI work
queue is drained, only queue chained work") was incomplete.
Use hdev->workqueue WQ when queuing hdev->{cmd,ncmd}_timer works because
hci_{cmd,ncmd}_timeout() calls queue_work(hdev->workqueue). Also, protect
the queuing operation with RCU read lock in order to avoid calling
queue_delayed_work() after cancel_delayed_work() completed.
Link: https://syzkaller.appspot.com/bug?extid=243b7d89777f90f7613b Reported-by: syzbot <syzbot+243b7d89777f90f7613b@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 877afadad2dce8aa ("Bluetooth: When HCI work queue is drained, only queue chained work") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: L2CAP: initialize delayed works at l2cap_chan_create()
syzbot is reporting cancel_delayed_work() without INIT_DELAYED_WORK() at
l2cap_chan_del() [1], for CONF_NOT_COMPLETE flag (which meant to prevent
l2cap_chan_del() from calling cancel_delayed_work()) is cleared by timer
which fires before l2cap_chan_del() is called by closing file descriptor
created by socket(AF_BLUETOOTH, SOCK_STREAM, BTPROTO_L2CAP).
l2cap_bredr_sig_cmd(L2CAP_CONF_REQ) and l2cap_bredr_sig_cmd(L2CAP_CONF_RSP)
are calling l2cap_ertm_init(chan), and they call l2cap_chan_ready() (which
clears CONF_NOT_COMPLETE flag) only when l2cap_ertm_init(chan) succeeded.
l2cap_sock_init() does not call l2cap_ertm_init(chan), and it instead sets
CONF_NOT_COMPLETE flag by calling l2cap_chan_set_defaults(). However, when
connect() is requested, "command 0x0409 tx timeout" happens after 2 seconds
from connect() request, and CONF_NOT_COMPLETE flag is cleared after 4
seconds from connect() request, for l2cap_conn_start() from
l2cap_info_timeout() callback scheduled by
Fix this problem by initializing delayed works used by L2CAP_MODE_ERTM
mode as soon as l2cap_chan_create() allocates a channel, like I did in
commit be8597239379f0f5 ("Bluetooth: initialize skb_queue_head at
l2cap_chan_create()").
Link: https://syzkaller.appspot.com/bug?extid=83672956c7aa6af698b3 Reported-by: syzbot <syzbot+83672956c7aa6af698b3@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
To fix this when the call __rfcomm_sock_close is now done without
holding the lock_sock since rfcomm_dlc_lock exists to protect
the dlc data there is no need to use lock_sock in that code path.
Bluetooth: hci_sync: allow advertise when scan without RPA
Address resolution will be paused during active scan to allow any
advertising reports reach the host. If LL privacy is enabled,
advertising will rely on the controller to generate new RPA.
If host is not using RPA, there is no need to stop advertising during
active scan because there is no need to generate RPA in the controller.
Signed-off-by: Zhengping Jiang <jiangzp@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: avoid hci_dev_test_and_set_flag() in mgmt_init_hdev()
syzbot is again reporting attempt to cancel uninitialized work
at mgmt_index_removed() [1], for setting of HCI_MGMT flag from
mgmt_init_hdev() from hci_mgmt_cmd() from hci_sock_sendmsg() can
race with testing of HCI_MGMT flag from mgmt_index_removed() from
hci_sock_bind() due to lack of serialization via hci_dev_lock().
Since mgmt_init_hdev() is called with mgmt_chan_list_lock held, we can
safely split hci_dev_test_and_set_flag() into hci_dev_test_flag() and
hci_dev_set_flag(). Thus, in order to close this race, set HCI_MGMT flag
after INIT_DELAYED_WORK() completed.
This is a local fix based on mgmt_chan_list_lock. Lack of serialization
via hci_dev_lock() might be causing different race conditions somewhere
else. But a global fix based on hci_dev_lock() should deserve a future
patch.
Link: https://syzkaller.appspot.com/bug?extid=844c7bf1b1aa4119c5de Reported-by: syzbot+844c7bf1b1aa4119c5de@syzkaller.appspotmail.com Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 3f2893d3c142986a ("Bluetooth: don't try to cancel uninitialized works at mgmt_index_removed()") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Kiran K [Wed, 7 Sep 2022 07:19:45 +0000 (12:49 +0530)]
Bluetooth: btintel: Mark Intel controller to support LE_STATES quirk
HarrrisonPeak, CyclonePeak, SnowFieldPeak and SandyPeak controllers
are marked to support HCI_QUIRK_LE_STATES.
Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Chethan T N <chethan.tumkur.narayan@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Co-developed-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Luiz Augusto von Dentz [Thu, 8 Sep 2022 20:57:50 +0000 (13:57 -0700)]
Bluetooth: Fix HCIGETDEVINFO regression
Recent changes breaks HCIGETDEVINFO since it changes the size of
hci_dev_info.
Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
hci_read_buffer_size_sync shall not use HCI_OP_LE_READ_BUFFER_SIZE_V2
sinze that is LE specific, instead it is hci_le_read_buffer_size_sync
version that shall use it.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216382 Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Brian Gix [Thu, 1 Sep 2022 19:19:13 +0000 (12:19 -0700)]
Bluetooth: Implement support for Mesh
The patch adds state bits, storage and HCI command chains for sending
and receiving Bluetooth Mesh advertising packets, and delivery to
requesting user space processes. It specifically creates 4 new MGMT
commands and 2 new MGMT events:
MGMT_OP_SET_MESH_RECEIVER - Sets passive scan parameters and a list of
AD Types which will trigger Mesh Packet Received events
MGMT_OP_MESH_READ_FEATURES - Returns information on how many outbound
Mesh packets can be simultaneously queued, and what the currently queued
handles are.
MGMT_OP_MESH_SEND - Command to queue a specific outbound Mesh packet,
with the number of times it should be sent, and the BD Addr to use.
Discrete advertisments are added to the ADV Instance list.
MGMT_OP_MESH_SEND_CANCEL - Command to cancel a prior outbound message
request.
MGMT_EV_MESH_DEVICE_FOUND - Event to deliver entire received Mesh
Advertisement packet, along with timing information.
MGMT_EV_MESH_PACKET_CMPLT - Event to indicate that an outbound packet is
no longer queued for delivery.
Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Sean Wang [Thu, 11 Aug 2022 00:49:07 +0000 (08:49 +0800)]
Bluetooth: btusb: mediatek: fix WMT failure during runtime suspend
WMT cmd/event doesn't follow up the generic HCI cmd/event handling, it
needs constantly polling control pipe until the host received the WMT
event, thus, we should require to specifically acquire PM counter on the
USB to prevent the interface from entering auto suspended while WMT
cmd/event in progress.
Fixes: a1c49c434e15 ("Bluetooth: btusb: Add protocol support for MediaTek MT7668U USB devices") Co-developed-by: Jing Cai <jing.cai@mediatek.com> Signed-off-by: Jing Cai <jing.cai@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Brian Gix [Tue, 16 Aug 2022 16:41:20 +0000 (09:41 -0700)]
Bluetooth: Move hci_abort_conn to hci_conn.c
hci_abort_conn() is a wrapper around a number of DISCONNECT and
CREATE_CONN_CANCEL commands that was being invoked from hci_request
request queues, which are now deprecated. There are two versions:
hci_abort_conn() which can be invoked from the hci_event thread, and
hci_abort_conn_sync() which can be invoked within a hci_sync cmd chain.
Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The HCI_OP_READ_ENC_KEY_SIZE command is converted from using the
deprecated hci_request mechanism to use hci_send_cmd, with an
accompanying hci_cc_read_enc_key_size to handle it's return response.
Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Zhengping Jiang [Tue, 23 Aug 2022 17:28:08 +0000 (10:28 -0700)]
Bluetooth: hci_sync: hold hdev->lock when cleanup hci_conn
When disconnecting all devices, hci_conn_failed is used to cleanup
hci_conn object when the hci_conn object cannot be aborted.
The function hci_conn_failed requires the caller holds hdev->lock.
Fixes: 9b3628d79b46f ("Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted") Signed-off-by: Zhengping Jiang <jiangzp@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Wolfram Sang [Thu, 18 Aug 2022 21:02:07 +0000 (23:02 +0200)]
Bluetooth: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.
Archie Pusaka [Tue, 23 Aug 2022 04:39:22 +0000 (12:39 +0800)]
Bluetooth: hci_event: Fix checking conn for le_conn_complete_evt
To prevent multiple conn complete events, we shouldn't look up the
conn with hci_lookup_le_connect, since it requires the state to be
BT_CONNECT. By the time the duplicate event is processed, the state
might have changed, so we end up processing the new event anyway.
Change the lookup function to hci_conn_hash_lookup_ba.
Luiz Augusto von Dentz [Thu, 18 Aug 2022 21:31:42 +0000 (14:31 -0700)]
Bluetooth: ISO: Fix not handling shutdown condition
In order to properly handle shutdown syscall the code shall not assume
that the how argument is always SHUT_RDWR resulting in SHUTDOWN_MASK as
that would result in poll to immediately report EPOLLHUP instead of
properly waiting for disconnect_cfm (Disconnect Complete) which is
rather important for the likes of BAP as the CIG may need to be
reprogrammed.
Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Hans de Goede [Sun, 7 Aug 2022 20:57:40 +0000 (22:57 +0200)]
Bluetooth: hci_event: Fix vendor (unknown) opcode status handling
Commit c8992cffbe74 ("Bluetooth: hci_event: Use of a function table to
handle Command Complete") was (presumably) meant to only refactor things
without any functional changes.
But it does have one undesirable side-effect, before *status would always
be set to skb->data[0] and it might be overridden by some of the opcode
specific handling. While now it always set by the opcode specific handlers.
This means that if the opcode is not known *status does not get set any
more at all!
This behavior change has broken bluetooth support for BCM4343A0 HCIs,
the hci_bcm.c code tries to configure UART attached HCIs at a higher
baudraute using vendor specific opcodes. The BCM4343A0 does not
support this and this used to simply fail:
[ 25.646442] Bluetooth: hci0: BCM: failed to write clock (-56)
[ 25.646481] Bluetooth: hci0: Failed to set baudrate
After which things would continue with the initial baudraute. But now
that hci_cmd_complete_evt() no longer sets status for unknown opcodes
*status is left at 0. This causes the hci_bcm.c code to think the baudraute
has been changed on the HCI side and to also adjust the UART baudrate,
after which communication with the HCI is broken, leading to:
And non working bluetooth. Fix this by restoring the previous
default "*status = skb->data[0]" handling for unknown opcodes.
Fixes: c8992cffbe74 ("Bluetooth: hci_event: Use of a function table to handle Command Complete") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Brian Gix [Fri, 5 Aug 2022 23:42:35 +0000 (16:42 -0700)]
Bluetooth: convert hci_update_adv_data to hci_sync
hci_update_adv_data() is called from hci_event and hci_core due to
events from the controller. The prior function used the deprecated
hci_request method, and the new one uses hci_sync.c
Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Brian Gix [Fri, 5 Aug 2022 23:42:32 +0000 (16:42 -0700)]
Bluetooth: Move Adv Instance timer to hci_sync
The Advertising Instance expiration timer adv_instance_expire was
handled with the deprecated hci_request mechanism, rather than it's
replacement: hci_sync.
Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Uros Bizjak [Mon, 22 Aug 2022 14:32:43 +0000 (16:32 +0200)]
netdev: Use try_cmpxchg in napi_if_scheduled_mark_missed
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
napi_if_scheduled_mark_missed. x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg
(and related move instruction in front of cmpxchg).
Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.
====================
net: devlink: sync flash and dev info commands
Purpose of this patchset is to introduce consistency between two devlink
commands:
devlink dev info
Shows versions of running default flash target and components.
devlink dev flash
Flashes default flash target or component name (if specified
on cmdline).
Currently it is up to the driver what versions to expose and what flash
update component names to accept. This is inconsistent. Thankfully, only
netdevsim currently using components so it is still time
to sanitize this.
This patchset makes sure, that devlink.c calls into driver for
component flash update only in case the driver exposes the same version
name.
Example:
$ devlink dev info
netdevsim/netdevsim10:
driver netdevsim
versions:
running:
fw.mgmt 10.20.30
stored:
fw.mgmt 10.20.30
$ devlink dev flash netdevsim/netdevsim10 file somefile.bin
[fw.mgmt] Preparing to flash
[fw.mgmt] Flashing 100%
[fw.mgmt] Flash select
[fw.mgmt] Flashing done
$ devlink dev flash netdevsim/netdevsim10 file somefile.bin component fw.mgmt
[fw.mgmt] Preparing to flash
[fw.mgmt] Flashing 100%
[fw.mgmt] Flash select
[fw.mgmt] Flashing done
$ devlink dev flash netdevsim/netdevsim10 file somefile.bin component dummy
Error: selected component is not supported by this device.
====================
Jiri Pirko [Wed, 24 Aug 2022 12:20:11 +0000 (14:20 +0200)]
net: devlink: limit flash component name to match version returned by info_get()
Limit the acceptance of component name passed to cmd_flash_update() to
match one of the versions returned by info_get(), marked by version type.
This makes things clearer and enforces 1:1 mapping between exposed
version and accepted flash component.
Check VERSION_TYPE_COMPONENT version type during cmd_flash_update()
execution by calling info_get() with different "req" context.
That causes info_get() to lookup the component name instead of
filling-up the netlink message.
Remove "UPDATE_COMPONENT" flag which becomes used.
Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 24 Aug 2022 12:20:10 +0000 (14:20 +0200)]
netdevsim: add version fw.mgmt info info_get() and mark as a component
Fix the only component user which is netdevsim. It uses component named
"fw.mgmt" in selftests. So add this version to info_get() output with
version type component.
Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 24 Aug 2022 12:20:09 +0000 (14:20 +0200)]
net: devlink: extend info_get() version put to indicate a flash component
Whenever the driver is called by his info_get() op, it may put multiple
version names and values to the netlink message. Extend by additional
helper devlink_info_version_running/stored_put_ext() that allows to
specify a version type that indicates when particular version name
represents a flash component.
This is going to be used in follow-up patch calling info_get() during
flash update command checking if version with this the version type
exists.
Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhengchao Shao [Wed, 24 Aug 2022 00:52:31 +0000 (08:52 +0800)]
net: sched: delete duplicate cleanup of backlog and qlen
qdisc_reset() is clearing qdisc->q.qlen and qdisc->qstats.backlog
_after_ calling qdisc->ops->reset. There is no need to clear them
again in the specific reset function.
Wenjuan Geng [Tue, 23 Aug 2022 09:01:22 +0000 (11:01 +0200)]
nfp: flower: support case of match on ct_state(0/0x3f)
is_post_ct_flow() function will process only ct_state ESTABLISHED,
then offload_pre_check() function will check FLOW_DISSECTOR_KEY_CT flag.
When config tc filter match ct_state(0/0x3f), dissector->used_keys
with FLOW_DISSECTOR_KEY_CT bit, function offload_pre_check() will
return false, so not offload. This is a special case that can be handled
safely.
Therefore, modify to let initial packet which won't go through conntrack
can be offloaded, as long as the cared ct fields are all zero.
Richard Gobert [Tue, 23 Aug 2022 07:10:49 +0000 (09:10 +0200)]
net: gro: skb_gro_header helper function
Introduce a simple helper function to replace a common pattern.
When accessing the GRO header, we fetch the pointer from frag0,
then test its validity and fetch it from the skb when necessary.
This leads to the pattern
skb_gro_header_fast -> skb_gro_header_hard -> skb_gro_header_slow
recurring many times throughout GRO code.
This patch replaces these patterns with a single inlined function
call, improving code readability.
====================
Add a second bind table hashed by port and address
Currently, there is one bind hashtable (bhash) that hashes by port only.
This patchset adds a second bind table (bhash2) that hashes by port and
address.
The motivation for adding bhash2 is to expedite bind requests in situations
where the port has many sockets in its bhash table entry (eg a large number
of sockets bound to different addresses on the same port), which makes checking
bind conflicts costly especially given that we acquire the table entry spinlock
while doing so, which can cause softirq cpu lockups and can prevent new tcp
connections.
We ran into this problem at Meta where the traffic team binds a large number
of IPs to port 443 and the bind() call took a significant amount of time
which led to cpu softirq lockups, which caused packet drops and other failures
on the machine.
When experimentally testing this on a local server for ~24k sockets bound to
the port, the results seen were:
ipv4:
before - 0.002317 seconds
with bhash2 - 0.000020 seconds
ipv6:
before - 0.002431 seconds
with bhash2 - 0.000021 seconds
The additions to the initial bhash2 submission [0] are:
* Updating bhash2 in the cases where a socket's rcv saddr changes after it has
* been bound
* Adding locks for bhash2 hashbuckets
Joanne Koong [Mon, 22 Aug 2022 18:10:23 +0000 (11:10 -0700)]
selftests/net: Add sk_bind_sendto_listen and sk_connect_zero_addr
This patch adds 2 new tests: sk_bind_sendto_listen and
sk_connect_zero_addr.
The sk_bind_sendto_listen test exercises the path where a socket's
rcv saddr changes after it has been added to the binding tables,
and then a listen() on the socket is invoked. The listen() should
succeed.
The sk_bind_sendto_listen test is copied over from one of syzbot's
tests: https://syzkaller.appspot.com/x/repro.c?x=1673a38df00000
The sk_connect_zero_addr test exercises the path where the socket was
never previously added to the binding tables and it gets assigned a
saddr upon a connect() to address 0.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joanne Koong [Mon, 22 Aug 2022 18:10:22 +0000 (11:10 -0700)]
selftests/net: Add test for timing a bind request to a port with a populated bhash entry
This test populates the bhash table for a given port with
MAX_THREADS * MAX_CONNECTIONS sockets, and then times how long
a bind request on the port takes.
When populating the bhash table, we create the sockets and then bind
the sockets to the same address and port (SO_REUSEADDR and SO_REUSEPORT
are set). When timing how long a bind on the port takes, we bind on a
different address without SO_REUSEPORT set. We do not set SO_REUSEPORT
because we are interested in the case where the bind request does not
go through the tb->fastreuseport path, which is fragile (eg
tb->fastreuseport path does not work if binding with a different uid).
To run the script:
Usage: ./bind_bhash.sh [-6 | -4] [-p port] [-a address]
6: use ipv6
4: use ipv4
port: Port number
address: ip address
Without any arguments, ./bind_bhash.sh defaults to ipv6 using ip address
"2001:0db8:0:f101::1" on port 443.
On my local machine, I see:
ipv4:
before - 0.002317 seconds
with bhash2 - 0.000020 seconds
ipv6:
before - 0.002431 seconds
with bhash2 - 0.000021 seconds
Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joanne Koong [Mon, 22 Aug 2022 18:10:21 +0000 (11:10 -0700)]
net: Add a bhash2 table hashed by port and address
The current bind hashtable (bhash) is hashed by port only.
In the socket bind path, we have to check for bind conflicts by
traversing the specified port's inet_bind_bucket while holding the
hashbucket's spinlock (see inet_csk_get_port() and
inet_csk_bind_conflict()). In instances where there are tons of
sockets hashed to the same port at different addresses, the bind
conflict check is time-intensive and can cause softirq cpu lockups,
as well as stops new tcp connections since __inet_inherit_port()
also contests for the spinlock.
This patch adds a second bind table, bhash2, that hashes by
port and sk->sk_rcv_saddr (ipv4) and sk->sk_v6_rcv_saddr (ipv6).
Searching the bhash2 table leads to significantly faster conflict
resolution and less time holding the hashbucket spinlock.
Please note a few things:
* There can be the case where the a socket's address changes after it
has been bound. There are two cases where this happens:
1) The case where there is a bind() call on INADDR_ANY (ipv4) or
IPV6_ADDR_ANY (ipv6) and then a connect() call. The kernel will
assign the socket an address when it handles the connect()
2) In inet_sk_reselect_saddr(), which is called when rebuilding the
sk header and a few pre-conditions are met (eg rerouting fails).
In these two cases, we need to update the bhash2 table by removing the
entry for the old address, and add a new entry reflecting the updated
address.
* The bhash2 table must have its own lock, even though concurrent
accesses on the same port are protected by the bhash lock. Bhash2 must
have its own lock to protect against cases where sockets on different
ports hash to different bhash hashbuckets but to the same bhash2
hashbucket.
This brings up a few stipulations:
1) When acquiring both the bhash and the bhash2 lock, the bhash2 lock
will always be acquired after the bhash lock and released before the
bhash lock is released.
2) There are no nested bhash2 hashbucket locks. A bhash2 lock is always
acquired+released before another bhash2 lock is acquired+released.
* The bhash table cannot be superseded by the bhash2 table because for
bind requests on INADDR_ANY (ipv4) or IPV6_ADDR_ANY (ipv6), every socket
bound to that port must be checked for a potential conflict. The bhash
table is the only source of port->socket associations.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Randy Dunlap [Wed, 24 Aug 2022 02:42:16 +0000 (19:42 -0700)]
net: ethernet: ti: davinci_mdio: fix build for mdio bitbang uses
davinci_mdio.c uses mdio bitbang APIs, so it should select
MDIO_BITBANG to prevent build errors.
arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdio_remove':
drivers/net/ethernet/ti/davinci_mdio.c:649: undefined reference to `free_mdio_bitbang'
arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdio_probe':
drivers/net/ethernet/ti/davinci_mdio.c:545: undefined reference to `alloc_mdio_bitbang'
arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdiobb_read':
drivers/net/ethernet/ti/davinci_mdio.c:236: undefined reference to `mdiobb_read'
arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdiobb_write':
drivers/net/ethernet/ti/davinci_mdio.c:253: undefined reference to `mdiobb_write'
Fixes: d04807b80691 ("net: ethernet: ti: davinci_mdio: Add workaround for errata i2329") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Ravi Gunasekaran <r-gunasekaran@ti.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/r/20220824024216.4939-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Wed, 24 Aug 2022 12:24:10 +0000 (13:24 +0100)]
Merge branch 'r8169-next'
Heiner Kallweit says:
====================
r8169: remove support for few unused chip versions
There's a number of chip versions that apparently never made it to the
mass market. Detection of these chip versions has been disabled for
few kernel versions now and nobody complained. Therefore remove
support for these chip versions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Aug 2022 12:19:39 +0000 (13:19 +0100)]
Merge tag 'mlx5-updates-2022-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
mlx5-updates-2022-08-22
Roi Dayan Says:
===============
Add support for SF tunnel offload
Mlx5 driver only supports VF tunnel offload.
To add support for SF tunnel offload the driver needs to:
1. Add send-to-vport metadata matching rules like done for VFs.
2. Set an indirect table for SF vport, same as VF vport.
info smaller sub functions for better maintainability.
rules from esw init phase to representor load phase.
SFs could be created after esw initialized and thus the send-to-vport
meta rules would not be created for those SFs.
By moving the creation of the rules to representor load phase
we ensure creating the rules also for SFs created later.
===============
Lama Kayal Says:
================
Make flow steering API loosely coupled from mlx5e_priv, in a manner to
introduce more readable and maintainable modules.
Make TC's private, let mlx5e_flow_steering struct be dynamically allocated,
and introduce its API to maintain the code via setters and getters
instead of publicly exposing it.
Introduce flow steering debug macros to provide an elegant finish to the
decoupled flow steering API, where errors related to flow steering shall
be reported via them.
All flow steering related files will drop any coupling to mlx5e_priv,
instead they will get the relevant members as input. Among these,
fs_tt_redirect, fs_tc, and arfs.
================
Jerry Ray [Mon, 22 Aug 2022 21:39:32 +0000 (16:39 -0500)]
micrel: ksz8851: fixes struct pointer issue
Issue found during code review. This bug has no impact as long as the
ks8851_net structure is the first element of the ks8851_net_spi structure.
As long as the offset to the ks8851_net struct is zero, the container_of()
macro is subtracting 0 and therefore no damage done. But if the
ks8851_net_spi struct is ever modified such that the ks8851_net struct
within it is no longer the first element of the struct, then the bug would
manifest itself and cause problems.
struct ks8851_net is contained within ks8851_net_spi.
ks is contained within kss.
kss is the priv_data of the netdev structure.
Signed-off-by: Jerry Ray <jerry.ray@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 22 Aug 2022 21:15:28 +0000 (21:15 +0000)]
tcp: annotate data-race around tcp_md5sig_pool_populated
tcp_md5sig_pool_populated can be read while another thread
changes its value.
The race has no consequence because allocations
are protected with tcp_md5sig_mutex.
This patch adds READ_ONCE() and WRITE_ONCE() to document
the race and silence KCSAN.
Reported-by: Abhishek Shah <abhishek.shah@columbia.edu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksandr Mazur [Mon, 22 Aug 2022 18:03:15 +0000 (21:03 +0300)]
net: marvell: prestera: implement br_port_locked flag offloading
Both <port> br_port_locked and <lag> interfaces's flag
offloading is supported. No new ABI is being added,
rather existing (port_param_set) API call gets extended.
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
V2:
add missing receipents (linux-kernel, netdev) Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Aug 2022 08:52:04 +0000 (09:52 +0100)]
Merge branch 'j7200-support'
Siddharth Vadapalli says:
====================
J7200: CPSW5G: Add support for QSGMII mode to am65-cpsw driver
Add support for QSGMII mode to am65-cpsw driver.
Change log:
v4-> v5:
1. Move ti,j7200-cpswxg-nuss compatible to the line above the
ti,j721e-cpsw-nuss compatible.
2. Add allOf and move if-then statements within it to allow future if-then
statements to be added easily.
v3 -> v4:
1. Update bindings to disallow ports based on compatible, instead of
adding a new if/then statement for the new compatible.
2. Add Else-If condition for RMII mode in the set of supported interfaces.
Support for RMII mode is already present in the driver and I had
missed out adding a condition for RMII mode in the previous patches.
v2 -> v3:
1. In ti,k3-am654-cpsw-nuss.yaml, restrict if/then statement to port
nodes.
v1 -> v2:
1. Add new compatible for CPSW5G in ti,k3-am654-cpsw-nuss.yaml and extend
properties for new compatible.
2. Add extra_modes member to struct am65_cpsw_pdata to be used for QSGMII
mode by new compatible.
3. Add check for phylink supported modes to ensure that only one phy mode
is advertised as supported.
4. Check if extra_modes supports QSGMII mode in am65_cpsw_nuss_mac_config()
for register write.
5. Add check for assigning port->sgmii_base only when extra_modes is valid.
Siddharth Vadapalli [Mon, 22 Aug 2022 07:01:25 +0000 (12:31 +0530)]
net: ethernet: ti: am65-cpsw: Move phy_set_mode_ext() to correct location
In TI's J7200 SoC CPSW5G ports, each of the 4 ports can be configured
as a QSGMII main or QSGMII-SUB port. This configuration is performed
by phy-gmii-sel driver on invoking the phy_set_mode_ext() function.
It is necessary for the QSGMII main port to be configured before any of
the QSGMII-SUB interfaces are brought up. Currently, the QSGMII-SUB
interfaces come up before the QSGMII main port is configured.
Fix this by moving the call to phy_set_mode_ext() from
am65_cpsw_nuss_ndo_slave_open() to am65_cpsw_nuss_init_slave_ports(),
thereby ensuring that the QSGMII main port is configured before any of
the QSGMII-SUB ports are brought up.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Siddharth Vadapalli [Mon, 22 Aug 2022 07:01:23 +0000 (12:31 +0530)]
dt-bindings: net: ti: k3-am654-cpsw-nuss: Update bindings for J7200 CPSW5G
Update bindings for TI K3 J7200 SoC which contains 5 ports (4 external
ports) CPSW5G module and add compatible for it.
Changes made:
- Add new compatible ti,j7200-cpswxg-nuss for CPSW5G.
- Extend pattern properties for new compatible.
- Change maximum number of CPSW ports to 4 for new compatible.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Menglong Dong [Sun, 21 Aug 2022 05:18:58 +0000 (13:18 +0800)]
net: skb: prevent the split of kfree_skb_reason() by gcc
Sometimes, gcc will optimize the function by spliting it to two or
more functions. In this case, kfree_skb_reason() is splited to
kfree_skb_reason and kfree_skb_reason.part.0. However, the
function/tracepoint trace_kfree_skb() in it needs the return address
of kfree_skb_reason().
This split makes the call chains becomes:
kfree_skb_reason() -> kfree_skb_reason.part.0 -> trace_kfree_skb()
which makes the return address that passed to trace_kfree_skb() be
kfree_skb().
Therefore, introduce '__fix_address', which is the combination of
'__noclone' and 'noinline', and apply it to kfree_skb_reason() to
prevent to from being splited or made inline.
(Is it better to simply apply '__noclone oninline' to kfree_skb_reason?
I'm thinking maybe other functions have the same problems)
Meanwhile, wrap 'skb_unref()' with 'unlikely()', as the compiler thinks
it is likely return true and splits kfree_skb_reason().
Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Aug 2022 00:43:30 +0000 (17:43 -0700)]
Merge branch 'add-interface-mode-select-and-rmii'
Wei Fang says:
====================
add interface mode select and RMII
From: Wei Fang <wei.fang@nxp.com>
The patches add the below feature support for both TJA1100 and
TJA1101 PHYs cards:
- Add MII and RMII mode support.
- Add REF_CLK input/output support for RMII mode.
====================
TJA110x REF_CLK can be configured as interface reference clock
intput or output when the RMII mode enabled. This patch add the
property to make the REF_CLK can be configurable.
Signed-off-by: Wei Fang <wei.fang@nxp.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
mlxsw: Introduce modular system support by minimal driver
Vadim Pasternak writes:
This patchset adds line cards support in mlxsw_minimal, which is used
for monitoring purposes on BMC systems. The BMC is connected to the
ASIC over I2C bus, unlike the host CPU that is connected to the ASIC
via PCI bus.
The BMC system needs to be notified whenever line cards become active
or inactive, so that, for example, netdevs will be registered /
unregistered by mlxsw_minimal. However, traps cannot be generated
towards the BMC over the I2C bus. To overcome that, the I2C bus driver
(i.e., mlxsw_i2c) registers an handler for an IRQ that is fired upon
specific system wide changes, like line card activation and
deactivation.
The generated event is handled by mlxsw_core, which checks whether
anything changed in the state of available line cards. If a line card
becomes active or inactive, interested parties such as mlxsw_minimal
are notified via their registered line card event callback.
Patch set overview:
Patches #1 is preparations.
Patches #2-#3 extend mlxsw_core with an infrastructure to handle the
previously mentioned system events.
Patch #4 extends the I2C bus driver to register an handler for the IRQ
fired upon specific system wide changes.
Patches #5-#8 gradually add line cards support in mlxsw_minimal by
dynamically registering / unregistering netdevs for ports found on
line cards, whenever a line card becomes active / inactive.
====================
Vadim Pasternak [Sun, 21 Aug 2022 16:20:18 +0000 (18:20 +0200)]
mlxsw: minimal: Extend to support line card dynamic operations
Implement line card operation callbacks got_active() / got_inactive().
The purpose of these callback to create / remove line card ports after
line card is getting active / inactive.
Implement line ports_remove_selected() callback to support line card
un-provisioning flow through 'devlink'.
Add line card operation registration and de-registration APIs.
Add module offset for line card. Offset for main board iz zero.
For line card in slot #n offset is calculated as (#n - 1) multiplied by
maximum modules number.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Sun, 21 Aug 2022 16:20:17 +0000 (18:20 +0200)]
mlxsw: minimal: Extend module to port mapping with slot index
The interfaces for ports found on line card are created and removed
dynamically after line card is getting active or inactive.
Introduce per line card array with module to port mapping.
For each port get 'slot_index' through PMLP register and set port
mapping for the relevant [slot_index][module] entry.
Split module and port allocation into separate routines.
Split per line card port creation and removing into separate routines.
Motivation to re-use these routines for line card operations.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Sun, 21 Aug 2022 16:20:15 +0000 (18:20 +0200)]
mlxsw: minimal: Extend APIs with slot index for modular system support
Add 'slot_index' field to port structure.
Replace zero slot_index argument with 'slot_index' in 'ethtool'
related APIs.
Add 'slot_index' argument to port initialization and
de-initialization related APIs.
Motivation is to prepare minimal driver for modular system support.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Sun, 21 Aug 2022 16:20:14 +0000 (18:20 +0200)]
mlxsw: i2c: Add support for system interrupt handling
Extend i2c bus driver with interrupt handler to support system specific
hotplug events, related to line card state change.
Provide system IRQ line for interrupt handler. IRQ line Id could be
provided through the platform data if available, or could be set to the
default value.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Sun, 21 Aug 2022 16:20:13 +0000 (18:20 +0200)]
mlxsw: core_linecards: Register a system event handler
Add line card system event handler. Register it with core. It is
triggered by system interrupts raised from chassis programmable logic
devices to CPU. The purpose is to handle line card state changes over
I2C bus.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Sun, 21 Aug 2022 16:20:12 +0000 (18:20 +0200)]
mlxsw: core: Add registration APIs for system event handler
The purpose of system event handler is to handle system interrupts.
Such interrupts are raised to CPU from system programmable logic
devices, upon specific system wide changes, like line card activation
and deactivation.
The purpose is to create an alternative to trap mechanism, which
delivers these events to driver over PCI bus, but not available for
the driver working over I2C bus.
Mechanism is system dependent and applicable only for the systems
equipped with programmable devices with custom logic.
Add APIs for event handler registration and un-registration and API
which should be invoked from the registered callbacks when system
interrupt is raised to CPU.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Sun, 21 Aug 2022 16:20:11 +0000 (18:20 +0200)]
mlxsw: core_linecards: Separate line card init and fini flow
Currently, each line card is initialized using the following steps:
1. Initializing its various fields (e.g., slot index).
2. Creating the corresponding devlink object.
3. Enabling events (i.e., traps) for changes in line card status.
4. Querying and processing line card status.
Unlike traps, the IRQ that notifies the CPU about line card status
changes cannot be enabled / disabled on a per line card basis.
If a handler is registered before the line cards are initialized, the
handler risks accessing uninitialized memory.
On the other hand, if the handler is registered after initialization,
we risk missing events. For example, in step 4, the driver might see
that a line card is in ready state and will tell the device to enable
it. When enablement is done, the line card will be activated and the
IRQ will be triggered. Since a handler was not registered, the event
will be missed.
Solve this by splitting the initialization sequence into two steps
(1-2 and 3-4). In a subsequent patch, the handler will be registered
between both steps.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 19 Aug 2022 20:02:21 +0000 (13:02 -0700)]
docs: netlink: basic introduction to Netlink
Provide a bit of a brain dump of netlink related information
as documentation. Hopefully this will be useful to people
trying to navigate implementing YAML based parsing in languages
we won't be able to help with.
I started writing this doc while trying to figure out what
it'd take to widen the applicability of YAML to good old rtnl,
but the doc grew beyond that as it usually happens.
In all honesty a lot of this information is new to me as I usually
follow the "copy an existing example, drink to forget" process
of writing netlink user space, so reviews will be much appreciated.
Jakub Kicinski [Fri, 19 Aug 2022 20:02:20 +0000 (13:02 -0700)]
net: improve and fix netlink kdoc
Subsequent patch will render the kdoc from
include/uapi/linux/netlink.h into Documentation.
We need to fix the warnings. While at it move
the comments on struct nlmsghdr to a proper
kdoc comment.
Sergei Antonov [Sun, 21 Aug 2022 16:08:44 +0000 (19:08 +0300)]
net: ftmac100: set max_mtu to allow DSA overhead setting
In case ftmac100 is used with a DSA switch, Linux wants to set MTU
to 1504 to accommodate for DSA overhead. With the default max_mtu
it leads to the error message:
ftmac100 92000000.mac eth0: error -22 setting MTU to 1504 to include DSA overhead
ftmac100 supports packet length 1518 (MAX_PKT_SIZE constant), so it is
safe to report it in max_mtu.
====================
DSA changes for multiple CPU ports (part 3)
Those who have been following part 1:
https://patchwork.kernel.org/project/netdevbpf/cover/20220511095020.562461-1-vladimir.oltean@nxp.com/
and part 2:
https://patchwork.kernel.org/project/netdevbpf/cover/20220521213743.2735445-1-vladimir.oltean@nxp.com/
will know that I am trying to enable the second internal port pair from
the NXP LS1028A Felix switch for DSA-tagged traffic via "ocelot-8021q".
This series represents part 3 of that effort.
Covered here are some preparations in DSA for handling multiple DSA
masters:
- when changing the tagging protocol via sysfs
- when the masters go down
as well as preparation for monitoring the upper devices of a DSA master
(to support DSA masters under a LAG).
There are also 2 small preparations for the ocelot driver, for the case
where multiple tag_8021q CPU ports are used in a LAG. Both those changes
have to do with PGID forwarding domains.
Compared to v1, the patches were trimmed down to just another
preparation stage, and the UAPI changes were pushed further out to part 4.
https://patchwork.kernel.org/project/netdevbpf/cover/20220523104256.3556016-1-olteanv@gmail.com/
Compared to v2, I had to export a symbol I forgot to
(ocelot_port_teardown_dsa_8021q_cpu), to avoid a build breakage when the
felix and seville drivers are built as modules.
====================