====================
ipv6: avoid atomic fragment on GSO output
When the ipv6 stack output a GSO packet, if its gso_size is larger than
dst MTU, then all segments would be fragmented. However, it is possible
for a GSO packet to have a trailing segment with smaller actual size
than both gso_size as well as the MTU, which leads to an "atomic
fragment". Atomic fragments are considered harmful in RFC-8021. An
Existing report from APNIC also shows that atomic fragments are more
likely to be dropped even it is equivalent to a no-op [1].
The series contains following changes:
* drop feature RTAX_FEATURE_ALLFRAG, which has been broken. This helps
simplifying other changes in this set.
* refactor __ip6_finish_output code to separate GSO and non-GSO packet
processing, mirroring IPv4 side logic.
* avoid generating atomic fragment on GSO packets.
Yan Zhai [Tue, 24 Oct 2023 14:26:40 +0000 (07:26 -0700)]
ipv6: avoid atomic fragment on GSO packets
When the ipv6 stack output a GSO packet, if its gso_size is larger than
dst MTU, then all segments would be fragmented. However, it is possible
for a GSO packet to have a trailing segment with smaller actual size
than both gso_size as well as the MTU, which leads to an "atomic
fragment". Atomic fragments are considered harmful in RFC-8021. An
Existing report from APNIC also shows that atomic fragments are more
likely to be dropped even it is equivalent to a no-op [1].
Add an extra check in the GSO slow output path. For each segment from
the original over-sized packet, if it fits with the path MTU, then avoid
generating an atomic fragment.
Yan Zhai [Tue, 24 Oct 2023 14:26:37 +0000 (07:26 -0700)]
ipv6: refactor ip6_finish_output for GSO handling
Separate GSO and non-GSO packets handling to make the logic cleaner. For
GSO packets, frag_max_size check can be omitted because it is only
useful for packets defragmented by netfilter hooks. Both local output
and GRO logic won't produce GSO packets when defragment is needed. This
also mirrors what IPv4 side code is doing.
The feature would send packets to the fragmentation path if a box
receives a PMTU value with less than 1280 byte. However, since commit 9d289715eb5c ("ipv6: stop sending PTB packets for MTU < 1280"), such
message would be simply discarded. The feature flag is neither supported
in iproute2 utility. In theory one can still manipulate it with direct
netlink message, but it is not ideal because it was based on obsoleted
guidance of RFC-2460 (replaced by RFC-8200).
The feature would always test false at the moment, so remove related
code or mark them as unused.
Jakub Kicinski [Wed, 25 Oct 2023 19:23:36 +0000 (12:23 -0700)]
Merge branch 'mptcp-features-and-fixes-for-v6-7'
Mat Martineau says:
====================
mptcp: Features and fixes for v6.7
Patch 1 adds a configurable timeout for the MPTCP connection when all
subflows are closed, to support break-before-make use cases.
Patch 2 is a fix for a 1-byte error in rx data counters with MPTCP
fastopen connections.
Patch 3 is a minor code cleanup.
Patches 4 & 5 add handling of rcvlowat for MPTCP sockets, with a
prerequisite patch to use a common scaling ratio between TCP and MPTCP.
Patch 6 improves efficiency of memory copying in MPTCP transmit code.
Patch 7 refactors syncing of socket options from the MPTCP socket to
its subflows.
Patches 8 & 9 help the MPTCP packet scheduler perform well by changing
the handling of notsent_lowat in subflows and how available buffer space
is calculated for MPTCP-level sends.
====================
Paolo Abeni [Mon, 23 Oct 2023 20:44:42 +0000 (13:44 -0700)]
mptcp: refactor sndbuf auto-tuning
The MPTCP protocol account for the data enqueued on all the subflows
to the main socket send buffer, while the send buffer auto-tuning
algorithm set the main socket send buffer size as the max size among
the subflows.
That causes bad performances when at least one subflow is sndbuf
limited, e.g. due to very high latency, as the MPTCP scheduler can't
even fill such buffer.
Change the send-buffer auto-tuning algorithm to compute the main socket
send buffer size as the sum of all the subflows buffer size.
Paolo Abeni [Mon, 23 Oct 2023 20:44:40 +0000 (13:44 -0700)]
mptcp: consolidate sockopt synchronization
Move the socket option synchronization for active subflows
at subflow creation time. This allows removing the now unused
unlocked variant of such helper.
While at that, clean-up a bit the mptcp_subflow_create_socket()
errors path.
Paolo Abeni [Mon, 23 Oct 2023 20:44:38 +0000 (13:44 -0700)]
mptcp: give rcvlowat some love
The MPTCP protocol allow setting sk_rcvlowat, but the value there
is currently ignored.
Additionally, the default subflows sk_rcvlowat basically disables per
subflow delayed ack: the MPTCP protocol move the incoming data from the
subflows into the msk socket as soon as the TCP stacks invokes the subflow
data_ready callback. Later, when __tcp_ack_snd_check() takes action,
the subflow-level copied_seq matches rcv_nxt, and that mandate for an
immediate ack.
Let the mptcp receive path be aware of such threshold, explicitly tracking
the amount of data available to be ready and checking vs sk_rcvlowat in
mptcp_poll() and before waking-up readers.
Additionally implement the set_rcvlowat() callback, to properly handle
the rcvbuf auto-tuning on sk_rcvlowat changes.
Finally to properly handle delayed ack, force the subflow level threshold
to 0 and instead explicitly ask for an immediate ack when the msk level th
is not reached.
Paolo Abeni [Mon, 23 Oct 2023 20:44:35 +0000 (13:44 -0700)]
mptcp: properly account fastopen data
Currently the socket level counter aggregating the received data
does not take in account the data received via fastopen.
Address the issue updating the counter as required.
Fixes: 38967f424b5b ("mptcp: track some aggregate data counters") Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-2-9dc60939d371@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 23 Oct 2023 20:44:34 +0000 (13:44 -0700)]
mptcp: add a new sysctl for make after break timeout
The MPTCP protocol allows sockets with no alive subflows to stay
in ESTABLISHED status for and user-defined timeout, to allow for
later subflows creation.
Currently such timeout is constant - TCP_TIMEWAIT_LEN. Let the
user-space configure them via a newly added sysctl, to better cope
with busy servers and simplify (make them faster) the relevant
pktdrill tests.
Note that the new know does not apply to orphaned MPTCP socket
waiting for the data_fin handshake completion: they always wait
TCP_TIMEWAIT_LEN.
Bragatheswaran Manickavel [Tue, 24 Oct 2023 18:20:51 +0000 (23:50 +0530)]
amd/pds_core: core: No need for Null pointer check before kfree
kfree()/vfree() internally perform NULL check on the
pointer handed to it and take no action if it indeed is
NULL. Hence there is no need for a pre-check of the memory
pointer before handing it to kfree()/vfree().
Issue reported by ifnullfree.cocci Coccinelle semantic
patch script.
Signed-off-by: Bragatheswaran Manickavel <bragathemanick0908@gmail.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 25 Oct 2023 09:28:00 +0000 (10:28 +0100)]
Merge branch 'mv88e6xxx-dsa-bindings'
Linus Walleij says:
====================
Create a binding for the Marvell MV88E6xxx DSA switches
The Marvell switches are lacking DT bindings.
I need proper schema checking to add LED support to the
Marvell switch. Just how it is, it can't go on like this.
Some Device Tree fixes are included in the series, these
remove the major and most annoying warnings fallout noise:
some warnings remain, and these are of more serious nature,
such as missing phy-mode. They can be applied individually,
or to the networking tree with the rest of the patches.
Thanks to Andrew Lunn, Vladimir Oltean and Russell King
for excellent review and feedback!
---
Changes in v7:
- Fix the elaborate spacing to satisfy yamllint in the
ports/ethernet-ports requirement.
- Link to v6: https://lore.kernel.org/r/20231024-marvell-88e6152-wan-led-v6-0-993ab0949344@linaro.org
Changes in v6:
- Fix ports/ethernet-ports requirement with proper indenting
(hopefully).
- Link to v5: https://lore.kernel.org/r/20231023-marvell-88e6152-wan-led-v5-0-0e82952015a7@linaro.org
Changes in v5:
- Consistently rename switch@n to ethernet-switch@n in all cleanup patches
- Consistently rename ports to ethernet-ports in all cleanup patches
- Consistently rename all port@n to ethernet-port@n in all cleanup patches
- Consistently rename all phy@n to ethernet-phy@n in all cleanup patches
- Restore the nodename on the Turris MOX which has a U-Boot binary using the
nodename as ABI, put in a blurb warning about this so no-one else tries
to change it in the future.
- Drop dsa.yaml direct references where we reference dsa.yaml#/$defs/ethernet-ports
- Replace the conjured MV88E6xxx example by a better one based on imx6qdl
plus strictly named nodes and added reset-gpios for a more complete example,
and another example using the interrupt controller based on
armada-381-netgear-gs110emx.dts
- Bump lineage to 2008 as Vladimir says the code was developed starting 2008.
- Link to v4: https://lore.kernel.org/r/20231018-marvell-88e6152-wan-led-v4-0-3ee0c67383be@linaro.org
Changes in v4:
- Rebase the series on top of Rob's series
"dt-bindings: net: Child node schema cleanups" (or the hex numbered
ports will not work)
- Fix up a whitespacing error corrupting v3...
- Add a new patch making the generic DSA binding require ports or
ethernet-ports in the switch node.
- Drop any corrections of port@a in the patches.
- Drop oneOf in the compatible enum for mv88e6xxx
- Use ethernet-switch, ethernet-ports and ethernet-phy in the examples
- Transclude the dsa.yaml#/$defs/ethernet-ports define for ports
- Move the DTS and binding fixes first, before the actual bindings,
so they apply without (too many) warnings as fallout.
- Drop stray colon in text.
- Drop example port in the mveusb binding.
- Link to v3: https://lore.kernel.org/r/20231016-marvell-88e6152-wan-led-v3-0-38cd449dfb15@linaro.org
Changes in v3:
- Fix up a related mvusb example in a different binding that
the scripts were complaining about.
- Fix up the wording on internal vs external MDIO buses in the
mv88e6xxx binding document.
- Remove pointless label and put the right rev-mii into the
MV88E6060 schema.
- Link to v2: https://lore.kernel.org/r/20231014-marvell-88e6152-wan-led-v2-0-7fca08b68849@linaro.org
Changes in v2:
- Break out a separate Marvell MV88E6060 binding file. I stand corrected.
- Drop the idea to rely on nodename mdio-external for the external
MDIO bus, keep the compatible, drop patch for the driver.
- Fix more Marvell DT mistakes.
- Fix NXP DT mistakes in a separate patch.
- Fix Marvell ARM64 mistakes in a separate patch.
- Link to v1: https://lore.kernel.org/r/20231013-marvell-88e6152-wan-led-v1-0-0712ba99857c@linaro.org
====================
Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell MV88E6060 is one of the oldest DSA switches from
Marvell, and it has DT bindings used in the wild. Let's define
them properly.
It is different enough from the rest of the MV88E6xxx switches
that it deserves its own binding.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 24 Oct 2023 13:20:32 +0000 (15:20 +0200)]
dt-bindings: marvell: Rewrite MV88E6xxx in schema
This is an attempt to rewrite the Marvell MV88E6xxx switch bindings
in YAML schema.
The current text binding says:
WARNING: This binding is currently unstable. Do not program it into a
FLASH never to be changed again. Once this binding is stable, this
warning will be removed.
Well that never happened before we switched to YAML markup,
we can't have it like this, what about fixing the mess?
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 24 Oct 2023 13:20:31 +0000 (15:20 +0200)]
ARM64: dts: marvell: Fix some common switch mistakes
Fix some errors in the Marvell MV88E6xxx switch descriptions:
- The top node had no address size or cells.
- switch0@0 is not OK, should be ethernet-switch@0.
- ports should be ethernet-ports
- port@0 should be ethernet-port@0
- PHYs should be named ethernet-phy@
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 24 Oct 2023 13:20:30 +0000 (15:20 +0200)]
ARM: dts: nxp: Fix some common switch mistakes
Fix some errors in the Marvell MV88E6xxx switch descriptions:
- switch0@0 is not OK, should be ethernet-switch@0
- ports should be ethernet-ports
- port should be ethernet-port
- phy should be ethernet-phy
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 24 Oct 2023 13:20:29 +0000 (15:20 +0200)]
ARM: dts: marvell: Fix some common switch mistakes
Fix some errors in the Marvell MV88E6xxx switch descriptions:
- The top node had no address size or cells.
- switch0@0 is not OK, should be ethernet-switch@0.
- The ports node should be named ethernet-ports
- The ethernet-ports node should have port@0 etc children, no
plural "ports" in the children.
- Ports should be named ethernet-port@0 etc
- PHYs should be named ethernet-phy@0 etc
This serves as an example of fixes needed for introducing a
schema for the bindings, but the patch can simply be applied.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 24 Oct 2023 13:20:28 +0000 (15:20 +0200)]
dt-bindings: net: mvusb: Fix up DSA example
When adding a proper schema for the Marvell mx88e6xxx switch,
the scripts start complaining about this embedded example:
dtschema/dtc warnings/errors:
net/marvell,mvusb.example.dtb: switch@0: ports: '#address-cells'
is a required property
from schema $id: http://devicetree.org/schemas/net/dsa/marvell,mv88e6xxx.yaml#
net/marvell,mvusb.example.dtb: switch@0: ports: '#size-cells'
is a required property
from schema $id: http://devicetree.org/schemas/net/dsa/marvell,mv88e6xxx.yaml#
Fix this up by extending the example with those properties in
the ports node.
While we are at it, rename "ports" to "ethernet-ports" and rename
"switch" to "ethernet-switch" as this is recommended practice.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 24 Oct 2023 13:20:27 +0000 (15:20 +0200)]
dt-bindings: net: dsa: Require ports or ethernet-ports
Bindings using dsa.yaml#/$defs/ethernet-ports specify that
a DSA switch node need to have a ports or ethernet-ports
subnode, and that is actually required, so add requirements
using oneOf.
Suggested-by: Rob Herring <robh@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Tue, 24 Oct 2023 11:05:51 +0000 (13:05 +0200)]
sched: act_ct: switch to per-action label counting
net->ct.labels_used was meant to convey 'number of ip/nftables rules
that need the label extension allocated'.
act_ct enables this for each net namespace, which voids all attempts
to avoid ct->ext allocation when possible.
Move this increment to the control plane to request label extension
space allocation only when its needed.
Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Tue, 24 Oct 2023 03:20:34 +0000 (11:20 +0800)]
net: hns3: add some link modes for hisilicon device
Add HCLGE_SUPPORT_50G_R1_BIT and HCLGE_SUPPORT_100G_R2_BIT two
capability bits and Corresponding link modes.
Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Mon, 23 Oct 2023 09:33:38 +0000 (11:33 +0200)]
net: dsa: microchip: ksz9477: add Wake on LAN support
Add WoL support for KSZ9477 family of switches. This code was tested on
KSZ8563 chip.
KSZ9477 family of switches supports multiple PHY events:
- wake on Link Up
- wake on Energy Detect.
Since current UAPI can't differentiate between this PHY events, map all
of them to WAKE_PHY.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Mon, 23 Oct 2023 09:33:37 +0000 (11:33 +0200)]
net: dsa: microchip: use wakeup-source DT property to enable PME output
KSZ switches with WoL support signals wake event over PME pin. If this
pin is attached to some external PMIC or System Controller can't be
described as GPIO, the only way to describe it in the devicetree is to
use wakeup-source property. So, add support for this property and enable
PME switch output if this property is present.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Mon, 23 Oct 2023 09:33:35 +0000 (11:33 +0200)]
net: dsa: microchip: Add missing MAC address register offset for ksz8863
Add the missing offset for the global MAC address register
(REG_SW_MAC_ADDR) for the ksz8863 family of switches.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Justin Stitt [Mon, 23 Oct 2023 19:39:39 +0000 (19:39 +0000)]
s390/qeth: replace deprecated strncpy with strscpy
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
We expect new_entry->dbf_name to be NUL-terminated based on its use with
strcmp():
| if (strcmp(entry->dbf_name, name) == 0) {
Moreover, NUL-padding is not required as new_entry is kzalloc'd just
before this assignment:
| new_entry = kzalloc(sizeof(struct qeth_dbf_entry), GFP_KERNEL);
... rendering any future NUL-byte assignments (like the ones strncpy()
does) redundant.
Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.
Justin Stitt [Mon, 23 Oct 2023 19:35:07 +0000 (19:35 +0000)]
s390/ctcm: replace deprecated strncpy with strscpy
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
We expect chid to be NUL-terminated based on its use with format
strings:
Moreover, NUL-padding is not required as it is _only_ used in this one
instance with a format string.
Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.
We can also drop the +1 from chid's declaration as we no longer need to
be cautious about leaving a spot for a NUL-byte. Let's use the more
idiomatic strscpy usage of (dest, src, sizeof(dest)) as this more
closely ties the destination buffer to the length.
Lorenzo Bianconi [Mon, 23 Oct 2023 22:00:19 +0000 (00:00 +0200)]
net: ethernet: mtk_wed: fix firmware loading for MT7986 SoC
The WED mcu firmware does not contain all the memory regions defined in
the dts reserved_memory node (e.g. MT7986 WED firmware does not contain
cpu-boot region).
Reverse the mtk_wed_mcu_run_firmware() logic to check all the fw
sections are defined in the dts reserved_memory node.
Fixes: c6d961aeaa77 ("net: ethernet: mtk_wed: move mem_region array out of mtk_wed_mcu_load_firmware") Tested-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/d983cbfe8ea562fef9264de8f0c501f7d5705bd5.1698098381.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: ethernet: renesas: infrastructure preparations for upcoming driver
Before we upstream a new driver, Niklas and I thought that a few
cleanups for Kconfig/Makefile will help readability and maintainability.
Here they are, looking forward to comments.
====================
Wolfram Sang [Sun, 22 Oct 2023 20:53:16 +0000 (22:53 +0200)]
net: ethernet: renesas: drop SoC names in Kconfig
Mentioning SoCs in Kconfig descriptions tends to get stale (e.g. RAVB is
missing RZV2M) or imprecise (e.g. SH_ETH is not available on all
R8A779x). Drop them instead of providing vague information. Improve the
file description a tad while here.
Wolfram Sang [Sun, 22 Oct 2023 20:53:15 +0000 (22:53 +0200)]
net: ethernet: renesas: group entries in Makefile
A new Renesas driver shall be added soon. Prepare the Makefile by
grouping the specific objects to the Kconfig symbol for better
readability. Improve the file description a tad while here.
Change ifconfig with ip command, on a system where ifconfig is
not used this script will not work correcly.
Test result with this patchset:
sudo make TARGETS="net" kselftest
....
TAP version 13
1..1
timeout set to 1500
selftests: net: route_localnet.sh
run arp_announce test
net.ipv4.conf.veth0.route_localnet = 1
net.ipv4.conf.veth1.route_localnet = 1
net.ipv4.conf.veth0.arp_announce = 2
net.ipv4.conf.veth1.arp_announce = 2
PING 127.25.3.14 (127.25.3.14) from 127.25.3.4 veth0: 56(84)
bytes of data.
64 bytes from 127.25.3.14: icmp_seq=1 ttl=64 time=0.038 ms
64 bytes from 127.25.3.14: icmp_seq=2 ttl=64 time=0.068 ms
64 bytes from 127.25.3.14: icmp_seq=3 ttl=64 time=0.068 ms
64 bytes from 127.25.3.14: icmp_seq=4 ttl=64 time=0.068 ms
64 bytes from 127.25.3.14: icmp_seq=5 ttl=64 time=0.068 ms
--- 127.25.3.14 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4073ms
rtt min/avg/max/mdev = 0.038/0.062/0.068/0.012 ms
ok
run arp_ignore test
net.ipv4.conf.veth0.route_localnet = 1
net.ipv4.conf.veth1.route_localnet = 1
net.ipv4.conf.veth0.arp_ignore = 3
net.ipv4.conf.veth1.arp_ignore = 3
PING 127.25.3.14 (127.25.3.14) from 127.25.3.4 veth0: 56(84)
bytes of data.
64 bytes from 127.25.3.14: icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from 127.25.3.14: icmp_seq=2 ttl=64 time=0.065 ms
64 bytes from 127.25.3.14: icmp_seq=3 ttl=64 time=0.066 ms
64 bytes from 127.25.3.14: icmp_seq=4 ttl=64 time=0.065 ms
64 bytes from 127.25.3.14: icmp_seq=5 ttl=64 time=0.065 ms
--- 127.25.3.14 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4092ms
rtt min/avg/max/mdev = 0.032/0.058/0.066/0.013 ms
ok
ok 1 selftests: net: route_localnet.sh
...
====================
Switch DSA to inclusive terminology
One of the action items following Netconf'23 is to switch subsystems to
use inclusive terminology. DSA has been making extensive use of the
"master" and "slave" words which are now replaced by "conduit" and
"user" respectively.
====================
Florian Fainelli [Mon, 23 Oct 2023 18:17:28 +0000 (11:17 -0700)]
net: dsa: Use conduit and user terms
Use more inclusive terms throughout the DSA subsystem by moving away
from "master" which is replaced by "conduit" and "slave" which is
replaced by "user". No functional changes.
Acked-by: Rob Herring <robh@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20231023181729.1191071-2-florian.fainelli@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Compiler warns about a possible format-overflow in tsnep_request_irq():
drivers/net/ethernet/engleder/tsnep_main.c:884:55: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]
sprintf(queue->name, "%s-rx-%d", name,
^
drivers/net/ethernet/engleder/tsnep_main.c:881:55: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]
sprintf(queue->name, "%s-tx-%d", name,
^
drivers/net/ethernet/engleder/tsnep_main.c:878:49: warning: '-txrx-' directive writing 6 bytes into a region of size between 5 and 25 [-Wformat-overflow=]
sprintf(queue->name, "%s-txrx-%d", name,
^~~~~~
Actually overflow cannot happen. Name is limited to IFNAMSIZ, because
netdev_name() is called during ndo_open(). queue_index is single char,
because less than 10 queues are supported.
Fix warning with snprintf(). Additionally increase buffer to 32 bytes,
because those 7 additional bytes were unused anyway.
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310182028.vmDthIUa-lkp@intel.com/ Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231023183856.58373-1-gerhard@engleder-embedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: deduplicate netdev name allocation
After recent fixes we have even more duplicated code in netdev name
allocation helpers. There are two complications in this code.
First, __dev_alloc_name() clobbers its output arg even if allocation
fails, forcing callers to do extra copies. Second as our experience in
commit 55a5ec9b7710 ("Revert "net: core: dev_get_valid_name is now the same as dev_alloc_name_ns"") and
commit 029b6d140550 ("Revert "net: core: maybe return -EEXIST in __dev_alloc_name"")
taught us, user space is very sensitive to the exact error codes.
Align the callers of __dev_alloc_name(), and remove some of its
complexity.
Jakub Kicinski [Mon, 23 Oct 2023 15:23:44 +0000 (08:23 -0700)]
net: trust the bitmap in __dev_alloc_name()
Prior to restructuring __dev_alloc_name() handled both printf
and non-printf names. In a clever attempt at code reuse it
always prints the name into a buffer and checks if it's
a duplicate.
Trust the bitmap, and return an error if its full.
This shrinks the possible ID space by one from 32K to 32K - 1,
as previously the max value would have been tried as a valid ID.
It seems very unlikely that anyone would care as we heard
no requests to increase the max beyond 32k.
Jakub Kicinski [Mon, 23 Oct 2023 15:23:43 +0000 (08:23 -0700)]
net: reduce indentation of __dev_alloc_name()
All callers of __dev_valid_name() go thru dev_prep_valid_name()
which handles the non-printf case. Focus __dev_alloc_name() on
the sprintf case, remove the indentation level.
Minor functional change of returning -EINVAL if % is not found,
which should now never happen.
Jakub Kicinski [Mon, 23 Oct 2023 15:23:42 +0000 (08:23 -0700)]
net: make dev_alloc_name() call dev_prep_valid_name()
__dev_alloc_name() handles both the sprintf and non-sprintf
target names. This complicates the code.
dev_prep_valid_name() already handles the non-sprintf case,
before calling __dev_alloc_name(), make the only other caller
also go thru dev_prep_valid_name(). This way we can drop
the non-sprintf handling in __dev_alloc_name() in one of
the next changes.
commit 55a5ec9b7710 ("Revert "net: core: dev_get_valid_name is now the same as dev_alloc_name_ns"") and
commit 029b6d140550 ("Revert "net: core: maybe return -EEXIST in __dev_alloc_name"")
tell us that we can't start returning -EEXIST from dev_alloc_name()
on name duplicates. Bite the bullet and pass the expected errno to
dev_prep_valid_name().
dev_prep_valid_name() must now propagate out the allocated id
for printf names.
Jakub Kicinski [Mon, 23 Oct 2023 15:23:41 +0000 (08:23 -0700)]
net: don't use input buffer of __dev_alloc_name() as a scratch space
Callers of __dev_alloc_name() want to pass dev->name as
the output buffer. Make __dev_alloc_name() not clobber
that buffer on failure, and remove the workarounds
in callers.
dev_alloc_name_ns() is now completely unnecessary.
The extra strscpy() added here will be gone by the end
of the patch series.
Davide Caratti [Mon, 23 Oct 2023 18:17:07 +0000 (11:17 -0700)]
net: mptcp: convert netlink from small_ops to ops
in the current MPTCP control plane, all operations use a netlink
attribute of the same type "MPTCP_PM_ATTR". However, add/del/get/flush
operations only parse the first element in the message _ the one that
describes MPTCP endpoints (that was named MPTCP_PM_ATTR_ADDR and
mostly used in ADD_ADDR operations _ probably the similarity of "attr",
"addr" and "add" might cause some confusion to human readers).
Convert MPTCP from 'small_ops' to 'ops', thus allowing different attributes
for each single operation, hopefully makes all this clearer to human
readers.
- use a separate attribute set for add/del/get/flush address operation,
binary compatible with the existing one, to store the endpoint address.
MPTCP_PM_ENDPOINT_ADDR is added to the uAPI (with the same value as
MPTCP_PM_ATTR_ADDR) for these operations.
- convert mptcp_pm_ops[] and add policy files accordingly.
this prepares MPTCP control plane to be described as YAML spec.
Liu Jian [Mon, 23 Oct 2023 06:47:29 +0000 (14:47 +0800)]
net: sched: sch_qfq: Use non-work-conserving warning handler
A helper function for printing non-work-conserving alarms is added in
commit b00355db3f88 ("pkt_sched: sch_hfsc: sch_htb: Add non-work-conserving
warning handler."). In this commit, use qdisc_warn_nonwc() instead of
WARN_ONCE() to handle the non-work-conserving warning in qfq Qdisc.
Adam Ford [Sun, 22 Oct 2023 15:19:11 +0000 (10:19 -0500)]
net: ethernet: davinci_emac: Use MAC Address from Device Tree
Currently there is a device tree entry called "local-mac-address"
which can be filled by the bootloader or manually set.This is
useful when the user does not want to use the MAC address
programmed into the SoC.
Currently, the davinci_emac reads the MAC from the DT, copies
it from pdata->mac_addr to priv->mac_addr, then blindly overwrites
it by reading from registers in the SoC, and falls back to a
random MAC if it's still not valid. This completely ignores any
MAC address in the device tree.
In order to use the local-mac-address, check to see if the contents
of priv->mac_addr are valid before falling back to reading from the
SoC when the MAC address is not valid.
Abel Wu [Thu, 19 Oct 2023 12:00:26 +0000 (20:00 +0800)]
sock: Ignore memcg pressure heuristics when raising allocated
Before sockets became aware of net-memcg's memory pressure since
commit e1aab161e013 ("socket: initial cgroup code."), the memory
usage would be granted to raise if below average even when under
protocol's pressure. This provides fairness among the sockets of
same protocol.
That commit changes this because the heuristic will also be
effective when only memcg is under pressure which makes no sense.
So revert that behavior.
After reverting, __sk_mem_raise_allocated() no longer considers
memcg's pressure. As memcgs are isolated from each other w.r.t.
memory accounting, consuming one's budget won't affect others.
So except the places where buffer sizes are needed to be tuned,
allow workloads to use the memory they are provisioned.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231019120026.42215-3-wuyun.abel@bytedance.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Abel Wu [Thu, 19 Oct 2023 12:00:25 +0000 (20:00 +0800)]
sock: Doc behaviors for pressure heurisitics
There are now two accounting infrastructures for skmem, while the
heuristics in __sk_mem_raise_allocated() were actually introduced
before memcg was born.
Add some comments to clarify whether they can be applied to both
infrastructures or not.
Suggested-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> Acked-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231019120026.42215-2-wuyun.abel@bytedance.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vishvambar Panth S [Fri, 20 Oct 2023 18:58:01 +0000 (00:28 +0530)]
net: microchip: lan743x: improve throughput with rx timestamp config
Currently all RX frames are timestamped which results in a performance
penalty when timestamping is not needed. The default is now being
changed to not timestamp any Rx frames (HWTSTAMP_FILTER_NONE), but
support has been added to allow changing the desired RX timestamping
mode (HWTSTAMP_FILTER_ALL - which was the previous setting and
HWTSTAMP_FILTER_PTP_V2_EVENT are now supported) using
SIOCSHWTSTAMP. All settings were tested using the hwstamp_ctl application.
It is also noted that ptp4l, when started, preconfigures the device to
timestamp using HWTSTAMP_FILTER_PTP_V2_EVENT, so this driver continues
to work properly "out of the box".
Test setup: x64 PC with LAN7430 ---> x64 PC as partner
====================
introduce page_pool_alloc() related API
In [1] & [2] & [3], there are usecases for veth and virtio_net
to use frag support in page pool to reduce memory usage, and it
may request different frag size depending on the head/tail
room space for xdp_frame/shinfo and mtu/packet size. When the
requested frag size is large enough that a single page can not
be split into more than one frag, using frag support only have
performance penalty because of the extra frag count handling
for frag support.
So this patchset provides a page pool API for the driver to
allocate memory with least memory utilization and performance
penalty when it doesn't know the size of memory it need
beforehand.
Yunsheng Lin [Fri, 20 Oct 2023 09:59:50 +0000 (17:59 +0800)]
page_pool: introduce page_pool_alloc() API
Currently page pool supports the below use cases:
use case 1: allocate page without page splitting using
page_pool_alloc_pages() API if the driver knows
that the memory it need is always bigger than
half of the page allocated from page pool.
use case 2: allocate page frag with page splitting using
page_pool_alloc_frag() API if the driver knows
that the memory it need is always smaller than
or equal to the half of the page allocated from
page pool.
There is emerging use case [1] & [2] that is a mix of the
above two case: the driver doesn't know the size of memory it
need beforehand, so the driver may use something like below to
allocate memory with least memory utilization and performance
penalty:
To avoid the driver doing something like above, add the
page_pool_alloc() API to support the above use case, and update
the true size of memory that is acctually allocated by updating
'*size' back to the driver in order to avoid exacerbating
truesize underestimate problem.
Rename page_pool_free() which is used in the destroy process to
__page_pool_destroy() to avoid confusion with the newly added
API.
Yunsheng Lin [Fri, 20 Oct 2023 09:59:49 +0000 (17:59 +0800)]
page_pool: remove PP_FLAG_PAGE_FRAG
PP_FLAG_PAGE_FRAG is not really needed after pp_frag_count
handling is unified and page_pool_alloc_frag() is supported
in 32-bit arch with 64-bit DMA, so remove it.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Lorenzo Bianconi <lorenzo@kernel.org> CC: Alexander Duyck <alexander.duyck@gmail.com> CC: Liang Chen <liangchen.linux@gmail.com> CC: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20231020095952.11055-3-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Fri, 20 Oct 2023 09:59:48 +0000 (17:59 +0800)]
page_pool: unify frag_count handling in page_pool_is_last_frag()
Currently when page_pool_create() is called with
PP_FLAG_PAGE_FRAG flag, page_pool_alloc_pages() is only
allowed to be called under the below constraints:
1. page_pool_fragment_page() need to be called to setup
page->pp_frag_count immediately.
2. page_pool_defrag_page() often need to be called to drain
the page->pp_frag_count when there is no more user will
be holding on to that page.
Those constraints exist in order to support a page to be
split into multi fragments.
And those constraints have some overhead because of the
cache line dirtying/bouncing and atomic update.
Those constraints are unavoidable for case when we need a
page to be split into more than one fragment, but there is
also case that we want to avoid the above constraints and
their overhead when a page can't be split as it can only
hold a fragment as requested by user, depending on different
use cases:
use case 1: allocate page without page splitting.
use case 2: allocate page with page splitting.
use case 3: allocate page with or without page splitting
depending on the fragment size.
Currently page pool only provide page_pool_alloc_pages() and
page_pool_alloc_frag() API to enable the 1 & 2 separately,
so we can not use a combination of 1 & 2 to enable 3, it is
not possible yet because of the per page_pool flag
PP_FLAG_PAGE_FRAG.
So in order to allow allocating unsplit page without the
overhead of split page while still allow allocating split
page we need to remove the per page_pool flag in
page_pool_is_last_frag(), as best as I can think of, it seems
there are two methods as below:
1. Add per page flag/bit to indicate a page is split or
not, which means we might need to update that flag/bit
everytime the page is recycled, dirtying the cache line
of 'struct page' for use case 1.
2. Unify the page->pp_frag_count handling for both split and
unsplit page by assuming all pages in the page pool is split
into a big fragment initially.
As page pool already supports use case 1 without dirtying the
cache line of 'struct page' whenever a page is recyclable, we
need to support the above use case 3 with minimal overhead,
especially not adding any noticeable overhead for use case 1,
and we are already doing an optimization by not updating
pp_frag_count in page_pool_defrag_page() for the last fragment
user, this patch chooses to unify the pp_frag_count handling
to support the above use case 3.
There is no noticeable performance degradation and some
justification for unifying the frag_count handling with this
patch applied using a micro-benchmark testing in [1].
Jakub Kicinski [Mon, 23 Oct 2023 23:44:18 +0000 (16:44 -0700)]
Merge tag 'for-net-next-2023-10-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- Add 0bda:b85b for Fn-Link RTL8852BE
- ISO: Many fixes for broadcast support
- Mark bcm4378/bcm4387 as BROKEN_LE_CODED
- Add support ITTIM PE50-M75C
- Add RTW8852BE device 13d3:3570
- Add support for QCA2066
- Add support for Intel Misty Peak - 8087:0038
* tag 'for-net-next-2023-10-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next:
Bluetooth: hci_sync: Fix Opcode prints in bt_dev_dbg/err
Bluetooth: Fix double free in hci_conn_cleanup
Bluetooth: btmtksdio: enable bluetooth wakeup in system suspend
Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE
Bluetooth: hci_bcm4377: Mark bcm4378/bcm4387 as BROKEN_LE_CODED
Bluetooth: ISO: Copy BASE if service data matches EIR_BAA_SERVICE_UUID
Bluetooth: Make handle of hci_conn be unique
Bluetooth: btusb: Add date->evt_skb is NULL check
Bluetooth: ISO: Fix bcast listener cleanup
Bluetooth: msft: __hci_cmd_sync() doesn't return NULL
Bluetooth: ISO: Match QoS adv handle with BIG handle
Bluetooth: ISO: Allow binding a bcast listener to 0 bises
Bluetooth: btusb: Add RTW8852BE device 13d3:3570 to device tables
Bluetooth: qca: add support for QCA2066
Bluetooth: ISO: Set CIS bit only for devices with CIS support
Bluetooth: Add support for Intel Misty Peak - 8087:0038
Bluetooth: Add support ITTIM PE50-M75C
Bluetooth: ISO: Pass BIG encryption info through QoS
Bluetooth: ISO: Fix BIS cleanup
====================
====================
devlink: finish conversion to generated split_ops
This patchset converts the remaining genetlink commands to generated
split_ops and removes the existing small_ops arrays entirely
alongside with shared netlink attribute policy.
Patches #1-#6 are just small preparations and small fixes on multiple
places. Note that couple of patches contain the "Fixes"
tag but no need to put them into -net tree.
Patch #7 is a simple rename preparation
Patch #8 is the main one in this set and adds actual definitions of cmds
in to yaml file.
Patches #9-#10 finalize the change removing bits that are no longer in
use.
====================
Jiri Pirko [Sat, 21 Oct 2023 11:27:09 +0000 (13:27 +0200)]
netlink: specs: devlink: add the remaining command to generate complete split_ops
Currently, some of the commands are not described in devlink yaml file
and are manually filled in net/devlink/netlink.c in small_ops. To make
all part of split_ops, add definitions of the rest of the commands
alongside with needed attributes and enums.
Note that this focuses on the kernel side. The requests are fully
described in order to generate split_op alongside with policies.
Follow-up will describe the replies in order to make the userspace
helpers complete.
Jiri Pirko [Sat, 21 Oct 2023 11:27:08 +0000 (13:27 +0200)]
devlink: rename netlink callback to be aligned with the generated ones
All remaining doit and dumpit netlink callback functions are going to be
used by generated split ops. They expect certain name format. Rename the
callback to be aligned with generated names.
Jiri Pirko [Sat, 21 Oct 2023 11:27:04 +0000 (13:27 +0200)]
tools: ynl-gen: render rsp_parse() helpers if cmd has only dump op
Due to the check in RenderInfo class constructor, type_consistent
flag is set to False to avoid rendering the same response parsing
helper for do and dump ops. However, in case there is no do, the helper
needs to be rendered for dump op. So split check to achieve that.
Jiri Pirko [Sat, 21 Oct 2023 11:27:03 +0000 (13:27 +0200)]
tools: ynl-gen: introduce support for bitfield32 attribute type
Introduce support for attribute type bitfield32.
Note that since the generated code works with struct nla_bitfield32,
the generator adds netlink.h to the list of includes for userspace
headers in case any bitfield32 is present.
Note that this is added only to genetlink-legacy scheme as requested
by Jakub Kicinski.
Jiri Pirko [Sat, 21 Oct 2023 11:27:02 +0000 (13:27 +0200)]
genetlink: don't merge dumpit split op for different cmds into single iter
Currently, split ops of doit and dumpit are merged into a single iter
item when they are subsequent. However, there is no guarantee that the
dumpit op is for the same cmd as doit op.
Fix this by checking if cmd is the same for both.
This problem does not occur in existing families.
In idpf_vc_core_init, the mailbox work is queued
on a mailbox workqueue but it is not cancelled on error.
This results in a call trace when idpf_mbx_task tries
to access the freed mailbox queue pointer. Fix it by
cancelling the mailbox work in the error path.
Fixes: 4930fbf419a7 ("idpf: add core init and interrupt request") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231023202655.173369-3-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michal Kubiak [Mon, 23 Oct 2023 20:26:54 +0000 (13:26 -0700)]
idpf: set scheduling mode for completion queue
The HW must be programmed differently for queue-based scheduling mode.
To program the completion queue context correctly, the control plane
must know the scheduling mode not only for the Tx queue, but also for
the completion queue.
Unfortunately, currently the driver sets the scheduling mode only for
the Tx queues.
Propagate the scheduling mode data for the completion queue as
well when sending the queue configuration messages.
Fixes: 1c325aac10a8 ("idpf: configure resources for TX queues") Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Alan Brady <alan.brady@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231023202655.173369-2-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Fri, 20 Oct 2023 20:12:54 +0000 (20:12 +0000)]
net_sched: sch_fq: fastpath needs to take care of sk->sk_pacing_status
If packets of a TCP flows take the fast path, we need to make sure
sk->sk_pacing_status is set to SK_PACING_FQ otherwise TCP might
fallback to internal pacing, which is not optimal.
Fixes: 076433bd78d7 ("net_sched: sch_fq: add fast path for mostly idle qdisc") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231020201254.732527-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ZhengHan Wang [Wed, 18 Oct 2023 10:30:55 +0000 (12:30 +0200)]
Bluetooth: Fix double free in hci_conn_cleanup
syzbot reports a slab use-after-free in hci_conn_hash_flush [1].
After releasing an object using hci_conn_del_sysfs in the
hci_conn_cleanup function, releasing the same object again
using the hci_dev_put and hci_conn_put functions causes a double free.
Here's a simplified flow:
This patch drop the hci_dev_put and hci_conn_put function
call in hci_conn_cleanup function, because the object is
freed in hci_conn_del_sysfs function.
This patch also fixes the refcounting in hci_conn_add_sysfs() and
hci_conn_del_sysfs() to take into account device_add() failures.
Guan Wentao [Thu, 12 Oct 2023 11:21:17 +0000 (19:21 +0800)]
Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE
Add PID/VID 0bda:b85b for Realtek RTL8852BE USB bluetooth part.
The PID/VID was reported by the patch last year. [1]
Some SBCs like rockpi 5B A8 module contains the device.
And it`s founded in website. [2] [3]
Here is the device tables in /sys/kernel/debug/usb/devices .
Janne Grunau [Mon, 16 Oct 2023 07:13:08 +0000 (09:13 +0200)]
Bluetooth: hci_bcm4377: Mark bcm4378/bcm4387 as BROKEN_LE_CODED
bcm4378 and bcm4387 claim to support LE Coded PHY but fail to pair
(reliably) with BLE devices if it is enabled.
On bcm4378 pairing usually succeeds after 2-3 tries. On bcm4387
pairing appears to be completely broken.
Ziyang Xuan [Wed, 11 Oct 2023 09:57:31 +0000 (17:57 +0800)]
Bluetooth: Make handle of hci_conn be unique
The handle of new hci_conn is always HCI_CONN_HANDLE_MAX + 1 if
the handle of the first hci_conn entry in hci_dev->conn_hash->list
is not HCI_CONN_HANDLE_MAX + 1. Use ida to manage the allocation of
hci_conn->handle to make it be unique.
Fixes: 9f78191cc9f1 ("Bluetooth: hci_conn: Always allocate unique handles") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The __hci_cmd_sync() function doesn't return NULL. Checking for NULL
doesn't make the code safer, it just confuses people.
When a function returns both error pointers and NULL then generally the
NULL is a kind of success case. For example, maybe we look up an item
then errors mean we ran out of memory but NULL means the item is not
found. Or if we request a feature, then error pointers mean that there
was an error but NULL means that the feature has been deliberately
turned off.
In this code it's different. The NULL is handled as if there is a bug
in __hci_cmd_sync() where it accidentally returns NULL instead of a
proper error code. This was done consistently until commit 9e14606d8f38
("Bluetooth: msft: Extended monitor tracking by address filter") which
deleted the work around for the potential future bug and treated NULL as
success.
Predicting potential future bugs is complicated, but we should just fix
them instead of working around them. Instead of debating whether NULL
is failure or success, let's just say it's currently impossible and
delete the dead code.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Iulia Tanasescu [Tue, 3 Oct 2023 14:37:39 +0000 (17:37 +0300)]
Bluetooth: ISO: Match QoS adv handle with BIG handle
In case the user binds multiple sockets for the same BIG, the BIG
handle should be matched with the associated adv handle, if it has
already been allocated previously.
Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>