Linus Torvalds [Thu, 24 Nov 2022 19:14:09 +0000 (11:14 -0800)]
Merge tag 'soc-fixes-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"There are a bunch of late fixes that just came in, in particular a
longer series for Rockchips devicetree files, but most of those just
address cosmetic errors that were found during the binding validation.
There are a couple of code changes:
- A regression fix to the IXP42x PCI bus
- A fix for a memory leak on optee, and another one for mach-mxs
- Two fixes for the sunxi rsb bus driver, to address problems with
the shutdown logic
The rest are small but important devicetree fixes for a number of
individual boards, addressing issues across all platforms:
- arm global timer on older rockchip SoCs is unstable and needs to be
disabled in favor of a more reliable clocksource
- Corrections to fix bluetooth, mmc, and networking on a few Rockchip
boards
- at91/sam9g20ek UDC needs a pin controller config change
- an omap board runs into mmc probe errors because of regulator nodes
in the wrong place
- imx8mp-evk has a minor inaccuracy with its pin config, but without
user visible impact
- The Allwinner H6 Hantro G2 video decoder needs an IOMMU reference
to prevent the driver from crashing"
* tag 'soc-fixes-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
bus: ixp4xx: Don't touch bit 7 on IXP42x
ARM: dts: imx6q-prti6q: Fix ref/tcxo-clock-frequency properties
arm64: dts: imx8mp-evk: correct pcie pad settings
ARM: mxs: fix memory leak in mxs_machine_init()
ARM: dts: at91: sam9g20ek: enable udc vbus gpio pinctrl
tee: optee: fix possible memory leak in optee_register_device()
arm64: dts: allwinner: h6: Add IOMMU reference to Hantro G2
media: dt-bindings: allwinner: h6-vpu-g2: Add IOMMU reference property
bus: sunxi-rsb: Support atomic transfers
bus: sunxi-rsb: Remove the shutdown callback
ARM: dts: rockchip: disable arm_global_timer on rk3066 and rk3188
arm64: dts: rockchip: Fix Pine64 Quartz4-B PMIC interrupt
ARM: dts: am335x-pcm-953: Define fixed regulators in root node
ARM: dts: rockchip: rk3188: fix lcdc1-rgb24 node name
arm64: dts: rockchip: fix ir-receiver node names
ARM: dts: rockchip: fix ir-receiver node names
arm64: dts: rockchip: fix adc-keys sub node names
ARM: dts: rockchip: fix adc-keys sub node names
arm: dts: rockchip: remove clock-frequency from rtc
arm: dts: rockchip: fix node name for hym8563 rtc
...
Linus Torvalds [Thu, 24 Nov 2022 19:09:01 +0000 (11:09 -0800)]
Merge tag 'loongarch-fixes-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix two build warnings, a copy_thread() bug, two page table
manipulation bugs, and some trivial cleanups"
* tag 'loongarch-fixes-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
docs/zh_CN/LoongArch: Fix wrong description of FPRs Note
LoongArch: Fix unsigned comparison with less than zero
LoongArch: Set _PAGE_DIRTY only if _PAGE_MODIFIED is set in {pmd,pte}_mkwrite()
LoongArch: Set _PAGE_DIRTY only if _PAGE_WRITE is set in {pmd,pte}_mkdirty()
LoongArch: Clear FPU/SIMD thread info flags for kernel thread
LoongArch: SMP: Change prefix from loongson3 to loongson
LoongArch: Combine acpi_boot_table_init() and acpi_boot_init()
LoongArch: Makefile: Use "grep -E" instead of "egrep"
Linus Torvalds [Thu, 24 Nov 2022 18:22:42 +0000 (10:22 -0800)]
Merge tag 'ext4_for_linus_stable2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Fix a regression in the lazytime code that was introduced in v6.1-rc1,
and a use-after-free that can be triggered by a maliciously corrupted
file system"
* tag 'ext4_for_linus_stable2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
fs: do not update freeing inode i_io_list
ext4: fix use-after-free in ext4_ext_shift_extents
Arnd Bergmann [Thu, 24 Nov 2022 14:36:13 +0000 (15:36 +0100)]
Merge tag 'v6.2-rockchip-dts32-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes
Disabling of the unreliable arm-global-timer on earliest
Rockchip SoCs, due to its frequency being bound to the
changing cpu clock.
* tag 'v6.2-rockchip-dts32-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: disable arm_global_timer on rk3066 and rk3188
Yu Liao [Wed, 23 Nov 2022 08:22:36 +0000 (16:22 +0800)]
net: thunderx: Fix the ACPI memory leak
The ACPI buffer memory (string.pointer) should be freed as the buffer is
not used after returning from bgx_acpi_match_id(), free it to prevent
memory leak.
Xiongfeng Wang [Wed, 23 Nov 2022 06:59:19 +0000 (14:59 +0800)]
octeontx2-af: Fix reference count issue in rvu_sdp_init()
pci_get_device() will decrease the reference count for the *from*
parameter. So we don't need to call put_device() to decrease the
reference. Let's remove the put_device() in the loop and only decrease
the reference count of the returned 'pdev' for the last loop because it
will not be passed to pci_get_device() as input parameter. We don't need
to check if 'pdev' is NULL because it is already checked inside
pci_dev_put(). Also add pci_dev_put() for the error path.
Li Zetao [Tue, 22 Nov 2022 15:00:46 +0000 (23:00 +0800)]
virtio_net: Fix probe failed when modprobe virtio_net
When doing the following test steps, an error was found:
step 1: modprobe virtio_net succeeded
# modprobe virtio_net <-- OK
step 2: fault injection in register_netdevice()
# modprobe -r virtio_net <-- OK
# ...
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 0 PID: 3521 Comm: modprobe
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
Call Trace:
<TASK>
...
should_failslab+0xa/0x20
...
dev_set_name+0xc0/0x100
netdev_register_kobject+0xc2/0x340
register_netdevice+0xbb9/0x1320
virtnet_probe+0x1d72/0x2658 [virtio_net]
...
</TASK>
virtio_net: probe of virtio0 failed with error -22
step 3: modprobe virtio_net failed
# modprobe virtio_net <-- failed
virtio_net: probe of virtio0 failed with error -2
The root cause of the problem is that the queues are not
disable on the error handling path when register_netdevice()
fails in virtnet_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().
virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.
Fix it by reset virtio device on error handling path. This
makes error handling follow the same order as normal device
cleanup in virtnet_remove() which does: unregister, destroy
failover, then reset. And that flow is better tested than
error handling so we can be reasonably sure it works well.
Fixes: 024655555021 ("virtio_net: fix use after free on allocation failure") Signed-off-by: Li Zetao <lizetao1@huawei.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20221122150046.3910638-1-lizetao1@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vladimir Oltean [Tue, 22 Nov 2022 13:09:36 +0000 (15:09 +0200)]
net: enetc: preserve TX ring priority across reconfiguration
In the blamed commit, a rudimentary reallocation procedure for RX buffer
descriptors was implemented, for the situation when their format changes
between normal (no PTP) and extended (PTP).
enetc_hwtstamp_set() calls enetc_close() and enetc_open() in a sequence,
and this sequence loses information which was previously configured in
the TX BDR Mode Register, specifically via the enetc_set_bdr_prio() call.
The TX ring priority is configured by tc-mqprio and tc-taprio, and
affects important things for TSN such as the TX time of packets. The
issue manifests itself most visibly by the fact that isochron --txtime
reports premature packet transmissions when PTP is first enabled on an
enetc interface.
Save the TX ring priority in a new field in struct enetc_bdr (occupies a
2 byte hole on arm64) in order to make this survive a ring reconfiguration.
Fixes: 434cebabd3a2 ("enetc: Add dynamic allocation of extended Rx BD rings") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/r/20221122130936.1704151-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
nfc: st-nci: Restructure validating logic in EVT_TRANSACTION
These are the same 3 patches that were applied in st21nfca here:
https://lore.kernel.org/netdev/20220607025729.1673212-1-mfaltesek@google.com
with a couple minor differences.
st-nci has nearly identical code to that of st21nfca for EVT_TRANSACTION,
except that there are two extra validation checks that are not present
in the st-nci code.
The 3/3 patch as coded for st21nfca pulls those checks in, bringing both
drivers into parity.
====================
Martin Faltesek [Tue, 22 Nov 2022 00:42:46 +0000 (18:42 -0600)]
nfc: st-nci: fix incorrect sizing calculations in EVT_TRANSACTION
The transaction buffer is allocated by using the size of the packet buf,
and subtracting two which seems intended to remove the two tags which are
not present in the target structure. This calculation leads to under
counting memory because of differences between the packet contents and the
target structure. The aid_len field is a u8 in the packet, but a u32 in
the structure, resulting in at least 3 bytes always being under counted.
Further, the aid data is a variable length field in the packet, but fixed
in the structure, so if this field is less than the max, the difference is
added to the under counting.
To fix, perform validation checks progressively to safely reach the
next field, to determine the size of both buffers and verify both tags.
Once all validation checks pass, allocate the buffer and copy the data.
This eliminates freeing memory on the error path, as validation checks are
moved ahead of memory allocation.
Reported-by: Denis Efremov <denis.e.efremov@oracle.com> Reviewed-by: Guenter Roeck <groeck@google.com> Fixes: 5d1ceb7f5e56 ("NFC: st21nfcb: Add HCI transaction event support") Signed-off-by: Martin Faltesek <mfaltesek@google.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Martin Faltesek [Tue, 22 Nov 2022 00:42:44 +0000 (18:42 -0600)]
nfc: st-nci: fix incorrect validating logic in EVT_TRANSACTION
The first validation check for EVT_TRANSACTION has two different checks
tied together with logical AND. One is a check for minimum packet length,
and the other is for a valid aid_tag. If either condition is true (fails),
then an error should be triggered. The fix is to change && to ||.
Reported-by: Denis Efremov <denis.e.efremov@oracle.com> Reviewed-by: Guenter Roeck <groeck@google.com> Fixes: 5d1ceb7f5e56 ("NFC: st21nfcb: Add HCI transaction event support") Signed-off-by: Martin Faltesek <mfaltesek@google.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 24 Nov 2022 03:18:58 +0000 (19:18 -0800)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
ipsec 2022-11-23
1) Fix "disable_policy" on ipv4 early demuxP Packets after
the initial packet in a flow might be incorectly dropped
on early demux if there are no matching policies.
From Eyal Birger.
2) Fix a kernel warning in case XFRM encap type is not
available. From Eyal Birger.
3) Fix ESN wrap around for GSO to avoid a double usage of a
sequence number. From Christian Langrock.
4) Fix a send_acquire race with pfkey_register.
From Herbert Xu.
5) Fix a list corruption panic in __xfrm_state_delete().
Thomas Jarosch.
6) Fix an unchecked return value in xfrm6_init().
Chen Zhongjin.
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: Fix ignored return value in xfrm6_init()
xfrm: Fix oops in __xfrm_state_delete()
af_key: Fix send_acquire race with pfkey_register
xfrm: replay: Fix ESN wrap around for GSO
xfrm: lwtunnel: squelch kernel warning in case XFRM encap type is not available
xfrm: fix "disable_policy" on ipv4 early demux
====================
1) Fix regression in ipset hash:ip with IPv4 range, from Vishwanath Pai.
This is fixing up a bug introduced in the 6.0 release.
2) The "netfilter: ipset: enforce documented limit to prevent allocating
huge memory" patch contained a wrong condition which makes impossible to
add up to 64 clashing elements to a hash:net,iface type of set while it
is the documented feature of the set type. The patch fixes the condition
and thus makes possible to add the elements while keeps preventing
allocating huge memory, from Jozsef Kadlecsik. This has been broken
for several releases.
3) Missing locking when updating the flow block list which might lead
a reader to crash. This has been broken since the introduction of the
flowtable hardware offload support.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: flowtable_offload: add missing locking
netfilter: ipset: restore allowing 64 clashing elements in hash:net,iface
netfilter: ipset: regression in ip_set_hash_ip.c
====================
Linus Torvalds [Wed, 23 Nov 2022 22:45:33 +0000 (14:45 -0800)]
Merge tag 'pci-v6.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fixes from Bjorn Helgaas:
- Update MAINTAINERS to add Manivannan Sadhasivam as Qcom PCIe RC
maintainer (replacing Stanimir Varbanov) and include DT PCI bindings
in the "PCI native host bridge and endpoint drivers" entry.
* tag 'pci-v6.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
MAINTAINERS: Include PCI bindings in host bridge entry
MAINTAINERS: Add Manivannan Sadhasivam as Qcom PCIe RC maintainer
Linus Torvalds [Wed, 23 Nov 2022 19:19:06 +0000 (11:19 -0800)]
Merge tag 'spi-fix-v6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few fixes, all device specific.
The most important ones are for the i.MX driver which had a couple of
nasty data corruption inducing errors appear after the change to
support PIO mode in the last merge window (one introduced by the
change and one latent one which the PIO changes exposed).
Thanks to Frieder, Fabio, Marc and Marek for jumping on that and
resolving the issues quickly once they were found"
* tag 'spi-fix-v6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-imx: spi_imx_transfer_one(): check for DMA transfer first
spi: tegra210-quad: Fix duplicate resource error
spi: dw-dma: decrease reference count in dw_spi_dma_init_mfld()
spi: spi-imx: Fix spi_bus_clk if requested clock is higher than input clock
spi: mediatek: Fix DEVAPC Violation at KO Remove
Linus Torvalds [Wed, 23 Nov 2022 19:06:09 +0000 (11:06 -0800)]
Merge tag '9p-for-6.1-rc7' of https://github.com/martinetd/linux
Pull 9p fixes from Dominique Martinet:
- 9p now uses a variable size for its recv buffer, but every place
hadn't been updated properly to use it and some buffer overflows have
been found and needed fixing.
There's still one place where msize is incorrectly used in a safety
check (p9_check_errors), but all paths leading to it should already
be avoiding overflows and that patch took a bit more time to get
right for zero-copy requests so I'll send it for 6.2
- yet another race condition in p9_conn_cancel introduced by a fix for
a syzbot report in the same place. Maybe at some point we'll get it
right without burning it all down...
* tag '9p-for-6.1-rc7' of https://github.com/martinetd/linux:
9p/xen: check logical size for buffer size
9p/fd: Use P9_HDRSZ for header size
9p/fd: Fix write overflow in p9_read_work
9p/fd: fix issue of list_del corruption in p9_fd_cancel()
David Howells [Mon, 21 Nov 2022 16:31:34 +0000 (16:31 +0000)]
fscache: fix OOB Read in __fscache_acquire_volume
The type of a->key[0] is char in fscache_volume_same(). If the length
of cache volume key is greater than 127, the value of a->key[0] is less
than 0. In this case, klen becomes much larger than 255 after type
conversion, because the type of klen is size_t. As a result, memcmp()
is read out of bounds.
This causes a slab-out-of-bounds Read in __fscache_acquire_volume(), as
reported by Syzbot.
Fix this by changing the type of the stored key to "u8 *" rather than
"char *" (it isn't a simple string anyway). Also put in a check that
the volume name doesn't exceed NAME_MAX.
Santiago Ruano Rincón [Mon, 21 Nov 2022 20:53:05 +0000 (21:53 +0100)]
net/cdc_ncm: Fix multicast RX support for CDC NCM devices with ZLP
ZLP for DisplayLink ethernet devices was enabled in 6.0: 266c0190aee3 ("net/cdc_ncm: Enable ZLP for DisplayLink ethernet devices").
The related driver_info should be the "same as cdc_ncm_info, but with
FLAG_SEND_ZLP". However, set_rx_mode that enables handling multicast
traffic was missing in the new cdc_ncm_zlp_info.
usbnet_cdc_update_filter rx mode was introduced in linux 5.9 with: e10dcb1b6ba7 ("net: cdc_ncm: hook into set_rx_mode to admit multicast
traffic")
Without this hook, multicast, and then IPv6 SLAAC, is broken.
Fixes: 266c0190aee3 ("net/cdc_ncm: Enable ZLP for DisplayLink ethernet devices") Signed-off-by: Santiago Ruano Rincón <santiago.ruano-rincon@imt-atlantique.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Tronchin [Mon, 21 Nov 2022 12:54:55 +0000 (13:54 +0100)]
net: usb: qmi_wwan: add u-blox 0x1342 composition
Add RmNet support for LARA-L6.
LARA-L6 module can be configured (by AT interface) in three different
USB modes:
* Default mode (Vendor ID: 0x1546 Product ID: 0x1341) with 4 serial
interfaces
* RmNet mode (Vendor ID: 0x1546 Product ID: 0x1342) with 4 serial
interfaces and 1 RmNet virtual network interface
* CDC-ECM mode (Vendor ID: 0x1546 Product ID: 0x1343) with 4 serial
interface and 1 CDC-ECM virtual network interface
In RmNet mode LARA-L6 exposes the following interfaces:
If 0: Diagnostic
If 1: AT parser
If 2: AT parser
If 3: AT parset/alternative functions
If 4: RMNET interface
Signed-off-by: Davide Tronchin <davide.tronchin.94@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Sitnicki [Mon, 21 Nov 2022 08:54:26 +0000 (09:54 +0100)]
l2tp: Don't sleep and disable BH under writer-side sk_callback_lock
When holding a reader-writer spin lock we cannot sleep. Calling
setup_udp_tunnel_sock() with write lock held violates this rule, because we
end up calling percpu_down_read(), which might sleep, as syzbot reports
[1]:
Trim the writer-side critical section for sk_callback_lock down to the
minimum, so that it covers only operations on sk_user_data.
Also, when grabbing the sk_callback_lock, we always need to disable BH, as
Eric points out. Failing to do so leads to deadlocks because we acquire
sk_callback_lock in softirq context, which can get stuck waiting on us if:
v2:
- Check and set sk_user_data while holding sk_callback_lock for both
L2TP encapsulation types (IP and UDP) (Tetsuo)
Cc: Tom Parkin <tparkin@katalix.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Fixes: b68777d54fac ("l2tp: Serialize access to sk_user_data with sk_callback_lock") Reported-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+703d9e154b3b58277261@syzkaller.appspotmail.com Reported-by: syzbot+50680ced9e98a61f7698@syzkaller.appspotmail.com Reported-by: syzbot+de987172bb74a381879b@syzkaller.appspotmail.com Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuan Can [Mon, 21 Nov 2022 03:32:26 +0000 (03:32 +0000)]
net: dm9051: Fix missing dev_kfree_skb() in dm9051_loop_rx()
The dm9051_loop_rx() returns without release skb when dm9051_stop_mrcmd()
returns error, free the skb to avoid this leak.
Fixes: 2dc95a4d30ed ("net: Add dm9051 driver") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Sun, 20 Nov 2022 06:24:38 +0000 (14:24 +0800)]
arcnet: fix potential memory leak in com20020_probe()
In com20020_probe(), if com20020_config() fails, dev and info
will not be freed, which will lead to a memory leak.
This patch adds freeing dev and info after com20020_config()
fails to fix this bug.
Compile tested only.
Fixes: 15b99ac17295 ("[PATCH] pcmcia: add return value to _config() functions") Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dominique Martinet [Fri, 18 Nov 2022 13:44:41 +0000 (22:44 +0900)]
9p/xen: check logical size for buffer size
trans_xen did not check the data fits into the buffer before copying
from the xen ring, but we probably should.
Add a check that just skips the request and return an error to
userspace if it did not fit
Jakub Kicinski [Wed, 23 Nov 2022 04:20:58 +0000 (20:20 -0800)]
Merge tag 'mlx5-fixes-2022-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2022-11-21
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2022-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5e: Fix possible race condition in macsec extended packet number update routine
net/mlx5e: Fix MACsec update SecY
net/mlx5e: Fix MACsec SA initialization routine
net/mlx5e: Remove leftovers from old XSK queues enumeration
net/mlx5e: Offload rule only when all encaps are valid
net/mlx5e: Fix missing alignment in size of MTT/KLM entries
net/mlx5: Fix sync reset event handler error flow
net/mlx5: E-Switch, Set correctly vport destination
net/mlx5: Lag, avoid lockdep warnings
net/mlx5: Fix handling of entry refcount when command is not issued to FW
net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
net/mlx5: SF: Fix probing active SFs during driver probe phase
net/mlx5: Fix FW tracer timestamp calculation
net/mlx5: Do not query pci info while pci disabled
====================
Yan Cangang [Sun, 20 Nov 2022 05:52:59 +0000 (13:52 +0800)]
net: ethernet: mtk_eth_soc: fix memory leak in error path
In mtk_ppe_init(), when dmam_alloc_coherent() or devm_kzalloc() failed,
the rhashtable ppe->l2_flows isn't destroyed. Fix it.
In mtk_probe(), when mtk_ppe_init() or mtk_eth_offload_init() or
register_netdev() failed, have the same problem. Fix it.
Fixes: 33fc42de3327 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Signed-off-by: Yan Cangang <nalanzeyu@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ziyang Xuan [Sun, 20 Nov 2022 03:54:05 +0000 (11:54 +0800)]
net: ethernet: mtk_eth_soc: fix potential memory leak in mtk_rx_alloc()
When fail to dma_map_single() in mtk_rx_alloc(), it returns directly.
But the memory allocated for local variable data is not freed, and
local variabel data has not been attached to ring->data[i] yet, so the
memory allocated for local variable data will not be freed outside
mtk_rx_alloc() too. Thus memory leak would occur in this scenario.
Add skb_free_frag(data) when dma_map_single() failed.
Fixes: 23233e577ef9 ("net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Link: https://lore.kernel.org/r/20221120035405.1464341-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
dccp/tcp: Fix bhash2 issues related to WARN_ON() in inet_csk_get_port().
syzkaller was hitting a WARN_ON() in inet_csk_get_port() in the 4th patch,
which was because we forgot to fix up bhash2 bucket when connect() for a
socket bound to a wildcard address fails in __inet_stream_connect().
There was a similar report [0], but its repro does not fire the WARN_ON() due
to inconsistent error handling.
When connect() for a socket bound to a wildcard address fails, saddr may or
may not be reset depending on where the failure happens. When we fail in
__inet_stream_connect(), sk->sk_prot->disconnect() resets saddr. OTOH, in
(dccp|tcp)_v[46]_connect(), if we fail after inet_hash6?_connect(), we
forget to reset saddr.
We fix this inconsistent error handling in the 1st patch, and then we'll
fix the bhash2 WARN_ON() issue.
Note that there is still an issue in that we reset saddr without checking
if there are conflicting sockets in bhash and bhash2, but this should be
another series.
Kuniyuki Iwashima [Sat, 19 Nov 2022 01:49:14 +0000 (17:49 -0800)]
dccp/tcp: Fixup bhash2 bucket when connect() fails.
If a socket bound to a wildcard address fails to connect(), we
only reset saddr and keep the port. Then, we have to fix up the
bhash2 bucket; otherwise, the bucket has an inconsistent address
in the list.
Also, listen() for such a socket will fire the WARN_ON() in
inet_csk_get_port(). [0]
Note that when a system runs out of memory, we give up fixing the
bucket and unlink sk from bhash and bhash2 by inet_put_port().
Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Sat, 19 Nov 2022 01:49:13 +0000 (17:49 -0800)]
dccp/tcp: Update saddr under bhash's lock.
When we call connect() for a socket bound to a wildcard address, we update
saddr locklessly. However, it could result in a data race; another thread
iterating over bhash might see a corrupted address.
Kuniyuki Iwashima [Sat, 19 Nov 2022 01:49:11 +0000 (17:49 -0800)]
dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
When connect() is called on a socket bound to the wildcard address,
we change the socket's saddr to a local address. If the socket
fails to connect() to the destination, we have to reset the saddr.
However, when an error occurs after inet_hash6?_connect() in
(dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
the socket bound to the address.
From the user's point of view, whether saddr is reset or not varies
with errno. Let's fix this inconsistent behaviour.
Note that after this patch, the repro [0] will trigger the WARN_ON()
in inet_csk_get_port() again, but this patch is not buggy and rather
fixes a bug papering over the bhash2's bug for which we need another
fix.
For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
by this sequence:
Tiezhu Yang [Tue, 22 Nov 2022 13:20:57 +0000 (21:20 +0800)]
docs/zh_CN/LoongArch: Fix wrong description of FPRs Note
The Chinese translation of FPRs Note is not consistent with the original
English version, $v0/$v1 should be $fv0/$fv1, $a0/$a1 should be $fa0/$fa1,
fix them.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
It turns out that while bit 7 is masked "reserved" it is
not unused, so masking it off as zero is dangerous, and
breaks flash access on some systems such as the NSLU2.
Be more careful and avoid masking off any of the reserved
bits 7, 8, 9 or 30. Only keep masking EXP_WORD (bit 2)
on IXP43x which is necessary in some setups.
Svyatoslav Feldsherov [Tue, 15 Nov 2022 20:20:01 +0000 (20:20 +0000)]
fs: do not update freeing inode i_io_list
After commit cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode
already has I_DIRTY_INODE") writeback_single_inode can push inode with
I_DIRTY_TIME set to b_dirty_time list. In case of freeing inode with
I_DIRTY_TIME set this can happen after deletion of inode from i_io_list
at evict. Stack trace is following.
This will lead to use after free in flusher thread.
Similar issue can be triggered if writeback_single_inode in the
stack trace update inode->i_io_list. Add explicit check to avoid it.
Fixes: cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE") Reported-by: syzbot+6ba92bd00d5093f7e371@syzkaller.appspotmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Svyatoslav Feldsherov <feldsherov@google.com> Link: https://lore.kernel.org/r/20221115202001.324188-1-feldsherov@google.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Felix Fietkau [Mon, 21 Nov 2022 18:26:15 +0000 (19:26 +0100)]
netfilter: flowtable_offload: add missing locking
nf_flow_table_block_setup and the driver TC_SETUP_FT call can modify the flow
block cb list while they are being traversed elsewhere, causing a crash.
Add a write lock around the calls to protect readers
Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support") Reported-by: Chad Monroe <chad.monroe@smartrg.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jozsef Kadlecsik [Tue, 22 Nov 2022 19:18:58 +0000 (20:18 +0100)]
netfilter: ipset: restore allowing 64 clashing elements in hash:net,iface
The commit 510841da1fcc ("netfilter: ipset: enforce documented limit to
prevent allocating huge memory") was too strict and prevented to add up to
64 clashing elements to a hash:net,iface type of set. This patch fixes the
issue and now the type behaves as documented.
This problem is caused by rotten packets, which are received after
polling but before interrupts are enabled again. This can be fixed by
checking for pending work and rescheduling if necessary after interrupts
has been enabled again.
Yang Yingliang [Sat, 19 Nov 2022 07:02:02 +0000 (15:02 +0800)]
bnx2x: fix pci device refcount leak in bnx2x_vf_is_pcie_pending()
As comment of pci_get_domain_bus_and_slot() says, it returns
a pci device with refcount increment, when finish using it,
the caller must decrement the reference count by calling
pci_dev_put(). Call pci_dev_put() before returning from
bnx2x_vf_is_pcie_pending() to avoid refcount leak.
Fixes: b56e9670ffa4 ("bnx2x: Prepare device and initialize VF database") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20221119070202.1407648-1-yangyingliang@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Xin Long [Fri, 18 Nov 2022 21:33:03 +0000 (16:33 -0500)]
net: sched: allow act_ct to be built without NF_NAT
In commit f11fe1dae1c4 ("net/sched: Make NET_ACT_CT depends on NF_NAT"),
it fixed the build failure when NF_NAT is m and NET_ACT_CT is y by
adding depends on NF_NAT for NET_ACT_CT. However, it would also cause
NET_ACT_CT cannot be built without NF_NAT, which is not expected. This
patch fixes it by changing to use "(!NF_NAT || NF_NAT)" as the depend.
Liu Jian [Thu, 17 Nov 2022 12:59:18 +0000 (20:59 +0800)]
net: sparx5: fix error handling in sparx5_port_open()
If phylink_of_phy_connect() fails, the port should be disabled.
If sparx5_serdes_set()/phy_power_on() fails, the port should be
disabled and the phylink should be stopped and disconnected.
Fixes: 946e7fd5053a ("net: sparx5: add port module support") Fixes: f3cad2611a77 ("net: sparx5: add hostmode with phylink support") Signed-off-by: Liu Jian <liujian56@huawei.com> Tested-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Reviewed-by: Steen Hegelund <steen.hegelund@microchip.com> Link: https://lore.kernel.org/r/20221117125918.203997-1-liujian56@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Wang ShaoBo [Fri, 18 Nov 2022 06:24:47 +0000 (14:24 +0800)]
net: wwan: iosm: use ACPI_FREE() but not kfree() in ipc_pcie_read_bios_cfg()
acpi_evaluate_dsm() should be coupled with ACPI_FREE() to free the ACPI
memory, because we need to track the allocation of acpi_object when
ACPI_DBG_TRACK_ALLOCATIONS enabled, so use ACPI_FREE() instead of kfree().
Thomas Jarosch [Wed, 2 Nov 2022 10:18:48 +0000 (11:18 +0100)]
xfrm: Fix oops in __xfrm_state_delete()
Kernel 5.14 added a new "byseq" index to speed
up xfrm_state lookups by sequence number in commit fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
While the patch was thorough, the function pfkey_send_new_mapping()
in net/af_key.c also modifies x->km.seq and never added
the current xfrm_state to the "byseq" index.
Exact location of the crash in __xfrm_state_delete():
if (x->km.seq)
hlist_del_rcu(&x->byseq);
The hlist_node "byseq" was never populated.
The bug only triggers if a new NAT traversal mapping (changed IP or port)
is detected in esp_input_done2() / esp6_input_done2(), which in turn
indirectly calls pfkey_send_new_mapping() *if* the kernel is compiled
with CONFIG_NET_KEY and "af_key" is active.
The PF_KEYv2 message SADB_X_NAT_T_NEW_MAPPING is not part of RFC 2367.
Various implementations have been examined how they handle
the "sadb_msg_seq" header field:
- racoon (Android): does not process SADB_X_NAT_T_NEW_MAPPING
- strongswan: does not care about sadb_msg_seq
- openswan: does not care about sadb_msg_seq
There is no standard how PF_KEYv2 sadb_msg_seq should be populated
for SADB_X_NAT_T_NEW_MAPPING and it's not used in popular
implementations either. Herbert Xu suggested we should just
use the current km.seq value as is. This fixes the root cause
of the oops since we no longer modify km.seq itself.
The update of "km.seq" looks like a copy'n'paste error
from pfkey_send_acquire(). SADB_ACQUIRE must indeed assign a unique km.seq
number according to RFC 2367. It has been verified that code paths
involving pfkey_send_acquire() don't cause the same Oops.
PF_KEYv2 SADB_X_NAT_T_NEW_MAPPING support was originally added here:
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
In particular, implement ESPinUDP encapsulation for IPsec
Nat Traversal.
A note on triggering the bug: I was not able to trigger it using VMs.
There is one VPN using a high latency link on our production VPN server
that triggered it like once a day though.
Jacob Keller [Fri, 18 Nov 2022 22:27:29 +0000 (14:27 -0800)]
ice: fix handling of burst Tx timestamps
Commit 1229b33973c7 ("ice: Add low latency Tx timestamp read") refactored
PTP timestamping logic to use a threaded IRQ instead of a separate kthread.
This implementation introduced ice_misc_intr_thread_fn and redefined the
ice_ptp_process_ts function interface to return a value of whether or not
the timestamp processing was complete.
ice_misc_intr_thread_fn would take the return value from ice_ptp_process_ts
and convert it into either IRQ_HANDLED if there were no more timestamps to
be processed, or IRQ_WAKE_THREAD if the thread should continue processing.
This is not correct, as the kernel does not re-schedule threaded IRQ
functions automatically. IRQ_WAKE_THREAD can only be used by the main IRQ
function.
This results in the ice_ptp_process_ts function (and in turn the
ice_ptp_tx_tstamp function) from only being called exactly once per
interrupt.
If an application sends a burst of Tx timestamps without waiting for a
response, the interrupt will trigger for the first timestamp. However,
later timestamps may not have arrived yet. This can result in dropped or
discarded timestamps. Worse, on E822 hardware this results in the interrupt
logic getting stuck such that no future interrupts will be triggered. The
result is complete loss of Tx timestamp functionality.
Fix this by modifying the ice_misc_intr_thread_fn to perform its own
polling of the ice_ptp_process_ts function. We sleep for a few microseconds
between attempts to avoid wasting significant CPU time. The value was
chosen to allow time for the Tx timestamps to complete without wasting so
much time that we overrun application wait budgets in the worst case.
The ice_ptp_process_ts function also currently returns false in the event
that the Tx tracker is not initialized. This would result in the threaded
IRQ handler never exiting if it gets started while the tracker is not
initialized.
Fix the function to appropriately return true when the tracker is not
initialized.
Note that this will not reproduce with default ptp4l behavior, as the
program always synchronously waits for a timestamp response before sending
another timestamp request.
Reported-by: Siddaraju DH <siddaraju.dh@intel.com> Fixes: 1229b33973c7 ("ice: Add low latency Tx timestamp read") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20221118222729.1565317-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 22 Nov 2022 04:50:12 +0000 (20:50 -0800)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-11-18 (iavf)
Ivan Vecera resolves issues related to reset by adding back call to
netif_tx_stop_all_queues() and adding calls to dev_close() to ensure
device is properly closed during reset.
Stefan Assmann removes waiting for setting of MAC address as this breaks
ARP.
Slawomir adds setting of __IAVF_IN_REMOVE_TASK bit to prevent deadlock
between remove and shutdown.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: Fix race condition between iavf_shutdown and iavf_remove
iavf: remove INITIAL_MAC_SET to allow gARP to work properly
iavf: Do not restart Tx queues after reset task failure
iavf: Fix a crash during reset task
====================
====================
tipc: fix two race issues in tipc_conn_alloc
The race exists beteen tipc_topsrv_accept() and tipc_conn_close(),
one is allocating the con while the other is freeing it and there
is no proper lock protecting it. Therefore, a null-pointer-defer
and a use-after-free may be triggered, see details on each patch.
====================
Xin Long [Fri, 18 Nov 2022 21:45:01 +0000 (16:45 -0500)]
tipc: add an extra conn_get in tipc_conn_alloc
One extra conn_get() is needed in tipc_conn_alloc(), as after
tipc_conn_alloc() is called, tipc_conn_close() may free this
con before deferencing it in tipc_topsrv_accept():
This patch fixes it by holding it in tipc_conn_alloc(), then after
all accessing in tipc_topsrv_accept() releasing it. Note when does
this in tipc_topsrv_kern_subscr(), as tipc_conn_rcv_sub() returns
0 or -1 only, we don't need to check for "> 0".
Fixes: c5fa7b3cf3cb ("tipc: introduce new TIPC server infrastructure") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It was caused by !con->sock in tipc_conn_close(). In tipc_topsrv_accept(),
con is allocated in conn_idr then its sock is set:
con = tipc_conn_alloc();
... <----[1]
con->sock = newsock;
If tipc_conn_close() is called in anytime of [1], the null-pointer-def
is triggered by con->sock->sk due to con->sock is not yet set.
This patch fixes it by moving the con->sock setting to tipc_conn_alloc()
under s->idr_lock. So that con->sock can never be NULL when getting the
con from s->conn_idr. It will be also safer to move con->server and flag
CF_CONNECTED setting under s->idr_lock, as they should all be set before
tipc_conn_alloc() is called.
Fixes: c5fa7b3cf3cb ("tipc: introduce new TIPC server infrastructure") Reported-by: Wei Chen <harperchen1110@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Emeel Hakim [Sun, 30 Oct 2022 09:19:52 +0000 (11:19 +0200)]
net/mlx5e: Fix possible race condition in macsec extended packet number update routine
Currenty extended packet number (EPN) update routine is accessing
macsec object without holding the general macsec lock hence facing
a possible race condition when an EPN update occurs while updating
or deleting the SA.
Fix by holding the general macsec lock before accessing the object.
Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Emeel Hakim [Sun, 30 Oct 2022 09:52:42 +0000 (11:52 +0200)]
net/mlx5e: Fix MACsec update SecY
Currently updating SecY destroys and re-creates RX SA objects,
the re-created RX SA objects are not identical to the destroyed
objects and it disagree on the encryption enabled property which
holds the value false after recreation, this value is not
supported with offload which leads to no traffic after an update.
Fix by recreating an identical objects.
Fixes: 5a39816a75e5 ("net/mlx5e: Add MACsec offload SecY support") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Emeel Hakim [Sun, 30 Oct 2022 09:43:24 +0000 (11:43 +0200)]
net/mlx5e: Fix MACsec SA initialization routine
Currently as part of MACsec SA initialization routine
extended packet number (EPN) object attribute is always
being set without checking if EPN is actually enabled,
the above could lead to a NULL dereference.
Fix by adding such a check.
Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Tariq Toukan [Mon, 14 Nov 2022 09:56:11 +0000 (11:56 +0200)]
net/mlx5e: Remove leftovers from old XSK queues enumeration
Before the cited commit, for N channels, a dedicated set of N queues was
created to support XSK, in indices [N, 2N-1], doubling the number of
queues.
In addition, changing the number of channels was prohibited, as it would
shift the indices.
Remove these two leftovers, as we moved XSK to a new queueing scheme,
starting from index 0.
Fixes: 3db4c85cde7a ("net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Chris Mi [Thu, 17 Nov 2022 05:45:45 +0000 (07:45 +0200)]
net/mlx5e: Offload rule only when all encaps are valid
The cited commit adds a for loop to support multiple encapsulations.
But it only checks if the last encap is valid.
Fix it by setting slow path flag when one of the encap is invalid.
Fixes: f493f15534ec ("net/mlx5e: Move flow attr reformat action bit to per dest flags") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Moshe Shemesh [Sat, 29 Oct 2022 06:03:48 +0000 (09:03 +0300)]
net/mlx5: Fix sync reset event handler error flow
When sync reset now event handling fails on mlx5_pci_link_toggle() then
no reset was done. However, since mlx5_cmd_fast_teardown_hca() was
already done, the firmware function is closed and the driver is left
without firmware functionality.
Fix it by setting device error state and reopen the firmware resources.
Reopening is done by the thread that was called for devlink reload
fw_activate as it already holds the devlink lock.
Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 16 Nov 2022 09:10:15 +0000 (11:10 +0200)]
net/mlx5: E-Switch, Set correctly vport destination
The cited commit moved from using reformat_id integer to packet_reformat
pointer which introduced the possibility to null pointer dereference.
When setting packet reformat flag and pkt_reformat pointer must
exists so checking MLX5_ESW_DEST_ENCAP is not enough, we need
to make sure the pkt_reformat is valid and check for MLX5_ESW_DEST_ENCAP_VALID.
If the dest encap valid flag does not exists then pkt_reformat can be
either invalid address or null.
Also, to make sure we don't try to access invalid pkt_reformat set it to
null when invalidated and invalidate it before calling add flow code as
its logically more correct and to be safe.
Fixes: 2b688ea5efde ("net/mlx5: Add flow steering actions to fs_cmd shim layer") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Eli Cohen [Mon, 15 Aug 2022 08:25:26 +0000 (11:25 +0300)]
net/mlx5: Lag, avoid lockdep warnings
ldev->lock is used to serialize lag change operations. Since multiport
eswtich functionality was added, we now change the mode dynamically.
However, acquiring ldev->lock is not allowed as it could possibly lead
to a deadlock as reported by the lockdep mechanism.
[ 836.154963] WARNING: possible circular locking dependency detected
[ 836.155850] 5.19.0-rc5_net_56b7df2 #1 Not tainted
[ 836.156549] ------------------------------------------------------
[ 836.157418] handler1/12198 is trying to acquire lock:
[ 836.158178] ffff888187d52b58 (&ldev->lock){+.+.}-{3:3}, at: mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core]
[ 836.159575]
[ 836.159575] but task is already holding lock:
[ 836.160474] ffff8881d4de2930 (&block->cb_lock){++++}-{3:3}, at: tc_setup_cb_add+0x5b/0x200
[ 836.161669] which lock already depends on the new lock.
[ 836.162905]
[ 836.162905] the existing dependency chain (in reverse order) is:
[ 836.164008] -> #3 (&block->cb_lock){++++}-{3:3}:
[ 836.164946] down_write+0x25/0x60
[ 836.165548] tcf_block_get_ext+0x1c6/0x5d0
[ 836.166253] ingress_init+0x74/0xa0 [sch_ingress]
[ 836.167028] qdisc_create.constprop.0+0x130/0x5e0
[ 836.167805] tc_modify_qdisc+0x481/0x9f0
[ 836.168490] rtnetlink_rcv_msg+0x16e/0x5a0
[ 836.169189] netlink_rcv_skb+0x4e/0xf0
[ 836.169861] netlink_unicast+0x190/0x250
[ 836.170543] netlink_sendmsg+0x243/0x4b0
[ 836.171226] sock_sendmsg+0x33/0x40
[ 836.171860] ____sys_sendmsg+0x1d1/0x1f0
[ 836.172535] ___sys_sendmsg+0xab/0xf0
[ 836.173183] __sys_sendmsg+0x51/0x90
[ 836.173836] do_syscall_64+0x3d/0x90
[ 836.174471] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 836.175282]
Moshe Shemesh [Thu, 17 Nov 2022 07:07:20 +0000 (09:07 +0200)]
net/mlx5: Fix handling of entry refcount when command is not issued to FW
In case command interface is down, or the command is not allowed, driver
did not increment the entry refcount, but might have decrement as part
of forced completion handling.
Fix that by always increment and decrement the refcount to make it
symmetric for all flows.
Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler") Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reported-by: Jack Wang <jinpu.wang@ionos.com> Tested-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Moshe Shemesh [Tue, 31 May 2022 06:14:03 +0000 (09:14 +0300)]
net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
While moving to new CMD API (quiet API), some pre-existing flows may call the new API
function that in case of error, returns the error instead of printing it as previously done.
For such flows we bring back the print but to tracepoint this time for sys admins to
have the ability to check for errors especially for commands using the new quiet API.
Tracepoint output example:
devlink-1333 [001] ..... 822.746922: mlx5_cmd: ACCESS_REG(0x805) op_mod(0x0) failed, status bad resource(0x5), syndrome (0xb06e1f), err(-22)
Fixes: f23519e542e5 ("net/mlx5: cmdif, Add new api for command execution") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Shay Drory [Thu, 4 Aug 2022 09:38:41 +0000 (12:38 +0300)]
net/mlx5: SF: Fix probing active SFs during driver probe phase
When SF devices and SF port representors are located on different
functions, unloading and reloading of SF parent driver doesn't recreate
the existing SF present in the device.
Fix it by querying SFs and probe active SFs during driver probe phase.
Moshe Shemesh [Thu, 20 Oct 2022 09:25:59 +0000 (12:25 +0300)]
net/mlx5: Fix FW tracer timestamp calculation
Fix a bug in calculation of FW tracer timestamp. Decreasing one in the
calculation should effect only bits 52_7 and not effect bits 6_0 of the
timestamp, otherwise bits 6_0 are always set in this calculation.
Roy Novich [Sun, 24 Jul 2022 06:49:07 +0000 (09:49 +0300)]
net/mlx5: Do not query pci info while pci disabled
The driver should not interact with PCI while PCI is disabled. Trying to
do so may result in being unable to get vital signs during PCI reset,
driver gets timed out and fails to recover.
Fixes: fad1783a6d66 ("net/mlx5: Print more info on pci error handlers") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Arnd Bergmann [Mon, 21 Nov 2022 14:58:38 +0000 (15:58 +0100)]
Merge tag 'am335x-pcm-953-regulators' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes
Regulator changes for am335x-pcm-953
This is for deferred probe issue on am335x-pcm-953 sdhci-omap regulator.
* tag 'am335x-pcm-953-regulators' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am335x-pcm-953: Define fixed regulators in root node
Vishwanath Pai [Wed, 28 Sep 2022 18:26:50 +0000 (14:26 -0400)]
netfilter: ipset: regression in ip_set_hash_ip.c
This patch introduced a regression: commit 48596a8ddc46 ("netfilter:
ipset: Fix adding an IPv4 range containing more than 2^31 addresses")
The variable e.ip is passed to adtfn() function which finally adds the
ip address to the set. The patch above refactored the for loop and moved
e.ip = htonl(ip) to the end of the for loop.
What this means is that if the value of "ip" changes between the first
assignement of e.ip and the forloop, then e.ip is pointing to a
different ip address than "ip".
Test case:
$ ipset create jdtest_tmp hash:ip family inet hashsize 2048 maxelem 100000
$ ipset add jdtest_tmp 10.0.1.1/31
ipset v6.21.1: Element cannot be added to the set: it's already added
The value of ip gets updated inside the "else if (tb[IPSET_ATTR_CIDR])"
block but e.ip is still pointing to the old value.
Fixes: 48596a8ddc46 ("netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses") Reviewed-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Dan Carpenter [Fri, 18 Nov 2022 15:07:54 +0000 (18:07 +0300)]
octeontx2-af: cn10k: mcs: Fix copy and paste bug in mcs_bbe_intr_handler()
This code accidentally uses the RX macro twice instead of the RX and TX.
Fixes: 6c635f78c474 ("octeontx2-af: cn10k: mcs: Handle MCS block interrupts") Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Fri, 18 Nov 2022 04:21:52 +0000 (20:21 -0800)]
ipv4/fib: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
Zero-length arrays are deprecated[1] and are being replaced with
flexible array members in support of the ongoing efforts to tighten the
FORTIFY_SOURCE routines on memcpy(), correctly instrument array indexing
with UBSAN_BOUNDS, and to globally enable -fstrict-flex-arrays=3.
Replace zero-length array with flexible-array member in struct key_vector.
This results in no differences in binary output.
[1] https://github.com/KSPP/linux/issues/78
Cc: Jakub Kicinski <kuba@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David Ahern <dsahern@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Díaz [Fri, 18 Nov 2022 03:44:21 +0000 (21:44 -0600)]
selftests/net: Find nettest in current directory
The `nettest` binary, built from `selftests/net/nettest.c`,
was expected to be found in the path during test execution of
`fcnal-test.sh` and `pmtu.sh`, leading to tests getting
skipped when the binary is not installed in the system, as can
be seen in these logs found in the wild [1]:
# TEST: vti4: PMTU exceptions [SKIP]
[ 350.600250] IPv6: ADDRCONF(NETDEV_CHANGE): veth_b: link becomes ready
[ 350.607421] IPv6: ADDRCONF(NETDEV_CHANGE): veth_a: link becomes ready
# 'nettest' command not found; skipping tests
# xfrm6udp not supported
# TEST: vti6: PMTU exceptions (ESP-in-UDP) [SKIP]
[ 351.605102] IPv6: ADDRCONF(NETDEV_CHANGE): veth_b: link becomes ready
[ 351.612243] IPv6: ADDRCONF(NETDEV_CHANGE): veth_a: link becomes ready
# 'nettest' command not found; skipping tests
# xfrm4udp not supported
The `unicast_extensions.sh` tests also rely on `nettest`, but
it runs fine there because it looks for the binary in the
current working directory [2]:
The same mechanism that works for the Unicast extensions tests
is here copied over to the PMTU and functional tests.
====================
The following patchset contains late Netfilter fixes for net:
1) Use READ_ONCE()/WRITE_ONCE() to update ct->mark, from Daniel Xu.
Not reported by syzbot, but I presume KASAN would trigger post
a splat on this. This is a rather old issue, predating git history.
2) Do not set up extensions for set element with end interval flag
set on. This leads to bogusly skipping this elements as expired
when listing the set/map to userspace as well as increasing
memory consumpton when stateful expressions are used. This issue
has been present since 4.18, when timeout support for rbtree set
was added.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 21 Nov 2022 11:26:57 +0000 (12:26 +0100)]
Merge tag 'imx-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 6.1, part 3:
- Fix a small memory leak in mach-mxs code.
- Correct PCIe pad configuration for imx8mp-evk board.
- Fix ref/tcxo clock frequency property for imx6q-prti6q board.
* tag 'imx-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6q-prti6q: Fix ref/tcxo-clock-frequency properties
arm64: dts: imx8mp-evk: correct pcie pad settings
ARM: mxs: fix memory leak in mxs_machine_init()
Huacai Chen [Mon, 21 Nov 2022 11:02:57 +0000 (19:02 +0800)]
LoongArch: Set _PAGE_DIRTY only if _PAGE_MODIFIED is set in {pmd,pte}_mkwrite()
Set _PAGE_DIRTY only if _PAGE_MODIFIED is set in {pmd,pte}_mkwrite().
Otherwise, _PAGE_DIRTY silences the TLB modify exception and make us
have no chance to mark a pmd/pte dirty (_PAGE_MODIFIED) for software.
Huacai Chen [Mon, 21 Nov 2022 11:02:57 +0000 (19:02 +0800)]
LoongArch: Set _PAGE_DIRTY only if _PAGE_WRITE is set in {pmd,pte}_mkdirty()
Now {pmd,pte}_mkdirty() set _PAGE_DIRTY bit unconditionally, this causes
random segmentation fault after commit 0ccf7f168e17bb7e ("mm/thp: carry
over dirty bit when thp splits on pmd").
The reason is: when fork(), parent process use pmd_wrprotect() to clear
huge page's _PAGE_WRITE and _PAGE_DIRTY (for COW); then pte_mkdirty() set
_PAGE_DIRTY as well as _PAGE_MODIFIED while splitting dirty huge pages;
once _PAGE_DIRTY is set, there will be no tlb modify exception so the COW
machanism fails; and at last memory corruption occurred between parent
and child processes.
So, we should set _PAGE_DIRTY only when _PAGE_WRITE is set in {pmd,pte}_
mkdirty().
Cc: stable@vger.kernel.org Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Mon, 21 Nov 2022 11:02:57 +0000 (19:02 +0800)]
LoongArch: Clear FPU/SIMD thread info flags for kernel thread
If a kernel thread is created by a user thread, it may carry FPU/SIMD
thread info flags (TIF_USEDFPU, TIF_USEDSIMD, etc.). Then it will be
considered as a fpu owner and kernel try to save its FPU/SIMD context
and cause such errors:
This can be easily triggered by ltp testcase syscalls/io_uring02 and it
can also be easily fixed by clearing the FPU/SIMD thread info flags for
kernel threads in copy_thread().
Cc: stable@vger.kernel.org Reported-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Mon, 21 Nov 2022 11:02:57 +0000 (19:02 +0800)]
LoongArch: SMP: Change prefix from loongson3 to loongson
SMP operations can be shared by Loongson-2 series and Loongson-3 series,
so we change the prefix from loongson3 to loongson for all functions and
data structures.
Tiezhu Yang [Mon, 21 Nov 2022 11:02:57 +0000 (19:02 +0800)]
LoongArch: Makefile: Use "grep -E" instead of "egrep"
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
Fix this up by changing the LoongArch Makefile to use "grep -E" instead.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Lu Wei [Thu, 17 Nov 2022 15:07:22 +0000 (23:07 +0800)]
net: microchip: sparx5: Fix return value in sparx5_tc_setup_qdisc_ets()
Function sparx5_tc_setup_qdisc_ets() always returns negative value
because it return -EOPNOTSUPP in the end. This patch returns the
rersult of sparx5_tc_ets_add() and sparx5_tc_ets_del() directly.
Fixes: 211225428d65 ("net: microchip: sparx5: add support for offloading ets qdisc") Signed-off-by: Lu Wei <luwei32@huawei.com> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shang XiaoJing [Thu, 17 Nov 2022 11:37:14 +0000 (19:37 +0800)]
nfc: s3fwrn5: Fix potential memory leak in s3fwrn5_nci_send()
s3fwrn5_nci_send() won't free the skb when it failed for the check
before s3fwrn5_write(). As the result, the skb will memleak. Free the
skb when the check failed.
Fixes: c04c674fadeb ("nfc: s3fwrn5: Add driver for Samsung S3FWRN5 NFC Chip") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Suggested-by: Pavel Machek <pavel@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Shang XiaoJing [Thu, 17 Nov 2022 11:37:13 +0000 (19:37 +0800)]
nfc: nxp-nci: Fix potential memory leak in nxp_nci_send()
nxp_nci_send() won't free the skb when it failed for the check before
write(). As the result, the skb will memleak. Free the skb when the
check failed.
Fixes: dece45855a8b ("NFC: nxp-nci: Add support for NXP NCI chips") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Suggested-by: Pavel Machek <pavel@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>