Matthew Wilcox (Oracle) [Wed, 25 Dec 2024 09:46:43 +0000 (04:46 -0500)]
gfs2: Use b_folio in gfs2_log_write_bh()
We are preparing to remove bh->b_page. gfs2_log_write() should continue
to operate on pages as some of the memory being logged does not come
from folios, so convert from folio to page in this function.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Linus Torvalds [Thu, 13 Feb 2025 21:13:37 +0000 (13:13 -0800)]
Merge tag 'spi-fix-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A small collection of driver specific fixes, none standing out in
particular"
* tag 'spi-fix-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: sn-f-ospi: Fix division by zero
spi: pxa2xx: Fix regression when toggling chip select on LPSS devices
spi: atmel-quadspi: Fix warning in doc-comment
Linus Torvalds [Thu, 13 Feb 2025 21:09:01 +0000 (13:09 -0800)]
Merge tag 'regulator-fix-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"The main change here is a revert for a cleanup that was done in the
core, attempting to resolve some confusion about how we handle systems
where we've somehow managed to end up with both platform data and
device tree data for the same device. Unfortunately it turns out there
are actually a few systems that deliberately do this and were broken
by the change so we've just reverted it.
There's also a new Qualcomm device ID"
* tag 'regulator-fix-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: core: let dt properties override driver init_data
regulator: qcom_smd: Add l2, l5 sub-node to mp5496 regulator
Linus Torvalds [Thu, 13 Feb 2025 20:17:04 +0000 (12:17 -0800)]
Merge tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, wireless and bluetooth.
Kalle Valo steps down after serving as the WiFi driver maintainer for
over a decade.
Current release - fix to a fix:
- vsock: orphan socket after transport release, avoid null-deref
- Bluetooth: L2CAP: fix corrupted list in hci_chan_del
Current release - regressions:
- eth:
- stmmac: correct Rx buffer layout when SPH is enabled
- iavf: fix a locking bug in an error path
- rxrpc: fix alteration of headers whilst zerocopy pending
- s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH
- Revert "netfilter: flowtable: teardown flow if cached mtu is stale"
Current release - new code bugs:
- rxrpc: fix ipv6 path MTU discovery, only ipv4 worked
- pse-pd: fix deadlock in current limit functions
Previous releases - regressions:
- rtnetlink: fix netns refleak with rtnl_setlink()
- wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364
firmware
Previous releases - always broken:
- add missing RCU protection of struct net throughout the stack
- can: rockchip: bail out if skb cannot be allocated
- eth: ti: am65-cpsw: base XDP support fixes
Misc:
- ethtool: tsconfig: update the format of hwtstamp flags, changes the
uAPI but this uAPI was not in any release yet"
* tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
net: pse-pd: Fix deadlock in current limit functions
rxrpc: Fix ipv6 path MTU discovery
Reapply "net: skb: introduce and use a single page frag cache"
s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH
mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()
ipv6: mcast: add RCU protection to mld_newpack()
team: better TEAM_OPTION_TYPE_STRING validation
Bluetooth: L2CAP: Fix corrupted list in hci_chan_del
Bluetooth: btintel_pcie: Fix a potential race condition
Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case
net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case
net: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases
vsock/test: Add test for SO_LINGER null ptr deref
vsock: Orphan socket after transport release
MAINTAINERS: Add sctp headers to the general netdev entry
Revert "netfilter: flowtable: teardown flow if cached mtu is stale"
iavf: Fix a locking bug in an error path
rxrpc: Fix alteration of headers whilst zerocopy pending
net: phylink: make configuring clock-stop dependent on MAC support
...
Linus Torvalds [Thu, 13 Feb 2025 20:06:29 +0000 (12:06 -0800)]
Merge tag 'for-6.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix stale page cache after race between readahead and direct IO write
- fix hole expansion when writing at an offset beyond EOF, the range
will not be zeroed
- use proper way to calculate offsets in folio ranges
* tag 'for-6.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix hole expansion when writing at an offset beyond EOF
btrfs: fix stale page cache after race between readahead and direct IO write
btrfs: fix two misuses of folio_shift()
Linus Torvalds [Thu, 13 Feb 2025 19:58:11 +0000 (11:58 -0800)]
Merge tag 'bcachefs-2025-02-12' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Just small stuff.
As a general announcement, on disk format is now frozen in my master
branch - future on disk format changes will be optional, not required.
- More fixes for going read-only: the previous fix was insufficient,
but with more work on ordering journal reclaim flushing (and a
btree node accounting fix so we don't split until we have to) the
tiering_replication test now consistently goes read-only in less
than a second.
- fix for fsck when we have reflink pointers to missing indirect
extents
- some transaction restart handling fixes from Alan; the "Pass
_orig_restart_count to trans_was_restarted" likely fixes some rare
undefined behaviour heisenbugs"
* tag 'bcachefs-2025-02-12' of git://evilpiepirate.org/bcachefs:
bcachefs: Reuse transaction
bcachefs: Pass _orig_restart_count to trans_was_restarted
bcachefs: CONFIG_BCACHEFS_INJECT_TRANSACTION_RESTARTS
bcachefs: Fix want_new_bset() so we write until the end of the btree node
bcachefs: Split out journal pins by btree level
bcachefs: Fix use after free
bcachefs: Fix marking reflink pointers to missing indirect extents
Kory Maincent [Wed, 12 Feb 2025 15:17:51 +0000 (16:17 +0100)]
net: pse-pd: Fix deadlock in current limit functions
Fix a deadlock in pse_pi_get_current_limit and pse_pi_set_current_limit
caused by consecutive mutex_lock calls. One in the function itself and
another in pse_pi_get_voltage.
Resolve the issue by using the unlocked version of pse_pi_get_voltage
instead.
David Howells [Wed, 12 Feb 2025 11:21:24 +0000 (11:21 +0000)]
rxrpc: Fix ipv6 path MTU discovery
rxrpc path MTU discovery currently only makes use of ICMPv4, but not
ICMPv6, which means that pmtud for IPv6 doesn't work correctly. Fix it to
check for ICMPv6 messages also.
Fixes: eeaedc5449d9 ("rxrpc: Implement path-MTU probing using padded PING ACKs (RFC8899)") Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/3517283.1739359284@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 13 Feb 2025 17:41:33 +0000 (09:41 -0800)]
Merge tag 'for-net-2025-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- btintel_pcie: Fix a potential race condition
- L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
- L2CAP: Fix corrupted list in hci_chan_del
* tag 'for-net-2025-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: L2CAP: Fix corrupted list in hci_chan_del
Bluetooth: btintel_pcie: Fix a potential race condition
Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
====================
Jakub Kicinski [Thu, 13 Feb 2025 17:38:50 +0000 (09:38 -0800)]
Merge tag 'nf-25-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains one revert for:
1) Revert flowtable entry teardown cycle when skbuff exceeds mtu to
deal with DF flag unset scenarios. This is reverts a patch coming
in the previous merge window (available in 6.14-rc releases).
* tag 'nf-25-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
Revert "netfilter: flowtable: teardown flow if cached mtu is stale"
====================
Sabrina reports that the revert may trigger warnings due to intervening
changes, especially the ability to rise MAX_SKB_FRAGS. Let's drop it
and revisit once that part is also ironed out.
Linus Torvalds [Thu, 13 Feb 2025 16:43:46 +0000 (08:43 -0800)]
Merge tag 'loongarch-fixes-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix bugs about idle, kernel_page_present(), IP checksum and KVM, plus
some trival cleanups"
* tag 'loongarch-fixes-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Set host with kernel mode when switch to VM mode
LoongArch: KVM: Remove duplicated cache attribute setting
LoongArch: KVM: Fix typo issue about GCFG feature detection
LoongArch: csum: Fix OoB access in IP checksum code for negative lengths
LoongArch: Remove the deprecated notifier hook mechanism
LoongArch: Use str_yes_no() helper function for /proc/cpuinfo
LoongArch: Fix kernel_page_present() for KPRANGE/XKPRANGE
LoongArch: Fix idle VS timer enqueue
Linus Torvalds [Thu, 13 Feb 2025 16:41:48 +0000 (08:41 -0800)]
Merge tag 'platform-drivers-x86-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- thinkpad_acpi:
- Fix registration of tpacpi platform driver
- Support fan speed in ticks per revolution (Thinkpad X120e)
- Support V9 DYTC profiles (new Thinkpad AMD platforms)
- int3472: Handle GPIO "enable" vs "reset" variation (ov7251)
* tag 'platform-drivers-x86-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: thinkpad_acpi: Fix registration of tpacpi platform driver
platform/x86: int3472: Call "reset" GPIO "enable" for INT347E
platform/x86: int3472: Use correct type for "polarity", call it gpio_flags
platform/x86: thinkpad_acpi: Support for V9 DYTC platform profiles
platform/x86: thinkpad_acpi: Fix invalid fan speed on ThinkPad X120e
Alexandra Winter [Wed, 12 Feb 2025 16:36:59 +0000 (17:36 +0100)]
s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH
Like other drivers qeth is calling local_bh_enable() after napi_schedule()
to kick-start softirqs [0].
Since netif_napi_add_tx() and napi_enable() now take the netdev_lock()
mutex [1], move them out from under the BH protection. Same solution as in
commit a60558644e20 ("wifi: mt76: move napi_enable() from under BH")
Wentao Liang [Wed, 12 Feb 2025 15:23:11 +0000 (23:23 +0800)]
mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()
Add a check for the return value of mlxsw_sp_port_get_stats_raw()
in __mlxsw_sp_port_get_stats(). If mlxsw_sp_port_get_stats_raw()
returns an error, exit the function to prevent further processing
with potentially invalid data.
Fixes: 614d509aa1e7 ("mlxsw: Move ethtool_ops to spectrum_ethtool.c") Cc: stable@vger.kernel.org # 5.9+ Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250212152311.1332-1-vulab@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Luiz Augusto von Dentz [Thu, 6 Feb 2025 20:54:45 +0000 (15:54 -0500)]
Bluetooth: L2CAP: Fix corrupted list in hci_chan_del
This fixes the following trace by reworking the locking of l2cap_conn
so instead of only locking when changing the chan_l list this promotes
chan_lock to a general lock of l2cap_conn so whenever it is being held
it would prevents the likes of l2cap_conn_del to run:
Kiran K [Fri, 31 Jan 2025 13:00:19 +0000 (18:30 +0530)]
Bluetooth: btintel_pcie: Fix a potential race condition
On HCI_OP_RESET command, firmware raises alive interrupt. Driver needs
to wait for this before sending other command. This patch fixes the potential
miss of alive interrupt due to which HCI_OP_RESET can timeout.
Expected flow:
If tx command is HCI_OP_RESET,
1. set data->gp0_received = false
2. send HCI_OP_RESET
3. wait for alive interrupt
Actual flow having potential race:
If tx command is HCI_OP_RESET,
1. send HCI_OP_RESET
1a. Firmware raises alive interrupt here and in ISR
data->gp0_received is set to true
2. set data->gp0_received = false
3. wait for alive interrupt
Signed-off-by: Kiran K <kiran.k@intel.com> Fixes: 05c200c8f029 ("Bluetooth: btintel_pcie: Add handshake between driver and firmware") Reported-by: Bjorn Helgaas <helgaas@kernel.org> Closes: https://patchwork.kernel.org/project/bluetooth/patch/20241001104451.626964-1-kiran.k@intel.com/ Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Luiz Augusto von Dentz [Thu, 16 Jan 2025 15:35:03 +0000 (10:35 -0500)]
Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
After the hci sync command releases l2cap_conn, the hci receive data work
queue references the released l2cap_conn when sending to the upper layer.
Add hci dev lock to the hci receive data work queue to synchronize the two.
[1]
BUG: KASAN: slab-use-after-free in l2cap_send_cmd+0x187/0x8d0 net/bluetooth/l2cap_core.c:954
Read of size 8 at addr ffff8880271a4000 by task kworker/u9:2/5837
Reported-by: syzbot+31c2f641b850a348a734@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=31c2f641b850a348a734 Tested-by: syzbot+31c2f641b850a348a734@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Roger Quadros [Mon, 10 Feb 2025 14:52:17 +0000 (16:52 +0200)]
net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case
For XDP transmit case, swdata doesn't contain SKB but the
XDP Frame. Infer the correct swdata based on buffer type
and return the XDP Frame for XDP transmit case.
Roger Quadros [Mon, 10 Feb 2025 14:52:16 +0000 (16:52 +0200)]
net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case
For successful XDP_TX and XDP_REDIRECT cases, the packet was received
successfully so update RX statistics. Use original received
packet length for that.
TX packets statistics are incremented on TX completion so don't
update it while TX queueing.
If xdp_convert_buff_to_frame() fails, increment tx_dropped.
Roger Quadros [Mon, 10 Feb 2025 14:52:15 +0000 (16:52 +0200)]
net: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases
If the XDP program doesn't result in XDP_PASS then we leak the
memory allocated by am65_cpsw_build_skb().
It is pointless to allocate SKB memory before running the XDP
program as we would be wasting CPU cycles for cases other than XDP_PASS.
Move the SKB allocation after evaluating the XDP program result.
This fixes the memleak. A performance boost is seen for XDP_DROP test.
Bibo Mao [Thu, 13 Feb 2025 04:02:56 +0000 (12:02 +0800)]
LoongArch: KVM: Set host with kernel mode when switch to VM mode
PRMD register is only meaningful on the beginning stage of exception
entry, and it is overwritten with nested irq or exception.
When CPU runs in VM mode, interrupt need be enabled on host. And the
mode for host had better be kernel mode rather than random or user mode.
When VM is running, the running mode with top command comes from CRMD
register, and running mode should be kernel mode since kernel function
is executing with perf command. It needs be consistent with both top and
perf command.
Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Cache attribute comes from GPA->HPA secondary mmu page table and is
configured when kvm is enabled. It is the same for all VMs, so remove
duplicated cache attribute setting on vCPU context switch.
Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Thu, 13 Feb 2025 04:02:56 +0000 (12:02 +0800)]
LoongArch: KVM: Fix typo issue about GCFG feature detection
This is typo issue and misusage about GCFG feature macro. The code
is wrong, only that it does not cause obvious problem since GCFG is
set again on vCPU context switch.
Yuli Wang [Thu, 13 Feb 2025 04:02:40 +0000 (12:02 +0800)]
LoongArch: Remove the deprecated notifier hook mechanism
The notifier hook mechanism in proc and cpuinfo is actually unnecessary
for LoongArch because it's not used anywhere.
It was originally added to the MIPS code in commit d6d3c9afaab4 ("MIPS:
MT: proc: Add support for printing VPE and TC ids"), and LoongArch then
inherited it.
But as the kernel code stands now, this notifier hook mechanism doesn't
really make sense for either LoongArch or MIPS.
In addition, the seq_file forward declaration needs to be moved to its
proper place, as only the show_ipi_list() function in smp.c requires it.
Huacai Chen [Thu, 13 Feb 2025 04:02:35 +0000 (12:02 +0800)]
LoongArch: Fix kernel_page_present() for KPRANGE/XKPRANGE
Now kernel_page_present() always return true for KPRANGE/XKPRANGE
addresses, this isn't correct because hibernation (ACPI S4) use it
to distinguish whether a page is saveable. If all KPRANGE/XKPRANGE
addresses are considered as saveable, then reserved memory such as
EFI_RUNTIME_SERVICES_CODE / EFI_RUNTIME_SERVICES_DATA will also be
saved and restored.
Fix this by returning true only if the KPRANGE/XKPRANGE address is in
memblock.memory.
Marco Crivellari [Thu, 13 Feb 2025 04:02:35 +0000 (12:02 +0800)]
LoongArch: Fix idle VS timer enqueue
LoongArch re-enables interrupts on its idle routine and performs a
TIF_NEED_RESCHED check afterwards before putting the CPU to sleep.
The IRQs firing between the check and the idle instruction may set the
TIF_NEED_RESCHED flag. In order to deal with such a race, IRQs
interrupting __arch_cpu_idle() rollback their return address to the
beginning of __arch_cpu_idle() so that TIF_NEED_RESCHED is checked
again before going back to sleep.
However idle IRQs can also queue timers that may require a tick
reprogramming through a new generic idle loop iteration but those timers
would go unnoticed here because __arch_cpu_idle() only checks
TIF_NEED_RESCHED. It doesn't check for pending timers.
Fix this with fast-forwarding idle IRQs return address to the end of the
idle routine instead of the beginning, so that the generic idle loop can
handle both TIF_NEED_RESCHED and pending timers.
Michal Luczaj [Mon, 10 Feb 2025 12:15:00 +0000 (13:15 +0100)]
vsock: Orphan socket after transport release
During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().
Orphan the socket only after transport release.
Partially reverts the 'Fixes:' commit.
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
lock_acquire+0x19e/0x500
_raw_spin_lock_irqsave+0x47/0x70
add_wait_queue+0x46/0x230
virtio_transport_release+0x4e7/0x7f0
__vsock_release+0xfd/0x490
vsock_release+0x90/0x120
__sock_release+0xa3/0x250
sock_close+0x14/0x20
__fput+0x35e/0xa90
__x64_sys_close+0x78/0xd0
do_syscall_64+0x93/0x1b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c Fixes: fcdd2242c023 ("vsock: Keep the binding until socket destruction") Tested-by: Luigi Leonardi <leonardi@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-1-ef6244d02b54@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 13 Feb 2025 03:53:03 +0000 (19:53 -0800)]
Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-02-11 (idpf, ixgbe, igc)
For idpf:
Sridhar fixes a couple issues in handling of RSC packets.
Josh adds a call to set_real_num_queues() to keep queue count in sync.
For ixgbe:
Piotr removes missed IS_ERR() removal when ERR_PTR usage was removed.
For igc:
Zdenek Bouska fixes reporting of Rx timestamp with AF_XDP.
Siang sets buffer type on empty frame to ensure proper handling.
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
igc: Set buffer type for empty frames in igc_init_empty_frame
igc: Fix HW RX timestamp when passed by ZC XDP
ixgbe: Fix possible skb NULL pointer dereference
idpf: call set_real_num_queues in idpf_open
idpf: record rx queue in skb for RSC packets
idpf: fix handling rsc packet with a single segment
====================
Mark Pearson [Tue, 11 Feb 2025 17:36:11 +0000 (12:36 -0500)]
platform/x86: thinkpad_acpi: Fix registration of tpacpi platform driver
The recent platform profile changes prevent the tpacpi platform driver
from registering. This error is seen in the kernel logs, and the
various tpacpi entries are not created:
[ 7550.642171] platform thinkpad_acpi: Resources present before probing
This happens because devm_platform_profile_register() is called before
tpacpi_pdev probes (thanks to Kurt Borja for identifying the root
cause).
For now revert back to the old platform_profile_register to fix the
issue. This is quick fix and will be re-implemented later as more
testing is needed for full solution.
Tested on X1 Carbon G12.
Fixes: 31658c916fa6 ("platform/x86: thinkpad_acpi: Use devm_platform_profile_register()") Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca> Reviewed-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250211173620.16522-1-mpearson-lenovo@squebb.ca Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
IPv4 packets with no DF flag set on result in frequent flow entry
teardown cycles, this is visible in the network topology that is used in
the nft_flowtable.sh test.
nft_flowtable.sh test ocassionally fails reporting that the dscp_fwd
test sees no packets going through the flowtable path.
Fixes: b8baac3b9c5c ("netfilter: flowtable: teardown flow if cached mtu is stale") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David Howells [Sun, 9 Feb 2025 20:07:55 +0000 (20:07 +0000)]
rxrpc: Fix alteration of headers whilst zerocopy pending
rxrpc: Fix alteration of headers whilst zerocopy pending
AF_RXRPC now uses MSG_SPLICE_PAGES to do zerocopy of the DATA packets when
it transmits them, but to reduce the number of descriptors required in the
DMA ring, it allocates a space for the protocol header in the memory
immediately before the data content so that it can include both in a single
descriptor. This is used for either the main RX header or the smaller
jumbo subpacket header as appropriate:
Now, when it stitches a large jumbo packet together from a number of
individual DATA packets (each of which is 1412 bytes of data), it uses the
full RX header from the first and then the jumbo subpacket header for the
rest of the components:
As mentioned, the main RX header and the jumbo header overlay one another
in memory and the formats don't match, so switching from one to the other
means rearranging the fields and adjusting the flags.
However, now that TLP has been included, it wants to retransmit the last
subpacket as a new data packet on its own, which means switching between
the header formats... and if the transmission is still pending, because of
the MSG_SPLICE_PAGES, we end up corrupting the jumbo subheader.
This has a variety of effects, with the RX service number overwriting the
jumbo checksum/key number field and the RX checksum overwriting the jumbo
flags - resulting in, at the very least, a confused connection-level abort
from the peer.
Fix this by leaving the jumbo header in the allocation with the data, but
allocating the RX header from the page frag allocator and concocting it on
the fly at the point of transmission as it does for ACK packets.
Fixes: 7c482665931b ("rxrpc: Implement RACK/TLP to deal with transmission stalls [RFC8985]") Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/2181712.1739131675@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Sat, 8 Feb 2025 11:52:23 +0000 (11:52 +0000)]
net: phylink: make configuring clock-stop dependent on MAC support
We should not be configuring the PHYs clock-stop settings unless the
MAC supports phylink managed EEE. Make this dependent on MAC support.
This was noticed in a suspicious RCU usage report from the kernel
test robot (the suspicious RCU usage due to calling phy_detach()
remains unaddressed, but is triggered by the error this was
generating.)
Fixes: 03abf2a7c654 ("net: phylink: add EEE management") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tgjNn-003q0w-Pw@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Filipe Manana [Wed, 5 Feb 2025 17:36:48 +0000 (17:36 +0000)]
btrfs: fix hole expansion when writing at an offset beyond EOF
At btrfs_write_check() if our file's i_size is not sector size aligned and
we have a write that starts at an offset larger than the i_size that falls
within the same page of the i_size, then we end up not zeroing the file
range [i_size, write_offset).
ret = btrfs_cont_expand(BTRFS_I(inode), oldsize, end_pos);
if (ret)
return ret;
}
So if our file's i_size is 90269 bytes and a write at offset 90365 bytes
comes in, we get 'start_pos' set to 90112 bytes, which is less than the
i_size and therefore we don't zero out the range [90269, 90365) by
calling btrfs_cont_expand().
This is an old bug introduced in commit 9036c10208e1 ("Btrfs: update hole
handling v2"), from 2008, and the buggy code got moved around over the
years.
Fix this by discarding 'start_pos' and comparing against the write offset
('pos') without any alignment.
This bug was recently exposed by test case generic/363 which tests this
scenario by polluting ranges beyond EOF with an mmap write and than verify
that after a file increases we get zeroes for the range which is supposed
to be a hole and not what we wrote with the previous mmaped write.
We're only seeing this exposed now because generic/363 used to run only
on xfs until last Sunday's fstests update.
generic/363 0s ... [failed, exit status 1]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//generic/363.out.bad)
--- tests/generic/363.out 2025-02-05 15:31:14.013646509 +0000
+++ /home/fdmanana/git/hub/xfstests/results//generic/363.out.bad 2025-02-05 17:25:33.112630781 +0000
@@ -1 +1,46 @@
QA output created by 363
+READ BAD DATA: offset = 0xdcad, size = 0xd921, fname = /home/fdmanana/btrfs-tests/dev/junk
+OFFSET GOOD BAD RANGE
+0x1609d 0x0000 0x3104 0x0
+operation# (mod 256) for the bad data may be 4
+0x1609e 0x0000 0x0472 0x1
+operation# (mod 256) for the bad data may be 4
...
(Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/generic/363.out /home/fdmanana/git/hub/xfstests/results//generic/363.out.bad' to see the entire diff)
Ran: generic/363
Failures: generic/363
Failed 1 of 1 tests
Filipe Manana [Tue, 4 Feb 2025 11:02:32 +0000 (11:02 +0000)]
btrfs: fix stale page cache after race between readahead and direct IO write
After commit ac325fc2aad5 ("btrfs: do not hold the extent lock for entire
read") we can now trigger a race between a task doing a direct IO write
and readahead. When this race is triggered it results in tasks getting
stale data when they attempt do a buffered read (including the task that
did the direct IO write).
This race can be sporadically triggered with test case generic/418, failing
like this:
generic/418 14s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//generic/418.out.bad)
--- tests/generic/418.out 2020-06-10 19:29:03.850519863 +0100
+++ /home/fdmanana/git/hub/xfstests/results//generic/418.out.bad 2025-02-03 15:42:36.974609476 +0000
@@ -1,2 +1,5 @@
QA output created by 418
+cmpbuf: offset 0: Expected: 0x1, got 0x0
+[6:0] FAIL - comparison failed, offset 24576
+diotest -wp -b 4096 -n 8 -i 4 failed at loop 3
Silence is golden
...
(Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/generic/418.out /home/fdmanana/git/hub/xfstests/results//generic/418.out.bad' to see the entire diff)
Ran: generic/418
Failures: generic/418
Failed 1 of 1 tests
The race happens like this:
1) A file has a prealloc extent for the range [16K, 28K);
2) Task A starts a direct IO write against file range [24K, 28K).
At the start of the direct IO write it invalidates the page cache at
__iomap_dio_rw() with kiocb_invalidate_pages() for the 4K page at file
offset 24K;
3) Task A enters btrfs_dio_iomap_begin() and locks the extent range
[24K, 28K);
4) Task B starts a readahead for file range [16K, 28K), entering
btrfs_readahead().
First it attempts to read the page at offset 16K by entering
btrfs_do_readpage(), where it calls get_extent_map(), locks the range
[16K, 20K) and gets the extent map for the range [16K, 28K), caching
it into the 'em_cached' variable declared in the local stack of
btrfs_readahead(), and then unlocks the range [16K, 20K).
Since the extent map has the prealloc flag, at btrfs_do_readpage() we
zero out the page's content and don't submit any bio to read the page
from the extent.
Then it attempts to read the page at offset 20K entering
btrfs_do_readpage() where we reuse the previously cached extent map
(decided by get_extent_map()) since it spans the page's range and
it's still in the inode's extent map tree.
Just like for the previous page, we zero out the page's content since
the extent map has the prealloc flag set.
Then it attempts to read the page at offset 24K entering
btrfs_do_readpage() where we reuse the previously cached extent map
(decided by get_extent_map()) since it spans the page's range and
it's still in the inode's extent map tree.
Just like for the previous pages, we zero out the page's content since
the extent map has the prealloc flag set. Note that we didn't lock the
extent range [24K, 28K), so we didn't synchronize with the ongoing
direct IO write being performed by task A;
5) Task A enters btrfs_create_dio_extent() and creates an ordered extent
for the range [24K, 28K), with the flags BTRFS_ORDERED_DIRECT and
BTRFS_ORDERED_PREALLOC set;
6) Task A unlocks the range [24K, 28K) at btrfs_dio_iomap_begin();
7) The ordered extent enters btrfs_finish_one_ordered() and locks the
range [24K, 28K);
8) Task A enters fs/iomap/direct-io.c:iomap_dio_complete() and it tries
to invalidate the page at offset 24K by calling
kiocb_invalidate_post_direct_write(), resulting in a call chain that
ends up at btrfs_release_folio().
The btrfs_release_folio() call ends up returning false because the range
for the page at file offset 24K is currently locked by the task doing
the ordered extent completion in the previous step (7), so we have:
This last function checking that the range is locked and returning false
and propagating it up to btrfs_release_folio().
So this results in a failure to invalidate the page and
kiocb_invalidate_post_direct_write() triggers this message logged in
dmesg:
Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O!
After this we leave the page cache with stale data for the file range
[24K, 28K), filled with zeroes instead of the data written by direct IO
write (all bytes with a 0x01 value), so any task attempting to read with
buffered IO, including the task that did the direct IO write, will get
all bytes in the range with a 0x00 value instead of the written data.
Fix this by locking the range, with btrfs_lock_and_flush_ordered_range(),
at the two callers of btrfs_do_readpage() instead of doing it at
get_extent_map(), just like we did before commit ac325fc2aad5 ("btrfs: do
not hold the extent lock for entire read"), and unlocking the range after
all the calls to btrfs_do_readpage(). This way we never reuse a cached
extent map without flushing any pending ordered extents from a concurrent
direct IO write.
Fixes: ac325fc2aad5 ("btrfs: do not hold the extent lock for entire read") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Linus Torvalds [Tue, 11 Feb 2025 18:19:36 +0000 (10:19 -0800)]
Merge tag 'tomoyo-pr-20250211' of git://git.code.sf.net/p/tomoyo/tomoyo
Pull tomoyo fixes from Tetsuo Handa:
"Redo of pathname patternization and fix spelling errors"
* tag 'tomoyo-pr-20250211' of git://git.code.sf.net/p/tomoyo/tomoyo:
tomoyo: use better patterns for procfs in learning mode
tomoyo: fix spelling errors
tomoyo: fix spelling error
Patrick Bellasi [Wed, 5 Feb 2025 14:04:41 +0000 (14:04 +0000)]
x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-Exit
In [1] the meaning of the synthetic IBPB flags has been redefined for a
better separation of concerns:
- ENTRY_IBPB -- issue IBPB on entry only
- IBPB_ON_VMEXIT -- issue IBPB on VM-Exit only
and the Retbleed mitigations have been updated to match this new
semantics.
Commit [2] was merged shortly before [1], and their interaction was not
handled properly. This resulted in IBPB not being triggered on VM-Exit
in all SRSO mitigation configs requesting an IBPB there.
Specifically, an IBPB on VM-Exit is triggered only when
X86_FEATURE_IBPB_ON_VMEXIT is set. However:
- X86_FEATURE_IBPB_ON_VMEXIT is not set for "spec_rstack_overflow=ibpb",
because before [1] having X86_FEATURE_ENTRY_IBPB was enough. Hence,
an IBPB is triggered on entry but the expected IBPB on VM-exit is
not.
- X86_FEATURE_IBPB_ON_VMEXIT is not set also when
"spec_rstack_overflow=ibpb-vmexit" if X86_FEATURE_ENTRY_IBPB is
already set.
That's because before [1] this was effectively redundant. Hence, e.g.
a "retbleed=ibpb spec_rstack_overflow=bpb-vmexit" config mistakenly
reports the machine still vulnerable to SRSO, despite an IBPB being
triggered both on entry and VM-Exit, because of the Retbleed selected
mitigation config.
- UNTRAIN_RET_VM won't still actually do anything unless
CONFIG_MITIGATION_IBPB_ENTRY is set.
For "spec_rstack_overflow=ibpb", enable IBPB on both entry and VM-Exit
and clear X86_FEATURE_RSB_VMEXIT which is made superfluous by
X86_FEATURE_IBPB_ON_VMEXIT. This effectively makes this mitigation
option similar to the one for 'retbleed=ibpb', thus re-order the code
for the RETBLEED_MITIGATION_IBPB option to be less confusing by having
all features enabling before the disabling of the not needed ones.
For "spec_rstack_overflow=ibpb-vmexit", guard this mitigation setting
with CONFIG_MITIGATION_IBPB_ENTRY to ensure UNTRAIN_RET_VM sequence is
effectively compiled in. Drop instead the CONFIG_MITIGATION_SRSO guard,
since none of the SRSO compile cruft is required in this configuration.
Also, check only that the required microcode is present to effectively
enabled the IBPB on VM-Exit.
Finally, update the KConfig description for CONFIG_MITIGATION_IBPB_ENTRY
to list also all SRSO config settings enabled by this guard.
Fixes: 864bcaa38ee4 ("x86/cpu/kvm: Provide UNTRAIN_RET_VM") [1] Fixes: d893832d0e1e ("x86/srso: Add IBPB on VMEXIT") [2] Reported-by: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Patrick Bellasi <derkling@google.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sakari Ailus [Tue, 11 Feb 2025 07:28:40 +0000 (09:28 +0200)]
platform/x86: int3472: Call "reset" GPIO "enable" for INT347E
The DT bindings for ov7251 specify "enable" GPIO (xshutdown in
documentation) but the int3472 indiscriminately provides this as a "reset"
GPIO to sensor drivers. Take this into account by assigning it as "enable"
with active high polarity for INT347E devices, i.e. ov7251. "reset" with
active low polarity remains the default GPIO name for other devices.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250211072841.7713-3-sakari.ailus@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Sakari Ailus [Tue, 11 Feb 2025 07:28:39 +0000 (09:28 +0200)]
platform/x86: int3472: Use correct type for "polarity", call it gpio_flags
Struct gpiod_lookup flags field's type is unsigned long. Thus use unsigned
long for values to be assigned to that field. Similarly, also call the
field gpio_flags which it really is.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250211072841.7713-2-sakari.ailus@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Song Yoong Siang [Wed, 5 Feb 2025 02:36:03 +0000 (10:36 +0800)]
igc: Set buffer type for empty frames in igc_init_empty_frame
Set the buffer type to IGC_TX_BUFFER_TYPE_SKB for empty frame in the
igc_init_empty_frame function. This ensures that the buffer type is
correctly identified and handled during Tx ring cleanup.
Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") Cc: stable@vger.kernel.org # 6.2+ Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Zdenek Bouska [Tue, 28 Jan 2025 12:26:48 +0000 (13:26 +0100)]
igc: Fix HW RX timestamp when passed by ZC XDP
Fixes HW RX timestamp in the following scenario:
- AF_PACKET socket with enabled HW RX timestamps is created
- AF_XDP socket with enabled zero copy is created
- frame is forwarded to the BPF program, where the timestamp should
still be readable (extracted by igc_xdp_rx_timestamp(), kfunc
behind bpf_xdp_metadata_rx_timestamp())
- the frame got XDP_PASS from BPF program, redirecting to the stack
- AF_PACKET socket receives the frame with HW RX timestamp
Moves the skb timestamp setting from igc_dispatch_skb_zc() to
igc_construct_skb_zc() so that igc_construct_skb_zc() is similar to
igc_construct_skb().
This issue can also be reproduced by running:
# tools/testing/selftests/bpf/xdp_hw_metadata enp1s0
When a frame with the wrong port 9092 (instead of 9091) is used:
# echo -n xdp | nc -u -q1 192.168.10.9 9092
then the RX timestamp is missing and xdp_hw_metadata prints:
skb hwtstamp is not found!
With this fix or when copy mode is used:
# tools/testing/selftests/bpf/xdp_hw_metadata -c enp1s0
then RX timestamp is found and xdp_hw_metadata prints:
found skb hwtstamp = 1736509937.852786132
Fixes: 069b142f5819 ("igc: Add support for PTP .getcyclesx64()") Signed-off-by: Zdenek Bouska <zdenek.bouska@siemens.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Florian Bezdeka <florian.bezdeka@siemens.com> Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Piotr Kwapulinski [Fri, 31 Jan 2025 12:14:50 +0000 (13:14 +0100)]
ixgbe: Fix possible skb NULL pointer dereference
The commit c824125cbb18 ("ixgbe: Fix passing 0 to ERR_PTR in
ixgbe_run_xdp()") stopped utilizing the ERR-like macros for xdp status
encoding. Propagate this logic to the ixgbe_put_rx_buffer().
The commit also relaxed the skb NULL pointer check - caught by Smatch.
Restore this check.
Fixes: c824125cbb18 ("ixgbe: Fix passing 0 to ERR_PTR in ixgbe_run_xdp()") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/intel-wired-lan/2c7d6c31-192a-4047-bd90-9566d0e14cc0@stanley.mountain/ Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Joshua Hay [Wed, 5 Feb 2025 02:08:11 +0000 (18:08 -0800)]
idpf: call set_real_num_queues in idpf_open
On initial driver load, alloc_etherdev_mqs is called with whatever max
queue values are provided by the control plane. However, if the driver
is loaded on a system where num_online_cpus() returns less than the max
queues, the netdev will think there are more queues than are actually
available. Only num_online_cpus() will be allocated, but
skb_get_queue_mapping(skb) could possibly return an index beyond the
range of allocated queues. Consequently, the packet is silently dropped
and it appears as if TX is broken.
Set the real number of queues during open so the netdev knows how many
queues will be allocated.
Fixes: 1c325aac10a8 ("idpf: configure resources for TX queues") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Sridhar Samudrala [Sat, 11 Jan 2025 00:29:22 +0000 (16:29 -0800)]
idpf: fix handling rsc packet with a single segment
Handle rsc packet with a single segment same as a multi
segment rsc packet so that CHECKSUM_PARTIAL is set in the
skb->ip_summed field. The current code is passing CHECKSUM_NONE
resulting in TCP GRO layer doing checksum in SW and hiding the
issue. This will fail when using dmabufs as payload buffers as
skb frag would be unreadable.
Fixes: 3a8845af66ed ("idpf: add RX splitq napi poll support") Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When submitting the change above, it was thought that the origin of the
init_data should be a clear choice, from the driver or from DT but not
both.
It turns out some devices, such as qcom-msm8974-lge-nexus5-hammerhead,
relied on the old behaviour to override the init_data provided by the
driver, making it some kind of default if none is provided by the platform.
Using the init_data provided by the driver when it is present broke these
devices so revert the change to fixup the situation and add a comment
to make things a bit more clear
Reported-by: Luca Weiss <luca@lucaweiss.eu> Closes: https://lore.kernel.org/lkml/5857103.DvuYhMxLoT@lucaweiss.eu Fixes: cd7a38c40b23 ("regulator: core: do not silently ignore provided init_data") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://patch.msgid.link/20250211-regulator-init-data-fixup-v1-1-5ce1c6cff990@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
Kent Overstreet [Mon, 10 Feb 2025 22:46:36 +0000 (17:46 -0500)]
bcachefs: Fix want_new_bset() so we write until the end of the btree node
want_new_bset() returns the address of a new bset to initialize if we
wish to do so in a btree node - either because the previous one is too
big, or because it's been written.
The case for 'previous bset was written' was wrong: it's only supposed
to check for if we have space in the node for one more block, but
because it subtracted the header from the space available it would never
initialize a new bset if we were down to the last block in a node.
Fixing this results in fewer btree node splits/compactions, which fixes
a bug with flushing the journal to go read-only sometimes not
terminating or taking excessively long.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 10 Feb 2025 16:34:59 +0000 (11:34 -0500)]
bcachefs: Split out journal pins by btree level
This lets us flush the journal to go read-only more effectively.
Flushing the journal and going read-only requires halting mutually
recursive processes, which strictly speaking are not guaranteed to
terminate.
Flushing btree node journal pins will kick off a btree node write, and
btree node writes on completion must do another btree update to the
parent node to update the 'sectors_written' field for that node's key.
If the parent node is full and requires a split or compaction, that's
going to generate a whole bunch of additional btree updates - alloc
info, LRU btree, and more - which then have to be flushed, and the cycle
repeats.
This process will terminate much more effectively if we tweak journal
reclaim to flush btree updates leaf to root: i.e., don't flush updates
for a given btree node (kicking off a write, and consuming space within
that node up to the next block boundary) if there might still be
unflushed updates in child nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Rob Herring (Arm) [Fri, 24 Jan 2025 19:16:44 +0000 (13:16 -0600)]
mfd: syscon: Restore device_node_to_regmap() for non-syscon nodes
Commit ba5095ebbc7a ("mfd: syscon: Allow syscon nodes without a
"syscon" compatible") broke drivers which call device_node_to_regmap()
on nodes without a "syscon" compatible. Restore the prior behavior for
device_node_to_regmap().
This also makes using device_node_to_regmap() incompatible with
of_syscon_register_regmap() again, so add kerneldoc for
device_node_to_regmap() and syscon_node_to_regmap() to make it clear
how and when each one should be used.
Fixes: ba5095ebbc7a ("mfd: syscon: Allow syscon nodes without a "syscon" compatible") Reported-by: Vaishnav Achath <vaishnav.a@ti.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Tested-by: Nishanth Menon <nm@ti.com> Tested-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Frank Wunderlich <frank-w@public-files.de> Tested-by: Dhruva Gole <d-gole@ti.com> Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20250124191644.2309790-1-robh@kernel.org Signed-off-by: Lee Jones <lee@kernel.org>
Paolo Abeni [Tue, 11 Feb 2025 09:39:46 +0000 (10:39 +0100)]
Merge tag 'batadv-net-pullrequest-20250207' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- Fix panic during interface removal in BATMAN V, by Andy Strohman
- Cleanup BATMAN V/ELP metric handling, by Sven Eckelmann (2 patches)
- Fix incorrect offset in batadv_tt_tvlv_ogm_handler_v1(),
by Remi Pommarel
* tag 'batadv-net-pullrequest-20250207' of git://git.open-mesh.org/linux-merge:
batman-adv: Fix incorrect offset in batadv_tt_tvlv_ogm_handler_v1()
batman-adv: Drop unmanaged ELP metric worker
batman-adv: Ignore neighbor throughput metrics in error case
batman-adv: fix panic during interface removal
====================
Thomas Weißschuh [Fri, 7 Feb 2025 09:39:06 +0000 (10:39 +0100)]
ptp: vmclock: Remove goto-based cleanup logic
vmclock_probe() uses an "out:" label to return from the function on
error. This indicates that some cleanup operation is necessary.
However the label does not do anything as all resources are managed
through devres, making the code slightly harder to read.
Remove the label and just return directly.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Thomas Weißschuh [Fri, 7 Feb 2025 09:39:05 +0000 (10:39 +0100)]
ptp: vmclock: Clean up miscdev and ptp clock through devres
Most resources owned by the vmclock device are managed through devres.
Only the miscdev and ptp clock are managed manually.
This makes the code slightly harder to understand than necessary.
Switch them over to devres and remove the now unnecessary drvdata.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Thomas Weißschuh [Fri, 7 Feb 2025 09:39:04 +0000 (10:39 +0100)]
ptp: vmclock: Don't unregister misc device if it was not registered
vmclock_remove() tries to detect the successful registration of the misc
device based on the value of its minor value.
However that check is incorrect if the misc device registration was not
attempted in the first place.
Always initialize the minor number, so the check works properly.
Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Thomas Weißschuh [Fri, 7 Feb 2025 09:39:03 +0000 (10:39 +0100)]
ptp: vmclock: Set driver data before its usage
If vmclock_ptp_register() fails during probing, vmclock_remove() is
called to clean up the ptp clock and misc device.
It uses dev_get_drvdata() to access the vmclock state.
However the driver data is not yet set at this point.
Assign the driver data earlier.
Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
David Woodhouse [Fri, 7 Feb 2025 09:39:02 +0000 (10:39 +0100)]
ptp: vmclock: Add .owner to vmclock_miscdev_fops
Without the .owner field, the module can be unloaded while /dev/vmclock0
is open, leading to an oops.
Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: stable@vger.kernel.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 11 Feb 2025 03:24:05 +0000 (19:24 -0800)]
Merge tag 'linux-can-fixes-for-6.14-20250208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-02-08
The first patch is by Reyders Morales and fixes a code example in the
CAN ISO15765-2 documentation.
The next patch is contributed by Alexander Hölzl and fixes sending of
J1939 messages with zero data length.
Fedor Pchelkin's patch for the ctucanfd driver adds a missing handling
for an skb allocation error.
Krzysztof Kozlowski contributes a patch for the c_can driver to fix
unbalanced runtime PM disable in error path.
The next patch is by Vincent Mailhol and fixes a NULL pointer
dereference on udev->serial in the etas_es58x driver.
The patch is by Robin van der Gracht and fixes the handling for an skb
allocation error.
* tag 'linux-can-fixes-for-6.14-20250208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: rockchip: rkcanfd_handle_rx_fifo_overflow_int(): bail out if skb cannot be allocated
can: etas_es58x: fix potential NULL pointer dereference on udev->serial
can: c_can: fix unbalanced runtime PM disable in error path
can: ctucanfd: handle skb allocation failure
can: j1939: j1939_sk_send_loop(): fix unable to send messages with data length zero
Documentation/networking: fix basic node example document ISO 15765-2
====================
Jakub Kicinski [Tue, 11 Feb 2025 02:13:06 +0000 (18:13 -0800)]
Merge tag 'wireless-2025-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Kalle Valo says:
====================
wireless fixes for v6.14-rc3
We have only one fix for ath12k and one fix for brcmfmac. Also this
will be my last pull request as I'm stepping down as wireless driver
maintainer.
* tag 'wireless-2025-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
MAINTAINERS: wifi: remove Kalle
MAINTAINERS: wifi: ath: remove Kalle
wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364 firmware
wifi: ath12k: fix handling of 6 GHz rules
====================
Eric Dumazet [Fri, 7 Feb 2025 13:58:39 +0000 (13:58 +0000)]
ndisc: extend RCU protection in ndisc_send_skb()
ndisc_send_skb() can be called without RTNL or RCU held.
Acquire rcu_read_lock() earlier, so that we can use dev_net_rcu()
and avoid a potential UAF.
Fixes: 1762f7e88eb3 ("[NETNS][IPV6] ndisc - make socket control per namespace") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250207135841.1948589-8-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Fri, 7 Feb 2025 13:58:36 +0000 (13:58 +0000)]
arp: use RCU protection in arp_xmit()
arp_xmit() can be called without RTNL or RCU protection.
Use RCU protection to avoid potential UAF.
Fixes: 29a26a568038 ("netfilter: Pass struct net into the netfilter hooks") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250207135841.1948589-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Fri, 7 Feb 2025 13:58:33 +0000 (13:58 +0000)]
ndisc: ndisc_send_redirect() must use dev_get_by_index_rcu()
ndisc_send_redirect() is called under RCU protection, not RTNL.
It must use dev_get_by_index_rcu() instead of __dev_get_by_index()
Fixes: 2f17becfbea5 ("vrf: check the original netdevice for generating redirect") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250207135841.1948589-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Furong Xu [Fri, 7 Feb 2025 08:56:39 +0000 (16:56 +0800)]
net: stmmac: Apply new page pool parameters when SPH is enabled
Commit df542f669307 ("net: stmmac: Switch to zero-copy in
non-XDP RX path") makes DMA write received frame into buffer at offset
of NET_SKB_PAD and sets page pool parameters to sync from offset of
NET_SKB_PAD. But when Header Payload Split is enabled, the header is
written at offset of NET_SKB_PAD, while the payload is written at
offset of zero. Uncorrect offset parameter for the payload breaks dma
coherence [1] since both CPU and DMA touch the page buffer from offset
of zero which is not handled by the page pool sync parameter.
And in case the DMA cannot split the received frame, for example,
a large L2 frame, pp_params.max_len should grow to match the tail
of entire frame.
Linus Torvalds [Mon, 10 Feb 2025 21:11:24 +0000 (13:11 -0800)]
Merge tag 'nfsd-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Fixes for new bugs:
- A fix for CB_GETATTR reply decoding was not quite correct
- Fix the NFSD connection limiting logic
- Fix a bug in the new session table resizing logic
Bugs that pre-date v6.14:
- Support for courteous clients (5.19) introduced a shutdown hang
- Fix a crash in the filecache laundrette (6.9)
- Fix a zero-day crash in NFSD's NFSv3 ACL implementation"
* tag 'nfsd-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Fix CB_GETATTR status fix
NFSD: fix hang in nfsd4_shutdown_callback
nfsd: fix __fh_verify for localio
nfsd: fix uninitialised slot info when a request is retried
nfsd: validate the nfsd_serv pointer before calling svc_wake_up
nfsd: clear acl_access/acl_default after releasing them
Chuck Lever [Mon, 10 Feb 2025 16:43:31 +0000 (11:43 -0500)]
NFSD: Fix CB_GETATTR status fix
Jeff says:
Now that I look, 1b3e26a5ccbf is wrong. The patch on the ml was correct, but
the one that got committed is different. It should be:
status = decode_cb_op_status(xdr, OP_CB_GETATTR, &cb->cb_status);
if (unlikely(status || cb->cb_status))
If "status" is non-zero, decoding failed (usu. BADXDR), but we also want to
bail out and not decode the rest of the call if the decoded cb_status is
non-zero. That's not happening here, cb_seq_status has already been checked and
is non-zero, so this ends up trying to decode the rest of the CB_GETATTR reply
when it doesn't exist.
Reported-by: Jeff Layton <jlayton@kernel.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219737 Fixes: 1b3e26a5ccbf ("NFSD: fix decoding in nfs4_xdr_dec_cb_getattr") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Dai Ngo [Thu, 30 Jan 2025 19:01:27 +0000 (11:01 -0800)]
NFSD: fix hang in nfsd4_shutdown_callback
If nfs4_client is in courtesy state then there is no point to send
the callback. This causes nfsd4_shutdown_callback to hang since
cl_cb_inflight is not 0. This hang lasts about 15 minutes until TCP
notifies NFSD that the connection was dropped.
This patch modifies nfsd4_run_cb_work to skip the RPC call if
nfs4_client is in courtesy state.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Fixes: 66af25799940 ("NFSD: add courteous server support for thread with only delegation") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Olga Kornievskaia [Tue, 28 Jan 2025 16:58:06 +0000 (11:58 -0500)]
nfsd: fix __fh_verify for localio
__fh_verify() added a call to svc_xprt_set_valid() to help do connection
management but during LOCALIO path rqstp argument is NULL, leading to
NULL pointer dereferencing and a crash.
Fixes: eccbbc7c00a5 ("nfsd: don't use sv_nrthreads in connection limiting calculations.") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
NeilBrown [Mon, 27 Jan 2025 23:05:03 +0000 (10:05 +1100)]
nfsd: fix uninitialised slot info when a request is retried
A recent patch moved the assignment of seq->maxslots from before the
test for a resent request (which ends with a goto) to after, resulting
in it not being run in that case. This results in the server returning
bogus "high slot id" and "target high slot id" values.
The assignments to ->maxslots and ->target_maxslots need to be *after*
the out: label so that the correct values are returned in replies to
requests that are served from cache.
Fixes: 60aa6564317d ("nfsd: allocate new session-based DRC slots on demand.") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Linus Torvalds [Mon, 10 Feb 2025 17:50:01 +0000 (09:50 -0800)]
Merge tag 'hid-for-linus-2025021001' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- build/dependency fixes for hid-lenovo and hid-intel-thc (Arnd
Bergmann)
- functional fixes for hid-corsair-void (Stuart Hayhurst)
- workqueue handling and ordering fix for hid-steam (Vicki Pfau)
- Gamepad mode vs. Lizard mode fix for hid-steam (Vicki Pfau)
- OOB read fix for hid-thrustmaster (Tulio Fernandes)
- fix for very long timeout on certain firmware in intel-ish-hid (Zhang
Lixu)
- other assorted small code fixes and device ID additions
* tag 'hid-for-linus-2025021001' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: hid-steam: Don't use cancel_delayed_work_sync in IRQ context
HID: hid-steam: Move hidraw input (un)registering to work
HID: hid-thrustmaster: fix stack-out-of-bounds read in usb_check_int_endpoints()
HID: apple: fix up the F6 key on the Omoton KB066 keyboard
HID: hid-apple: Apple Magic Keyboard a3203 USB-C support
samples/hid: fix broken vmlinux path for VMLINUX_BTF
samples/hid: remove unnecessary -I flags from libbpf EXTRA_CFLAGS
HID: topre: Fix n-key rollover on Realforce R3S TKL boards
HID: intel-ish-hid: ipc: Add Panther Lake PCI device IDs
HID: multitouch: Add NULL check in mt_input_configured
HID: winwing: Add NULL check in winwing_init_led()
HID: hid-steam: Fix issues with disabling both gamepad mode and lizard mode
HID: ignore non-functional sensor in HP 5MP Camera
HID: intel-thc: fix CONFIG_HID dependency
HID: lenovo: select CONFIG_ACPI_PLATFORM_PROFILE
HID: intel-ish-hid: Send clock sync message immediately after reset
HID: intel-ish-hid: fix the length of MNG_SYNC_FW_CLOCK in doorbell
HID: corsair-void: Initialise memory for psy_cfg
HID: corsair-void: Add missing delayed work cancel for headset status
Linus Torvalds [Mon, 10 Feb 2025 17:40:45 +0000 (09:40 -0800)]
Merge tag 'pinctrl-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- A series of IRQ and behaviour stabilization fixes for the CY8C95x0
pin control expander
- A print format fix for the generic debugfs output
* tag 'pinctrl-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: pinconf-generic: Print unsigned value if a format is registered
pinctrl: cy8c95x0: Respect IRQ trigger settings from firmware
pinctrl: cy8c95x0: Rename PWMSEL to SELPWM
pinctrl: cy8c95x0: Enable regmap locking for debug
pinctrl: cy8c95x0: Avoid accessing reserved registers
pinctrl: cy8c95x0: Fix off-by-one in the regmap range settings
platform/x86: thinkpad_acpi: Fix invalid fan speed on ThinkPad X120e
On ThinkPad X120e, fan speed is reported in ticks per revolution
rather than RPM.
Recalculate the fan speed value reported for ThinkPad X120e
to RPM based on a 22.5 kHz clock.
Based on the information on
https://www.thinkwiki.org/wiki/How_to_control_fan_speed,
the same problem is highly likely to be relevant to at least Edge11,
but Edge11 is not addressed in this patch.
Linus Torvalds [Sun, 9 Feb 2025 18:05:32 +0000 (10:05 -0800)]
Merge tag 'kbuild-fixes-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Suppress false-positive -Wformat-{overflow,truncation}-non-kprintf
warnings regardless of the W= option
- Avoid CONFIG_TRIM_UNUSED_KSYMS dropping symbols passed to symbol_get()
- Fix a build regression of the Debian linux-headers package
* tag 'kbuild-fixes-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: install-extmod-build: add missing quotation marks for CC variable
kbuild: fix misspelling in scripts/Makefile.lib
kbuild: keep symbols for symbol_get() even with CONFIG_TRIM_UNUSED_KSYMS
scripts/Makefile.extrawarn: Do not show clang's non-kprintf warnings at W=1
Linus Torvalds [Sun, 9 Feb 2025 17:47:06 +0000 (09:47 -0800)]
Merge tag 'pm-6.14-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a recently introduced kernel crash due to a NULL pointer
dereference during system-wide suspend (Rafael Wysocki)"
* tag 'pm-6.14-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: core: Restrict power.set_active propagation