Yang Erkun [Mon, 21 Oct 2024 08:25:40 +0000 (16:25 +0800)]
nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net
In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
release all resources related to the hashed `nfs4_client`. If the
`nfsd_client_shrinker` is running concurrently, the `expire_client`
function will first unhash this client and then destroy it. This can
lead to the following warning. Additionally, numerous use-after-free
errors may occur as well.
====================================================================
BUG nfsd_file_mark (Tainted: G B W ): Objects remaining
nfsd_file_mark on __kmem_cache_shutdown()
--------------------------------------------------------------------
To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
mode in nfs4_state_shutdown_net.
Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker") Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Bibo Mao [Mon, 21 Oct 2024 14:11:19 +0000 (22:11 +0800)]
LoongArch: Set initial pte entry with PAGE_GLOBAL for kernel space
There are two pages in one TLB entry on LoongArch system. For kernel
space, it requires both two pte entries (buddies) with PAGE_GLOBAL bit
set, otherwise HW treats it as non-global tlb, there will be potential
problems if tlb entry for kernel space is not global. Such as fail to
flush kernel tlb with the function local_flush_tlb_kernel_range() which
supposed only flush tlb with global bit.
Kernel address space areas include percpu, vmalloc, vmemmap, fixmap and
kasan areas. For these areas both two consecutive page table entries
should be enabled with PAGE_GLOBAL bit. So with function set_pte() and
pte_clear(), pte buddy entry is checked and set besides its own pte
entry. However it is not atomic operation to set both two pte entries,
there is problem with test_vmalloc test case.
So function kernel_pte_init() is added to init a pte table when it is
created for kernel address space, and the default initial pte value is
PAGE_GLOBAL rather than zero at beginning. Then only its own pte entry
need update with function set_pte() and pte_clear(), nothing to do with
the pte buddy entry.
Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Thomas Weißschuh [Mon, 21 Oct 2024 14:11:19 +0000 (22:11 +0800)]
LoongArch: Don't crash in stack_top() for tasks without vDSO
Not all tasks have a vDSO mapped, for example kthreads never do. If such
a task ever ends up calling stack_top(), it will derefence the NULL vdso
pointer and crash.
Huacai Chen [Mon, 21 Oct 2024 14:11:19 +0000 (22:11 +0800)]
LoongArch: Set correct size for vDSO code mapping
The current size of vDSO code mapping is hardcoded to PAGE_SIZE. This
cannot work for 4KB page size after commit 18efd0b10e0fd77 ("LoongArch:
vDSO: Wire up getrandom() vDSO implementation") because the code size
increases to 8KB. Thus set the code mapping size to its real size, i.e.
PAGE_ALIGN(vdso_end - vdso_start).
Fixes: 18efd0b10e0fd77 ("LoongArch: vDSO: Wire up getrandom() vDSO implementation") Reviewed-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Mon, 21 Oct 2024 14:11:19 +0000 (22:11 +0800)]
LoongArch: Enable IRQ if do_ale() triggered in irq-enabled context
Unaligned access exception can be triggered in irq-enabled context such
as user mode, in this case do_ale() may call get_user() which may cause
sleep. Then we will get:
Huacai Chen [Mon, 21 Oct 2024 14:11:18 +0000 (22:11 +0800)]
LoongArch: Get correct cores_per_package for SMT systems
In loongson_sysconf, The "core" of cores_per_node and cores_per_package
stands for a logical core, which means in a SMT system it stands for a
thread indeed. This information is gotten from SMBIOS Type4 Structure,
so in order to get a correct cores_per_package for both SMT and non-SMT
systems in parse_cpu_table() we should use SMBIOS_THREAD_PACKAGE_OFFSET
instead of SMBIOS_CORE_PACKAGE_OFFSET.
Cc: stable@vger.kernel.org Reported-by: Chao Li <lichao@loongson.cn> Tested-by: Chao Li <lichao@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Yanteng Si [Mon, 21 Oct 2024 14:11:18 +0000 (22:11 +0800)]
LoongArch: Use "Exception return address" to comment ERA
The information contained in the comment for LOONGARCH_CSR_ERA is even
less informative than the macro itself, which can cause confusion for
junior developers. Let's use the full English term.
Signed-off-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Marek Maslanka [Sat, 12 Oct 2024 18:26:55 +0000 (18:26 +0000)]
platform/x86:intel/pmc: Revert "Enable the ACPI PM Timer to be turned off when suspended"
Commit e86c8186d03a ("platform/x86:intel/pmc: Enable the ACPI PM Timer to
be turned off when suspended") can cause the suspend process to hang as
the pmcdev->lock in the pmc_core_acpi_pm_timer_suspend_resume might already
be held by the pmc_core_mphy_pg_show or pmc_core_pll_show if the userspace
gets frozen when these functions are being executed.
Also, pmc_core_acpi_pm_timer_suspend_resume must not sleep, as this
function is called indirectly by the tick_freeze function in
kernel/time/tick-common.c, which holds the spinlock.
Revert the changes for now to fix these issues.
Fixes: e86c8186d03a ("platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended") Reported-by: Luca Coelho <luca@coelho.fi> Closes: https://lore.kernel.org/lkml/40555604c3f4be43bf72e72d5409eaece4be9320.camel@coelho.fi/ Signed-off-by: Marek Maslanka <mmaslanka@google.com> Link: https://lore.kernel.org/r/20241012182656.2107178-1-mmaslanka@google.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Javier Carrasco [Sun, 13 Oct 2024 18:11:29 +0000 (20:11 +0200)]
drm/bridge: tc358767: fix missing of_node_put() in for_each_endpoint_of_node()
for_each_endpoint_of_node() requires a call to of_node_put() for every
early exit. A new error path was added to the loop without observing
this requirement.
Add the missing call to of_node_put() in the error path.
Abel Vesa [Fri, 18 Oct 2024 12:49:34 +0000 (15:49 +0300)]
drm/bridge: Fix assignment of the of_node of the parent to aux bridge
The assignment of the of_node to the aux bridge needs to mark the
of_node as reused as well, otherwise resource providers like pinctrl will
report a gpio as already requested by a different device when both pinconf
and gpios property are present.
Fix that by using the device_set_of_node_from_dev() helper instead.
In context of my work on composefs/bootc I've been testing the new support
for directly mounting files with erofs (ie: without a loopback device) and
it's working well. Thanks for adding this feature --- it's a huge quality
of life improvement for us.
I've observed a strange behaviour, though: when mounting a file as an
erofs, if you read() the filesystem context fd, you always get the
following error message reported: Can't lookup blockdev.
That's caused by the code in erofs_fc_get_tree() trying to call
get_tree_bdev() and recovering from the error in case it was ENOTBLK and
CONFIG_EROFS_FS_BACKED_BY_FILE. Unfortunately, get_tree_bdev() logs the
error directly on the fs_context, so you get the error message even on
successful mounts.
It looks something like this at the syscall level:
This is kernel 6.12.0-0.rc0.20240926git11a299a7933e.13.fc42.x86_64 from
Fedora Rawhide.
It's a pretty minor issue, but it sent me on a wild goose chase for an hour
or two, so probably it should get fixed before the final release.
Gao Xiang <hsiangkao@linux.alibaba.com>:
Fix this by providing a get_tree_bdev_flags() helper which can be used
to silence such warnings.
* patches from https://lore.kernel.org/r/20241009033151.2334888-1-hsiangkao@linux.alibaba.com:
erofs: use get_tree_bdev_flags() to avoid misleading messages
fs/super.c: introduce get_tree_bdev_flags()
Gao Xiang [Wed, 9 Oct 2024 03:31:50 +0000 (11:31 +0800)]
fs/super.c: introduce get_tree_bdev_flags()
As Allison reported [1], currently get_tree_bdev() will store
"Can't lookup blockdev" error message. Although it makes sense for
pure bdev-based fses, this message may mislead users who try to use
EROFS file-backed mounts since get_tree_nodev() is used as a fallback
then.
Add get_tree_bdev_flags() to specify extensible flags [2] and
GET_TREE_BDEV_QUIET_LOOKUP to silence "Can't lookup blockdev" message
since it's misleading to EROFS file-backed mounts now.
Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241009033151.2334888-1-hsiangkao@linux.alibaba.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
Shubham Panwar [Sun, 20 Oct 2024 09:50:46 +0000 (15:20 +0530)]
ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue
Add a DMI quirk for Samsung Galaxy Book2 to fix an initial lid state
detection issue.
The _LID device incorrectly returns the lid status as "closed" during
boot, causing the system to enter a suspend loop right after booting.
The quirk ensures that the correct lid state is reported initially,
preventing the system from immediately suspending after startup. It
only addresses the initial lid state detection and ensures proper
system behavior upon boot.
Christian Heusel [Thu, 17 Oct 2024 11:16:26 +0000 (13:16 +0200)]
ACPI: resource: Add LG 16T90SP to irq1_level_low_skip_override[]
The LG Gram Pro 16 2-in-1 (2024) the 16T90SP has its keybopard IRQ (1)
described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
which breaks the keyboard.
Add the 16T90SP to the irq1_level_low_skip_override[] quirk table to fix
this.
Reported-by: Dirk Holten <dirk.holten@gmx.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219382 Cc: All applicable <stable@vger.kernel.org> Suggested-by: Dirk Holten <dirk.holten@gmx.de> Signed-off-by: Christian Heusel <christian@heusel.eu> Link: https://patch.msgid.link/20241017-lg-gram-pro-keyboard-v2-1-7c8fbf6ff718@heusel.eu Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Koba Ko [Sat, 12 Oct 2024 20:50:10 +0000 (04:50 +0800)]
ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context
PRMT needs to find the correct type of block to translate the PA-VA
mapping for EFI runtime services.
The issue arises because the PRMT is finding a block of type
EFI_CONVENTIONAL_MEMORY, which is not appropriate for runtime services
as described in Section 2.2.2 (Runtime Services) of the UEFI
Specification [1]. Since the PRM handler is a type of runtime service,
this causes an exception when the PRM handler is called.
[Firmware Bug]: Unable to handle paging request in EFI runtime service
WARNING: CPU: 22 PID: 4330 at drivers/firmware/efi/runtime-wrappers.c:341
__efi_queue_work+0x11c/0x170
Call trace:
Let PRMT find a block with EFI_MEMORY_RUNTIME for PRM handler and PRM
context.
If no suitable block is found, a warning message will be printed, but
the procedure continues to manage the next PRM handler.
However, if the PRM handler is actually called without proper allocation,
it would result in a failure during error handling.
By using the correct memory types for runtime services, ensure that the
PRM handler and the context are properly mapped in the virtual address
space during runtime, preventing the paging request error.
The issue is really that only memory that has been remapped for runtime
by the firmware can be used by the PRM handler, and so the region needs
to have the EFI_MEMORY_RUNTIME attribute.
Link: https://uefi.org/sites/default/files/resources/UEFI_Spec_2_10_Aug29.pdf Fixes: cefc7ca46235 ("ACPI: PRM: implement OperationRegion handler for the PlatformRtMechanism subtype") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Koba Ko <kobak@nvidia.com> Reviewed-by: Matthew R. Ochs <mochs@nvidia.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://patch.msgid.link/20241012205010.4165798-1-kobak@nvidia.com
[ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Yuan Can [Fri, 18 Oct 2024 02:12:05 +0000 (10:12 +0800)]
powercap: dtpm_devfreq: Fix error check against dev_pm_qos_add_request()
The caller of the function dev_pm_qos_add_request() checks again a non
zero value but dev_pm_qos_add_request() can return '1' if the request
already exists. Therefore, the setup function fails while the QoS
request actually did not failed.
Fix that by changing the check against a negative value like all the
other callers of the function.
Fixes: e44655617317 ("powercap/drivers/dtpm: Add dtpm devfreq with energy model support") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241018021205.46460-1-yuancan@huawei.com
[ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Christian Loehle [Thu, 17 Oct 2024 22:00:33 +0000 (23:00 +0100)]
cpufreq: docs: Reflect latency changes in docs
There were two changes related to transition latency recently.
Namely commit e13aa799c2a6 ("cpufreq: Change default transition delay
to 2ms") and
commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER").
Both changed the defaults / maximums so let the documentation
reflect that.
so to fix, it is enough to make bitmap an array of u64.
Fixes: 941168f8b40e5 ("virtio_net: support device stats") Reported-by: "Colin King (gmail)" <colin.i.king@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/53e2bd6728136d5916e384a7840e5dc7eebff832.1729099611.git.mst@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Tue, 15 Oct 2024 19:41:18 +0000 (19:41 +0000)]
net: fix races in netdev_tx_sent_queue()/dev_watchdog()
Some workloads hit the infamous dev_watchdog() message:
"NETDEV WATCHDOG: eth0 (xxxx): transmit queue XX timed out"
It seems possible to hit this even for perfectly normal
BQL enabled drivers:
1) Assume a TX queue was idle for more than dev->watchdog_timeo
(5 seconds unless changed by the driver)
2) Assume a big packet is sent, exceeding current BQL limit.
3) Driver ndo_start_xmit() puts the packet in TX ring,
and netdev_tx_sent_queue() is called.
4) QUEUE_STATE_STACK_XOFF could be set from netdev_tx_sent_queue()
before txq->trans_start has been written.
5) txq->trans_start is written later, from netdev_start_xmit()
if (rc == NETDEV_TX_OK)
txq_trans_update(txq)
dev_watchdog() running on another cpu could read the old
txq->trans_start, and then see QUEUE_STATE_STACK_XOFF, because 5)
did not happen yet.
To solve the issue, write txq->trans_start right before one XOFF bit
is set :
- _QUEUE_STATE_DRV_XOFF from netif_tx_stop_queue()
- __QUEUE_STATE_STACK_XOFF from netdev_tx_sent_queue()
From dev_watchdog(), we have to read txq->state before txq->trans_start.
Add memory barriers to enforce correct ordering.
In the future, we could avoid writing over txq->trans_start for normal
operations, and rename this field to txq->xoff_start_time.
Fixes: bec251bc8b6a ("net: no longer stop all TX queues in dev_watchdog()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20241015194118.3951657-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lin Ma [Tue, 15 Oct 2024 13:16:21 +0000 (21:16 +0800)]
net: wwan: fix global oob in wwan_rtnl_policy
The variable wwan_rtnl_link_ops assign a *bigger* maxtype which leads to
a global out-of-bounds read when parsing the netlink attributes. Exactly
same bug cause as the oob fixed in commit b33fb5b801c6 ("net: qualcomm:
rmnet: fix global oob in rmnet_policy").
==================================================================
BUG: KASAN: global-out-of-bounds in validate_nla lib/nlattr.c:388 [inline]
BUG: KASAN: global-out-of-bounds in __nla_validate_parse+0x19d7/0x29a0 lib/nlattr.c:603
Read of size 1 at addr ffffffff8b09cb60 by task syz.1.66276/323862
====================
fsl/fman: Fix refcount handling of fman-related devices
The series is intended to fix refcount handling for fman-related "struct
device" objects - the devices are not released upon driver removal or in
the error paths during probe. This leads to device reference leaks.
The device pointers are now saved to struct mac_device and properly handled
in the driver's probe and removal functions.
Originally reported by Simon Horman <horms@kernel.org>
(https://lore.kernel.org/all/20240702133651.GK598357@kernel.org/)
Aleksandr Mishin [Tue, 15 Oct 2024 06:01:22 +0000 (09:01 +0300)]
fsl/fman: Fix refcount handling of fman-related devices
In mac_probe() there are multiple calls to of_find_device_by_node(),
fman_bind() and fman_port_bind() which takes references to of_dev->dev.
Not all references taken by these calls are released later on error path
in mac_probe() and in mac_remove() which lead to reference leaks.
Add references release.
Fixes: 3933961682a3 ("fsl/fman: Add FMan MAC driver") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Aleksandr Mishin [Tue, 15 Oct 2024 06:01:21 +0000 (09:01 +0300)]
fsl/fman: Save device references taken in mac_probe()
In mac_probe() there are calls to of_find_device_by_node() which takes
references to of_dev->dev. These references are not saved and not released
later on error path in mac_probe() and in mac_remove().
Add new fields into mac_device structure to save references taken for
future use in mac_probe() and mac_remove().
This is a preparation for further reference leaks fix.
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
fuse_writepages() might be called with no dirty pages after all writable
opens were closed. In this case __fuse_write_file_get() will return NULL
which will trigger the WARNING.
The exact conditions under which this is triggered is unclear and syzbot
didn't find a reproducer yet.
Peter Collingbourne [Fri, 18 Oct 2024 22:16:43 +0000 (15:16 -0700)]
bpf, arm64: Fix address emission with tag-based KASAN enabled
When BPF_TRAMP_F_CALL_ORIG is enabled, the address of a bpf_tramp_image
struct on the stack is passed during the size calculation pass and
an address on the heap is passed during code generation. This may
cause a heap buffer overflow if the heap address is tagged because
emit_a64_mov_i64() will emit longer code than it did during the size
calculation pass. The same problem could occur without tag-based
KASAN if one of the 16-bit words of the stack address happened to
be all-ones during the size calculation pass. Fix the problem by
assuming the worst case (4 instructions) when calculating the size
of the bpf_tramp_image address emission.
Eric Biggers [Sun, 20 Oct 2024 17:56:24 +0000 (10:56 -0700)]
ALSA: hda/tas2781: select CRC32 instead of CRC32_SARWATE
Fix the kconfig option for the tas2781 HDA driver to select CRC32 rather
than CRC32_SARWATE. CRC32_SARWATE is an option from the kconfig
'choice' that selects the specific CRC32 implementation. Selecting a
'choice' option seems to have no effect, but even if it did work, it
would be incorrect for a random driver to override the user's choice.
CRC32 is the correct option to select for crc32() to be available.
José Relvas [Sun, 20 Oct 2024 10:27:56 +0000 (11:27 +0100)]
ALSA: hda/realtek: Add subwoofer quirk for Acer Predator G9-593
The Acer Predator G9-593 has a 2+1 speaker system which isn't probed
correctly.
This patch adds a quirk with the proper pin connections.
Note that I do not own this laptop, so I cannot guarantee that this
fixes the issue.
Testing was done by other users here:
https://discussion.fedoraproject.org/t/-/118482
This model appears to have two different dev IDs...
- 0x1177 (as seen on the forum link above)
- 0x1178 (as seen on https://linux-hardware.org/?probe=127df9999f)
I don't think the audio system was changed between model revisions, so
the patch applies for both IDs.
Andrey Shumilin [Fri, 18 Oct 2024 06:00:18 +0000 (09:00 +0300)]
ALSA: firewire-lib: Avoid division by zero in apply_constraint_to_size()
The step variable is initialized to zero. It is changed in the loop,
but if it's not changed it will remain zero. Add a variable check
before the division.
The observed behavior was introduced by commit 826b5de90c0b
("ALSA: firewire-lib: fix insufficient PCM rule for period/buffer size"),
and it is difficult to show that any of the interval parameters will
satisfy the snd_interval_test() condition with data from the
amdtp_rate_table[] table.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Arnd Bergmann [Tue, 15 Oct 2024 15:21:48 +0000 (15:21 +0000)]
i915: fix DRM_I915_GVT_KVMGT dependencies
Depending on x86 and KVM is not enough, as the kvm helper functions
that get called here are controlled by CONFIG_KVM_X86, which is
disabled if both KVM_INTEL and KVM_AMD are turned off.
Qiao Ma [Tue, 15 Oct 2024 06:01:48 +0000 (14:01 +0800)]
uprobe: avoid out-of-bounds memory access of fetching args
Uprobe needs to fetch args into a percpu buffer, and then copy to ring
buffer to avoid non-atomic context problem.
Sometimes user-space strings, arrays can be very large, but the size of
percpu buffer is only page size. And store_trace_args() won't check
whether these data exceeds a single page or not, caused out-of-bounds
memory access.
It could be reproduced by following steps:
1. build kernel with CONFIG_KASAN enabled
2. save follow program as test.c
// If string length large than MAX_STRING_SIZE, the fetch_store_strlen()
// will return 0, cause __get_data_size() return shorter size, and
// store_trace_args() will not trigger out-of-bounds access.
// So make string length less than 4096.
\#define STRLEN 4093
void generate_string(char *str, int n)
{
int i;
for (i = 0; i < n; ++i)
{
char c = i % 26 + 'a';
str[i] = c;
}
str[n-1] = '\0';
}
Linus Torvalds [Sun, 20 Oct 2024 21:08:17 +0000 (14:08 -0700)]
Merge tag 'for-net-2024-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Pull bluetooth fixes from Luiz Augusto Von Dentz:
- ISO: Fix multiple init when debugfs is disabled
- Call iso_exit() on module unload
- Remove debugfs directory on module init failure
- btusb: Fix not being able to reconnect after suspend
- btusb: Fix regression with fake CSR controllers 0a12:0001
- bnep: fix wild-memory-access in proto_unregister
Note: normally the bluetooth fixes go through the networking tree, but
this missed the weekly merge, and two of the commits fix regressions
that have caused a fair amount of noise and have now hit stable too:
So I'm pulling it directly just to expedite things and not miss yet
another -rc release. This is not meant to become a new pattern.
* tag 'for-net-2024-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
Bluetooth: bnep: fix wild-memory-access in proto_unregister
Bluetooth: btusb: Fix not being able to reconnect after suspend
Bluetooth: Remove debugfs directory on module init failure
Bluetooth: Call iso_exit() on module unload
Bluetooth: ISO: Fix multiple init when debugfs is disabled
Linus Torvalds [Sun, 20 Oct 2024 20:55:46 +0000 (13:55 -0700)]
Merge tag 'pinctrl-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Mostly error path fixes, but one pretty serious interrupt problem in
the Ocelot driver as well:
- Fix two error paths and a missing semicolon in the Intel driver
- Add a missing ACPI ID for the Intel Panther Lake
- Check return value of devm_kasprintf() in the Apple and STM32
drivers
- Add a missing mutex_destroy() in the aw9523 driver
- Fix a double free in cv1800_pctrl_dt_node_to_map() in the Sophgo
driver
- Fix a double free in ma35_pinctrl_dt_node_to_map_func() in the
Nuvoton driver
- Fix a bug in the Ocelot interrupt handler making the system hang"
* tag 'pinctrl-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: ocelot: fix system hang on level based interrupts
pinctrl: nuvoton: fix a double free in ma35_pinctrl_dt_node_to_map_func()
pinctrl: sophgo: fix double free in cv1800_pctrl_dt_node_to_map()
pinctrl: intel: platform: Add Panther Lake to the list of supported
pinctrl: aw9523: add missing mutex_destroy
pinctrl: stm32: check devm_kasprintf() returned value
pinctrl: apple: check devm_kasprintf() returned value
pinctrl: intel: platform: use semicolon instead of comma in ncommunities assignment
pinctrl: intel: platform: fix error path in device_for_each_child_node()
Kent Overstreet [Sat, 19 Oct 2024 22:27:09 +0000 (18:27 -0400)]
bcachefs: Workaround for kvmalloc() not supporting > INT_MAX allocations
kvmalloc() doesn't support allocations > INT_MAX, but vmalloc() does -
the limit should be lifted, but we can work around this for now.
A user with a 75 TB filesystem reported the following journal replay
error:
https://github.com/koverstreet/bcachefs/issues/769
In journal replay we have to sort and dedup all the keys from the
journal, which means we need a large contiguous allocation. Given that
the user has 128GB of ram, the 2GB limit on allocation size has become
far too small.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sun, 20 Oct 2024 20:10:44 +0000 (13:10 -0700)]
Merge tag 'char-misc-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull misc driver fixes from Greg KH:
"Here are a number of small char/misc/iio driver fixes for 6.12-rc4:
- loads of small iio driver fixes for reported problems
- parport driver out-of-bounds fix
- Kconfig description and MAINTAINERS file updates
All of these, except for the Kconfig and MAINTAINERS file updates have
been in linux-next all week. Those other two are just documentation
changes and will have no runtime issues and were merged on Friday"
* tag 'char-misc-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (39 commits)
misc: rtsx: list supported models in Kconfig help
MAINTAINERS: Remove some entries due to various compliance requirements.
misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for OTP device
misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for EEPROM device
parport: Proper fix for array out-of-bounds access
iio: frequency: admv4420: fix missing select REMAP_SPI in Kconfig
iio: frequency: {admv4420,adrf6780}: format Kconfig entries
iio: adc: ad4695: Add missing Kconfig select
iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
iioc: dac: ltc2664: Fix span variable usage in ltc2664_channel_config()
iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig
iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig
iio: frequency: adf4377: add missing select REMAP_SPI in Kconfig
iio: resolver: ad2s1210: add missing select (TRIGGERED_)BUFFER in Kconfig
iio: resolver: ad2s1210 add missing select REGMAP in Kconfig
iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
iio: pressure: bm1390: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
...
Linus Torvalds [Sun, 20 Oct 2024 19:57:53 +0000 (12:57 -0700)]
Merge tag 'usb-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
"Here are some small USB driver fixes and new device ids for 6.12-rc4:
- xhci driver fixes for a number of reported issues
- new usb-serial driver ids
- dwc3 driver fixes for reported problems.
- usb gadget driver fixes for reported problems
- typec driver fixes
- MAINTAINER file updates
All of these have been in linux-next this week with no reported issues"
* tag 'usb-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: option: add Telit FN920C04 MBIM compositions
USB: serial: option: add support for Quectel EG916Q-GL
xhci: dbc: honor usb transfer size boundaries.
usb: xhci: Fix handling errors mid TD followed by other errors
xhci: Mitigate failed set dequeue pointer commands
xhci: Fix incorrect stream context type macro
USB: gadget: dummy-hcd: Fix "task hung" problem
usb: gadget: f_uac2: fix return value for UAC2_ATTRIBUTE_STRING store
usb: dwc3: core: Fix system suspend on TI AM62 platforms
xhci: tegra: fix checked USB2 port number
usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG
usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEF
usb: typec: altmode should keep reference to parent
MAINTAINERS: usb: raw-gadget: add bug tracker link
MAINTAINERS: Add an entry for the LJCA drivers
Linus Torvalds [Sun, 20 Oct 2024 19:04:32 +0000 (12:04 -0700)]
Merge tag 'x86_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Explicitly disable the TSC deadline timer when going idle to address
some CPU errata in that area
- Do not apply the Zenbleed fix on anything else except AMD Zen2 on the
late microcode loading path
- Clear CPU buffers later in the NMI exit path on 32-bit to avoid
register clearing while they still contain sensitive data, for the
RDFS mitigation
- Do not clobber EFLAGS.ZF with VERW on the opportunistic SYSRET exit
path on 32-bit
- Fix parsing issues of memory bandwidth specification in sysfs for
resctrl's memory bandwidth allocation feature
- Other small cleanups and improvements
* tag 'x86_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Always explicitly disarm TSC-deadline timer
x86/CPU/AMD: Only apply Zenbleed fix for Zen2 during late microcode load
x86/bugs: Use code segment selector for VERW operand
x86/entry_32: Clear CPU buffers after register restore in NMI return
x86/entry_32: Do not clobber user EFLAGS.ZF
x86/resctrl: Annotate get_mem_config() functions as __init
x86/resctrl: Avoid overflow in MB settings in bw_validate()
x86/amd_nb: Add new PCI ID for AMD family 1Ah model 20h
Linus Torvalds [Sun, 20 Oct 2024 18:30:56 +0000 (11:30 -0700)]
Merge tag 'sched_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduling fixes from Borislav Petkov:
- Add PREEMPT_RT maintainers
- Fix another aspect of delayed dequeued tasks wrt determining their
state, i.e., whether they're runnable or blocked
- Handle delayed dequeued tasks and their migration wrt PSI properly
- Fix the situation where a delayed dequeue task gets enqueued into a
new class, which should not happen
- Fix a case where memory allocation would happen while the runqueue
lock is held, which is a no-no
- Do not over-schedule when tasks with shorter slices preempt the
currently running task
- Make sure delayed to deque entities are properly handled before
unthrottling
- Other smaller cleanups and improvements
* tag 'sched_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Add an entry for PREEMPT_RT.
sched/fair: Fix external p->on_rq users
sched/psi: Fix mistaken CPU pressure indication after corrupted task state bug
sched/core: Dequeue PSI signals for blocked tasks that are delayed
sched: Fix delayed_dequeue vs switched_from_fair()
sched/core: Disable page allocation in task_tick_mm_cid()
sched/deadline: Use hrtick_enabled_dl() before start_hrtick_dl()
sched/eevdf: Fix wakeup-preempt by checking cfs_rq->nr_running
sched: Fix sched_delayed vs cfs_bandwidth
Paolo Bonzini [Sun, 20 Oct 2024 16:10:56 +0000 (12:10 -0400)]
Merge tag 'kvmarm-fixes-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.12, take #2
- Fix the guest view of the ID registers, making the relevant fields
writable from userspace (affecting ID_AA64DFR0_EL1 and ID_AA64PFR1_EL1)
- Correcly expose S1PIE to guests, fixing a regression introduced
in 6.12-rc1 with the S1POE support
- Fix the recycling of stage-2 shadow MMUs by tracking the context
(are we allowed to block or not) as well as the recycling state
- Address a couple of issues with the vgic when userspace misconfigures
the emulation, resulting in various splats. Headaches courtesy
of our Syzkaller friends
Cyan Yang [Thu, 19 Sep 2024 16:01:26 +0000 (00:01 +0800)]
RISCV: KVM: use raw_spinlock for critical section in imsic
For the external interrupt updating procedure in imsic, there was a
spinlock to protect it already. But since it should not be preempted in
any cases, we should turn to use raw_spinlock to prevent any preemption
in case PREEMPT_RT was enabled.
Signed-off-by: Cyan Yang <cyan.yang@sifive.com> Reviewed-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org>
Message-ID: <20240919160126.44487-1-cyan.yang@sifive.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 3 Oct 2024 23:43:27 +0000 (16:43 -0700)]
KVM: selftests: Fix out-of-bounds reads in CPUID test's array lookups
When looking for a "mangled", i.e. dynamic, CPUID entry, terminate the
walk based on the number of array _entries_, not the size in bytes of
the array. Iterating based on the total size of the array can result in
false passes, e.g. if the random data beyond the array happens to match
a CPUID entry's function and index.
Fixes: fb18d053b7f8 ("selftest: kvm: x86: test KVM_GET_CPUID2 and guest visible CPUIDs against KVM_GET_SUPPORTED_CPUID") Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-ID: <20241003234337.273364-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM: selftests: x86: Avoid using SSE/AVX instructions
Some distros switched gcc to '-march=x86-64-v3' by default and while it's
hard to find a CPU which doesn't support it today, many KVM selftests fail
with
==== Test Assertion Failure ====
lib/x86_64/processor.c:570: Unhandled exception in guest
pid=72747 tid=72747 errno=4 - Interrupted system call
Unhandled exception '0x6' at guest RIP '0x4104f7'
The failure is easy to reproduce elsewhere with
$ make clean && CFLAGS='-march=x86-64-v3' make -j && ./x86_64/kvm_pv_test
The root cause of the problem seems to be that with '-march=x86-64-v3' GCC
uses AVX* instructions (VMOVQ in the example above) and without prior
XSETBV() in the guest this results in #UD. It is certainly possible to add
it there, e.g. the following saves the day as well:
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-ID: <20240920154422.2890096-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hangbin Liu [Fri, 18 Oct 2024 00:53:01 +0000 (00:53 +0000)]
MAINTAINERS: add samples/pktgen to NETWORKING [GENERAL]
samples/pktgen is missing in the MAINTAINERS file.
Suggested-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Message-ID: <20241018005301.10052-1-liuhangbin@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Peter Rashleigh [Wed, 16 Oct 2024 04:08:22 +0000 (21:08 -0700)]
net: dsa: mv88e6xxx: Fix error when setting port policy on mv88e6393x
mv88e6393x_port_set_policy doesn't correctly shift the ptr value when
converting the policy format between the old and new styles, so the
target register ends up with the ptr being written over the data bits.
Shift the pointer to align with the format expected by
mv88e6393x_port_policy_write().
Fixes: 6584b26020fc ("net: dsa: mv88e6xxx: implement .port_set_policy for Amethyst") Signed-off-by: Peter Rashleigh <peter@rashleigh.ca> Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <20241016040822.3917-1-peter@rashleigh.ca> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Sean Christopherson [Wed, 9 Oct 2024 14:08:38 +0000 (07:08 -0700)]
KVM: nSVM: Ignore nCR3[4:0] when loading PDPTEs from memory
Ignore nCR3[4:0] when loading PDPTEs from memory for nested SVM, as bits
4:0 of CR3 are ignored when PAE paging is used, and thus VMRUN doesn't
enforce 32-byte alignment of nCR3.
In the absolute worst case scenario, failure to ignore bits 4:0 can result
in an out-of-bounds read, e.g. if the target page is at the end of a
memslot, and the VMM isn't using guard pages.
Per the APM:
The CR3 register points to the base address of the page-directory-pointer
table. The page-directory-pointer table is aligned on a 32-byte boundary,
with the low 5 address bits 4:0 assumed to be 0.
And the SDM's much more explicit:
4:0 Ignored
Note, KVM gets this right when loading PDPTRs, it's only the nSVM flow
that is broken.
Fixes: e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE from guest memory") Reported-by: Kirk Swidowski <swidowski@google.com> Cc: Andy Nguyen <theflow@google.com> Cc: 3pvd <3pvd@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241009140838.1036226-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Wed, 9 Oct 2024 17:50:00 +0000 (10:50 -0700)]
KVM: VMX: reset the segment cache after segment init in vmx_vcpu_reset()
Reset the segment cache after segment initialization in vmx_vcpu_reset()
to harden KVM against caching stale/uninitialized data. Without the
recent fix to bypass the cache in kvm_arch_vcpu_put(), the following
scenario is possible:
- vCPU is just created, and the vCPU thread is preempted before
SS.AR_BYTES is written in vmx_vcpu_reset().
- When scheduling out the vCPU task, kvm_arch_vcpu_in_kernel() =>
vmx_get_cpl() reads and caches '0' for SS.AR_BYTES.
- vmx_vcpu_reset() => seg_setup() configures SS.AR_BYTES, but doesn't
invoke vmx_segment_cache_clear() to invalidate the cache.
As a result, KVM retains a stale value in the cache, which can be read,
e.g. via KVM_GET_SREGS. Usually this is not a problem because the VMX
segment cache is reset on each VM-Exit, but if the userspace VMM (e.g KVM
selftests) reads and writes system registers just after the vCPU was
created, _without_ modifying SS.AR_BYTES, userspace will write back the
stale '0' value and ultimately will trigger a VM-Entry failure due to
incorrect SS segment type.
Invalidating the cache after writing the VMCS doesn't address the general
issue of cache accesses from IRQ context being unsafe, but it does prevent
KVM from clobbering the VMCS, i.e. mitigates the harm done _if_ KVM has a
bug that results in an unsafe cache access.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Fixes: 2fb92db1ec08 ("KVM: VMX: Cache vmcs segment fields")
[sean: rework changelog to account for previous patch] Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241009175002.1118178-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 9 Oct 2024 19:23:45 +0000 (12:23 -0700)]
KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL
Massage the documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL to call out that
it applies to moved memslots as well as deleted memslots, to avoid KVM's
"fast zap" terminology (which has no meaning for userspace), and to reword
the documented targeted zap behavior to specifically say that KVM _may_
zap a subset of all SPTEs. As evidenced by the fix to zap non-leafs SPTEs
with gPTEs, formally documenting KVM's exact internal behavior is risky
and unnecessary.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241009192345.1148353-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 9 Oct 2024 19:23:44 +0000 (12:23 -0700)]
KVM: x86/mmu: Add lockdep assert to enforce safe usage of kvm_unmap_gfn_range()
Add a lockdep assertion in kvm_unmap_gfn_range() to ensure that either
mmu_invalidate_in_progress is elevated, or that the range is being zapped
due to memslot removal (loosely detected by slots_lock being held).
Zapping SPTEs without mmu_invalidate_{in_progress,seq} protection is unsafe
as KVM's page fault path snapshots state before acquiring mmu_lock, and
thus can create SPTEs with stale information if vCPUs aren't forced to
retry faults (due to seeing an in-progress or past MMU invalidation).
Memslot removal is a special case, as the memslot is retrieved outside of
mmu_invalidate_seq, i.e. doesn't use the "standard" protections, and
instead relies on SRCU synchronization to ensure any in-flight page faults
are fully resolved before zapping SPTEs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241009192345.1148353-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 9 Oct 2024 19:23:43 +0000 (12:23 -0700)]
KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot
When performing a targeted zap on memslot removal, zap only MMU pages that
shadow guest PTEs, as zapping all SPs that "match" the gfn is inexact and
unnecessary. Furthermore, for_each_gfn_valid_sp() arguably shouldn't
exist, because it doesn't do what most people would it expect it to do.
The "round gfn for level" adjustment that is done for direct SPs (no gPTE)
means that the exact gfn comparison will not get a match, even when a SP
does "cover" a gfn, or was even created specifically for a gfn.
For memslot deletion specifically, KVM's behavior will vary significantly
based on the size and alignment of a memslot, and in weird ways. E.g. for
a 4KiB memslot, KVM will zap more SPs if the slot is 1GiB aligned than if
it's only 4KiB aligned. And as described below, zapping SPs in the
aligned case overzaps for direct MMUs, as odds are good the upper-level
SPs are serving other memslots.
To iterate over all potentially-relevant gfns, KVM would need to make a
pass over the hash table for each level, with the gfn used for lookup
rounded for said level. And then check that the SP is of the correct
level, too, e.g. to avoid over-zapping.
But even then, KVM would massively overzap, as processing every level is
all but guaranteed to zap SPs that serve other memslots, especially if the
memslot being removed is relatively small. KVM could mitigate that issue
by processing only levels that can be possible guest huge pages, i.e. are
less likely to be re-used for other memslot, but while somewhat logical,
that's quite arbitrary and would be a bit of a mess to implement.
So, zap only SPs with gPTEs, as the resulting behavior is easy to describe,
is predictable, and is explicitly minimal, i.e. KVM only zaps SPs that
absolutely must be zapped.
Cc: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241009192345.1148353-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Kirill A. Shutemov [Tue, 15 Oct 2024 09:58:17 +0000 (12:58 +0300)]
x86/kvm: Override default caching mode for SEV-SNP and TDX
AMD SEV-SNP and Intel TDX have limited access to MTRR: either it is not
advertised in CPUID or it cannot be programmed (on TDX, due to #VE on
CR0.CD clear).
This results in guests using uncached mappings where it shouldn't and
pmd/pud_set_huge() failures due to non-uniform memory type reported by
mtrr_type_lookup().
Override MTRR state, making it WB by default as the kernel does for
Hyper-V guests.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Suggested-by: Binbin Wu <binbin.wu@intel.com> Cc: Juergen Gross <jgross@suse.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Juergen Gross <jgross@suse.com>
Message-ID: <20241015095818.357915-1-kirill.shutemov@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Dr. David Alan Gilbert [Tue, 1 Oct 2024 14:13:54 +0000 (15:13 +0100)]
KVM: Remove unused kvm_vcpu_gfn_to_pfn_atomic
The last use of kvm_vcpu_gfn_to_pfn_atomic was removed by commit 1bbc60d0c7e5 ("KVM: x86/mmu: Remove MMU auditing")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Message-ID: <20241001141354.18009-3-linux@treblig.org>
[Adjust Documentation/virt/kvm/locking.rst. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Dr. David Alan Gilbert [Tue, 1 Oct 2024 14:13:53 +0000 (15:13 +0100)]
KVM: Remove unused kvm_vcpu_gfn_to_pfn
The last use of kvm_vcpu_gfn_to_pfn was removed by commit b1624f99aa8f ("KVM: Remove kvm_vcpu_gfn_to_page() and kvm_vcpu_gpa_to_page()")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Message-ID: <20241001141354.18009-2-linux@treblig.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Sun, 20 Oct 2024 00:04:52 +0000 (17:04 -0700)]
Merge tag 'io_uring-6.12-20241019' of git://git.kernel.dk/linux
Pull one more io_uring fix from Jens Axboe:
"Fix for a regression introduced in 6.12-rc2, where a condition check
was negated and hence -EAGAIN would bubble back up up to userspace
rather than trigger a retry condition"
* tag 'io_uring-6.12-20241019' of git://git.kernel.dk/linux:
io_uring/rw: fix wrong NOWAIT check in io_rw_init_file()
Aleksandr Mishin [Thu, 17 Oct 2024 10:06:51 +0000 (13:06 +0300)]
octeon_ep: Add SKB allocation failures handling in __octep_oq_process_rx()
build_skb() returns NULL in case of a memory allocation failure so handle
it inside __octep_oq_process_rx() to avoid NULL pointer dereference.
__octep_oq_process_rx() is called during NAPI polling by the driver. If
skb allocation fails, keep on pulling packets out of the Rx DMA queue: we
shouldn't break the polling immediately and thus falsely indicate to the
octep_napi_poll() that the Rx pressure is going down. As there is no
associated skb in this case, don't process the packets and don't push them
up the network stack - they are skipped.
Helper function is implemented to unmmap/flush all the fragment buffers
used by the dropped packet. 'alloc_failures' counter is incremented to
mark the skb allocation error in driver statistics.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 37d79d059606 ("octeon_ep: add Tx/Rx processing and interrupt support") Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Aleksandr Mishin [Thu, 17 Oct 2024 10:06:50 +0000 (13:06 +0300)]
octeon_ep: Implement helper for iterating packets in Rx queue
The common code with some packet and index manipulations is extracted and
moved to newly implemented helper to make the code more readable and avoid
duplication. This is a preparation for skb allocation failure handling.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Suggested-by: Simon Horman <horms@kernel.org> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Vadim Fedorenko [Wed, 16 Oct 2024 19:52:34 +0000 (12:52 -0700)]
bnxt_en: replace ptp_lock with irqsave variant
In netpoll configuration the completion processing can happen in hard
irq context which will break with spin_lock_bh() for fullfilling RX
timestamp in case of all packets timestamping. Replace it with
spin_lock_irqsave() variant.
Fixes: 7f5515d19cd7 ("bnxt_en: Get the RX packet timestamp") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Message-ID: <20241016195234.2622004-1-vadfed@meta.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Michel Alex [Wed, 16 Oct 2024 12:11:15 +0000 (12:11 +0000)]
net: phy: dp83822: Fix reset pin definitions
This change fixes a rare issue where the PHY fails to detect a link
due to incorrect reset behavior.
The SW_RESET definition was incorrectly assigned to bit 14, which is the
Digital Restart bit according to the datasheet. This commit corrects
SW_RESET to bit 15 and assigns DIG_RESTART to bit 14 as per the
datasheet specifications.
The SW_RESET define is only used in the phy_reset function, which fully
re-initializes the PHY after the reset is performed. The change in the
bit definitions should not have any negative impact on the functionality
of the PHY.
v2:
- added Fixes tag
- improved commit message
Cc: stable@vger.kernel.org Fixes: 5dc39fd5ef35 ("net: phy: DP83822: Add ability to advertise Fiber connection") Signed-off-by: Alex Michel <alex.michel@wiedemann-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Message-ID: <AS1P250MB0608A798661549BF83C4B43EA9462@AS1P250MB0608.EURP250.PROD.OUTLOOK.COM> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Jakub Kicinski [Tue, 15 Oct 2024 15:30:05 +0000 (08:30 -0700)]
MAINTAINERS: add Simon as an official reviewer
Simon has been diligently and consistently reviewing networking
changes for at least as long as our development statistics
go back. Often if not usually topping the list of reviewers.
Make his role official.
Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <20241015153005.2854018-1-kuba@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Jakub Boehm [Tue, 15 Oct 2024 15:16:04 +0000 (17:16 +0200)]
net: plip: fix break; causing plip to never transmit
Since commit 71ae2cb30531 ("net: plip: Fix fall-through warnings for Clang")
plip was not able to send any packets, this patch replaces one
unintended break; with fallthrough; which was originally missed by
commit 9525d69a3667 ("net: plip: mark expected switch fall-throughs").
I have verified with a real hardware PLIP connection that everything
works once again after applying this patch.
Fixes: 71ae2cb30531 ("net: plip: Fix fall-through warnings for Clang") Signed-off-by: Jakub Boehm <boehm.jakub@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <20241015-net-plip-tx-fix-v1-1-32d8be1c7e0b@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Wang Hai [Tue, 15 Oct 2024 14:48:02 +0000 (22:48 +0800)]
be2net: fix potential memory leak in be_xmit()
The be_xmit() returns NETDEV_TX_OK without freeing skb
in case of be_xmit_enqueue() fails, add dev_kfree_skb_any() to fix it.
Fixes: 760c295e0e8d ("be2net: Support for OS2BMC.") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Message-ID: <20241015144802.12150-1-wanghai38@huawei.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Wang Hai [Tue, 15 Oct 2024 14:41:48 +0000 (22:41 +0800)]
net/sun3_82586: fix potential memory leak in sun3_82586_send_packet()
The sun3_82586_send_packet() returns NETDEV_TX_OK without freeing skb
in case of skb->len being too long, add dev_kfree_skb() to fix it.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <20241015144148.7918-1-wanghai38@huawei.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Kory Maincent [Tue, 15 Oct 2024 13:02:54 +0000 (15:02 +0200)]
net: pse-pd: Fix out of bound for loop
Adjust the loop limit to prevent out-of-bounds access when iterating over
PI structures. The loop should not reach the index pcdev->nr_lines since
we allocate exactly pcdev->nr_lines number of PI structures. This fix
ensures proper bounds are maintained during iterations.
Fixes: 9be9567a7c59 ("net: pse-pd: Add support for PSE PIs") Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Message-ID: <20241015130255.125508-1-kory.maincent@bootlin.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Linus Torvalds [Sat, 19 Oct 2024 19:52:19 +0000 (12:52 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Fixes all in drivers. The largest is the mpi3mr which corrects a phy
count limit that should only apply to the controller but was being
incorrectly applied to expander phys"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: core: Fix null-ptr-deref in target_alloc_device()
scsi: mpi3mr: Validate SAS port assignments
scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down
scsi: ufs: core: Requeue aborted request
scsi: ufs: core: Fix the issue of ICU failure
Linus Torvalds [Sat, 19 Oct 2024 19:42:14 +0000 (12:42 -0700)]
Merge tag 'ftrace-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace fixes from Steven Rostedt:
"A couple of fixes to function graph infrastructure:
- Fix allocation of idle shadow stack allocation during hotplug
If function graph tracing is started when a CPU is offline, if it
were come online during the trace then the idle task that
represents the CPU will not get a shadow stack allocated for it.
This means all function graph hooks that happen while that idle
task is running (including in interrupt mode) will have all its
events dropped.
Switch over to the CPU hotplug mechanism that will have any newly
brought on line CPU get a callback that can allocate the shadow
stack for its idle task.
- Fix allocation size of the ret_stack_list array
When function graph tracing converted over to allowing more than
one user at a time, it had to convert its shadow stack from an
array of ret_stack structures to an array of unsigned longs. The
shadow stacks are allocated in batches of 32 at a time and assigned
to every running task. The batch is held by the ret_stack_list
array.
But when the conversion happened, instead of allocating an array of
32 pointers, it was allocated as a ret_stack itself (PAGE_SIZE).
This ret_stack_list gets passed to a function that iterates over
what it believes is its size defined by the
FTRACE_RETSTACK_ALLOC_SIZE macro (which is 32).
Luckily (PAGE_SIZE) is greater than 32 * sizeof(long), otherwise
this would have been an array overflow. This still should be fixed
and the ret_stack_list should be allocated to the size it is
expected to be as someday it may end up being bigger than
SHADOW_STACK_SIZE"
* tag 'ftrace-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
fgraph: Allocate ret_stack_list with proper size
fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks
Linus Torvalds [Sat, 19 Oct 2024 18:48:14 +0000 (11:48 -0700)]
Merge tag 'ipe-pr-20241018' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe
Pull ipe fixes from Fan Wu:
"This addresses several issues identified by Luca when attempting to
enable IPE on Debian and systemd:
- address issues with IPE policy update errors and policy update
version check, improving the clarity of error messages for better
understanding by userspace programs.
- enable IPE policies to be signed by secondary and platform
keyrings, facilitating broader use across general Linux
distributions like Debian.
- updates the IPE entry in the MAINTAINERS file to reflect the new
tree URL and my updated email from kernel.org"
* tag 'ipe-pr-20241018' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
MAINTAINERS: update IPE tree url and Fan Wu's email
ipe: fallback to platform keyring also if key in trusted keyring is rejected
ipe: allow secondary and platform keyrings to install/update policies
ipe: also reject policy updates with the same version
ipe: return -ESTALE instead of -EINVAL on update when new policy has a lower version
Linus Torvalds [Sat, 19 Oct 2024 17:18:03 +0000 (10:18 -0700)]
Merge tag 'input-for-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- a fix for Zinitix driver to not fail probing if the property enabling
touch keys functionality is not defined. Support for touch keys was
added in 6.12 merge window so this issue does not affect users of
released kernels
- a couple new vendor/device IDs in xpad driver to enable support for
more hardware
* tag 'input-for-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: zinitix - don't fail if linux,keycodes prop is absent
Input: xpad - add support for MSI Claw A1M
Input: xpad - add support for 8BitDo Ultimate 2C Wireless Controller
Linus Torvalds [Sat, 19 Oct 2024 15:44:10 +0000 (08:44 -0700)]
Merge tag '9p-for-6.12-rc4' of https://github.com/martinetd/linux
Pull 9p fixes from Dominique Martinet:
"Mashed-up update that I sat on too long:
- fix for multiple slabs created with the same name
- enable multipage folios
- theorical fix to also look for opened fids by inode if none was
found by dentry"
[ Enabling multi-page folios should have been done during the merge
window, but it's a one-liner, and the actual meat of the enablement
is in netfs and already in use for other filesystems... - Linus ]
* tag '9p-for-6.12-rc4' of https://github.com/martinetd/linux:
9p: Avoid creating multiple slab caches with the same name
9p: Enable multipage folios
9p: v9fs_fid_find: also lookup by inode if not found dentry
Linus Torvalds [Sat, 19 Oct 2024 15:32:47 +0000 (08:32 -0700)]
Merge tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux
Pull rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Fix several issues with the 'rustc-option' macro. It includes a
refactor from Masahiro of three '{cc,rust}-*' macros, which is not
a fix but avoids repeating the same commands (which would be
several lines in the case of 'rustc-option').
- Fix conditions for 'CONFIG_HAVE_CFI_ICALL_NORMALIZE_INTEGERS'. It
includes the addition of 'CONFIG_RUSTC_LLVM_VERSION', which is not
a fix but is needed for the actual fix.
And a trivial grammar fix"
* tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux:
cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS
kbuild: rust: add `CONFIG_RUSTC_LLVM_VERSION`
kbuild: fix issues with rustc-option
kbuild: refactor cc-option-yn, cc-disable-warning, rust-option-yn macros
lib/Kconfig.debug: fix grammar in RUST_BUILD_ASSERT_ALLOW
Jens Axboe [Sat, 19 Oct 2024 15:16:51 +0000 (09:16 -0600)]
io_uring/rw: fix wrong NOWAIT check in io_rw_init_file()
A previous commit improved how !FMODE_NOWAIT is dealt with, but
inadvertently negated a check whilst doing so. This caused -EAGAIN to be
returned from reading files with O_NONBLOCK set. Fix up the check for
REQ_F_SUPPORT_NOWAIT.
Steven Rostedt [Sat, 19 Oct 2024 01:52:12 +0000 (21:52 -0400)]
fgraph: Allocate ret_stack_list with proper size
The ret_stack_list is an array of ret_stack shadow stacks for the function
graph usage. When the first function graph is enabled, all tasks in the
system get a shadow stack. The ret_stack_list is a 32 element array of
pointers to these shadow stacks. It allocates the shadow stack in batches
(32 stacks at a time), assigns them to running tasks, and continues until
all tasks are covered.
When the function graph shadow stack changed from an array of
ftrace_ret_stack structures to an array of longs, the allocation of
ret_stack_list went from allocating an array of 32 elements to just a
block defined by SHADOW_STACK_SIZE. Luckily, that's defined as PAGE_SIZE
and is much more than enough to hold 32 pointers. But it is way overkill
for the amount needed to allocate.
Change the allocation of ret_stack_list back to a kcalloc() of
FTRACE_RETSTACK_ALLOC_SIZE pointers.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241018215212.23f13f40@rorschach Fixes: 42675b723b484 ("function_graph: Convert ret_stack to a series of longs") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Sat, 19 Oct 2024 01:43:00 +0000 (21:43 -0400)]
fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks
The function graph infrastructure allocates a shadow stack for every task
when enabled. This includes the idle tasks. The first time the function
graph is invoked, the shadow stacks are created and never freed until the
task exits. This includes the idle tasks.
Only the idle tasks that were for online CPUs had their shadow stacks
created when function graph tracing started. If function graph tracing is
enabled and a CPU comes online, the idle task representing that CPU will
not have its shadow stack created, and all function graph tracing for that
idle task will be silently dropped.
Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks.
This will include idle tasks for CPUs that come online during tracing.
Linus Torvalds [Fri, 18 Oct 2024 23:27:14 +0000 (16:27 -0700)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Daniel Borkmann:
- Fix BPF verifier to not affect subreg_def marks in its range
propagation (Eduard Zingerman)
- Fix a truncation bug in the BPF verifier's handling of
coerce_reg_to_size_sx (Dimitar Kanaliev)
- Fix the BPF verifier's delta propagation between linked registers
under 32-bit addition (Daniel Borkmann)
- Fix a NULL pointer dereference in BPF devmap due to missing rxq
information (Florian Kauer)
- Fix a memory leak in bpf_core_apply (Jiri Olsa)
- Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for
arrays of nested structs (Hou Tao)
- Fix build ID fetching where memory areas backing the file were
created with memfd_secret (Andrii Nakryiko)
- Fix BPF task iterator tid filtering which was incorrectly using pid
instead of tid (Jordan Rome)
- Several fixes for BPF sockmap and BPF sockhash redirection in
combination with vsocks (Michal Luczaj)
- Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri)
- Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility
of an infinite BPF tailcall (Pu Lehui)
- Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot
be resolved (Thomas Weißschuh)
- Fix a bug in kfunc BTF caching for modules where the wrong BTF object
was returned (Toke Høiland-Jørgensen)
- Fix a BPF selftest compilation error in cgroup-related tests with
musl libc (Tony Ambardar)
- Several fixes to BPF link info dumps to fill missing fields (Tyrone
Wu)
- Add BPF selftests for kfuncs from multiple modules, checking that the
correct kfuncs are called (Simon Sundberg)
- Ensure that internal and user-facing bpf_redirect flags don't overlap
(Toke Høiland-Jørgensen)
- Switch to use kvzmalloc to allocate BPF verifier environment (Rik van
Riel)
- Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat
under RT (Wander Lairson Costa)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits)
lib/buildid: Handle memfd_secret() files in build_id_parse()
selftests/bpf: Add test case for delta propagation
bpf: Fix print_reg_state's constant scalar dump
bpf: Fix incorrect delta propagation between linked registers
bpf: Properly test iter/task tid filtering
bpf: Fix iter/task tid filtering
riscv, bpf: Make BPF_CMPXCHG fully ordered
bpf, vsock: Drop static vsock_bpf_prot initialization
vsock: Update msg_count on read_skb()
vsock: Update rx_bytes on read_skb()
bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock
selftests/bpf: Add asserts for netfilter link info
bpf: Fix link info netfilter flags to populate defrag flag
selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx()
selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx()
bpf: Fix truncation bug in coerce_reg_to_size_sx()
selftests/bpf: Assert link info uprobe_multi count & path_size if unset
bpf: Fix unpopulated path_size when uprobe_multi fields unset
selftests/bpf: Fix cross-compiling urandom_read
selftests/bpf: Add test for kfunc module order
...
Linus Torvalds [Fri, 18 Oct 2024 23:11:17 +0000 (16:11 -0700)]
Merge tag 'linux_kselftest-fixes-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan:
- fix test makefile to install tests directory without which the test
fails with errors
* tag 'linux_kselftest-fixes-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftest: hid: add the missing tests directory
Nikita Travkin [Fri, 4 Oct 2024 16:17:30 +0000 (21:17 +0500)]
Input: zinitix - don't fail if linux,keycodes prop is absent
When initially adding the touchkey support, a mistake was made in the
property parsing code. The possible negative errno from
device_property_count_u32() was never checked, which was an oversight
left from converting to it from the of_property as part of the review
fixes.
Re-add the correct handling of the absent property, in which case zero
touchkeys should be assumed, which would disable the feature.
- Cleanup of checking for always-set tag_set (SurajSonawane2415)
- Fix for a crash with CPU hotplug notifiers (Ming)
- Don't allow zero-copy ublk on unprivileged device (Ming)
- Use array_index_nospec() for CDROM (Josh)
- Remove dead code in drbd (David)
- Tweaks to elevator loading (Breno)
* tag 'block-6.12-20241018' of git://git.kernel.dk/linux:
cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed()
nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function
nvme: make keep-alive synchronous operation
nvme-loop: flush off pending I/O while shutting down loop controller
nvme-pci: fix race condition between reset and nvme_dev_disable()
ublk: don't allow user copy for unprivileged device
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
nvme-multipath: defer partition scanning
blk-mq: setup queue ->tag_set before initializing hctx
elevator: Remove argument from elevator_find_get
elevator: do not request_module if elevator exists
drbd: Remove unused conn_lowest_minor
nvme: disable CC.CRIME (NVME_CC_CRIME)
nvme: delete unnecessary fallthru comment
nvmet-rdma: use sbitmap to replace rsp free list
block: Fix elevator_get_default() checking for NULL q->tag_set
nvme: tcp: avoid race between queue_lock lock and destroy
nvmet-passthru: clear EUID/NGUID/UUID while using loop target
block: fix blk_rq_map_integrity_sg kernel-doc
Linus Torvalds [Fri, 18 Oct 2024 22:38:37 +0000 (15:38 -0700)]
Merge tag 'io_uring-6.12-20241018' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Fix a regression this merge window where cloning of registered
buffers didn't take into account the dummy_ubuf
- Fix a race with reading how many SQRING entries are available,
causing userspace to need to loop around io_uring_sqring_wait()
rather than being able to rely on SQEs being available when it
returned
- Ensure that the SQPOLL thread is TASK_RUNNING before running
task_work off the cancelation exit path
* tag 'io_uring-6.12-20241018' of git://git.kernel.dk/linux:
io_uring/sqpoll: ensure task state is TASK_RUNNING when running task_work
io_uring/rsrc: ignore dummy_ubuf for buffer cloning
io_uring/sqpoll: close race on waiting for sqring entries
Mark Brown [Fri, 18 Oct 2024 20:47:03 +0000 (21:47 +0100)]
ASoC/SoundWire: clean up link DMA during stop for IPC4
Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>:
Clean up the link DMA for playback during stop for IPC4 is required to
reset the DMA read/write pointers when the stream is prepared and
restarted after a call to snd_pcm_drain()/snd_pcm_drop().
The change is mainly on ASoC. We may go via ASoC tree with Vinod's
Acked-by tag
Ranjani Sridharan (4):
ASoC: SOF: ipc4-topology: Do not set ALH node_id for aggregated DAIs
ASoC: SOF: Intel: hda: Handle prepare without close for non-HDA DAI's
soundwire: intel_ace2x: Send PDI stream number during prepare
ASoC: SOF: Intel: hda: Always clean up link DMA during stop
Olga Kornievskaia [Fri, 18 Oct 2024 19:24:58 +0000 (15:24 -0400)]
nfsd: fix race between laundromat and free_stateid
There is a race between laundromat handling of revoked delegations
and a client sending free_stateid operation. Laundromat thread
finds that delegation has expired and needs to be revoked so it
marks the delegation stid revoked and it puts it on a reaper list
but then it unlock the state lock and the actual delegation revocation
happens without the lock. Once the stid is marked revoked a racing
free_stateid processing thread does the following (1) it calls
list_del_init() which removes it from the reaper list and (2) frees
the delegation stid structure. The laundromat thread ends up not
calling the revoke_delegation() function for this particular delegation
but that means it will no release the lock lease that exists on
the file.
Now, a new open for this file comes in and ends up finding that
lease list isn't empty and calls nfsd_breaker_owns_lease() which ends
up trying to derefence a freed delegation stateid. Leading to the
followint use-after-free KASAN warning:
This patch proposes a fixed that's based on adding 2 new additional
stid's sc_status values that help coordinate between the laundromat
and other operations (nfsd4_free_stateid() and nfsd4_delegreturn()).
First to make sure, that once the stid is marked revoked, it is not
removed by the nfsd4_free_stateid(), the laundromat take a reference
on the stateid. Then, coordinating whether the stid has been put
on the cl_revoked list or we are processing FREE_STATEID and need to
make sure to remove it from the list, each check that state and act
accordingly. If laundromat has added to the cl_revoke list before
the arrival of FREE_STATEID, then nfsd4_free_stateid() knows to remove
it from the list. If nfsd4_free_stateid() finds that operations arrived
before laundromat has placed it on cl_revoke list, it marks the state
freed and then laundromat will no longer add it to the list.
Also, for nfsd4_delegreturn() when looking for the specified stid,
we need to access stid that are marked removed or freeable, it means
the laundromat has started processing it but hasn't finished and this
delegreturn needs to return nfserr_deleg_revoked and not
nfserr_bad_stateid. The latter will not trigger a FREE_STATEID and the
lack of it will leave this stid on the cl_revoked list indefinitely.
Fixes: 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock") CC: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
ipe: fallback to platform keyring also if key in trusted keyring is rejected
If enabled, we fallback to the platform keyring if the trusted keyring
doesn't have the key used to sign the ipe policy. But if pkcs7_verify()
rejects the key for other reasons, such as usage restrictions, we do not
fallback. Do so, following the same change in dm-verity.
Signed-off-by: Luca Boccassi <bluca@debian.org> Suggested-by: Serge Hallyn <serge@hallyn.com>
[FW: fixed some line length issues and a typo in the commit message] Signed-off-by: Fan Wu <wufan@kernel.org>
Linus Torvalds [Fri, 18 Oct 2024 18:37:12 +0000 (11:37 -0700)]
Merge tag 'v6.12-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix possible double free setting xattrs
- Fix slab out of bounds with large ioctl payload
- Remove three unused functions, and an unused variable that could be
confusing
* tag 'v6.12-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Remove unused functions
smb/client: Fix logically dead code
smb: client: fix OOBs when building SMB2_IOCTL request
smb: client: fix possible double free in smb2_set_ea()
Linus Torvalds [Fri, 18 Oct 2024 18:28:39 +0000 (11:28 -0700)]
Merge tag 'xfs-6.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
- Fix integer overflow in xrep_bmap
- Fix stale dealloc punching for COW IO
* tag 'xfs-6.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: punch delalloc extents from the COW fork for COW writes
xfs: set IOMAP_F_SHARED for all COW fork allocations
xfs: share more code in xfs_buffered_write_iomap_begin
xfs: support the COW fork in xfs_bmap_punch_delalloc_range
xfs: IOMAP_ZERO and IOMAP_UNSHARE already hold invalidate_lock
xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof
xfs: factor out a xfs_file_write_zero_eof helper
iomap: move locking out of iomap_write_delalloc_release
iomap: remove iomap_file_buffered_write_punch_delalloc
iomap: factor out a iomap_last_written_block helper
xfs: fix integer overflow in xrep_bmap
Linus Torvalds [Fri, 18 Oct 2024 18:16:01 +0000 (11:16 -0700)]
Merge tag 'pm-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two issues in the amd-pstate cpufreq driver and update the
intel_rapl power capping driver with a new processor ID.
Specifics:
- Enable ACPI CPPC in amd_pstate_register_driver() after disabling it
in amd_pstate_unregister_driver() when switching driver operation
modes (Dhananjay Ugwekar)
- Make amd-pstate use nominal performance as the maximum performance
level when boost is disabled (Mario Limonciello)
- Add ArrowLake-H to the list of processors where PL4 is supported in
the MSR part of the intel_rapl power capping driver (Srinivas
Pandruvada)"
* tag 'pm-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
powercap: intel_rapl_msr: Add PL4 support for ArrowLake-H
cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled
cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems
Linus Torvalds [Fri, 18 Oct 2024 18:13:53 +0000 (11:13 -0700)]
Merge tag 'hwmon-for-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
"Fix auto-detect regression in jc42 driver"
* tag 'hwmon-for-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
[PATCH} hwmon: (jc42) Properly detect TSE2004-compliant devices again
Linus Torvalds [Fri, 18 Oct 2024 18:03:21 +0000 (11:03 -0700)]
Merge tag 'drm-fixes-2024-10-18' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes, msm and xe are the two main ones, with a bunch of
scattered fixes including a largish revert in mgag200, then amdgpu,
vmwgfx and scattering of other minor ones.
All seems pretty regular.
msm:
- Display:
- move CRTC resource assignment to atomic_check otherwise to make
consecutive calls to atomic_check() consistent
- fix rounding / sign-extension issues with pclk calculation in
case of DSC
- cleanups to drop incorrect null checks in dpu snapshots
- fix to use kvzalloc in dpu snapshot to avoid allocation issues
in heavily loaded system cases
- Fix to not program merge_3d block if dual LM is not being used
- Fix to not flush merge_3d block if its not enabled otherwise
this leads to false timeouts
- GPU:
- a7xx: add a fence wait before SMMU table update
xe:
- New workaround to Xe2 (Aradhya)
- Fix unbalanced rpm put (Matthew Auld)
- Remove fragile lock optimization (Matthew Brost)
- Fix job release, delegating it to the drm scheduler (Matthew Brost)
- Fix timestamp bit width for Xe2 (Lucas)
- Fix external BO's dma-resv usag (Matthew Brost)
- Fix returning success for timeout in wait_token (Nirmoy)
- Initialize fence to avoid it being detected as signaled (Matthew
Auld)
- Improve cache flush for BMG (Matthew Auld)
- Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka)
host1x:
- Fix boot on Tegra186
- Set DMA parameters
mgag200:
- Revert VBLANK support
panel:
- himax-hx83192: Adjust power and gamma
qaic:
- Sgtable loop fixes
vmwgfx:
- Limit display layout allocatino size
- Handle allocation errors in connector checks
- Clean up KMS code for 2d-only setup
- Report surface-check errors correctly
- Remove NULL test around kvfree()"
* tag 'drm-fixes-2024-10-18' of https://gitlab.freedesktop.org/drm/kernel: (45 commits)
drm/ast: vga: Clear EDID if no display is connected
drm/ast: sil164: Clear EDID if no display is connected
Revert "drm/mgag200: Add vblank support"
drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs
drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater
drm/xe/bmg: improve cache flushing behaviour
drm/xe/xe_sync: initialise ufence.signalled
drm/xe/ufence: ufence can be signaled right after wait_woken
drm/xe: Use bookkeep slots for external BO's in exec IOCTL
drm/xe/query: Increase timestamp width
drm/xe: Don't free job in TDR
drm/xe: Take job list lock in xe_sched_add_pending_job
drm/xe: fix unbalanced rpm put() with declare_wedged()
drm/xe: fix unbalanced rpm put() with fence_fini()
drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg
drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible mode
drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculation
drm/msm/a6xx+: Insert a fence wait before SMMU table update
drm/msm/dpu: don't always program merge_3d block
drm/msm/dpu: Don't always set merge_3d pending flush
...