Dave Airlie [Fri, 14 Mar 2025 01:09:31 +0000 (11:09 +1000)]
Merge tag 'drm-misc-fixes-2025-03-13' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
A null pointer check for gma500, two clippy fixes for panic, a fix for
an interaction between DPMS and atomic leading to dropped frames, and
a locking fix for dp_mst
Yifan Zha [Wed, 5 Mar 2025 05:14:55 +0000 (13:14 +0800)]
drm/amd/amdkfd: Evict all queues even HWS remove queue failed
[Why]
If reset is detected and kfd need to evict working queues, HWS moving queue will be failed.
Then remaining queues are not evicted and in active state.
After reset done, kfd uses HWS to termination remaining activated queues but HWS is resetted.
So remove queue will be failed again.
[How]
Keep removing all queues even if HWS returns failed.
It will not affect cpsch as it checks reset_domain->sem.
v2: If any queue failed, evict queue returns error.
v3: Declare err inside the if-block.
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 42c854b8fb0cce512534aa2b7141948e80c6ebb0) Cc: stable@vger.kernel.org
Imre Deak [Fri, 7 Mar 2025 18:31:52 +0000 (20:31 +0200)]
drm/dp_mst: Fix locking when skipping CSN before topology probing
The handling of the MST Connection Status Notify message is skipped if
the probing of the topology is still pending. Acquiring the
drm_dp_mst_topology_mgr::probe_lock for this in
drm_dp_mst_handle_up_req() is problematic: the task/work this function
is called from is also responsible for handling MST down-request replies
(in drm_dp_mst_handle_down_rep()). Thus drm_dp_mst_link_probe_work() -
holding already probe_lock - could be blocked waiting for an MST
down-request reply while drm_dp_mst_handle_up_req() is waiting for
probe_lock while processing a CSN message. This leads to the probe
work's down-request message timing out.
A scenario similar to the above leading to a down-request timeout is
handling a CSN message in drm_dp_mst_handle_conn_stat(), holding the
probe_lock and sending down-request messages while a second CSN message
sent by the sink subsequently is handled by drm_dp_mst_handle_up_req().
Fix the above by moving the logic to skip the CSN handling to
drm_dp_mst_process_up_req(). This function is called from a work
(separate from the task/work handling new up/down messages), already
holding probe_lock. This solves the above timeout issue, since handling
of down-request replies won't be blocked by probe_lock.
Fixes: ddf983488c3e ("drm/dp_mst: Skip CSN if topology probing is not done yet") Cc: Wayne Lin <Wayne.Lin@amd.com> Cc: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org # v6.6+ Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250307183152.3822170-1-imre.deak@intel.com
Mario Limonciello [Fri, 28 Feb 2025 19:18:14 +0000 (13:18 -0600)]
drm/amd/display: Fix slab-use-after-free on hdcp_work
[Why]
A slab-use-after-free is reported when HDCP is destroyed but the
property_validate_dwork queue is still running.
[How]
Cancel the delayed work when destroying workqueue.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006 Fixes: da3fd7ac0bcf ("drm/amd/display: Update CP property based on HW query") Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 725a04ba5a95e89c89633d4322430cfbca7ce128) Cc: stable@vger.kernel.org
Alex Hung [Thu, 27 Feb 2025 23:36:25 +0000 (16:36 -0700)]
drm/amd/display: Assign normalized_pix_clk when color depth = 14
[WHY & HOW]
A warning message "WARNING: CPU: 4 PID: 459 at ... /dc_resource.c:3397
calculate_phy_pix_clks+0xef/0x100 [amdgpu]" occurs because the
display_color_depth == COLOR_DEPTH_141414 is not handled. This is
observed in Radeon RX 6600 XT.
It is fixed by assigning pix_clk * (14 * 3) / 24 - same as the rests.
Also fixes the indentation in get_norm_pix_clk.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 274a87eb389f58eddcbc5659ab0b180b37e92775) Cc: stable@vger.kernel.org
Mario Limonciello [Sun, 23 Feb 2025 06:04:35 +0000 (00:04 -0600)]
drm/amd/display: Restore correct backlight brightness after a GPU reset
[Why]
GPU reset will attempt to restore cached state, but brightness doesn't
get restored. It will come back at 100% brightness, but userspace thinks
it's the previous value.
[How]
When running resume sequence if GPU is in reset restore brightness
to previous value.
Acked-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5e19e2b57b6bb640d68dfc7991e1e182922cf867) Cc: stable@vger.kernel.org
Mario Limonciello [Sun, 23 Feb 2025 05:37:32 +0000 (23:37 -0600)]
drm/amd/display: fix default brightness
[Why]
To avoid flickering during boot default brightness level set by BIOS
should be maintained for as much of the boot as feasible.
commit 2fe87f54abdc ("drm/amd/display: Set default brightness according
to ACPI") attempted to set the right levels for AC vs DC, but brightness
still got reset to maximum level in initialization code for
setup_backlight_device().
[How]
Remove the hardcoded initialization in setup_backlight_device() and
instead program brightness value to match BIOS (AC or DC). This avoids a
brightness flicker from kernel changing the value. Userspace may however
still change it during boot.
Fixes: 2fe87f54abdc ("drm/amd/display: Set default brightness according to ACPI") Acked-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0747acf3311229e22009bec4a9e7fc30c879e842) Cc: stable@vger.kernel.org
Leo Li [Thu, 20 Feb 2025 21:20:26 +0000 (16:20 -0500)]
drm/amd/display: Disable unneeded hpd interrupts during dm_init
[Why]
It seems HPD interrupts are enabled by default for all connectors, even
if the hpd source isn't valid. An eDP for example, does not have a valid
hpd source (but does have a valid hpdrx source; see construct_phy()).
Thus, eDPs should have their hpd interrupt disabled.
In the past, this wasn't really an issue. Although the driver gets
interrupted, then acks by writing to hw registers, there weren't any
subscribed handlers that did anything meaningful (see
register_hpd_handlers()).
But things changed with the introduction of IPS. s2idle requires that
the driver allows IPS for DMUB fw to put hw to sleep. Since register
access requires hw to be awake, the driver will block IPS entry to do
so. And no IPS means no hw sleep during s2idle.
This was the observation on DCN35 systems with an eDP. During suspend,
the eDP toggled its hpd pin as part of the panel power down sequence.
The driver was then interrupted, and acked by writing to registers,
blocking IPS entry.
[How]
Since DC marks eDP connections as having invalid hpd sources (see
construct_phy()), DM should disable them at the hw level. Do so in
amdgpu_dm_hpd_init() by disabling all hpd ints first, then selectively
enabling ones for connectors that have valid hpd sources.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7b1ba19eb15f88e70782642ce2d934211269337b) Cc: stable@vger.kernel.org
Mario Limonciello [Thu, 6 Mar 2025 18:51:24 +0000 (12:51 -0600)]
drm/amd: Keep display off while going into S4
When userspace invokes S4 the flow is:
1) amdgpu_pmops_prepare()
2) amdgpu_pmops_freeze()
3) Create hibernation image
4) amdgpu_pmops_thaw()
5) Write out image to disk
6) Turn off system
Then on resume amdgpu_pmops_restore() is called.
This flow has a problem that because amdgpu_pmops_thaw() is called
it will call amdgpu_device_resume() which will resume all of the GPU.
This includes turning the display hardware back on and discovering
connectors again.
This is an unexpected experience for the display to turn back on.
Adjust the flow so that during the S4 sequence display hardware is
not turned back on.
Reported-by: Xaver Hugl <xaver.hugl@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2038 Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Link: https://lore.kernel.org/r/20250306185124.44780-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 68bfdc8dc0a1a7fdd9ab61e69907ae71a6fd3d91)
Starting from 6.11, AMDGPU driver, while being loaded with amdgpu.dc=1,
due to lack of .is_two_pixels_per_container function in dce60_tg_funcs,
causes a NULL pointer dereference on PCs with old GPUs, such as R9 280X.
So this fix adds missing .is_two_pixels_per_container to dce60_tg_funcs.
Reported-by: Rosen Penev <rosenp@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3942 Fixes: e6a901a00822 ("drm/amd/display: use even ODM slice width for two pixels per container") Signed-off-by: Aliaksei Urbanski <aliaksei.urbanski@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bd4b125eb949785c6f8a53b0494e32795421209d) Cc: stable@vger.kernel.org
Ville Syrjälä [Wed, 19 Feb 2025 16:02:39 +0000 (18:02 +0200)]
drm/atomic: Filter out redundant DPMS calls
Video players (eg. mpv) do periodic XResetScreenSaver() calls to
keep the screen on while the video playing. The modesetting ddx
plumbs these straight through into the kernel as DPMS setproperty
ioctls, without any filtering whatsoever. When implemented via
atomic these end up as empty commits on the crtc (which will
nonetheless take one full frame), which leads to a dropped
frame every time XResetScreenSaver() is called.
Let's just filter out redundant DPMS property changes in the
kernel to avoid this issue.
v2: Explain the resulting commits a bit better (Sima)
Document the behaviour in uapi docs (Sima)
Linus Torvalds [Sun, 9 Mar 2025 19:23:14 +0000 (09:23 -1000)]
Merge tag 'kbuild-fixes-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Use the specified $(LD) when building userprogs with Clang
- Pass the correct target triple when compile-testing UAPI headers
with Clang
- Fix pacman-pkg build error with KBUILD_OUTPUT
* tag 'kbuild-fixes-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: install-extmod-build: Fix build when specifying KBUILD_OUTPUT
docs: Kconfig: fix defconfig description
kbuild: hdrcheck: fix cross build with clang
kbuild: userprogs: use correct lld when linking through clang
Linus Torvalds [Sun, 9 Mar 2025 19:14:07 +0000 (09:14 -1000)]
Merge tag 'usb-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB driver fixes for some reported issues. These
contain:
- typec driver fixes
- dwc3 driver fixes
- xhci driver fixes
- renesas controller fixes
- gadget driver fixes
- a new USB quirk added
All of these have been in linux-next with no reported issues"
* tag 'usb-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: ucsi: Fix NULL pointer access
usb: quirks: Add DELAY_INIT and NO_LPM for Prolific Mass Storage Card Reader
usb: xhci: Fix host controllers "dying" after suspend and resume
usb: dwc3: Set SUSPENDENABLE soon after phy init
usb: hub: lack of clearing xHC resources
usb: renesas_usbhs: Flush the notify_hotplug_work
usb: renesas_usbhs: Use devm_usb_get_phy()
usb: renesas_usbhs: Call clk_put()
usb: dwc3: gadget: Prevent irq storm when TH re-executes
usb: gadget: Check bmAttributes only if configuration is valid
xhci: Restrict USB4 tunnel detection for USB3 devices to Intel hosts
usb: xhci: Enable the TRB overfetch quirk on VIA VL805
usb: gadget: Fix setting self-powered state on suspend
usb: typec: ucsi: increase timeout for PPM reset operations
acpi: typec: ucsi: Introduce a ->poll_cci method
usb: typec: tcpci_rt1711h: Unmask alert interrupts to fix functionality
usb: gadget: Set self-powered based on MaxPower and bmAttributes
usb: gadget: u_ether: Set is_suspend flag if remote wakeup fails
usb: atm: cxacru: fix a flaw in existing endpoint checks
Linus Torvalds [Sun, 9 Mar 2025 19:11:42 +0000 (09:11 -1000)]
Merge tag 'driver-core-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix that resolves a reported memory leak.
It's been in linux-next for 2 weeks now with no reported problems"
* tag 'driver-core-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers: core: fix device leak in __fw_devlink_relax_cycles()
Linus Torvalds [Sun, 9 Mar 2025 19:07:54 +0000 (09:07 -1000)]
Merge tag 'char-misc-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/IIO driver fixes from Greg KH:
"Here are a number of misc and char and iio driver fixes that have been
sitting in my tree for way too long. They contain:
- iio driver fixes for reported issues
- regression fix for rtsx_usb card reader
- mei and mhi driver fixes
- small virt driver fixes
- ntsync permissions fix
- other tiny driver fixes for reported problems.
All of these have been in linux-next for quite a while with no
reported issues"
* tag 'char-misc-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (30 commits)
Revert "drivers/card_reader/rtsx_usb: Restore interrupt based detection"
ntsync: Check wait count based on byte size.
bus: simple-pm-bus: fix forced runtime PM use
char: misc: deallocate static minor in error path
eeprom: digsy_mtc: Make GPIO lookup table match the device
drivers: virt: acrn: hsm: Use kzalloc to avoid info leak in pmcmd_ioctl
binderfs: fix use-after-free in binder_devices
slimbus: messaging: Free transaction ID in delayed interrupt scenario
vbox: add HAS_IOPORT dependency
cdx: Fix possible UAF error in driver_override_show()
intel_th: pci: Add Panther Lake-P/U support
intel_th: pci: Add Panther Lake-H support
intel_th: pci: Add Arrow Lake support
intel_th: msu: Fix less trivial kernel-doc warnings
intel_th: msu: Fix kernel-doc warnings
MAINTAINERS: change maintainer for FSI
ntsync: Set the permissions to be 0666
bus: mhi: host: pci_generic: Use pci_try_reset_function() to avoid deadlock
mei: vsc: Use "wakeuphostint" when getting the host wakeup GPIO
mei: me: add panther lake P DID
...
Linus Torvalds [Sun, 9 Mar 2025 19:04:08 +0000 (09:04 -1000)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"arm64:
- Fix a couple of bugs affecting pKVM's PSCI relay implementation
when running in the hVHE mode, resulting in the host being entered
with the MMU in an unknown state, and EL2 being in the wrong mode
x86:
- Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow
- Ensure DEBUGCTL is context switched on AMD to avoid running the
guest with the host's value, which can lead to unexpected bus lock
#DBs
- Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't
properly emulate BTF. KVM's lack of context switching has meant BTF
has always been broken to some extent
- Always save DR masks for SNP vCPUs if DebugSwap is *supported*, as
the guest can enable DebugSwap without KVM's knowledge
- Fix a bug in mmu_stress_tests where a vCPU could finish the "writes
to RO memory" phase without actually generating a write-protection
fault
- Fix a printf() goof in the SEV smoke test that causes build
failures with -Werror
- Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when
PERFMON_V2 isn't supported by KVM"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Explicitly zero EAX and EBX when PERFMON_V2 isn't supported by KVM
KVM: selftests: Fix printf() format goof in SEV smoke test
KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage
KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3
KVM: SVM: Save host DR masks on CPUs with DebugSwap
KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()
KVM: arm64: Initialize HCR_EL2.E2H early
KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is disabled
KVM: x86: Snapshot the host's DEBUGCTL in common x86
KVM: SVM: Suppress DEBUGCTL.BTF on AMD
KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value
KVM: selftests: Assert that STI blocking isn't set after event injection
KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow
Paolo Bonzini [Sun, 9 Mar 2025 07:44:06 +0000 (03:44 -0400)]
Merge tag 'kvm-x86-fixes-6.14-rcN.2' of https://github.com/kvm-x86/linux into HEAD
KVM x86 fixes for 6.14-rcN #2
- Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow.
- Ensure DEBUGCTL is context switched on AMD to avoid running the guest with
the host's value, which can lead to unexpected bus lock #DBs.
- Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly
emulate BTF. KVM's lack of context switching has meant BTF has always been
broken to some extent.
- Always save DR masks for SNP vCPUs if DebugSwap is *supported*, as the guest
can enable DebugSwap without KVM's knowledge.
- Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO
memory" phase without actually generating a write-protection fault.
- Fix a printf() goof in the SEV smoke test that causes build failures with
-Werror.
- Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2
isn't supported by KVM.
Paolo Bonzini [Sun, 9 Mar 2025 07:43:56 +0000 (03:43 -0400)]
Merge tag 'kvmarm-fixes-6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.14, take #4
- Fix a couple of bugs affecting pKVM's PSCI relay implementation
when running in the hVHE mode, resulting in the host being entered
with the MMU in an unknown state, and EL2 being in the wrong mode.
Linus Torvalds [Sun, 9 Mar 2025 00:34:06 +0000 (14:34 -1000)]
Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"33 hotfixes. 24 are cc:stable and the remainder address post-6.13
issues or aren't considered necessary for -stable kernels.
26 are for MM and 7 are for non-MM.
- "mm: memory_failure: unmap poisoned folio during migrate properly"
from Ma Wupeng fixes a couple of two year old bugs involving the
migration of hwpoisoned folios.
- "selftests/damon: three fixes for false results" from SeongJae Park
fixes three one year old bugs in the SAMON selftest code.
The remainder are singletons and doubletons. Please see the individual
changelogs for details"
* tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (33 commits)
mm/page_alloc: fix uninitialized variable
rapidio: add check for rio_add_net() in rio_scan_alloc_net()
rapidio: fix an API misues when rio_add_net() fails
MAINTAINERS: .mailmap: update Sumit Garg's email address
Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone"
mm: fix finish_fault() handling for large folios
mm: don't skip arch_sync_kernel_mappings() in error paths
mm: shmem: remove unnecessary warning in shmem_writepage()
userfaultfd: fix PTE unmapping stack-allocated PTE copies
userfaultfd: do not block on locking a large folio with raised refcount
mm: zswap: use ATOMIC_LONG_INIT to initialize zswap_stored_pages
mm: shmem: fix potential data corruption during shmem swapin
mm: fix kernel BUG when userfaultfd_move encounters swapcache
selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries
selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms
selftests/damon/damos_quota: make real expectation of quota exceeds
include/linux/log2.h: mark is_power_of_2() with __always_inline
NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback
mm, swap: avoid BUG_ON in relocate_cluster()
mm: swap: use correct step in loop to wait all clusters in wait_for_allocation()
...
Linus Torvalds [Sat, 8 Mar 2025 19:29:54 +0000 (09:29 -1000)]
Merge tag 'x86-urgent-2025-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more x86 fixes from Ingo Molnar:
- Add more model IDs to the AMD microcode version check, more people
are hitting these checks
- Fix a Xen guest boot warning related to AMD northbridge setup
- Fix SEV guest bugs related to a recent changes in its locking logic
- Fix a missing definition of PTRS_PER_PMD that assembly builds can hit
* tag 'x86-urgent-2025-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Add some forgotten models to the SHA check
x86/mm: Define PTRS_PER_PMD for assembly code too
virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutex
virt: sev-guest: Allocate request data dynamically
x86/amd_nb: Use rdmsr_safe() in amd_get_mmconfig_range()
Linus Torvalds [Sat, 8 Mar 2025 17:21:41 +0000 (07:21 -1000)]
Merge tag 'loongarch-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix bugs in kernel build, hibernation, memory management and KVM"
* tag 'loongarch-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Fix GPA size issue about VM
LoongArch: KVM: Reload guest CSR registers after sleep
LoongArch: KVM: Add interrupt checking for AVEC
LoongArch: Set hugetlb mmap base address aligned with pmd size
LoongArch: Set max_pfn with the PFN of the last page
LoongArch: Use polling play_dead() when resuming from hibernation
LoongArch: Eliminate superfluous get_numa_distances_cnt()
LoongArch: Convert unreachable() to BUG()
Bibo Mao [Sat, 8 Mar 2025 05:52:04 +0000 (13:52 +0800)]
LoongArch: KVM: Fix GPA size issue about VM
Physical address space is 48 bit on Loongson-3A5000 physical machine,
however it is 47 bit for VM on Loongson-3A5000 system. Size of physical
address space of VM is the same with the size of virtual user space (a
half) of physical machine.
Variable cpu_vabits represents user address space, kernel address space
is not included (user space and kernel space are both a half of total).
Here cpu_vabits, rather than cpu_vabits - 1, is to represent the size of
guest physical address space.
Also there is strict checking about page fault GPA address, inject error
if it is larger than maximum GPA address of VM.
Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Sat, 8 Mar 2025 05:52:01 +0000 (13:52 +0800)]
LoongArch: KVM: Reload guest CSR registers after sleep
On host, the HW guest CSR registers are lost after suspend and resume
operation. Since last_vcpu of boot CPU still records latest vCPU pointer
so that the guest CSR register skips to reload when boot CPU resumes and
vCPU is scheduled.
Here last_vcpu is cleared so that guest CSR registers will reload from
scheduled vCPU context after suspend and resume.
Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Sat, 8 Mar 2025 05:51:59 +0000 (13:51 +0800)]
LoongArch: KVM: Add interrupt checking for AVEC
There is a newly added macro INT_AVEC with CSR ESTAT register, which is
bit 14 used for LoongArch AVEC support. AVEC interrupt status bit 14 is
supported with macro CSR_ESTAT_IS, so here replace the hard-coded value
0x1fff with macro CSR_ESTAT_IS so that the AVEC interrupt status is also
supported by KVM.
Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The problem is that base address allocated from hugetlbfs is not aligned
with pmd size. Here add a checking for hugetlbfs and align base address
with pmd size. After this patch the test case "testcases/bin/hugefork02"
passes to run.
This is similar to the commit 7f24cbc9c4d42db8a3c8484d1 ("mm/mmap: teach
generic_get_unmapped_area{_topdown} to handle hugetlb mappings").
Bibo Mao [Sat, 8 Mar 2025 05:51:32 +0000 (13:51 +0800)]
LoongArch: Set max_pfn with the PFN of the last page
The current max_pfn equals to zero. In this case, it causes user cannot
get some page information through /proc filesystem such as kpagecount.
The following message is displayed by stress-ng test suite with command
"stress-ng --verbose --physpage 1 -t 1".
# stress-ng --verbose --physpage 1 -t 1
stress-ng: error: [1691] physpage: cannot read page count for address 0x134ac000 in /proc/kpagecount, errno=22 (Invalid argument)
stress-ng: error: [1691] physpage: cannot read page count for address 0x7ffff207c3a8 in /proc/kpagecount, errno=22 (Invalid argument)
stress-ng: error: [1691] physpage: cannot read page count for address 0x134b0000 in /proc/kpagecount, errno=22 (Invalid argument)
...
After applying this patch, the kernel can pass the test.
# stress-ng --verbose --physpage 1 -t 1
stress-ng: debug: [1701] physpage: [1701] started (instance 0 on CPU 3)
stress-ng: debug: [1701] physpage: [1701] exited (instance 0 on CPU 3)
stress-ng: debug: [1700] physpage: [1701] terminated (success)
Cc: stable@vger.kernel.org # 6.8+ Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Sat, 8 Mar 2025 05:51:32 +0000 (13:51 +0800)]
LoongArch: Use polling play_dead() when resuming from hibernation
When CONFIG_RANDOM_KMALLOC_CACHES or other randomization infrastructrue
enabled, the idle_task's stack may different between the booting kernel
and target kernel. So when resuming from hibernation, an ACTION_BOOT_CPU
IPI wakeup the idle instruction in arch_cpu_idle_dead() and jump to the
interrupt handler. But since the stack pointer is changed, the interrupt
handler cannot restore correct context.
So rename the current arch_cpu_idle_dead() to idle_play_dead(), make it
as the default version of play_dead(), and the new arch_cpu_idle_dead()
call play_dead() directly. For hibernation, implement an arch-specific
hibernate_resume_nonboot_cpu_disable() to use the polling version (idle
instruction is replace by nop, and irq is disabled) of play_dead(), i.e.
poll_play_dead(), to avoid IPI handler corrupting the idle_task's stack
when resuming from hibernation.
This solution is a little similar to commit 406f992e4a372dafbe3c ("x86 /
hibernate: Use hlt_play_dead() when resuming from hibernation").
Tiezhu Yang [Sat, 8 Mar 2025 05:50:45 +0000 (13:50 +0800)]
LoongArch: Convert unreachable() to BUG()
When compiling on LoongArch, there exists the following objtool warning
in arch/loongarch/kernel/machine_kexec.o:
kexec_reboot() falls through to next function crash_shutdown_secondary()
Avoid using unreachable() as it can (and will in the absence of UBSAN)
generate fall-through code. Use BUG() so we get a "break BRK_BUG" trap
(with unreachable annotation).
Linus Torvalds [Sat, 8 Mar 2025 02:21:02 +0000 (16:21 -1000)]
Merge tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix return address recovery of traced function in ftrace to ensure
reliable stack unwinding
- Fix compiler warnings and runtime crashes of vDSO selftests on s390
by introducing a dedicated GNU hash bucket pointer with correct
32-bit entry size
- Fix test_monitor_call() inline asm, which misses CC clobber, by
switching to an instruction that doesn't modify CC
* tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ftrace: Fix return address recovery of traced function
selftests/vDSO: Fix GNU hash table entry size for s390x
s390/traps: Fix test_monitor_call() inline assembly
Ingo Molnar [Thu, 6 Mar 2025 22:00:16 +0000 (23:00 +0100)]
x86/mm: Define PTRS_PER_PMD for assembly code too
Andy reported the following build warning from head_32.S:
In file included from arch/x86/kernel/head_32.S:29:
arch/x86/include/asm/pgtable_32.h:59:5: error: "PTRS_PER_PMD" is not defined, evaluates to 0 [-Werror=undef]
59 | #if PTRS_PER_PMD > 1
The reason is that on 2-level i386 paging the folded in PMD's
PTRS_PER_PMD constant is not defined in assembly headers,
only in generic MM C headers.
Instead of trying to fish out the definition from the generic
headers, just define it - it even has a comment for it already...
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/Z8oa8AUVyi2HWfo9@gmail.com
Linus Torvalds [Fri, 7 Mar 2025 22:17:42 +0000 (12:17 -1000)]
Merge tag 'acpi-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Restore the previous behavior of the ACPI platform_profile sysfs
interface that has been changed recently in a way incompatible with
the existing user space (Mario Limonciello)"
* tag 'acpi-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
platform/x86/amd: pmf: Add balanced-performance to hidden choices
platform/x86/amd: pmf: Add 'quiet' to hidden choices
ACPI: platform_profile: Add support for hidden choices
Linus Torvalds [Fri, 7 Mar 2025 21:17:30 +0000 (11:17 -1000)]
Merge tag 'for-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix leaked extent map after error when reading chunks
- replace use of deprecated strncpy
- in zoned mode, fixed range when ulocking extent range, causing a hang
* tag 'for-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix a leaked chunk map issue in read_one_chunk()
btrfs: replace deprecated strncpy() with strscpy()
btrfs: zoned: fix extent range end unlock in cow_file_range()
Linus Torvalds [Fri, 7 Mar 2025 21:12:33 +0000 (11:12 -1000)]
Merge tag 'block-6.14-20250306' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- TCP use after free fix on polling (Sagi)
- Controller memory buffer cleanup fixes (Icenowy)
- Free leaking requests on bad user passthrough commands (Keith)
- TCP error message fix (Maurizio)
- TCP corruption fix on partial PDU (Maurizio)
- TCP memory ordering fix for weakly ordered archs (Meir)
- Type coercion fix on message error for TCP (Dan)
- Name the RQF flags enum, fixing issues with anon enums and BPF import
of it
- ublk parameter setting fix
- GPT partition 7-bit conversion fix
* tag 'block-6.14-20250306' of git://git.kernel.dk/linux:
block: Name the RQF flags enum
nvme-tcp: fix signedness bug in nvme_tcp_init_connection()
block: fix conversion of GPT partition name to 7-bit
ublk: set_params: properly check if parameters can be applied
nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch
nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu()
nvme-tcp: Fix a C2HTermReq error message
nvmet: remove old function prototype
nvme-ioctl: fix leaked requests on mapping error
nvme-pci: skip CMB blocks incompatible with PCI P2P DMA
nvme-pci: clean up CMBMSC when registering CMB fails
nvme-tcp: fix possible UAF in nvme_tcp_poll
Linus Torvalds [Fri, 7 Mar 2025 21:09:33 +0000 (11:09 -1000)]
Merge tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"A single fix for a regression introduced in the 6.14 merge window,
causing stalls/hangs with IOPOLL reads or writes"
* tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linux:
io_uring/rw: ensure reissue path is correctly handled for IOPOLL
- Fix possible memory corruption in child_cfs_rq_on_list()
* tag 'sched-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/rt: Update limit of sched_rt sysctl in documentation
sched/deadline: Use online cpus for validating runtime
sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
Linus Torvalds [Fri, 7 Mar 2025 20:38:33 +0000 (10:38 -1000)]
Merge tag 'perf-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
"Fix a race between PMU registration and event creation, and fix
pmus_lock vs. pmus_srcu lock ordering"
* tag 'perf-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix perf_pmu_register() vs. perf_init_event()
perf/core: Fix pmus_lock vs. pmus_srcu ordering
Linus Torvalds [Fri, 7 Mar 2025 17:51:27 +0000 (07:51 -1000)]
Merge tag 'hwmon-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- xgene-hwmon: Fix a NULL vs IS_ERR_OR_NULL() check
- ad7314: Return error if leading zero bits are non-zero
- ntc_thermistor: Update/fix the ncpXXxh103 sensor table
- pmbus: Initialise page count in pmbus_identify()
- peci/dimmtemp: Do not provide fake threshold data
* tag 'hwmon-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: fix a NULL vs IS_ERR_OR_NULL() check in xgene_hwmon_probe()
hwmon: (ad7314) Validate leading zero bits and return error
hwmon: (ntc_thermistor) Fix the ncpXXxh103 sensor table
hwmon: (pmbus) Initialise page count in pmbus_identify()
hwmon: (peci/dimmtemp) Do not provide fake thresholds data
Linus Torvalds [Fri, 7 Mar 2025 17:29:13 +0000 (07:29 -1000)]
Merge tag 'platform-drivers-x86-v6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- amd/pmf:
- Initialize 'cb_mutex'
- Support for new version of PMF-TA
- intel-hid: Fix volume buttons on Microsoft Surface Go 4 tablet
- intel/vsec: Add Diamond Rapids support
- thinkpad_acpi: Add battery quirk for ThinkPad X131e
* tag 'platform-drivers-x86-v6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA
platform/x86/amd/pmf: Propagate PMF-TA return codes
platform/x86/intel/vsec: Add Diamond Rapids support
platform/x86: thinkpad_acpi: Add battery quirk for ThinkPad X131e
platform/x86: intel-hid: fix volume buttons on Microsoft Surface Go 4 tablet
platform/x86/amd/pmf: Initialize and clean up `cb_mutex`
Linus Torvalds [Fri, 7 Mar 2025 17:24:41 +0000 (07:24 -1000)]
Merge tag 'sound-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"There is a single change in ALSA core (for sequencer code for the
module auto-loading in a wrong timing) while the all rest are various
HD- and USB-audio fixes.
Many of them are boring device-specific quirks, and should be safe to
take"
* tag 'sound-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Add support for ASUS Zenbook UM3406KA Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS B5405 and B5605 Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS B3405 and B3605 Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for various ASUS Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS ROG Strix G614 Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS ROG Strix GA603 Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS ROG Strix G814 Laptop using CS35L41 HDA
ALSA: hda: intel: Add Dell ALC3271 to power_save denylist
ALSA: hda/realtek: update ALC222 depop optimize
ALSA: hda: realtek: fix incorrect IS_REACHABLE() usage
ALSA: usx2y: validate nrpacks module parameter on probe
ALSA: hda/realtek - add supported Mic Mute LED for Lenovo platform
ALSA: seq: Avoid module auto-load handling at event delivery
ALSA: hda: Fix speakers on ASUS EXPERTBOOK P5405CSA 1.0
ALSA: hda/realtek: Fix Asus Z13 2025 audio
ALSA: hda/realtek: Remove (revert) duplicate Ally X config
Miguel Ojeda [Sat, 1 Mar 2025 23:16:02 +0000 (00:16 +0100)]
drm/panic: fix overindented list items in documentation
Starting with the upcoming Rust 1.86.0 (to be released 2025-04-03),
Clippy warns:
error: doc list item overindented
--> drivers/gpu/drm/drm_panic_qr.rs:914:5
|
914 | /// will be encoded as binary segment, otherwise it will be encoded
| ^^^ help: try using ` ` (2 spaces)
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#doc_overindented_list_items
The overindentation is slightly hard to notice, since all the items
start with a backquote that makes it look OK, but it is there.
Thus fix it.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Fixes: cb5164ac43d0 ("drm/panic: Add a QR code panic screen") Cc: stable@vger.kernel.org # Needed in 6.12.y and 6.13.y only (Rust is pinned in older LTSs). Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250301231602.917580-2-ojeda@kernel.org
Miguel Ojeda [Sat, 1 Mar 2025 23:16:01 +0000 (00:16 +0100)]
drm/panic: use `div_ceil` to clean Clippy warning
Starting with the upcoming Rust 1.86.0 (to be released 2025-04-03),
Clippy warns:
error: manually reimplementing `div_ceil`
--> drivers/gpu/drm/drm_panic_qr.rs:548:26
|
548 | let pad_offset = (offset + 7) / 8;
| ^^^^^^^^^^^^^^^^ help: consider using `.div_ceil()`: `offset.div_ceil(8)`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_div_ceil
And similarly for `stride`. Thus apply the suggestion to both.
The behavior (and thus codegen) is not exactly equivalent [1][2], since
`div_ceil()` returns the right value for the values that currently
would overflow.
Alexey Kardashevskiy [Fri, 7 Mar 2025 01:37:00 +0000 (12:37 +1100)]
virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutex
Compared to the SNP Guest Request, the "Extended" version adds data pages for
receiving certificates. If not enough pages provided, the HV can report to the
VM how much is needed so the VM can reallocate and repeat.
Commit
ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex")
moved handling of the allocated/desired pages number out of scope of said
mutex and create a possibility for a race (multiple instances trying to
trigger Extended request in a VM) as there is just one instance of
snp_msg_desc per /dev/sev-guest and no locking other than snp_cmd_mutex.
Fix the issue by moving the data blob/size and the GHCB input struct
(snp_req_data) into snp_guest_req which is allocated on stack now and accessed
by the GHCB caller under that mutex.
Stop allocating SEV_FW_BLOB_MAX_SIZE in snp_msg_alloc() as only one of four
callers needs it. Free the received blob in get_ext_report() right after it is
copied to the userspace. Possible future users of snp_send_guest_request() are
likely to have different ideas about the buffer size anyways.
Fixes: ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250307013700.437505-3-aik@amd.com
Nikunj A Dadhania [Thu, 6 Mar 2025 08:17:21 +0000 (19:17 +1100)]
virt: sev-guest: Allocate request data dynamically
Commit
ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex")
narrowed the command mutex scope to snp_send_guest_request(). However,
GET_REPORT, GET_DERIVED_KEY, and GET_EXT_REPORT share the req structure in
snp_guest_dev. Without the mutex protection, concurrent requests can overwrite
each other's data. Fix it by dynamically allocating the request structure.
Fixes: ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") Closes: https://github.com/AMDESE/AMDSEV/issues/265 Reported-by: andreas.stuehrk@yaxi.tech Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250307013700.437505-2-aik@amd.com
Linus Torvalds [Fri, 7 Mar 2025 04:25:35 +0000 (18:25 -1000)]
fs/pipe: add simpler helpers for common cases
The fix to atomically read the pipe head and tail state when not holding
the pipe mutex has caused a number of headaches due to the size change
of the involved types.
It turns out that we don't have _that_ many places that access these
fields directly and were affected, but we have more than we strictly
should have, because our low-level helper functions have been designed
to have intimate knowledge of how the pipes work.
And as a result, that random noise of direct 'pipe->head' and
'pipe->tail' accesses makes it harder to pinpoint any actual potential
problem spots remaining.
For example, we didn't have a "is the pipe full" helper function, but
instead had a "given these pipe buffer indexes and this pipe size, is
the pipe full". That's because some low-level pipe code does actually
want that much more complicated interface.
But most other places literally just want a "is the pipe full" helper,
and not having it meant that those places ended up being unnecessarily
much too aware of this all.
It would have been much better if only the very core pipe code that
cared had been the one aware of this all.
So let's fix it - better late than never. This just introduces the
trivial wrappers for "is this pipe full or empty" and to get how many
pipe buffers are used, so that instead of writing
if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
the places that literally just want to know if a pipe is full can just
say
if (pipe_is_full(pipe))
instead. The existing trivial cases were converted with a 'sed' script.
This cuts down on the places that access pipe->head and pipe->tail
directly outside of the pipe code (and core splice code) quite a lot.
The splice code in particular still revels in doing the direct low-level
accesses, and the fuse fuse_dev_splice_write() code also seems a bit
unnecessarily eager to go very low-level, but it's at least a bit better
than it used to be.
xe:
- Remove double page flip on initial plane
- Properly setup userptr pfn_flags_mask
- Fix GT "for each engine" workarounds
- Fix userptr races and missed validations
- Userptr invalid page access fixes
- Cleanup some style nits
amdgpu:
- Fix NULL check in DC code
- SMU 14 fix
amdkfd:
- Fix NULL check in queue validation
radeon:
- RS400 HyperZ fix"
* tag 'drm-fixes-2025-03-07' of https://gitlab.freedesktop.org/drm/kernel: (22 commits)
drm/bochs: Fix DPMS regression
drm/xe/userptr: Unmap userptrs in the mmu notifier
drm/xe/hmm: Don't dereference struct page pointers without notifier lock
drm/xe/hmm: Style- and include fixes
drm/xe: Add staging tree for VM binds
drm/xe: Fix fault mode invalidation with unbind
drm/xe/vm: Fix a misplaced #endif
drm/xe/vm: Validate userptr during gpu vma prefetching
drm/amd/pm: always allow ih interrupt from fw
drm/radeon: Fix rs400_gpu_init for ATI mobility radeon Xpress 200M
drm/amdkfd: Fix NULL Pointer Dereference in KFD queue
drm/amd/display: Fix null check for pipe_ctx->plane_state in resource_build_scaling_params
drm/xe: Fix GT "for each engine" workarounds
drm/xe/userptr: properly setup pfn_flags_mask
drm/i915/mst: update max stream count to match number of pipes
drm/xe: Remove double pageflip
drm/sched: Fix preprocessor guard
drm/imagination: Fix timestamps in firmware traces
drm/imagination: only init job done fences once
drm/imagination: Hold drm_gem_gpuva lock for unmap
...
Breno Leitao [Thu, 6 Mar 2025 16:27:51 +0000 (08:27 -0800)]
block: Name the RQF flags enum
Commit 5f89154e8e9e3445f9b59 ("block: Use enum to define RQF_x bit
indexes") converted the RQF flags to an anonymous enum, which was
a beneficial change. This patch goes one step further by naming the enum
as "rqf_flags".
This naming enables exporting these flags to BPF clients, eliminating
the need to duplicate these flags in BPF code. Instead, BPF clients can
now access the same kernel-side values through CO:RE (Compile Once, Run
Everywhere), as shown in this example:
Linus Torvalds [Thu, 6 Mar 2025 23:52:15 +0000 (13:52 -1000)]
Merge tag 'bcachefs-2025-03-06' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
- Fix a compatibility issue: we shouldn't be setting incompat feature
bits unless explicitly requested
- Fix another bug where the journal alloc/resize path could spuriously
fail with -BCH_ERR_open_buckets_empty
- Copygc shouldn't run on read-only devices: fragmentation isn't an
issue if we're not currently writing to a given device, and it may
not have anywhere to move the data to
* tag 'bcachefs-2025-03-06' of git://evilpiepirate.org/bcachefs:
bcachefs: copygc now skips non-rw devices
bcachefs: Fix bch2_dev_journal_alloc() spuriously failing
bcachefs: Don't set BCH_FEATURE_incompat_version_field unless requested
Kent Overstreet [Fri, 28 Feb 2025 16:34:41 +0000 (11:34 -0500)]
bcachefs: copygc now skips non-rw devices
There's no point in doing copygc on non-rw devices: the fragmentation
doesn't matter if we're not writing to them, and we may not have
anywhere to put the data on our other devices.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Previously, we fixed journal resize spuriousl failing with
-BCH_ERR_open_buckets_empty, but initial journal allocation was missed
because it didn't invoke the "block on allocator" loop at all.
Factor out the "loop on allocator" code to fix that.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Ard Biesheuvel [Thu, 6 Mar 2025 15:59:16 +0000 (16:59 +0100)]
x86/boot: Sanitize boot params before parsing command line
The 5-level paging code parses the command line to look for the 'no5lvl'
string, and does so very early, before sanitize_boot_params() has been
called and has been given the opportunity to wipe bogus data from the
fields in boot_params that are not covered by struct setup_header, and
are therefore supposed to be initialized to zero by the bootloader.
This triggers an early boot crash when using syslinux-efi to boot a
recent kernel built with CONFIG_X86_5LEVEL=y and CONFIG_EFI_STUB=n, as
the 0xff padding that now fills the unused PE/COFF header is copied into
boot_params by the bootloader, and interpreted as the top half of the
command line pointer.
Fix this by sanitizing the boot_params before use. Note that there is no
harm in calling this more than once; subsequent invocations are able to
spot that the boot_params have already been cleaned up.
- wifi: iwlwifi:
- fix A-MSDU TSO preparation
- free pages allocated when failing to build A-MSDU
- ipv6: fix dst ref loop in ila lwtunnel
- mptcp: fix 'scheduling while atomic' in
mptcp_pm_nl_append_new_local_addr
- bluetooth: add check for mgmt_alloc_skb() in
mgmt_device_connected()
- ethtool: allow NULL nlattrs when getting a phy_device
- eth: be2net: fix sleeping while atomic bugs in
be_ndo_bridge_getlink
Previous releases - always broken:
- core: support TCP GSO case for a few missing flags
- wifi: mac80211:
- fix vendor-specific inheritance
- cleanup sta TXQs on flush
- llc: do not use skb_get() before dev_queue_xmit()
- eth: ipa: nable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX}
for v4.7"
* tag 'net-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (41 commits)
net: ipv6: fix missing dst ref drop in ila lwtunnel
net: ipv6: fix dst ref loop in ila lwtunnel
mctp i3c: handle NULL header address
net: dsa: mt7530: Fix traffic flooding for MMIO devices
net-timestamp: support TCP GSO case for a few missing flags
vlan: enforce underlying device type
mptcp: fix 'scheduling while atomic' in mptcp_pm_nl_append_new_local_addr
net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device
ppp: Fix KMSAN uninit-value warning with bpf
net: ipa: Enable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX} for v4.7
net: ipa: Fix QSB data for v4.7
net: ipa: Fix v4.7 resource group names
net: hns3: make sure ptp clock is unregister and freed if hclge_ptp_get_cycle returns an error
wifi: nl80211: disable multi-link reconfiguration
net: dsa: rtl8366rb: don't prompt users for LED control
be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink
llc: do not use skb_get() before dev_queue_xmit()
wifi: cfg80211: regulatory: improve invalid hints checking
caif_virtio: fix wrong pointer check in cfv_probe()
net: gso: fix ownership in __udp_gso_segment
...
Linus Torvalds [Thu, 6 Mar 2025 19:19:15 +0000 (09:19 -1000)]
Merge tag 'v6.14-rc5-smb3-fixes' of git://git.samba.org/ksmbd
Pull smb fixes from Steve French:
"Five SMB server fixes, two related client fixes, and minor MAINTAINERS
update:
- Two SMB3 lock fixes fixes (including use after free and bug on fix)
- Fix to race condition that can happen in processing IPC responses
- Four ACL related fixes: one related to endianness of num_aces, and
two related fixes to the checks for num_aces (for both client and
server), and one fixing missing check for num_subauths which can
cause memory corruption
- And minor update to email addresses in MAINTAINERS file"
* tag 'v6.14-rc5-smb3-fixes' of git://git.samba.org/ksmbd:
cifs: fix incorrect validation for num_aces field of smb_acl
ksmbd: fix incorrect validation for num_aces field of smb_acl
smb: common: change the data type of num_aces to le16
ksmbd: fix bug on trap in smb2_lock
ksmbd: fix use-after-free in smb2_lock
ksmbd: fix type confusion via race condition when using ipc_msg_send_request
ksmbd: fix out-of-bounds in parse_sec_desc()
MAINTAINERS: update email address in cifs and ksmbd entry
Linus Torvalds [Thu, 6 Mar 2025 18:18:48 +0000 (08:18 -1000)]
Merge tag 'exfat-for-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat fixes from Namjae Jeon:
- Optimize new cluster allocation by correctly find empty entry slot
- Add a check to prevent excessive bitmap clearing due to invalid
data size of file/dir entry
- Fix incorrect error return for zero-byte writes
* tag 'exfat-for-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: add a check for invalid data size
exfat: short-circuit zero-byte writes in exfat_file_write_iter
exfat: fix soft lockup in exfat_clear_bitmap
exfat: fix just enough dentries but allocate a new cluster to dir
Linus Torvalds [Thu, 6 Mar 2025 18:04:49 +0000 (08:04 -1000)]
Merge tag 'vfs-6.14-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix spelling mistakes in idmappings.rst
- Fix RCU warnings in override_creds()/revert_creds()
- Create new pid namespaces with default limit now that pid_max is
namespaced
* tag 'vfs-6.14-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
pid: Do not set pid_max in new pid namespaces
doc: correcting two prefix errors in idmappings.rst
cred: Fix RCU warnings in override/revert_creds
Linus Torvalds [Thu, 6 Mar 2025 17:53:25 +0000 (07:53 -1000)]
fs/pipe: fix pipe buffer index use in FUSE
This was another case that Rasmus pointed out where the direct access to
the pipe head and tail pointers broke on 32-bit configurations due to
the type changes.
As with the pipe FIONREAD case, fix it by using the appropriate helper
functions that deal with the right pipe index sizing.
Linus Torvalds [Thu, 6 Mar 2025 17:33:58 +0000 (07:33 -1000)]
fs/pipe: do not open-code pipe head/tail logic in FIONREAD
Rasmus points out that we do indeed have other cases of breakage from
the type changes that were introduced on 32-bit targets in order to read
the pipe head and tail values atomically (commit 3d252160b818: "fs/pipe:
Read pipe->{head,tail} atomically outside pipe->mutex").
Fix it up by using the proper helper functions that now deal with the
pipe buffer index types properly. This makes the code simpler and more
obvious.
The compiler does the CSE and loop hoisting of the pipe ring size
masking that we used to do manually, so open-coding this was never a
good idea.
Linus Torvalds [Thu, 6 Mar 2025 17:30:42 +0000 (07:30 -1000)]
fs/pipe: express 'pipe_empty()' in terms of 'pipe_occupancy()'
That's what 'pipe_full()' does, so it's more consistent. But more
importantly it gets the type limits right when the pipe head and tail
are no longer necessarily 'unsigned int'.
Andrei Kuchynski [Wed, 5 Mar 2025 11:17:39 +0000 (11:17 +0000)]
usb: typec: ucsi: Fix NULL pointer access
Resources should be released only after all threads that utilize them
have been destroyed.
This commit ensures that resources are not released prematurely by waiting
for the associated workqueue to complete before deallocating them.
Cc: stable <stable@kernel.org> Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking") Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250305111739.1489003-2-akuchynski@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Miao Li [Tue, 4 Mar 2025 07:07:57 +0000 (15:07 +0800)]
usb: quirks: Add DELAY_INIT and NO_LPM for Prolific Mass Storage Card Reader
When used on Huawei hisi platforms, Prolific Mass Storage Card Reader
which the VID:PID is in 067b:2731 might fail to enumerate at boot time
and doesn't work well with LPM enabled, combination quirks:
USB_QUIRK_DELAY_INIT + USB_QUIRK_NO_LPM
fixed the problems.
Fabrizio Castro [Wed, 5 Mar 2025 16:37:50 +0000 (16:37 +0000)]
gpio: rcar: Fix missing of_node_put() call
of_parse_phandle_with_fixed_args() requires its caller to
call into of_node_put() on the node pointer from the output
structure, but such a call is currently missing.
Haoxiang Li [Mon, 3 Mar 2025 02:42:33 +0000 (10:42 +0800)]
btrfs: fix a leaked chunk map issue in read_one_chunk()
Add btrfs_free_chunk_map() to free the memory allocated
by btrfs_alloc_chunk_map() if btrfs_add_chunk_map() fails.
Fixes: 7dc66abb5a47 ("btrfs: use a dedicated data structure for chunk maps") CC: stable@vger.kernel.org Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Signed-off-by: David Sterba <dsterba@suse.com>
Jens Axboe [Thu, 6 Mar 2025 11:32:46 +0000 (04:32 -0700)]
Merge tag 'nvme-6.14-2025-03-05' of git://git.infradead.org/nvme into block-6.14
Pull NVMe fixe from Keith:
"nvme fixes for Linux 6.14
- TCP use after free fix on polling (Sagi)
- Controller memory buffer cleanup fixes (Icenowy)
- Free leaking requests on bad user passthrough commands (Keith)
- TCP error message fix (Maurizio)
- TCP corruption fix on partial PDU (Maurizio)
- TCP memory ordering fix for weakly ordered archs (Meir)
- Type coercion fix on message error for TCP (Dan)"
* tag 'nvme-6.14-2025-03-05' of git://git.infradead.org/nvme:
nvme-tcp: fix signedness bug in nvme_tcp_init_connection()
nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch
nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu()
nvme-tcp: Fix a C2HTermReq error message
nvmet: remove old function prototype
nvme-ioctl: fix leaked requests on mapping error
nvme-pci: skip CMB blocks incompatible with PCI P2P DMA
nvme-pci: clean up CMBMSC when registering CMB fails
nvme-tcp: fix possible UAF in nvme_tcp_poll
Inochi Amaoto [Thu, 6 Mar 2025 06:39:51 +0000 (14:39 +0800)]
kbuild: install-extmod-build: Fix build when specifying KBUILD_OUTPUT
Since commit 5f73e7d0386d ("kbuild: refactor cross-compiling
linux-headers package"), the linux-headers pacman package fails
to build when "O=" is set. The build system complains:
/mnt/chroot/linux/scripts/Makefile.build:41: mnt/chroots/linux-mainline/pacman/linux-upstream/pkg/linux-upstream-headers/usr//lib/modules/6.14.0-rc3-00350-g771dba31fffc/build/scripts/Makefile: No such file or directory
This is because the "srcroot" variable is set to "." and the
"build" variable is set to the absolute path. This makes the
"src" variables point to wrong directory.
Change the "build" variable to a relative path to "." to
fix build.
Ivan Abramov [Thu, 6 Mar 2025 11:20:45 +0000 (14:20 +0300)]
drm/gma500: Add NULL check for pci_gfx_root in mid_get_vbt_data()
Since pci_get_domain_bus_and_slot() can return NULL, add NULL check for
pci_gfx_root in the mid_get_vbt_data().
This change is similar to the checks implemented in mid_get_fuse_settings()
and mid_get_pci_revID(), which were introduced by commit 0cecdd818cd7
("gma500: Final enables for Oaktrail") as "additional minor
bulletproofing".
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Justin Iurman [Tue, 4 Mar 2025 18:10:39 +0000 (19:10 +0100)]
net: ipv6: fix dst ref loop in ila lwtunnel
This patch follows commit 92191dd10730 ("net: ipv6: fix dst ref loops in
rpl, seg6 and ioam6 lwtunnels") and, on a second thought, the same patch
is also needed for ila (even though the config that triggered the issue
was pathological, but still, we don't want that to happen).
Shrikanth Hegde [Thu, 6 Mar 2025 05:29:53 +0000 (10:59 +0530)]
sched/deadline: Use online cpus for validating runtime
The ftrace selftest reported a failure because writing -1 to
sched_rt_runtime_us returns -EBUSY. This happens when the possible
CPUs are different from active CPUs.
Active CPUs are part of one root domain, while remaining CPUs are part
of def_root_domain. Since active cpumask is being used, this results in
cpus=0 when a non active CPUs is used in the loop.
Fix it by looping over the online CPUs instead for validating the
bandwidth calculations.
Michal Koutný [Wed, 5 Mar 2025 14:58:49 +0000 (15:58 +0100)]
pid: Do not set pid_max in new pid namespaces
It is already difficult for users to troubleshoot which of multiple pid
limits restricts their workload. The per-(hierarchical-)NS pid_max would
contribute to the confusion.
Also, the implementation copies the limit upon creation from
parent, this pattern showed cumbersome with some attributes in legacy
cgroup controllers -- it's subject to race condition between parent's
limit modification and children creation and once copied it must be
changed in the descendant.
Let's do what other places do (ucounts or cgroup limits) -- create new
pid namespaces without any limit at all. The global limit (actually any
ancestor's limit) is still effectively in place, we avoid the
set/unshare race and bumps of global (ancestral) limit have the desired
effect on pid namespace that do not care.
Takashi Iwai [Tue, 4 Mar 2025 13:41:57 +0000 (14:41 +0100)]
drm/bochs: Fix DPMS regression
The recent rewrite with the use of regular atomic helpers broke the
DPMS unblanking on X11. Fix it by moving the call of
bochs_hw_blank(false) from CRTC mode_set_nofb() to atomic_enable().
Local variable compact_result created at:
__alloc_pages_slowpath+0x66/0x16c0 mm/page_alloc.c:4218
__alloc_frozen_pages_noprof+0xa4c/0xe00 mm/page_alloc.c:4752
Haoxiang Li [Thu, 27 Feb 2025 04:11:31 +0000 (12:11 +0800)]
rapidio: add check for rio_add_net() in rio_scan_alloc_net()
The return value of rio_add_net() should be checked. If it fails,
put_device() should be called to free the memory and give up the reference
initialized in rio_add_net().
Link: https://lkml.kernel.org/r/20250227041131.3680761-1-haoxiang_li2024@163.com Fixes: e6b585ca6e81 ("rapidio: move net allocation into core code") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Haoxiang Li [Thu, 27 Feb 2025 07:34:09 +0000 (15:34 +0800)]
rapidio: fix an API misues when rio_add_net() fails
rio_add_net() calls device_register() and fails when device_register()
fails. Thus, put_device() should be used rather than kfree(). Add
"mport->net = NULL;" to avoid a use after free issue.
Link: https://lkml.kernel.org/r/20250227073409.3696854-1-haoxiang_li2024@163.com Fixes: e8de370188d0 ("rapidio: add mport char device driver") Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Gabriel Krisman Bertazi [Wed, 26 Feb 2025 03:22:58 +0000 (22:22 -0500)]
Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone"
Commit 96a5c186efff ("mm/page_alloc.c: don't show protection in zone's
->lowmem_reserve[] for empty zone") removes the protection of lower zones
from allocations targeting memory-less high zones. This had an unintended
impact on the pattern of reclaims because it makes the high-zone-targeted
allocation more likely to succeed in lower zones, which adds pressure to
said zones. I.e, the following corresponding checks in
zone_watermark_ok/zone_watermark_fast are less likely to trigger:
if (free_pages <= min + z->lowmem_reserve[highest_zoneidx])
return false;
As a result, we are observing an increase in reclaim and kswapd scans, due
to the increased pressure. This was initially observed as increased
latency in filesystem operations when benchmarking with fio on a machine
with some memory-less zones, but it has since been associated with
increased contention in locks related to memory reclaim. By reverting
this patch, the original performance was recovered on that machine.
The original commit was introduced as a clarification of the
/proc/zoneinfo output, so it doesn't seem there are usecases depending on
it, making the revert a simple solution.
For reference, I collected vmstat with and without this patch on a freshly
booted system running intensive randread io from an nvme for 5 minutes. I
got:
33M scans is similar to what we had in kernels predating this patch.
These numbers is fairly representative of the workload on this machine, as
measured in several runs. So we are talking about a 2-order of magnitude
increase.
Link: https://lkml.kernel.org/r/20250226032258.234099-1-krisman@suse.de Fixes: 96a5c186efff ("mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone") Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Baoquan He <bhe@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Brian Geffon [Wed, 26 Feb 2025 16:23:41 +0000 (11:23 -0500)]
mm: fix finish_fault() handling for large folios
When handling faults for anon shmem finish_fault() will attempt to install
ptes for the entire folio. Unfortunately if it encounters a single
non-pte_none entry in that range it will bail, even if the pte that
triggered the fault is still pte_none. When this situation happens the
fault will be retried endlessly never making forward progress.
This patch fixes this behavior and if it detects that a pte in the range
is not pte_none it will fall back to setting a single pte.
[bgeffon@google.com: tweak whitespace] Link: https://lkml.kernel.org/r/20250227133236.1296853-1-bgeffon@google.com Link: https://lkml.kernel.org/r/20250226162341.915535-1-bgeffon@google.com Fixes: 43e027e41423 ("mm: memory: extend finish_fault() to support large folio") Signed-off-by: Brian Geffon <bgeffon@google.com> Suggested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reported-by: Marek Maslanka <mmaslanka@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickens <hughd@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ryan Roberts [Wed, 26 Feb 2025 12:16:09 +0000 (12:16 +0000)]
mm: don't skip arch_sync_kernel_mappings() in error paths
Fix callers that previously skipped calling arch_sync_kernel_mappings() if
an error occurred during a pgtable update. The call is still required to
sync any pgtable updates that may have occurred prior to hitting the error
condition.
These are theoretical bugs discovered during code review.
Link: https://lkml.kernel.org/r/20250226121610.2401743-1-ryan.roberts@arm.com Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified") Fixes: 0c95cba49255 ("mm: apply_to_pte_range warn and fail if a large pte is encountered") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Christop Hellwig <hch@infradead.org> Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: shmem: remove unnecessary warning in shmem_writepage()
Although the scenario where shmem_writepage() is called with info->flags &
VM_LOCKED is unlikely to happen, it's still possible, as evidenced by
syzbot [1]. However, the warning in this case isn't necessary because the
situation is already handled correctly [2].
Current implementation of move_pages_pte() copies source and destination
PTEs in order to detect concurrent changes to PTEs involved in the move.
However these copies are also used to unmap the PTEs, which will fail if
CONFIG_HIGHPTE is enabled because the copies are allocated on the stack.
Fix this by using the actual PTEs which were kmap()ed.
Link: https://lkml.kernel.org/r/20250226185510.2732648-3-surenb@google.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Barry Song <21cnbao@gmail.com> Cc: Barry Song <v-songbaohua@oppo.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Suren Baghdasaryan [Wed, 26 Feb 2025 18:55:08 +0000 (10:55 -0800)]
userfaultfd: do not block on locking a large folio with raised refcount
Lokesh recently raised an issue about UFFDIO_MOVE getting into a deadlock
state when it goes into split_folio() with raised folio refcount.
split_folio() expects the reference count to be exactly mapcount +
num_pages_in_folio + 1 (see can_split_folio()) and fails with EAGAIN
otherwise.
If multiple processes are trying to move the same large folio, they raise
the refcount (all tasks succeed in that) then one of them succeeds in
locking the folio, while others will block in folio_lock() while keeping
the refcount raised. The winner of this race will proceed with calling
split_folio() and will fail returning EAGAIN to the caller and unlocking
the folio. The next competing process will get the folio locked and will
go through the same flow. In the meantime the original winner will be
retried and will block in folio_lock(), getting into the queue of waiting
processes only to repeat the same path. All this results in a livelock.
An easy fix would be to avoid waiting for the folio lock while holding
folio refcount, similar to madvise_free_huge_pmd() where folio lock is
acquired before raising the folio refcount. Since we lock and take a
refcount of the folio while holding the PTE lock, changing the order of
these operations should not break anything.
Modify move_pages_pte() to try locking the folio first and if that fails
and the folio is large then return EAGAIN without touching the folio
refcount. If the folio is single-page then split_folio() is not called,
so we don't have this issue. Lokesh has a reproducer [1] and I verified
that this change fixes the issue.
Sun YangKai [Wed, 26 Feb 2025 15:32:43 +0000 (23:32 +0800)]
mm: zswap: use ATOMIC_LONG_INIT to initialize zswap_stored_pages
This is currently the only atomic_long_t variable initialized by
ATOMIC_INIT macro found in the kernel by using `grep -r atomic_long_t |
grep ATOMIC_INIT`
This was introduced in 6e1fa555ec77, in which we modified the type of
zswap_stored_pages to atomic_long_t, but didn't change the initialization.
Link: https://lkml.kernel.org/r/20250226153253.19179-1-sunk67188@gmail.com Fixes: 6e1fa555ec77 ("mm: zswap: modify zswap_stored_pages to be atomic_long_t") Signed-off-by: Sun YangKai <sunk67188@gmail.com> Acked-by: Yosry Ahmed <yosry.ahmed@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> Cc: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Baolin Wang [Tue, 25 Feb 2025 09:52:55 +0000 (17:52 +0800)]
mm: shmem: fix potential data corruption during shmem swapin
Alex and Kairui reported some issues (system hang or data corruption) when
swapping out or swapping in large shmem folios. This is especially easy
to reproduce when the tmpfs is mount with the 'huge=within_size'
parameter. Thanks to Kairui's reproducer, the issue can be easily
replicated.
The root cause of the problem is that swap readahead may asynchronously
swap in order 0 folios into the swap cache, while the shmem mapping can
still store large swap entries. Then an order 0 folio is inserted into
the shmem mapping without splitting the large swap entry, which overwrites
the original large swap entry, leading to data corruption.
When getting a folio from the swap cache, we should split the large swap
entry stored in the shmem mapping if the orders do not match, to fix this
issue.
Link: https://lkml.kernel.org/r/2fe47c557e74e9df5fe2437ccdc6c9115fa1bf70.1740476943.git.baolin.wang@linux.alibaba.com Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca> Reported-by: Kairui Song <ryncsn@gmail.com> Closes: https://lore.kernel.org/all/1738717785.im3r5g2vxc.none@localhost/ Tested-by: Kairui Song <kasong@tencent.com> Cc: David Hildenbrand <david@redhat.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Matthew Wilcow <willy@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>