Merge tag 'xfs-fixes-6.16-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
- Fix umount hang with unflushable inodes (and add new tracepoint used
for debugging this)
- Fix ABBA deadlock in xfs_reclaim_inode() vs xfs_ifree_cluster()
- Fix dquot buffer pin deadlock
* tag 'xfs-fixes-6.16-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: add FALLOC_FL_ALLOCATE_RANGE to supported flags mask
xfs: fix unmount hang with unflushable inodes stuck in the AIL
xfs: factor out stale buffer item completion
xfs: rearrange code in xfs_buf_item.c
xfs: add tracepoints for stale pinned inode state debug
xfs: avoid dquot buffer pin deadlock
xfs: catch stale AGF/AGF metadata
xfs: xfs_ifree_cluster vs xfs_iflush_shutdown_abort deadlock
xfs: actually use the xfs_growfs_check_rtgeom tracepoint
xfs: Improve error handling in xfs_mru_cache_create()
xfs: move xfs_submit_zoned_bio a bit
xfs: use xfs_readonly_buftarg in xfs_remount_rw
xfs: remove NULL pointer checks in xfs_mru_cache_insert
xfs: check for shutdown before going to sleep in xfs_select_zone
Merge tag 'mmc-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Apply BROKEN_SD_DISCARD quirk earlier during init
- Silence some confusing error messages for SD UHS-II cards
MMC host:
- mtk-sd:
- Prevent memory corruption from DMA map failure
- Fix a pagefault in dma_unmap_sg() for not prepared data
- sdhci: Revert "Disable SD card clock before changing parameters"
- sdhci-of-k1: Fix error code in probe()
- sdhci-uhs2: Silence some confusing error messages for SD UHS-II cards"
* tag 'mmc-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mtk-sd: reset host->mrq on prepare_data() error
Revert "mmc: sdhci: Disable SD card clock before changing parameters"
mmc: sdhci-uhs2: Adjust some error messages and register dump for SD UHS-II card
mmc: sdhci: Add a helper function for dump register in dynamic debug mode
mmc: core: Adjust some error messages for SD UHS-II cards
mtk-sd: Prevent memory corruption from DMA map failure
mtk-sd: Fix a pagefault in dma_unmap_sg() for not prepared data
mmc: sdhci-of-k1: Fix error code in probe()
mmc: core: sd: Apply BROKEN_SD_DISCARD quirk earlier
Merge tag 's390-6.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- Fix PCI error recovery and bring it in line with AER/EEH
* tag 's390-6.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: Allow automatic recovery with minimal driver support
s390/pci: Do not try re-enabling load/store if device is disabled
s390/pci: Fix stale function handles in error handling
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd fixes from Jason Gunthorpe:
"Some changes to the userspace selftest framework cause the iommufd
tests to start failing. This turned out to be bugs in the iommufd side
that were just getting uncovered.
- Deal with MAP_HUGETLB mmaping more than requested even when in
MAP_FIXED mode
- Fixup missing error flow cleanup in the test
- Check that the memory allocations suceeded
- Suppress some bogus gcc 'may be used uninitialized' warnings"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd/selftest: Fix build warnings due to uninitialized mfd
iommufd/selftest: Add asserts testing global mfd
iommufd/selftest: Add missing close(mfd) in memfd_mmap()
iommufd/selftest: Fix iommufd_dirty_tracking with large hugepage sizes
Merge tag 'nfs-for-6.16-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
- Fix loop in GSS sequence number cache
- Clean up /proc/net/rpc/nfs if nfs_fs_proc_net_init() fails
- Fix a race to wake on NFS_LAYOUT_DRAIN
- Fix handling of NFS level errors in I/O
* tag 'nfs-for-6.16-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFSv4/flexfiles: Fix handling of NFS level errors in I/O
NFSv4/pNFS: Fix a race to wake on NFS_LAYOUT_DRAIN
nfs: Clean up /proc/net/rpc/nfs when nfs_fs_proc_net_init() fails.
sunrpc: fix loop in gss seqno cache
Linus Torvalds [Mon, 30 Jun 2025 23:32:43 +0000 (16:32 -0700)]
Merge tag 'io_uring-6.16-20250630' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Now that anonymous inodes set S_IFREG, this breaks the io_uring
read/write retries for short reads/writes. As things like timerfd and
eventfd are anon inodes, applications that previously did:
Linus Torvalds [Mon, 30 Jun 2025 15:27:38 +0000 (08:27 -0700)]
Merge tag 'rtc-6.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC fixes from Alexandre Belloni:
"Some fixes for 6.16. The cmos one is important for PREEMPT_RT. I've
also added the s5m changes as they had a dependency on the MFD pull
request that was included in 6.16-rc1 and we didn't synchronize before
the merge window and they won't hurt.
- cmos: use spin_lock_irqsave in cmos_interrupt
- pcf2127: fix SPI command byte for PCF2131
- s5m: add S2MPG10 support"
* tag 'rtc-6.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: pcf2127: add missing semicolon after statement
rtc: pcf2127: fix SPI command byte for PCF2131
rtc: cmos: use spin_lock_irqsave in cmos_interrupt
rtc: s5m: replace open-coded read/modify/write registers with regmap helpers
rtc: s5m: replace regmap_update_bits with regmap_clear/set_bits
rtc: s5m: switch to devm_device_init_wakeup
rtc: s5m: fix a typo: peding -> pending
rtc: s5m: add support for S2MPG10 RTC
rtc: s5m: prepare for external regmap
rtc: s5m: cache device type during probe
Youling Tang [Mon, 30 Jun 2025 01:11:48 +0000 (09:11 +0800)]
xfs: add FALLOC_FL_ALLOCATE_RANGE to supported flags mask
Add FALLOC_FL_ALLOCATE_RANGE to the set of supported fallocate flags in
XFS_FALLOC_FL_SUPPORTED. This change improves code clarity and maintains
by explicitly showing this flag in the supported flags mask.
Note that since FALLOC_FL_ALLOCATE_RANGE is defined as 0x00, this addition
has no functional modifications.
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Jens Axboe [Sun, 29 Jun 2025 22:48:28 +0000 (16:48 -0600)]
io_uring: gate REQ_F_ISREG on !S_ANON_INODE as well
io_uring marks a request as dealing with a regular file on S_ISREG. This
drives things like retries on short reads or writes, which is generally
not expected on a regular file (or bdev). Applications tend to not
expect that, so io_uring tries hard to ensure it doesn't deliver short
IO on regular files.
However, a recent commit added S_IFREG to anonymous inodes. When
io_uring is used to read from various things that are backed by anon
inodes, like eventfd, timerfd, etc, then it'll now all of a sudden wait
for more data when rather than deliver what was read or written in a
single operation. This breaks applications that issue reads on anon
inodes, if they ask for more data than a single read delivers.
Add a check for !S_ANON_INODE as well before setting REQ_F_ISREG to
prevent that.
Cc: Christian Brauner <brauner@kernel.org> Cc: stable@vger.kernel.org Link: https://github.com/ghostty-org/ghostty/discussions/7720 Fixes: cfd86ef7e8e7 ("anon_inode: use a proper mode internally") Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sun, 29 Jun 2025 16:25:55 +0000 (09:25 -0700)]
Merge tag 'staging-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fix from Greg KH:
"Here is a single staging driver fix for 6.16-rc4. It resolves a build
error in the rtl8723bs driver for some versions of clang on arm64 when
checking the frame size with -Wframe-larger-than.
It has been in linux-next for a while now with no reported issues"
* tag 'staging-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: rtl8723bs: Avoid memset() in aes_cipher() and aes_decipher()
Linus Torvalds [Sun, 29 Jun 2025 16:21:27 +0000 (09:21 -0700)]
Merge tag 'tty-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are five small serial and tty and vt fixes for 6.16-rc4. Included
in here are:
- kerneldoc fixes for recent vt changes
- imx serial driver fix
- of_node sysfs fix for a regression
- vt missing notification fix
- 8250 dt bindings fix
All of these have been in linux-next for a while with no reported issues"
* tag 'tty-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
dt-bindings: serial: 8250: Make clocks and clock-frequency exclusive
serial: imx: Restore original RXTL for console to fix data loss
serial: core: restore of_node information in sysfs
vt: fix kernel-doc warnings in ucs_get_fallback()
vt: add missing notification when switching back to text mode
Linus Torvalds [Sun, 29 Jun 2025 15:43:54 +0000 (08:43 -0700)]
Merge tag 'edac_urgent_for_v6.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
- Consider secondary address mask registers in amd64_edac in order to
get the correct total memory size of the system
* tag 'edac_urgent_for_v6.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/amd64: Fix size calculation for Non-Power-of-Two DIMMs
Linus Torvalds [Sun, 29 Jun 2025 15:28:24 +0000 (08:28 -0700)]
Merge tag 'x86_urgent_for_v6.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Make sure DR6 and DR7 are initialized to their architectural values
and not accidentally cleared, leading to misconfigurations
* tag 'x86_urgent_for_v6.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/traps: Initialize DR7 by writing its architectural reset value
x86/traps: Initialize DR6 by writing its architectural reset value
Linus Torvalds [Sun, 29 Jun 2025 15:16:02 +0000 (08:16 -0700)]
Merge tag 'perf_urgent_for_v6.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:
- Make sure an AUX perf event is really disabled when it overruns
* tag 'perf_urgent_for_v6.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/aux: Fix pending disable flow when the AUX ring buffer overruns
Linus Torvalds [Sat, 28 Jun 2025 22:23:17 +0000 (15:23 -0700)]
Merge tag 'i2c-for-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- imx: fix SMBus protocol compliance during block read
- omap: fix error handling path in probe
- robotfuzz, tiny-usb: prevent zero-length reads
- x86, designware, amdisp: fix build error when modules are disabled
(agreed to go in via i2c)
- scx200_acb: fix build error because of missing HAS_IOPORT
* tag 'i2c-for-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: scx200_acb: depends on HAS_IOPORT
i2c: omap: Fix an error handling path in omap_i2c_probe()
platform/x86: Use i2c adapter name to fix build errors
i2c: amd-isp: Initialize unique adapter name
i2c: designware: Initialize adapter name only when not set
i2c: tiny-usb: disable zero-length read messages
i2c: robotfuzz-osif: disable zero-length read messages
i2c: imx: fix emulated smbus block read
Linus Torvalds [Sat, 28 Jun 2025 18:39:24 +0000 (11:39 -0700)]
Merge tag 'trace-v6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
- Fix possible UAF on error path in filter_free_subsystem_filters()
When freeing a subsystem filter, the filter for the subsystem is
passed in to be freed and all the events within the subsystem will
have their filter freed too. In order to free without waiting for RCU
synchronization, list items are allocated to hold what is going to be
freed to free it via a call_rcu(). If the allocation of these items
fails, it will call the synchronization directly and free after that
(causing a bit of delay for the user).
The subsystem filter is first added to this list and then the filters
for all the events under the subsystem. The bug is if one of the
allocations of the list items for the event filters fail to allocate,
it jumps to the "free_now" label which will free the subsystem
filter, then all the items on the allocated list, and then the event
filters that were not added to the list yet. But because the
subsystem filter was added first, it gets freed twice.
The solution is to add the subsystem filter after the events, and
then if any of the allocations fail it will not try to free any of
them twice
* tag 'trace-v6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix filter logic error
Linus Torvalds [Sat, 28 Jun 2025 18:35:11 +0000 (11:35 -0700)]
Merge tag 'loongarch-fixes-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
- replace __ASSEMBLY__ with __ASSEMBLER__ in headers like others
- fix build warnings about export.h
- reserve the EFI memory map region for kdump
- handle __init vs inline mismatches
- fix some KVM bugs
* tag 'loongarch-fixes-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Disable updating of "num_cpu" and "feature"
LoongArch: KVM: Check validity of "num_cpu" from user space
LoongArch: KVM: Check interrupt route from physical CPU
LoongArch: KVM: Fix interrupt route update with EIOINTC
LoongArch: KVM: Add address alignment check for IOCSR emulation
LoongArch: KVM: Avoid overflow with array index
LoongArch: Handle KCOV __init vs inline mismatches
LoongArch: Reserve the EFI memory map region
LoongArch: Fix build warnings about export.h
LoongArch: Replace __ASSEMBLY__ with __ASSEMBLER__ in headers
Niklas Schnelle [Wed, 25 Jun 2025 09:28:30 +0000 (11:28 +0200)]
s390/pci: Allow automatic recovery with minimal driver support
According to Documentation/PCI/pci-error-recovery.rst only the
error_detected() callback in the err_handler struct is mandatory for
a driver to support error recovery. So far s390's error recovery chose
a stricter approach also requiring slot_reset() and resume().
Relax this requirement and only require error_detected(). If a callback
is not implemented EEH and AER treat this as PCI_ERS_RESULT_NONE. This
return value is otherwise used by drivers abstaining from their vote
on how to proceed with recovery and currently also not supported by
s390's recovery code.
So to support missing callbacks in-line with other implementors of the
recovery flow, also handle PCI_ERS_RESULT_NONE. Since s390 only does per
PCI function recovery and does not do voting, treat PCI_ERS_RESULT_NONE
optimistically and proceed through recovery unless other failures
prevent this.
Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Julian Ruess <julianr@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Niklas Schnelle [Wed, 25 Jun 2025 09:28:29 +0000 (11:28 +0200)]
s390/pci: Do not try re-enabling load/store if device is disabled
If a device is disabled unblocking load/store on its own is not useful
as a full re-enable of the function is necessary anyway. Note that SCLP
Write Event Data Action Qualifier 0 (Reset) leaves the device disabled
and triggers this case unless the driver already requests a reset.
Niklas Schnelle [Wed, 25 Jun 2025 09:28:28 +0000 (11:28 +0200)]
s390/pci: Fix stale function handles in error handling
The error event information for PCI error events contains a function
handle for the respective function. This handle is generally captured at
the time the error event was recorded. Due to delays in processing or
cascading issues, it may happen that during firmware recovery multiple
events are generated. When processing these events in order Linux may
already have recovered an affected function making the event information
stale. Fix this by doing an unconditional CLP List PCI function
retrieving the current function handle with the zdev->state_lock held
and ignoring the event if its function handle is stale.
- Fix for regression in handling native Windows symlinks
- Three smbdirect fixes:
- oops in RDMA response processing
- smbdirect memcpy issue
- fix smbdirect regression with large writes (smbdirect test cases
now all passing)
- Fix for "FAILED_TO_PARSE" warning in trace-cmd report output
* tag 'v6.16-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix reading into an ITER_FOLIOQ from the smbdirect code
cifs: Fix the smbd_response slab to allow usercopy
smb: client: fix potential deadlock when reconnecting channels
smb: client: remove \t from TP_printk statements
smb: client: let smbd_post_send_iter() respect the peers max_send_size and transmit all data
smb: client: fix regression with native SMB symlinks
Linus Torvalds [Sat, 28 Jun 2025 03:34:10 +0000 (20:34 -0700)]
Merge tag 'mm-hotfixes-stable-2025-06-27-16-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"16 hotfixes.
6 are cc:stable and the remainder address post-6.15 issues or aren't
considered necessary for -stable kernels. 5 are for MM"
* tag 'mm-hotfixes-stable-2025-06-27-16-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: add Lorenzo as THP co-maintainer
mailmap: update Duje Mihanović's email address
selftests/mm: fix validate_addr() helper
crashdump: add CONFIG_KEYS dependency
mailmap: correct name for a historical account of Zijun Hu
mailmap: add entries for Zijun Hu
fuse: fix runtime warning on truncate_folio_batch_exceptionals()
scripts/gdb: fix dentry_name() lookup
mm/damon/sysfs-schemes: free old damon_sysfs_scheme_filter->memcg_path on write
mm/alloc_tag: fix the kmemleak false positive issue in the allocation of the percpu variable tag->counters
lib/group_cpus: fix NULL pointer dereference from group_cpus_evenly()
mm/hugetlb: remove unnecessary holding of hugetlb_lock
MAINTAINERS: add missing files to mm page alloc section
MAINTAINERS: add tree entry to mm init block
mm: add OOM killer maintainer structure
fs/proc/task_mmu: fix PAGE_IS_PFNZERO detection for the huge zero folio
Linus Torvalds [Sat, 28 Jun 2025 03:22:18 +0000 (20:22 -0700)]
Merge tag 'riscv-for-linus-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V Fixes for 5.16-rc4
- .rodata is no longer linkd into PT_DYNAMIC.
It was not supposed to be there in the first place and resulted in
invalid (but unused) entries. This manifests as at least warnings in
llvm-readelf
- A fix for runtime constants with all-0 upper 32-bits. This should
only manifest on MMU=n kernels
- A fix for context save/restore on systems using the T-Head vector
extensions
- A fix for a conflicting "+r"/"r" register constraint in the VDSO
getrandom syscall wrapper, which is undefined behavior in clang
- A fix for a missing register clobber in the RVV raid6 implementation.
This manifests as a NULL pointer reference on some compilers, but
could trigger in other ways
- Misaligned accesses from userspace at faulting addresses are now
handled correctly
- A fix for an incorrect optimization that allowed access_ok() to mark
invalid addresses as accessible, which can result in userspace
triggering BUG()s
- A few fixes for build warnings, and an update to Drew's email address
* tag 'riscv-for-linus-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: export boot_cpu_hartid
Revert "riscv: Define TASK_SIZE_MAX for __access_ok()"
riscv: Fix sparse warning in vendor_extensions/sifive.c
Revert "riscv: misaligned: fix sleeping function called during misaligned access handling"
MAINTAINERS: Update Drew Fustini's email address
RISC-V: uaccess: Wrap the get_user_8 uaccess macro
raid6: riscv: Fix NULL pointer dereference caused by a missing clobber
RISC-V: vDSO: Correct inline assembly constraints in the getrandom syscall wrapper
riscv: vector: Fix context save/restore with xtheadvector
riscv: fix runtime constant support for nommu kernels
riscv: vdso: Exclude .rodata from the PT_DYNAMIC segment
Linus Torvalds [Sat, 28 Jun 2025 02:38:36 +0000 (19:38 -0700)]
Merge tag 'drm-fixes-2025-06-28' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Regular weekly drm updates, nothing out of the ordinary, amdgpu, xe,
i915 and a few misc bits. Seems about right for this time in the
release cycle.
core:
- fix drm_writeback_connector_cleanup function signature
- use correct HDMI audio bridge in drm_connector_hdmi_audio_init
bridge:
- SN65DSI86: fix HPD
amdgpu:
- Cleaner shader support for additional GFX9 GPUs
- MES firmware compatibility fixes
- Discovery error reporting fixes
- SDMA6/7 userq fixes
- Backlight fix
- EDID sanity check
i915:
- Fix for SNPS PHY HDMI for 1080p@120Hz
- Correct DP AUX DPCD probe address
- Followup build fix for GCOV and AutoFDO enabled config
xe:
- Missing error check
- Fix xe_hwmon_power_max_write
- Move flushes
- Explicitly exit CT safe mode on unwind
- Process deferred GGTT node removals on device unwind"
* tag 'drm-fixes-2025-06-28' of https://gitlab.freedesktop.org/drm/kernel:
drm/xe: Process deferred GGTT node removals on device unwind
drm/xe/guc: Explicitly exit CT safe mode on unwind
drm/xe: move DPT l2 flush to a more sensible place
drm/xe: Move DSB l2 flush to a more sensible place
drm/bridge: ti-sn65dsi86: Add HPD for DisplayPort connector type
drm/i915: fix build error some more
drm/xe/hwmon: Fix xe_hwmon_power_max_write
drm/xe/display: Add check for alloc_ordered_workqueue()
drm/amd/display: Add sanity checks for drm_edid_raw()
drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value
drm/amdgpu/sdma7: add ucode version checks for userq support
drm/amdgpu/sdma6: add ucode version checks for userq support
drm/amd: Adjust output for discovery error handling
drm/amdgpu/mes: add compatibility checks for set_hw_resource_1
drm/amdgpu/gfx9: Add Cleaner Shader Support for GFX9.x GPUs
drm/bridge-connector: Fix bridge in drm_connector_hdmi_audio_init()
drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS
drm/i915/snps_hdmi_pll: Fix 64-bit divisor truncation by using div64_u64
drm: writeback: Fix drm_writeback_connector_cleanup signature
Linus Torvalds [Sat, 28 Jun 2025 00:58:32 +0000 (17:58 -0700)]
Merge tag 'cxl-fixes-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link (CXL) fixes from Dave Jiang:
"These fixes address a few issues in the CXL subsystem, including
dealing with some bugs in the CXL EDAC and RAS drivers:
- Fix return value of cxlctl_validate_set_features()
- Fix min_scrub_cycle of a region miscaculation and add additional
documentation
- Fix potential memory leak issues for CXL EDAC
- Fix CPER handler device confusion for CXL RAS
- Fix using wrong repair type to check DRAM event record"
* tag 'cxl-fixes-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/edac: Fix using wrong repair type to check dram event record
cxl/ras: Fix CPER handler device confusion
cxl/edac: Fix potential memory leak issues
cxl/Documentation: Add more description about min/max scrub cycle
cxl/edac: Fix the min_scrub_cycle of a region miscalculation
cxl: fix return value in cxlctl_validate_set_features()
Linus Torvalds [Sat, 28 Jun 2025 00:32:30 +0000 (17:32 -0700)]
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
"Fix a regression where the purgatory code sometimes fails to build"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: sha256: Mark sha256_choose_blocks as __always_inline
Dave Airlie [Fri, 27 Jun 2025 20:53:00 +0000 (06:53 +1000)]
Merge tag 'drm-misc-fixes-2025-06-26' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.16-rc4:
- Fix function signature of drm_writeback_connector_cleanup.
- Use correct HDMI audio bridge in drm_connector_hdmi_audio_init.
- Make HPD work on SN65DSI86.
Edward Adam Davis [Tue, 24 Jun 2025 06:38:46 +0000 (14:38 +0800)]
tracing: Fix filter logic error
If the processing of the tr->events loop fails, the filter that has been
added to filter_head will be released twice in free_filter_list(&head->rcu)
and __free_filter(filter).
After adding the filter of tr->events, add the filter to the filter_head
process to avoid triggering uaf.
Link: https://lore.kernel.org/tencent_4EF87A626D702F816CD0951CE956EC32CD0A@qq.com Fixes: a9d0aab5eb33 ("tracing: Fix regression of filter waiting a long time on RCU synchronization") Reported-by: syzbot+daba72c4af9915e9c894@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=daba72c4af9915e9c894 Tested-by: syzbot+daba72c4af9915e9c894@syzkaller.appspotmail.com Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Fri, 27 Jun 2025 19:08:36 +0000 (12:08 -0700)]
Merge tag 'acpi-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Revert a commit that attempted to fix a memory leak in an error code
path and introduced a different issue (Zhe Qiao)"
* tag 'acpi-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "PCI/ACPI: Fix allocated memory release on error in pci_acpi_scan_root()"
Linus Torvalds [Fri, 27 Jun 2025 16:02:33 +0000 (09:02 -0700)]
Merge tag 'block-6.16-20250626' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- Fixes for ublk:
- fix C++ narrowing warnings in the uapi header
- update/improve UBLK_F_SUPPORT_ZERO_COPY comment in uapi header
- fix for the ublk ->queue_rqs() implementation, limiting a batch
to just the specific task AND ring
- ublk_get_data() error handling fix
- sanity check more arguments in ublk_ctrl_add_dev()
- selftest addition
- NVMe pull request via Christoph:
- reset delayed remove_work after reconnect
- fix atomic write size validation
- Fix for a warning introduced in bdev_count_inflight_rw() in this
merge window
* tag 'block-6.16-20250626' of git://git.kernel.dk/linux:
block: fix false warning in bdev_count_inflight_rw()
ublk: sanity check add_dev input for underflow
nvme: fix atomic write size validation
nvme: refactor the atomic write unit detection
nvme: reset delayed remove_work after reconnect
ublk: setup ublk_io correctly in case of ublk_get_data() failure
ublk: update UBLK_F_SUPPORT_ZERO_COPY comment in UAPI header
ublk: fix narrowing warnings in UAPI header
selftests: ublk: don't take same backing file for more than one ublk devices
ublk: build batch from IOs in same io_ring_ctx and io task
Linus Torvalds [Fri, 27 Jun 2025 15:55:57 +0000 (08:55 -0700)]
Merge tag 'io_uring-6.16-20250626' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Two tweaks for a recent fix: fixing a memory leak if multiple iovecs
were initially mapped but only the first was used and hence turned
into a UBUF rathan than an IOVEC iterator, and catching a case where
a retry would be done even if the previous segment wasn't full
- Small series fixing an issue making the vm unhappy if debugging is
turned on, hitting a VM_BUG_ON_PAGE()
- Fix a resource leak in io_import_dmabuf() in the error handling case,
which is a regression in this merge window
- Mark fallocate as needing to be write serialized, as is already done
for truncate and buffered writes
* tag 'io_uring-6.16-20250626' of git://git.kernel.dk/linux:
io_uring/kbuf: flag partial buffer mappings
io_uring/net: mark iov as dynamically allocated even for single segments
io_uring: fix resource leak in io_import_dmabuf()
io_uring: don't assume uaddr alignment in io_vec_fill_bvec
io_uring/rsrc: don't rely on user vaddr alignment
io_uring/rsrc: fix folio unpinning
io_uring: make fallocate be hashed work
Linus Torvalds [Fri, 27 Jun 2025 15:26:25 +0000 (08:26 -0700)]
Merge tag 's390-6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- Fix incorrectly dropped dereferencing of the stack nth entry
introduced with a previous KASAN false positive fix
- Use a proper memdup_array_user() helper to prevent overflow in a
protected key size calculation
* tag 's390-6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ptrace: Fix pointer dereferencing in regs_get_kernel_stack_nth()
s390/pkey: Prevent overflow in size calculation for memdup_user()
It always seems to be generic/753 that triggers this, and repeating
a quick group test run triggers it every 10-15 iterations. Hence it
generally triggers once up every 30-40 minutes of test time. just
running generic/753 by itself or concurrently with a limited group
of tests doesn't reproduce this issue at all.
Tracing on a hung system shows the AIL repeating every 50ms a log
force followed by an attempt to push pinned, aborted inodes from the
AIL (trimmed for brevity):
The inode log items are marked as aborted, which means that either:
a) a transaction commit has occurred, seen an error or shutdown, and
called xfs_trans_free_items() to abort the items. This should happen
before any pinning of log items occurs.
or
b) a dirty transaction has been cancelled. This should also happen
before any pinning of log items occurs.
or
c) AIL insertion at journal IO completion is marked as aborted. In
this case, the log item is pinned by the CIL until journal IO
completes and hence needs to be unpinned. This is then done after
the ->iop_committed() callback is run, so the pin count should be
balanced correctly.
Yet none of these seemed to be occurring. Further tracing indicated
this:
d) Shutdown during CIL pushing resulting in log item completion
being called from checkpoint abort processing. Items are unpinned
and released without serialisation against each other, journal IO
completion or transaction commit completion.
In this case, we may still have a transaction commit in flight that
holds a reference to a xfs_buf_log_item (BLI) after CIL insertion.
e.g. a synchronous transaction will flush the CIL before the
transaction is torn down. The concurrent CIL push then aborts
insertion it and drops the commit/AIL reference to the BLI. This can
leave the transaction commit context with the last reference to the
BLI which is dropped here:
xfs_trans_free_items()
->iop_release
xfs_buf_item_release
xfs_buf_item_put
if (XFS_LI_ABORTED)
xfs_trans_ail_delete
xfs_buf_item_relse()
Unlike the journal completion ->iop_unpin path, this path does not
run stale buffer completion process when it drops the last
reference, hence leaving the stale inodes attached to the buffer
sitting the AIL. There are no other references to those inodes, so
there is no other mechanism to remove them from the AIL. Hence
unmount hangs.
The buffer lock context for stale buffers is passed to the last BLI
reference. This is normally the last BLI unpin on journal IO
completion. The unpin then processes the stale buffer completion and
releases the buffer lock. However, if the final unpin from journal
IO completion (or CIL push abort) does not hold the last reference
to the BLI, there -must- still be a transaction context that
references the BLI, and so that context must perform the stale
buffer completion processing before the buffer is unlocked and the
BLI torn down.
The fix for this is to rework the xfs_buf_item_relse() path to run
stale buffer completion processing if it drops the last reference to
the BLI. We still hold the buffer locked, so the buffer owner and
lock context is the same as if we passed the BLI and buffer to the
->iop_unpin() context to finish stale process on journal commit.
However, we have to be careful here. In a shutdown state, we can be
freeing dirty BLIs from xfs_buf_item_put() via xfs_trans_brelse()
and xfs_trans_bdetach(). The existing code handles this case by
considering shutdown state as "aborted", but in doing so
largely masks the failure to clean up stale BLI state from the
xfs_buf_item_relse() path. i.e regardless of the shutdown state and
whether the item is in the AIL, we must finish the stale buffer
cleanup if we are are dropping the last BLI reference from the
->iop_relse path in transaction commit context.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Dave Chinner [Wed, 25 Jun 2025 22:48:59 +0000 (08:48 +1000)]
xfs: factor out stale buffer item completion
The stale buffer item completion handling is currently only done
from BLI unpinning. We need to perform this function from where-ever
the last reference to the BLI is dropped, so first we need to
factor this code out into a helper.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Dave Chinner [Wed, 25 Jun 2025 22:48:58 +0000 (08:48 +1000)]
xfs: rearrange code in xfs_buf_item.c
The code to initialise, release and free items is all the way down
the bottom of the file. Upcoming fixes need to these functions
earlier in the file, so move them to the top.
There is one code change in this move - the parameter to
xfs_buf_item_relse() is changed from the xfs_buf to the
xfs_buf_log_item - the thing that the function is releasing.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Dave Chinner [Wed, 25 Jun 2025 22:48:57 +0000 (08:48 +1000)]
xfs: add tracepoints for stale pinned inode state debug
I needed more insight into how stale inodes were getting stuck on
the AIL after a forced shutdown when running fsstress. These are the
tracepoints I added for that purpose.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
The CIL push has seen the deadlock, so it has aborted the push and
is running CIL checkpoint completion to abort all the items in the
checkpoint. This calls ->iop_unpin(remove = true) to clean up the
log items in the checkpoint.
When a buffer log item is unpined like this, it needs to lock the
buffer to run io completion to correctly fail the buffer and run all
the required completions to fail attached log items as well. In this
case, the attempt to lock the buffer on unpin is hanging because the
buffer is already locked.
I suspected a leaked XFS_BLI_HOLD state because of XFS_BLI_STALE
handling changes I was testing, so I went looking for
pin events on HOLD buffers and unpin events on locked buffer. That
isolated this one buffer with these two events:
Yup, another test run by check-parallel is running drop_caches
concurrently and the dquot shrinker for the hung filesystem is
running. That's trying to flush a dirty dquot from reclaim context,
and it waiting on a log force to complete. xfs_qm_dqflush is called
with the dquot buffer held locked, and so we've called
xfs_log_force() with that buffer locked.
Now the log force is waiting for a workqueue flush to complete, and
that workqueue flush is waiting of CIL checkpoint processing to
finish.
The CIL checkpoint processing is aborting all the log items it has,
and that requires locking aborted buffers to cancel them.
Now, normally this isn't a problem if we are issuing a log force
to unpin an object, because the ->iop_unpin() method wakes pin
waiters first. That results in the pin waiter finishing off whatever
it was doing, dropping the lock and then xfs_buf_item_unpin() can
lock the buffer and fail it.
However, xfs_qm_dqflush() is waiting on the -dquot- unpin event, not
the dquot buffer unpin event, and so it never gets woken and so does
not drop the buffer lock.
Inodes do not have this problem, as they can only be written from
one spot (->iop_push) whilst dquots can be written from multiple
places (memory reclaim, ->iop_push, xfs_dq_dqpurge, and quotacheck).
The reason that the dquot buffer has an attached buffer log item is
that it has been recently allocated. Initialisation of the dquot
buffer logs the buffer directly, thereby pinning it in memory. We
then modify the dquot in a separate operation, and have memory
reclaim racing with a shutdown and we trigger this deadlock.
check-parallel reproduces this reliably on 1kB FSB filesystems with
quota enabled because it does all of these things concurrently
without having to explicitly write tests to exercise these corner
case conditions.
xfs_qm_dquot_logitem_push() doesn't have this deadlock because it
checks if the dquot is pinned before locking the dquot buffer and
skipping it if it is pinned. This means the xfs_qm_dqunpin_wait()
log force in xfs_qm_dqflush() never triggers and we unlock the
buffer safely allowing a concurrent shutdown to fail the buffer
appropriately.
xfs_qm_dqpurge() could have this problem as it is called from
quotacheck and we might have allocated dquot buffers when recording
the quota updates. This can be fixed by calling
xfs_qm_dqunpin_wait() before we lock the dquot buffer. Because we
hold the dquot locked, nothing will be able to add to the pin count
between the unpin_wait and the dqflush callout, so this now makes
xfs_qm_dqpurge() safe against this race.
xfs_qm_dquot_isolate() can also be fixed this same way but, quite
frankly, we shouldn't be doing IO in memory reclaim context. If the
dquot is pinned or dirty, simply rotate it and let memory reclaim
come back to it later, same as we do for inodes.
This then gets rid of the nasty issue in xfs_qm_flush_one() where
quotacheck writeback races with memory reclaim flushing the dquots.
We can lift xfs_qm_dqunpin_wait() up into this code, then get rid of
the "can't get the dqflush lock" buffer write to cycle the dqlfush
lock and enable it to be flushed again. checking if the dquot is
pinned and returning -EAGAIN so that the dquot walk will revisit the
dquot again later.
Finally, with xfs_qm_dqunpin_wait() lifted into all the callers,
we can remove it from the xfs_qm_dqflush() code.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Dave Chinner [Wed, 25 Jun 2025 22:48:55 +0000 (08:48 +1000)]
xfs: catch stale AGF/AGF metadata
There is a race condition that can trigger in dmflakey fstests that
can result in asserts in xfs_ialloc_read_agi() and
xfs_alloc_read_agf() firing. The asserts look like this:
Essentially, it is the same problem. When _flakey_drop_and_remount()
loads the drop-writes table, it makes all writes silently fail. Writes
are reported to the fs as completed successfully, but they are not
issued to the backing store. The filesystem sees the successful
write completion and marks the metadata buffer clean and removes it
from the AIL.
If this happens at the same time as memory pressure is occuring,
the now-clean AGF and/or AGI buffers can be reclaimed from memory.
Shortly afterwards, but before _flakey_drop_and_remount() runs
unmount, background writeback is kicked and it tries to allocate
blocks for the dirty pages in memory. This then tries to access the
AGF buffer we just turfed out of memory. It's not found, so it gets
read in from disk.
This is all fine, except for the fact that the last writeback of the
AGF did not actually reach disk. The AGF on disk is stale compared
to the in-memory state held by the perag, and so they don't match
and the assert fires.
Then other operations on that inode hang because the task was killed
whilst holding inode locks. e.g:
Memory pressure is one way to trigger this, another is to run "echo
3 > /proc/sys/vm/drop_caches" randomly while tests are running.
Regardless of how it is triggered, this effectively takes down the
system once umount hangs because it's holding a sb->s_umount lock
exclusive and now every sync(1) call gets stuck on it.
Fix this by replacing the asserts with a corruption detection check
and a shutdown.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Dave Chinner [Wed, 25 Jun 2025 22:48:54 +0000 (08:48 +1000)]
xfs: xfs_ifree_cluster vs xfs_iflush_shutdown_abort deadlock
Lock order of xfs_ifree_cluster() is cluster buffer -> try ILOCK
-> IFLUSHING, except for the last inode in the cluster that is
triggering the free. In that case, the lock order is ILOCK ->
cluster buffer -> IFLUSHING.
xfs_iflush_cluster() uses cluster buffer -> try ILOCK -> IFLUSHING,
so this can safely run concurrently with xfs_ifree_cluster().
xfs_inode_item_precommit() uses ILOCK -> cluster buffer, but this
cannot race with xfs_ifree_cluster() so being in a different order
will not trigger a deadlock.
xfs_reclaim_inode() during a filesystem shutdown uses ILOCK ->
IFLUSHING -> cluster buffer via xfs_iflush_shutdown_abort(), and
this deadlocks against xfs_ifree_cluster() like so:
We can't change the lock order of xfs_ifree_cluster() - XFS_ISTALE
and XFS_IFLUSHING are serialised through to journal IO completion
by the cluster buffer lock being held.
There's quite a few asserts in the code that check that XFS_ISTALE
does not occur out of sync with buffer locking (e.g. in
xfs_iflush_cluster). There's also a dependency on the inode log item
being removed from the buffer before XFS_IFLUSHING is cleared, also
with asserts that trigger on this.
Further, we don't have a requirement for the inode to be locked when
completing or aborting inode flushing because all the inode state
updates are serialised by holding the cluster buffer lock across the
IO to completion.
We can't check for XFS_IRECLAIM in xfs_ifree_mark_inode_stale() and
skip the inode, because there is no guarantee that the inode will be
reclaimed. Hence it *must* be marked XFS_ISTALE regardless of
whether reclaim is preparing to free that inode. Similarly, we can't
check for IFLUSHING before locking the inode because that would
result in dirty inodes not being marked with ISTALE in the event of
racing with XFS_IRECLAIM.
Hence we have to address this issue from the xfs_reclaim_inode()
side. It is clear that we cannot hold the inode locked here when
calling xfs_iflush_shutdown_abort() because it is the inode->buffer
lock order that causes the deadlock against xfs_ifree_cluster().
Hence we need to drop the ILOCK before aborting the inode in the
shutdown case. Once we've aborted the inode, we can grab the ILOCK
again and then immediately reclaim it as it is now guaranteed to be
clean.
Note that dropping the ILOCK in xfs_reclaim_inode() means that it
can now be locked by xfs_ifree_mark_inode_stale() and seen whilst in
this state. This is safe because we have left the XFS_IFLUSHING flag
on the inode and so xfs_ifree_mark_inode_stale() will simply set
XFS_ISTALE and move to the next inode. An ASSERT check in this path
needs to be tweaked to take into account this new shutdown
interaction.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Johannes Berg [Fri, 6 Jun 2025 07:56:52 +0000 (09:56 +0200)]
i2c: scx200_acb: depends on HAS_IOPORT
It already depends on X86_32, but that's also set for ARCH=um.
Recent changes made UML no longer have IO port access since
it's not needed, but this driver uses it. Build it only for
HAS_IOPORT. This is pretty much the same as depending on X86,
but on the off-chance that HAS_IOPORT will ever be optional
on x86 HAS_IOPORT is the real prerequisite.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Bibo Mao [Fri, 27 Jun 2025 10:27:44 +0000 (18:27 +0800)]
LoongArch: KVM: Disable updating of "num_cpu" and "feature"
Property "num_cpu" and "feature" are read-only once eiointc is created,
which are set with KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL attr group before
device creation.
Attr group KVM_DEV_LOONGARCH_EXTIOI_GRP_SW_STATUS is to update register
and software state for migration and reset usage, property "num_cpu" and
"feature" can not be update again if it is created already.
Here discard write operation with property "num_cpu" and "feature" in
attr group KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL.
Cc: stable@vger.kernel.org Fixes: 1ad7efa552fd ("LoongArch: KVM: Add EIOINTC user mode read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Fri, 27 Jun 2025 10:27:44 +0000 (18:27 +0800)]
LoongArch: KVM: Check validity of "num_cpu" from user space
The maximum supported cpu number is EIOINTC_ROUTE_MAX_VCPUS about
irqchip EIOINTC, here add validation about cpu number to avoid array
pointer overflow.
Cc: stable@vger.kernel.org Fixes: 1ad7efa552fd ("LoongArch: KVM: Add EIOINTC user mode read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Fri, 27 Jun 2025 10:27:44 +0000 (18:27 +0800)]
LoongArch: KVM: Check interrupt route from physical CPU
With EIOINTC interrupt controller, physical CPU ID is set for irq route.
However the function kvm_get_vcpu() is used to get destination vCPU when
delivering irq. With API kvm_get_vcpu(), the logical CPU ID is used.
With API kvm_get_vcpu_by_cpuid(), vCPU ID can be searched from physical
CPU ID.
Cc: stable@vger.kernel.org Fixes: 3956a52bc05b ("LoongArch: KVM: Add EIOINTC read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Fri, 27 Jun 2025 10:27:44 +0000 (18:27 +0800)]
LoongArch: KVM: Fix interrupt route update with EIOINTC
With function eiointc_update_sw_coremap(), there is forced assignment
like val = *(u64 *)pvalue. Parameter pvalue may be pointer to char type
or others, there is problem with forced assignment with u64 type.
Here the detailed value is passed rather address pointer.
Cc: stable@vger.kernel.org Fixes: 3956a52bc05b ("LoongArch: KVM: Add EIOINTC read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Fri, 27 Jun 2025 10:27:44 +0000 (18:27 +0800)]
LoongArch: KVM: Add address alignment check for IOCSR emulation
IOCSR instruction supports 1/2/4/8 bytes access, the address should be
naturally aligned with its access size. Here address alignment check is
added in the EIOINTC kernel emulation.
Cc: stable@vger.kernel.org Fixes: 3956a52bc05b ("LoongArch: KVM: Add EIOINTC read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Linus Torvalds [Fri, 27 Jun 2025 02:49:12 +0000 (19:49 -0700)]
Merge tag 'bcachefs-2025-06-26' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
- Lots of small check/repair fixes, primarily in subvol loop and
directory structure loop (when involving snapshots).
- Fix a few 6.16 regressions: rare UAF in the foreground allocator path
when taking a transaction restart from the transaction bump
allocator, and some small fallout from the change to log the error
being corrected in the journal when repairing errors, also some
fallout from the btree node read error logging improvements.
(Alan, Bharadwaj)
- New option: journal_rewind
This lets the entire filesystem be reset to an earlier point in time.
Note that this is only a disaster recovery tool, and right now there
are major caveats to using it (discards should be disabled, in
particular), but it successfully restored the filesystem of one of
the users who was bit by the subvolume deletion bug and didn't have
backups. I'll likely be making some changes to the discard path in
the future to make this a reliable recovery tool.
- Some new btree iterator tracepoints, for tracking down some
livelock-ish behaviour we've been seeing in the main data write path.
* tag 'bcachefs-2025-06-26' of git://evilpiepirate.org/bcachefs: (51 commits)
bcachefs: Plumb correct ip to trans_relock_fail tracepoint
bcachefs: Ensure we rewind to run recovery passes
bcachefs: Ensure btree node scan runs before checking for scanned nodes
bcachefs: btree_root_unreadable_and_scan_found_nothing should not be autofix
bcachefs: fix bch2_journal_keys_peek_prev_min() underflow
bcachefs: Use wait_on_allocator() when allocating journal
bcachefs: Check for bad write buffer key when moving from journal
bcachefs: Don't unlock the trans if ret doesn't match BCH_ERR_operation_blocked
bcachefs: Fix range in bch2_lookup_indirect_extent() error path
bcachefs: fix spurious error_throw
bcachefs: Add missing bch2_err_class() to fileattr_set()
bcachefs: Add missing key type checks to check_snapshot_exists()
bcachefs: Don't log fsck err in the journal if doing repair elsewhere
bcachefs: Fix *__bch2_trans_subbuf_alloc() error path
bcachefs: Fix missing newlines before ero
bcachefs: fix spurious error in read_btree_roots()
bcachefs: fsck: Fix oops in key_visible_in_snapshot()
bcachefs: fsck: fix unhandled restart in topology repair
bcachefs: fsck: Fix check_directory_structure when no check_dirents
bcachefs: Fix restart handling in btree_node_scrub_work()
...
Linus Torvalds [Thu, 26 Jun 2025 19:26:39 +0000 (12:26 -0700)]
Merge tag 'devicetree-fixes-for-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Convert altr,uart-1.0 and altr,juart-1.0 to DT schema. These were
applied for nios2, but never sent upstream.
- Fix extra '/' in fsl,ls1028a-reset '$id' path
- Fix warnings in ti,sn65dsi83 schema due to unnecessary $ref.
* tag 'devicetree-fixes-for-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: serial: Convert altr,uart-1.0 to DT schema
dt-bindings: serial: Convert altr,juart-1.0 to DT schema
dt-bindings: soc: fsl,ls1028a-reset: Drop extra "/" in $id
dt-bindings: drm/bridge: ti-sn65dsi83: drop $ref to fix lvds-vod* warnings
Jens Axboe [Thu, 26 Jun 2025 18:17:48 +0000 (12:17 -0600)]
io_uring/kbuf: flag partial buffer mappings
A previous commit aborted mapping more for a non-incremental ring for
bundle peeking, but depending on where in the process this peeking
happened, it would not necessarily prevent a retry by the user. That can
create gaps in the received/read data.
Add struct buf_sel_arg->partial_map, which can pass this information
back. The networking side can then map that to internal state and use it
to gate retry as well.
Since this necessitates a new flag, change io_sr_msg->retry to a
retry_flags member, and store both the retry and partial map condition
in there.
Cc: stable@vger.kernel.org Fixes: 26ec15e4b0c1 ("io_uring/kbuf: don't truncate end buffer for multiple buffer peeks") Signed-off-by: Jens Axboe <axboe@kernel.dk>
Trond Myklebust [Thu, 19 Jun 2025 19:16:11 +0000 (15:16 -0400)]
NFSv4/flexfiles: Fix handling of NFS level errors in I/O
Allow the flexfiles error handling to recognise NFS level errors (as
opposed to RPC level errors) and handle them separately. The main
motivator is the NFSERR_PERM errors that get returned if the NFS client
connects to the data server through a port number that is lower than
1024. In that case, the client should disconnect and retry a READ on a
different data server, or it should retry a WRITE after reconnecting.
Reviewed-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
David Howells [Wed, 2 Apr 2025 19:27:26 +0000 (20:27 +0100)]
cifs: Fix reading into an ITER_FOLIOQ from the smbdirect code
When performing a file read from RDMA, smbd_recv() prints an "Invalid msg
type 4" error and fails the I/O. This is due to the switch-statement there
not handling the ITER_FOLIOQ handed down from netfslib.
Fix this by collapsing smbd_recv_buf() and smbd_recv_page() into
smbd_recv() and just using copy_to_iter() instead of memcpy(). This
future-proofs the function too, in case more ITER_* types are added.
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: David Howells <dhowells@redhat.com>
cc: Tom Talpey <tom@talpey.com>
cc: Paulo Alcantara (Red Hat) <pc@manguebit.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
David Howells [Wed, 25 Jun 2025 13:15:04 +0000 (14:15 +0100)]
cifs: Fix the smbd_response slab to allow usercopy
The handling of received data in the smbdirect client code involves using
copy_to_iter() to copy data from the smbd_reponse struct's packet trailer
to a folioq buffer provided by netfslib that encapsulates a chunk of
pagecache.
If, however, CONFIG_HARDENED_USERCOPY=y, this will result in the checks
then performed in copy_to_iter() oopsing with something like the following:
CIFS: Attempting to mount //172.31.9.1/test
CIFS: VFS: RDMA transport established
usercopy: Kernel memory exposure attempt detected from SLUB object 'smbd_response_0000000091e24ea1' (offset 81, size 63)!
------------[ cut here ]------------
kernel BUG at mm/usercopy.c:102!
...
RIP: 0010:usercopy_abort+0x6c/0x80
...
Call Trace:
<TASK>
__check_heap_object+0xe3/0x120
__check_object_size+0x4dc/0x6d0
smbd_recv+0x77f/0xfe0 [cifs]
cifs_readv_from_socket+0x276/0x8f0 [cifs]
cifs_read_from_socket+0xcd/0x120 [cifs]
cifs_demultiplex_thread+0x7e9/0x2d50 [cifs]
kthread+0x396/0x830
ret_from_fork+0x2b8/0x3b0
ret_from_fork_asm+0x1a/0x30
The problem is that the smbd_response slab's packet field isn't marked as
being permitted for usercopy.
Fix this by passing parameters to kmem_slab_create() to indicate that
copy_to_iter() is permitted from the packet region of the smbd_response
slab objects, less the header space.
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Stefan Metzmacher <metze@samba.org> Link: https://lore.kernel.org/r/acb7f612-df26-4e2a-a35d-7cd040f513e1@samba.org/ Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Stefan Metzmacher <metze@samba.org> Tested-by: Stefan Metzmacher <metze@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Wed, 25 Jun 2025 15:22:38 +0000 (12:22 -0300)]
smb: client: fix potential deadlock when reconnecting channels
Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order
and prevent the following deadlock from happening
======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc3-build2+ #1301 Tainted: G S W
------------------------------------------------------
cifsd/6055 is trying to acquire lock: ffff88810ad56038 (&tcp_ses->srv_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x134/0x200
but task is already holding lock: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
3 locks held by cifsd/6055:
#0: ffffffff857de398 (&cifs_tcp_ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x7b/0x200
#1: ffff888119c64060 (&ret_buf->ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x9c/0x200
#2: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200
Cc: linux-cifs@vger.kernel.org Reported-by: David Howells <dhowells@redhat.com> Fixes: d7d7a66aacd6 ("cifs: avoid use of global locks for high contention data") Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* tag 'nvme-6.16-2025-06-26' of git://git.infradead.org/nvme:
nvme: fix atomic write size validation
nvme: refactor the atomic write unit detection
nvme: reset delayed remove_work after reconnect
Michal Wajdeczko [Thu, 12 Jun 2025 22:09:36 +0000 (00:09 +0200)]
drm/xe: Process deferred GGTT node removals on device unwind
While we are indirectly draining our dedicated workqueue ggtt->wq
that we use to complete asynchronous removal of some GGTT nodes,
this happends as part of the managed-drm unwinding (ggtt_fini_early),
which could be later then manage-device unwinding, where we could
already unmap our MMIO/GMS mapping (mmio_fini).
This was recently observed during unsuccessful VF initialization:
Michal Wajdeczko [Thu, 12 Jun 2025 22:09:37 +0000 (00:09 +0200)]
drm/xe/guc: Explicitly exit CT safe mode on unwind
During driver probe we might be briefly using CT safe mode, which
is based on a delayed work, but usually we are able to stop this
once we have IRQ fully operational. However, if we abort the probe
quite early then during unwind we might try to destroy the workqueue
while there is still a pending delayed work that attempts to restart
itself which triggers a WARN.
This was recently observed during unsuccessful VF initialization:
Maarten Lankhorst [Fri, 6 Jun 2025 10:45:47 +0000 (11:45 +0100)]
drm/xe: Move DSB l2 flush to a more sensible place
Flushing l2 is only needed after all data has been written.
Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/20250606104546.1996818-3-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 0dd2dd0182bc444a62652e89d08c7f0e4fde15ba) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Kees Cook [Thu, 26 Jun 2025 12:07:18 +0000 (20:07 +0800)]
LoongArch: Handle KCOV __init vs inline mismatches
When the KCOV is enabled all functions get instrumented, unless
the __no_sanitize_coverage attribute is used. To prepare for
__no_sanitize_coverage being applied to __init functions, we have to
handle differences in how GCC's inline optimizations get resolved.
For LoongArch this exposed several places where __init annotations
were missing but ended up being "accidentally correct". So fix these
cases.
Ming Wang [Thu, 26 Jun 2025 12:07:18 +0000 (20:07 +0800)]
LoongArch: Reserve the EFI memory map region
The EFI memory map at 'boot_memmap' is crucial for kdump to understand
the primary kernel's memory layout. This memory region, typically part
of EFI Boot Services (BS) data, can be overwritten after ExitBootServices
if not explicitly preserved by the kernel.
This commit addresses this by:
1. Calling memblock_reserve() to reserve the entire physical region
occupied by the EFI memory map (header + descriptors). This prevents
the primary kernel from reallocating and corrupting this area.
2. Setting the EFI_PRESERVE_BS_REGIONS flag in efi.flags. This indicates
that efforts have been made to preserve critical BS code/data regions
which can be useful for other kernel subsystems or debugging.
These changes ensure the original EFI memory map data remains intact,
improving kdump reliability and potentially aiding other EFI-related
functionalities that might rely on preserved BS code/data.
Signed-off-by: Ming Wang <wangming01@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Thu, 26 Jun 2025 12:07:18 +0000 (20:07 +0800)]
LoongArch: Fix build warnings about export.h
After commit a934a57a42f64a4 ("scripts/misc-check: check missing #include
<linux/export.h> when W=1") and 7d95680d64ac8e836c ("scripts/misc-check:
check unnecessary #include <linux/export.h> when W=1"), we get some build
warnings with W=1:
arch/loongarch/kernel/acpi.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/kernel/alternative.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/kernel/kfpu.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/kernel/traps.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/kernel/unwind_guess.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/kernel/unwind_orc.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/kernel/unwind_prologue.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/lib/crc32-loongarch.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/lib/csum.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing
arch/loongarch/kernel/elf.c: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present
arch/loongarch/kernel/paravirt.c: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present
arch/loongarch/pci/pci.c: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present
Thomas Huth [Thu, 26 Jun 2025 12:07:10 +0000 (20:07 +0800)]
LoongArch: Replace __ASSEMBLY__ with __ASSEMBLER__ in headers
While the GCC and Clang compilers already define __ASSEMBLER__
automatically when compiling assembler code, __ASSEMBLY__ is a macro
that only gets defined by the Makefiles in the kernel. This is bad
since macros starting with two underscores are names that are reserved
by the C language. It can also be very confusing for the developers
when switching between userspace and kernelspace coding, or when
dealing with uapi headers that rather should use __ASSEMBLER__ instead.
So let's now standardize on the __ASSEMBLER__ macro that is provided
by the compilers.
This is almost a completely mechanical patch (done with a simple
"sed -i" statement), with one comment tweaked manually in the
arch/loongarch/include/asm/cpu.h file (it was missing the trailing
underscores).
Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Christoph Hellwig [Wed, 11 Jun 2025 04:54:56 +0000 (06:54 +0200)]
nvme: fix atomic write size validation
Don't mix the namespace and controller values, and validate the
per-controller limit when probing the controller. This avoid spurious
failures for controllers with namespaces that have different namespaces
with different logical block sizes, or report the per-namespace values
only for some namespaces.
It also fixes a missing queue_limits_cancel_update in an error path by
removing that error path.
Fixes: 8695f060a029 ("nvme: all namespaces in a subsystem must adhere to a common atomic write size") Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Tested-by: Yi Zhang <yi.zhang@redhat.com>
Keith Busch [Wed, 11 Jun 2025 04:50:47 +0000 (06:50 +0200)]
nvme: reset delayed remove_work after reconnect
The remove_work will proceed with permanently disconnecting on the
initial final path failure if the head shows no paths after the delay.
If a new path connects while the remove_work is pending, and if that new
path happens to disconnect before that remove_work executes, the delayed
removal should reset based on the most recent path disconnect time, but
queue_delayed_work() won't do anything if the work is already pending.
Attempt to cancel the delayed work when a new path connects, and use
mod_delayed_work() in case the remove_work remains pending anyway.
Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Jiawen Wu [Wed, 25 Jun 2025 02:39:24 +0000 (10:39 +0800)]
net: libwx: fix the creation of page_pool
'rx_ring->size' means the count of ring descriptors multiplied by the
size of one descriptor. When increasing the count of ring descriptors,
it may exceed the limit of pool size.
[ 864.209610] page_pool_create_percpu() gave up with errno -7
[ 864.209613] txgbe 0000:11:00.0: Page pool creation failed: -7
Fix to set the pool_size to the count of ring descriptors.
Fixes: 850b971110b2 ("net: libwx: Allocate Rx and Tx resources") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/434C72BFB40E350A+20250625023924.21821-1-jiawenwu@trustnetic.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 24 Jun 2025 18:32:58 +0000 (11:32 -0700)]
net: selftests: fix TCP packet checksum
The length in the pseudo header should be the length of the L3 payload
AKA the L4 header+payload. The selftest code builds the packet from
the lower layers up, so all the headers are pushed already when it
constructs L4. We need to subtract the lower layer headers from skb->len.
Leo Yan [Wed, 25 Jun 2025 17:07:37 +0000 (18:07 +0100)]
perf/aux: Fix pending disable flow when the AUX ring buffer overruns
If an AUX event overruns, the event core layer intends to disable the
event by setting the 'pending_disable' flag. Unfortunately, the event
is not actually disabled afterwards.
the 'pending_disable' flag was changed to a boolean. However, the
AUX event code was not updated accordingly. The flag ends up holding a
CPU number. If this number is zero, the flag is taken as false and the
IRQ work is never triggered.
Later, with commit:
2b84def990d3 ("perf: Split __perf_pending_irq() out of perf_pending_irq()")
a new IRQ work 'pending_disable_irq' was introduced to handle event
disabling. The AUX event path was not updated to kick off the work queue.
To fix this bug, when an AUX ring buffer overrun is detected, call
perf_event_disable_inatomic() to initiate the pending disable flow.
Also update the outdated comment for setting the flag, to reflect the
boolean values (0 or 1).
Fixes: 2b84def990d3 ("perf: Split __perf_pending_irq() out of perf_pending_irq()") Fixes: ca6c21327c6a ("perf: Fix missing SIGTRAPs") Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Liang Kan <kan.liang@linux.intel.com> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20250625170737.2918295-1-leo.yan@arm.com
Salvatore Bonaccorso [Wed, 25 Jun 2025 18:41:28 +0000 (20:41 +0200)]
ALSA: hda/realtek: Fix built-in mic on ASUS VivoBook X507UAR
The built-in mic of ASUS VivoBook X507UAR is broken recently by the fix
of the pin sort. The fixup ALC256_FIXUP_ASUS_MIC_NO_PRESENCE is working
for addressing the regression, too.
Fixes: 3b4309546b48 ("ALSA: hda: Fix headset detection failure due to unstable sort") Reported-by: Igor Tamara <igor.tamara@gmail.com> Closes: https://bugs.debian.org/1108069 Signed-off-by: Salvatore Bonaccorso <carnil@debian.org> Link: https://lore.kernel.org/CADdHDco7_o=4h_epjEAb92Dj-vUz_PoTC2-W9g5ncT2E0NzfeQ@mail.gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
Linus Torvalds [Thu, 26 Jun 2025 04:09:02 +0000 (21:09 -0700)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Fix use-after-free in libbpf when map is resized (Adin Scannell)
- Fix verifier assumptions about 2nd argument of bpf_sysctl_get_name
(Jerome Marchand)
- Fix verifier assumption of nullness of d_inode in dentry (Song Liu)
- Fix global starvation of LRU map (Willem de Bruijn)
- Fix potential NULL dereference in btf_dump__free (Yuan Chen)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: adapt one more case in test_lru_map to the new target_free
libbpf: Fix possible use-after-free for externs
selftests/bpf: Convert test_sysctl to prog_tests
bpf: Specify access type of bpf_sysctl_get_name args
libbpf: Fix null pointer dereference in btf_dump__free on allocation failure
bpf: Adjust free target to avoid global starvation of LRU map
bpf: Mark dentry->d_inode as trusted_or_null
Kent Overstreet [Wed, 25 Jun 2025 04:48:14 +0000 (00:48 -0400)]
bcachefs: Ensure we rewind to run recovery passes
Fix a 6.16 regression from the recovery pass rework, which introduced a
bug where calling bch2_run_explicit_recovery_pass() would only return
the error code to rewind recovery for the first call that scheduled that
recovery pass.
If the error code from the first call was swallowed (because it was
called by an asynchronous codepath), subsequent calls would go "ok, this
pass is already marked as needing to run" and return 0.
Fixing this ensures that check_topology bails out to run btree_node_scan
before doing any repair.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 25 Jun 2025 16:45:11 +0000 (12:45 -0400)]
bcachefs: Ensure btree node scan runs before checking for scanned nodes
Previously, calling bch2_btree_has_scanned_nodes() when btree node
scan hadn't actually run would erroniously return false - causing us to
think a btree was entirely gone.
This fixes a 6.16 regression from moving the scheduling of btree node
scan out of bch2_btree_lost_data() (fixing the bug where we'd schedule
it persistently in the superblock) and only scheduling it when
check_toploogy() is asking for scanned btree nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Thu, 26 Jun 2025 03:48:48 +0000 (20:48 -0700)]
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull mount fixes from Al Viro:
"Several mount-related fixes"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
userns and mnt_idmap leak in open_tree_attr(2)
attach_recursive_mnt(): do not lock the covering tree when sliding something under it
replace collect_mounts()/drop_collected_mounts() with a safer variant
Kuniyuki Iwashima [Tue, 24 Jun 2025 21:45:00 +0000 (14:45 -0700)]
atm: Release atm_dev_mutex after removing procfs in atm_dev_deregister().
syzbot reported a warning below during atm_dev_register(). [0]
Before creating a new device and procfs/sysfs for it, atm_dev_register()
looks up a duplicated device by __atm_dev_lookup(). These operations are
done under atm_dev_mutex.
However, when removing a device in atm_dev_deregister(), it releases the
mutex just after removing the device from the list that __atm_dev_lookup()
iterates over.
So, there will be a small race window where the device does not exist on
the device list but procfs/sysfs are still not removed, triggering the
splat.
Let's hold the mutex until procfs/sysfs are removed in
atm_dev_deregister().
Lorenzo Stoakes [Wed, 25 Jun 2025 09:52:31 +0000 (10:52 +0100)]
MAINTAINERS: add Lorenzo as THP co-maintainer
I am doing a great deal of review and getting ever more involved in THP
with intent to do more so in future also, so add myself as co-maintainer
to help David with workload.
Link: https://lkml.kernel.org/r/20250625095231.42874-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Dev Jain <dev.jain@arm.com> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Barry Song <baohua@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dev Jain [Fri, 20 Jun 2025 11:11:50 +0000 (16:41 +0530)]
selftests/mm: fix validate_addr() helper
validate_addr() checks whether the address returned by mmap() lies in the
low or high VA space, according to whether a high addr hint was passed or
not. The fix commit mentioned below changed the code in such a way that
this function will always return failure when passed high_addr == 1; addr
will be >= HIGH_ADDR_MARK always, we will fall down to "if (addr >
HIGH_ADDR_MARK)" and return failure. Fix this.
Link: https://lkml.kernel.org/r/20250620111150.50344-1-dev.jain@arm.com Fixes: d1d86ce28d0f ("selftests/mm: virtual_address_range: conform to TAP format output") Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Donet Tom <donettom@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Arnd Bergmann [Fri, 20 Jun 2025 11:21:22 +0000 (13:21 +0200)]
crashdump: add CONFIG_KEYS dependency
The dm_crypt code fails to build without CONFIG_KEYS:
kernel/crash_dump_dm_crypt.c: In function 'restore_dm_crypt_keys_to_thread_keyring':
kernel/crash_dump_dm_crypt.c:105:9: error: unknown type name 'key_ref_t'; did you mean 'key_ref_put'?
There is a mix of 'select KEYS' and 'depends on KEYS' in Kconfig,
so there is no single obvious solution here, but generally using 'depends on'
makes more sense and is less likely to cause dependency loops.
Link: https://lkml.kernel.org/r/20250620112140.3396316-1-arnd@kernel.org Fixes: 62f17d9df692 ("crash_dump: retrieve dm crypt keys in kdump kernel") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Zijun Hu [Fri, 20 Jun 2025 11:53:53 +0000 (19:53 +0800)]
mailmap: add entries for Zijun Hu
Map my old qualcomm email addresses:
Zijun Hu <quic_zijuhu@quicinc.com>
Zijun Hu <zijuhu@codeaurora.org>
To the current one:
Zijun Hu <zijun.hu@oss.qualcomm.com>
Haiyue Wang [Sat, 21 Jun 2025 17:13:51 +0000 (01:13 +0800)]
fuse: fix runtime warning on truncate_folio_batch_exceptionals()
The WARN_ON_ONCE is introduced on truncate_folio_batch_exceptionals() to
capture whether the filesystem has removed all DAX entries or not.
And the fix has been applied on the filesystem xfs and ext4 by the commit 0e2f80afcfa6 ("fs/dax: ensure all pages are idle prior to filesystem
unmount").
Apply the missed fix on filesystem fuse to fix the runtime warning: