Kent Overstreet [Thu, 17 Apr 2025 18:09:56 +0000 (14:09 -0400)]
bcachefs: Fix snapshotting a subvolume, then renaming it
Subvolume roots and the dirents that point to them are special; they
don't obey the normal snapshot versioning rules because they cross
snapshot boundaries.
We don't keep around older versions of subvolume dirents on rename - we
don't need to, because subvolume dirents are only visible in the parent
subvolume, and we wouldn't be able to match up the different dirent and
inode versions due to crossing the snapshot ID boundary.
That means that when we rename a subvolume, that's been snapshotted, the
older version of the subvolume root will become dangling - it won't have
a dirent that points to it.
That's expected, we just need to tell fsck that this is ok.
Fixes: https://github.com/koverstreet/bcachefs/issues/856 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 10 Apr 2025 01:04:17 +0000 (21:04 -0400)]
bcachefs: Don't print data read retry success on non-errors
We may end up in the data read retry path when reading cached data and
racing with invalidation, or on checksum error when we were reading into
a userspace buffer that might have been modified while the read was in
flight.
These aren't real errors, so we shouldn't print the 'retry success'
message.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Alan Huang [Sat, 12 Apr 2025 10:20:49 +0000 (18:20 +0800)]
bcachefs: Add missing error handling
Reported-by: syzbot+d10151bf01574a09a915@syzkaller.appspotmail.com Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Gabriel Shahrouzi [Sat, 12 Apr 2025 18:39:33 +0000 (14:39 -0400)]
bcachefs: Prevent granting write refs when filesystem is read-only
Fix a shutdown WARNING in bch2_dev_free caused by active write I/O
references (ca->io_ref[WRITE]) on a device being freed.
The problem occurs when:
- The filesystem is marked read-only (BCH_FS_rw clear in c->flags).
- A subsequent operation (e.g., error handling for device removal)
incorrectly tries to grant write references back to a device.
- During final shutdown, the read-only flag causes the system to skip
stopping write I/O references (bch2_dev_io_ref_stop(ca, WRITE)).
- The leftover active write reference triggers the WARN_ON in
bch2_dev_free.
Prevent this by checking if the filesystem is read-only before
attempting to grant write references to a device in the problematic
code path. Ensure consistency between the filesystem state flag
and the device I/O reference state during shutdown.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Eric Biggers [Wed, 2 Apr 2025 16:23:38 +0000 (09:23 -0700)]
bcachefs: Remove unnecessary softdep on xxhash
As with the other algorithms, bcachefs does not access xxhash through
the crypto API. So there is no need to use a module softdep to ensure
that it is loaded.
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Eric Biggers [Wed, 2 Apr 2025 04:33:33 +0000 (21:33 -0700)]
bcachefs: use library APIs for ChaCha20 and Poly1305
Just use the ChaCha20 and Poly1305 libraries instead of the clunky
crypto API. This is much simpler. It is also slightly faster, since
the libraries provide more direct access to the same
architecture-optimized ChaCha20 and Poly1305 code.
I've tested that existing encrypted bcachefs filesystems can be continue
to be accessed with this patch applied.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 5 Apr 2025 16:26:43 +0000 (12:26 -0400)]
bcachefs: Fix UAF in bchfs_read()
Commit 3ba0240a8789 fixed a bug in the read retry path in __bch2_read(),
and changed bchfs_read() to match - to avoid a landmine if
bch2_read_extent() ever starts returning transaction restarts.
But that was incorrect, because bchfs_read() doesn't use a separate
stack allocated bvec_iter, it uses the one in the rbio being submitted.
Add a comment explaining the issue, and revert the buggy change.
Fixes: 3ba0240a8789 ("bcachefs: Fix silent short reads in data read retry path") Reported-by: syzbot+2deb10b8dc9aae6fab67@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Thomas Weißschuh [Wed, 2 Apr 2025 20:21:57 +0000 (21:21 +0100)]
tools/include: make uapi/linux/types.h usable from assembly
The "real" linux/types.h UAPI header gracefully degrades to a NOOP when
included from assembly code.
Mirror this behaviour in the tools/ variant.
Test for __ASSEMBLER__ over __ASSEMBLY__ as the former is provided by the
toolchain automatically.
Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/lkml/af553c62-ca2f-4956-932c-dd6e3a126f58@sirena.org.uk/ Fixes: c9fbaa879508 ("selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://patch.msgid.link/20250321-uapi-consistency-v1-1-439070118dc0@linutronix.de Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:
- add missing config symbol CONFIG_SND_HDA_EXT_CORE required for asoc
driver CONFIG_SND_SOF_SOF_HDA_SDW_BPT
* tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
ASoC: SOF: Intel: Let SND_SOF_SOF_HDA_SDW_BPT select SND_HDA_EXT_CORE
Len Brown [Sun, 6 Apr 2025 18:49:20 +0000 (14:49 -0400)]
tools/power turbostat: v2025.05.06
Support up to 8192 processors
Add cpuidle governor debug telemetry, disabled by default
Update default output to exclude cpuidle invocation counts
Bug fixes
Merge tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fix from Ingo Molnar:
"Fix a perf events time accounting bug"
* tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix child_total_time_enabled accounting bug at task exit
Merge tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix a nonsensical Kconfig combination
- Remove an unnecessary rseq-notification
* tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Eliminate useless task_work on execve
sched/isolation: Make CONFIG_CPU_ISOLATION depend on CONFIG_SMP
... and don't error out so hard on missing module descriptions.
Before commit 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()")
we used to warn about missing module descriptions, but only when
building with extra warnigns (ie 'W=1').
After that commit the warning became an unconditional hard error.
And it turns out not all modules have been converted despite the claims
to the contrary. As reported by Damian Tometzki, the slub KUnit test
didn't have a module description, and apparently nobody ever really
noticed.
The reason nobody noticed seems to be that the slub KUnit tests get
disabled by SLUB_TINY, which also ends up disabling a lot of other code,
both in tests and in slub itself. And so anybody doing full build tests
didn't actually see this failre.
So let's disable SLUB_TINY for build-only tests, since it clearly ends
up limiting build coverage. Also turn the missing module descriptions
error back into a warning, but let's keep it around for non-'W=1'
builds.
Len Brown [Sun, 6 Apr 2025 15:18:39 +0000 (11:18 -0400)]
tools/power turbostat: report CoreThr per measurement interval
The CoreThr column displays total thermal throttling events
since boot time.
Change it to report events during the measurement interval.
This is more useful for showing a user the current conditions.
Total events since boot time are still available to the user via
/sys/devices/system/cpu/cpu*/thermal_throttle/*
Document CoreThr on turbostat.8
Fixes: eae97e053fe30 ("turbostat: Support thermal throttle count print") Reported-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: Chen Yu <yu.c.chen@intel.com>
Justin Ernst [Wed, 19 Mar 2025 20:27:31 +0000 (15:27 -0500)]
tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192
On systems with >= 1024 cpus (in my case 1152), turbostat fails with the error output:
"turbostat: /sys/fs/cgroup/cpuset.cpus.effective: cpu str malformat 0-1151"
A similar error appears with the use of turbostat --cpu when the inputted cpu
range contains a cpu number >= 1024:
# turbostat -c 1100-1151
"--cpu 1100-1151" malformed
...
Both errors are caused by parse_cpu_str() reaching its limit of CPU_SUBSET_MAXCPUS.
It's a good idea to limit the maximum cpu number being parsed, but 1024 is too low.
For a small increase in compute and allocated memory, increasing CPU_SUBSET_MAXCPUS
brings support for parsing cpu numbers >= 1024.
Increase CPU_SUBSET_MAXCPUS to 8192, a common setting for CONFIG_NR_CPUS on x86_64.
Signed-off-by: Justin Ernst <justin.ernst@hpe.com> Signed-off-by: Len Brown <len.brown@intel.com>
Merge tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer cleanups from Thomas Gleixner:
"A set of final cleanups for the timer subsystem:
- Convert all del_timer[_sync]() instances over to the new
timer_delete[_sync]() API and remove the legacy wrappers.
Conversion was done with coccinelle plus some manual fixups as
coccinelle chokes on scoped_guard().
- The final cleanup of the hrtimer_init() to hrtimer_setup()
conversion.
This has been delayed to the end of the merge window, so that all
patches which have been merged through other trees are in mainline
and all new users are catched.
Doing this right before rc1 ensures that new code which is merged post
rc1 is not introducing new instances of the original functionality"
* tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracing/timers: Rename the hrtimer_init event to hrtimer_setup
hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()
hrtimers: Rename debug_init() to debug_setup()
hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()
hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()
hrtimers: Make callback function pointer private
hrtimers: Merge __hrtimer_init() into __hrtimer_setup()
hrtimers: Switch to use __htimer_setup()
hrtimers: Delete hrtimer_init()
treewide: Convert new and leftover hrtimer_init() users
treewide: Switch/rename to timer_delete[_sync]()
Merge tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more irq updates from Thomas Gleixner:
"A set of updates for the interrupt subsystem:
- A treewide cleanup for the irq_domain code, which makes the naming
consistent and gets rid of the original oddity of naming domains
'host'.
This is a trivial mechanical change and is done late to ensure that
all instances have been catched and new code merged post rc1 wont
reintroduce new instances.
- A trivial consistency fix in the migration code
The recent introduction of irq_force_complete_move() in the core
code, causes a problem for the nostalgia crowd who maintains ia64
out of tree.
The code assumes that hierarchical interrupt domains are enabled
and dereferences irq_data::parent_data unconditionally. That works
in mainline because both architectures which enable that code have
hierarchical domains enabled. Though it breaks the ia64 build,
which enables the functionality, but does not have hierarchical
domains.
While it's not really a problem for mainline today, this
unconditional dereference is inconsistent and trivially fixable by
using the existing helper function irqd_get_parent_data(), which
has the appropriate #ifdeffery in place"
* tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move()
irqdomain: Stop using 'host' for domain
irqdomain: Rename irq_get_default_host() to irq_get_default_domain()
irqdomain: Rename irq_set_default_host() to irq_set_default_domain()
Merge tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A revert to fix a adjtimex() regression:
The recent change to prevent that time goes backwards for the coarse
time getters due to immediate multiplier adjustments via adjtimex(),
changed the way how the timekeeping core treats that.
That change result in a regression on the adjtimex() side, which is
user space visible:
1) The forwarding of the base time moves the update out of the
original period and establishes a new one. That's changing the
behaviour of the [PF]LL control, which user space expects to be
applied periodically.
2) The clearing of the accumulated NTP error due to #1, changes the
behaviour as well.
An attempt to delay the multiplier/frequency update to the next tick
did not solve the problem as userspace expects that the multiplier or
frequency updates are in effect, when the syscall returns.
There is a different solution for the coarse time problem available,
so revert the offending commit to restore the existing adjtimex()
behaviour"
* tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "timekeeping: Fix possible inconsistencies in _COARSE clockids"
Merge tag 'sh-for-v6.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:
"One important fix and one small configuration update.
The first patch by Artur Rojek fixes an issue with the J2 firmware
loader not being able to find the location of the device tree blob due
to insufficient alignment of the .bss section which rendered J2 boards
unbootable.
The second patch by Johan Korsnes updates the defconfigs on sh to drop
the CONFIG_NET_CLS_TCINDEX configuration option which became obsolete
after 8c710f75256b ("net/sched: Retire tcindex classifier").
Summary:
- sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
- sh: Align .bss section padding to 8-byte boundary"
* tag 'sh-for-v6.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
sh: Align .bss section padding to 8-byte boundary
Merge tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Improve performance in gendwarfksyms
- Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS
- Support CONFIG_HEADERS_INSTALL for ARCH=um
- Use more relative paths to sources files for better reproducibility
- Support the loong64 Debian architecture
- Add Kbuild bash completion
- Introduce intermediate vmlinux.unstripped for architectures that need
static relocations to be stripped from the final vmlinux
- Fix versioning in Debian packages for -rc releases
- Treat missing MODULE_DESCRIPTION() as an error
- Convert Nios2 Makefiles to use the generic rule for built-in DTB
- Add debuginfo support to the RPM package
* tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits)
kbuild: rpm-pkg: build a debuginfo RPM
kconfig: merge_config: use an empty file as initfile
nios2: migrate to the generic rule for built-in DTB
rust: kbuild: skip `--remap-path-prefix` for `rustdoc`
kbuild: pacman-pkg: hardcode module installation path
kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally
modpost: require a MODULE_DESCRIPTION()
kbuild: make all file references relative to source root
x86: drop unnecessary prefix map configuration
kbuild: deb-pkg: add comment about future removal of KDEB_COMPRESS
kbuild: Add a help message for "headers"
kbuild: deb-pkg: remove "version" variable in mkdebian
kbuild: deb-pkg: fix versioning for -rc releases
Documentation/kbuild: Fix indentation in modules.rst example
x86: Get rid of Makefile.postlink
kbuild: Create intermediate vmlinux build with relocations preserved
kbuild: Introduce Kconfig symbol for linking vmlinux with relocations
kbuild: link-vmlinux.sh: Make output file name configurable
kbuild: do not generate .tmp_vmlinux*.map when CONFIG_VMLINUX_MAP=y
Revert "kheaders: Ignore silly-rename files"
...
Merge tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes, mostly from the end of last week, this week was very
quiet, maybe you scared everyone away. It's mostly amdgpu, and xe,
with some i915, adp and bridge bits, since I think this is overly
quiet I'd expect rc2 to be a bit more lively.
bridge:
- tda998x: Select CONFIG_DRM_KMS_HELPER
amdgpu:
- Guard against potential division by 0 in fan code
- Zero RPM support for SMU 14.0.2
- Properly handle SI and CIK support being disabled
- PSR fixes
- DML2 fixes
- DP Link training fix
- Vblank fixes
- RAS fixes
- Partitioning fix
- SDMA fix
- SMU 13.0.x fixes
- Rom fetching fix
- MES fixes
- Queue reset fix
xe:
- Fix NULL pointer dereference on error path
- Add missing HW workaround for BMG
- Fix survivability mode not triggering
- Fix build warning when DRM_FBDEV_EMULATION is not set
i915:
- Bounds check for scalers in DSC prefill latency computation
- Fix build by adding a missing include
adp:
- Fix error handling in plane setup"
# -----BEGIN PGP SIGNATURE-----
* tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
drm/i2c: tda998x: select CONFIG_DRM_KMS_HELPER
drm/amdgpu/gfx12: fix num_mec
drm/amdgpu/gfx11: fix num_mec
drm/amd/pm: Add gpu_metrics_v1_8
drm/amdgpu: Prefer shadow rom when available
drm/amd/pm: Update smu metrics table for smu_v13_0_6
drm/amd/pm: Remove host limit metrics support
Remove unnecessary firmware version check for gc v9_4_2
drm/amdgpu: stop unmapping MQD for kernel queues v3
Revert "drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA"
drm/amdgpu: Parse all deferred errors with UMC aca handle
drm/amdgpu: Update ta ras block
drm/amdgpu: Add NPS2 to DPX compatible mode
drm/amdgpu: Use correct gfx deferred error count
drm/amd/display: Actually do immediate vblank disable
drm/amd/display: prevent hang on link training fail
Revert "drm/amd/display: dml2 soc dscclk use DPM table clk setting"
drm/amd/display: Increase vblank offdelay for PSR panels
drm/amd: Handle being compiled without SI or CIK support better
drm/amd/pm: Add zero RPM enabled OD setting support for SMU14.0.2
...
Uday Shankar [Mon, 31 Mar 2025 22:46:32 +0000 (16:46 -0600)]
kbuild: rpm-pkg: build a debuginfo RPM
The rpm-pkg make target currently suffers from a few issues related to
debuginfo:
1. debuginfo for things built into the kernel (vmlinux) is not available
in any RPM produced by make rpm-pkg. This makes using tools like
systemtap against a make rpm-pkg kernel impossible.
2. debug source for the kernel is not available. This means that
commands like 'disas /s' in gdb, which display source intermixed with
assembly, can only print file names/line numbers which then must be
painstakingly resolved to actual source in a separate editor.
3. debuginfo for modules is available, but it remains bundled with the
.ko files that contain module code, in the main kernel RPM. This is a
waste of space for users who do not need to debug the kernel (i.e.
most users).
Address all of these issues by additionally building a debuginfo RPM
when the kernel configuration allows for it, in line with standard
patterns followed by RPM distributors. With these changes:
1. systemtap now works (when these changes are backported to 6.11, since
systemtap lags a bit behind in compatibility), as verified by the
following simple test script:
129 op_str = blk_op_name[op];
130
131 return op_str;
132 }
0xffffffff814c8768 <+40>: jmp 0xffffffff81d01360 <__x86_return_thunk>
End of assembler dump.
3. The size of the main kernel package goes down substantially,
especially if many modules are built (quite typical). Here is a
comparison of installed size of the kernel package (configured with
allmodconfig, dwarf4 debuginfo, and module compression turned off)
before and after this patch:
# rpm -qi kernel-6.13* | grep -E '^(Version|Size)'
Version : 6.13.0postpatch+
Size : 1382874089
Version : 6.13.0prepatch+
Size : 17870795887
This is a ~92% size reduction.
Note that a debuginfo package can only be produced if the following
configs are set:
- CONFIG_DEBUG_INFO=y
- CONFIG_MODULE_COMPRESS=n
- CONFIG_DEBUG_INFO_SPLIT=n
The first of these is obvious - we can't produce debuginfo if the build
does not generate it. The second two requirements can in principle be
removed, but doing so is difficult with the current approach, which uses
a generic rpmbuild script find-debuginfo.sh that processes all packaged
executables. If we want to remove those requirements the best path
forward is likely to add some debuginfo extraction/installation logic to
the modules_install target (controllable by flags). That way, it's
easier to operate on modules before they're compressed, and the logic
can be reused by all packaging targets.
Daniel Gomez [Fri, 28 Mar 2025 14:28:37 +0000 (14:28 +0000)]
kconfig: merge_config: use an empty file as initfile
The scripts/kconfig/merge_config.sh script requires an existing
$INITFILE (or the $1 argument) as a base file for merging Kconfig
fragments. However, an empty $INITFILE can serve as an initial starting
point, later referenced by the KCONFIG_ALLCONFIG Makefile variable
if -m is not used. This variable can point to any configuration file
containing preset config symbols (the merged output) as stated in
Documentation/kbuild/kconfig.rst. When -m is used $INITFILE will
contain just the merge output requiring the user to run make (i.e.
KCONFIG_ALLCONFIG=<$INITFILE> make <allnoconfig/alldefconfig> or make
olddefconfig).
Instead of failing when `$INITFILE` is missing, create an empty file and
use it as the starting point for merges.
Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Sun, 22 Dec 2024 00:30:53 +0000 (09:30 +0900)]
nios2: migrate to the generic rule for built-in DTB
Commit 654102df2ac2 ("kbuild: add generic support for built-in boot
DTBs") introduced generic support for built-in DTBs.
Select GENERIC_BUILTIN_DTB when built-in DTB support is enabled.
To keep consistency across architectures, this commit also renames
CONFIG_NIOS2_DTB_SOURCE_BOOL to CONFIG_BUILTIN_DTB, and
CONFIG_NIOS2_DTB_SOURCE to CONFIG_BUILTIN_DTB_NAME.
Johan Korsnes [Sun, 23 Mar 2025 19:13:30 +0000 (20:13 +0100)]
sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
This option was removed from Kconfig in 8c710f75256b ("net/sched:
Retire tcindex classifier") but from the defconfigs.
Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier") Signed-off-by: Johan Korsnes <johan.korsnes@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Artur Rojek [Sun, 16 Feb 2025 17:55:44 +0000 (18:55 +0100)]
sh: Align .bss section padding to 8-byte boundary
J2-based devices expect to find a device tree blob at the end of the
.bss section. As of a77725a9a3c5 ("scripts/dtc: Update to upstream
version v1.6.1-19-g0a3a9d3449c8"), libfdt enforces 8-byte alignment
for the DTB, causing J2 devices to fail early in sh_fdt_init().
As the J2 loader firmware calculates the DTB location based on the kernel
image .bss section size rather than the __bss_stop symbol offset, the
required alignment can't be enforced with BSS_SECTION(0, PAGE_SIZE, 8).
To fix this, inline a modified version of the above macro which grows
.bss by the required size. While this change affects all existing SH
boards, it should be benign on platforms which don't need this alignment.
Signed-off-by: Artur Rojek <contact@artur-rojek.eu> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: Rob Landley <rob@landley.net> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Merge tag 'input-for-v6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- a brand new driver for touchpads and touchbars in newer Apple devices
- support for Berlin-A series in goodix-berlin touchscreen driver
- improvements to matrix_keypad driver to better handle GPIOs toggling
- assorted small cleanups in other input drivers
* tag 'input-for-v6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: goodix_berlin - add support for Berlin-A series
dt-bindings: input: goodix,gt9916: Document gt9897 compatible
dt-bindings: input: matrix_keypad - add wakeup-source property
dt-bindings: input: matrix_keypad - add missing property
Input: pm8941-pwrkey - fix dev_dbg() output in pm8941_pwrkey_irq()
Input: synaptics - hide unused smbus_pnp_ids[] array
Input: apple_z2 - fix potential confusion in Kconfig
Input: matrix_keypad - use fsleep for delays after activating columns
Input: matrix_keypad - add settle time after enabling all columns
dt-bindings: input: matrix_keypad: add settle time after enabling all columns
dt-bindings: input: matrix_keypad: convert to YAML
dt-bindings: input: Correct indentation and style in DTS example
MAINTAINERS: Add entries for Apple Z2 touchscreen driver
Input: apple_z2 - add a driver for Apple Z2 touchscreens
dt-bindings: input: touchscreen: Add Z2 controller
Input: Switch to use hrtimer_setup()
Input: drop vb2_ops_wait_prepare/finish
Nam Cao [Wed, 5 Feb 2025 10:55:20 +0000 (11:55 +0100)]
hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()
All the hrtimer_init*() functions have been renamed to hrtimer_setup*().
Rename debug_init_on_stack() to debug_setup_on_stack() as well, to keep the
names consistent.
Nam Cao [Wed, 5 Feb 2025 10:55:18 +0000 (11:55 +0100)]
hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()
All the hrtimer_init*() functions have been renamed to hrtimer_setup*().
Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper() as well, to
keep the names consistent.
Nam Cao [Wed, 5 Feb 2025 10:55:17 +0000 (11:55 +0100)]
hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()
The struct hrtimer::function field can only be changed using
hrtimer_setup*() or hrtimer_update_function(), and both already null-check
'function'. Therefore, null-checking 'function' in hrtimer_start_range_ns()
is not necessary.
Nam Cao [Wed, 5 Feb 2025 10:55:16 +0000 (11:55 +0100)]
hrtimers: Make callback function pointer private
Make the struct hrtimer::function field private, to prevent users from
changing this field in an unsafe way. hrtimer_update_function() should be
used if the callback function needs to be changed.
Nam Cao [Wed, 5 Feb 2025 10:55:11 +0000 (11:55 +0100)]
hrtimers: Switch to use __htimer_setup()
__hrtimer_init_sleeper() calls __hrtimer_init() and also sets up the
callback function. But there is already __hrtimer_setup() which does both
actions.
Switch to use __hrtimer_setup() to simplify the code.
Merge tag 'v6.15-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a race condition in the newly added eip93 driver"
* tag 'v6.15-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: inside-secure/eip93 - acquire lock on eip93_put_descriptor hash
Merge tag 's390-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Vasily Gorbik:
- Fix machine check handler _CIF_MCCK_GUEST bit setting by adding the
missing base register for relocated lowcore address
- Fix build failure on older linkers by conditionally adding the
-no-pie linker option only when it is supported
- Fix inaccurate kernel messages in vfio-ap by providing descriptive
error notifications for AP queue sharing violations
- Fix PCI isolation logic by ensuring non-VF devices correctly return
false in zpci_bus_is_isolated_vf()
- Fix PCI DMA range map setup by using dma_direct_set_offset() to add a
proper sentinel element, preventing potential overruns and
translation errors
- Cleanup header dependency problems with asm-offsets.c
- Add fault info for unexpected low-address protection faults in user
mode
- Add support for HOTPLUG_SMT, replacing the arch-specific "nosmt"
handling with common code handling
- Use bitop functions to implement CPU flag helper functions to ensure
that bits cannot get lost if modified in different contexts on a CPU
- Remove unused machine_flags for the lowcore
* tag 's390-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/vfio-ap: Fix no AP queue sharing allowed message written to kernel log
s390/pci: Fix dev.dma_range_map missing sentinel element
s390/mm: Dump fault info in case of low address protection fault
s390/smp: Add support for HOTPLUG_SMT
s390: Fix linker error when -no-pie option is unavailable
s390/processor: Use bitop functions for cpu flag helper functions
s390/asm-offsets: Remove ASM_OFFSETS_C
s390/asm-offsets: Include ftrace_regs.h instead of ftrace.h
s390/kvm: Split kvm_host header file
s390/pci: Fix zpci_bus_is_isolated_vf() for non-VFs
s390/lowcore: Remove unused machine_flags
s390/entry: Fix setting _CIF_MCCK_GUEST with lowcore relocation
Merge tag '6.15-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull more smb client updates from Steve French:
- reconnect fixes: three for updating rsize/wsize and an SMB1 reconnect
fix
- RFC1001 fixes: fixing connections to nonstandard ports, and negprot
retries
- fix mfsymlinks to old servers
- make mapping of open flags for SMB1 more accurate
- permission fixes: adding retry on open for write, and one for stat to
workaround unexpected access denied
- add two new xattrs, one for retrieving SACL and one for retrieving
owner (without having to retrieve the whole ACL)
- fix mount parm validation for echo_interval
- minor cleanup (including removing now unneeded cifs_truncate_page)
* tag '6.15-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal version number
cifs: Implement is_network_name_deleted for SMB1
cifs: Remove cifs_truncate_page() as it should be superfluous
cifs: Do not add FILE_READ_ATTRIBUTES when using GENERIC_READ/EXECUTE/ALL
cifs: Improve SMB2+ stat() to work also without FILE_READ_ATTRIBUTES
cifs: Add fallback for SMB2 CREATE without FILE_READ_ATTRIBUTES
cifs: Fix querying and creating MF symlinks over SMB1
cifs: Fix access_flags_to_smbopen_mode
cifs: Fix negotiate retry functionality
cifs: Improve handling of NetBIOS packets
cifs: Allow to disable or force initialization of NetBIOS session
cifs: Add a new xattr system.smb3_ntsd_owner for getting or setting owner
cifs: Add a new xattr system.smb3_ntsd_sacl for getting or setting SACLs
smb: client: Update IO sizes after reconnection
smb: client: Store original IO parameters and prevent zero IO sizes
smb:client: smb: client: Add reverse mapping from tcon to superblocks
cifs: remove unreachable code in cifs_get_tcp_session()
cifs: fix integer overflow in match_server()
Merge tag 'ntb-6.15' of https://github.com/jonmason/ntb
Pull ntb fixes from Jon Mason:
"Bug fixes for NTB Switchtec driver mw negative shift, Intel NTB link
status db, ntb_perf double unmap (in error case), and MSI 64bit
arithmetic.
Also, add new AMD NTB PCI IDs, update AMD NTB maintainer, and pull in
patch to reduce the stack usage in IDT driver"
* tag 'ntb-6.15' of https://github.com/jonmason/ntb:
ntb_hw_amd: Add NTB PCI ID for new gen CPU
ntb: reduce stack usage in idt_scan_mws
ntb: use 64-bit arithmetic for the MSI doorbell mask
MAINTAINERS: Update AMD NTB maintainers
ntb_perf: Delete duplicate dmaengine_unmap_put() call in perf_copy_chunk()
ntb: intel: Fix using link status DB's
ntb_hw_switchtec: Fix shift-out-of-bounds in switchtec_ntb_mw_set_trans
Miroslav reported that the changes for handling the inconsistencies in the
coarse time getters result in a regression on the adjtimex() side.
There are two issues:
1) The forwarding of the base time moves the update out of the original
period and establishes a new one.
2) The clearing of the accumulated NTP error is changing the behaviour as
well.
Userspace expects that multiplier/frequency updates are in effect, when the
syscall returns, so delaying the update to the next tick is not solving the
problem either.
Revert the change, so that the established expectations of user space
implementations (ntpd, chronyd) are restored. The re-introduced
inconsistency of the coarse time getters will be addressed in a subsequent
fix.
Fixes: 757b000f7b93 ("timekeeping: Fix possible inconsistencies in _COARSE clockids") Reported-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/Z-qsg6iDGlcIJulJ@localhost
Merge tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- The sub-architecture selection Kconfig system has been cleaned up,
the documentation has been improved, and various detections have been
fixed
- The vector-related extensions dependencies are now validated when
parsing from device tree and in the DT bindings
- Misaligned access probing can be overridden via a kernel command-line
parameter, along with various fixes to misalign access handling
- Support for relocatable !MMU kernels builds
- Support for hpge pfnmaps, which should improve TLB utilization
- Support for runtime constants, which improves the d_hash()
performance
- Support for bfloat16, Zicbom, Zaamo, Zalrsc, Zicntr, Zihpm
- Various fixes, including:
- We were missing a secondary mmu notifier call when flushing the
tlb which is required for IOMMU
- Fix ftrace panics by saving the registers as expected by ftrace
- Fix a couple of stimecmp usage related to cpu hotplug
- purgatory_start is now aligned as per the STVEC requirements
- A fix for hugetlb when calculating the size of non-present PTEs
* tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (65 commits)
riscv: Add norvc after .option arch in runtime const
riscv: Make sure toolchain supports zba before using zba instructions
riscv/purgatory: 4B align purgatory_start
riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator
selftests: riscv: fix v_exec_initval_nolibc.c
riscv: Fix hugetlb retrieval of number of ptes in case of !present pte
riscv: print hartid on bringup
riscv: Add norvc after .option arch in runtime const
riscv: Remove CONFIG_PAGE_OFFSET
riscv: Support CONFIG_RELOCATABLE on riscv32
asm-generic: Always define Elf_Rel and Elf_Rela
riscv: Support CONFIG_RELOCATABLE on NOMMU
riscv: Allow NOMMU kernels to access all of RAM
riscv: Remove duplicate CONFIG_PAGE_OFFSET definition
RISC-V: errata: Use medany for relocatable builds
dt-bindings: riscv: document vector crypto requirements
dt-bindings: riscv: add vector sub-extension dependencies
dt-bindings: riscv: d requires f
RISC-V: add f & d extension validation checks
RISC-V: add vector crypto extension validation checks
...
Merge tag 'net-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter.
Current release - regressions:
- four fixes for the netdev per-instance locking
Current release - new code bugs:
- consolidate more code between existing Rx zero-copy and uring so
that the latter doesn't miss / have to duplicate the safety checks
Previous releases - regressions:
- ipv6: fix omitted Netlink attributes when using SKIP_STATS
Previous releases - always broken:
- net: fix geneve_opt length integer overflow
- udp: fix multiple wrap arounds of sk->sk_rmem_alloc when it
approaches INT_MAX
- dsa: mvpp2: add a lock to avoid corruption of the shared TCAM
- dsa: airoha: fix issues with traffic QoS configuration / offload,
and flow table offload
Misc:
- touch up the Netlink YAML specs of old families to make them usable
for user space C codegen"
* tag 'net-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits)
selftests: net: amt: indicate progress in the stress test
netlink: specs: rt_route: pull the ifa- prefix out of the names
netlink: specs: rt_addr: pull the ifa- prefix out of the names
netlink: specs: rt_addr: fix get multi command name
netlink: specs: rt_addr: fix the spec format / schema failures
net: avoid false positive warnings in __net_mp_close_rxq()
net: move mp dev config validation to __net_mp_open_rxq()
net: ibmveth: make veth_pool_store stop hanging
arcnet: Add NULL check in com20020pci_probe()
ipv6: Do not consider link down nexthops in path selection
ipv6: Start path selection from the first nexthop
usbnet:fix NPE during rx_complete
net: octeontx2: Handle XDP_ABORTED and XDP invalid as XDP_DROP
net: fix geneve_opt length integer overflow
io_uring/zcrx: fix selftests w/ updated netdev Python helpers
selftests: net: use netdevsim in netns test
docs: net: document netdev notifier expectations
net: dummy: request ops lock
netdevsim: add dummy device notifiers
net: rename rtnl_net_debug to lock_debug
...
Merge tag 'spi-fix-v6.15-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A small collection of fixes that came in during the merge window,
everything is driver specific with nothing standing out particularly"
* tag 'spi-fix-v6.15-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: bcm2835: Restore native CS probing when pinctrl-bcm2835 is absent
spi: bcm2835: Do not call gpiod_put() on invalid descriptor
spi: cadence-qspi: revert "Improve spi memory performance"
spi: cadence: Fix out-of-bounds array access in cdns_mrvl_xspi_setup_clock()
spi: fsl-qspi: use devm function instead of driver remove
spi: SPI_QPIC_SNAND should be tristate and depend on MTD
spi-rockchip: Fix register out of bounds access
Merge tag 'soc-drivers-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more SoC driver updates from Arnd Bergmann:
"This is the promised follow-up to the soc drivers branch, adding minor
updates to omap and freescale drivers.
Most notably, Ioana Ciornei takes over maintenance of the DPAA bus
driver used in some NXP (originally Freescale) chips"
* tag 'soc-drivers-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
bus: fsl-mc: Remove deadcode
MAINTAINERS: add the linuppc-dev list to the fsl-mc bus entry
MAINTAINERS: fix nonexistent dtbinding file name
MAINTAINERS: add myself as maintainer for the fsl-mc bus
irqdomain: soc: Switch to irq_find_mapping()
Input: tsc2007 - accept standard properties
Merge tag 'platform-drivers-x86-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- thinkpad_acpi:
- Fix NULL pointer dereferences while probing
- Disable ACPI fan access for T495* and E560
- ISST: Correct command storage data length
* tag 'platform-drivers-x86-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
MAINTAINERS: consistently use my dedicated email address
platform/x86: ISST: Correct command storage data length
platform/x86: thinkpad_acpi: disable ACPI fan access for T495* and E560
platform/x86: thinkpad_acpi: Fix NULL pointer dereferences while probing
Thomas Gleixner [Fri, 4 Apr 2025 14:51:19 +0000 (16:51 +0200)]
genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move()
Frank reported, that the common irq_force_complete_move() breaks the out of
tree build of ia64. The reason is that ia64 uses the migration code, but
does not have hierarchical interrupt domains enabled.
This went unnoticed in mainline as both x86 and RISC-V have hierarchical
domains enabled. Not that it matters for mainline, but it's still
inconsistent.
Use irqd_get_parent_data() instead of accessing the parent_data field
directly. The helper returns NULL when hierarchical domains are disabled
otherwise it accesses the parent_data field of the domain.
No functional change.
Fixes: 751dc837dabd ("genirq: Introduce common irq_force_complete_move() implementation") Reported-by: Frank Scheiner <frank.scheiner@web.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Frank Scheiner <frank.scheiner@web.de> Link: https://lore.kernel.org/all/87h634ugig.ffs@tglx
Jakub Kicinski [Thu, 3 Apr 2025 14:56:36 +0000 (07:56 -0700)]
selftests: net: amt: indicate progress in the stress test
Our CI expects output from the test at least once every 10 minutes.
The AMT test when running on debug kernel is just on the edge
of that time for the stress test. Improve the output:
- print the name of the test first, before starting it,
- output a dot every 10% of the way.
Output after:
TEST: amt discovery [ OK ]
TEST: IPv4 amt multicast forwarding [ OK ]
TEST: IPv6 amt multicast forwarding [ OK ]
TEST: IPv4 amt traffic forwarding torture .......... [ OK ]
TEST: IPv6 amt traffic forwarding torture .......... [ OK ]
It is confusing to see 'host' and 'domain' to be used as 'domain'. Given
this header is all about domains, switch the remaining 'host' uses to
'domain'.
Jakub Kicinski [Thu, 3 Apr 2025 01:37:06 +0000 (18:37 -0700)]
netlink: specs: rt_route: pull the ifa- prefix out of the names
YAML specs don't normally include the C prefix name in the name
of the YAML attr. Remove the ifa- prefix from all attributes
in route-attrs and metrics and specify name-prefix instead.
This is a bit risky, hopefully there aren't many users out there.
Jakub Kicinski [Thu, 3 Apr 2025 01:37:05 +0000 (18:37 -0700)]
netlink: specs: rt_addr: pull the ifa- prefix out of the names
YAML specs don't normally include the C prefix name in the name
of the YAML attr. Remove the ifa- prefix from all attributes
in addr-attrs and specify name-prefix instead.
This is a bit risky, hopefully there aren't many users out there.
====================
net: make memory provider install / close paths more common
We seem to be fixing bugs in config path for devmem which also exist
in the io_uring ZC path. Let's try to make the two paths more common,
otherwise this is bound to keep happening.
Jakub Kicinski [Thu, 3 Apr 2025 01:34:05 +0000 (18:34 -0700)]
net: avoid false positive warnings in __net_mp_close_rxq()
Commit under Fixes solved the problem of spurious warnings when we
uninstall an MP from a device while its down. The __net_mp_close_rxq()
which is used by io_uring was not fixed. Move the fix over and reuse
__net_mp_close_rxq() in the devmem path.
Acked-by: Stanislav Fomichev <sdf@fomichev.me> Fixes: a70f891e0fa0 ("net: devmem: do not WARN conditionally after netdev_rx_queue_restart()") Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250403013405.2827250-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 3 Apr 2025 01:34:04 +0000 (18:34 -0700)]
net: move mp dev config validation to __net_mp_open_rxq()
devmem code performs a number of safety checks to avoid having
to reimplement all of them in the drivers. Move those to
__net_mp_open_rxq() and reuse that function for binding to make
sure that io_uring ZC also benefits from them.
While at it rename the queue ID variable to rxq_idx in
__net_mp_open_rxq(), we touch most of the relevant lines.
The XArray insertion is reordered after the netdev_rx_queue_restart()
call, otherwise we'd need to duplicate the queue index check
or risk inserting an invalid pointer. The XArray allocation
failures should be extremely rare.
Reviewed-by: Mina Almasry <almasrymina@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Fixes: 6e18ed929d3b ("net: add helpers for setting a memory provider on an rx queue") Link: https://patch.msgid.link/20250403013405.2827250-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dave Marquardt [Wed, 2 Apr 2025 15:44:03 +0000 (10:44 -0500)]
net: ibmveth: make veth_pool_store stop hanging
v2:
- Created a single error handling unlock and exit in veth_pool_store
- Greatly expanded commit message with previous explanatory-only text
Summary: Use rtnl_mutex to synchronize veth_pool_store with itself,
ibmveth_close and ibmveth_open, preventing multiple calls in a row to
napi_disable.
Background: Two (or more) threads could call veth_pool_store through
writing to /sys/devices/vio/30000002/pool*/*. You can do this easily
with a little shell script. This causes a hang.
I configured LOCKDEP, compiled ibmveth.c with DEBUG, and built a new
kernel. I ran this test again and saw:
The control APIs are not idempotent. Control API calls are safe
against concurrent use of datapath APIs but an incorrect sequence of
control API calls may result in crashes, deadlocks, or race
conditions. For example, calling napi_disable() multiple times in a
row will deadlock.
In the normal open and close paths, rtnl_mutex is acquired to prevent
other callers. This is missing from veth_pool_store. Use rtnl_mutex in
veth_pool_store fixes these hangs.
Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com> Fixes: 860f242eb534 ("[PATCH] ibmveth change buffer pools dynamically") Reviewed-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250402154403.386744-1-davemarq@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Henry Martin [Wed, 2 Apr 2025 13:50:36 +0000 (21:50 +0800)]
arcnet: Add NULL check in com20020pci_probe()
devm_kasprintf() returns NULL when memory allocation fails. Currently,
com20020pci_probe() does not check for this case, which results in a
NULL pointer dereference.
Add NULL check after devm_kasprintf() to prevent this issue and ensure
no resources are left allocated.
ipv6: Do not consider link down nexthops in path selection
Nexthops whose link is down are not supposed to be considered during
path selection when the "ignore_routes_with_linkdown" sysctl is set.
This is done by assigning them a negative region boundary.
However, when comparing the computed hash (unsigned) with the region
boundary (signed), the negative region boundary is treated as unsigned,
resulting in incorrect nexthop selection.
Fix by treating the computed hash as signed. Note that the computed hash
is always in range of [0, 2^31 - 1].
Fixes: 3d709f69a3e7 ("ipv6: Use hash-threshold instead of modulo-N") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250402114224.293392-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cited commit transitioned IPv6 path selection to use hash-threshold
instead of modulo-N. With hash-threshold, each nexthop is assigned a
region boundary in the multipath hash function's output space and a
nexthop is chosen if the calculated hash is smaller than the nexthop's
region boundary.
Hash-threshold does not work correctly if path selection does not start
with the first nexthop. For example, if fib6_select_path() is always
passed the last nexthop in the group, then it will always be chosen
because its region boundary covers the entire hash function's output
space.
Fix this by starting the selection process from the first nexthop and do
not consider nexthops for which rt6_score_route() provided a negative
score.
Fixes: 3d709f69a3e7 ("ipv6: Use hash-threshold instead of modulo-N") Reported-by: Stanislav Fomichev <stfomichev@gmail.com> Closes: https://lore.kernel.org/netdev/Z9RIyKZDNoka53EO@mini-arch/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250402114224.293392-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ying Lu [Wed, 2 Apr 2025 08:58:59 +0000 (16:58 +0800)]
usbnet:fix NPE during rx_complete
Missing usbnet_going_away Check in Critical Path.
The usb_submit_urb function lacks a usbnet_going_away
validation, whereas __usbnet_queue_skb includes this check.
This inconsistency creates a race condition where:
A URB request may succeed, but the corresponding SKB data
fails to be queued.
Subsequent processes:
(e.g., rx_complete → defer_bh → __skb_unlink(skb, list))
attempt to access skb->next, triggering a NULL pointer
dereference (Kernel Panic).
Lorenzo Bianconi [Tue, 1 Apr 2025 09:02:12 +0000 (11:02 +0200)]
net: octeontx2: Handle XDP_ABORTED and XDP invalid as XDP_DROP
In the current implementation octeontx2 manages XDP_ABORTED and XDP
invalid as XDP_PASS forwarding the skb to the networking stack.
Align the behaviour to other XDP drivers handling XDP_ABORTED and XDP
invalid as XDP_DROP.
Please note this patch has just compile tested.
Merge tag 'x86-urgent-2025-04-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix a performance regression on AMD iGPU and dGPU drivers, related to
the unintended activation of DMA bounce buffers that regressed game
performance if KASLR disturbed things just enough
- Fix a copy_user_generic() performance regression on certain older
non-FSRM/ERMS CPUs
- Fix a Clang build warning due to a semantic merge conflict the Kunit
tree generated with the x86 tree
- Fix FRED related system hang during S4 resume
- Remove an unused API
* tag 'x86-urgent-2025-04-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fred: Fix system hang during S4 resume with FRED enabled
x86/platform/iosf_mbi: Remove unused iosf_mbi_unregister_pmic_bus_access_notifier()
x86/mm/init: Handle the special case of device private pages in add_pages(), to not increase max_pfn and trigger dma_addressing_limited() bounce buffers
x86/tools: Drop duplicate unlikely() definition in insn_decoder_test.c
x86/uaccess: Improve performance by aligning writes to 8 bytes in copy_user_generic(), on non-FSRM/ERMS CPUs
Merge tag 'sound-fix-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of device-specific fixes that have been gathered since
the previous pull:
- A few more HD-audio quirks and fixups
- A series of Qualcomm AudioReach fixes
- Various small fixes for ASoC rt5665, WSA, SOF and Cirrus"
* tag 'sound-fix-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Fix built-in mic on another ASUS VivoBook model
ALSA: hda/realtek - Support mute led function for HP platform
ASoC: imx-card: Add NULL check in imx_card_probe()
ASoC: codecs: rt5665: Fix some error handling paths in rt5665_probe()
ASoC: q6apm-dai: make use of q6apm_get_hw_pointer
ASoC: qdsp6: q6apm-dai: fix capture pipeline overruns.
ASoC: qdsp6: q6apm-dai: set 10 ms period and buffer alignment.
ASoC: q6apm: add q6apm_get_hw_pointer helper
ASoC: q6apm-dai: schedule all available frames to avoid dsp under-runs
ASoC: SOF: hda/ptl: Move mic privacy change notification sending to a work
ALSA/hda: intel-sdw-acpi: Remove (explicitly) unused header
ALSA: hda/realtek: Enable Mute LED on HP OMEN 16 Laptop xd000xx
ALSA: hda/tas2781: Upgrade calibratd-data writing code to support Alpha and Beta dsp firmware
ASoC: qdsp6: q6asm-dai: fix q6asm_dai_compr_set_params error path
ALSA: hda/realtek: Fix built-in mic breakage on ASUS VivoBook X515JA
ASoC: sma1307: Fix error handling in sma1307_setting_loaded()
ASoC: codecs: wsa884x: Correct VI sense channel mask
ASoC: codecs: wsa883x: Correct VI sense channel mask
firmware: cs_dsp: Ensure cs_dsp_load[_coeff]() returns 0 on success
Merge tag 'omap-for-v6.14/drivers-signed' of https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap into soc/drivers-2
arm/omap: drivers: updates for v6.14
* tag 'omap-for-v6.14/drivers-signed' of https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap:
Input: tsc2007 - accept standard properties
Merge tag 'soc_fsl-6.15-1' of https://github.com/chleroy/linux into soc/drivers-2
FSL SOC Changes for 6.15:
- irqdomain cleanups from Jiry
- Add Ioana as Maintainer of fsl-mc bus and remove Laurentiu and Stuart
- Remove deadcode from fsl-mc bus
* tag 'soc_fsl-6.15-1' of https://github.com/chleroy/linux:
bus: fsl-mc: Remove deadcode
MAINTAINERS: add the linuppc-dev list to the fsl-mc bus entry
MAINTAINERS: fix nonexistent dtbinding file name
MAINTAINERS: add myself as maintainer for the fsl-mc bus
irqdomain: soc: Switch to irq_find_mapping()
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull dcache fixes from Al Viro:
"Fixes for bugs caught as part of tree-in-dcache work.
Mostly dentry refcount mishandling"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
hypfs_create_cpu_files(): add missing check for hypfs_mkdir() failure
qibfs: fix _another_ leak
spufs: fix a leak in spufs_create_context()
spufs: fix gang directory lifetimes
spufs: fix a leak on spufs_new_file() failure
Jakub Kicinski [Thu, 3 Apr 2025 23:23:00 +0000 (16:23 -0700)]
Merge tag 'nf-25-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains Netfilter fixes for net:
1) conncount incorrectly removes element for non-dynamic sets,
these elements represent a static control plane configuration,
leave them in place.
2) syzbot found a way to unregister a basechain that has been never
registered from the chain update path, fix from Florian Westphal.
3) Fix incorrect pointer arithmetics in geneve support for tunnel,
from Lin Ma.
* tag 'nf-25-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_tunnel: fix geneve_opt type confusion addition
netfilter: nf_tables: don't unregister hook when table is dormant
netfilter: nft_set_hash: GC reaps elements with conncount for dynamic sets only
====================
Merge tag 'v6.15rc-part2-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
"Four ksmbd SMB3 server fixes, all also for stable"
* tag 'v6.15rc-part2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix null pointer dereference in alloc_preauth_hash()
ksmbd: validate zero num_subauth before sub_auth is accessed
ksmbd: fix overflow in dacloffset bounds check
ksmbd: fix session use-after-free in multichannel connection
Merge tag 'trace-ringbuffer-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer updates from Steven Rostedt:
"Persistent buffer cleanups and simplifications.
It was mistaken that the physical memory returned from "reserve_mem"
had to be vmap()'d to get to it from a virtual address. But
reserve_mem already maps the memory to the virtual address of the
kernel so a simple phys_to_virt() can be used to get to the virtual
address from the physical memory returned by "reserve_mem". With this
new found knowledge, the code can be cleaned up and simplified.
- Enforce that the persistent memory is page aligned
As the buffers using the persistent memory are all going to be
mapped via pages, make sure that the memory given to the tracing
infrastructure is page aligned. If it is not, it will print a
warning and fail to map the buffer.
- Use phys_to_virt() to get the virtual address from reserve_mem
Instead of calling vmap() on the physical memory returned from
"reserve_mem", use phys_to_virt() instead.
As the memory returned by "memmap" or any other means where a
physical address is given to the tracing infrastructure, it still
needs to be vmap(). Since this memory can never be returned back to
the buddy allocator nor should it ever be memmory mapped to user
space, flag this buffer and up the ref count. The ref count will
keep it from ever being freed, and the flag will prevent it from
ever being memory mapped to user space.
- Use vmap_page_range() for memmap virtual address mapping
For the memmap buffer, instead of allocating an array of struct
pages, assigning them to the contiguous phsycial memory and then
passing that to vmap(), use vmap_page_range() instead
- Replace flush_dcache_folio() with flush_kernel_vmap_range()
Instead of calling virt_to_folio() and passing that to
flush_dcache_folio(), just call flush_kernel_vmap_range() directly.
This also fixes a bug where if a subbuffer was bigger than
PAGE_SIZE only the PAGE_SIZE portion would be flushed"
* tag 'trace-ringbuffer-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Use flush_kernel_vmap_range() over flush_dcache_folio()
tracing: Use vmap_page_range() to map memmap ring buffer
tracing: Have reserve_mem use phys_to_virt() and separate from memmap buffer
tracing: Enforce the persistent ring buffer to be page aligned
Merge tag 'block-6.15-20250403' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe:
- NVMe pull request via Keith:
- PCI endpoint target cleanup (Damien)
- Early import for uring_cmd fixed buffer (Caleb)
- Multipath documentation and notification improvements (John)
- Invalid pci sq doorbell write fix (Maurizio)
- Queue init locking fix
- Remove dead nsegs parameter from blk_mq_get_new_requests()
* tag 'block-6.15-20250403' of git://git.kernel.dk/linux:
block: don't grab elevator lock during queue initialization
nvme-pci: skip nvme_write_sq_db on empty rqlist
nvme-multipath: change the NVME_MULTIPATH config option
nvme: update the multipath warning in nvme_init_ns_head
nvme/ioctl: move fixed buffer lookup to nvme_uring_cmd_io()
nvme/ioctl: move blk_mq_free_request() out of nvme_map_user_request()
nvme/ioctl: don't warn on vectorized uring_cmd with fixed buffer
nvmet: pci-epf: Keep completion queues mapped
block: remove unused nseg parameter
For igc:
Joe Damato removes unmapping of XSK queues from NAPI instance.
Zdenek Bouska swaps condition checks/call to prevent AF_XDP Tx drops
with low budget value.
For e1000e:
Vitaly adjusts Kumeran interface configuration to prevent MDI errors.
For ixgbe:
Piotr clears PHY high values on media type detection to ensure stale
values are not used.
For idpf:
Emil adjusts shutdown calls to prevent NULL pointer dereference.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
idpf: fix adapter NULL pointer dereference on reboot
ixgbe: fix media type detection for E610 device
e1000e: change k1 configuration on MTP and later platforms
igc: Fix TX drops in XDP ZC
igc: Fix XSK queue NAPI ID mapping
====================
Merge tag 'io_uring-6.15-20250403' of git://git.kernel.dk/linux
Pull more io_uring updates from Jens Axboe:
"Set of fixes/updates for io_uring that should go into this release.
The ublk bits could've gone via either tree - usually I put them in
block, but they got a bit mixed this series with the zero-copy
supported that ended up dipping into both trees.
This contains:
- Fix for sendmsg zc, include in pinned pages accounting like we do
for the other zc types
- Series for ublk fixing request aborting, doing various little
cleanups, fixing some zc issues, and adding queue_rqs support
- Another ublk series doing some code cleanups
- Series cleaning up the io_uring send path, mostly in preparation
for registered buffers
- Series doing little MSG_RING cleanups
- Fix for the newly added zc rx, fixing len being 0 for the last
invocation of the callback
- Add vectored registered buffer support for ublk. With that, then
ublk also supports this feature in the kernel revision where it
could generically introduced for rw/net
- A bunch of selftest additions for ublk. This is the majority of the
diffstat
- Silence a KCSAN data race warning for io-wq
- Various little cleanups and fixes"
* tag 'io_uring-6.15-20250403' of git://git.kernel.dk/linux: (44 commits)
io_uring: always do atomic put from iowq
selftests: ublk: enable zero copy for stripe target
io_uring: support vectored kernel fixed buffer
block: add for_each_mp_bvec()
io_uring: add validate_fixed_range() for validate fixed buffer
selftests: ublk: kublk: fix an error log line
selftests: ublk: kublk: use ioctl-encoded opcodes
io_uring/zcrx: return early from io_zcrx_recv_skb if readlen is 0
io_uring/net: avoid import_ubuf for regvec send
io_uring/rsrc: check size when importing reg buffer
io_uring: cleanup {g,s]etsockopt sqe reading
io_uring: hide caches sqes from drivers
io_uring: make zcrx depend on CONFIG_IO_URING
io_uring: add req flag invariant build assertion
Documentation: ublk: remove dead footnote
selftests: ublk: specify io_cmd_buf pointer type
ublk: specify io_cmd_buf pointer type
io_uring: don't pass ctx to tw add remote helper
io_uring/msg: initialise msg request opcode
io_uring/msg: rename io_double_lock_ctx()
...
Lin Ma [Wed, 2 Apr 2025 16:56:32 +0000 (00:56 +0800)]
net: fix geneve_opt length integer overflow
struct geneve_opt uses 5 bit length for each single option, which
means every vary size option should be smaller than 128 bytes.
However, all current related Netlink policies cannot promise this
length condition and the attacker can exploit a exact 128-byte size
option to *fake* a zero length option and confuse the parsing logic,
further achieve heap out-of-bounds read.
Fix these issues by enforing correct length condition in related
policies.
Fixes: 925d844696d9 ("netfilter: nft_tunnel: add support for geneve opts") Fixes: 4ece47787077 ("lwtunnel: add options setting and dumping for geneve") Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key") Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Xin Long <lucien.xin@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250402165632.6958-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>