Linus Torvalds [Fri, 28 Mar 2025 19:41:36 +0000 (12:41 -0700)]
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC fixes from Eric Biggers:
"Fix out-of-scope array bugs in arm and arm64's crc_t10dif_arch()"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
arm64/crc-t10dif: fix use of out-of-scope array in crc_t10dif_arch()
arm/crc-t10dif: fix use of out-of-scope array in crc_t10dif_arch()
Linus Torvalds [Fri, 28 Mar 2025 19:37:13 +0000 (12:37 -0700)]
Merge tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This brings two main changes to Landlock:
- A signal scoping fix with a new interface for user space to know if
it is compatible with the running kernel.
- Audit support to give visibility on why access requests are denied,
including the origin of the security policy, missing access rights,
and description of object(s). This was designed to limit log spam
as much as possible while still alerting about unexpected blocked
access.
With these changes come new and improved documentation, and a lot of
new tests"
* tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (36 commits)
landlock: Add audit documentation
selftests/landlock: Add audit tests for network
selftests/landlock: Add audit tests for filesystem
selftests/landlock: Add audit tests for abstract UNIX socket scoping
selftests/landlock: Add audit tests for ptrace
selftests/landlock: Test audit with restrict flags
selftests/landlock: Add tests for audit flags and domain IDs
selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags
selftests/landlock: Add test for invalid ruleset file descriptor
samples/landlock: Enable users to log sandbox denials
landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
landlock: Add LANDLOCK_RESTRICT_SELF_LOG_*_EXEC_* flags
landlock: Log scoped denials
landlock: Log TCP bind and connect denials
landlock: Log truncate and IOCTL denials
landlock: Factor out IOCTL hooks
landlock: Log file-related denials
landlock: Log mount-related denials
landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status
landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials
...
Linus Torvalds [Fri, 28 Mar 2025 19:09:33 +0000 (12:09 -0700)]
Merge tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux
Pull capabilities update from Serge Hallyn:
"This contains just one patch that removes a helper function whose last
user (smack) stopped using it in 2018"
* tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
capability: Remove unused has_capability
Linus Torvalds [Fri, 28 Mar 2025 19:06:58 +0000 (12:06 -0700)]
Merge tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull ima updates from Mimi Zohar:
"Two performance improvements, which minimize the number of integrity
violations"
* tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: limit the number of ToMToU integrity violations
ima: limit the number of open-writers integrity violations
Thomas says:
"I just noticed that for some incomprehensible reason, probably sheer
incompetemce when trying to utilize b4, I managed to merge an outdated
_and_ buggy version of that series.
Can you please revert that merge completely?"
Done.
Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 28 Mar 2025 03:20:15 +0000 (20:20 -0700)]
Merge tag 'm68knommu-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu updates from Greg Ungerer:
- remove unused include of linux/fb.h
- use strscpy() instead of strncpy()
* tag 'm68knommu-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: mm: Replace deprecated strncpy() with strscpy()
m68k: Do not include <linux/fb.h>
Linus Torvalds [Fri, 28 Mar 2025 02:39:08 +0000 (19:39 -0700)]
Merge tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Madhavan Srinivasan:
- Remove support for IBM Cell Blades
- SMP support for microwatt platform
- Support for inline static calls on PPC32
- Enable pmu selftests for power11 platform
- Enable hardware trace macro (HTM) hcall support
- Support for limited address mode capability
- Changes to RMA size from 512 MB to 768 MB to handle fadump
- Misc fixes and cleanups
Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann,
Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom,
Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook,
Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani
(IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav
Jain, and Venkat Rao Bagalkote.
* tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits)
powerpc/kexec: fix physical address calculation in clear_utlb_entry()
crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD
powerpc: Fix 'intra_function_call not a direct call' warning
powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu'
KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests
powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7
powerpc/microwatt: Add SMP support
powerpc: Define config option for processors with broadcast TLBIE
powerpc/microwatt: Define an idle power-save function
powerpc/microwatt: Device-tree updates
powerpc/microwatt: Select COMMON_CLK in order to get the clock framework
net: toshiba: Remove reference to PPC_IBM_CELL_BLADE
net: spider_net: Remove powerpc Cell driver
cpufreq: ppc_cbe: Remove powerpc Cell driver
genirq: Remove IRQ_EDGE_EOI_HANDLER
docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR
powerpc: Remove UDBG_RTAS_CONSOLE
powerpc/io: Use standard barrier macros in io.c
powerpc/io: Rename _insw_ns() etc.
powerpc/io: Use generic raw accessors
...
Linus Torvalds [Fri, 28 Mar 2025 02:31:34 +0000 (19:31 -0700)]
Merge tag 'probes-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
- probe-events: Add comments about entry data storing code to clarify
where and how the entry data is stored for function return events.
- probe-events: Log error for exceeding the number of arguments to help
user to identify error reason via tracefs/error_log file.
- Improve the ftracetest selftests:
- Expand the tprobe event test to check if it can correctly find the
wrong format tracepoint name.
- Add new syntax error test to check whether error_log correctly
indicates a wrong character in the tracepoint name.
- Add a new dynamic events argument limitation test case which
checks max number of probe arguments.
* tag 'probes-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: probe-events: Add comments about entry data storing code
selftests/ftrace: Add dynamic events argument limitation test case
selftests/ftrace: Add new syntax error test
selftests/ftrace: Expand the tprobe event test to check wrong format
tracing: probe-events: Log error for exceeding the number of arguments
Linus Torvalds [Fri, 28 Mar 2025 02:26:10 +0000 (19:26 -0700)]
Merge tag 'livepatching-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching updates from Petr Mladek:
- Add a selftest for tracing of a livepatched function
- Skip a selftest when kprobes are not using ftrace
- Some documentation clean up
* tag 'livepatching-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
selftests: livepatch: test if ftrace can trace a livepatched function
selftests: livepatch: add new ftrace helpers functions
selftest/livepatch: Only run test-kprobe with CONFIG_KPROBES_ON_FTRACE
docs: livepatch: move text out of code block
livepatch: Add comment to clarify klp_add_nops()
Linus Torvalds [Fri, 28 Mar 2025 02:22:24 +0000 (19:22 -0700)]
Merge tag 'printk-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- New option "printk.debug_non_panic_cpus" allows to store printk
messages from non-panic CPUs during panic. It might be useful when
panic() fails. It is disabled by default because it increases the
chance to see the messages printed before panic() and on the
panic-CPU.
- New build option "CONFIG_NULL_TTY_DEFAULT_CONSOLE" allows to build
kernel without the virtual terminal support which prefers ttynull
over serial console.
- Do not unblank suspended consoles.
- Some code clean up.
* tag 'printk-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk/panic: Add option to allow non-panic CPUs to write to the ring buffer.
printk: Add an option to allow ttynull to be a default console device
printk: Check CON_SUSPEND when unblanking a console
printk: Rename console_start to console_resume
printk: Rename console_stop to console_suspend
printk: Rename resume_console to console_resume_all
printk: Rename suspend_console to console_suspend_all
Linus Torvalds [Fri, 28 Mar 2025 02:06:07 +0000 (19:06 -0700)]
Merge tag 'linux_kselftest-kunit-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
"kunit tool:
- Changes to kunit tool to use qboot on QEMU x86_64, and build GDB
scripts
- Fixes kunit tool bug in parsing test plan
- Adds test to kunit tool to check parsing late test plan
kunit:
- Clarifies kunit_skip() argument name
- Adds Kunit check for the longest symbol length
- Changes qemu_configs for sparc to use Zilog console"
* tag 'linux_kselftest-kunit-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: tool: add test to check parsing late test plan
kunit: tool: Fix bug in parsing test plan
Kunit to check the longest symbol length
kunit: Clarify kunit_skip() argument name
kunit: tool: Build GDB scripts
kunit: qemu_configs: sparc: use Zilog console
kunit: tool: Use qboot on QEMU x86_64
Linus Torvalds [Fri, 28 Mar 2025 01:57:58 +0000 (18:57 -0700)]
Merge tag 'linux_kselftest-next-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest updates from Shuah Khan:
- Fix bugs and clean up code in tracing, ftrace, and user_events tests
- Add missing executables to ftrace gitignore
* tag 'linux_kselftest-next-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/ftrace: add 'poll' binary to gitignore
selftests/ftrace: Use readelf to find entry point in uprobe test
selftests/user_events: Fix failures caused by test code
selftests/tracing: Allow some more tests to run in instances
selftests/ftrace: Clean up triggers after setting them
selftests/tracing: Test only toplevel README file not the instances
Linus Torvalds [Fri, 28 Mar 2025 01:22:46 +0000 (18:22 -0700)]
Merge tag 'ktest-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest update from Steven Rostedt:
- Fix failure of directory of log file not existing
If a LOG_FILE option is set for ktest to log its messages, and the
directory path does not exist. Then ktest fails. Have ktest attempt
to create the directory where the log file exists and if that
succeeds continue on testing.
* tag 'ktest-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Fix Test Failures Due to Missing LOG_FILE Directories
Linus Torvalds [Fri, 28 Mar 2025 00:03:01 +0000 (17:03 -0700)]
Merge tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tooling updates from Steven Rostedt:
- Allow RTLA to collect data via BPF
The current implementation of rtla uses libtracefs and libtraceevent
to pull sample events generated by the timerlat tracer from the trace
buffer. rtla then processes the sample by updating the histogram and
summary (current, maximum, minimum, and sum values) as well as checks
if tracing has been stopped due to threshold overflow.
In use cases where a large number of samples is being generated, that
is, with measurements running on many CPUs and with a low interval,
this sample processing design causes a significant CPU load on the
rtla side. Furthermore, with >100 CPUs and 100us interval, rtla was
reported as not being able to keep up with the samples and dropping
most of them, leading to it being unusable.
Change the way the timerlat trace processes samples by attaching a
BPF program to the trace event using the BPF skeleton feature of
bpftool. Unlike the current implementation, the BPF implementation
does not check whether tracing is stopped (in BPF mode, tracing is
always off to improve performance), but waits for a write to a BPF
ringbuffer instead. This allows rtla to exit immediately when a
threshold is violated, without waiting for the next iteration of the
while loop.
If the requirements for the BPF implementation are not met, either at
build time or at run time, the current implementation is used as
fallback. Which implementation is being used can be seen when running
rtla timerlat with "-D" option. rtla can be forced to run in non-BPF
mode by setting the RTLA_NO_BPF option to 1, for debugging purposes.
- Fix LD_FLAGS from being dropped in build
- Refactor code to remove duplication of save_trace_to_file
- Always set options and do not rely on default settings
Do not rely on the default kernel settings of the tracers when
starting. They could have been changed by the user which gives
inconsistent results. Always set the options that rtla expects.
- Add creation of ctags and TAGS for traversing code
* tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Add the ability to create ctags and etags
rtla/tests: Test setting default options
rtla/tests: Reset osnoise options before check
rtla: Always set all tracer options
rtla/osnoise: Set OSNOISE_WORKLOAD to true
rtla: Unify apply_config between top and hist
rtla/osnoise: Unify params struct
rtla: Fix segfault in save_trace_to_file call
tools/build: Use SYSTEM_BPFTOOL for system bpftool
rtla: Refactor save_trace_to_file
tools/rv: Keep user LDFLAGS in build
rtla/timerlat: Test BPF mode
rtla/timerlat_top: Use BPF to collect samples
rtla/timerlat_top: Move divisor to update
rtla/timerlat_hist: Use BPF to collect samples
rtla/timerlat: Add BPF skeleton to collect samples
rtla: Add optional dependency on BPF tooling
tools/build: Add bpftool-skeletons feature test
rtla/timerlat: Unify params struct
Linus Torvalds [Thu, 27 Mar 2025 23:22:12 +0000 (16:22 -0700)]
Merge tag 'trace-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Add option traceoff_after_boot
In order to debug kernel boot, it sometimes is helpful to enable
tracing via the kernel command line. Unfortunately, by the time the
login prompt appears, the trace is overwritten by the init process
and other user space start up applications.
Adding a "traceoff_after_boot" will disable tracing when the kernel
passes control to init which will allow developers to be able to see
the traces that occurred during boot.
- Clean up the mmflags macros that display the GFP flags in trace
events
The macros to print the GFP flags for trace events had a bit of
duplication. The code was restructured to remove duplication and in
the process it also adds some flags that were missed before.
- Removed some dead code and scripts/draw_functrace.py
draw_functrace.py hasn't worked in years and as nobody complained
about it, remove it.
- Constify struct event_trigger_ops
The event_trigger_ops is just a structure that has function pointers
that are assigned when the variables are created. These variables
should all be constants.
- Other minor clean ups and fixes
* tag 'trace-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Replace strncpy with memcpy for fixed-length substring copy
tracing: Fix synth event printk format for str fields
tracing: Do not use PERF enums when perf is not defined
tracing: Ensure module defining synth event cannot be unloaded while tracing
tracing: fix return value in __ftrace_event_enable_disable for TRACE_REG_UNREGISTER
tracing/osnoise: Fix possible recursive locking for cpus_read_lock()
tracing: Align synth event print fmt
tracing: gfp: vsprintf: Do not print "none" when using %pGg printf format
tracepoint: Print the function symbol when tracepoint_debug is set
tracing: Constify struct event_trigger_ops
scripts/tracing: Remove scripts/tracing/draw_functrace.py
tracing: Update MAINTAINERS file to include tracepoint.c
tracing/user_events: Slightly simplify user_seq_show()
tracing/user_events: Don't use %pK through printk
tracing: gfp: Remove duplication of recording GFP flags
tracing: Remove orphaned event_trace_printk
ring-buffer: Fix typo in comment about header page pointer
tracing: Add traceoff_after_boot option
Linus Torvalds [Thu, 27 Mar 2025 23:03:52 +0000 (16:03 -0700)]
Merge tag 'trace-latency-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull latency tracing updates from Steven Rostedt:
- Add some trace events to osnoise and timerlat sample generation
This adds more information to the osnoise and timerlat tracers as
well as allows BPF programs to be attached to these locations to
extract even more data.
- Fix to DECLARE_TRACE_CONDITION() macro
It wasn't used but now will be and it happened to be broken causing
the build to fail.
- Add scheduler specification monitors to runtime verifier (RV)
This is a continuation of Daniel Bristot's work.
RV allows monitors to run and react concurrently. Running the
cumulative model is equivalent to running single components using the
same reactors, with the advantage that it's easier to point out which
specification failed in case of error.
This update introduces nested monitors to RV, in short, the sysfs
monitor folder will contain a monitor named sched, which is nothing
but an empty container for other monitors. Controlling the sched
monitor (enable, disable, set reactors) controls all nested monitors.
The following scheduling monitors are added:
- sco: scheduling context operations
Monitor to ensure sched_set_state happens only in thread context
- tss: task switch while scheduling
Monitor to ensure sched_switch happens only in scheduling context
- snroc: set non runnable on its own context
Monitor to ensure set_state happens only in the respective task's context
- scpd: schedule called with preemption disabled
Monitor to ensure schedule is called with preemption disabled
- snep: schedule does not enable preempt
Monitor to ensure schedule does not enable preempt
- sncid: schedule not called with interrupt disabled
Monitor to ensure schedule is not called with interrupt disabled
* tag 'trace-latency-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tools/rv: Allow rv list to filter for container
Documentation/rv: Add docs for the sched monitors
verification/dot2k: Add support for nested monitors
tools/rv: Add support for nested monitors
rv: Add scpd, snep and sncid per-cpu monitors
rv: Add snroc per-task monitor
rv: Add sco and tss per-cpu monitors
rv: Add option for nested monitors and include sched
sched: Add sched tracepoints for RV task model
rv: Add license identifiers to monitor files
tracing: Fix DECLARE_TRACE_CONDITION
trace/osnoise: Add trace events for samples
Linus Torvalds [Thu, 27 Mar 2025 22:57:29 +0000 (15:57 -0700)]
Merge tag 'ftrace-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace updates from Steven Rostedt:
- Record function parameters for function and function graph tracers
An option has been added to function tracer (func-args) and the
function graph tracer (funcgraph-args) that when set, the tracers
will record the registers that hold the arguments into each function
event. On reading of the trace, it will use BTF to print those
arguments. Most archs support up to 6 arguments (depending on the
complexity of the arguments) and those are printed.
If a function has more arguments then what was recorded, the output
will end with " ... )".
- The rest of the changes are minor clean ups and fixes
* tag 'ftrace-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Use hashtable.h for event_hash
tracing: Fix use-after-free in print_graph_function_flags during tracer switching
function_graph: Remove the unused variable func
ftrace: Add arguments to function tracer
ftrace: Have funcgraph-args take affect during tracing
ftrace: Add support for function argument to graph tracer
ftrace: Add print_function_args()
ftrace: Have ftrace_free_filter() WARN and exit if ops is active
fgraph: Correct typo in ftrace_return_to_handler comment
Linus Torvalds [Thu, 27 Mar 2025 22:44:34 +0000 (15:44 -0700)]
Merge tag 'trace-sorttable-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing / sorttable updates from Steven Rostedt:
- Implement arm64 build time sorting of the mcount location table
When gcc is used to build arm64, the mcount_loc section is all zeros
in the vmlinux elf file. The addresses are stored in the Elf_Rela
location.
To sort at build time, an array is allocated and the addresses are
added to it via the content of the mcount_loc section as well as he
Elf_Rela data. After sorting, the information is put back into the
Elf_Rela which now has the section sorted.
- Make sorting of mcount location table for arm64 work with clang as
well
When clang is used, the mcount_loc section contains the addresses,
unlike the gcc build. An array is still created and the sorting works
for both methods.
- Remove weak functions from the mcount_loc section
Have the sorttable code pass in the data of functions defined via
'nm -S' which shows the functions as well as their sizes. Using this
information the sorttable code can determine if a function in the
mcount_loc section was weak and overridden. If the function is not
found, it is set to be zero. On boot, when the mcount_loc section is
read and the ftrace table is created, if the address in the
mcount_loc is not in the kernel core text then it is removed and not
added to the ftrace_filter_functions (the functions that can be
attached by ftrace callbacks).
- Update and fix the reporting of how much data is used for ftrace
functions
On boot, a report of how many pages were used by the ftrace table as
well as how they were grouped (the table holds a list of sections
that are groups of pages that were able to be allocated). The
removing of the weak functions required the accounting to be updated.
* tag 'trace-sorttable-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
scripts/sorttable: Allow matches to functions before function entry
scripts/sorttable: Use normal sort if theres no relocs in the mcount section
ftrace: Check against is_kernel_text() instead of kaslr_offset()
ftrace: Test mcount_loc addr before calling ftrace_call_addr()
ftrace: Have ftrace pages output reflect freed pages
ftrace: Update the mcount_loc check of skipped entries
scripts/sorttable: Zero out weak functions in mcount_loc table
scripts/sorttable: Always use an array for the mcount_loc sorting
scripts/sorttable: Have mcount rela sort use direct values
arm64: scripts/sorttable: Implement sorting mcount_loc at boot for arm64
Linus Torvalds [Thu, 27 Mar 2025 20:27:08 +0000 (13:27 -0700)]
Merge tag 'ext4-for_linus-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Ext4 bug fixes and cleanups, including:
- hardening against maliciously fuzzed file systems
- backwards compatibility for the brief period when we attempted to
ignore zero-width characters
- avoid potentially BUG'ing if there is a file system corruption
found during the file system unmount
- fix free space reporting by statfs when project quotas are enabled
and the free space is less than the remaining project quota
Also improve performance when replaying a journal with a very large
number of revoke records (applicable for Lustre volumes)"
* tag 'ext4-for_linus-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (71 commits)
ext4: fix OOB read when checking dotdot dir
ext4: on a remount, only log the ro or r/w state when it has changed
ext4: correct the error handle in ext4_fallocate()
ext4: Make sb update interval tunable
ext4: avoid journaling sb update on error if journal is destroying
ext4: define ext4_journal_destroy wrapper
ext4: hash: simplify kzalloc(n * 1, ...) to kzalloc(n, ...)
jbd2: add a missing data flush during file and fs synchronization
ext4: don't over-report free space or inodes in statvfs
ext4: clear DISCARD flag if device does not support discard
jbd2: remove jbd2_journal_unfile_buffer()
ext4: reorder capability check last
ext4: update the comment about mb_optimize_scan
jbd2: fix off-by-one while erasing journal
ext4: remove references to bh->b_page
ext4: goto right label 'out_mmap_sem' in ext4_setattr()
ext4: fix out-of-bound read in ext4_xattr_inode_dec_ref_all()
ext4: introduce ITAIL helper
jbd2: remove redundant function jbd2_journal_has_csum_v2or3_feature
ext4: remove redundant function ext4_has_metadata_csum
...
Linus Torvalds [Thu, 27 Mar 2025 20:20:07 +0000 (13:20 -0700)]
Merge tag 'bcachefs-2025-03-24' of git://evilpiepirate.org/bcachefs
Pull bcachefs updates from Kent Overstreet:
"On disk format is now soft frozen: no more required/automatic are
anticipated before taking off the experimental label.
Major changes/features since 6.14:
- Scrub
- Blocksize greater than page size support
- A number of "rebalance spinning and doing no work" issues have been
fixed; we now check if the write allocation will succeed in
bch2_data_update_init(), before kicking off the read.
There's still more work to do in this area. Later we may want to
add another bitset btree, like rebalance_work, to track "extents
that rebalance was requested to move but couldn't", e.g. due to
destination target having insufficient online devices.
- We can now support scaling well into the petabyte range: latest
bcachefs-tools will pick an appropriate bucket size at format time
to ensure fsck can run in available memory (e.g. a server with
256GB of ram and 100PB of storage would want 16MB buckets).
Cached replicas now get backpointers, which means we no longer rely
on incrementing bucket generation numbers to invalidate cached
data: this lets us get rid of the bucket generation number garbage
collection, which had to periodically rescan all extents to
recompute bucket oldest_gen.
Bucket generation numbers are now only used as a consistency check,
but they're quite useful for that.
- 1.22: stripe backpointers
Stripes now have backpointers: erasure coded stripes have their own
checksums, separate from the checksums for the extents they contain
(and stripe checksums also cover the parity blocks). This is
required for implementing scrub for stripes.
- 1.23: stripe lru (scalability improvement)
Persistent lru for stripes, ordered by "number of empty blocks".
This is used by the stripe creation path, which depending on free
space may create a new stripe out of a partially empty existing
stripe instead of starting a brand new stripe.
This replaces an in-memory heap, and means we no longer have to
read in the stripes btree at startup.
- 1.24: casefolding
Case insensitive directory support, courtesy of Valve.
This is an incompatible feature, to enable mount with
-o version_upgrade=incompatible
- 1.25: extent_flags
Another incompatible feature requiring explicit opt-in to enable.
This adds a flags entry to extents, and a flag bit that marks
extents as poisoned.
A poisoned extent is an extent that was unreadable due to checksum
errors. We can't move such extents without giving them a new
checksum, and we may have to move them (for e.g. copygc or device
evacuate). We also don't want to delete them: in the future we'll
have an API that lets userspace ignore checksum errors and attempt
to deal with simple bitrot itself. Marking them as poisoned lets us
continue to return the correct error to userspace on normal read
calls.
Other changes/features:
- BCH_IOCTL_QUERY_COUNTERS: this is used by the new 'bcachefs fs top'
command, which shows a live view of all internal filesystem
counters.
- Improved journal pipelining: we can now have 16 journal writes in
flight concurrently, up from 4. We're logging significantly more to
the journal than we used to with all the recent disk accounting
changes and additions, so some users should see a performance
increase on some workloads.
- BCH_MEMBER_STATE_failed: previously, we would do no IO at all to
devices marked as failed. Now we will attempt to read from them,
but only if we have no better options.
- New option, write_error_timeout: devices will be kicked out of the
filesystem if all writes have been failing for x number of seconds.
We now also kick devices out when notified by blk_holder_ops that
they've gone offline.
- Device option handling improvements: the discard option should now
be working as expected (additionally, in -tools, all device options
that can be set at format time can now be set at device add time,
i.e. data_allowed, state).
- We now try harder to read data after a checksum error: we'll do
additional retries if necessary to a device after after it gave us
data with a checksum error.
- More self healing work: the full inode <-> dirent consistency
checks that are currently run by fsck are now also run every time
we do a lookup, meaning we'll be able to correct errors at runtime.
Runtime self healing will be flipped on after the new changes have
seen more testing, currently they're just checking for consistency.
- KMSAN fixes: our KMSAN builds should be nearly clean now, which
will put a massive dent in the syzbot dashboard"
* tag 'bcachefs-2025-03-24' of git://evilpiepirate.org/bcachefs: (180 commits)
bcachefs: Kill unnecessary bch2_dev_usage_read()
bcachefs: btree node write errors now print btree node
bcachefs: Fix race in print_chain()
bcachefs: btree_trans_restart_foreign_task()
bcachefs: bch2_disk_accounting_mod2()
bcachefs: zero init journal bios
bcachefs: Eliminate padding in move_bucket_key
bcachefs: Fix a KMSAN splat in btree_update_nodes_written()
bcachefs: kmsan asserts
bcachefs: Fix kmsan warnings in bch2_extent_crc_pack()
bcachefs: Disable asm memcpys when kmsan enabled
bcachefs: Handle backpointers with unknown data types
bcachefs: Count BCH_DATA_parity backpointers correctly
bcachefs: Run bch2_check_dirent_target() at lookup time
bcachefs: Refactor bch2_check_dirent_target()
bcachefs: Move bch2_check_dirent_target() to namei.c
bcachefs: fs-common.c -> namei.c
bcachefs: EIO cleanup
bcachefs: bch2_write_prep_encoded_data() now returns errcode
bcachefs: Simplify bch2_write_op_error()
...
Linus Torvalds [Thu, 27 Mar 2025 20:17:39 +0000 (13:17 -0700)]
Merge tag 'jfs-6.14' of github.com:kleikamp/linux-shaggy
Pull jfs updates from David Kleikamp:
"Various bug fixes and cleanups for JFS"
* tag 'jfs-6.14' of github.com:kleikamp/linux-shaggy:
jfs: add index corruption check to DT_GETPAGE()
fs/jfs: consolidate sanity checking in dbMount
jfs: add sanity check for agwidth in dbMount
jfs: Prevent copying of nlink with value 0 from disk inode
fs/jfs: Prevent integer overflow in AG size calculation
fs/jfs: cast inactags to s64 to prevent potential overflow
jfs: Fix uninit-value access of imap allocated in the diMount() function
jfs: fix slab-out-of-bounds read in ea_get()
jfs: add check read-only before truncation in jfs_truncate_nolock()
jfs: add check read-only before txBeginAnon() call
jfs: reject on-disk inodes of an unsupported type
jfs: Remove reference to bh->b_page
jfs: Delete a couple tabs in jfs_reconfigure()
Linus Torvalds [Thu, 27 Mar 2025 20:07:00 +0000 (13:07 -0700)]
Merge tag 'xfs-6.15-merge' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Carlos Maiolino:
- XFS zoned allocator: Enables XFS to support zoned devices using its
real-time allocator
- Use folios/vmalloc for buffer cache backing memory
- Some code cleanups and bug fixes
* tag 'xfs-6.15-merge' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (70 commits)
xfs: remove the flags argument to xfs_buf_get_uncached
xfs: remove the flags argument to xfs_buf_read_uncached
xfs: remove xfs_buf_free_maps
xfs: remove xfs_buf_get_maps
xfs: call xfs_buf_alloc_backing_mem from _xfs_buf_alloc
xfs: remove unnecessary NULL check before kvfree()
xfs: don't wake zone space waiters without m_zone_info
xfs: don't increment m_generation for all errors in xfs_growfs_data
xfs: fix a missing unlock in xfs_growfs_data
xfs: Remove duplicate xfs_rtbitmap.h header
xfs: trigger zone GC when out of available rt blocks
xfs: trace what memory backs a buffer
xfs: cleanup mapping tmpfs folios into the buffer cache
xfs: use vmalloc instead of vm_map_area for buffer backing memory
xfs: buffer items don't straddle pages anymore
xfs: kill XBF_UNMAPPED
xfs: convert buffer cache to use high order folios
xfs: remove the kmalloc to page allocator fallback
xfs: refactor backing memory allocations for buffers
xfs: remove xfs_buf_is_vmapped
...
Linus Torvalds [Thu, 27 Mar 2025 20:04:31 +0000 (13:04 -0700)]
Merge tag 'dlm-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
- two fixes to the recent rcu lookup optimizations
- a change allowing TCP to be configured with the first of multiple IP
address
* tag 'dlm-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: make tcp still work in multi-link env
dlm: fix error if active rsb is not hashed
dlm: fix error if inactive rsb is not hashed
dlm: prevent NPD when writing a positive value to event_done
dlm: increase max number of links for corosync3/knet
Linus Torvalds [Thu, 27 Mar 2025 19:55:54 +0000 (12:55 -0700)]
Merge tag 'f2fs-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, there are three major updates: (1) folio conversion,
(2) refactoring for mount API conversion, (3) some performance
improvement such as direct IO, checkpoint speed, and IO priority
hints.
For stability, there are patches which add more sanity checks and
fixes some major issues like i_size in atomic write operations and
write pointer recovery in zoned devices.
Enhancements:
- huge folio converion work by Matthew Wilcox
- clean up for mount API conversion by Eric Sandeen
- improve direct IO speed in the overwrite case
- add some sanity check on node consistency
- set highest IO priority for checkpoint thread
- keep POSIX_FADV_NOREUSE ranges and add sysfs entry to reclaim pages
- add ioctl to get IO priority hint
- add carve_out sysfs node for fsstat
Bug fixes:
- disable nat_bits during umount to avoid potential nat entry corruption
- fix missing i_size update on atomic writes
- fix missing discard for active segments
- fix running out of free segments
- fix out-of-bounds access in f2fs_truncate_inode_blocks()
- call f2fs_recover_quota_end() correctly
- fix potential deadloop in prepare_compress_overwrite()
- fix the missing write pointer correction for zoned device
- fix to avoid panic once fallocation fails for pinfile
- don't retry IO for corrupted data scenario
There are many other clean up patches and minor bug fixes as usual"
* tag 'f2fs-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits)
f2fs: fix missing discard for active segments
f2fs: optimize f2fs DIO overwrites
f2fs: fix to avoid atomicity corruption of atomic file
f2fs: pass sbi rather than sb to parse_options()
f2fs: pass sbi rather than sb to quota qf_name helpers
f2fs: defer readonly check vs norecovery
f2fs: Pass sbi rather than sb to f2fs_set_test_dummy_encryption
f2fs: make LAZYTIME a mount option flag
f2fs: make INLINECRYPT a mount option flag
f2fs: factor out an f2fs_default_check function
f2fs: consolidate unsupported option handling errors
f2fs: use f2fs_sb_has_device_alias during option parsing
f2fs: add carve_out sysfs node
f2fs: fix to avoid running out of free segments
f2fs: Remove f2fs_write_node_page()
f2fs: Remove f2fs_write_meta_page()
f2fs: Remove f2fs_write_data_page()
f2fs: Remove check for ->writepage
Revert "f2fs: rebuild nat_bits during umount"
f2fs: fix to avoid accessing uninitialized curseg
...
Linus Torvalds [Thu, 27 Mar 2025 19:51:48 +0000 (12:51 -0700)]
Merge tag 'for-6.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"User visible changes:
- fall back to buffered write if direct io is done on a file that
requires checksums
- this avoids a problem with checksum mismatch errors, observed
e.g. on virtual images when writes to pages under writeback
cause the checksum mismatch reports
- this may lead to some performance degradation but currently the
recommended setup for VM images is to use the NOCOW file
attribute that also disables checksums
- fast/realtime zstd levels -15 to -1
- supported by mount options (compress=zstd:-5) and defrag ioctl
- improved speed, reduced compression ratio, check the commit for
sample measurements
- defrag ioctl extended to accept negative compression levels
- subpage mode
- remove warning when subpage mode is used, the feature is now
reasonably complete and tested
- in debug mode allow to create 2K b-tree nodes to allow testing
subpage on x86_64 with 4K pages too
Performance improvements:
- in send, better file path caching improves runtime (on sample load
by -30%)
- on s390x with hardware zlib support prepare the input buffer in a
better way to get the best results from the acceleration
- minor speed improvement in encoded read, avoid memory allocation in
synchronous mode
Core:
- enable stable writes on inodes, replacing manually waiting for
writeback and allowing to skip that on inodes without checksums
- add last checks and warnings for out-of-band dirty writes to pages,
requiring a fixup ("fixup worker"), this should not be necessary
since 5.8 where get_user_page() and pin_user_pages*() prevent this
- long history behind that, we'll be happy to remove the whole
infrastructure in the near future
- more folio API conversions and preparations for large folio support
- subpage cleanups and refactoring, split handling of data and
metadata to allow future support for large folios
- readpage works as block-by-block, no change for normal mode, this
is preparation for future subpage updates
- block group refcount fixes and hardening
- delayed iput fixes
- in zoned mode, fix zone activation on filesystem with missing
devices
Cleanups:
- inode parameter cleanups
- path auto-freeing updates
- code flow simplifications in send
- redundant parameter cleanups"
* tag 'for-6.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (164 commits)
btrfs: zoned: fix zone finishing with missing devices
btrfs: zoned: fix zone activation with missing devices
btrfs: remove end_no_trans label from btrfs_log_inode_parent()
btrfs: simplify condition for logging new dentries at btrfs_log_inode_parent()
btrfs: remove redundant else statement from btrfs_log_inode_parent()
btrfs: use memcmp_extent_buffer() at replay_one_extent()
btrfs: update outdated comment for overwrite_item()
btrfs: use variables to store extent buffer and slot at overwrite_item()
btrfs: avoid unnecessary memory allocation and copy at overwrite_item()
btrfs: don't clobber ret in btrfs_validate_super()
btrfs: prepare btrfs_page_mkwrite() for large folios
btrfs: prepare extent_io.c for future large folio support
btrfs: prepare btrfs_launcher_folio() for large folios support
btrfs: replace PAGE_SIZE with folio_size for subpage.[ch]
btrfs: add a size parameter to btrfs_alloc_subpage()
btrfs: subpage: make btrfs_is_subpage() check against a folio
btrfs: add extra warning if delayed iput is added when it's not allowed
btrfs: avoid redundant path slot assignment in btrfs_search_forward()
btrfs: remove unnecessary btrfs_key local variable in btrfs_search_forward()
btrfs: simplify the return value handling in search_ioctl()
...
Linus Torvalds [Thu, 27 Mar 2025 19:40:40 +0000 (12:40 -0700)]
Merge tag 'erofs-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"In this cycle, EROFS 48-bit block addressing is available to support
massive datasets for model training and other large data archive use
cases.
In addition, byte-oriented encoded extents have been supported to
reduce metadata sizes when using large configurations as well as to
improve Zstd compression speed.
There are some bugfixes and cleanups as usual.
Summary:
- Support 48-bit block addressing for large images
- Introduce encoded extents to reduce metadata on larger pclusters
- Enable unaligned compressed data to improve Zstd compression speed
- Allow 16-byte volume names again
- Minor cleanups"
* tag 'erofs-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: enable 48-bit layout support
erofs: support unaligned encoded data
erofs: implement encoded extent metadata
erofs: add encoded extent on-disk definition
erofs: initialize decompression early
erofs: support dot-omitted directories
erofs: implement 48-bit block addressing for unencoded inodes
erofs: add 48-bit block addressing on-disk support
erofs: simplify erofs_{read,fill}_inode()
erofs: get rid of erofs_map_blocks_flatmode()
erofs: move {in,out}pages into struct z_erofs_decompress_req
erofs: clean up header parsing for ztailpacking and fragments
erofs: simplify tail inline pcluster handling
erofs: allow 16-byte volume name again
erofs: get rid of erofs_kmap_type
erofs: use Z_EROFS_LCLUSTER_TYPE_MAX to simplify switches
Linus Torvalds [Thu, 27 Mar 2025 19:09:25 +0000 (12:09 -0700)]
Merge tag 'gfs2-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Fix two bugs related to locking request cancelation (locking request
being retried instead of canceled; canceling the wrong locking
request)
- Prevent a race between inode creation and deferred delete analogous
to commit ffd1cf0443a2 from 6.13. This now allows to further simplify
gfs2_evict_inode() without introducing mysterious problems
- When in inode delete should be verified / retried "later" but that
isn't possible, skip the delete instead of carrying it out
immediately. This broke in 6.13
- More folio conversions from Matthew Wilcox (plus a fix from Dan
Carpenter)
- Various minor fixes and cleanups
* tag 'gfs2-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (22 commits)
gfs2: some comment clarifications
gfs2: Fix a NULL vs IS_ERR() bug in gfs2_find_jhead()
gfs2: Convert gfs2_meta_read_endio() to use a folio
gfs2: Convert gfs2_end_log_write_bh() to work on a folio
gfs2: Convert gfs2_find_jhead() to use a folio
gfs2: Convert gfs2_jhead_pg_srch() to gfs2_jhead_folio_search()
gfs2: Use b_folio in gfs2_check_magic()
gfs2: Use b_folio in gfs2_submit_bhs()
gfs2: Use b_folio in gfs2_trans_add_meta()
gfs2: Use b_folio in gfs2_log_write_bh()
gfs2: skip if we cannot defer delete
gfs2: remove redundant warnings
gfs2: minor evict fix
gfs2: Prevent inode creation race (2)
gfs2: Fix additional unlikely request cancelation race
gfs2: Fix request cancelation bug
gfs2: Check for empty queue in run_queue
gfs2: Remove more dead code in add_to_queue
gfs2: Replace GIF_DEFER_DELETE with GLF_DEFER_DELETE
gfs2: glock holder GL_NOPID fix
...
Linus Torvalds [Thu, 27 Mar 2025 16:46:53 +0000 (09:46 -0700)]
Merge tag 'asm-generic-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"This is mainly set of cleanups of asm-generic/io.h, resolving problems
with inconsistent semantics of ioread64/iowrite64 that were causing
runtime and build issues.
The "GENERIC_IOMAP" version that switches between inb()/outb() and
readb()/writeb() style accessors is now only used on architectures
that have PC-style ISA devices that are not memory mapped (x86, uml,
m68k-q40 and powerpc-powernv), while alpha and parisc use a more
complicated variant and everything else just maps the ioread
interfaces to plan MMIO (readb/writeb etc).
In addition there are two small changes from Raag Jadav to simplify
the asm-generic/io.h indirect inclusions and from Jann Horn to fix a
corner case with read_word_at_a_time"
* tag 'asm-generic-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
rwonce: fix crash by removing READ_ONCE() for unaligned read
rwonce: handle KCSAN like KASAN in read_word_at_a_time()
m68k: coldfire: select PCI_IOMAP for PCI
mips: export pci_iounmap()
mips: fix PCI_IOBASE definition
m68k/nommu: stop using GENERIC_IOMAP
mips: drop GENERIC_IOMAP wrapper
powerpc: asm/io.h: remove split ioread64/iowrite64 helpers
parisc: stop using asm-generic/iomap.h
sh: remove duplicate ioread/iowrite helpers
alpha: stop using asm-generic/iomap.h
io.h: drop unused headers
drm/draw: include missing headers
asm-generic/io.h: rework split ioread64/iowrite64 helpers
Mimi Zohar [Mon, 27 Jan 2025 15:45:48 +0000 (10:45 -0500)]
ima: limit the number of ToMToU integrity violations
Each time a file in policy, that is already opened for read, is opened
for write, a Time-of-Measure-Time-of-Use (ToMToU) integrity violation
audit message is emitted and a violation record is added to the IMA
measurement list. This occurs even if a ToMToU violation has already
been recorded.
Limit the number of ToMToU integrity violations per file open for read.
Note: The IMA_MAY_EMIT_TOMTOU atomic flag must be set from the reader
side based on policy. This may result in a per file open for read
ToMToU violation.
Since IMA_MUST_MEASURE is only used for violations, rename the atomic
IMA_MUST_MEASURE flag to IMA_MAY_EMIT_TOMTOU.
Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6 Tested-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Tested-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Linus Torvalds [Thu, 27 Mar 2025 16:37:18 +0000 (09:37 -0700)]
Merge tag 'soc-arm-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC arm platform updates from Arnd Bergmann:
"The at91 platform gains support for SAMA7D65, a new variant of the
Cortex-A7 based SAMA7G5 with a graphics output.
The i.MX, Renesas and davinci platforms each get one minor bugfix"
* tag 'soc-arm-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: davinci: always enable CONFIG_ARCH_DAVINCI_DA850
ARM: imx: mark imx53_suspend_sz as unused
ARM: at91: pm: Enable ULP0/ULP1 for SAMA7D65
ARM: at91: pm: Add Backup mode for SAMA7D65
ARM: at91: pm: add DT compatible support for sama7d65
ARM: at91: pm: fix at91_suspend_finish for ZQ calibration
dt-bindings: ARM: at91: add Calao USB boards
dt-bindings: ARM: at91: make separate entry for Olimex board
ARM: at91: Add Support in SoC driver for SAMA7D65
dt-bindings: atmel-sysreg: Add SAMA7D65 Chip ID
ARM: shmobile: rcar-gen2: Remove CMA reservation code
Mimi Zohar [Mon, 27 Jan 2025 15:24:13 +0000 (10:24 -0500)]
ima: limit the number of open-writers integrity violations
Each time a file in policy, that is already opened for write, is opened
for read, an open-writers integrity violation audit message is emitted
and a violation record is added to the IMA measurement list. This
occurs even if an open-writers violation has already been recorded.
Limit the number of open-writers integrity violations for an existing
file open for write to one. After the existing file open for write
closes (__fput), subsequent open-writers integrity violations may be
emitted.
Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6 Tested-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Tested-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Linus Torvalds [Thu, 27 Mar 2025 16:14:30 +0000 (09:14 -0700)]
Merge tag 'soc-defconfig-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC defconfig updates from Arnd Bergmann:
"A small set of updates for the arm64 defconfig to enable more drivers,
plus a bit for housekeeping on some of the arm32 defconfigs on
particular SoC families"
* tag 'soc-defconfig-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm: defconfig: drop RT_GROUP_SCHED=y from bcm2835/tegra/omap2plus
arm64: defconfig: Enable USB retimer and redriver
arm64: defconfig: Build NSS Clock Controller driver for IPQ9574
arm64: defconfig: Enable SPI NAND flashes
arm64: defconfig: Enable Synopsys HDMI receiver
arm64: defconfig: Enable Rockchip UFS host driver
arm64: defconfig: enable Qualcomm IRIS & VIDEOCC_8550 as module
arm64: defconfig: Enable HSR protocol driver
arm64: defconfig: Enable gb_beagleplay
arm64: defconfig: enable DRM_DISPLAY_CONNECTOR as a module
arm64: defconfig: Enable Qualcomm QCM2290 GPU clock controller
ARM: shmobile: defconfig: Supplement DTB with ATAG information
Linus Torvalds [Thu, 27 Mar 2025 16:05:55 +0000 (09:05 -0700)]
Merge tag 'soc-drivers-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC driver updates from Arnd Bergmann:
"These are the updates for SoC specific drivers and related subsystems:
- Firmware driver updates for SCMI, FF-A and SMCCC firmware
interfaces, adding support for additional firmware features
including SoC identification and FF-A SRI callbacks as well as
various bugfixes
- Memory controller updates for Nvidia and Mediatek
- Reset controller support for microchip sam9x7 and imx8qxp/imx8qm
- New hardware support for multiple Mediatek, Renesas and Samsung
Exynos chips
- Minor updates on Zynq, Qualcomm, Amlogic, TI, Samsung, Nvidia and
Apple chips
There will be a follow up with a few more driver updates that are
still causing build regressions at the moment"
* tag 'soc-drivers-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (97 commits)
irqchip: Add support for Amlogic A4 and A5 SoCs
dt-bindings: interrupt-controller: Add support for Amlogic A4 and A5 SoCs
reset: imx: fix incorrect module device table
dt-bindings: power: qcom,kpss-acc-v2: add qcom,msm8916-acc compatible
bus: qcom-ssc-block-bus: Fix the error handling path of qcom_ssc_block_bus_probe()
bus: qcom-ssc-block-bus: Remove some duplicated iounmap() calls
soc: qcom: pd-mapper: Add support for SDM630/636
reset: imx: Add SCU reset driver for i.MX8QXP and i.MX8QM
dt-bindings: firmware: imx: add property reset-controller
dt-bindings: reset: atmel,at91sam9260-reset: add sam9x7
memory: mtk-smi: Add ostd setting for mt8192
dt-bindings: soc: samsung: exynos-usi: Drop unnecessary status from example
firmware: tegra: bpmp: Fix typo in bpmp-abi.h
soc/tegra: pmc: Use str_enable_disable-like helpers
soc: samsung: include linux/array_size.h where needed
firmware: arm_scmi: use ioread64() instead of ioread64_hi_lo()
soc: mediatek: mtk-socinfo: Add extra entry for MT8395AV/ZA Genio 1200
soc: mediatek: mt8188-mmsys: Add support for DSC on VDO0
soc: mediatek: mmsys: Migrate all tables to MMSYS_ROUTE() macro
soc: mediatek: mt8365-mmsys: Fix routing table masks and values
...
Linus Torvalds [Thu, 27 Mar 2025 16:01:37 +0000 (09:01 -0700)]
Merge tag 'soc-dt-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC devicetree updates from Arnd Bergmann:
"There is new support for additional on-chip devices on Apple,
Mediatek, Renesas, Rockchip, Samsung, Google, TI, ST, Nvidia and
Amlogic devices.
The Arm Morello reference platform gets a devicetree for booting in
normal aarch64 mode. The hardware supports experimental CHERI support,
which requires a modified kernel.
The AMD (formerly Xilinx) Versal NET SoC gets added, this is a
combined FPGA with Cortex-A78 CPUs in a SoC.
Six new ST STM32MP2 SoC variants are added. Like the earlier
STM32MP25, the MP211, MP213, MP215, MP231, MP233 and MP235 models are
based on one or two Cortex-A35 cores but each feature a different set
of I/O devices.
Mediatek MT8370 is a minor variation of MT8390 with fewer CPU and GPU
cores
Apple T2 is the baseboard management controller on earlier Intel CPU
based Macs, with 16 models now gaining initial support.
All the above come with dts files for the reference boards. In
addition, these boards are added for the SoCs that are already
supported:
- The Milk-V Jupiter board based on SpacemiT K1/M1
- NetCube Systems Kumquat board based on the 32-bit Allwinner V3s SoC
- Three boards based on 32-bit stm32mp1
- 11 distinct board variants from Toradex and one from Variscite, all
based on i.MX6
- Google Pixel Pro 6 phone based on gs101 (Tensor)
- Three additional variants of the i.MX8MP based "Skov" board
- A second variant of the i.MX95 EVK board
- Two boards based on Renesas SoCs
- Four boards based the Rockchip RK35xx series, plus the RK3588 'MNT
Reform 2' laptop"
* tag 'soc-dt-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (538 commits)
arm64: dts: Add gpio_intc node for Amlogic A5 SoCs
arm64: dts: Add gpio_intc node for Amlogic A4 SoCs
arm64: dts: hi3660: Add property for fixing CPUIdle
arm64: dts: rockchip: remove ethm0_clk0_25m_out from Sige5 gmac0
arm64: dts: marvell: Use preferred node names for "simple-bus"
arm64: dts: marvell: Drop unused CP11X_TYPE define
arm64: dts: marvell: Move arch timer and pmu nodes to top-level
arm64: dts: rockchip: Fix PWM pinctrl names
arm64: dts: rockchip: fix RK3576 SCMI clock IDs
dt-bindings: clock: rk3576: add SCMI clocks
arm64: dts: rockchip: Fix pcie reset gpio on Orange Pi 5 Max
arm64: dts: amd/seattle: Drop undocumented "spi-controller" properties
arm64: dts: amd/seattle: Fix bus, mmc, and ethernet node names
arm64: dts: amd/seattle: Move and simplify fixed clocks
arm64: dts: amd/seattle: Base Overdrive B1 on top of B0 version
arm64: dts: rockchip: Enable HDMI audio output for ArmSoM Sige7
arm64: dts: rockchip: Enable onboard eMMC on Radxa E20C
arm64: dts: rockchip: Add SDHCI controller for RK3528
arm64: dts: rockchip: Remove bluetooth node from rock-3a
arm64: dts: rockchip: Move rk356x scmi SHMEM to reserved memory
...
Ayush Jain [Fri, 7 Mar 2025 04:38:54 +0000 (04:38 +0000)]
ktest: Fix Test Failures Due to Missing LOG_FILE Directories
Handle missing parent directories for LOG_FILE path to prevent test
failures. If the parent directories don't exist, create them to ensure
the tests proceed successfully.
tracing: probe-events: Add comments about entry data storing code
Add comments about entry data storing code to __store_entry_arg() and
traceprobe_get_entry_data_size(). These are a bit complicated because of
building the entry data storing code and scanning it.
This just add comments, no behavior change.
Link: https://lore.kernel.org/all/174061715004.501424.333819546601401102.stgit@devnote2/ Reported-by: Steven Rostedt <rostedt@goodmis.org> Closes: https://lore.kernel.org/all/20250226102223.586d7119@gandalf.local.home/ Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Linus Torvalds [Thu, 27 Mar 2025 04:48:21 +0000 (21:48 -0700)]
Merge tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Continue Netlink conversions to per-namespace RTNL lock
(IPv4 routing, routing rules, routing next hops, ARP ioctls)
- Continue extending the use of netdev instance locks. As a driver
opt-in protect queue operations and (in due course) ethtool
operations with the instance lock and not RTNL lock.
- Support collecting TCP timestamps (data submitted, sent, acked) in
BPF, allowing for transparent (to the application) and lower
overhead tracking of TCP RPC performance.
- Tweak existing networking Rx zero-copy infra to support zero-copy
Rx via io_uring.
- Optimize MPTCP performance in single subflow mode by 29%.
- Enable GRO on packets which went thru XDP CPU redirect (were queued
for processing on a different CPU). Improving TCP stream
performance up to 2x.
- Improve performance of contended connect() by 200% by searching for
an available 4-tuple under RCU rather than a spin lock. Bring an
additional 229% improvement by tweaking hash distribution.
- Avoid unconditionally touching sk_tsflags on RX, improving
performance under UDP flood by as much as 10%.
- Avoid skb_clone() dance in ping_rcv() to improve performance under
ping flood.
- Avoid FIB lookup in netfilter if socket is available, 20% perf win.
- Rework network device creation (in-kernel) API to more clearly
identify network namespaces and their roles. There are up to 4
namespace roles but we used to have just 2 netns pointer arguments,
interpreted differently based on context.
- Use sysfs_break_active_protection() instead of trylock to avoid
deadlocks between unregistering objects and sysfs access.
- Add a new sysctl and sockopt for capping max retransmit timeout in
TCP.
- Support masking port and DSCP in routing rule matches.
- Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.
- Support specifying at what time packet should be sent on AF_XDP
sockets.
- Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin
users.
- Add Netlink YAML spec for WiFi (nl80211) and conntrack.
- Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
which only need to be exported when IPv6 support is built as a
module.
- Age FDB entries based on Rx not Tx traffic in VxLAN, similar to
normal bridging.
- Allow users to specify source port range for GENEVE tunnels.
- netconsole: allow attaching kernel release, CPU ID and task name to
messages as metadata
Driver API:
- Continue rework / fixing of Energy Efficient Ethernet (EEE) across
the SW layers. Delegate the responsibilities to phylink where
possible. Improve its handling in phylib.
- Support symmetric OR-XOR RSS hashing algorithm.
- Support tracking and preserving IRQ affinity by NAPI itself.
- Support loopback mode speed selection for interface selftests.
Device drivers:
- Remove the IBM LCS driver for s390
- Remove the sb1000 cable modem driver
- Add support for SFP module access over SMBus
- Add MCTP transport driver for MCTP-over-USB
- Enable XDP metadata support in multiple drivers
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- add PCIe TLP Processing Hints (TPH) support for new AMD
platforms
- support dumping RoCE queue state for debug
- opt into instance locking
- Intel (100G, ice, idpf):
- ice: rework MSI-X IRQ management and distribution
- ice: support for E830 devices
- iavf: add support for Rx timestamping
- iavf: opt into instance locking
- nVidia/Mellanox:
- mlx4: use page pool memory allocator for Rx
- mlx5: support for one PTP device per hardware clock
- mlx5: support for 200Gbps per-lane link modes
- mlx5: move IPSec policy check after decryption
- AMD/Solarflare:
- support FW flashing via devlink
- Cisco (enic):
- use page pool memory allocator for Rx
- enable 32, 64 byte CQEs
- get max rx/tx ring size from the device
- Meta (fbnic):
- support flow steering and RSS configuration
- report queue stats
- support TCP segmentation
- support IRQ coalescing
- support ring size configuration
- Marvell/Cavium:
- support AF_XDP
- Wangxun:
- support for PTP clock and timestamping
- Huawei (hibmcge):
- checksum offload
- add more statistics
- Ethernet virtual:
- VirtIO net:
- aggressively suppress Tx completions, improve perf by 96%
with 1 CPU and 55% with 2 CPUs
- expose NAPI to IRQ mapping and persist NAPI settings
- Google (gve):
- support XDP in DQO RDA Queue Format
- opt into instance locking
- Microsoft vNIC:
- support BIG TCP
- Ethernet NICs consumer, and embedded:
- Synopsys (stmmac):
- cleanup Tx and Tx clock setting and other link-focused
cleanups
- enable SGMII and 2500BASEX mode switching for Intel platforms
- support Sophgo SG2044
- Broadcom switches (b53):
- support for BCM53101
- TI:
- iep: add perout configuration support
- icssg: support XDP
- Cadence (macb):
- implement BQL
- Xilinx (axinet):
- support dynamic IRQ moderation and changing coalescing at
runtime
- implement BQL
- report standard stats
- MediaTek:
- support phylink managed EEE
- Intel:
- igc: don't restart the interface on every XDP program change
- RealTek (r8169):
- support reading registers of internal PHYs directly
- increase max jumbo packet size on RTL8125/RTL8126
- Airoha:
- support for RISC-V NPU packet processing unit
- enable scatter-gather and support MTU up to 9kB
- Tehuti (tn40xx):
- support cards with TN4010 MAC and an Aquantia AQR105 PHY
- Ethernet PHYs:
- support for TJA1102S, TJA1121
- dp83tg720: add randomized polling intervals for link detection
- dp83822: support changing the transmit amplitude voltage
- support for LEDs on 88q2xxx
- CAN:
- canxl: support Remote Request Substitution bit access
- flexcan: add S32G2/S32G3 SoC
- WiFi:
- remove cooked monitor support
- strict mode for better AP testing
- basic EPCS support
- OMI RX bandwidth reduction support
- batman-adv: add support for jumbo frames
- WiFi drivers:
- RealTek (rtw88):
- support RTL8814AE and RTL8814AU
- RealTek (rtw89):
- switch using wiphy_lock and wiphy_work
- add BB context to manipulate two PHY as preparation of MLO
- improve BT-coexistence mechanism to play A2DP smoothly
- Intel (iwlwifi):
- add new iwlmld sub-driver for latest HW/FW combinations
- MediaTek (mt76):
- preparation for mt7996 Multi-Link Operation (MLO) support
- Qualcomm/Atheros (ath12k):
- continued work on MLO
- Silabs (wfx):
- Wake-on-WLAN support
- Bluetooth:
- add support for skb TX SND/COMPLETION timestamping
- hci_core: enable buffer flow control for SCO/eSCO
- coredump: log devcd dumps into the monitor
- Bluetooth drivers:
- intel: add support to configure TX power
- nxp: handle bootloader error during cmd5 and cmd7"
* tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits)
unix: fix up for "apparmor: add fine grained af_unix mediation"
mctp: Fix incorrect tx flow invalidation condition in mctp-i2c
net: usb: asix: ax88772: Increase phy_name size
net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string
net: libwx: fix Tx L4 checksum
net: libwx: fix Tx descriptor content for some tunnel packets
atm: Fix NULL pointer dereference
net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards
net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card
net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus
net: phy: aquantia: add essential functions to aqr105 driver
net: phy: aquantia: search for firmware-name in fwnode
net: phy: aquantia: add probe function to aqr105 for firmware loading
net: phy: Add swnode support to mdiobus_scan
gve: add XDP DROP and PASS support for DQ
gve: update XDP allocation path support RX buffer posting
gve: merge packet buffer size fields
gve: update GQ RX to use buf_size
gve: introduce config-based allocation for XDP
gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics
...
Linus Torvalds [Thu, 27 Mar 2025 04:35:28 +0000 (21:35 -0700)]
Merge tag 'zstd-linus-v6.15-rc1' of https://github.com/terrelln/linux
Pull zstd updates from Nick Terrell:
"Update zstd to the latest upstream release v1.5.7.
The two major motivations for updating Zstandard are to keep the code
up to date, and to expose API's needed by Intel for the QAT
compression accelerator.
Imported cleanly from the upstream tag v1.5.7-kernel, which is signed
by upstream's signing key EF8FE99528B52FFD"
Linus Torvalds [Thu, 27 Mar 2025 04:02:05 +0000 (21:02 -0700)]
Merge tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
- Move vm_table members out of kernel/sysctl.c
All vm_table array members have moved to their respective subsystems
leading to the removal of vm_table from kernel/sysctl.c. This
increases modularity by placing the ctl_tables closer to where they
are actually used and at the same time reducing the chances of merge
conflicts in kernel/sysctl.c.
- ctl_table range fixes
Replace the proc_handler function that checks variable ranges in
coredump_sysctls and vdso_table with the one that actually uses the
extra{1,2} pointers as min/max values. This tightens the range of the
values that users can pass into the kernel effectively preventing
{under,over}flows.
- Misc fixes
Correct grammar errors and typos in test messages. Update sysctl
files in MAINTAINERS. Constified and removed array size in
declaration for alignment_tbl
* tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (22 commits)
selftests/sysctl: fix wording of help messages
selftests: fix spelling/grammar errors in sysctl/sysctl.sh
MAINTAINERS: Update sysctl file list in MAINTAINERS
sysctl: Fix underflow value setting risk in vm_table
coredump: Fixes core_pipe_limit sysctl proc_handler
sysctl: remove unneeded include
sysctl: remove the vm_table
sh: vdso: move the sysctl to arch/sh/kernel/vsyscall/vsyscall.c
x86: vdso: move the sysctl to arch/x86/entry/vdso/vdso32-setup.c
fs: dcache: move the sysctl to fs/dcache.c
sunrpc: simplify rpcauth_cache_shrink_count()
fs: drop_caches: move sysctl to fs/drop_caches.c
fs: fs-writeback: move sysctl to fs/fs-writeback.c
mm: nommu: move sysctl to mm/nommu.c
security: min_addr: move sysctl to security/min_addr.c
mm: mmap: move sysctl to mm/mmap.c
mm: util: move sysctls to mm/util.c
mm: vmscan: move vmscan sysctls to mm/vmscan.c
mm: swap: move sysctl to mm/swap.c
mm: filemap: move sysctl to mm/filemap.c
...
Linus Torvalds [Thu, 27 Mar 2025 03:10:09 +0000 (20:10 -0700)]
Merge tag 'iommu-updates-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel:
"Core iommufd dependencies from Jason:
- Change the iommufd fault handle into an always present hwpt handle
in the domain
- Give iommufd its own SW_MSI implementation along with some IRQ
layer rework
- Improvements to the handle attach API
Core fixes for probe-issues from Robin
Intel VT-d changes:
- Checking for SVA support in domain allocation and attach paths
- Move PCI ATS and PRI configuration into probe paths
- Fix a pentential hang on reboot -f
- Miscellaneous cleanups
AMD-Vi changes:
- Support for up to 2k IRQs per PCI device function
- Set of smaller fixes
ARM-SMMU changes:
- SMMUv2 devicetree binding updates for Qualcomm implementations
(QCS8300 GPU and MSM8937)
- Clean up SMMUv2 runtime PM implementation to help with wider rework
of pm_runtime_put_autosuspend()
S390 IOMMU changes:
- Support for IOMMU passthrough
Apple Dart changes:
- Driver adjustments to meet ISP device requirements
- Null-ptr deref fix
- Disable subpage protection for DART 1"
* tag 'iommu-updates-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (54 commits)
iommu/vt-d: Fix possible circular locking dependency
iommu/vt-d: Don't clobber posted vCPU IRTE when host IRQ affinity changes
iommu/vt-d: Put IRTE back into posted MSI mode if vCPU posting is disabled
iommu: apple-dart: fix potential null pointer deref
iommu/rockchip: Retire global dma_dev workaround
iommu/rockchip: Register in a sensible order
iommu/rockchip: Allocate per-device data sensibly
iommu/mediatek-v1: Support COMPILE_TEST
iommu/amd: Enable support for up to 2K interrupts per function
iommu/amd: Rename DTE_INTTABLEN* and MAX_IRQS_PER_TABLE macro
iommu/amd: Replace slab cache allocator with page allocator
iommu/amd: Introduce generic function to set multibit feature value
iommu: Don't warn prematurely about dodgy probes
iommu/arm-smmu: Set rpm auto_suspend once during probe
dt-bindings: arm-smmu: Document QCS8300 GPU SMMU
iommu: Get DT/ACPI parsing into the proper probe path
iommu: Keep dev->iommu state consistent
iommu: Resolve ops in iommu_init_device()
iommu: Handle race with default domain setup
iommu: Unexport iommu_fwspec_free()
...
Linus Torvalds [Thu, 27 Mar 2025 02:57:34 +0000 (19:57 -0700)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (scsi_debug, ufs, lpfc, st, fnic, mpi3mr,
mpt3sas) and the removal of cxlflash.
The only non-trivial core change is an addition to unit attention
handling to recognize UAs for power on/reset and new media so the tape
driver can use it"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (107 commits)
scsi: st: Tighten the page format heuristics with MODE SELECT
scsi: st: ERASE does not change tape location
scsi: st: Fix array overflow in st_setup()
scsi: target: tcm_loop: Fix wrong abort tag
scsi: lpfc: Restore clearing of NLP_UNREG_INP in ndlp->nlp_flag
scsi: hisi_sas: Fixed failure to issue vendor specific commands
scsi: fnic: Remove unnecessary NUL-terminations
scsi: fnic: Remove redundant flush_workqueue() calls
scsi: core: Use a switch statement when attaching VPD pages
scsi: ufs: renesas: Add initialization code for R-Car S4-8 ES1.2
scsi: ufs: renesas: Add reusable functions
scsi: ufs: renesas: Refactor 0x10ad/0x10af PHY settings
scsi: ufs: renesas: Remove register control helper function
scsi: ufs: renesas: Add register read to remove save/set/restore
scsi: ufs: renesas: Replace init data by init code
scsi: ufs: dt-bindings: renesas,ufs: Add calibration data
scsi: mpi3mr: Task Abort EH Support
scsi: storvsc: Don't report the host packet status as the hv status
scsi: isci: Make most module parameters static
scsi: megaraid_sas: Make most module parameters static
...
Linus Torvalds [Thu, 27 Mar 2025 02:49:02 +0000 (19:49 -0700)]
Merge tag 'ata-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata updates from Niklas Cassel:
- Add 'external' to the libata.force module parameter, in order to
allow a user to workaround broken firmware (me)
- Use the str_up_down() helper in the sata_via driver (Salah Triki)
- Convert the Freescale PowerQUICC SATA device tree binding to YAML
(J. Neuschäfer)
- Do not use ATAPI DMA for a device that only supports PIO (me)
- Add Marvell 88SE9215 PCI device ID to the ahci driver. Since the
controller has quirks, it cannot rely on the generic AHCI PCI class
code entry (Daniel Kral)
- Improve the return value of atapi_check_dma() (Huacai Chen)
- Fix the NCQ Non-Data log not supported print to actually reference
the correct log (me)
- Make Marvel 88SE9215 prefer DMA for ATAPI devices (Huacai Chen)
- Simplify the AHCI IRQ vector allocations by performing the IRQ vector
allocations in the same function, regardless of IRQ type (Tomas
Henzl)
* tag 'ata-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: ahci: simplify init function
ahci: Marvell 88SE9215 controllers prefer DMA for ATAPI
ata: libata: Fix NCQ Non-Data log not supported print
ata: libata: Improve return value of atapi_check_dma()
ahci: add PCI ID for Marvell 88SE9215 SATA Controller
ata: libata-eh: Do not use ATAPI DMA for a device limited to PIO mode
dt-bindings: ata: Convert fsl,pq-sata to YAML
ata: sata_via: Use str_up_down() helper in vt6420_prereset()
ata: libata-core: Add 'external' to the libata.force kernel parameter
Linus Torvalds [Thu, 27 Mar 2025 01:08:55 +0000 (18:08 -0700)]
Merge tag 'for-6.15/block-20250322' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- Fixes for integrity handling
- NVMe pull request via Keith:
- Secure concatenation for TCP transport (Hannes)
- Multipath sysfs visibility (Nilay)
- Various cleanups (Qasim, Baruch, Wang, Chen, Mike, Damien, Li)
- Correct use of 64-bit BARs for pci-epf target (Niklas)
- Socket fix for selinux when used in containers (Peijie)
- MD pull request via Yu:
- fix recovery can preempt resync (Li Nan)
- fix md-bitmap IO limit (Su Yue)
- fix raid10 discard with REQ_NOWAIT (Xiao Ni)
- fix raid1 memory leak (Zheng Qixing)
- fix mddev uaf (Yu Kuai)
- fix raid1,raid10 IO flags (Yu Kuai)
- some refactor and cleanup (Yu Kuai)
- Series cleaning up and fixing bugs in the bad block handling code
- Improve support for write failure simulation in null_blk
- Various lock ordering fixes
- Fixes for locking for debugfs attributes
- Various ublk related fixes and improvements
- Cleanups for blk-rq-qos wait handling
- blk-throttle fixes
- Fixes for loop dio and sync handling
- Fixes and cleanups for the auto-PI code
- Block side support for hardware encryption keys in blk-crypto
- Various cleanups and fixes
* tag 'for-6.15/block-20250322' of git://git.kernel.dk/linux: (105 commits)
nvmet: replace max(a, min(b, c)) by clamp(val, lo, hi)
nvme-tcp: fix selinux denied when calling sock_sendmsg
nvmet: pci-epf: Always configure BAR0 as 64-bit
nvmet: Remove duplicate uuid_copy
nvme: zns: Simplify nvme_zone_parse_entry()
nvmet: pci-epf: Remove redundant 'flush_workqueue()' calls
nvmet-fc: Remove unused functions
nvme-pci: remove stale comment
nvme-fc: Utilise min3() to simplify queue count calculation
nvme-multipath: Add visibility for queue-depth io-policy
nvme-multipath: Add visibility for numa io-policy
nvme-multipath: Add visibility for round-robin io-policy
nvmet: add tls_concat and tls_key debugfs entries
nvmet-tcp: support secure channel concatenation
nvmet: Add 'sq' argument to alloc_ctrl_args
nvme-fabrics: reset admin connection for secure concatenation
nvme-tcp: request secure channel concatenation
nvme-keyring: add nvme_tls_psk_refresh()
nvme: add nvme_auth_derive_tls_psk()
nvme: add nvme_auth_generate_digest()
...
Linus Torvalds [Thu, 27 Mar 2025 00:56:00 +0000 (17:56 -0700)]
Merge tag 'for-6.15/io_uring-20250322' of git://git.kernel.dk/linux
Pull io_uring updates from Jens Axboe:
"This is the first of the io_uring pull requests for the 6.15 merge
window, there will be others once the net tree has gone in. This
contains:
- Cleanup and unification of cancelation handling across various
request types.
- Improvement for bundles, supporting them both for incrementally
consumed buffers, and for non-multishot requests.
- Enable toggling of using iowait while waiting on io_uring events or
not. Unfortunately this is still tied with CPU frequency boosting
on short waits, as the scheduler side has not been very receptive
to splitting the (useless) iowait stat from the cpufreq implied
boost.
- Add support for kbuf nodes, enabling zero-copy support for the ublk
block driver.
- Various cleanups for resource node handling.
- Series greatly cleaning up the legacy provided (non-ring based)
buffers. For years, we've been pushing the ring provided buffers as
the way to go, and that is what people have been using. Reduce the
complexity and code associated with legacy provided buffers.
- Series cleaning up the compat handling.
- Series improving and cleaning up the recvmsg/sendmsg iovec and msg
handling.
- Series of cleanups for io-wq.
- Start adding a bunch of selftests. The liburing repository
generally carries feature and regression tests for everything, but
at least for ublk initially, we'll try and go the route of having
it in selftests as well. We'll see how this goes, might decide to
migrate more tests this way in the future.
- Various little cleanups and fixes"
* tag 'for-6.15/io_uring-20250322' of git://git.kernel.dk/linux: (108 commits)
selftests: ublk: add stripe target
selftests: ublk: simplify loop io completion
selftests: ublk: enable zero copy for null target
selftests: ublk: prepare for supporting stripe target
selftests: ublk: move common code into common.c
selftests: ublk: increase max buffer size to 1MB
selftests: ublk: add single sqe allocator helper
selftests: ublk: add generic_01 for verifying sequential IO order
selftests: ublk: fix starting ublk device
io_uring: enable toggle of iowait usage when waiting on CQEs
selftests: ublk: fix write cache implementation
selftests: ublk: add variable for user to not show test result
selftests: ublk: don't show `modprobe` failure
selftests: ublk: add one dependency header
io_uring/kbuf: enable bundles for incrementally consumed buffers
Revert "io_uring/rsrc: simplify the bvec iter count calculation"
selftests: ublk: improve test usability
selftests: ublk: add stress test for covering IO vs. killing ublk server
selftests: ublk: add one stress test for covering IO vs. removing device
selftests: ublk: load/unload ublk_drv when preparing & cleaning up tests
...
Jann Horn [Wed, 26 Mar 2025 21:04:36 +0000 (22:04 +0100)]
rwonce: fix crash by removing READ_ONCE() for unaligned read
When arm64 is built with LTO, it upgrades READ_ONCE() to ldar / ldapr
(load-acquire) to avoid issues that can be caused by the compiler
optimizing away implicit address dependencies.
Unlike plain loads, these load-acquire instructions actually require an
aligned address.
For now, fix it by removing the READ_ONCE() that the buggy commit
introduced.
Linus Torvalds [Wed, 26 Mar 2025 20:30:27 +0000 (13:30 -0700)]
Merge tag 'timers-clocksource-2025-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull clocksource/event updates from Thomas Gleixner:
- Add support for suspend/resume in the STM32 LP-Timer driver with a
follow up fix, which uses the proper method to setup the timer as a
optional wakeup source instead of trying to force it as mandatory
wakeup source.
- The usual device tree updates to enable new SoC models in existing
drivers.
- Trivial spelling, style and indentation fixes
* tag 'timers-clocksource-2025-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
dt-bindings: timer: Add SiFive CLINT2
clocksource/drivers/stm32-lptimer: Use wakeup capable instead of init wakeup
clocksource/drivers/exynos_mct: Fixed a spelling error
clocksource/drivers/stm32-lptimer: Add support for suspend / resume
dt-bindings: timer: exynos4210-mct: add samsung,exynos2200-mct-peris compatible
dt-bindings: timer: exynos4210-mct: Add samsung,exynos990-mct compatible
dt-bindings: timer: Correct indentation and style in DTS example
Linus Torvalds [Wed, 26 Mar 2025 20:20:22 +0000 (13:20 -0700)]
Merge tag 'irq-urgent-2025-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MSI irq fix from Thomas Gleixner:
"An urgent fix for the XEN related PCI/MSI changes:
XEN used a global variable to disable the masking of MSI interrupts as
XEN handles that on the hypervisor side. This turned out to be a
problem with VMD as the PCI devices behind a VMD bridge are not always
handled by the hypervisor and then require masking by guest.
To solve this the global variable was replaced by a interrupt domain
specific flag, which is set by the generic XEN PCI/MSI domain, but not
by VMD or any other domain in the system.
So far, so good. But the implementation (and the reviewer) missed the
fact, that accessing the domain flag cannot be done directly because
there are at least two situations, where this fails.
Legacy architectures are not providing interrupt domains at all. The
new MSI parent domains do not require to have a domain info pointer.
Both cases result in a unconditional NULL pointer derefence.
The PCI/MSI code already has a function to query the MSI domain
specific flag in a safe way, which handles all possible cases of
PCI/MSI backends.
So the fix it simply to replace the open coded checks by invoking the
safe helper to query the flag"
* tag 'irq-urgent-2025-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends
Linus Torvalds [Wed, 26 Mar 2025 17:28:36 +0000 (10:28 -0700)]
Merge tag 'mtd/for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd updates from Miquel Raynal:
"MTD changes:
- The atmel,dataflash binding has been converted to yaml and the
physmap one constrained. Some logs are improved, error path are
getting reworked a bit, few patches target the use of
str_enabled_disabled().
Raw NAND changes:
- i.MX8 and i.MX31 now have their own compatible, the Qcom driver got
cleaned, the Broadcom driver got fixed.
SPI NAND changes:
- OTP support has been brought, and ESMT and Micron manufacturer
drivers implement it.
- Read retry, and Macronix manufacturer driver implement it.
SPI NOR changes:
- Adding support for few flashes. Few cleanup patches for the core
driver, where we touched the headers inclusion list and we start
using the scope based mutex cleanup helpers.
There is also a bunch of minor improvements and fixes in drivers
and bindings"
* tag 'mtd/for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (34 commits)
dt-bindings: mtd: atmel,dataflash: convert txt to yaml
mtd: mchp48l640: Use str_enable_disable() in mchp48l640_write_prepare()
mtd: rawnand: gpmi: Use str_enabled_disabled() in gpmi_nand_attach_chip()
mtd: mtdpart: Do not supply NULL to printf()
dt-bindings: mtd: gpmi-nand: Add compatible string for i.MX8 chips
mtd: nand: Fix a kdoc comment
mtd: spinand: Improve spinand_info macros style
mtd: spi-nor: drop unused <linux/of_platform.h>
mtd: spi-nor: explicitly include <linux/of.h>
mtd: spi-nor: explicitly include <linux/math64.h>
mtd: spi-nor: macronix: add support for mx66{l2, u1}g45g
mtd: spi-nor: macronix: Add post_sfdp fixups for Quad Input Page Program
mtd: Fix error handling in mtd_device_parse_register() error path
mtd: capture device name setting failure when adding mtd
mtd: Add check for devm_kcalloc()
mtd: Replace kcalloc() with devm_kcalloc()
dt-bindings: mtd: physmap: Ensure all properties are defined
mtd: rawnand: brcmnand: fix PM resume warning
dt-bindings: mtd: mxc-nand: Document fsl,imx31-nand
mtd: spinand: macronix: Add support for read retry
...
Linus Torvalds [Wed, 26 Mar 2025 17:05:43 +0000 (10:05 -0700)]
Merge tag 'hid-for-linus-2025032601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Jiri Kosina:
- PlayStation 5 controllers support (Alex Henrie)
- big revamp and modernization of the aged hid-pidff force feedback
driver (Tomasz Pakuła)
- conversion of hid-lg-g15 to standard multicolor LED API (Kate Hsuan)
- improvement of behavior of Human Presence Sensor (HPD) in amd_sfh
driver (Mario Limonciello)
- other assorted fixes, code cleanups and device ID additions
* tag 'hid-for-linus-2025032601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (70 commits)
HID: remove superfluous (and wrong) Makefile entry for CONFIG_INTEL_ISH_FIRMWARE_DOWNLOADER
HID: Intel-thc-hid: Intel-quickspi: Correct device state names gramatically
HID: wacom: Remove static WACOM_PKGLEN_MAX limit
HID: amd_sfh: Don't show wrong status for amd_sfh_hpd_info()
HID: amd_sfh: Default to HPD disabled
HID: amd_sfh: Allow configuring whether HPD is enabled or disabled
HID: pidff: Fix set_device_control()
HID: pidff: Fix 90 degrees direction name North -> East
HID: pidff: Compute INFINITE value instead of using hardcoded 0xffff
HID: pidff: Clamp effect playback LOOP_COUNT value
HID: pidff: Rename two functions to align them with naming convention
HID: lenovo: silence unreachable code warning
HID: lenovo: Fix to ensure the data as __le32 instead of u32
HID: bpf: add a v6.11+ compatible BPF fixup for the XPPen ACK05 remote
HID: bpf: new hid_bpf_async.h common header
HID: bpf: import new kfunc from v6.10 & v6.11
HID: bpf: add support for the XP-Pen Artist Pro 19 (gen2)
HID: bpf: Added updated Kamvas Pro 19 descriptor
HID: bpf: Suppress bogus F13 trigger on Sirius keyboard full fan shortcut
HID: bpf: Add support for the default firmware mode of the Huion K20
...
* tag 'platform-drivers-x86-v6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (75 commits)
platform/x86: x86-android-tablets: Add select POWER_SUPPLY to Kconfig
platform/x86/amd/pmf: convert timeouts to secs_to_jiffies()
platform/x86: thinkpad_acpi: convert timeouts to secs_to_jiffies()
irqdomain: platform/x86: Switch to irq_domain_create_linear()
platform/x86/amd/pmc: fix leak in probe()
tools/power/x86/intel-speed-select: v1.22 release
tools/power/x86/intel-speed-select: Prefix header search path with sysroot
tools/power/x86/intel-speed-select: Die ID for IO dies
tools/power/x86/intel-speed-select: Fix the condition to check multi die system
tools/power/x86/intel-speed-select: Prevent increasing MAX_DIE_PER_PACKAGE
platform/x86/amd/pmc: Use managed APIs for mutex
platform/x86/amd/pmc: Remove unnecessary line breaks
platform/x86/amd/pmc: Move macros and structures to the PMC header file
platform/x86/amd/pmc: Notify user when platform does not support s0ix transition
platform/x86: dell-ddv: Use the power supply extension mechanism
platform/x86: dell-ddv: Use devm_battery_hook_register
platform/x86: dell-ddv: Fix temperature calculation
platform/x86: thinkpad_acpi: check the return value of devm_mutex_init()
platform/x86: samsung-galaxybook: Fix block_recording not supported logic
platform/x86: dell-uart-backlight: Make dell_uart_bl_serdev_driver static
...
Miquel Raynal [Wed, 26 Mar 2025 16:49:15 +0000 (17:49 +0100)]
Merge tag 'nand/for-6.15' into mtd/next
* Raw NAND changes:
i.MX8 and i.MX31 now have their own compatible, the Qcom driver got
cleaned, the Broadcom driver got fixed.
* SPI NAND changes:
Two main features have been added:
- OTP support has been brought, and ESMT and Micron manufacturer drivers
implement it.
- Read retry, and Macronix manufacturer driver implement it.
There is as well a bunch of minor improvements and fixes in drivers and
bindings.
Miquel Raynal [Wed, 26 Mar 2025 16:49:01 +0000 (17:49 +0100)]
Merge tag 'spi-nor/for-6.15' into mtd/next
SPI NOR adds support for few flashes. Few cleanup patches for the core
driver, where we touched the headers inclusion list and we start using
the scope based mutex cleanup helpers.
Linus Torvalds [Wed, 26 Mar 2025 16:41:55 +0000 (09:41 -0700)]
Merge tag 'sound-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"We've received lots of commits at this time, as a result of various
cleanup and refactoring works as well as a few new drivers and the
generic SoundWire support. Most of changes are device-specific, little
about the core changes. Some highlights below:
Core:
- A couple of (rather minor) race fixes in ALSA sequencer code
- A regression fix in ALSA timer code that may lead to a deadlock
ASoC:
- A large series of code conversion to use modern terminology for the
clocking configuration
- Conversions of PM ops with the modern macros in all ASoC drivers
- Clarification of the control operations
- Prepartory work for more generic SoundWire SCDA controls
- Support for AMD ACP 7.x, AWINC WM88166, Everest ES8388, Intel AVS
PEAKVOL and GAIN DSP modules Mediatek MT8188 DMIC, NXP i.MX95,
nVidia Tegra interconnects, Rockchip RK3588 S/PDIF, Texas
Instruments SN012776 and TAS5770L, and Wolfson WM8904 DMICs
Others:
- Conversions of PM ops with the modern macros in the rest drivers
- USB-audio quirks and fixes for Presonus Studio, DJM-A9, CME
- HD-audio quirks and fixes ASUS, HP, Lenovo, and others"
* tag 'sound-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (651 commits)
ALSA: hda: tas2781-i2c: Remove unnecessary NULL check before release_firmware()
ALSA: hda: cs35l56: Remove unnecessary NULL check before release_firmware()
ALSA: hda/realtek: Bass speaker fixup for ASUS UM5606KA
ALSA: hda/realtek: Fix built-in mic assignment on ASUS VivoBook X515UA
ALSA: hda/realtek: Add support for various HP Laptops using CS35L41 HDA
ALSA: timer: Don't take register_mutex with copy_from/to_user()
ASoC: SDCA: Correct handling of selected mode DisCo property
ASoC: amd: yc: update quirk data for new Lenovo model
ALSA: hda/realtek: fix micmute LEDs on HP Laptops with ALC3247
ALSA: hda/realtek: fix micmute LEDs on HP Laptops with ALC3315
ASoC: SOF: mediatek: Commonize duplicated functions
ASoC: dmic: Fix NULL pointer dereference
ASoC: wm8904: add DMIC support
ASoC: wm8904: get platform data from DT
ASoC: dt-bindings: wm8904: Add DMIC, GPIO, MIC and EQ support
ASoC: wm8904: Don't touch GPIO configs set to 0xFFFF
of: Add of_property_read_u16_index
ALSA: oxygen: Fix dependency on CONFIG_PM_SLEEP
ASoC: ops: Apply platform_max after deciding control type
ASoC: ops: Remove some unnecessary local variables
...
Stephen Rothwell [Wed, 26 Mar 2025 04:01:48 +0000 (15:01 +1100)]
unix: fix up for "apparmor: add fine grained af_unix mediation"
After merging the apparmor tree, today's linux-next build (x86_64
allmodconfig) failed like this:
security/apparmor/af_unix.c: In function 'unix_state_double_lock':
security/apparmor/af_unix.c:627:17: error: implicit declaration of function 'unix_state_lock'; did you mean 'unix_state_double_lock'? [-Wimplicit-function-declaration]
627 | unix_state_lock(sk1);
| ^~~~~~~~~~~~~~~
| unix_state_double_lock
security/apparmor/af_unix.c: In function 'unix_state_double_unlock':
security/apparmor/af_unix.c:642:17: error: implicit declaration of function 'unix_state_unlock'; did you mean 'unix_state_double_lock'? [-Wimplicit-function-declaration]
642 | unix_state_unlock(sk1);
| ^~~~~~~~~~~~~~~~~
| unix_state_double_lock
Caused by commit
c05e705812d1 ("apparmor: add fine grained af_unix mediation")
interacting with commit
84960bf24031 ("af_unix: Move internal definitions to net/unix/.")
Tomas Glozar [Thu, 20 Mar 2025 09:25:00 +0000 (10:25 +0100)]
rtla/tests: Test setting default options
Add function to test engine to test with pre-set osnoise options, and
use it to test whether osnoise period (as an example) is set correctly.
The test works by pre-setting a high period of 10 minutes and stop on
threshold. Thus, it is easy to check whether rtla is properly resetting
the period to default: if it is, the test will complete on time, since
the first sample will overflow the threshold. If not, it will time out.
Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-7-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tomas Glozar [Thu, 20 Mar 2025 09:24:59 +0000 (10:24 +0100)]
rtla/tests: Reset osnoise options before check
Remove any dangling tracing instances from previous improperly exited
runs of rtla, and reset osnoise options to default before running a test
case.
This ensures that the test results are deterministic. Specific test
cases checked that rtla behaves correctly even when the tracer state is
not clean will be added later.
Tomas Glozar [Thu, 20 Mar 2025 09:24:58 +0000 (10:24 +0100)]
rtla: Always set all tracer options
rtla currently only sets tracer options that are explicitly set by the
user, with the exception of OSNOISE_WORKLOAD.
This leads to improper behavior in case rtla is run with those options
not set to the default value. rtla does reset them to the original
value upon exiting, but that does not protect it from starting with
non-default values set either by an improperly exited rtla or by another
user of the tracers.
Fix the problem by setting the default value for all tracer options if
the user has not provided their own value.
For most of the options, it's enough to just drop the if clause checking
for the value being set. For cpus, "all" is used as the default value,
and for osnoise default period and runtime, default values of
the osnoise_data variable in trace_osnoise.c are used.
Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-5-tglozar@redhat.com Fixes: 1eceb2fc2ca5 ("rtla/osnoise: Add osnoise top mode") Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode") Fixes: a828cd18bc4a ("rtla: Add timerlat tool and timelart top mode") Fixes: 1eeb6328e8b3 ("rtla/timerlat: Add timerlat hist mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This situation can also happen when running rtla-osnoise after an
improperly exited rtla-timerlat run.
Set OSNOISE_WORKLOAD in rtla-osnoise, too, similarly to what we
already did for timerlat in commit 217f0b1e990e ("rtla/timerlat_top: Set
OSNOISE_WORKLOAD for kernel threads") and commit d8d866171a41
("rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads").
Note that there is no user workload mode for rtla-osnoise yet, so
OSNOISE_WORKLOAD is always set to true.
Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-4-tglozar@redhat.com Fixes: 1eceb2fc2ca5 ("rtla/osnoise: Add osnoise top mode") Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tomas Glozar [Thu, 20 Mar 2025 09:24:56 +0000 (10:24 +0100)]
rtla: Unify apply_config between top and hist
The functions osnoise_top_apply_config and osnoise_hist_apply_config, as
well as timerlat_top_apply_config and timerlat_hist_apply_config, are
mostly the same.
Move common part from them into separate functions osnoise_apply_config
and timerlat_apply_config.
For rtla-timerlat, also unify params->user_hist and params->user_top
into one field called params->user_data, and move several fields used
only by timerlat-top into the top-only section of struct
timerlat_params.
Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-3-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tomas Glozar [Thu, 20 Mar 2025 09:24:55 +0000 (10:24 +0100)]
rtla/osnoise: Unify params struct
Instead of having separate structs osnoise_top_params and
osnoise_hist_params, use one struct osnoise_params for both.
This allows code using the structs to be shared between osnoise-top and
osnoise-hist.
Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tomas Glozar [Thu, 13 Mar 2025 14:10:34 +0000 (15:10 +0100)]
rtla: Fix segfault in save_trace_to_file call
Running rtla with exit on threshold, but without saving trace leads to a
segmenetation fault:
$ rtla timerlat hist -T 10
...
Max timerlat IRQ latency from idle: 4.29 us in cpu 0
Segmentation fault
This is caused by null pointer deference in the call of
save_trace_to_file, which attempts to dereference an uninitialized
osnoise_tool variable:
save_trace_to_file(record->trace.inst, params->trace_output);
^ this is uninitialized if params->trace_output is
not set
Fix this by not attempting to dereference "record" if it is NULL and
passing NULL instead. As a safety measure, the first field is also
checked for NULL inside save_trace_to_file.
Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/20250313141034.299117-1-tglozar@redhat.com Fixes: dc4d4e7c72d1 ("rtla: Refactor save_trace_to_file") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tomas Glozar [Wed, 26 Mar 2025 00:40:18 +0000 (01:40 +0100)]
tools/build: Use SYSTEM_BPFTOOL for system bpftool
The feature test for system bpftool uses BPFTOOL as the variable to set
its path, defaulting to just "bpftool" if not set by the user.
This conflicts with selftests and a few other utilities, which expect
BPFTOOL to be set to the in-tree bpftool path by default. For example,
bpftool selftests fail to build:
$ make -C tools/testing/selftests/bpf/
make: Entering directory '/home/tglozar/dev/linux/tools/testing/selftests/bpf'
make: *** No rule to make target 'bpftool', needed by '/home/tglozar/dev/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h'. Stop.
make: Leaving directory '/home/tglozar/dev/linux/tools/testing/selftests/bpf'
Fix the problem by renaming the variable used for system bpftool from
BPFTOOL to SYSTEM_BPFTOOL, so that the new usage does not conflict with
the existing one of BPFTOOL.
Daniel Hsu [Tue, 25 Mar 2025 08:10:08 +0000 (16:10 +0800)]
mctp: Fix incorrect tx flow invalidation condition in mctp-i2c
Previously, the condition for invalidating the tx flow in
mctp_i2c_invalidate_tx_flow() checked if `rc` was nonzero.
However, this could incorrectly trigger the invalidation
even when `rc > 0` was returned as a success status.
This patch updates the condition to explicitly check for `rc < 0`,
ensuring that only error cases trigger the invalidation.
Signed-off-by: Daniel Hsu <Daniel-Hsu@quantatw.com> Reviewed-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Mickaël Salaün [Thu, 20 Mar 2025 19:07:17 +0000 (20:07 +0100)]
landlock: Add audit documentation
Because audit is dedicated to the system administrator, create a new
entry in Documentation/admin-guide/LSM . Extend other Landlock
documentation's pages with this new one.
Mickaël Salaün [Thu, 20 Mar 2025 19:07:13 +0000 (20:07 +0100)]
selftests/landlock: Add audit tests for ptrace
Add tests for all ptrace actions checking "blockers=ptrace" records.
This also improves PTRACE_TRACEME and PTRACE_ATTACH tests by making sure
that the restrictions comes from Landlock, and with the expected
process. These extended tests are like enhanced errno checks that make
sure Landlock enforcement is consistent.
Mickaël Salaün [Thu, 20 Mar 2025 19:07:11 +0000 (20:07 +0100)]
selftests/landlock: Add tests for audit flags and domain IDs
Add audit_test.c to check with and without LANDLOCK_RESTRICT_SELF_*
flags against the two Landlock audit record types:
AUDIT_LANDLOCK_ACCESS and AUDIT_LANDLOCK_DOMAIN.
Check consistency of domain IDs per layer in AUDIT_LANDLOCK_ACCESS and
AUDIT_LANDLOCK_DOMAIN messages: denied access, domain allocation, and
domain deallocation.
These tests use signal scoping to make it simple. They are not in the
scoped_signal_test.c file but in the new dedicated audit_test.c file.
Tests are run with audit filters to ensure the audit records come from
the test program. Moreover, because there can only be one audit
process, tests would failed if run in parallel. Because of audit
limitations, tests can only be run in the initial namespace.
The audit test helpers were inspired by libaudit and
tools/testing/selftests/net/netfilter/audit_logread.c
Mickaël Salaün [Thu, 20 Mar 2025 19:07:10 +0000 (20:07 +0100)]
selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags
Add the base_test's restrict_self_fd_flags tests to align with previous
restrict_self_fd tests but with the new
LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF flag.
Add the restrict_self_flags tests to check that
LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF,
LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON, and
LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF are valid but not the next
bit. Some checks are similar to restrict_self_checks_ordering's ones.
Mickaël Salaün [Thu, 20 Mar 2025 19:07:09 +0000 (20:07 +0100)]
selftests/landlock: Add test for invalid ruleset file descriptor
To align with fs_test's layout1.inval and layout0.proc_nsfs which test
EBADFD for landlock_add_rule(2), create a new base_test's
restrict_self_fd which test EBADFD for landlock_restrict_self(2).
Mickaël Salaün [Thu, 20 Mar 2025 19:07:08 +0000 (20:07 +0100)]
samples/landlock: Enable users to log sandbox denials
By default, denials from within the sandbox are not logged. Indeed, the
sandboxer's security policy might not be fitted to the set of sandboxed
processes that could be spawned (e.g. from a shell).
For test purpose, parse the LL_FORCE_LOG environment variable to log
every sandbox denials, including after launching the initial sandboxed
program thanks to LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON.
Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF for the case of sandboxer
tools, init systems, or runtime containers launching programs sandboxing
themselves in an inconsistent way. Setting this flag should only
depends on runtime configuration (i.e. not hardcoded).
We don't create a new ruleset's option because this should not be part
of the security policy: only the task that enforces the policy (not the
one that create it) knows if itself or its children may request denied
actions.
This is the first and only flag that can be set without actually
restricting the caller (i.e. without providing a ruleset).
Extend struct landlock_cred_security with a u8 log_subdomains_off.
struct landlock_file_security is still 16 bytes.
Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Closes: https://github.com/landlock-lsm/linux/issues/3 Link: https://lore.kernel.org/r/20250320190717.2287696-19-mic@digikod.net
[mic: Fix comment] Signed-off-by: Mickaël Salaün <mic@digikod.net>
Most of the time we want to log denied access because they should not
happen and such information helps diagnose issues. However, when
sandboxing processes that we know will try to access denied resources
(e.g. unknown, bogus, or malicious binary), we might want to not log
related access requests that might fill up logs.
By default, denied requests are logged until the task call execve(2).
If the LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF flag is set, denied
requests will not be logged for the same executed file.
If the LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON flag is set, denied
requests from after an execve(2) call will be logged.
The rationale is that a program should know its own behavior, but not
necessarily the behavior of other programs.
Because LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF is set for a specific
Landlock domain, it makes it possible to selectively mask some access
requests that would be logged by a parent domain, which might be handy
for unprivileged processes to limit logs. However, system
administrators should still use the audit filtering mechanism. There is
intentionally no audit nor sysctl configuration to re-enable these logs.
This is delegated to the user space program.
Increment the Landlock ABI version to reflect this interface change.
Mickaël Salaün [Thu, 20 Mar 2025 19:07:03 +0000 (20:07 +0100)]
landlock: Log truncate and IOCTL denials
Add audit support to the file_truncate and file_ioctl hooks.
Add a deny_masks_t type and related helpers to store the domain's layer
level per optional access rights (i.e. LANDLOCK_ACCESS_FS_TRUNCATE and
LANDLOCK_ACCESS_FS_IOCTL_DEV) when opening a file, which cannot be
inferred later. In practice, the landlock_file_security aligned blob size is
still 16 bytes because this new one-byte deny_masks field follows the
existing two-bytes allowed_access field and precede the packed
fown_subject.
Implementing deny_masks_t with a bitfield instead of a struct enables a
generic implementation to store and extract layer levels.
Add KUnit tests to check the identification of a layer level from a
deny_masks_t, and the computation of a deny_masks_t from an access right
with its layer level or a layer_mask_t array.
Mickaël Salaün [Thu, 20 Mar 2025 19:07:02 +0000 (20:07 +0100)]
landlock: Factor out IOCTL hooks
Compat and non-compat IOCTL hooks are almost the same, except to compare
the IOCTL command. Factor out these two IOCTL hooks to highlight the
difference and minimize audit changes (see next commit).
We could pack blocker names (e.g. "fs:make_reg,refer") but that would
increase complexity for the kernel and log parsers. Moreover, this
could not handle blockers of different classes (e.g. fs and net). Make
it simple and flexible instead.
Add KUnit tests to check the identification from a layer_mask_t array of
the first layer level denying such request.
Mickaël Salaün [Thu, 20 Mar 2025 19:06:59 +0000 (20:06 +0100)]
landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status
Asynchronously log domain information when it first denies an access.
This minimize the amount of generated logs, which makes it possible to
always log denials for the current execution since they should not
happen. These records are identified with the new AUDIT_LANDLOCK_DOMAIN
type.
The AUDIT_LANDLOCK_DOMAIN message contains:
- the "domain" ID which is described;
- the "status" which can either be "allocated" or "deallocated";
- the "mode" which is for now only "enforcing";
- for the "allocated" status, a minimal set of properties to easily
identify the task that loaded the domain's policy with
landlock_restrict_self(2): "pid", "uid", executable path ("exe"), and
command line ("comm");
- for the "deallocated" state, the number of "denials" accounted to this
domain, which is at least 1.
This requires each domain to save these task properties at creation
time in the new struct landlock_details. A reference to the PID is kept
for the lifetime of the domain to avoid race conditions when
investigating the related task. The executable path is resolved and
stored to not keep a reference to the filesystem and block related
actions. All these metadata are stored for the lifetime of the related
domain and should then be minimal. The required memory is not accounted
to the task calling landlock_restrict_self(2) contrary to most other
Landlock allocations (see related comment).
The AUDIT_LANDLOCK_DOMAIN record follows the first AUDIT_LANDLOCK_ACCESS
record for the same domain, which is always followed by AUDIT_SYSCALL
and AUDIT_PROCTITLE. This is in line with the audit logic to first
record the cause of an event, and then add context with other types of
record.
Log domain deletion with the "deallocated" state when a domain was
previously logged. This makes it possible for log parsers to free
potential resources when a domain ID will never show again.
The number of denied access requests is useful to easily check how many
access requests a domain blocked and potentially if some of them are
missing in logs because of audit rate limiting, audit rules, or Landlock
log configuration flags (see following commit).
Audit event sample for a deletion of a domain that denied something:
Cc: Günther Noack <gnoack@google.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-11-mic@digikod.net
[mic: Update comment and GFP flag for landlock_log_drop_domain()] Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Thu, 20 Mar 2025 19:06:58 +0000 (20:06 +0100)]
landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials
Add a new AUDIT_LANDLOCK_ACCESS record type dedicated to an access
request denied by a Landlock domain. AUDIT_LANDLOCK_ACCESS indicates
that something unexpected happened.
For now, only denied access are logged, which means that any
AUDIT_LANDLOCK_ACCESS record is always followed by a SYSCALL record with
"success=no". However, log parsers should check this syscall property
because this is the only sign that a request was denied. Indeed, we
could have "success=yes" if Landlock would support a "permissive" mode.
We could also add a new field to AUDIT_LANDLOCK_DOMAIN for this mode
(see following commit).
By default, the only logged access requests are those coming from the
same executed program that enforced the Landlock restriction on itself.
In other words, no audit record are created for a task after it called
execve(2). This is required to avoid log spam because programs may only
be aware of their own restrictions, but not the inherited ones.
Following commits will allow to conditionally generate
AUDIT_LANDLOCK_ACCESS records according to dedicated
landlock_restrict_self(2)'s flags.
The AUDIT_LANDLOCK_ACCESS message contains:
- the "domain" ID restricting the action on an object,
- the "blockers" that are missing to allow the requested access,
- a set of fields identifying the related object (e.g. task identified
with "opid" and "ocomm").
The blockers are implicit restrictions (e.g. ptrace), or explicit access
rights (e.g. filesystem), or explicit scopes (e.g. signal). This field
contains a list of at least one element, each separated with a comma.
The initial blocker is "ptrace", which describe all implicit Landlock
restrictions related to ptrace (e.g. deny tracing of tasks outside a
sandbox).
Add audit support to ptrace_access_check and ptrace_traceme hooks. For
the ptrace_access_check case, we log the current/parent domain and the
child task. For the ptrace_traceme case, we log the parent domain and
the current/child task. Indeed, the requester and the target are the
current task, but the action would be performed by the parent task.
Mickaël Salaün [Thu, 20 Mar 2025 19:06:57 +0000 (20:06 +0100)]
landlock: Identify domain execution crossing
Extend struct landlock_cred_security with a domain_exec bitmask to
identify which Landlock domain were created by the current task's bprm.
The whole bitmask is reset on each execve(2) call.
Mickaël Salaün [Thu, 20 Mar 2025 19:06:56 +0000 (20:06 +0100)]
landlock: Prepare to use credential instead of domain for fowner
This cosmetic change is needed for audit support, specifically to be
able to filter according to cross-execution boundaries.
struct landlock_file_security's size stay the same for now but it will
increase with struct landlock_cred_security's size.
Only save Landlock domain in hook_file_set_fowner() if the current
domain has LANDLOCK_SCOPE_SIGNAL, which was previously done for each
hook_file_send_sigiotask() calls. This should improve a bit
performance.
Replace hardcoded LANDLOCK_SCOPE_SIGNAL with the signal_scope.scope
variable.
Use scoped guards for RCU read-side critical sections.