www.infradead.org Git - users/jedix/linux-maple.git/log

]> www.infradead.org Git - users/jedix/linux-maple.git/log

projects / users / jedix / linux-maple.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Catalin Marinas [Tue, 12 Nov 2024 14:35:05 +0000 (14:35 +0000)]

kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c

Compiling the child_cleanup() function results in:

gcs-stress.c: In function ‘child_cleanup’:
gcs-stress.c:266:75: warning: format ‘%d’ expects a matching ‘int’ argument [-Wformat=]
  266 |                                 ksft_print_msg("%s: Exited due to signal %d\n",
      |                                                                          ~^
      |                                                                           |
      |                                                                           int

Add the missing child->exit_signal argument.

Fixes: 05e6cfff58c4 ("kselftest/arm64: Add a GCS stress test")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Thu, 31 Oct 2024 19:21:38 +0000 (19:21 +0000)]

arm64/gcs: Fix outdated ptrace documentation

The ptrace documentation for GCS was written prior to the implementation of
clone3() when we still blocked enabling of GCS via ptrace. This restriction
was relaxed as part of implementing clone3() support since we implemented
support for the GCS not being managed by the kernel but the documentation
still mentions the restriction. Update the documentation to reflect what
was merged.

We have not yet merged clone3() itself but all the support other than in
clone() itself is there.

Fixes: 7058bf87cd59 ("arm64/gcs: Document the ABI for Guarded Control Stacks")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241031-arm64-gcs-doc-disable-v1-1-d7f6ded62046@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Fri, 11 Oct 2024 14:36:25 +0000 (15:36 +0100)]

kselftest/arm64: Ensure stable names for GCS stress test results

The GCS stress test program currently uses the PID of the threads it
creates in the test names it reports, resulting in unstable test names
between runs. Fix this by using a thread number instead.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241011-arm64-gcs-stress-stable-name-v1-1-4950f226218e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Sat, 5 Oct 2024 00:17:18 +0000 (01:17 +0100)]

kselftest/arm64: Validate that GCS push and write permissions work

Add trivial assembly programs which give themselves the appropriate
permissions and then execute GCSPUSHM and GCSSTR, they will report errors
by generating signals on the non-permitted instructions. Not using libc
minimises the interaction with any policy set for the system but we skip on
failure to get the permissions in case the system is locked down to make
them inaccessible.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241005-arm64-gcs-test-flags-v1-1-03cb9786c5cd@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:18 +0000 (23:59 +0100)]

kselftest/arm64: Enable GCS for the FP stress tests

While it's a bit off topic for them the floating point stress tests do give
us some coverage of context thrashing cases, and also of active signal
delivery separate to the relatively complicated framework in the actual
signals tests. Have the tests enable GCS on startup, ignoring failures so
they continue to work as before on systems without GCS.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-39-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:17 +0000 (23:59 +0100)]

kselftest/arm64: Add a GCS stress test

Add a stress test which runs one more process than we have CPUs spinning
through a very recursive function with frequent syscalls immediately prior
to return and signals being injected every 100ms. The goal is to flag up
any scheduling related issues, for example failure to ensure that barriers
are inserted when moving a GCS using task to another CPU. The test runs for
a configurable amount of time, defaulting to 10 seconds.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-38-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:16 +0000 (23:59 +0100)]

kselftest/arm64: Add GCS signal tests

Do some testing of the signal handling for GCS, checking that a GCS
frame has the expected information in it and that the expected signals
are delivered with invalid operations.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-37-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:15 +0000 (23:59 +0100)]

kselftest/arm64: Add test coverage for GCS mode locking

Verify that we can lock individual GCS mode bits, that other modes
aren't affected and as a side effect also that every combination of
modes can be enabled.

Normally the inability to reenable GCS after disabling it would be an
issue with testing but fortunately the kselftest_harness runs each test
within a fork()ed child. This can be inconvenient for some kinds of
testing but here it means that each test is in a separate thread and
therefore won't be affected by other tests in the suite.

Once we get toolchains with support for enabling GCS by default we will
need to take care to not do that in the build system but there are no
such toolchains yet so it is not yet an issue.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-36-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:14 +0000 (23:59 +0100)]

kselftest/arm64: Add a GCS test program built with the system libc

There are things like threads which nolibc struggles with which we want
to add coverage for, and the ABI allows us to test most of these even if
libc itself does not understand GCS so add a test application built
using the system libc.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-35-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:13 +0000 (23:59 +0100)]

kselftest/arm64: Add very basic GCS test program

This test program just covers the basic GCS ABI, covering aspects of the
ABI as standalone features without attempting to integrate things.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-34-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:12 +0000 (23:59 +0100)]

kselftest/arm64: Always run signals tests with GCS enabled

Since it is not possible to return from the function that enabled GCS
without disabling GCS it is very inconvenient to use the signal handling
tests to cover GCS when GCS is not enabled by the toolchain and runtime,
something that no current distribution does. Since none of the testcases
do anything with stacks that would cause problems with GCS we can sidestep
this issue by unconditionally enabling GCS on startup and exiting with a
call to exit() rather than a return from main().

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-33-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:11 +0000 (23:59 +0100)]

kselftest/arm64: Allow signals tests to specify an expected si_code

Currently we ignore si_code unless the expected signal is a SIGSEGV, in
which case we enforce it being SEGV_ACCERR. Allow test cases to specify
exactly which si_code should be generated so we can validate this, and
test for other segfault codes.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-32-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:10 +0000 (23:59 +0100)]

kselftest/arm64: Add framework support for GCS to signal handling tests

Teach the framework about the GCS signal context, avoiding warnings on
the unknown context.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-31-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:09 +0000 (23:59 +0100)]

kselftest/arm64: Add GCS as a detected feature in the signal tests

In preparation for testing GCS related signal handling add it as a feature
we check for in the signal handling support code.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-30-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:08 +0000 (23:59 +0100)]

kselftest/arm64: Verify the GCS hwcap

Add coverage of the GCS hwcap to the hwcap selftest, using a read of
GCSPR_EL0 to generate SIGILL without having to worry about enabling GCS.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-29-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:07 +0000 (23:59 +0100)]

arm64: Add Kconfig for Guarded Control Stack (GCS)

Provide a Kconfig option allowing the user to select if GCS support is
built into the kernel.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-28-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:06 +0000 (23:59 +0100)]

arm64/ptrace: Expose GCS via ptrace and core files

Provide a new register type NT_ARM_GCS reporting the current GCS mode
and pointer for EL0. Due to the interactions with allocation and
deallocation of Guarded Control Stacks we do not permit any changes to
the GCS mode via ptrace, only GCSPR_EL0 may be changed.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-27-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:05 +0000 (23:59 +0100)]

arm64/signal: Expose GCS state in signal frames

Add a context for the GCS state and include it in the signal context when
running on a system that supports GCS. We reuse the same flags that the
prctl() uses to specify which GCS features are enabled and also provide the
current GCS pointer.

We do not support enabling GCS via signal return, there is a conflict
between specifying GCSPR_EL0 and allocation of a new GCS and this is not
an ancticipated use case. We also enforce GCS configuration locking on
signal return.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-26-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:04 +0000 (23:59 +0100)]

arm64/signal: Set up and restore the GCS context for signal handlers

When invoking a signal handler we use the GCS configuration and stack
for the current thread.

Since we implement signal return by calling the signal handler with a
return address set up pointing to a trampoline in the vDSO we need to
also configure any active GCS for this by pushing a frame for the
trampoline onto the GCS.  If we do not do this then signal return will
generate a GCS protection fault.

In order to guard against attempts to bypass GCS protections via signal
return we only allow returning with GCSPR_EL0 pointing to an address
where it was previously preempted by a signal.  We do this by pushing a
cap onto the GCS, this takes the form of an architectural GCS cap token
with the top bit set and token type of 0 which we add on signal entry
and validate and pop off on signal return.  The combination of the top
bit being set and the token type mean that this can't be interpreted as
a valid token or address.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-25-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:03 +0000 (23:59 +0100)]

arm64/mm: Implement map_shadow_stack()

As discussed extensively in the changelog for the addition of this
syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the
existing mmap() and madvise() syscalls do not map entirely well onto the
security requirements for guarded control stacks since they lead to
windows where memory is allocated but not yet protected or stacks which
are not properly and safely initialised. Instead a new syscall
map_shadow_stack() has been defined which allocates and initialises a
shadow stack page.

Implement this for arm64. Two flags are provided, allowing applications
to request that the stack be initialised with a valid cap token at the
top of the stack and optionally also an end of stack marker above that.
We support requesting an end of stack marker alone but since this is a
NULL pointer it is indistinguishable from not initialising anything by
itself.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-24-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:02 +0000 (23:59 +0100)]

arm64/gcs: Implement shadow stack prctl() interface

Implement the architecture neutral prctl() interface for setting the
shadow stack status, this supports setting and reading the current GCS
configuration for the current thread.

Userspace can enable basic GCS functionality and additionally also
support for GCS pushes and arbitrary GCS stores.  It is expected that
this prctl() will be called very early in application startup, for
example by the dynamic linker, and not subsequently adjusted during
normal operation.  Users should carefully note that after enabling GCS
for a thread GCS will become active with no call stack so it is not
normally possible to return from the function that invoked the prctl().

State is stored per thread, enabling GCS for a thread causes a GCS to be
allocated for that thread.

Userspace may lock the current GCS configuration by specifying
PR_SHADOW_STACK_ENABLE_LOCK, this prevents any further changes to the
GCS configuration via any means.

If GCS is not being enabled then all flags other than _LOCK are ignored,
it is not possible to enable stores or pops without enabling GCS.

When disabling the GCS we do not free the allocated stack, this allows
for inspection of the GCS after disabling as part of fault reporting.
Since it is not an expected use case and since it presents some
complications in determining what to do with previously initialsed data
on the GCS attempts to reenable GCS after this are rejected.  This can
be revisted if a use case arises.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-23-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:01 +0000 (23:59 +0100)]

arm64/gcs: Ensure that new threads have a GCS

When a new thread is created by a thread with GCS enabled the GCS needs
to be specified along with the regular stack.

Unfortunately plain clone() is not extensible and existing clone3()
users will not specify a stack so all existing code would be broken if
we mandated specifying the stack explicitly. For compatibility with
these cases and also x86 (which did not initially implement clone3()
support for shadow stacks) if no GCS is specified we will allocate one
so when a thread is created which has GCS enabled allocate one for it.
We follow the extensively discussed x86 implementation and allocate
min(RLIMIT_STACK/2, 2G). Since the GCS only stores the call stack and not
any variables this should be more than sufficient for most applications.

GCSs allocated via this mechanism will be freed when the thread exits.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-22-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:59:00 +0000 (23:59 +0100)]

arm64/gcs: Context switch GCS state for EL0

There are two registers controlling the GCS state of EL0, GCSPR_EL0 which
is the current GCS pointer and GCSCRE0_EL1 which has enable bits for the
specific GCS functionality enabled for EL0. Manage these on context switch
and process lifetime events, GCS is reset on exec(). Also ensure that
any changes to the GCS memory are visible to other PEs and that changes
from other PEs are visible on this one by issuing a GCSB DSYNC when
moving to or from a thread with GCS.

Since the current GCS configuration of a thread will be visible to
userspace we store the configuration in the format used with userspace
and provide a helper which configures the system register as needed.

On systems that support GCS we always allow access to GCSPR_EL0, this
facilitates reporting of GCS faults if userspace implements disabling of
GCS on error - the GCS can still be discovered and examined even if GCS
has been disabled.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-21-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:59 +0000 (23:58 +0100)]

arm64/mm: Handle GCS data aborts

All GCS operations at EL0 must happen on a page which is marked as
having UnprivGCS access, including read operations.  If a GCS operation
attempts to access a page without this then it will generate a data
abort with the GCS bit set in ESR_EL1.ISS2.

EL0 may validly generate such faults, for example due to copy on write
which will cause the GCS data to be stored in a read only page with no
GCS permissions until the actual copy happens.  Since UnprivGCS allows
both reads and writes to the GCS (though only through GCS operations) we
need to ensure that the memory management subsystem handles GCS accesses
as writes at all times.  Do this by adding FAULT_FLAG_WRITE to any GCS
page faults, adding handling to ensure that invalid cases are identfied
as such early so the memory management core does not think they will
succeed.  The core cannot distinguish between VMAs which are generally
writeable and VMAs which are only writeable through GCS operations.

EL1 may validly write to EL0 GCS for management purposes (eg, while
initialising with cap tokens).

We also report any GCS faults in VMAs not marked as part of a GCS as
access violations, causing a fault to be delivered to userspace if it
attempts to do GCS operations outside a GCS.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-20-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:58 +0000 (23:58 +0100)]

arm64/traps: Handle GCS exceptions

A new exception code is defined for GCS specific faults other than
standard load/store faults, for example GCS token validation failures,
add handling for this. These faults are reported to userspace as
segfaults with code SEGV_CPERR (protection error), mirroring the
reporting for x86 shadow stack errors.

GCS faults due to memory load/store operations generate data aborts with
a flag set, these will be handled separately as part of the data abort
handling.

Since we do not currently enable GCS for EL1 we should not get any faults
there but while we're at it we wire things up there, treating any GCS
fault as fatal.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-19-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:57 +0000 (23:58 +0100)]

arm64/hwcap: Add hwcap for GCS

Provide a hwcap to enable userspace to detect support for GCS.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-18-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:56 +0000 (23:58 +0100)]

arm64/idreg: Add overrride for GCS

Hook up an override for GCS, allowing it to be disabled from the command
line by specifying arm64.nogcs in case there are problems.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-17-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:54 +0000 (23:58 +0100)]

arm64/mm: Map pages for guarded control stack

Map pages flagged as being part of a GCS as such rather than using the
full set of generic VM flags.

This is done using a conditional rather than extending the size of
protection_map since that would make for a very sparse array.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-15-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:53 +0000 (23:58 +0100)]

mm: Define VM_SHADOW_STACK for arm64 when we support GCS

Use VM_HIGH_ARCH_5 for guarded control stack pages.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-14-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:52 +0000 (23:58 +0100)]

arm64/mm: Allocate PIE slots for EL0 guarded control stack

Pages used for guarded control stacks need to be described to the hardware
using the Permission Indirection Extension, GCS is not supported without
PIE. In order to support copy on write for guarded stacks we allocate two
values, one for active GCSs and one for GCS pages marked as read only prior
to copy.

Since the actual effect is defined using PIE the specific bit pattern used
does not matter to the hardware but we choose two values which differ only
in PTE_WRITE in order to help share code with non-PIE cases.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-13-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:51 +0000 (23:58 +0100)]

arm64/cpufeature: Runtime detection of Guarded Control Stack (GCS)

Add a cpufeature for GCS, allowing other code to conditionally support it
at runtime.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-12-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:50 +0000 (23:58 +0100)]

arm64/gcs: Provide basic EL2 setup to allow GCS usage at EL0 and EL1

There is a control HCRX_EL2.GCSEn which must be set to allow GCS
features to take effect at lower ELs and also fine grained traps for GCS
usage at EL0 and EL1. Configure all these to allow GCS usage by EL0 and
EL1.

We also initialise GCSCR_EL1 and GCSCRE0_EL1 to ensure that we can
execute function call instructions without faulting regardless of the
state when the kernel is started.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-11-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:49 +0000 (23:58 +0100)]

arm64/gcs: Provide put_user_gcs()

In order for EL1 to write to an EL0 GCS it must use the GCSSTTR instruction
rather than a normal STTR. Provide a put_user_gcs() which does this.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-10-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:48 +0000 (23:58 +0100)]

arm64/gcs: Add manual encodings of GCS instructions

Define C callable functions for GCS instructions used by the kernel. In
order to avoid ambitious toolchain requirements for GCS support these are
manually encoded, this means we have fixed register numbers which will be
a bit limiting for the compiler but none of these should be used in
sufficiently fast paths for this to be a problem.

Note that GCSSTTR is used to store to EL0.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-9-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:47 +0000 (23:58 +0100)]

arm64/sysreg: Add definitions for architected GCS caps

The architecture defines a format for guarded control stack caps, used
to mark the top of an unused GCS in order to limit the potential for
exploitation via stack switching. Add definitions associated with these.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-8-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:46 +0000 (23:58 +0100)]

arm64/gcs: Document the ABI for Guarded Control Stacks

Add some documentation of the userspace ABI for Guarded Control Stacks.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-7-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:45 +0000 (23:58 +0100)]

arm64: Document boot requirements for Guarded Control Stacks

FEAT_GCS introduces a number of new system registers, we require that
access to these registers is not trapped when we identify that the feature
is present. There is also a HCRX_EL2 control to make GCS operations
functional.

Since if GCS is enabled any function call instruction will cause a fault
we also require that the feature be specifically disabled, existing
kernels implicitly have this requirement and especially given that the
MMU must be disabled it is difficult to see a situation where leaving
GCS enabled would be reasonable.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-6-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:44 +0000 (23:58 +0100)]

mman: Add map_shadow_stack() flags

In preparation for adding arm64 GCS support make the map_shadow_stack()
SHADOW_STACK_SET_TOKEN flag generic and add _SET_MARKER. The existing
flag indicates that a token usable for stack switch should be added to
the top of the newly mapped GCS region while the new flag indicates that
a top of stack marker suitable for use by unwinders should be added
above that.

For arm64 the top of stack marker is all bits 0.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-5-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:43 +0000 (23:58 +0100)]

prctl: arch-agnostic prctl for shadow stack

Three architectures (x86, aarch64, riscv) have announced support for
shadow stacks with fairly similar functionality.  While x86 is using
arch_prctl() to control the functionality neither arm64 nor riscv uses
that interface so this patch adds arch-agnostic prctl() support to
get and set status of shadow stacks and lock the current configuation to
prevent further changes, with support for turning on and off individual
subfeatures so applications can limit their exposure to features that
they do not need.  The features are:

  - PR_SHADOW_STACK_ENABLE: Tracking and enforcement of shadow stacks,
    including allocation of a shadow stack if one is not already
    allocated.
  - PR_SHADOW_STACK_WRITE: Writes to specific addresses in the shadow
    stack.
  - PR_SHADOW_STACK_PUSH: Push additional values onto the shadow stack.

These features are expected to be inherited by new threads and cleared
on exec(), unknown features should be rejected for enable but accepted
for locking (in order to allow for future proofing).

This is based on a patch originally written by Deepak Gupta but modified
fairly heavily, support for indirect landing pads is removed, additional
modes added and the locking interface reworked.  The set status prctl()
is also reworked to just set flags, if setting/reading the shadow stack
pointer is required this could be a separate prctl.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-4-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:42 +0000 (23:58 +0100)]

arm64/mm: Restructure arch_validate_flags() for extensibility

Currently arch_validate_flags() is written in a very non-extensible
fashion, returning immediately if MTE is not supported and writing the MTE
check as a direct return. Since we will want to add more checks for GCS
refactor the existing code to be more extensible, no functional change
intended.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-3-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:41 +0000 (23:58 +0100)]

mm: Define VM_HIGH_ARCH_6

The addition of protection keys means that on arm64 we now use all of the
currently defined VM_HIGH_ARCH_x bits. In order to allow us to allocate a
new flag for GCS pages define VM_HIGH_ARCH_6.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-2-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mark Brown [Tue, 1 Oct 2024 22:58:40 +0000 (23:58 +0100)]

mm: Introduce ARCH_HAS_USER_SHADOW_STACK

Since multiple architectures have support for shadow stacks and we need to
select support for this feature in several places in the generic code
provide a generic config option that the architectures can select.

Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Tested-by: Kees Cook <kees@kernel.org>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-1-222b78d87eee@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 22:06:19 +0000 (15:06 -0700)]

Linux 6.12-rc1

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 21:47:33 +0000 (14:47 -0700)]

x86: kvm: fix build error

The cpu_emergency_register_virt_callback() function is used
unconditionally by the x86 kvm code, but it is declared (and defined)
conditionally:

  #if IS_ENABLED(CONFIG_KVM_INTEL) || IS_ENABLED(CONFIG_KVM_AMD)
  void cpu_emergency_register_virt_callback(cpu_emergency_virt_cb *callback);
  ...

leading to a build error when neither KVM_INTEL nor KVM_AMD support is
enabled:

  arch/x86/kvm/x86.c: In function ‘kvm_arch_enable_virtualization’:
  arch/x86/kvm/x86.c:12517:9: error: implicit declaration of function ‘cpu_emergency_register_virt_callback’ [-Wimplicit-function-declaration]
  12517 |         cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/x86/kvm/x86.c: In function ‘kvm_arch_disable_virtualization’:
  arch/x86/kvm/x86.c:12522:9: error: implicit declaration of function ‘cpu_emergency_unregister_virt_callback’ [-Wimplicit-function-declaration]
  12522 |         cpu_emergency_unregister_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix the build by defining empty helper functions the same way the old
cpu_emergency_disable_virtualization() function was dealt with for the
same situation.

Maybe we could instead have made the call sites conditional, since the
callers (kvm_arch_{en,dis}able_virtualization()) have an empty weak
fallback.  I'll leave that to the kvm people to argue about, this at
least gets the build going for that particular config.

Fixes: 590b09b1d88e ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 16:53:04 +0000 (09:53 -0700)]

Merge tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox

Pull mailbox updates from Jassi Brar:

- fix kconfig dependencies (mhu-v3, omap2+)

- use devie name instead of genereic imx_mu_chan as interrupt name
   (imx)

- enable sa8255p and qcs8300 ipc controllers (qcom)

- Fix timeout during suspend mode (bcm2835)

- convert to use use of_property_match_string (mailbox)

- enable mt8188 (mediatek)

- use devm_clk_get_enabled helpers (spreadtrum)

- fix device-id typo (rockchip)

* tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
  mailbox, remoteproc: omap2+: fix compile testing
  dt-bindings: mailbox: qcom-ipcc: Document QCS8300 IPCC
  dt-bindings: mailbox: qcom-ipcc: document the support for SA8255p
  dt-bindings: mailbox: mtk,adsp-mbox: Add compatible for MT8188
  mailbox: Use of_property_match_string() instead of open-coding
  mailbox: bcm2835: Fix timeout during suspend mode
  mailbox: sprd: Use devm_clk_get_enabled() helpers
  mailbox: rockchip: fix a typo in module autoloading
  mailbox: imx: use device name in interrupt name
  mailbox: ARM_MHU_V3 should depend on ARM64

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 16:47:33 +0000 (09:47 -0700)]

Merge tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

- fix DesignWare driver ENABLE-ABORT sequence, ensuring ABORT can
   always be sent when needed

- check for PCLK in the SynQuacer controller as an optional clock,
   allowing ACPI to directly provide the clock rate

- KEBA driver Kconfig dependency fix

- fix XIIC driver power suspend sequence

* tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
  i2c: keba: I2C_KEBA should depend on KEBA_CP500
  i2c: synquacer: Deal with optional PCLK correctly
  i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 16:35:10 +0000 (09:35 -0700)]

Merge tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:

- handle chained SGLs in the new tracing code (Christoph Hellwig)

* tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: fix DMA API tracing for chained scatterlists

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 16:22:34 +0000 (09:22 -0700)]

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
"These are mostly minor updates.

  There are two drivers (lpfc and mpi3mr) which missed the initial
  pull and a core change to retry a start/stop unit which affect
  suspend/resume"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
  scsi: lpfc: Update lpfc version to 14.4.0.5
  scsi: lpfc: Support loopback tests with VMID enabled
  scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING
  scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
  scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler
  scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs
  scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE
  scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
  scsi: mpi3mr: Update driver version to 8.12.0.0.50
  scsi: mpi3mr: Improve wait logic while controller transitions to READY state
  scsi: mpi3mr: Update MPI Headers to revision 34
  scsi: mpi3mr: Use firmware-provided timestamp update interval
  scsi: mpi3mr: Enhance the Enable Controller retry logic
  scsi: sd: Fix off-by-one error in sd_read_block_characteristics()
  scsi: pm8001: Do not overwrite PCI queue mapping
  scsi: scsi_debug: Remove a useless memset()
  scsi: pmcraid: Convert comma to semicolon
  scsi: sd: Retry START STOP UNIT commands
  scsi: mpi3mr: A performance fix
  scsi: ufs: qcom: Update MODE_MAX cfg_bw value
  ...

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 16:17:44 +0000 (09:17 -0700)]

Merge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs

Pull more bcachefs updates from Kent Overstreet:
"Assorted minor syzbot fixes, and for bigger stuff:

  Fix two disk accounting rewrite bugs:

   - Disk accounting keys use the version field of bkey so that journal
     replay can tell which updates have been applied to the btree.

     This is set in the transaction commit path, after we've gotten our
     journal reservation (and our time ordering), but the
     BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay
     uses was incorrectly skipping this for new updates generated prior
     to journal replay.

     This fixes the underlying cause of an assertion pop in
     disk_accounting_read.

   - A couple of fixes for disk accounting + device removal.

     Checking if acocunting replicas entries were marked in the
     superblock was being done at the wrong point, when deltas in the
     journal could still zero them out, and then additionally we'd try
     to add a missing replicas entry to the superblock without checking
     if it referred to an invalid (removed) device.

  A whole slew of repair fixes:

   - fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
     an infinite loop when repairing a filesystem with many snapshots

   - fix incorrect transaction restart handling leading to occasional
     "fsck counted ..." warnings

   - fix warning in __bch2_fsck_err() for bkey fsck errors

   - check_inode() in fsck now correctly checks if the filesystem was
     clean

   - there shouldn't be pending logged ops if the fs was clean, we now
     check for this

   - remove_backpointer() doesn't remove a dirent that doesn't actually
     point to the inode

   - many more fsck errors are AUTOFIX"

* tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs: (35 commits)
  bcachefs: check_subvol_path() now prints subvol root inode
  bcachefs: remove_backpointer() now checks if dirent points to inode
  bcachefs: dirent_points_to_inode() now warns on mismatch
  bcachefs: Fix lost wake up
  bcachefs: Check for logged ops when clean
  bcachefs: BCH_FS_clean_recovery
  bcachefs: Convert disk accounting BUG_ON() to WARN_ON()
  bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply
  bcachefs: Check for accounting keys with bversion=0
  bcachefs: rename version -> bversion
  bcachefs: Don't delete unlinked inodes before logged op resume
  bcachefs: Fix BCH_SB_ERRS() so we can reorder
  bcachefs: Fix fsck warnings from bkey validation
  bcachefs: Move transaction commit path validation to as late as possible
  bcachefs: Fix disk accounting attempting to mark invalid replicas entry
  bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()
  bcachefs: Fix accounting read + device removal
  bcachefs: bch_accounting_mode
  bcachefs: fix transaction restart handling in check_extents(), check_dirents()
  bcachefs: kill inode_walker_entry.seen_this_pos
  ...

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 16:10:00 +0000 (09:10 -0700)]

Merge tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"Fix TDX MMIO #VE fault handling, and add two new Intel model numbers
  for 'Pantherlake' and 'Diamond Rapids'"

* tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add two Intel CPU model numbers
  x86/tdx: Fix "in-kernel MMIO" check

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 15:51:30 +0000 (08:51 -0700)]

Merge tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
"lockdep:
    - Fix potential deadlock between lockdep and RCU (Zhiguo Niu)
    - Use str_plural() to address Coccinelle warning (Thorsten Blum)
    - Add debuggability enhancement (Luis Claudio R. Goncalves)

  static keys & calls:
    - Fix static_key_slow_dec() yet again (Peter Zijlstra)
    - Handle module init failure correctly in static_call_del_module()
      (Thomas Gleixner)
    - Replace pointless WARN_ON() in static_call_module_notify() (Thomas
      Gleixner)

  <linux/cleanup.h>:
    - Add usage and style documentation (Dan Williams)

  rwsems:
    - Move is_rwsem_reader_owned() and rwsem_owner() under
      CONFIG_DEBUG_RWSEMS (Waiman Long)

  atomic ops, x86:
    - Redeclare x86_32 arch_atomic64_{add,sub}() as void (Uros Bizjak)
    - Introduce the read64_nonatomic macro to x86_32 with cx8 (Uros
      Bizjak)"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
* tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Move is_rwsem_reader_owned() and rwsem_owner() under CONFIG_DEBUG_RWSEMS
  jump_label: Fix static_key_slow_dec() yet again
  static_call: Replace pointless WARN_ON() in static_call_module_notify()
  static_call: Handle module init failure correctly in static_call_del_module()
  locking/lockdep: Simplify character output in seq_line()
  lockdep: fix deadlock issue between lockdep and rcu
  lockdep: Use str_plural() to fix Coccinelle warning
  cleanup: Add usage and style documentation
  lockdep: suggest the fix for "lockdep bfs error:-1" on print_bfs_bug
  locking/atomic/x86: Redeclare x86_32 arch_atomic64_{add,sub}() as void
  locking/atomic/x86: Introduce the read64_nonatomic macro to x86_32 with cx8

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 15:44:28 +0000 (08:44 -0700)]

Merge tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux

Pull coccinelle updates from Julia Lawall:
"Extend string_choices.cocci to use more available helpers

  Ten patches from Hongbo Li extending string_choices.cocci with the
  complete set of functions offered by include/linux/string_choices.h.

  One patch from myself reducing the number of redundant cases that are
  checked by Coccinelle, giving a small performance improvement"

* tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  Reduce Coccinelle choices in string_choices.cocci
  coccinelle: Remove unnecessary parentheses for only one possible change.
  coccinelle: Add rules to find str_yes_no() replacements
  coccinelle: Add rules to find str_on_off() replacements
  coccinelle: Add rules to find str_write_read() replacements
  coccinelle: Add rules to find str_read_write() replacements
  coccinelle: Add rules to find str_enable{d}_disable{d}() replacements
  coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements
  coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements
  coccinelle: Add rules to find str_false_true() replacements
  coccinelle: Add rules to find str_true_false() replacements

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2024 15:37:03 +0000 (08:37 -0700)]

Merge tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fix from Shuah Khan:
"One urgent fix to vDSO as automated testing is failing due to this
bug"

* tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: vDSO: align stack for O2-optimized memcpy

commit | commitdiff | tree

Ingo Molnar [Sun, 29 Sep 2024 06:57:18 +0000 (08:57 +0200)]

Merge branch 'locking/core' into locking/urgent, to pick up pending commits

Merge all pending locking commits into a single branch.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Julia Lawall [Sat, 28 Sep 2024 19:26:22 +0000 (21:26 +0200)]

Reduce Coccinelle choices in string_choices.cocci

The isomorphism neg_if_exp negates the test of a ?: conditional,
making it unnecessary to have an explicit case for a negated test
with the branches inverted.

At the same time, we can disable neg_if_exp in cases where a
different API function may be more suitable for a negated test.

Finally, in the non-patch cases, E matches an expression with
parentheses around it, so there is no need to mention ()
explicitly in the pattern. The () are still needed in the patch
cases, because we want to drop them, if they are present.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:27 +0000 (09:09 +0800)]

coccinelle: Remove unnecessary parentheses for only one possible change.

The parentheses are only needed if there is a disjunction, ie a
set of possible changes. If there is only one pattern, we can
remove these parentheses. Just like the format:

  -  x
  +  y

not:

  (
  -  x
  +  y
  )

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:26 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_yes_no() replacements

As other rules done, we add rules for str_yes_no()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:25 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_on_off() replacements

As other rules done, we add rules for str_on_off()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:24 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_write_read() replacements

As other rules done, we add rules for str_write_read()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:23 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_read_write() replacements

As other rules done, we add rules for str_read_write()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:22 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_enable{d}_disable{d}() replacements

As other rules done, we add rules for str_enable{d}_
disable{d}() to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:21 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements

As other rules done, we add rules for str_lo{w}_hi{gh}()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:20 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements

As other rules done, we add rules for str_hi{gh}_lo{w}()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:19 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_false_true() replacements

As done with str_true_false(), add checks for str_false_true()
opportunities. A simple test can find over 9 cases currently
exist in the tree.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Hongbo Li [Wed, 11 Sep 2024 01:09:18 +0000 (09:09 +0800)]

coccinelle: Add rules to find str_true_false() replacements

After str_true_false() has been introduced in the tree,
we can add rules for finding places where str_true_false()
can be used. A simple test can find over 10 locations.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2024 16:20:14 +0000 (09:20 -0700)]

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull x86 kvm updates from Paolo Bonzini:
"x86:

   - KVM currently invalidates the entirety of the page tables, not just
     those for the memslot being touched, when a memslot is moved or
     deleted.

     This does not traditionally have particularly noticeable overhead,
     but Intel's TDX will require the guest to re-accept private pages
     if they are dropped from the secure EPT, which is a non starter.

     Actually, the only reason why this is not already being done is a
     bug which was never fully investigated and caused VM instability
     with assigned GeForce GPUs, so allow userspace to opt into the new
     behavior.

   - Advertise AVX10.1 to userspace (effectively prep work for the
     "real" AVX10 functionality that is on the horizon)

   - Rework common MSR handling code to suppress errors on userspace
     accesses to unsupported-but-advertised MSRs

     This will allow removing (almost?) all of KVM's exemptions for
     userspace access to MSRs that shouldn't exist based on the vCPU
     model (the actual cleanup is non-trivial future work)

   - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC)
     splits the 64-bit value into the legacy ICR and ICR2 storage,
     whereas Intel (APICv) stores the entire 64-bit value at the ICR
     offset

   - Fix a bug where KVM would fail to exit to userspace if one was
     triggered by a fastpath exit handler

   - Add fastpath handling of HLT VM-Exit to expedite re-entering the
     guest when there's already a pending wake event at the time of the
     exit

   - Fix a WARN caused by RSM entering a nested guest from SMM with
     invalid guest state, by forcing the vCPU out of guest mode prior to
     signalling SHUTDOWN (the SHUTDOWN hits the VM altogether, not the
     nested guest)

   - Overhaul the "unprotect and retry" logic to more precisely identify
     cases where retrying is actually helpful, and to harden all retry
     paths against putting the guest into an infinite retry loop

   - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping
     rmaps in the shadow MMU

   - Refactor pieces of the shadow MMU related to aging SPTEs in
     prepartion for adding multi generation LRU support in KVM

   - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is
     enabled, i.e. when the CPU has already flushed the RSB

   - Trace the per-CPU host save area as a VMCB pointer to improve
     readability and cleanup the retrieval of the SEV-ES host save area

   - Remove unnecessary accounting of temporary nested VMCB related
     allocations

   - Set FINAL/PAGE in the page fault error code for EPT violations if
     and only if the GVA is valid. If the GVA is NOT valid, there is no
     guest-side page table walk and so stuffing paging related metadata
     is nonsensical

   - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit
     instead of emulating posted interrupt delivery to L2

   - Add a lockdep assertion to detect unsafe accesses of vmcs12
     structures

   - Harden eVMCS loading against an impossible NULL pointer deref
     (really truly should be impossible)

   - Minor SGX fix and a cleanup

   - Misc cleanups

  Generic:

   - Register KVM's cpuhp and syscore callbacks when enabling
     virtualization in hardware, as the sole purpose of said callbacks
     is to disable and re-enable virtualization as needed

   - Enable virtualization when KVM is loaded, not right before the
     first VM is created

     Together with the previous change, this simplifies a lot the logic
     of the callbacks, because their very existence implies
     virtualization is enabled

   - Fix a bug that results in KVM prematurely exiting to userspace for
     coalesced MMIO/PIO in many cases, clean up the related code, and
     add a testcase

   - Fix a bug in kvm_clear_guest() where it would trigger a buffer
     overflow _if_ the gpa+len crosses a page boundary, which thankfully
     is guaranteed to not happen in the current code base. Add WARNs in
     more helpers that read/write guest memory to detect similar bugs

  Selftests:

   - Fix a goof that caused some Hyper-V tests to be skipped when run on
     bare metal, i.e. NOT in a VM

   - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES
     guest

   - Explicitly include one-off assets in .gitignore. Past Sean was
     completely wrong about not being able to detect missing .gitignore
     entries

   - Verify userspace single-stepping works when KVM happens to handle a
     VM-Exit in its fastpath

   - Misc cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  Documentation: KVM: fix warning in "make htmldocs"
  s390: Enable KVM_S390_UCONTROL config in debug_defconfig
  selftests: kvm: s390: Add VM run test case
  KVM: SVM: let alternatives handle the cases when RSB filling is required
  KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid
  KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent
  KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals
  KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range()
  KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper
  KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed
  KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot
  KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps()
  KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range()
  KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn
  KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list
  KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version
  KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure()
  KVM: x86: Update retry protection fields when forcing retry on emulation failure
  KVM: x86: Apply retry protection to "unprotect on failure" path
  KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn
  ...

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2024 16:11:46 +0000 (09:11 -0700)]

Merge tag 's390-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Vasily Gorbik:

- Clean up and improve vdso code: use SYM_* macros for function and
   data annotations, add CFI annotations to fix GDB unwinding, optimize
   the chacha20 implementation

- Add vfio-ap driver feature advertisement for use by libvirt and
   mdevctl

* tag 's390-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/vfio-ap: Driver feature advertisement
  s390/vdso: Use one large alternative instead of an alternative branch
  s390/vdso: Use SYM_DATA_START_LOCAL()/SYM_DATA_END() for data objects
  tools: Add additional SYM_*() stubs to linkage.h
  s390/vdso: Use macros for annotation of asm functions
  s390/vdso: Add CFI annotations to __arch_chacha20_blocks_nostack()
  s390/vdso: Fix comment within __arch_chacha20_blocks_nostack()
  s390/vdso: Get rid of permutation constants

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2024 16:06:15 +0000 (09:06 -0700)]

Merge tag 'modules-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
"There are a few fixes / cleanups from Vincent, Chunhui, and Petr, but
  the most important part of this pull request is the Rust community
  stepping up to help maintain both C / Rust code for future Rust module
  support. We grow the set of modules maintainers by three now, and with
  this hope to scale to help address what's needed to properly support
  future Rust module support.

  A lot of exciting stuff coming in future kernel releases.

  This has been on linux-next for ~ 3 weeks now with no issues"

* tag 'modules-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  module: Refine kmemleak scanned areas
  module: abort module loading when sysfs setup suffer errors
  MAINTAINERS: scale modules with more reviewers
  module: Clean up the description of MODULE_SIG_<type>
  module: Split modules_install compression and in-kernel decompression

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2024 16:00:38 +0000 (09:00 -0700)]

Merge tag 'fbdev-for-6.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

Pull fbdev fixes from Helge Deller:

- crash fix in fbcon_putcs

- avoid a possible string memory overflow in sisfb

- minor code optimizations in omapfb and fbcon

* tag 'fbdev-for-6.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbdev: sisfb: Fix strbuf array overflow
  fbcon: break earlier in search_fb_in_map and search_for_mapped_con
  fbdev: omapfb: Call of_node_put(ep) only once in omapdss_of_find_source_for_first_ep()
  fbcon: Fix a NULL pointer dereference issue in fbcon_putcs

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2024 15:47:46 +0000 (08:47 -0700)]

Merge tag 'drm-next-2024-09-28' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Regular fixes for the week to end the merge window, i915 and xe have a
  few each, amdgpu makes up most of it with a bunch of SR-IOV related
  fixes amongst others.

  i915:
   - Fix BMG support to UHBR13.5
   - Two PSR fixes
   - Fix colorimetry detection for DP

  xe:
   - Fix macro for checking minimum GuC version
   - Fix CCS offset calculation for some BMG SKUs
   - Fix locking on memory usage reporting via fdinfo and BO destroy
   - Fix GPU page fault handler on a closed VM
   - Fix overflow in oa batch buffer

  amdgpu:
   - MES 12 fix
   - KFD fence sync fix
   - SR-IOV fixes
   - VCN 4.0.6 fix
   - SDMA 7.x fix
   - Bump driver version to note cleared VRAM support
   - SWSMU fix
   - CU occupancy logic fix
   - SDMA queue fix"

* tag 'drm-next-2024-09-28' of https://gitlab.freedesktop.org/drm/kernel: (79 commits)
  drm/amd/pm: update workload mask after the setting
  drm/amdgpu: bump driver version for cleared VRAM
  drm/amdgpu: fix vbios fetching for SR-IOV
  drm/amdgpu: fix PTE copy corruption for sdma 7
  drm/amdkfd: Add SDMA queue quantum support for GFX12
  drm/amdgpu/vcn: enable AV1 on both instances
  drm/amdkfd: Fix CU occupancy for GFX 9.4.3
  drm/amdkfd: Update logic for CU occupancy calculations
  drm/amdgpu: skip coredump after job timeout in SRIOV
  drm/amdgpu: sync to KFD fences before clearing PTEs
  drm/amdgpu/mes12: set enable_level_process_quantum_check
  drm/i915/dp: Fix colorimetry detection
  drm/amdgpu/mes12: reduce timeout
  drm/amdgpu/mes11: reduce timeout
  drm/amdgpu: use GEM references instead of TTMs v2
  drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT`
  drm/amd/display: Fix kdoc entry for 'tps' in 'dc_process_dmub_dpia_set_tps_notification'
  drm/amdgpu: update golden regs for gfx12
  drm/amdgpu: clean up vbios fetching code
  drm/amd/display: handle nulled pipe context in DCE110's set_drr()
  ...

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2024 15:40:36 +0000 (08:40 -0700)]

Merge tag 'ceph-for-6.12-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
"Three CephFS fixes from Xiubo and Luis and a bunch of assorted
  cleanups"

* tag 'ceph-for-6.12-rc1' of https://github.com/ceph/ceph-client:
  ceph: remove the incorrect Fw reference check when dirtying pages
  ceph: Remove empty definition in header file
  ceph: Fix typo in the comment
  ceph: fix a memory leak on cap_auths in MDS client
  ceph: flush all caps releases when syncing the whole filesystem
  ceph: rename ceph_flush_cap_releases() to ceph_flush_session_cap_releases()
  libceph: use min() to simplify code in ceph_dns_resolve_name()
  ceph: Convert to use jiffies macro
  ceph: Remove unused declarations

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2024 15:35:21 +0000 (08:35 -0700)]

Merge tag 'v6.12-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

- fix querying dentry for char/block special files

- small cleanup patches

* tag 'v6.12-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: Correct typos in multiple comments across various files
  ksmbd: fix open failure from block and char device file
  ksmbd: remove unsafe_memcpy use in session setup
  ksmbd: Replace one-element arrays with flexible-array members
  ksmbd: fix warning: comparison of distinct pointer types lacks a cast

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2024 15:30:27 +0000 (08:30 -0700)]

Merge tag '6.12rc-more-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull xmb client fixes from Steve French:

- Noisy log message cleanup

- Important netfs fix for cifs crash in generic/074

- Three minor improvements to use of hashing (multichannel and mount
   improvements)

- Fix decryption crash for large read with small esize

* tag '6.12rc-more-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: make SHA-512 TFM ephemeral
  smb: client: make HMAC-MD5 TFM ephemeral
  smb: client: stop flooding dmesg in smb2_calc_signature()
  smb: client: allocate crypto only for primary server
  smb: client: fix UAF in async decryption
  netfs: Fix write oops in generic/346 (9p) and generic/074 (cifs)

commit | commitdiff | tree

Kent Overstreet [Tue, 24 Sep 2024 02:32:47 +0000 (22:32 -0400)]

bcachefs: check_subvol_path() now prints subvol root inode

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 24 Sep 2024 02:27:13 +0000 (22:27 -0400)]

bcachefs: remove_backpointer() now checks if dirent points to inode

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 24 Sep 2024 02:22:00 +0000 (22:22 -0400)]

bcachefs: dirent_points_to_inode() now warns on mismatch

if an inode backpointer points to a dirent that doesn't point back,
that's an error we should warn about.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Alan Huang [Tue, 27 Aug 2024 15:14:48 +0000 (23:14 +0800)]

bcachefs: Fix lost wake up

If the reader acquires the read lock and then the writer enters the slow
path, while the reader proceeds to the unlock path, the following scenario
can occur without the change:

writer: pcpu_read_count(lock) return 1 (so __do_six_trylock will return 0)
reader: this_cpu_dec(*lock->readers)
reader: smp_mb()
reader: state = atomic_read(&lock->state) (there is no waiting flag set)
writer: six_set_bitmask()

then the writer will sleep forever.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 20:23:30 +0000 (16:23 -0400)]

bcachefs: Check for logged ops when clean

If we shut down successfully, there shouldn't be any logged ops to
resume.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 20:19:58 +0000 (16:19 -0400)]

bcachefs: BCH_FS_clean_recovery

Add a filesystem flag to indicate whether we did a clean recovery -
using c->sb.clean after we've got rw is incorrect, since c->sb is
updated whenever we write the superblock.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Sat, 28 Sep 2024 01:05:59 +0000 (21:05 -0400)]

bcachefs: Convert disk accounting BUG_ON() to WARN_ON()

We had a bug where disk accounting keys didn't always have their version
field set in journal replay; change the BUG_ON() to a WARN(), and
exclude this case since it's now checked for elsewhere (in the bkey
validate function).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 19:59:29 +0000 (15:59 -0400)]

bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply

This was added to avoid double-counting accounting keys in journal
replay. But applied incorrectly (easily done since it applies to the
transaction commit, not a particular update), it leads to skipping
in-mem accounting for real accounting updates, and failure to give them
a version number - which leads to journal replay becoming very confused
the next time around.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 19:58:02 +0000 (15:58 -0400)]

bcachefs: Check for accounting keys with bversion=0

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 19:49:17 +0000 (15:49 -0400)]

bcachefs: rename version -> bversion

give bversions a more distinct name, to aid in grepping

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 19:19:17 +0000 (15:19 -0400)]

bcachefs: Don't delete unlinked inodes before logged op resume

Previously, check_inode() would delete unlinked inodes if they weren't
on the deleted list - this code dating from before there was a deleted
list.

But, if we crash during a logged op (truncate or finsert/fcollapse) of
an unlinked file, logged op resume will get confused if the inode has
already been deleted - instead, just add it to the deleted list if it
needs to be there; delete_dead_inodes runs after logged op resume.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 19:30:17 +0000 (15:30 -0400)]

bcachefs: Fix BCH_SB_ERRS() so we can reorder

BCH_SB_ERRS() has a field for the actual enum val so that we can reorder
to reorganize, but the way BCH_SB_ERR_MAX was defined didn't allow for
this.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 20:51:19 +0000 (16:51 -0400)]

bcachefs: Fix fsck warnings from bkey validation

__bch2_fsck_err() warns if the current task has a btree_trans object and
it wasn't passed in, because if it has to prompt for user input it has
to be able to unlock it.

But plumbing the btree_trans through bkey_validate(), as well as
transaction restarts, is problematic - so instead make bkey fsck errors
FSCK_AUTOFIX, which doesn't need to warn.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 26 Sep 2024 20:50:29 +0000 (16:50 -0400)]

bcachefs: Move transaction commit path validation to as late as possible

In order to check for accounting keys with version=0, we need to run
validation after they've been assigned version numbers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Wed, 25 Sep 2024 22:17:52 +0000 (18:17 -0400)]

bcachefs: Fix disk accounting attempting to mark invalid replicas entry

This fixes the following bug, where a disk accounting key has an invalid
replicas entry, and we attempt to add it to the superblock:

bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): starting version 1.12: rebalance_work_acct_fix opts=metadata_replicas=2,data_replicas=2,foreground_target=ssd,background_target=hdd,nopromote_whole_extents,verbose,fsck,fix_errors=yes
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): recovering from clean shutdown, journal seq 15211644
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): accounting_read...
accounting not marked in superblock replicas
  replicas cached: 1/1 [0], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 0 in entry cached: 1/1 [0]
replicas_v0 (size 88):
user: 2 [3 5] user: 2 [1 4] cached: 1 [2] btree: 2 [1 2] user: 2 [2 5] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 2 [1 2] user: 2 [2 3] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 3] user: 2 [1 5] user: 2 [2 4]

bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): inconsistency detected - emergency read only at journal seq 15211644
accounting not marked in superblock replicas
  replicas user: 1/1 [3], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 0 in entry cached: 1/1 [0]
replicas_v0 (size 96):
user: 2 [3 5] user: 2 [1 3] cached: 1 [2] btree: 2 [1 2] user: 2 [2 4] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 1 [3] user: 2 [1 5] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 2] user: 2 [1 4] user: 2 [2 3] user: 2 [2 5]

accounting not marked in superblock replicas
  replicas user: 1/2 [3 7], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 7 in entry user: 1/2 [3 7]
replicas_v0 (size 96):
user: 2 [3 7] user: 2 [1 3] cached: 1 [2] btree: 2 [1 2] user: 2 [2 4] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 1 [3] user: 2 [1 5] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 2] user: 2 [1 4] user: 2 [2 3] user: 2 [2 5] user: 2 [3 5]

done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): alloc_read... done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): stripes_read... done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): snapshots_read... done

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Wed, 25 Sep 2024 22:17:31 +0000 (18:17 -0400)]

bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Wed, 25 Sep 2024 20:46:06 +0000 (16:46 -0400)]

bcachefs: Fix accounting read + device removal

accounting read was checking if accounting replicas entries were marked
in the superblock prior to applying accounting from the journal,
which meant that a recently removed device could spuriously trigger a
"not marked in superblocked" error (when journal entries zero out the
offending counter).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Wed, 25 Sep 2024 02:53:56 +0000 (22:53 -0400)]

bcachefs: bch_accounting_mode

Minor refactoring - replace multiple bool arguments with an enum; prep
work for fixing a bug in accounting read.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 24 Sep 2024 02:32:58 +0000 (22:32 -0400)]

bcachefs: fix transaction restart handling in check_extents(), check_dirents()

Dealing with outside state within a btree transaction is always tricky.

check_extents() and check_dirents() have to accumulate counters for
i_sectors and i_nlink (for subdirectories). There were two bugs:

- transaction commit may return a restart; therefore we have to commit
  before accumulating to those counters
- get_inode_all_snapshots() may return a transaction restart, before
  updating w->last_pos; then, on the restart,
  check_i_sectors()/check_subdir_count() would see inodes that were not
  for w->last_pos

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 24 Sep 2024 02:29:05 +0000 (22:29 -0400)]

bcachefs: kill inode_walker_entry.seen_this_pos

dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 24 Sep 2024 23:31:22 +0000 (19:31 -0400)]

bcachefs: Fix incorrect IS_ERR_OR_NULL usage

Returning a positive integer instead of an error code causes error paths
to become very confused.

Closes: syzbot+c0360e8367d6d8d04a66@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Hongbo Li [Tue, 24 Sep 2024 01:41:46 +0000 (09:41 +0800)]

bcachefs: fix the memory leak in exception case

The pointer clean points the memory allocated by kmemdup, when the
return value of bch2_sb_clean_validate_late is not zero. The memory
pointed by clean is leaked. So we should free it in this case.

Fixes: a37ad1a3aba9 ("bcachefs: sb-clean.c")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Hongbo Li [Tue, 24 Sep 2024 01:42:24 +0000 (09:42 +0800)]

bcachefs: fast exit when darray_make_room failed

In downgrade_table_extra, the return value is needed. When it
return failed, we should exit immediately.

Fixes: 7773df19c35f ("bcachefs: metadata version bucket_stripe_sectors")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 24 Sep 2024 02:05:14 +0000 (22:05 -0400)]

bcachefs: Fix iterator leak in check_subvol()

A couple small error handling fixes

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 24 Sep 2024 02:06:04 +0000 (22:06 -0400)]

bcachefs: Add snapshot to bch_inode_unpacked

this allows for various cleanups in fsck

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Diogo Jahchan Koike [Mon, 23 Sep 2024 22:22:14 +0000 (19:22 -0300)]

bcachefs: assign return error when iterating through layout

syzbot reported a null ptr deref in __copy_user [0]

In __bch2_read_super, when a corrupt backup superblock matches the
default opts offset, no error is assigned to ret and the freed superblock
gets through, possibly being assigned as the best sb in bch2_fs_open and
being later dereferenced, causing a fault. Assign EINVALID to ret when
iterating through layout.

[0]: https://syzkaller.appspot.com/bug?extid=18a5c5e8a9c856944876

Reported-by: syzbot+18a5c5e8a9c856944876@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=18a5c5e8a9c856944876
Signed-off-by: Diogo Jahchan Koike <djahchankoike@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Mon, 23 Sep 2024 22:42:39 +0000 (18:42 -0400)]

bcachefs: Fix srcu warning in check_topology

check_topology doesn't need the srcu lock and doesn't use normal btree
transactions - we can just drop the srcu lock.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

Unnamed repository; edit this file 'description' to name the repository.