Christophe Leroy [Wed, 10 Mar 2021 17:46:49 +0000 (17:46 +0000)]
powerpc/uaccess: Remove calls to __get_user_bad() and __put_user_bad()
__get_user_bad() and __put_user_bad() are functions that are
declared but not defined, in order to make the link fail in
case they are called.
Nowadays, we have BUILD_BUG() and BUILD_BUG_ON() for that, and
they have the advantage to break the build earlier as it breaks
it at compile time instead of link time.
Christophe Leroy [Wed, 10 Mar 2021 17:46:48 +0000 (17:46 +0000)]
powerpc/uaccess: Remove __chk_user_ptr() in __get/put_user
Commit d02f6b7dab82 ("powerpc/uaccess: Evaluate macro arguments once,
before user access is allowed") changed the __chk_user_ptr()
argument from the passed ptr pointer to the locally
declared __gu_addr. But __gu_addr is locally defined as __user
so the check is pointless.
During kernel build __chk_user_ptr() voids and is only evaluated
during sparse checks so it should have been armless to leave the
original pointer check there.
Nevertheless, this check is indeed redundant with the assignment
above which casts the ptr pointer to the local __user __gu_addr.
In case of mismatch, sparse will detect it there, so the
__check_user_ptr() is not needed anywhere else than in access_ok().
Christophe Leroy [Wed, 10 Mar 2021 17:46:47 +0000 (17:46 +0000)]
powerpc/uaccess: Remove __unsafe_put_user_goto()
__unsafe_put_user_goto() is just an intermediate layer to
__put_user_size_goto() without added value other than doing
the __user pointer type checking.
Do the __user pointer type checking in __put_user_size_goto()
and remove __unsafe_put_user_goto().
Commit 6bfd93c32a50 ("powerpc: Fix incorrect might_sleep in
__get_user/__put_user on kernel addresses") added a check to not call
might_sleep() on kernel addresses. This was to enable the use of
__get_user() in the alignment exception handler for any address.
Then commit 95156f0051cb ("lockdep, mm: fix might_fault() annotation")
added a check of the address space in might_fault(), based on
set_fs() logic. But this didn't solve the powerpc alignment exception
case as it didn't call set_fs(KERNEL_DS).
Nowadays, set_fs() is gone, previous patch fixed the alignment
exception handler and __get_user/__put_user are not supposed to be
used anymore to read kernel memory.
Therefore the is_kernel_addr() check has become useless and can be
removed.
Christophe Leroy [Wed, 10 Mar 2021 17:46:45 +0000 (17:46 +0000)]
powerpc/align: Don't use __get_user_instr() on kernel addresses
In the old days, when we didn't have kernel userspace access
protection and had set_fs(), it was wise to use __get_user()
and friends to read kernel memory.
Nowadays, get_user() is granting userspace access and is exclusively
for userspace access.
In alignment exception handler, use probe_kernel_read_inst()
instead of __get_user_instr() for reading instructions in kernel.
This will allow to remove the is_kernel_addr() check in
__get/put_user() in a following patch.
Christophe Leroy [Wed, 10 Mar 2021 17:46:43 +0000 (17:46 +0000)]
powerpc/uaccess: Remove __get/put_user_inatomic()
Powerpc is the only architecture having _inatomic variants of
__get_user() and __put_user() accessors. They were introduced
by commit e68c825bb016 ("[POWERPC] Add inatomic versions of __get_user
and __put_user").
Those variants expand to the _nosleep macros instead of expanding
to the _nocheck macros. The only difference between the _nocheck
and the _nosleep macros is the call to might_fault().
Since commit 662bbcb2747c ("mm, sched: Allow uaccess in atomic with
pagefault_disable()"), __get/put_user() can be used in atomic parts
of the code, therefore __get/put_user_inatomic() have become useless.
Remove __get_user_inatomic() and __put_user_inatomic().
Christophe Leroy [Fri, 12 Mar 2021 13:25:11 +0000 (13:25 +0000)]
powerpc/align: Convert emulate_spe() to user_access_begin
This patch converts emulate_spe() to using user_access_begin
logic.
Since commit 662bbcb2747c ("mm, sched: Allow uaccess in atomic with
pagefault_disable()"), might_fault() doesn't fire when called from
sections where pagefaults are disabled, which must be the case
when using _inatomic variants of __get_user and __put_user. So
the might_fault() in user_access_begin() is not a problem.
There was a verification of user_mode() together with the access_ok(),
but there is a second verification of user_mode() just after, that
leads to immediate return. The access_ok() is now part of the
user_access_begin which is called after that other user_mode()
verification, so no need to check user_mode() again.
Michael Ellerman [Tue, 16 Mar 2021 01:09:38 +0000 (12:09 +1100)]
powerpc/pseries: Only register vio drivers if vio bus exists
The vio bus is a fake bus, which we use on pseries LPARs (guests) to
discover devices provided by the hypervisor. There's no need or sense
in creating the vio bus on bare metal systems.
Which is why commit 4336b9337824 ("powerpc/pseries: Make vio and
ibmebus initcalls pseries specific") made the initialisation of the
vio bus only happen in LPARs.
However as a result of that commit we now see errors at boot on bare
metal systems:
Driver 'hvc_console' was unable to register with bus_type 'vio' because the bus was not initialized.
Driver 'tpm_ibmvtpm' was unable to register with bus_type 'vio' because the bus was not initialized.
This happens because those drivers are built-in, and are calling
vio_register_driver(). It in turn calls driver_register() with a
reference to vio_bus_type, but we haven't registered vio_bus_type with
the driver core.
Fix it by also guarding vio_register_driver() with a check to see if
we are on pseries.
Fixes: 4336b9337824 ("powerpc/pseries: Make vio and ibmebus initcalls pseries specific") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Link: https://lore.kernel.org/r/20210316010938.525657-1-mpe@ellerman.id.au
Daniel Henrique Barboza [Tue, 23 Mar 2021 20:50:56 +0000 (17:50 -0300)]
powerpc/pseries/hotplug-cpu: Show 'last online CPU' error in dlpar_cpu_offline()
One of the reasons that dlpar_cpu_offline can fail is when attempting to
offline the last online CPU of the kernel. This can be observed in a
pseries QEMU guest that has hotplugged CPUs. If the user offlines all
other CPUs of the guest, and a hotplugged CPU is now the last online
CPU, trying to reclaim it will fail.
The current error message in this situation returns rc with -EBUSY and a
generic explanation, e.g.:
pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16
EBUSY can be caused by other conditions, such as cpu_hotplug_disable
being true. Throwing a more specific error message for this case,
instead of just "Failed to offline CPU", makes it clearer that the error
is in fact a known error situation instead of other generic/unknown
cause.
This patch adds a 'last online' check in dlpar_cpu_offline() to catch
the 'last online CPU' offline error, eturning a more informative error
message:
pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9
He Ying [Tue, 16 Mar 2021 04:11:48 +0000 (00:11 -0400)]
powerpc/setup_64: Fix sparse warnings
Sparse warns:
warning: symbol 'rfi_flush' was not declared.
warning: symbol 'entry_flush' was not declared.
warning: symbol 'uaccess_flush' was not declared.
Define 'entry_flush' and 'uaccess_flush' as static because they are
not referenced outside the file. Include asm/security_features.h in
which 'rfi_flush' is declared.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: He Ying <heying24@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316041148.29694-1-heying24@huawei.com
Christophe Leroy [Tue, 16 Mar 2021 07:57:16 +0000 (07:57 +0000)]
powerpc: Fix arch_stack_walk() to have running function as first entry
It seems like other architectures, namely x86 and arm64 and riscv
at least, include the running function as top entry when saving
stack trace with save_stack_trace_regs().
Functionnalities like KFENCE expect it.
Do the same on powerpc, it allows KFENCE and other users to
properly identify the faulting function as depicted below.
Before the patch KFENCE was identifying finish_task_switch.isra
as the faulting function.
Christophe Leroy [Tue, 16 Mar 2021 07:57:14 +0000 (07:57 +0000)]
powerpc: Rename 'tsk' parameter into 'task'
To better match generic code, rename 'tsk' to 'task' in
some stacktrace functions in preparation of following
patch which converts powerpc to generic ARCH_STACKWALK.
powerpc/book3s64/kuap: Move Kconfig varriables to BOOK3S_64
With below two commits:
commit c91435d95c49 ("powerpc/book3s64/hash/kuep: Enable KUEP on hash")
commit b2ff33a10c8b ("powerpc/book3s64/hash/kuap: Enable kuap on hash")
the kernel now supports kuap/kuep with hash translation. Hence select the
Kconfig even when radix is disabled.
Nicholas Piggin [Tue, 16 Mar 2021 10:52:05 +0000 (20:52 +1000)]
powerpc/64s: Fix hash fault to use TRAP accessor
Hash faults use the trap vector to decide whether this is an
instruction or data fault. This should use the TRAP accessor
rather than open access regs->trap.
This won't cause a problem at the moment because 64s only uses
trap flags for system call interrupts (the norestart flag), but
that could change if any other trap flags get used in future.
Fixes: a4922f5442e7e ("powerpc/64s: move the hash fault handling logic to C") Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316105205.407767-1-npiggin@gmail.com
In fault.c, #ifdef CONFIG_PPC_MEM_KEYS is not needed because all
functions are always defined, and arch_vma_access_permitted()
always returns true when CONFIG_PPC_MEM_KEYS is not defined so
access_pkey_error() will return false so bad_access_pkey()
will never be called.
Include linux/pkeys.h to get a definition of vma_pkeys() for
bad_access_pkey().
Michael Ellerman [Sun, 14 Mar 2021 09:33:20 +0000 (20:33 +1100)]
powerpc/64s: Fold update_current_thread_[i]amr() into their only callers
lkp reported warnings in some configuration due to
update_current_thread_amr() being unused:
arch/powerpc/mm/book3s64/pkeys.c:284:20: error: unused function 'update_current_thread_amr'
static inline void update_current_thread_amr(u64 value)
Which is because it's only use is inside an ifdef. We could move it
inside the ifdef, but it's a single line function and only has one
caller, so just fold it in.
Similarly update_current_thread_iamr() is small and only called once,
so fold it in also.
Yang Li [Mon, 15 Mar 2021 07:24:56 +0000 (15:24 +0800)]
powerpc/xive: use true and false for bool variable
fixed the following coccicheck:
./arch/powerpc/sysdev/xive/spapr.c:552:8-9: WARNING: return of 0/1 in
function 'xive_spapr_match' with return type bool
Christophe Leroy [Mon, 15 Mar 2021 11:01:26 +0000 (11:01 +0000)]
powerpc/asm-offsets: GPR14 is not needed either
Commit aac6a91fea93 ("powerpc/asm: Remove unused symbols in
asm-offsets.c") removed GPR15 to GPR31 but kept GPR14,
probably because it pops up in a couple of comments when doing
a grep.
However, it was never used either, so remove it as well.
Christophe Leroy [Mon, 15 Mar 2021 12:00:09 +0000 (12:00 +0000)]
powerpc/math: Fix missing __user qualifier for get_user() and other sparse warnings
Sparse reports the following problems:
arch/powerpc/math-emu/math.c:228:21: warning: Using plain integer as NULL pointer
arch/powerpc/math-emu/math.c:228:31: warning: Using plain integer as NULL pointer
arch/powerpc/math-emu/math.c:228:41: warning: Using plain integer as NULL pointer
arch/powerpc/math-emu/math.c:228:51: warning: Using plain integer as NULL pointer
arch/powerpc/math-emu/math.c:237:13: warning: incorrect type in initializer (different address spaces)
arch/powerpc/math-emu/math.c:237:13: expected unsigned int [noderef] __user *_gu_addr
arch/powerpc/math-emu/math.c:237:13: got unsigned int [usertype] *
arch/powerpc/math-emu/math.c:226:1: warning: symbol 'do_mathemu' was not declared. Should it be static?
Add missing __user qualifier when casting pointer used in get_user()
Use NULL instead of 0 to initialise opX local variables.
Add a prototype for do_mathemu() (Added in processor.h like sparc)
Christophe Leroy [Fri, 12 Mar 2021 12:50:50 +0000 (12:50 +0000)]
powerpc/8xx: Create C version of kuap save/restore/check helpers
In preparation of porting PPC32 to C syscall entry/exit,
create C version of kuap_save_and_lock() and kuap_user_restore() and
kuap_kernel_restore() and kuap_assert_locked() and
kuap_get_and_assert_locked() on 8xx.
Christophe Leroy [Fri, 12 Mar 2021 12:50:49 +0000 (12:50 +0000)]
powerpc/32s: Create C version of kuap save/restore/check helpers
In preparation of porting PPC32 to C syscall entry/exit,
create C version of kuap_save_and_lock() and kuap_user_restore() and
kuap_kernel_restore() and kuap_assert_locked() and
kuap_get_and_assert_locked() on book3s/32.
Christophe Leroy [Fri, 12 Mar 2021 12:50:48 +0000 (12:50 +0000)]
powerpc/64s: Make kuap_check_amr() and kuap_get_and_check_amr() generic
In preparation of porting powerpc32 to C syscall entry/exit,
rename kuap_check_amr() and kuap_get_and_check_amr() as
kuap_assert_locked() and kuap_get_and_assert_locked(), and move in the
generic asm/kup.h the stub for when CONFIG_PPC_KUAP is not selected.
Christophe Leroy [Fri, 12 Mar 2021 12:50:41 +0000 (12:50 +0000)]
powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE
In order to get more control in exception prolog, dismantle
all non standard exception macros, finishing with EXC_XFER_STD
and EXC_XFER_LITE and EXC_XFER_TEMPLATE.
Also remove transfer_to_handler_full and ret_from_except and
ret_from_except_full as they are not used anymore.
Last parameter of EXCEPTION() is now ignored, will be removed
in a later patch to avoid too much churn.
Christophe Leroy [Fri, 12 Mar 2021 12:50:40 +0000 (12:50 +0000)]
powerpc/32: Only restore non volatile registers when required
Until now, non volatile registers were restored everytime they
were saved, ie using EXC_XFER_STD meant saving and restoring
them while EXC_XFER_LITE meant neither saving not restoring them.
Now that they are always saved, EXC_XFER_STD means to restore
them and EXC_XFER_LITE means to not restore them.
Most of the users of EXC_XFER_STD only need to retrieve the
non volatile registers. For them there is no need to restore
the non volatile registers as they have not been modified.
Only very few exceptions require non volatile registers restore.
Opencode the few places which require saving of non volatile
registers.
Christophe Leroy [Fri, 12 Mar 2021 12:50:39 +0000 (12:50 +0000)]
powerpc/32: Add a prepare_transfer_to_handler macro for exception prologs
In order to increase flexibility, add a macro that will for now
call transfer_to_handler.
As transfer_to_handler doesn't do the actual transfer anymore,
also name it prepare_transfer_to_handler. The following patches
will progressively remove the use of transfer_to_handler label.
Christophe Leroy [Fri, 12 Mar 2021 12:50:35 +0000 (12:50 +0000)]
powerpc/32: Don't save thread.regs on interrupt entry
Since commit 06d67d54741a ("powerpc: make process.c suitable for both
32-bit and 64-bit"), thread.regs is set on task creation, no need to
set it again and again at each interrupt entry as it never change.
Christophe Leroy [Fri, 12 Mar 2021 12:50:33 +0000 (12:50 +0000)]
powerpc/32: Always save non volatile registers on exception entry
In preparation of handling exception entry and exit in C,
in order to simplify the handling, always save non volatile registers
when entering an exception.
Christophe Leroy [Fri, 12 Mar 2021 12:50:32 +0000 (12:50 +0000)]
powerpc/32: Perform normal function call in exception entry
Now that the MMU is re-enabled before calling the transfer function,
we don't need anymore that hack with the address of the handler and
the return function sitting just after the 'bl' to the transfer
fonction, that function is retrieving via a read relative to 'lr'.
Do a regular call to the transfer function, then to the handler,
then branch to the return function.
Christophe Leroy [Fri, 12 Mar 2021 12:50:29 +0000 (12:50 +0000)]
powerpc/32: Move exception prolog code into .text once MMU is back on
The space in the head section is rather constrained by the fact that
exception vectors are spread every 0x100 bytes and sometimes we
need to have "out of line" code because it doesn't fit.
Now that we are enabling MMU early in the prolog, take that opportunity
to jump somewhere else in the .text section where we don't have any
space constraint.
Christophe Leroy [Fri, 12 Mar 2021 12:50:25 +0000 (12:50 +0000)]
powerpc/32: Statically initialise first emergency context
The check of the emergency context initialisation in
vmap_stack_overflow is buggy for the SMP case, as it
compares r1 with 0 while in the SMP case r1 is offseted
by the CPU id.
Instead of fixing it, just perform static initialisation
of the first emergency context.
Christophe Leroy [Fri, 12 Mar 2021 12:50:23 +0000 (12:50 +0000)]
powerpc/32: Tag DAR in EXCEPTION_PROLOG_2 for the 8xx
8xx requires to tag the DAR with a magic value in order to
fixup DAR on faults generated by 'dcbX', as the 8xx
forgets to update the DAR for those faults.
Do the tagging as early as possible, that is before enabling MMU.
Christophe Leroy [Fri, 12 Mar 2021 12:50:21 +0000 (12:50 +0000)]
powerpc/32: Remove ksp_limit
ksp_limit is there to help detect stack overflows.
That is specific to ppc32 as it was removed from ppc64 in
commit cbc9565ee826 ("powerpc: Remove ksp_limit on ppc64").
There are other means for detecting stack overflows.
As ppc64 has proven to not need it, ppc32 should be able to do
without it too.
Christophe Leroy [Fri, 12 Mar 2021 12:50:16 +0000 (12:50 +0000)]
powerpc/40x: Prepare normal exception handler for enabling MMU early
Ensure normal exception handler are able to manage stuff with
MMU enabled. For that we use CONFIG_VMAP_STACK related code
allthough there is no intention to really activate CONFIG_VMAP_STACK
on powerpc 40x for the moment.
40x uses SPRN_DEAR instead of SPRN_DAR and SPRN_ESR instead of
SPRN_DSISR. Take it into account in common macros.
40x MSR value doesn't fit on 15 bits, use LOAD_REG_IMMEDIATE() in
common macros that will be used also with 40x.
Christophe Leroy [Fri, 12 Mar 2021 12:50:11 +0000 (12:50 +0000)]
powerpc/40x: Don't use SPRN_SPRG_SCRATCH0/1 in TLB miss handlers
SPRN_SPRG_SCRATCH5 is used to save SPRN_PID.
SPRN_SPRG_SCRATCH6 is already available.
SPRN_PID is only 8 bits. We have r12 that contains CR.
We only need to preserve CR0, so we have space available in r12
to save PID.
Keep PID in r12 and free up SPRN_SPRG_SCRATCH5.
Then In TLB miss handlers, instead of using SPRN_SPRG_SCRATCH0 and
SPRN_SPRG_SCRATCH1, use SPRN_SPRG_SCRATCH5 and SPRN_SPRG_SCRATCH6
to avoid future conflicts with normal exception prologs.
Christophe Leroy [Fri, 12 Mar 2021 12:50:10 +0000 (12:50 +0000)]
powerpc/traps: Declare unrecoverable_exception() as __noreturn
unrecoverable_exception() is never expected to return, most callers
have an infiniteloop in case it returns.
Ensure it really never returns by terminating it with a BUG(), and
declare it __no_return.
It always GCC to really simplify functions calling it. In the exemple
below, it avoids the stack frame in the likely fast path and avoids
code duplication for the exit.
Ravi Bangoria [Thu, 11 Mar 2021 09:15:38 +0000 (14:45 +0530)]
powerpc/uprobes: Validation for prefixed instruction
As per ISA 3.1, prefixed instruction should not cross 64-byte
boundary. So don't allow Uprobe on such prefixed instruction.
There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if that probe
is on the 64-byte unaligned prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.
Christopher M. Riedl [Sat, 27 Feb 2021 01:12:59 +0000 (19:12 -0600)]
powerpc/signal: Use __get_user() to copy sigset_t
Usually sigset_t is exactly 8B which is a "trivial" size and does not
warrant using __copy_from_user(). Use __get_user() directly in
anticipation of future work to remove the trivial size optimizations
from __copy_from_user().
The ppc32 implementation of get_sigset_t() previously called
copy_from_user() which, unlike __copy_from_user(), calls access_ok().
Replacing this w/ __get_user() (no access_ok()) is fine here since both
callsites in signal_32.c are preceded by an earlier access_ok().
Daniel Axtens [Sat, 27 Feb 2021 01:12:58 +0000 (19:12 -0600)]
powerpc/signal64: Rewrite rt_sigreturn() to minimise uaccess switches
Add uaccess blocks and use the 'unsafe' versions of functions doing user
access where possible to reduce the number of times uaccess has to be
opened/closed.
Co-developed-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-10-cmr@codefail.de
Daniel Axtens [Sat, 27 Feb 2021 01:12:57 +0000 (19:12 -0600)]
powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess switches
Add uaccess blocks and use the 'unsafe' versions of functions doing user
access where possible to reduce the number of times uaccess has to be
opened/closed.
There is no 'unsafe' version of copy_siginfo_to_user, so move it
slightly to allow for a "longer" uaccess block.
Co-developed-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-9-cmr@codefail.de
Previously restore_sigcontext() performed a costly KUAP switch on every
uaccess operation. These repeated uaccess switches cause a significant
drop in signal handling performance.
Rewrite restore_sigcontext() to assume that a userspace read access
window is open by replacing all uaccess functions with their 'unsafe'
versions. Modify the callers to first open, call
unsafe_restore_sigcontext(), and then close the uaccess window.
Previously setup_sigcontext() performed a costly KUAP switch on every
uaccess operation. These repeated uaccess switches cause a significant
drop in signal handling performance.
Rewrite setup_sigcontext() to assume that a userspace write access window
is open by replacing all uaccess functions with their 'unsafe' versions.
Modify the callers to first open, call unsafe_setup_sigcontext() and
then close the uaccess window.
Christopher M. Riedl [Sat, 27 Feb 2021 01:12:54 +0000 (19:12 -0600)]
powerpc/signal64: Remove TM ifdefery in middle of if/else block
Both rt_sigreturn() and handle_rt_signal_64() contain TM-related ifdefs
which break-up an if/else block. Provide stubs for the ifdef-guarded TM
functions and remove the need for an ifdef in rt_sigreturn().
Rework the remaining TM ifdef in handle_rt_signal64() similar to
commit f1cf4f93de2f ("powerpc/signal32: Remove ifdefery in middle of if/else").
Unlike in the commit for ppc32, the ifdef can't be removed entirely
since uc_transact in sigframe depends on CONFIG_PPC_TRANSACTIONAL_MEM.
Christopher M. Riedl [Sat, 27 Feb 2021 01:12:53 +0000 (19:12 -0600)]
powerpc: Reference parameter in MSR_TM_ACTIVE() macro
Unlike the other MSR_TM_* macros, MSR_TM_ACTIVE does not reference or
use its parameter unless CONFIG_PPC_TRANSACTIONAL_MEM is defined. This
causes an 'unused variable' compile warning unless the variable is also
guarded with CONFIG_PPC_TRANSACTIONAL_MEM.
Reference but do nothing with the argument in the macro to avoid a
potential compile warning.
Christopher M. Riedl [Sat, 27 Feb 2021 01:12:52 +0000 (19:12 -0600)]
powerpc/signal64: Remove non-inline calls from setup_sigcontext()
The majority of setup_sigcontext() can be refactored to execute in an
"unsafe" context assuming an open uaccess window except for some
non-inline function calls. Move these out into a separate
prepare_setup_sigcontext() function which must be called first and
before opening up a uaccess window. Non-inline function calls should be
avoided during a uaccess window for a few reasons:
- KUAP should be enabled for as much kernel code as possible.
Opening a uaccess window disables KUAP which means any code
executed during this time contributes to a potential attack
surface.
- Non-inline functions default to traceable which means they are
instrumented for ftrace. This adds more code which could run
with KUAP disabled.
- Powerpc does not currently support the objtool UACCESS checks.
All code running with uaccess must be audited manually which
means: less code -> less work -> fewer problems (in theory).
A follow-up commit converts setup_sigcontext() to be "unsafe".
Davidlohr Bueso [Tue, 9 Mar 2021 01:59:50 +0000 (17:59 -0800)]
powerpc/qspinlock: Use generic smp_cond_load_relaxed
49a7d46a06c3 (powerpc: Implement smp_cond_load_relaxed()) added
busy-waiting pausing with a preferred SMT priority pattern, lowering
the priority (reducing decode cycles) during the whole loop slowpath.
However, data shows that while this pattern works well with simple
spinlocks, queued spinlocks benefit more being kept in medium priority,
with a cpu_relax() instead, being a low+medium combo on powerpc.
Data is from three benchmarks on a Power9: 9008-22L 64 CPUs with
2 sockets and 8 threads per core.
1. locktorture.
This is data for the lowest and most artificial/pathological level,
with increasing thread counts pounding on the lock. Metrics are total
ops/minute. Despite some small hits in the 4-8 range, scenarios are
either neutral or favorable to this patch.
Here a client will do one-way throughput tests to a localhost server, with
increasing message sizes, dealing with the sk_lock. This patch shows to put
the performance of the qspinlock back to par with that of the simple lock:
Davidlohr Bueso [Tue, 9 Mar 2021 01:59:49 +0000 (17:59 -0800)]
powerpc/spinlock: Unserialize spin_is_locked
c6f5d02b6a0f (locking/spinlocks/arm64: Remove smp_mb() from
arch_spin_is_locked()) made it pretty official that the call
semantics do not imply any sort of barriers, and any user that
gets creative must explicitly do any serialization.
This creativity, however, is nowadays pretty limited:
1. spin_unlock_wait() has been removed from the kernel in favor
of a lock/unlock combo. Furthermore, queued spinlocks have now
for a number of years no longer relied on _Q_LOCKED_VAL for the
call, but any non-zero value to indicate a locked state. There
were cases where the delayed locked store could lead to breaking
mutual exclusion with crossed locking; such as with sysv ipc and
netfilter being the most extreme.
2. The auditing Andrea did in verified that remaining spin_is_locked()
no longer rely on such semantics. Most callers just use it to assert
a lock is taken, in a debug nature. The only user that gets cute is
NOLOCK qdisc, as of:
96009c7d500e (sched: replace __QDISC_STATE_RUNNING bit with a spin lock)
... which ironically went in the next day after c6f5d02b6a0f. This
change replaces test_bit() with spin_is_locked() to know whether
to take the busylock heuristic to reduce contention on the main
qdisc lock. So any races against spin_is_locked() for archs that
use LL/SC for spin_lock() will be benign and not break any mutual
exclusion; furthermore, both the seqlock and busylock have the same
scope.