]> www.infradead.org Git - users/hch/misc.git/commit
rqspinlock: Hardcode cond_acquire loops for arm64
authorKumar Kartikeya Dwivedi <memxor@gmail.com>
Sun, 16 Mar 2025 04:05:24 +0000 (21:05 -0700)
committerAlexei Starovoitov <ast@kernel.org>
Wed, 19 Mar 2025 15:03:04 +0000 (08:03 -0700)
commitebababcd03729db14b2dd911d6600af84415509c
tree74eafacaf3d4a8447b52b705cf40d145b4698b67
parent14c48ee81452752d85e55a9b55e18cd90d251193
rqspinlock: Hardcode cond_acquire loops for arm64

Currently, for rqspinlock usage, the implementation of
smp_cond_load_acquire (and thus, atomic_cond_read_acquire) are
susceptible to stalls on arm64, because they do not guarantee that the
conditional expression will be repeatedly invoked if the address being
loaded from is not written to by other CPUs. When support for
event-streams is absent (which unblocks stuck WFE-based loops every
~100us), we may end up being stuck forever.

This causes a problem for us, as we need to repeatedly invoke the
RES_CHECK_TIMEOUT in the spin loop to break out when the timeout
expires.

Let us import the smp_cond_load_acquire_timewait implementation Ankur is
proposing in [0], and then fallback to it once it is merged.

While we rely on the implementation to amortize the cost of sampling
check_timeout for us, it will not happen when event stream support is
unavailable. This is not the common case, and it would be difficult to
fit our logic in the time_expr_ns >= time_limit_ns comparison, hence
just let it be.

  [0]: https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com

Cc: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250316040541.108729-9-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
arch/arm64/include/asm/rqspinlock.h [new file with mode: 0644]
kernel/bpf/rqspinlock.c