]> www.infradead.org Git - users/dwmw2/linux.git/commit
x86/fpu: Simplify the switch_fpu_prepare() + switch_fpu_finish() logic
authorOleg Nesterov <oleg@redhat.com>
Sat, 3 May 2025 14:38:30 +0000 (16:38 +0200)
committerIngo Molnar <mingo@kernel.org>
Sun, 4 May 2025 08:29:24 +0000 (10:29 +0200)
commit730faa15a069f4025a0f8c2a5244c3067da7ecbe
tree589707c08ed0a95c5de412a97998fec17fc2374d
parenta78701fe4befbe3c1720f84c893d5565edbbd11b
x86/fpu: Simplify the switch_fpu_prepare() + switch_fpu_finish() logic

Now that switch_fpu_finish() doesn't load the FPU state, it makes more
sense to fold it into switch_fpu_prepare() renamed to switch_fpu(), and
more importantly, use the "prev_p" task as a target for TIF_NEED_FPU_LOAD.
It doesn't make any sense to delay set_tsk_thread_flag(TIF_NEED_FPU_LOAD)
until "prev_p" is scheduled again.

There is no worry about the very first context switch, fpu_clone() must
always set TIF_NEED_FPU_LOAD.

Also, shift the test_tsk_thread_flag(TIF_NEED_FPU_LOAD) from the callers
to switch_fpu().

Note that the "PF_KTHREAD | PF_USER_WORKER" check can be removed but
this deserves a separate patch which can change more functions, say,
kernel_fpu_begin_mask().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Chang S . Bae <chang.seok.bae@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250503143830.GA8982@redhat.com
arch/x86/include/asm/fpu/sched.h
arch/x86/kernel/process_32.c
arch/x86/kernel/process_64.c