From 6f670b9d3ca532b2bf723d40b3e0fb74659aa289 Mon Sep 17 00:00:00 2001 From: zhongjinji Date: Thu, 14 Aug 2025 21:55:53 +0800 Subject: [PATCH] futex: introduce function process_has_robust_futex() Patch series "mm/oom_kill: Only delay OOM reaper for processes using robust futexes", v4. The OOM reaper quickly reclaims a process's memory when the system hits OOM, helping the system recover. Without the OOM reaper, if a process frozen by cgroup v1 is OOM killed, the victim's memory cannot be freed, leaving the system in a poor state. Even if the process is not frozen by cgroup v1, reclaiming victims' memory remains important, as having one more process working speeds up memory release. When processes holding robust futexes are OOM killed but waiters on those futexes remain alive, the robust futexes might be reaped before futex_cleanup() runs. This can cause the waiters to block indefinitely [1]. To prevent this issue, the OOM reaper's work is delayed by 2 seconds [1]. Since many killed processes exit within 2 seconds, the OOM reaper rarely runs after this delay. However, robust futex users are few, so delaying OOM reap for all victims is unnecessary. If each thread's robust_list in a process is NULL, the process holds no robust futexes. For such processes, the OOM reaper should not be delayed. For processes holding robust futexes, to avoid issue [1], the OOM reaper must still be delayed. Patch 1 introduces process_has_robust_futex() to detect whether a process uses robust futexes. Patch 2 delays the OOM reaper only for processes holding robust futexes, improving OOM reaper performance. Patch 3 makes the OOM reaper and exit_mmap() traverse the maple tree in opposite orders to reduce PTE lock contention caused by unmapping the same vma. This patch (of 3): When the holders of robust futexes are OOM killed but the waiters on robust futexes are still alive, the robust futexes might be reaped before futex_cleanup() runs. This can cause the waiters to block indefinitely [1]. To prevent this issue, the OOM reaper's work is delayed by 2 seconds [1]. However, the OOM reaper now rarely runs since many killed processes exit within 2 seconds. Because robust futex users are few, delay the reaper's execution only for processes holding robust futexes to improve the performance of the OOM reaper. Introduce the function process_has_robust_futex() to detect whether a process uses robust futexes. If each thread's robust_list in a process is NULL, it means the process holds no robust futexes. Conversely, it means the process holds robust futexes. Link: https://lkml.kernel.org/r/20250814135555.17493-1-zhongjinji@honor.com Link: https://lkml.kernel.org/r/20250814135555.17493-2-zhongjinji@honor.com Link: https://lore.kernel.org/all/20220414144042.677008-1-npache@redhat.com/T/#u [1] Signed-off-by: zhongjinji Cc: Andre Almeida Cc: Darren Hart Cc: Davidlohr Bueso Cc: David Rientjes Cc: Ingo Molnar Cc: Liam Howlett Cc: Mariano Pache Cc: Michal Hocko Cc: Peter Zijlstra Cc: Shakeel Butt Cc: Joel Savitz Cc: Thomas Gleinxer Signed-off-by: Andrew Morton --- include/linux/futex.h | 5 +++++ kernel/futex/core.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) diff --git a/include/linux/futex.h b/include/linux/futex.h index 9e9750f04980..39540b7ae2a1 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -81,6 +81,7 @@ void futex_exec_release(struct task_struct *tsk); long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3); int futex_hash_prctl(unsigned long arg2, unsigned long arg3, unsigned long arg4); +bool process_has_robust_futex(struct task_struct *tsk); #ifdef CONFIG_FUTEX_PRIVATE_HASH int futex_hash_allocate_default(void); @@ -108,6 +109,10 @@ static inline int futex_hash_prctl(unsigned long arg2, unsigned long arg3, unsig { return -EINVAL; } +static inline bool process_has_robust_futex(struct task_struct *tsk) +{ + return false; +} static inline int futex_hash_allocate_default(void) { return 0; diff --git a/kernel/futex/core.c b/kernel/futex/core.c index d9bb5567af0c..01b6561ab4f6 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -1961,6 +1961,36 @@ int futex_hash_prctl(unsigned long arg2, unsigned long arg3, unsigned long arg4) return ret; } +/* + * process_has_robust_futex() - check whether the given task hold robust futexes. + * @p: task struct of which task to consider + * + * If any thread in the task has a non-NULL robust_list or compat_robust_list, + * it indicates that the task holds robust futexes. + */ +bool process_has_robust_futex(struct task_struct *tsk) +{ + struct task_struct *t; + bool ret = false; + + rcu_read_lock(); + for_each_thread(tsk, t) { + if (unlikely(t->robust_list)) { + ret = true; + break; + } +#ifdef CONFIG_COMPAT + if (unlikely(t->compat_robust_list)) { + ret = true; + break; + } +#endif + } + rcu_read_unlock(); + + return ret; +} + static int __init futex_init(void) { unsigned long hashsize, i; -- 2.51.0