From 2a0665516eed8a467501c8409c93085d6e299bcf Mon Sep 17 00:00:00 2001 From: Honglei Wang Date: Thu, 23 Feb 2017 10:50:42 +0800 Subject: [PATCH] timers: Use proper base migration in add_timer_on() Orabug 25079989 hit timer race condition which is fixed by upstream commit 22b886dd (timers: Use proper base migration in add_timer_on()). Commit 22b886dd is base on v4.4 code whose definition of timer_list is different from uek4. It'll have to modify the timer_list struct if backport it directly with all his base commits. That's a big changeset and there will be kABI check issue. So this patch implements same logic (which is used in __mod_timer() for the same race condition as well) as 22b886dd and works with uek4 timer_list struct. Following is the description of commit 22b886dd: timers: Use proper base migration in add_timer_on() Regardless of the previous CPU a timer was on, add_timer_on() currently simply sets timer->flags to the new CPU. As the caller must be seeing the timer as idle, this is locally fine, but the timer leaving the old base while unlocked can lead to race conditions as follows. Let's say timer was on cpu 0. cpu 0 cpu 1 ----------------------------------------------------------------------------- del_timer(timer) succeeds del_timer(timer) lock_timer_base(timer) locks cpu_0_base add_timer_on(timer, 1) spin_lock(&cpu_1_base->lock) timer->flags set to cpu_1_base operates on @timer operates on @timer This triggered with mod_delayed_work_on() which contains "if (del_timer()) add_timer_on()" sequence eventually leading to the following oops. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] detach_if_pending+0x69/0x1a0 ... Workqueue: wqthrash wqthrash_workfunc [wqthrash] task: ffff8800172ca680 ti: ffff8800172d0000 task.ti: ffff8800172d0000 RIP: 0010:[] [] detach_if_pending+0x69/0x1a0 ... Call Trace: [] del_timer+0x44/0x60 [] try_to_grab_pending+0xb6/0x160 [] mod_delayed_work_on+0x33/0x80 [] wqthrash_workfunc+0x61/0x90 [wqthrash] [] process_one_work+0x1e8/0x650 [] worker_thread+0x4e/0x450 [] kthread+0xef/0x110 [] ret_from_fork+0x3f/0x70 Fix it by updating add_timer_on() to perform proper migration as __mod_timer() does. Orabug: 25079989 Signed-off-by: Honglei Wang Acked-by: Joe Jin --- kernel/time/timer.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 2ece3aa5069c..0686ff60f11e 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -966,13 +966,29 @@ EXPORT_SYMBOL(add_timer); */ void add_timer_on(struct timer_list *timer, int cpu) { - struct tvec_base *base = per_cpu(tvec_bases, cpu); + struct tvec_base *new_base = per_cpu(tvec_bases, cpu); + struct tvec_base *base; unsigned long flags; timer_stats_timer_set_start_info(timer); BUG_ON(timer_pending(timer) || !timer->function); - spin_lock_irqsave(&base->lock, flags); - timer_set_base(timer, base); + + /* + * If @timer was on a different CPU, it should be migrated with the + * old base locked to prevent other operations proceeding with the + * wrong base locked. See lock_timer_base(). + */ + base = lock_timer_base(timer, &flags); + + if (base != new_base) { + /* See the comment in lock_timer_base() */ + timer_set_base(timer, NULL); + spin_unlock(&base->lock); + base = new_base; + spin_lock(&base->lock); + timer_set_base(timer, base); + } + debug_activate(timer, timer->expires); internal_add_timer(base, timer); spin_unlock_irqrestore(&base->lock, flags); -- 2.50.1