From ca109491f612aab5c8152207631c0444f63da97f Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Tue, 25 Nov 2008 12:43:51 +0100 Subject: hrtimer: removing all ur callback modes Impact: cleanup, move all hrtimer processing into hardirq context This is an attempt at removing some of the hrtimer complexity by reducing the number of callback modes to 1. This means that all hrtimer callback functions will be ran from HARD-irq context. I went through all the 30 odd hrtimer callback functions in the kernel and saw only one that I'm not quite sure of, which is the one in net/can/bcm.c - hence I'm CC-ing the folks responsible for that code. Furthermore, the hrtimer core now calls callbacks directly with IRQs disabled in case you try to enqueue an expired timer. If this timer is a periodic timer (which should use hrtimer_forward() to advance its time) then it might be possible to end up in an inf. recursive loop due to the fact that hrtimer_forward() doesn't round up to the next timer granularity, and therefore keeps on calling the callback - obviously this needs a fix. Aside from that, this seems to compile and actually boot on my dual core test box - although I'm sure there are some bugs in, me not hitting any makes me certain :-) Signed-off-by: Peter Zijlstra Signed-off-by: Ingo Molnar --- kernel/time/ntp.c | 4 ++-- kernel/time/tick-sched.c | 1 - 2 files changed, 2 insertions(+), 3 deletions(-) (limited to 'kernel/time') diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c index 8ff15e5d486..f5f793d9241 100644 --- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -131,7 +131,7 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer) { enum hrtimer_restart res = HRTIMER_NORESTART; - write_seqlock_irq(&xtime_lock); + write_seqlock(&xtime_lock); switch (time_state) { case TIME_OK: @@ -164,7 +164,7 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer) } update_vsyscall(&xtime, clock); - write_sequnlock_irq(&xtime_lock); + write_sequnlock(&xtime_lock); return res; } diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 342fc9ccab4..502a81e2639 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -681,7 +681,6 @@ void tick_setup_sched_timer(void) */ hrtimer_init(&ts->sched_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); ts->sched_timer.function = tick_sched_timer; - ts->sched_timer.cb_mode = HRTIMER_CB_IRQSAFE_PERCPU; /* Get the next period (per cpu) */ hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update()); -- cgit v1.2.3 From fa116ea35ec7f40e890972324409e99eed008d56 Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Thu, 11 Dec 2008 17:04:11 +0100 Subject: nohz: no softirq pending warnings for offline cpus Impact: remove false positive warning After a cpu was taken down during cpu hotplug (read: disabled for interrupts) it still might have pending softirqs. However take_cpu_down makes sure that the idle task will run next instead of ksoftirqd on the taken down cpu. The idle task will call tick_nohz_stop_sched_tick which might warn about pending softirqs just before the cpu kills itself completely. However the pending softirqs on the dead cpu aren't a problem because they will be moved to an online cpu during CPU_DEAD handling. So make sure we warn only for online cpus. Signed-off-by: Heiko Carstens Signed-off-by: Ingo Molnar --- kernel/time/tick-sched.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel/time') diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 342fc9ccab4..dc17ffcf191 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -247,7 +247,7 @@ void tick_nohz_stop_sched_tick(int inidle) if (need_resched()) goto end; - if (unlikely(local_softirq_pending())) { + if (unlikely(local_softirq_pending() && cpu_online(cpu))) { static int ratelimit; if (ratelimit < 10) { -- cgit v1.2.3 From 001474491fabeca233168a8598f721c808040f90 Mon Sep 17 00:00:00 2001 From: "Woodruff, Richard" Date: Mon, 1 Dec 2008 14:18:11 -0800 Subject: nohz: suppress needless timer reprogramming In my device I get many interrupts from a high speed USB device in a very short period of time. The system spends a lot of time reprogramming the hardware timer which is in a slower timing domain as compared to the CPU. This results in the CPU spending a huge amount of time waiting for the timer posting to be done. All of this reprogramming is useless as the wake up time has not changed. As measured using ETM trace this drops my reprogramming penalty from almost 60% CPU load down to 15% during high interrupt rate. I can send traces to show this. Suppress setting of duplicate timer event when timer already stopped. Timer programming can be very costly and can result in long cpu stall/wait times. [akpm@linux-foundation.org: coding-style fixes] [tglx@linutronix.de: move the check to the right place and avoid raising the softirq for nothing] Signed-off-by: Richard Woodruff Cc: johnstul@us.ibm.com Signed-off-by: Andrew Morton Signed-off-by: Thomas Gleixner --- kernel/time/tick-sched.c | 41 ++++++++++++++++++++++++----------------- 1 file changed, 24 insertions(+), 17 deletions(-) (limited to 'kernel/time') diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index dc17ffcf191..87fc34f21db 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -282,8 +282,31 @@ void tick_nohz_stop_sched_tick(int inidle) /* Schedule the tick, if we are at least one jiffie off */ if ((long)delta_jiffies >= 1) { + /* + * calculate the expiry time for the next timer wheel + * timer + */ + expires = ktime_add_ns(last_update, tick_period.tv64 * + delta_jiffies); + + /* + * If this cpu is the one which updates jiffies, then + * give up the assignment and let it be taken by the + * cpu which runs the tick timer next, which might be + * this cpu as well. If we don't drop this here the + * jiffies might be stale and do_timer() never + * invoked. + */ + if (cpu == tick_do_timer_cpu) + tick_do_timer_cpu = TICK_DO_TIMER_NONE; + if (delta_jiffies > 1) cpu_set(cpu, nohz_cpu_mask); + + /* Skip reprogram of event if its not changed */ + if (ts->tick_stopped && ktime_equal(expires, dev->next_event)) + goto out; + /* * nohz_stop_sched_tick can be called several times before * the nohz_restart_sched_tick is called. This happens when @@ -306,17 +329,6 @@ void tick_nohz_stop_sched_tick(int inidle) rcu_enter_nohz(); } - /* - * If this cpu is the one which updates jiffies, then - * give up the assignment and let it be taken by the - * cpu which runs the tick timer next, which might be - * this cpu as well. If we don't drop this here the - * jiffies might be stale and do_timer() never - * invoked. - */ - if (cpu == tick_do_timer_cpu) - tick_do_timer_cpu = TICK_DO_TIMER_NONE; - ts->idle_sleeps++; /* @@ -332,12 +344,7 @@ void tick_nohz_stop_sched_tick(int inidle) goto out; } - /* - * calculate the expiry time for the next timer wheel - * timer - */ - expires = ktime_add_ns(last_update, tick_period.tv64 * - delta_jiffies); + /* Mark expiries */ ts->idle_expires = expires; if (ts->nohz_mode == NOHZ_MODE_HIGHRES) { -- cgit v1.2.3