aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2008-11-06cpumask: introduce new API, without changing anythingRusty Russell
Impact: introduce new APIs We want to deprecate cpumasks on the stack, as we are headed for gynormous numbers of CPUs. Eventually, we want to head towards an undefined 'struct cpumask' so they can never be declared on stack. 1) New cpumask functions which take pointers instead of copies. (cpus_* -> cpumask_*) 2) Several new helpers to reduce requirements for temporary cpumasks (cpumask_first_and, cpumask_next_and, cpumask_any_and) 3) Helpers for declaring cpumasks on or offstack for large NR_CPUS (cpumask_var_t, alloc_cpumask_var and free_cpumask_var) 4) 'struct cpumask' for explicitness and to mark new-style code. 5) Make iterator functions stop at nr_cpu_ids (a runtime constant), not NR_CPUS for time efficiency and for smaller dynamic allocations in future. 6) cpumask_copy() so we can allocate less than a full cpumask eventually (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask' definition eventually. 7) work_on_cpu() helper for doing task on a CPU, rather than saving old cpumask for current thread and manipulating it. 8) smp_call_function_many() which is smp_call_function_mask() except taking a cpumask pointer. Note that this patch simply introduces the new functions and leaves the obsolescent ones in place. This is to simplify the transition patches. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03tracing, ring-buffer: add paranoid checks for loopsSteven Rostedt
While writing a new tracer, I had a bug where I caused the ring-buffer to recurse in a bad way. The bug was with the tracer I was writing and not the ring-buffer itself. But it took a long time to find the problem. This patch adds paranoid checks into the ring-buffer infrastructure that will catch bugs of this nature. Note: I put the bug back in the tracer and this patch showed the error nicely and prevented the lockup. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03ftrace: use kretprobe trampoline name to test in outputSteven Rostedt
Impact: ia64+tracing build fix When a function is kprobed, the return address is set to the kprobe_trampoline, or something similar. This caused the output of the trace to look confusing when the parent seemed to be this "kprobe_trampoline" function. To fix this, Abhishek Sagar added a test of the instruction pointer of the parent to see if it matched the kprobe_trampoline. If it did, the output would print a "[unknown/kretprobe'd]" instead. Unfortunately, not all archs do this the same way, and the trampoline function may not be exported, which causes failures in builds. This patch will compare the name instead of the pointer to see if it matches. This prevents us from depending on a function from being exported, and should work on all archs. The worst that can happen is that an arch might use a different name and then we go back to the confusing output. At least the arch will still build. Reported-by: Abhishek Sagar <sagar.abhishek@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Abhishek Sagar <sagar.abhishek@gmail.com> Acked-by: Abhishek Sagar <sagar.abhishek@gmail.com>
2008-11-03tracing, alpha: undefined reference to `save_stack_trace'Al Viro
Impact: build fix on !stacktrace architectures only select STACKTRACE on architectures that have STACKTRACE_SUPPORT ... since we also need to ifdef out the guts of ftrace_trace_stack(). We also want to disallow setting TRACE_ITER_STACKTRACE in trace_flags on such configs, but that can wait. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-01PM_TEST_SUSPEND should depend on RTC_CLASS, not RTC_LIBAl Viro
Insufficient dependency - we really want CONFIG_RTC_CLASS=y there. That will give us CONFIG_RTC_LIB=y, so the old dependency can be simply replaced. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01reserve_region_with_split: Fix GFP_KERNEL usage under spinlockLinus Torvalds
This one apparently doesn't generate any warnings, because the function is only used during system bootup, when the warnings are disabled. But it's still very wrong. The __reserve_region_with_split() function is called with the resource_lock held for writing, so it must only ever do GFP_ATOMIC allocations. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: remove sched-design.txt from 00-INDEX sched: change sched_debug's mode to 0444
2008-10-30Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ftrace: handle archs that do not support irqs_disabled_flags
2008-10-30Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: resources: fix x86info results ioremap.c:226 __ioremap_caller+0xf2/0x2d6() WARNINGs
2008-10-31ftrace: handle archs that do not support irqs_disabled_flagsSteven Rostedt
Impact: build fix on non-lockdep architectures Some architectures do not support a way to read the irq flags that is set from "local_irq_save(flags)" to determine if interrupts were disabled or enabled. Ftrace uses this information to display to the user if the trace occurred with interrupts enabled or disabled. Besides the fact that those archs that do not support this will fail to compile, unless they fix it, we do not want to have the trace simply say interrupts were not disabled or they were enabled, without knowing the real answer. This patch adds a 'X' in the output to let the user know that the architecture they are running on does not support a way for the tracer to determine if interrupts were enabled or disabled. It also lets those same archs compile with tracing enabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ftrace: fix trace_nop config select ftrace: perform an initialization for ftrace to enable it
2008-10-30'kill sig -1' must only apply to caller's namespaceSukadev Bhattiprolu
Currently "kill <sig> -1" kills processes in all namespaces and breaks the isolation of namespaces. Earlier attempt to fix this was discussed at: http://lkml.org/lkml/2008/7/23/148 As suggested by Oleg Nesterov in that thread, use "task_pid_vnr() > 1" check since task_pid_vnr() returns 0 if process is outside the caller's namespace. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Tested-by: Daniel Hokka Zakrisson <daniel@hozac.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30kernel/profile: fix profile_init() section mismatchPaul Mundt
profile_init() calls in to alloc_bootmem() on early initialization. While alloc_bootmem() is __init, the reference itself is safe in that it is tucked below a !slab_is_available() check. So, flag profile_init() as __ref. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30freezer_cg: simplify freezer_change_state()Li Zefan
Just call unfreeze_cgroup() if goal_state == THAWED, and call try_to_freeze_cgroup() if goal_state == FROZEN. No behavior has been changed. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Cedric Le Goater <clg@fr.ibm.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30freezer_cg: use thaw_process() in unfreeze_cgroup()Li Zefan
Don't duplicate the implementation of thaw_process(). [akpm@linux-foundation.org: make __thaw_process() static] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30freezer_cg: remove redundant check in freezer_can_attach()Li Zefan
It is sufficient to check if @task is frozen, and no need to check if the original freezer is frozen. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Cedric Le Goater <clg@fr.ibm.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30freezer_cg: fix improper BUG_ON() causing oopsLi Zefan
The BUG_ON() should be protected by freezer->lock, otherwise it can be triggered easily when a task has been unfreezed but the corresponding cgroup hasn't been changed to FROZEN state. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Cedric Le Goater <clg@fr.ibm.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30sched: change sched_debug's mode to 0444Li Zefan
Impact: change /proc/sched/debug from rw-r--r-- to r--r--r-- /proc/sched_debug is read-only. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-29ftrace: fix trace_nop config selectSteven Rostedt
Impact: build fix on non-function-tracing architectures The trace_nop is the tracer that is defined when no tracer is set in the ftrace infrastructure. The trace_nop was mistakenly selected by HAVE_FTRACE due to the confusion between ftrace infrastructure and the ftrace function tracer (which has been solved by renaming the function tracer). This patch changes the select to the approriate TRACING. This patch should fix compile errors on architectures that do not define the FUNCTION_TRACER. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28resources: fix x86info results ioremap.c:226 __ioremap_caller+0xf2/0x2d6() ↵Suresh Siddha
WARNINGs Impact: avoid false-positive WARN_ON() Andi Kleen reported: > When running x86info on a 2.6.27-git8 system I get > > resource map sanity check conflict: 0x9e000 0x9efff 0x10000 0x9e7ff System RAM > ------------[ cut here ]------------ > WARNING: at /home/lsrc/linux/arch/x86/mm/ioremap.c:226 __ioremap_caller+0xf2/0x2d6() > ... Some of the pages below the 1MB ISA addresses will be shared typically by both BIOS and system usable RAM. For example: BIOS-e820: 0000000000000000 - 000000000009f800 (usable) BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved) x86info reads the low physical address using /dev/mem, which internally uses ioremap() for accessing non RAM pages. ioremap() of such low pages conflicts with multiple resource entities leading to the above warning. Change the iomem_map_sanity_check() to allow mapping a page spanning multiple resource entities (minimum granularity that one can map is a page anyhow). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28ftrace: perform an initialization for ftrace to enable itFrederic Weisbecker
Impact: corrects a bug which made the non-dyn function tracer not functional With latest git, the non-dynamic function tracer didn't get any trace. The problem was the fact that ftrace_enabled wasn't initialized to 1 because ftrace hasn't any init function when DYNAMIC_FTRACE is disabled. So when a tracer tries to register an ftrace_ops struct, __register_ftrace_function failed to set the hook. This patch corrects it by setting an init function to initialize ftrace during the boot. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) ftrace: fix current_tracer error return tracing: fix a build error on alpha ftrace: use a real variable for ftrace_nop in x86 tracing/ftrace: make boot tracer select the sched_switch tracer tracepoint: check if the probe has been registered asm-generic: define DIE_OOPS in asm-generic trace: fix printk warning for u64 ftrace: warning in kernel/trace/ftrace.c ftrace: fix build failure ftrace, powerpc, sparc64, x86: remove notrace from arch ftrace file ftrace: remove ftrace hash ftrace: remove mcount set ftrace: remove daemon ftrace: disable dynamic ftrace for all archs that use daemon ftrace: add ftrace warn on to disable ftrace ftrace: only have ftrace_kill atomic ftrace: use probe_kernel ftrace: comment arch ftrace code ftrace: return error on failed modified text. ftrace: dynamic ftrace process only text section ...
2008-10-28Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: fix irqs on/off ip tracing lockdep: minor fix for debug_show_all_locks() x86: restore the old swiotlb alloc_coherent behavior x86: use GFP_DMA for 24bit coherent_dma_mask swiotlb: remove panic for alloc_coherent failure xen: compilation fix of drivers/xen/events.c on IA64 xen: portability clean up and some minor clean up for xencomm.c xen: don't reload cr3 on suspend kernel/resource: fix reserve_region_with_split() section mismatch printk: remove unused code from kernel/printk.c
2008-10-28Merge branch 'irq-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: make variable static
2008-10-28Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: fix documentation reference for sched_min_granularity_ns sched: virtual time buddy preemption sched: re-instate vruntime based wakeup preemption sched: weaken sync hint sched: more accurate min_vruntime accounting sched: fix a find_busiest_group buglet sched: add CONFIG_SMP consistency
2008-10-28ftrace: fix current_tracer error returnSteven Rostedt
The commit (in linux-tip) c2931e05ec5965597cbfb79ad332d4a29aeceb23 ( ftrace: return an error when setting a nonexistent tracer ) added useful code that would error when a bad tracer was written into the current_tracer file. But this had a bug if the amount written was more than the amount read by that code. The first iteration would set the tracer correctly, but since it did not consume the rest of what was written (usually whitespace), the userspace utility would continue to write what was not consumed. This second iteration would fail to find a tracer and return -EINVAL. Funny thing is that the tracer would have already been set. This patch just consumes all the data that is written to the file. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28lockdep: fix irqs on/off ip tracingHeiko Carstens
Impact: fix lockdep lock-api-caller output when irqsoff tracing is enabled 81d68a96 "ftrace: trace irq disabled critical timings" added wrappers around trace_hardirqs_on/off_caller. However these functions use __builtin_return_address(0) to figure out which function actually disabled or enabled irqs. The result is that we save the ips of trace_hardirqs_on/off instead of the real caller. Not very helpful. However since the patch from Steven the ip already gets passed. So use that and get rid of __builtin_return_address(0) in these two functions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28lockdep: minor fix for debug_show_all_locks()qinghuang feng
When we failed to get tasklist_lock eventually (count equals 0), we should only print " ignoring it.\n", and not print " locked it.\n" needlessly. Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28tracing: fix a build error on alphaFrederic Weisbecker
Impact: build fix on Alpha When tracing is enabled, some arch have included <linux/irqflags.h> on their <asm/system.h> but others like alpha or m68k don't. Build error on alpha: kernel/trace/trace.c: In function 'tracing_cpumask_write': kernel/trace/trace.c:2145: error: implicit declaration of function 'raw_local_irq_disable' kernel/trace/trace.c:2162: error: implicit declaration of function 'raw_local_irq_enable' Tested on Alpha through a cross-compiler (should correct a similar issue on m68k). Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27tracing/ftrace: make boot tracer select the sched_switch tracerFrederic Weisbecker
Impact: build fix If the boot tracer is selected but not the sched_switch, there will be a build failure: kernel/built-in.o: In function `boot_trace_init': trace_boot.c:(.text+0x5ee38): undefined reference to `sched_switch_trace' kernel/built-in.o: In function `disable_boot_trace': (.text+0x5eee1): undefined reference to `tracing_stop_cmdline_record' kernel/built-in.o: In function `enable_boot_trace': (.text+0x5ef11): undefined reference to `tracing_start_cmdline_record' This patch fixes it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27tracepoint: check if the probe has been registeredFrederic Weisbecker
Impact: fix kernel crash that can trigger during tracing If we try to remove a probe that has not been already registered, the tracepoint_entry_remove_probe() function will dereference a NULL pointer. Check the probe before removing it to avoid crashes. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27trace: fix printk warning for u64Stephen Rothwell
A powerpc ppc64_defconfig build produces these warnings: kernel/trace/ring_buffer.c: In function 'rb_add_time_stamp': kernel/trace/ring_buffer.c:969: warning: format '%llu' expects type 'long long unsigned int', but argument 2 has type 'u64' kernel/trace/ring_buffer.c:969: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'u64' kernel/trace/ring_buffer.c:969: warning: format '%llu' expects type 'long long unsigned int', but argument 4 has type 'u64' Just cast the u64s to unsigned long long like we do everywhere else. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27Merge commit 'v2.6.28-rc2' into tracing/urgentIngo Molnar
2008-10-26cgroup: remove unused variableStephen Rothwell
/scratch/sfr/next/kernel/cgroup.c: In function 'cgroup_tasks_start': /scratch/sfr/next/kernel/cgroup.c:2107: warning: unused variable 'i' Introduced in commit cc31edceee04a7b87f2be48f9489ebb72d264844 "cgroups: convert tasks file to use a seq_file with shared pid array". Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-25Revert "Call init_workqueues before pre smp initcalls."Linus Torvalds
This reverts commit a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635 by moving the call to init_workqueues() back where it belongs - after SMP has been initialized. It also moves stop_machine_init() - which needs workqueues - to a later phase using a core_initcall() instead of early_initcall(). That should satisfy all ordering requirements, and was apparently the reason why init_workqueues() was moved to be too early. Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-24ftrace: warning in kernel/trace/ftrace.cIngo Molnar
this warning: kernel/trace/ftrace.c:189: warning: ‘frozen_record_count’ defined but not used triggers because frozen_record_count is only used in the KCONFIG_MARKERS case. Move the variable it there. Alas, this frozen-record facility seems to have little use. The frozen_record_count variable is not used by anything, nor the flags. So this section might need a bit of dead-code-removal care as well. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24sched: virtual time buddy preemptionPeter Zijlstra
Since we moved wakeup preemption back to virtual time, it makes sense to move the buddy stuff back as well. The purpose of the buddy scheduling is to allow a quickly scheduling pair of tasks to run away from the group as far as a regular busy task would be allowed under wakeup preemption. This has the advantage that the pair can ping-pong for a while, enjoying cache-hotness. Without buddy scheduling other tasks would interleave destroying the cache. Also, it saves a word in cfs_rq. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24sched: re-instate vruntime based wakeup preemptionPeter Zijlstra
The advantage is that vruntime based wakeup preemption has a better conceptual model. Here wakeup_gran = 0 means: preempt when 'fair'. Therefore wakeup_gran is the granularity of unfairness we allow in order to make progress. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24sched: weaken sync hintMike Galbraith
Mysql+oltp and pgsql+oltp peaks are still shifted right. The below puts the peaks back to 1 client/server pair per core. Use the avg_overlap information to weaken the sync hint. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24sched: more accurate min_vruntime accountingPeter Zijlstra
Mike noticed the current min_vruntime tracking can go wrong and skip the current task. If the only remaining task in the tree is a nice 19 task with huge vruntime, new tasks will be inserted too far to the right too, causing some interactibity issues. min_vruntime can only change due to the leftmost entry disappearing (dequeue_entity()), or by the leftmost entry being incremented past the next entry, which elects a new leftmost (__update_curr()) Due to the current entry not being part of the actual tree, we have to compare the leftmost tree entry with the current entry, and take the leftmost of these two. So create a update_min_vruntime() function that takes computes the leftmost vruntime in the system (either tree of current) and increases the cfs_rq->min_vruntime if the computed value is larger than the previously found min_vruntime. And call this from the two sites we've identified that can change min_vruntime. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24sched: fix a find_busiest_group bugletPeter Zijlstra
In one of the group load balancer patches: commit 408ed066b11cf9ee4536573b4269ee3613bd735e Author: Peter Zijlstra <a.p.zijlstra@chello.nl> Date: Fri Jun 27 13:41:28 2008 +0200 Subject: sched: hierarchical load vs find_busiest_group The following change: - if (max_load - this_load + SCHED_LOAD_SCALE_FUZZ >= + if (max_load - this_load + 2*busiest_load_per_task >= busiest_load_per_task * imbn) { made the condition always true, because imbn is [1,2]. Therefore, remove the 2*, and give the it a fair chance. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24Merge commit 'v2.6.28-rc1' into sched/urgentIngo Molnar
2008-10-23kernel/resource: fix reserve_region_with_split() section mismatchPaul Mundt
Impact: cleanup, small kernel text size reduction, no functionality changed reserve_region_with_split() calls in to __reserve_region_with_split(), which is an __init function. The only caller of reserve_region_with_split() is an __init function, so make it __init too. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23printk: remove unused code from kernel/printk.croel kluin
both log_buf_copy() and log_buf_len are unused. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23Fix compile warning in kernel/params.cLinus Torvalds
Move free_module_param_attrs() into the CONFIG_MODULES section, since it's only used inside there. Thus avoiding the warning kernel/params.c:514: warning: 'free_module_param_attrs' defined but not used Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23Merge branch 'proc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc * 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits) proc: remove fs/proc/proc_misc.c proc: move /proc/vmcore creation to fs/proc/vmcore.c proc: move pagecount stuff to fs/proc/page.c proc: move all /proc/kcore stuff to fs/proc/kcore.c proc: move /proc/schedstat boilerplate to kernel/sched_stats.h proc: move /proc/modules boilerplate to kernel/module.c proc: move /proc/diskstats boilerplate to block/genhd.c proc: move /proc/zoneinfo boilerplate to mm/vmstat.c proc: move /proc/vmstat boilerplate to mm/vmstat.c proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c proc: move /proc/buddyinfo boilerplate to mm/vmstat.c proc: move /proc/vmallocinfo to mm/vmalloc.c proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c proc: move /proc/slab_allocators boilerplate to mm/slab.c proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c proc: move /proc/stat to fs/proc/stat.c proc: move rest of /proc/partitions code to block/genhd.c proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c proc: move /proc/devices code to fs/proc/devices.c proc: move rest of /proc/locks to fs/locks.c ...
2008-10-23Merge branch 'v28-range-hrtimers-for-linus-v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits) hrtimers: add missing docbook comments to struct hrtimer hrtimers: simplify hrtimer_peek_ahead_timers() hrtimers: fix docbook comments DECLARE_PER_CPU needs linux/percpu.h hrtimers: fix typo rangetimers: fix the bug reported by Ingo for real rangetimer: fix BUG_ON reported by Ingo rangetimer: fix x86 build failure for the !HRTIMERS case select: fix alpha OSF wrapper select: fix alpha OSF wrapper hrtimer: peek at the timer queue just before going idle hrtimer: make the futex() system call use the per process slack value hrtimer: make the nanosleep() syscall use the per process slack hrtimer: fix signed/unsigned bug in slack estimator hrtimer: show the timer ranges in /proc/timer_list hrtimer: incorporate feedback from Peter Zijlstra hrtimer: add a hrtimer_start_range() function hrtimer: another build fix hrtimer: fix build bug found by Ingo hrtimer: make select() and poll() use the hrtimer range feature ...
2008-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdevLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: (66 commits) [PATCH] kill the rest of struct file propagation in block ioctls [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET [PATCH] get rid of blkdev_locked_ioctl() [PATCH] get rid of blkdev_driver_ioctl() [PATCH] sanitize blkdev_get() and friends [PATCH] remember mode of reiserfs journal [PATCH] propagate mode through swsusp_close() [PATCH] propagate mode through open_bdev_excl/close_bdev_excl [PATCH] pass fmode_t to blkdev_put() [PATCH] kill the unused bsize on the send side of /dev/loop [PATCH] trim file propagation in block/compat_ioctl.c [PATCH] end of methods switch: remove the old ones [PATCH] switch sr [PATCH] switch sd [PATCH] switch ide-scsi [PATCH] switch tape_block [PATCH] switch dcssblk [PATCH] switch dasd [PATCH] switch mtd_blkdevs [PATCH] switch mmc ...
2008-10-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (46 commits) [PATCH] fs: add a sanity check in d_free [PATCH] i_version: remount support [patch] vfs: make security_inode_setattr() calling consistent [patch 1/3] FS_MBCACHE: don't needlessly make it built-in [PATCH] move executable checking into ->permission() [PATCH] fs/dcache.c: update comment of d_validate() [RFC PATCH] touch_mnt_namespace when the mount flags change [PATCH] reiserfs: add missing llseek method [PATCH] fix ->llseek for more directories [PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent [PATCH vfs-2.6 5/6] vfs: remove LOOKUP_PARENT from non LOOKUP_PARENT lookup [PATCH vfs-2.6 4/6] vfs: remove unnecessary fsnotify_d_instantiate() [PATCH vfs-2.6 3/6] vfs: add __d_instantiate() helper [PATCH vfs-2.6 2/6] vfs: add d_ancestor() [PATCH vfs-2.6 1/6] vfs: replace parent == dentry->d_parent by IS_ROOT() [PATCH] get rid of on-stack dentry in udf [PATCH 2/2] anondev: switch to IDA [PATCH 1/2] anondev: init IDR statically [JFFS2] Use d_splice_alias() not d_add() in jffs2_lookup() [PATCH] Optimise NFS readdir hack slightly. ...
2008-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: stop_machine: fix error code handling on multiple cpus stop_machine: use workqueues instead of kernel threads workqueue: introduce create_rt_workqueue Call init_workqueues before pre smp initcalls. Make panic= and panic_on_oops into core_params Make initcall_debug a core_param core_param() for genuinely core kernel parameters param: Fix duplicate module prefixes module: check kernel param length at compile time, not runtime Remove stop_machine during module load v2 module: simplify load_module.