Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2008-11-30	xen_play_dead() is __cpuinit	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	xen_setup_vcpu_info_placement() is not init on x86	Al Viro
	... so get xen-ops.h in agreement with xen/smp.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	kvm_setup_secondary_clock() is cpuinit	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	enable_IR_x2apic() needs to be __init	Al Viro
	calls __init, called only from __init Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	pci_setup() is init, not devinit	Al Viro
	for fsck sake, it's used only when parsing kernel command line... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	alpha: pcibios_resource_to_bus() is callable from normal code	Al Viro
	pci_enable_rom(), specifically. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	tricky one: hisax sections	Al Viro
	a) hisax_init_pcmcia() needs to be defined only if we have CONFIG_HOTPLUG (no PCMCIA support otherwise) and can be declared __devinit. b) HiSax_inithardware() can go __init c) hisax_register() is passing to checkcard() full-blown hisax_cs_setup_card(): checkcard(i, id, NULL, hisax_d_if->owner, hisax_cs_setup_card); The problem with it is that * hisax_cs_setup_card() is __devinit * hisax_register() is not * hisax_cs_setup_card() is a switch from hell, calling a lot of setup_some_weirdcard() depending on card->typ. _These_ are also __devinit. However, in hisax_register() we have card->typ equal to ISDN_CTYPE_DYNAMIC, which reduces hisax_cs_setup_card() to "nevermind all that crap, just do nothing and return 2". So we add a trimmed-down callback doing just that and passed to checkcard() by hisax_register(). _This_ is non-init (we can stand the impact on .text size). Voila - no section warnings from drivers/isdn Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	cpuinit fixes in kernel/*	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	uninorth-agp section mess	Al Viro
	'aperture' is declared devinitdata (the whole word of it) and is used from ->fetch_size() which can, AFAICS, be used on !HOTPLUG after init time. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	rapidio section noise	Al Viro
	functions calling devinit and called only from devinit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	section errors in smc911x/smc91x	Al Viro
	a) ->probe() can be __devinit; no need to put it into .text b) calling __init stuff from it, OTOH, is wrong c) ->remove() is __devexit fodder Acked-by: rmk+kernel@arm.linux.org.uk Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	fix the section noise in sparc head.S	Al Viro
	usual .text.head trick Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	m32r: section noise in head.S	Al Viro
	usual "introduce .text.head, put it in front of TEXT_TEXT in vmlinux.lds.S, make the stuff up to jump to start_kernel live in it", same as on other targets. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	section misannotation in ibmtr_cs	Al Viro
	ibmtr_resume() is calling ibmtr_probe(), which is devinit. Whether that's the right thing to do there is a separate question, but since it's PCMCIA and thus will never compile without HOTPLUG... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	ixgbe section fixes	Al Viro
	ixgbe_init_interrupt_scheme() is called from ixgbe_resume(). Build that with CONFIG_PM and without CONFIG_HOTPLUG and you've got a problem. Several helpers called by it also are misannotated __devinit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	rackmeter section fixes	Al Viro
	* rackmeter_remove() reference needs devexit_p * rackmeter_setup() is calls devinit and is called only from devinit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	gdth section fixes	Al Viro
	PCI side of driver should be devinit, not init Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	of_platform_driver noise on sparce	Al Viro
	switch to __init for those; unlike powerpc sparc has no hotplug support for that stuff and their ->probe() tends to call __init functions while being declared __devinit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	advansys fix on ISA-less configs	Al Viro
	The code if (shost->dma_channel != NO_ISA_DMA) free_dma(shost->dma_channel); in there is triggerable only if we have CONFIG_ISA (we only set ->dma_channel to something other than NO_ISA_DMA under #ifdef CONFIG_ISA). OTOH, free_dma() is not guaranteed to be there in absense of CONFIG_ISA. IOW, driver runs into undefined symbols on PCI-but-not-ISA configs (e.g. on frv) and it's a false positive. Fix: put the entire if () under #ifdef CONFIG_ISA; behaviour doesn't change and dependency on free_dma() disappears for !CONFIG_ISA. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	W1_MASTER_DS1WM should depend on HAVE_CLK	Al Viro
	Uses clk_...() a lot Acked-by: rmk+kernel@arm.linux.org.uk Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	icside section warnings	Al Viro
	icside_register_v[56] is called from (__devinit) icside_probe Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	fix talitos	Al Viro
	talitos_remove() can be called from talitos_probe() on failure exit path, so it can't be __devexit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	istallion section warnings	Al Viro
	stli_findeisabrds() and stli_initbrds() are using __init and called only from __init. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	sparc64 trivial section misannotations	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	sparc32 cpuinit flase positives	Al Viro
	All noise since we don't have CPU hotplug there. However, they did expose something very odd-looking in there - poke_viking() does a bunch of identical btfixup each time it's called (i.e. for each CPU). That one is left alone for now; just the trivial misannotation fixes. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	powerpc set_huge_psize() false positive	Al Viro
	called only from __init, calls __init. Incidentally, it ought to be static in file. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	false __cpuinit positives on alpha	Al Viro
	pure noise - alpha doesn't have CPU hotplug Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30	meminit section warnings	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-29	sched: prevent divide by zero error in cpu_avg_load_per_task, update	Ingo Molnar
	Regarding the bug addressed in: 4cd4262: sched: prevent divide by zero error in cpu_avg_load_per_task Linus points out that the fix is not complete: > There's nothing that keeps gcc from deciding not to reload > rq->nr_running. > > Of course, in _practice_, I don't think gcc ever will (if it decides > that it will spill, gcc is likely going to decide that it will > literally spill the local variable to the stack rather than decide to > reload off the pointer), but it's a valid compiler optimization, and > it even has a name (rematerialization). > > So I suspect that your patch does fix the bug, but it still leaves the > fairly unlikely _potential_ for it to re-appear at some point. > > We have ACCESS_ONCE() as a macro to guarantee that the compiler > doesn't rematerialize a pointer access. That also would clarify > the fact that we access something unsafe outside a lock. So make sure our nr_running value is immutable and cannot change after we check it for nonzero. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-29	sched, cpusets: fix warning in kernel/cpuset.c	Ingo Molnar
	this warning: kernel/cpuset.c: In function ‘generate_sched_domains’: kernel/cpuset.c:588: warning: ‘ndoms’ may be used uninitialized in this function triggers because GCC does not recognize that ndoms stays uninitialized only if doms is NULL - but that flow is covered at the end of generate_sched_domains(). Help out GCC by initializing this variable to 0. (that's prudent anyway) Also, this function needs a splitup and code flow simplification: with 160 lines length it's clearly too long. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-29	ieee1394: sbp2: fix race condition in state change	Stefan Richter
	An intermediate transition from _RUNNING to _IN_SHUTDOWN could have been missed by the former code. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-11-29	ieee1394: fix list corruption (reported at module removal)	Stefan Richter
	If there is more than one FireWire controller present, dummy_zero_addr and dummy_max_addr were added multiple times to different lists, thus corrupting the lists. Fix this by allocating them dynamically per host instead of just once globally. (Perhaps a better address space allocation algorithm could rid us of the two dummy address spaces.) Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10129 . Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-11-28	mlx4_core: Save/restore default port IB capability mask	Jack Morgenstein
	Commit 7ff93f8b ("mlx4_core: Multiple port type support") introduced support for different port types. As part of that support, SET_PORT is invoked to set the port type during driver startup. However, as a side-effect, for IB ports the invocation of this command also sets the port's capability mask to zero (losing the default value set by FW). To fix this, get the default ib port capabilities (via a MAD_IFC Port Info query) during driver startup, and save them for use in the mlx4_SET_PORT command when setting the port-type to Infiniband. This patch fixes problems with subnet manager (SM) failover such as <https://bugs.openfabrics.org/show_bug.cgi?id=1183>, which occurred because the IsTrapSupported bit in the capability mask was zeroed. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-11-28	toshiba_acpi: close race in toshiba_acpi driver	Arjan van de Ven
	the toshiba ACPI driver will, in a failure case, free the rfkill state before stopping the polling timer that would use this state. More interesting, in the same failure case handling, it calls the exit function, which also frees the rfkill state, but after stopping the polling. If the race happens, a NULL pointer is passed to rfkill_force_state() which then causes a nice dereference. Fix the race by just not doing the too-early freeing of the rfkill state. This appears to be the cause of a hot issue on kerneloops.org; while I have no solid evidence of that this patch will fix the issue, the race appears rather real. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-28	i2c-parport: Fix misplaced parport_release call	Jean Delvare
	We shouldn't release the parallel port until we are actually done with it. Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-11-28	i2c: Remove i2c clients in reverse order	Jean Delvare
	i2c clients should be removed in reverse order compared to the probe (actually: bind) order. This matters when several clients depend on each other. Signed-off-by: Jean Delvare <khali@linux-fr.org> Tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
2008-11-28	i2c/isp1301_omap: Build fixes	David Brownell
	Build fixes for isp1301_omap; no behavior changes: - fix incorrect probe() signature (it changed many months ago) - provide missing functions on H3 and H4 boards - "sparse" fixes (static, NULL-vs-0) The H3 build bits subset some of the stuff that was previously in the OMAP tree but never went to mainline. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-11-28	HID: Apple ALU wireless keyboards are bluetooth devices	Jan Scholz
	While parsing 'hid_blacklist' in the apple alu wireless keyboard is not found. This happens because in the blacklist it is declared with HID_USB_DEVICE although the keyboards are really bluetooth devices. The same holds for 'apple_devices' list. This patch fixes it by changing HID_USB_DEVICE to HID_BLUETOOTH_DEVICE in those two lists. Signed-off-by: Jan Scholz <Scholz@fias.uni-frankfurt.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-11-27	Merge branch 'for-rmk' of ↵	Russell King
	git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6
2008-11-27	Allow architectures to override copy_user_highpage()	Russell King
	With aliasing VIPT cache support, the ARM implementation of clear_user_page() and copy_user_page() sets up a temporary kernel space mapping such that we have the same cache colour as the userspace page. This avoids having to consider any userspace aliases from this operation. However, when highmem is enabled, kmap_atomic() have to setup mappings. The copy_user_highpage() and clear_user_highpage() call these functions before delegating the copies to copy_user_page() and clear_user_page(). The effect of this is that each of the _user_highpage() functions setup their own kmap mapping, followed by the _user_page() functions setting up another mapping. This is rather wasteful. Thankfully, copy_user_highpage() can be overriden by architectures by defining __HAVE_ARCH_COPY_USER_HIGHPAGE. However, replacement of clear_user_highpage() is more difficult because its inline definition is not conditional. It seems that you're expected to define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE and provide a replacement __alloc_zeroed_user_highpage() implementation instead. The allocation itself is fine, so we don't want to override that. What we really want to do is to override clear_user_highpage() with our own version which doesn't kmap_atomic() unnecessarily. Other VIPT architectures (PARISC and SH) would also like to override this function as well. Acked-by: Hugh Dickins <hugh@veritas.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-11-27	udf: Fix BUG_ON() in destroy_inode()	Jan Kara
	udf_clear_inode() can leave behind buffers on mapping's i_private list (when we truncated preallocation). Call invalidate_inode_buffers() so that the list is properly cleaned-up before we return from udf_clear_inode(). This is ugly and suggest that we should cleanup preallocation earlier than in clear_inode() but currently there's no such call available since drop_inode() is called under inode lock and thus is unusable for disk operations. Signed-off-by: Jan Kara <jack@suse.cz>
2008-11-27	[ARM] pxa/palmtx: misc fixes to use generic GPIO API	Marek Vasut
	Signed-off-by: Marek Vasut <marek.vasut@gmail.com> Signed-off-by: Eric Miao <eric.miao@marvell.com>
2008-11-27	x86: always define DECLARE_PCI_UNMAP* macros	Joerg Roedel
	Impact: fix boot crash on AMD IOMMU if CONFIG_GART_IOMMU is off Currently these macros evaluate to a no-op except the kernel is compiled with GART or Calgary support. But we also need these macros when we have SWIOTLB, VT-d or AMD IOMMU in the kernel. Since we always compile at least with SWIOTLB we can define these macros always. This patch is also for stable backport for the same reason the SWIOTLB default selection patch is. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27	Merge branch 'omap-fixes' of ↵	Russell King
	git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
2008-11-27	[S390] Update default configuration.	Martin Schwidefsky
	Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27	[S390] Fix alignment of initial kernel stack.	Heiko Carstens
	We need an alignment of 16384 bytes for the initial kernel stack if the kernel is configured for 16384 bytes stacks but the linker script currently guarantees only an alignment of 8192 bytes. So fix this and simply use THREAD_SIZE as alignment value which will always do the right thing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27	[S390] pgtable.h: Fix oops in unmap_vmas for KVM processes	Christian Borntraeger
	When running several kvm processes with lots of memory overcommitment, we have seen an oops during process shutdown: ------------[ cut here ]------------ Kernel BUG at 0000000000193434 [verbose debug info unavailable] addressing exception: 0005 [#1] PREEMPT SMP Modules linked in: kvm sunrpc qeth_l2 dm_mod qeth ccwgroup CPU: 10 Not tainted 2.6.28-rc4-kvm-bigiron-00521-g0ccca08-dirty #8 Process kuli (pid: 14460, task: 0000000149822338, ksp: 0000000024f57650) Krnl PSW : 0704e00180000000 0000000000193434 (unmap_vmas+0x884/0xf10) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000002 0000000000000000 000000051008d000 000003e05e6034e0 00000000001933f6 00000000000001e9 0000000407259e0a 00000002be88c400 00000200001c1000 0000000407259608 0000000407259e08 0000000024f577f0 0000000407259e09 0000000000445fa8 00000000001933f6 0000000024f577f0 Krnl Code: 0000000000193426: eb22000c000d sllg %r2,%r2,12 000000000019342c: a7180000 lhi %r1,0 0000000000193430: b2290012 iske %r1,%r2 >0000000000193434: a7110002 tmll %r1,2 0000000000193438: a7840006 brc 8,193444 000000000019343c: 9602c000 oi 0(%r12),2 0000000000193440: 96806000 oi 0(%r6),128 0000000000193444: a7110004 tmll %r1,4 Call Trace: ([<00000000001933f6>] unmap_vmas+0x846/0xf10) [<0000000000199680>] exit_mmap+0x210/0x458 [<000000000012a8f8>] mmput+0x54/0xfc [<000000000012f714>] exit_mm+0x134/0x144 [<000000000013120c>] do_exit+0x240/0x878 [<00000000001318dc>] do_group_exit+0x98/0xc8 [<000000000013e6b0>] get_signal_to_deliver+0x30c/0x358 [<000000000010bee0>] do_signal+0xec/0x860 [<0000000000112e30>] sysc_sigpending+0xe/0x22 [<000002000013198a>] 0x2000013198a INFO: lockdep is turned off. Last Breaking-Event-Address: [<00000000001a68d0>] free_swap_and_cache+0x1a0/0x1a4 <4>---[ end trace bc19f1d51ac9db7c ]--- The faulting instruction is the storage key operation (iske) in ptep_rcp_copy (called by pte_clear, called by unmap_vmas). iske reads dirty and reference bit information for a physical page and requires a valid physical address. Since we are in pte_clear, we cannot rely on the pte containing a valid address. Fortunately we dont need these information in pte_clear - after all there is no mapping. The best fix is to remove the needless call to ptep_rcp_copy that contains the iske. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27	[S390] fix/cleanup sched_clock	Christian Borntraeger
	CONFIG_PRINTK_TIME reveals that sched_clock has a wrong offset during boot: .. [ 0.000000] Movable zone: 0 pages used for memmap [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 775679 [ 0.000000] Kernel command line: dasd=4b6c root=/dev/dasda1 ro noinitrd [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) [6920575.975232] console [ttyS0] enabled [6920575.987586] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes) [6920575.991404] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes) .. The s390 implementation of sched_clock uses the store clock instruction and subtracts jiffies_timer_cc. jiffies_timer_cc is a local variable in arch/s390/kernel/time.c and only used for sched_clock and monotonic clock. For historical reasons there is an offset on that value. With todays code this offset is unnecessary. By removing that offset we can get a sched_clock which returns the nanoseconds after time_init. This improves CONFIG_PRINTK_TIME. Since sched_clock is the only user, I have also renamed jiffies_timer_cc to sched_clock_base_cc. In addition, the local variable init_timer_cc is redundant and can be romved as well. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27	[S390] fix system call parameter functions.	Martin Schwidefsky
	syscall_get_nr() currently returns a valid result only if the call chain of the traced process includes do_syscall_trace_enter(). But collect_syscall() can be called for any sleeping task, the result of syscall_get_nr() in general is completely bogus. To make syscall_get_nr() work for any sleeping task the traps field in pt_regs is replace with svcnr - the system call number the process is executing. If svcnr == 0 the process is not on a system call path. The syscall_get_arguments and syscall_set_arguments use regs->gprs[2] for the first system call parameter. This is incorrect since gprs[2] may have been overwritten with the system call number if the call chain includes do_syscall_trace_enter. Use regs->orig_gprs2 instead. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27	sched: prevent divide by zero error in cpu_avg_load_per_task	Steven Rostedt
	Impact: fix divide by zero crash in scheduler rebalance irq While testing the branch profiler, I hit this crash: divide error: 0000 [#1] PREEMPT SMP [...] RIP: 0010:[<ffffffff8024a008>] [<ffffffff8024a008>] cpu_avg_load_per_task+0x50/0x7f [...] Call Trace: <IRQ> <0> [<ffffffff8024fd43>] find_busiest_group+0x3e5/0xcaa [<ffffffff8025da75>] rebalance_domains+0x2da/0xa21 [<ffffffff80478769>] ? find_next_bit+0x1b2/0x1e6 [<ffffffff8025e2ce>] run_rebalance_domains+0x112/0x19f [<ffffffff8026d7c2>] __do_softirq+0xa8/0x232 [<ffffffff8020ea7c>] call_softirq+0x1c/0x3e [<ffffffff8021047a>] do_softirq+0x94/0x1cd [<ffffffff8026d5eb>] irq_exit+0x6b/0x10e [<ffffffff8022e6ec>] smp_apic_timer_interrupt+0xd3/0xff [<ffffffff8020e4b3>] apic_timer_interrupt+0x13/0x20 The code for cpu_avg_load_per_task has: if (rq->nr_running) rq->avg_load_per_task = rq->load.weight / rq->nr_running; The runqueue lock is not held here, and there is nothing that prevents the rq->nr_running from going to zero after it passes the if condition. The branch profiler simply made the race window bigger. This patch saves off the rq->nr_running to a local variable and uses that for both the condition and the division. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>