aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-11-06ftrace: soft tracing stop and startSteven Rostedt
Impact: add way to quickly start stop tracing from the kernel This patch adds a soft stop and start to the trace. This simply disables function tracing via the ftrace_disabled flag, and disables the trace buffers to prevent recording. The tracing code may still be executed, but the trace will not be recorded. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06ftrace: add quick function trace stopSteven Rostedt
Impact: quick start and stop of function tracer This patch adds a way to disable the function tracer quickly without the need to run kstop_machine. It adds a new variable called function_trace_stop which will stop the calls to functions from mcount when set. This is just an on/off switch and does not handle recursion like preempt_disable(). It's main purpose is to help other tracers/debuggers start and stop tracing fuctions without the need to call kstop_machine. The config option HAVE_FUNCTION_TRACE_MCOUNT_TEST is added for archs that implement the testing of the function_trace_stop in the mcount arch dependent code. Otherwise, the test is done in the C code. x86 is the only arch at the moment that supports this. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06Merge branch 'tracing/fastboot' into tracing/ftraceIngo Molnar
2008-11-04tracing/ftrace: fix a bug when switch current tracer to sched tracerFrederic Weisbecker
Impact: fix boot tracer + sched tracer coupling bug Fix a bug that made the sched_switch tracer unable to run if set as the current_tracer after the boot tracer. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04tracing/ftrace: types and naming corrections for sched tracerFrederic Weisbecker
Impact: cleanup This patch applies some corrections suggested by Steven Rostedt. Change the type of shed_ref into int since it is used into a Mutex, we don't need it anymore as an atomic variable in the sched_switch tracer. Also change the name of the register mutex. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04tracing/fastboot: use sched switch tracer from boot tracerFrederic Weisbecker
Impact: enhance boot trace output with scheduling events Use the sched_switch tracer from the boot tracer. We also can trace schedule events inside the initcalls. Sched tracing is disabled after the initcall has finished and then reenabled before the next one is started. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04tracing/ftrace: remove unused code in sched_switch tracerFrederic Weisbecker
Impact: cleanup When init_sched_switch_trace() is called, it has no reason to start the sched tracer if the sched_ref is not zero. _ If this is non-zero, the tracer is already used, but we can register it to the tracing engine. There is already a security which avoid the tracer probes not to be resgistered twice. _ If this is zero, this block will not be used. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04tracing/ftrace: fix a race condition in sched_switch tracerFrederic Weisbecker
Impact: fix race condition in sched_switch tracer This patch fixes a race condition in the sched_switch tracer. If several tasks (IE: concurrent initcalls) are playing with tracing_start_cmdline_record() and tracing_stop_cmdline_record(), the following situation could happen: _ Task A and B are using the same tracepoint probe. Task A holds it. Task B is sleeping and doesn't hold it. _ Task A frees the sched tracer, then sched_ref is decremented to 0. _ Task A is preempted and hadn't yet unregistered its tracepoint probe, then B runs. _ B increments sched_ref, sees it's 1 and then guess it has to register its probe. But it has not been freed by task A. _ A lot of bad things can happen after that... Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04tracing/fastboot: Enable boot tracing only during initcallsFrederic Weisbecker
Impact: modify boot tracer We used to disable the initcall tracing at a specified time (IE: end of builtin initcalls). But we don't need it anymore. It will be stopped when initcalls are finished. However we want two things: _Start this tracing only after pre-smp initcalls are finished. _Since we are planning to trace sched_switches at the same time, we want to enable them only during the initcall execution. For this purpose, this patch introduce two functions to enable/disable the sched_switch tracing during boot. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04ftrace: sysctl typoPeter Zijlstra
Impact: fix sysctl name typo Steve must have needed more coffee ;-) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04ftrace: sysrq-z to dump the buffersPeter Zijlstra
Impact: add SysRq-z support to dump trace buffers Allows one to force an ftrace dump from sysrq Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04ftrace: function tracer with irqs disabledSteven Rostedt
Impact: disable interrupts during trace entry creation (as opposed to preempt) To help with performance, I set the ftracer to not disable interrupts, and only to disable preemption. If an interrupt occurred, it would not be traced, because the function tracer protects itself from recursion. This may be faster, but the trace output might miss some traces. This patch makes the fuction trace disable interrupts, but it also adds a runtime feature to disable preemption instead. It does this by having two different tracer functions. When the function tracer is enabled, it will check to see which version is requested (irqs disabled or preemption disabled). Then it will use the corresponding function as the tracer. Irq disabling is the default behaviour, but if the user wants better performance, with the chance of missing traces, then they can choose the preempt disabled version. Running hackbench 3 times with the irqs disabled and 3 times with the preempt disabled function tracer yielded: tracing type times entries recorded ------------ -------- ---------------- irq disabled 43.393 166433066 43.282 166172618 43.298 166256704 preempt disabled 38.969 159871710 38.943 159972935 39.325 161056510 Average: irqs disabled: 43.324 166287462 preempt disabled: 39.079 160300385 preempt is 10.8 percent faster than irqs disabled. I wrote a patch to count function trace recursion and reran hackbench. With irq disabled: 1,150 times the function tracer did not trace due to recursion. with preempt disabled: 5,117,718 times. The thousand times with irq disabled could be due to NMIs, or simply a case where it called a function that was not protected by notrace. But we also see that a large amount of the trace is lost with the preempt version. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04ftrace: insert in the ftrace_preempt_disable()/enable() functionsSteven Rostedt
Impact: use new, consolidated APIs in ftrace plugins This patch replaces the schedule safe preempt disable code with the ftrace_preempt_disable() and ftrace_preempt_enable() safe functions. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04ftrace: introduce ftrace_preempt_disable()/enable()Steven Rostedt
Impact: add new ftrace-plugin internal APIs Parts of the tracer needs to be careful about schedule recursion. If the NEED_RESCHED flag is set, a preempt_enable will call schedule. Inside the schedule function, the NEED_RESCHED flag is cleared. The problem arises when a trace happens in the schedule function but before NEED_RESCHED is cleared. The race is as follows: schedule() >> tracer called trace_function() preempt_disable() [ record trace ] preempt_enable() <<- here's the issue. [check NEED_RESCHED] schedule() [ Repeat the above, over and over again ] The naive approach is simply to use preempt_enable_no_schedule instead. The problem with that approach is that, although we solve the schedule recursion issue, we now might lose a preemption check when not in the schedule function. trace_function() preempt_disable() [ record trace ] [Interrupt comes in and sets NEED_RESCHED] preempt_enable_no_resched() [continue without scheduling] The way ftrace handles this problem is with the following approach: int resched; resched = need_resched(); preempt_disable_notrace(); [record trace] if (resched) preempt_enable_no_sched_notrace(); else preempt_enable_notrace(); This may seem like the opposite of what we want. If resched is set then we call the "no_sched" version?? The reason we do this is because if NEED_RESCHED is set before we disable preemption, there's two reasons for that: 1) we are in an atomic code path 2) we are already on our way to the schedule function, and maybe even in the schedule function, but have yet to clear the flag. Both the above cases we do not want to schedule. This solution has already been implemented within the ftrace infrastructure. But the problem is that it has been implemented several times. This patch encapsulates this code to two nice functions. resched = ftrace_preempt_disable(); [ record trace] ftrace_preempt_enable(resched); This way the tracers do not need to worry about getting it right. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03Merge branches 'tracing/ftrace', 'tracing/markers', 'tracing/mmiotrace', ↵Ingo Molnar
'tracing/nmisafe', 'tracing/tracepoints' and 'tracing/urgent' into tracing/core
2008-11-03tracepoint: introduce *_noupdate APIs.Lai Jiangshan
Impact: add new tracepoint APIs to allow the batched registration of probes new APIs separate tracepoint_probe_register(), tracepoint_probe_unregister() into 2 steps. The first step of them is just update tracepoint_entry, not connect or disconnect. this patch introduces tracepoint_probe_update_all() for update all. these APIs are very useful for registering lots of probes but just updating once. Another very important thing is that *_noupdate APIs do not require module_mutex. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03tracepoint: simplification for tracepoints using RCULai Jiangshan
Impact: simplify implementation Now, unused memory is handled by struct tp_probes. old code use these three field to handle unused memory. struct tracepoint_entry { ... struct rcu_head rcu; void *oldptr; unsigned char rcu_pending:1; ... }; in this way, unused memory is handled by struct tracepoint_entry. it bring reenter bug(it was fixed) and tracepoint.c is filled full of ".*rcu.*" code statements. this patch removes all these. and: rcu_barrier_sched() is removed. Do not need regain tracepoints_mutex after tracepoint_update_probes() several little cleanup. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03tracing, alpha: undefined reference to `save_stack_trace'Al Viro
Impact: build fix on !stacktrace architectures only select STACKTRACE on architectures that have STACKTRACE_SUPPORT ... since we also need to ifdef out the guts of ftrace_trace_stack(). We also want to disallow setting TRACE_ITER_STACKTRACE in trace_flags on such configs, but that can wait. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03ftrace: ftrace_dump_on_oops=[tracer]Peter Zijlstra
Impact: add new (optional) debug boot option In order to facilitate early boot trouble, allow one to specify a tracer on the kernel boot line. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03Merge commit 'v2.6.28-rc3' into tracing/ftraceIngo Molnar
2008-11-02Linux v2.6.28-rc3Linus Torvalds
2008-11-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: ide-gd: re-get capacity on revalidate tx4938ide: Avoid underflow on calculation of a wait cycle tx4938ide: Do not call devm_ioremap for whole 128KB tx4938ide: Check minimum cycle time and SHWT range (v2) ide: Switch to a common address ide-cd: fix DMA alignment regression
2008-11-02ide-gd: re-get capacity on revalidateBorislav Petkov
We need to re-get a removable media's capacity when revalidating the disk so that its partitions get rescanned by the block layer. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: axboe@kernel.dk Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-11-02tx4938ide: Avoid underflow on calculation of a wait cycleAtsushi Nemoto
Make 'wt' variable signed while it can be negative during calculation. Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: sshtylyov@ru.mvista.com Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-11-02tx4938ide: Do not call devm_ioremap for whole 128KBAtsushi Nemoto
Call devm_ioremap() for CS0 and CS1 separetely. And some style cleanups. Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: ralf@linux-mips.org Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-11-02tx4938ide: Check minimum cycle time and SHWT range (v2)Atsushi Nemoto
SHWT value is used as address valid to -CSx assertion and -CSx to -DIOx assertion setup time, and contrarywise, -DIOx to -CSx release and -CSx release to address invalid hold time, so it actualy applies 4 times and so constitutes -DIOx recovery time. Check requirement of the recovery time and cycle time. Also check SHWT maximum value. Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: ralf@linux-mips.org Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-11-02ide: Switch to a common addressAlan Cox
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-11-02ide-cd: fix DMA alignment regressionBorislav Petkov
e5318b531b008c79d2a0c0df06a7b8628da38e2f ("ide: use the dma safe check for REQ_TYPE_ATA_PC") introduced a regression which caused some ATAPI drives to turn off DMA for REQ_TYPE_BLOCK_PC commands while burning and thus degrading performance and ultimately causing an excessive amount of underruns. The issue is documented also in: http://bugzilla.kernel.org/show_bug.cgi?id=11742. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Tested-by: Valerio Passini <valerio.passini@unicam.it> [bart: fixup patch description per comments from Sergei Shtylyov] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-11-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix PCI resource mapping on sparc64 sparc64: Kill annoying warning when building compat_binfmt_elf.o sparc32: kernel/trace/trace.c wants DIE_OOPS sparc64: Fix __copy_{to,from}_user_inatomic defines.
2008-11-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits) af_unix: netns: fix problem of return value IRDA: remove double inclusion of module.h udp: multicast packets need to check namespace net: add documentation for skb recycling key: fix setkey(8) policy set breakage bpa10x: free sk_buff with kfree_skb xfrm: do not leak ESRCH to user space net: Really remove all of LOOPBACK_TSO code. netfilter: nf_conntrack_proto_gre: switch to register_pernet_gen_subsys() netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys net: delete excess kernel-doc notation pppoe: Fix socket leak. gianfar: Don't reset TBI<->SerDes link if it's already up gianfar: Fix race in TBI/SerDes configuration at91_ether: request/free GPIO for PHY interrupt amd8111e: fix dma_free_coherent context atl1: fix vlan tag regression SMC91x: delete unused local variable "lp" myri10ge: fix stop/go mmio ordering bonding: fix panic when taking bond interface down before removing module ...
2008-11-02linux/string.h: fix comment typoJeff Garzik
s/user/used/ Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-02sparc64: Fix PCI resource mapping on sparc64Max Dmitrichenko
There is a problem discovered in recent versions of ATI Mach64 driver in X.org on sparc64 architecture. In short, the driver fails to mmap MMIO aperture (PCI resource #2). I've found that kernel's __pci_mmap_make_offset() returns EINVAL. It checks whether user attempts to mmap more than the resource length, which is 0x1000 bytes in our case. But PAGE_SIZE on SPARC64 is 0x2000 and this is what actually is being mmaped. So __pci_mmap_make_offset() failed for this PCI resource. Signed-off-by: Max Dmitrichenko <dmitrmax@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-02sparc64: Kill annoying warning when building compat_binfmt_elf.oDavid S. Miller
GCC warns because some tests against 32-bit values never evaluate to true due to how TASK_SIZE is defined. I always wanted to mimick powerpc's definition of TASK_SIZE, which is simply TASK_SIZE_OF(current) and that also fixes the warning. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01sparc32: kernel/trace/trace.c wants DIE_OOPSAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01sparc64: Fix __copy_{to,from}_user_inatomic defines.Hugh Dickins
Alexander Beregalov reports oops in __bzero() called from copy_from_user_fixup() called from iov_iter_copy_from_user_atomic(), when running dbench on tmpfs on sparc64: its __copy_from_user_inatomic and __copy_to_user_inatomic should be avoiding, not calling, the fixups. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01af_unix: netns: fix problem of return valueJianjun Kong
fix problem of return value net/unix/af_unix.c: unix_net_init() when error appears, it should return 'error', not always return 0. Signed-off-by: Jianjun Kong <jianjun@zeuux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01IRDA: remove double inclusion of module.hAlexander Beregalov
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01udp: multicast packets need to check namespaceEric Dumazet
Current UDP multicast delivery is not namespace aware. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01net: add documentation for skb recyclingStephen Hemminger
Commit 04a4bb55bcf35b63d40fd2725e58599ff8310dd7 ("net: add skb_recycle_check() to enable netdriver skb recycling") added a method for network drivers to recycle skbuffs, but while use of this mechanism was documented in the commit message, it should really have been added as a docbook comment as well -- this patch does that. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01sparc32: kernel/trace/trace.c wants DIE_OOPSAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01el3_common_init() should be __devinit, not __initAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01missing dependencies on HAVE_CLK in drivers/mfdAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01section fixes for cirrusfbAl Viro
cirrusfb_zorro_unmap() may be called both from __devexit and (on cleanup path) from __devinit. So it needs to be a normal function, same as for cirrusfb_pci_unmap() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01PM_TEST_SUSPEND should depend on RTC_CLASS, not RTC_LIBAl Viro
Insufficient dependency - we really want CONFIG_RTC_CLASS=y there. That will give us CONFIG_RTC_LIB=y, so the old dependency can be simply replaced. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01oss: fix O_NONBLOCK in dmasound_coreAl Viro
We broke O_NONBLOCK handling in OSS dmasound_core in 2.3.11-pre3 - the original code copied f_flags to open_mode and then checked for O_NONBLOCK in there, but that got changed to copying f_mode and O_NONBLOCK has not reached that field in any kernel version. Since we do not care for any other bits, the fix is obvious... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix AMDC1E and XTOPOLOGY conflict in cpufeature x86: build fix
2008-11-01init/do_mounts_md.c: remove duplicated #includeHuang Weiyi
Removed duplicated #include <linux/delay.h> in init/do_mounts_md.c. The same compile error ("error: implicit declaration of function 'msleep'") got fixed twice: - f8b77d39397e1510b1a3bcfd385ebd1a45aae77f ("init/do_mounts_md.c: msleep compile fix") - 73b4a24f5ff09389ba6277c53a266b142f655ed2 ("init/do_mounts_md.c must #include <linux/delay.h>") by people adding the <linux/delay.h> include in two slightly different places. Andrew's quilt scripts happily ignore the fuzz, and will re-apply the patch even though they had conflicts. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01x86: Clean up late e820 resource allocationLinus Torvalds
This makes the late e820 resources use 'insert_resource_expand_to_fit()' instead of doing a 'reserve_region_with_split()', and also avoids marking them as IORESOURCE_BUSY. This results in us being perfectly happy to use pre-existing PCI resources even if they were marked as being in a reserved region, while still avoiding any _new_ allocations in the reserved regions. It also makes for a simpler and more accurate resource tree. Example resource allocation from Jonathan Corbet, who has firmware that has an e820 reserved entry that covered a big range (e0000000-fed003ff), and that had various PCI resources in it set up by firmware. With old kernels, the reserved range would force us to re-allocate all pre-existing PCI resources, and his reserved range would end up looking like this: e0000000-fed003ff : reserved fec00000-fec00fff : IOAPIC 0 fed00000-fed003ff : HPET 0 where only the pre-allocated special regions (IOAPIC and HPET) were kept around. With 2.6.28-rc2, which uses 'reserve_region_with_split()', Jonathan's resource tree looked like this: e0000000-fe7fffff : reserved fe800000-fe8fffff : PCI Bus 0000:01 fe800000-fe8fffff : reserved fe900000-fe9d9aff : reserved fe9d9b00-fe9d9bff : 0000:00:1f.3 fe9d9b00-fe9d9bff : reserved fe9d9c00-fe9d9fff : 0000:00:1a.7 fe9d9c00-fe9d9fff : reserved fe9da000-fe9dafff : 0000:00:03.3 fe9da000-fe9dafff : reserved fe9db000-fe9dbfff : 0000:00:19.0 fe9db000-fe9dbfff : reserved fe9dc000-fe9dffff : 0000:00:1b.0 fe9dc000-fe9dffff : reserved fe9e0000-fe9fffff : 0000:00:19.0 fe9e0000-fe9fffff : reserved fea00000-fea7ffff : 0000:00:02.0 fea00000-fea7ffff : reserved fea80000-feafffff : 0000:00:02.1 fea80000-feafffff : reserved feb00000-febfffff : 0000:00:02.0 feb00000-febfffff : reserved fec00000-fed003ff : reserved fec00000-fec00fff : IOAPIC 0 fed00000-fed003ff : HPET 0 and because the reserved entry had been split and moved into the individual resources, and because it used the IORESOURCE_BUSY flag, the drivers that actually wanted to _use_ those resources couldn't actually attach to them: e1000e 0000:00:19.0: BAR 0: can't reserve mem region [0xfe9e0000-0xfe9fffff] HDA Intel 0000:00:1b.0: BAR 0: can't reserve mem region [0xfe9dc000-0xfe9dffff] with this patch, the resource tree instead becomes e0000000-fed003ff : reserved fe800000-fe8fffff : PCI Bus 0000:01 fe9d9b00-fe9d9bff : 0000:00:1f.3 fe9d9c00-fe9d9fff : 0000:00:1a.7 fe9d9c00-fe9d9fff : ehci_hcd fe9da000-fe9dafff : 0000:00:03.3 fe9db000-fe9dbfff : 0000:00:19.0 fe9db000-fe9dbfff : e1000e fe9dc000-fe9dffff : 0000:00:1b.0 fe9dc000-fe9dffff : ICH HD audio fe9e0000-fe9fffff : 0000:00:19.0 fe9e0000-fe9fffff : e1000e fea00000-fea7ffff : 0000:00:02.0 fea80000-feafffff : 0000:00:02.1 feb00000-febfffff : 0000:00:02.0 fec00000-fec00fff : IOAPIC 0 fed00000-fed003ff : HPET 0 ie the one reserved region now ends up surrounding all the PCI resources that were allocated inside of it by firmware, and because it is not marked BUSY, drivers have no problem attaching to the pre-allocated resources. Reported-and-tested-by: Jonathan Corbet <corbet@lwn.net> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Robert Hancock <hancockr@shaw.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01reserve_region_with_split: Fix GFP_KERNEL usage under spinlockLinus Torvalds
This one apparently doesn't generate any warnings, because the function is only used during system bootup, when the warnings are disabled. But it's still very wrong. The __reserve_region_with_split() function is called with the resource_lock held for writing, so it must only ever do GFP_ATOMIC allocations. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01Merge branch 'link_removal' of git://www.jni.nu/crisLinus Torvalds
* 'link_removal' of git://www.jni.nu/cris: [CRIS] Remove links from CRIS build [CRIS] Merge asm-offsets.c for both arches into one file.