aboutsummaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2008-12-03trace: fix output of stack traceSteven Rostedt
Impact: fix to output of stack trace If a function is not found in the stack of the stack tracer, the number printed is quite strange. This fixes the algorithm to handle missing functions better. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03tracing/function-graph-tracer: enabled by defaultIngo Molnar
CONFIG_FUNCTION_GRAPH_TRACER depends on FUNCTION_TRACER already, (turning it non-default) so it so making it default-n is pointless. So enable it by default - it's a nice extension of the function tracer. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03tracing/function-graph-tracer: improve duration outputFrederic Weisbecker
Impact: better trace output of duration for long calls The old duration output didn't exceeded 9999.999 us to fit the column and the nanosecs were always 3 numbers. As Ingo suggested, it's better to have the whole microseconds elapsed time and shift the nanosecs precision if needed to fit the maximum 7 numbers. And usec need more number, the case should be rare and important enough to break a bit the column alignment to show it. So, depending of the duration value, we now have these patterns: u.nnn us uu.nnn us uuu.nnn us uuuu.nnn us uuuuu.nn us uuuuuu.n us uuuuuuuu..... us Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03tracing/function-graph-tracer: display unified style cmdline and pidFrederic Weisbecker
Impact: extend function-graph output: let one know which thread called a function This patch implements a helper function to print the couple cmdline/pid. Its output is provided during task switching and on each row if the new "funcgraph-proc" defualt-off option is set through trace_options file. The output is center aligned and never exceeds 14 characters. The cmdline is truncated over 7 chars. But note that if the pid exceeds 6 characters, the column will overflow (but the situation is abnormal). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03ftrace: function graph return for function entrySteven Rostedt
Impact: feature, let entry function decide to trace or not This patch lets the graph tracer entry function decide if the tracing should be done at the end as well. This requires all function graph entry functions return 1 if it should trace, or 0 if the return should not be traced. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03ring-buffer: change "page" variable names to "bpage"Steven Rostedt
Impact: clean up Andrew Morton pointed out that the kernel convention of a variable named page should be of type page struct. The ring buffer uses a variable named "page" for a pointer to something else. This patch converts those to be called "bpage" (as in "buffer page"). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03ftrace: add ftrace_graph_stop()Steven Rostedt
Impact: new ftrace_graph_stop function While developing more features of function graph, I hit a bug that caused the WARN_ON to trigger in the prepare_ftrace_return function. Well, it was hard for me to find out that was happening because the bug would not print, it would just cause a hard lockup or reboot. The reason is that it is not safe to call printk from this function. Looking further, I also found that it calls unregister_ftrace_graph, which grabs a mutex and calls kstop machine. This would definitely lock the box up if it were to trigger. This patch adds a fast and safe ftrace_graph_stop() which will stop the function tracer. Then it is safe to call the WARN ON. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03ring-buffer: read page interfaceSteven Rostedt
Impact: new API to ring buffer This patch adds a new interface into the ring buffer that allows a page to be read from the ring buffer on a given CPU. For every page read, one must also be given to allow for a "swap" of the pages. rpage = ring_buffer_alloc_read_page(buffer); if (!rpage) goto err; ret = ring_buffer_read_page(buffer, &rpage, cpu, full); if (!ret) goto empty; process_page(rpage); ring_buffer_free_read_page(rpage); The caller of these functions must handle any waits that are needed to wait for new data. The ring_buffer_read_page will simply return 0 if there is no data, or if "full" is set and the writer is still on the current page. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03ring-buffer: move some metadata into buffer pageSteven Rostedt
Impact: get ready for splice changes This patch moves the commit and timestamp into the beginning of each data page of the buffer. This change will allow the page to be moved to another location (disk, network, etc) and still have information in the page to be able to read it. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03ftrace: replace raw_local_irq_save with local_irq_saveSteven Rostedt
Impact: fix for lockdep and ftrace The raw_local_irq_save/restore confuses lockdep. This patch converts them to the local_irq_save/restore variants. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into ↵Ingo Molnar
tracing/core
2008-12-02tracing/function-graph-tracer: support for x86-64Frederic Weisbecker
Impact: extend and enable the function graph tracer to 64-bit x86 This patch implements the support for function graph tracer under x86-64. Both static and dynamic tracing are supported. This causes some small CPP conditional asm on arch/x86/kernel/ftrace.c I wanted to use probe_kernel_read/write to make the return address saving/patching code more generic but it causes tracing recursion. That would be perhaps useful to implement a notrace version of these function for other archs ports. Note that arch/x86/process_64.c is not traced, as in X86-32. I first thought __switch_to() was responsible of crashes during tracing because I believed current task were changed inside but that's actually not the case (actually yes, but not the "current" pointer). So I will have to investigate to find the functions that harm here, to enable tracing of the other functions inside (but there is no issue at this time, while process_64.c stays out of -pg flags). A little possible race condition is fixed inside this patch too. When the tracer allocate a return stack dynamically, the current depth is not initialized before but after. An interrupt could occur at this time and, after seeing that the return stack is allocated, the tracer could try to trace it with a random uninitialized depth. It's a prevention, even if I hadn't problems with it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Bird <tim.bird@am.sony.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-02function trace: fix a bug of single thread function traceLiming Wang
Impact: fix "no output from tracer" bug caused by ftrace_update_pid_func() When disabling single thread function trace using "echo -1 > set_ftrace_pid", the normal function trace has to restore to original function, otherwise the normal function trace will not work well. Without this commit, something like below: $ ps |grep 850 850 root 2556 S -/bin/sh $ echo 850 > /debug/tracing/set_ftrace_pid $ echo function > /debug/tracing/current_tracer $ echo 1 > /debug/tracing/tracing_enabled $ sleep 1 $ echo 0 > /debug/tracing/tracing_enabled $ cat /debug/tracing/trace_pipe |wc -l 59704 $ echo -1 > /debug/tracing/set_ftrace_pid $ echo 1 > /debug/tracing/tracing_enabled $ sleep 1 $ echo 0 > /debug/tracing/tracing_enabled $ more /debug/tracing/trace_pipe <====== nothing output now! it should output trace record. Signed-off-by: Liming Wang <liming.wang@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-02Merge branches 'tracing/branch-tracer', 'tracing/ftrace', ↵Ingo Molnar
'tracing/function-graph-tracer', 'tracing/markers', 'tracing/powerpc', 'tracing/stack-tracer' and 'tracing/tracepoints' into tracing/core
2008-11-29tracing/branch-tracer: include missing irqflags.hFrederic Weisbecker
Impact: fix build error on branch tracer This should fix a build error reported on alpha in linux-next: CC kernel/trace/trace_branch.o kernel/trace/trace_branch.c: In function 'probe_likely_condition': kernel/trace/trace_branch.c:44: error: implicit declaration of function 'raw_local_irq_save' kernel/trace/trace_branch.c:76: error: implicit declaration of function 'raw_local_irq_restore' Unfortunately, I can't test it since I don't have any Alpha build environment. Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28ftrace: improve seq_operation of ftraceLiming Wang
Impact: make ftrace position computing more sane First remove useless ->pos field. Then we needn't check seq_printf in .show like other place. Signed-off-by: Liming Wang <liming.wang@windriver.com> Reviewed-by: Bruce Ashfield <bruce.ashfield@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28tracing, alpha: fix build: add missing #ifdef CONFIG_STACKTRACETörök Edwin
There are architectures that still have no stacktrace support. Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28tracing/function-graph-tracer: more output tweaksIngo Molnar
Impact: prettify the output some more Before: 0) | sys_read() { 0) 0.796 us | fget_light(); 0) | vfs_read() { 0) | rw_verify_area() { 0) | security_file_permission() { ------------8<---------- thread sshd-1755 ------------8<---------- After: 0) | sys_read() { 0) 0.796 us | fget_light(); 0) | vfs_read() { 0) | rw_verify_area() { 0) | security_file_permission() { ------------------------------------------ | 1) migration/0--1 => sshd-1755 ------------------------------------------ Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28tracing/function-graph-tracer: adjustments of the trace informationsFrederic Weisbecker
Impact: increase the visual qualities of the call-graph-tracer output This patch applies various trace output formatting changes: - CPU is now a decimal number, followed by a parenthesis. - Overhead is now on the second column (gives a good visibility) - Cost is now on the third column, can't exceed 9999.99 us. It is followed by a virtual line based on a "|" character. - Functions calls are now the last column on the right. This way, we haven't dynamic column (which flow is harder to follow) on its right. - CPU and Overhead have their own option flag. They are default-on but you can disable them easily: echo nofuncgraph-cpu > trace_options echo nofuncgraph-overhead > trace_options TODO: _ Refactoring of the thread switch output. _ Give a default-off option to output the thread and its pid on each row. _ Provide headers _ .... Here is an example of the new trace style: 0) | mutex_unlock() { 0) 0.639 us | __mutex_unlock_slowpath(); 0) 1.607 us | } 0) | remove_wait_queue() { 0) 0.616 us | _spin_lock_irqsave(); 0) 0.616 us | _spin_unlock_irqrestore(); 0) 2.779 us | } 0) 0.495 us | n_tty_set_room(); 0) ! 9999.999 us | } 0) | tty_ldisc_deref() { 0) 0.615 us | _spin_lock_irqsave(); 0) 0.616 us | _spin_unlock_irqrestore(); 0) 2.793 us | } 0) | current_fs_time() { 0) 0.488 us | current_kernel_time(); 0) 0.495 us | timespec_trunc(); 0) 2.486 us | } 0) ! 9999.999 us | } 0) ! 9999.999 us | } 0) ! 9999.999 us | } 0) | sys_read() { 0) 0.796 us | fget_light(); 0) | vfs_read() { 0) | rw_verify_area() { 0) | security_file_permission() { 0) 0.488 us | cap_file_permission(); 0) 1.720 us | } 0) 3. 4 us | } 0) | tty_read() { 0) 0.488 us | tty_paranoia_check(); 0) | tty_ldisc_ref_wait() { 0) | tty_ldisc_try() { 0) 0.615 us | _spin_lock_irqsave(); 0) 0.615 us | _spin_unlock_irqrestore(); 0) 5.436 us | } 0) 6.427 us | } Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27tracing/function-graph-tracer: enhancements for the trace outputFrederic Weisbecker
Impact: enhance the output of the graph-tracer This patch applies some ideas of Ingo Molnar and Steven Rostedt. * Output leaf functions in one line with parenthesis, semicolon and duration output. * Add a second column (after cpu) for an overhead sign. if duration > 100 us, "!" if duration > 10 us, "+" else " " * Print output in us with remaining nanosec: u.n * Print duration on the right end, following the indentation of the functions. Use also visual clues: "-" on entry call (no duration to output) and "+" on return (duration output). The name of the tracer has been fixed as well: function-branch becomes function_branch. Here is an example of the new output: CPU[000] dequeue_entity() { - CPU[000] update_curr() { - CPU[000] update_min_vruntime(); + 0.512 us CPU[000] } + 1.504 us CPU[000] clear_buddies(); + 0.481 us CPU[000] update_min_vruntime(); + 0.504 us CPU[000] } + 4.557 us CPU[000] hrtick_update() { - CPU[000] hrtick_start_fair(); + 0.489 us CPU[000] } + 1.443 us CPU[000] + } + 14.655 us CPU[000] + } + 15.678 us CPU[000] + } + 16.686 us CPU[000] msecs_to_jiffies(); + 0.481 us CPU[000] put_prev_task_fair(); + 0.504 us CPU[000] pick_next_task_fair(); + 0.482 us CPU[000] pick_next_task_rt(); + 0.504 us CPU[000] pick_next_task_fair(); + 0.481 us CPU[000] pick_next_task_idle(); + 0.489 us CPU[000] _spin_trylock(); + 0.655 us CPU[000] _spin_unlock(); + 0.609 us CPU[000] ------------8<---------- thread bash-2794 ------------8<---------- CPU[000] finish_task_switch() { - CPU[000] _spin_unlock_irq(); + 0.722 us CPU[000] } + 2.369 us CPU[000] ! } + 501972.605 us CPU[000] ! } + 501973.763 us CPU[000] copy_from_read_buf() { - CPU[000] _spin_lock_irqsave(); + 0.670 us CPU[000] _spin_unlock_irqrestore(); + 0.699 us CPU[000] copy_to_user() { - CPU[000] might_fault() { - CPU[000] __might_sleep(); + 0.503 us CPU[000] } + 1.632 us CPU[000] __copy_to_user_ll(); + 0.542 us CPU[000] } + 3.858 us CPU[000] tty_audit_add_data() { - CPU[000] _spin_lock_irq(); + 0.609 us CPU[000] _spin_unlock_irq(); + 0.624 us CPU[000] } + 3.196 us CPU[000] _spin_lock_irqsave(); + 0.624 us CPU[000] _spin_unlock_irqrestore(); + 0.625 us CPU[000] + } + 13.611 us CPU[000] copy_from_read_buf() { - CPU[000] _spin_lock_irqsave(); + 0.624 us CPU[000] _spin_unlock_irqrestore(); + 0.616 us CPU[000] } + 2.820 us CPU[000] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27Merge branches 'tracing/blktrace', 'tracing/ftrace', ↵Ingo Molnar
'tracing/function-graph-tracer' and 'tracing/power-tracer' into tracing/core
2008-11-27ftrace: prevent recursionLai Jiangshan
Impact: prevent unnecessary stack recursion if the resched flag was set before we entered, then don't reschedule. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26tracing: add "power-tracer": C/P state tracer to help power optimizationArjan van de Ven
Impact: new "power-tracer" ftrace plugin This patch adds a C/P-state ftrace plugin that will generate detailed statistics about the C/P-states that are being used, so that we can look at detailed decisions that the C/P-state code is making, rather than the too high level "average" that we have today. An example way of using this is: mount -t debugfs none /sys/kernel/debug echo cstate > /sys/kernel/debug/tracing/current_tracer echo 1 > /sys/kernel/debug/tracing/tracing_enabled sleep 1 echo 0 > /sys/kernel/debug/tracing/tracing_enabled cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26ftrace: add cpu annotation for function graph tracerSteven Rostedt
Impact: enhancement for function graph tracer When run on a SMP box, the function graph tracer is confusing because it shows the different CPUS as changes in the trace. This patch adds the annotation of 'CPU[###]' where ### is a three digit number. The output will look similar to this: CPU[001] dput() { CPU[000] } 726 CPU[001] } 487 CPU[000] do_softirq() { CPU[001] } 2221 CPU[000] __do_softirq() { CPU[000] __local_bh_disable() { CPU[001] unroll_tree_refs() { CPU[000] } 569 CPU[001] } 501 CPU[000] rcu_process_callbacks() { CPU[001] kfree() { What makes this nice is that now you can grep the file and produce readable format for a particular CPU. # cat /debug/tracing/trace > /tmp/trace # grep '^CPU\[000\]' /tmp/trace > /tmp/trace0 # grep '^CPU\[001\]' /tmp/trace > /tmp/trace1 Will give you: # head /tmp/trace0 CPU[000] ------------8<---------- thread sshd-3899 ------------8<---------- CPU[000] inotify_dentry_parent_queue_event() { CPU[000] } 2531 CPU[000] inotify_inode_queue_event() { CPU[000] } 505 CPU[000] } 69626 CPU[000] } 73089 CPU[000] audit_syscall_exit() { CPU[000] path_put() { CPU[000] dput() { # head /tmp/trace1 CPU[001] ------------8<---------- thread pcscd-3446 ------------8<---------- CPU[001] } 4186 CPU[001] dput() { CPU[001] } 543 CPU[001] vfs_permission() { CPU[001] inode_permission() { CPU[001] shmem_permission() { CPU[001] generic_permission() { CPU[001] } 501 CPU[001] } 2205 Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26ftrace: add thread comm to function graph tracerSteven Rostedt
Impact: enhancement to function graph tracer Export the trace_find_cmdline so the function graph tracer can use it to print the comms of the threads. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26ftrace: let function tracing and function return run togetherSteven Rostedt
Impact: feature This patch enables function tracing and function return to run together. I've tested this by enabling the stack tracer and return tracer, where both the function entry and function return are used together with dynamic ftrace. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26ftrace: use code patching for ftrace graph tracerSteven Rostedt
Impact: more efficient code for ftrace graph tracer This patch uses the dynamic patching, when available, to patch the function graph code into the kernel. This patch will ease the way for letting both function tracing and function graph tracing run together. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26ftrace: add function tracing to single threadSteven Rostedt
Impact: feature to function trace a single thread This patch adds the ability to function trace a single thread. The file: /debugfs/tracing/set_ftrace_pid contains the pid to trace. Valid pids are any positive integer. Writing any negative number to this file will disable the pid tracing and the function tracer will go back to tracing all of threads. This feature works with both static and dynamic function tracing. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26tracing/function-return-tracer: set a more human readable outputFrederic Weisbecker
Impact: feature This patch sets a C-like output for the function graph tracing. For this aim, we now call two handler for each function: one on the entry and one other on return. This way we can draw a well-ordered call stack. The pid of the previous trace is loosely stored to be compared against the one of the current trace to see if there were a context switch. Without this little feature, the call tree would seem broken at some locations. We could use the sched_tracer to capture these sched_events but this way of processing is much more simpler. 2 spaces have been chosen for indentation to fit the screen while deep calls. The time of execution in nanosecs is printed just after closed braces, it seems more easy this way to find the corresponding function. If the time was printed as a first column, it would be not so easy to find the corresponding function if it is called on a deep depth. I plan to output the return value but on 32 bits CPU, the return value can be 32 or 64, and its difficult to guess on which case we are. I don't know what would be the better solution on X86-32: only print eax (low-part) or even edx (high-part). Actually it's thee same problem when a function return a 8 bits value, the high part of eax could contain junk values... Here is an example of trace: sys_read() { fget_light() { } 526 vfs_read() { rw_verify_area() { security_file_permission() { cap_file_permission() { } 519 } 1564 } 2640 do_sync_read() { pipe_read() { __might_sleep() { } 511 pipe_wait() { prepare_to_wait() { } 760 deactivate_task() { dequeue_task() { dequeue_task_fair() { dequeue_entity() { update_curr() { update_min_vruntime() { } 504 } 1587 clear_buddies() { } 512 add_cfs_task_weight() { } 519 update_min_vruntime() { } 511 } 5602 dequeue_entity() { update_curr() { update_min_vruntime() { } 496 } 1631 clear_buddies() { } 496 update_min_vruntime() { } 527 } 4580 hrtick_update() { hrtick_start_fair() { } 488 } 1489 } 13700 } 14949 } 16016 msecs_to_jiffies() { } 496 put_prev_task_fair() { } 504 pick_next_task_fair() { } 489 pick_next_task_rt() { } 496 pick_next_task_fair() { } 489 pick_next_task_idle() { } 489 ------------8<---------- thread 4 ------------8<---------- finish_task_switch() { } 1203 do_softirq() { __do_softirq() { __local_bh_disable() { } 669 rcu_process_callbacks() { __rcu_process_callbacks() { cpu_quiet() { rcu_start_batch() { } 503 } 1647 } 3128 __rcu_process_callbacks() { } 542 } 5362 _local_bh_enable() { } 587 } 8880 } 9986 kthread_should_stop() { } 669 deactivate_task() { dequeue_task() { dequeue_task_fair() { dequeue_entity() { update_curr() { calc_delta_mine() { } 511 update_min_vruntime() { } 511 } 2813 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26tracing/function-return-tracer: change the name into function-graph-tracerFrederic Weisbecker
Impact: cleanup This patch changes the name of the "return function tracer" into function-graph-tracer which is a more suitable name for a tracing which makes one able to retrieve the ordered call stack during the code flow. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25x86, bts, ftrace: a BTS ftrace plug-in prototypeMarkus Metzger
Impact: add new ftrace plugin A prototype for a BTS ftrace plug-in. The tracer collects branch trace in a cyclic buffer for each cpu. The tracer is not configurable and the trace for each snapshot is appended when doing cat /debug/tracing/trace. This is a proof of concept that will be extended with future patches to become a (hopefully) useful tool. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25x86, ftrace: call trace->open() before stopping tracing; add ↵Markus Metzger
trace->print_header() Add a callback to allow an ftrace plug-in to write its own header. Move the call to trace->open() up a few lines. The changes are required by the BTS ftrace plug-in. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25hrtimer: removing all ur callback modesPeter Zijlstra
Impact: cleanup, move all hrtimer processing into hardirq context This is an attempt at removing some of the hrtimer complexity by reducing the number of callback modes to 1. This means that all hrtimer callback functions will be ran from HARD-irq context. I went through all the 30 odd hrtimer callback functions in the kernel and saw only one that I'm not quite sure of, which is the one in net/can/bcm.c - hence I'm CC-ing the folks responsible for that code. Furthermore, the hrtimer core now calls callbacks directly with IRQs disabled in case you try to enqueue an expired timer. If this timer is a periodic timer (which should use hrtimer_forward() to advance its time) then it might be possible to end up in an inf. recursive loop due to the fact that hrtimer_forward() doesn't round up to the next timer granularity, and therefore keeps on calling the callback - obviously this needs a fix. Aside from that, this seems to compile and actually boot on my dual core test box - although I'm sure there are some bugs in, me not hitting any makes me certain :-) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-24Merge branches 'tracing/branch-tracer', 'tracing/fastboot', ↵Ingo Molnar
'tracing/ftrace', 'tracing/function-return-tracer', 'tracing/power-tracer', 'tracing/powerpc', 'tracing/ring-buffer', 'tracing/stack-tracer' and 'tracing/urgent' into tracing/core
2008-11-23x86, mmiotrace: fix buffer overrun detectionPekka Paalanen
Impact: fix mmiotrace overrun tracing When ftrace framework moved to use the ring buffer facility, the buffer overrun detection was broken after 2.6.27 by commit | commit 3928a8a2d98081d1bc3c0a84a2d70e29b90ecf1c | Author: Steven Rostedt <rostedt@goodmis.org> | Date: Mon Sep 29 23:02:41 2008 -0400 | | ftrace: make work with new ring buffer | | This patch ports ftrace over to the new ring buffer. The detection is now fixed by using the ring buffer API. When mmiotrace detects a buffer overrun, it will report the number of lost events. People reading an mmiotrace log must know if something was missed, otherwise the data may not make sense. Signed-off-by: Pekka Paalanen <pq@iki.fi> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/function-return-tracer: don't trace kfree while it frees the return ↵Frederic Weisbecker
stack Impact: fix a crash While I killed the cat process, I got sometimes the following (but rare) crash: [ 65.689027] Pid: 2969, comm: cat Not tainted (2.6.28-rc6-tip #83) AMILO Li 2727 [ 65.689027] EIP: 0060:[<00000000>] EFLAGS: 00010082 CPU: 1 [ 65.689027] EIP is at 0x0 [ 65.689027] EAX: 00000000 EBX: f66cd780 ECX: c019a64a EDX: f66cd780 [ 65.689027] ESI: 00000286 EDI: f66cd780 EBP: f630be2c ESP: f630be24 [ 65.689027] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 65.689027] Process cat (pid: 2969, ti=f630a000 task=f66cd780 task.ti=f630a000) [ 65.689027] Stack: [ 65.689027] 00000012 f630bd54 f630be7c c012c853 00000000 c0133cc9 f66cda54 f630be5c [ 65.689027] f630be68 f66cda54 f66cd88c f66cd878 f7070000 00000001 f630be90 c0135dbc [ 65.689027] f614a614 f630be68 f630be68 f65ba200 00000002 f630bf10 f630be90 c012cad6 [ 65.689027] Call Trace: [ 65.689027] [<c012c853>] ? do_exit+0x603/0x850 [ 65.689027] [<c0133cc9>] ? next_signal+0x9/0x40 [ 65.689027] [<c0135dbc>] ? dequeue_signal+0x8c/0x180 [ 65.689027] [<c012cad6>] ? do_group_exit+0x36/0x90 [ 65.689027] [<c013709c>] ? get_signal_to_deliver+0x20c/0x390 [ 65.689027] [<c0102b69>] ? do_notify_resume+0x99/0x8b0 [ 65.689027] [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80 [ 65.689027] [<c014db9b>] ? trace_hardirqs_on+0xb/0x10 [ 65.689027] [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80 [ 65.689027] [<c02e39b0>] ? n_tty_write+0x0/0x340 [ 65.689027] [<c02e1812>] ? redirected_tty_write+0x82/0x90 [ 65.689027] [<c019ee99>] ? vfs_write+0x99/0xd0 [ 65.689027] [<c02e1790>] ? redirected_tty_write+0x0/0x90 [ 65.689027] [<c019f342>] ? sys_write+0x42/0x70 [ 65.689027] [<c01035ca>] ? work_notifysig+0x13/0x19 [ 65.689027] Code: Bad EIP value. [ 65.689027] EIP: [<00000000>] 0x0 SS:ESP 0068:f630be24 This is because on do_exit(), kfree is called to free the return addresses stack but kfree is traced and stored its return address in this stack. This patch fixes it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/stack-tracer: avoid races accessing fileTörök Edwin
Impact: fix race vma->vm_file reference is only stable while holding the mmap_sem, so move usage of it to within the critical section. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/stack-tracer: introduce CONFIG_USER_STACKTRACE_SUPPORTTörök Edwin
Impact: cleanup User stack tracing is just implemented for x86, but it is not x86 specific. Introduce a generic config flag, that is currently enabled only for x86. When other arches implement it, they will have to SELECT USER_STACKTRACE_SUPPORT. Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/stack-tracer: fix locking and refcountsTörök Edwin
Impact: fix refcounting/object-access bug Hold mmap_sem while looking up/accessing vma. Hold the RCU lock while using the task we looked up. Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/stack-tracer: fix style issuesTörök Edwin
Impact: cleanup Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23trace: fix compiler warning in branch profilerSteven Rostedt
Impact: fix compiler warning The ftrace_pointers used in the branch profiler are constant values. They should never change. But the compiler complains when they are passed into the debugfs_create_file as a data pointer, because the function discards the qualifier. This patch typecasts the parameter to debugfs_create_file back to a void pointer. To remind the callbacks that they are pointing to a constant value, I also modified the callback local pointers to be const struct ftrace_pointer * as well. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23ftrace: add ftrace_off_permanentSteven Rostedt
Impact: add new API to disable all of ftrace on anomalies It case of a serious anomaly being detected (like something caught by lockdep) it is a good idea to disable all tracing immediately, without grabing any locks. This patch adds ftrace_off_permanent that disables the tracers, function tracing and ring buffers without a way to enable them again. This should only be used when something serious has been detected. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23ring-buffer: add tracing_off_permanentSteven Rostedt
Impact: feature to permanently disable ring buffer This patch adds a API to the ring buffer code that will permanently disable the ring buffer from ever recording. This should only be called when some serious anomaly is detected, and the system may be in an unstable state. When that happens, shutting down the recording to the ring buffers may be appropriate. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23trace: profile all if conditionalsSteven Rostedt
Impact: feature to profile if statements This patch adds a branch profiler for all if () statements. The results will be found in: /debugfs/tracing/profile_branch For example: miss hit % Function File Line ------- --------- - -------- ---- ---- 0 1 100 x86_64_start_reservations head64.c 127 0 1 100 copy_bootdata head64.c 69 1 0 0 x86_64_start_kernel head64.c 111 32 0 0 set_intr_gate desc.h 319 1 0 0 reserve_ebda_region head.c 51 1 0 0 reserve_ebda_region head.c 47 0 1 100 reserve_ebda_region head.c 42 0 0 X maxcpus main.c 165 Miss means the branch was not taken. Hit means the branch was taken. The percent is the percentage the branch was taken. This adds a significant amount of overhead and should only be used by those analyzing their system. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23trace: branch profiling should not print percent without dataSteven Rostedt
Impact: cleanup on output of branch profiler When a branch has not been taken, it does not make sense to show a percentage incorrect or hit. This patch changes the behaviour to print out a 'X' when the branch has not been executed yet. For example: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2096 0 0 do_arch_prctl process_64.c 832 0 0 X do_arch_prctl process_64.c 804 2604 0 0 IS_ERR err.h 34 130228 5765 4 __switch_to process_64.c 673 0 0 X enable_TSC process_64.c 448 0 0 X disable_TSC process_64.c 431 Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23trace: consolidate unlikely and likely profilerSteven Rostedt
Impact: clean up to make one profiler of like and unlikely tracer The likely and unlikely profiler prints out the file and line numbers of the annotated branches that it is profiling. It shows the number of times it was correct or incorrect in its guess. Having two different files or sections for that matter to tell us if it was a likely or unlikely is pretty pointless. We really only care if it was correct or not. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing: identify which executable object the userspace address belongs toTörök Edwin
Impact: modify+improve the userstacktrace tracing visualization feature Store thread group leader id, and use it to lookup the address in the process's map. We could have looked up the address on thread's map, but the thread might not exist by the time we are called. The process might not exist either, but if you are reading trace_pipe, that is unlikely. Example usage: mount -t debugfs nodev /sys/kernel/debug cd /sys/kernel/debug/tracing echo userstacktrace >iter_ctrl echo sym-userobj >iter_ctrl echo sched_switch >current_tracer echo 1 >tracing_enabled cat trace_pipe >/tmp/trace& .... run application ... echo 0 >tracing_enabled cat /tmp/trace You'll see stack entries like: /lib/libpthread-2.7.so[+0xd370] You can convert them to function/line using: addr2line -fie /lib/libpthread-2.7.so 0xd370 Or: addr2line -fie /usr/lib/debug/libpthread-2.7.so 0xd370 For non-PIC/PIE executables this won't work: a.out[+0x73b] You need to run the following: addr2line -fie a.out 0x40073b (where 0x400000 is the default load address of a.out) Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing: add support for userspace stacktraces in tracing/iter_ctrlTörök Edwin
Impact: add new (default-off) tracing visualization feature Usage example: mount -t debugfs nodev /sys/kernel/debug cd /sys/kernel/debug/tracing echo userstacktrace >iter_ctrl echo sched_switch >current_tracer echo 1 >tracing_enabled .... run application ... echo 0 >tracing_enabled Then read one of 'trace','latency_trace','trace_pipe'. To get the best output you can compile your userspace programs with frame pointers (at least glibc + the app you are tracing). Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/function-return-tracer: store return stack into task_struct and ↵Frederic Weisbecker
allocate it dynamically Impact: use deeper function tracing depth safely Some tests showed that function return tracing needed a more deeper depth of function calls. But it could be unsafe to store these return addresses to the stack. So these arrays will now be allocated dynamically into task_struct of current only when the tracer is activated. Typical scheme when tracer is activated: - allocate a return stack for each task in global list. - fork: allocate the return stack for the newly created task - exit: free return stack of current - idle init: same as fork I chose a default depth of 50. I don't have overruns anymore. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23Merge branches 'tracing/profiling', 'tracing/options' and 'tracing/urgent' ↵Ingo Molnar
into tracing/core