aboutsummaryrefslogtreecommitdiff
path: root/kernel/trace/trace.h
AgeCommit message (Collapse)Author
2009-03-05tracing: rename ftrace_printk() => trace_printk()Ingo Molnar
Impact: cleanup Use a more generic name - this also allows the prototype to move to kernel.h and be generally available to kernel developers who want to do some quick tracing. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-04tracing: add latency output format optionSteven Rostedt
With the removal of the latency_trace file, we lost the ability to see some of the finer details in a trace. Like the state of interrupts enabled, the preempt count, need resched, and if we are in an interrupt handler, softirq handler or not. This patch simply creates an option to bring back the old format. This also removes the warning about an unused variable that held the latency_trace file operations. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-03tracing: add binary buffer files for use with spliceSteven Rostedt
Impact: new feature This patch creates a directory of files that correspond to the per CPU ring buffers. These are binary files and are made to be used with splice. This is the fastest way to extract data from the ftrace ring buffers. Thanks to Jiaying Zhang for pushing me to get this code fixed, and to Eduard - Gabriel Munteanu for his splice code that helped me debug my code. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-02tracing: add format file to describe event struct fieldsSteven Rostedt
This patch adds the "format" file to the trace point event directory. This is based off of work by Tom Zanussi, in which a file is exported to be tread from user land such that a user space app may read the binary record stored in the ring buffer. # cat /debug/tracing/events/sched/sched_switch/format field:pid_t prev_pid; offset:12; size:4; field:int prev_prio; offset:16; size:4; field special:char next_comm[TASK_COMM_LEN]; offset:20; size:16; field:pid_t next_pid; offset:36; size:4; field:int next_prio; offset:40; size:4; Idea-from: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-02tracing: make trace_seq_reset global and rename to trace_seq_initSteven Rostedt
Impact: clean up The trace_seq functions may be used separately outside of the ftrace iterator. The trace_seq_reset is needed for these operations. This patch also renames trace_seq_reset to the more appropriate trace_seq_init. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: add raw fast tracing interface for trace eventsSteven Rostedt
This patch adds the interface to enable the C style trace points. In the directory /debugfs/tracing/events/subsystem/event We now have three files: enable : values 0 or 1 to enable or disable the trace event. available_types: values 'raw' and 'printf' which indicate the tracing types available for the trace point. If a developer does not use the TRACE_EVENT_FORMAT macro and just uses the TRACE_FORMAT macro, then only 'printf' will be available. This file is read only. type: values 'raw' or 'printf'. This indicates which type of tracing is active for that trace point. 'printf' is the default and if 'raw' is not available, this file is read only. # echo raw > /debug/tracing/events/sched/sched_wakeup/type # echo 1 > /debug/tracing/events/sched/sched_wakeup/enable Will enable the C style tracing for the sched_wakeup trace point. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: add raw trace point recording infrastructureSteven Rostedt
Impact: lower overhead tracing The current event tracer can automatically pick up trace points that are registered with the TRACE_FORMAT macro. But it required a printf format string and parsing. Although, this adds the ability to get guaranteed information like task names and such, it took a hit in overhead processing. This processing can add about 500-1000 nanoseconds overhead, but in some cases that too is considered too much and we want to shave off as much from this overhead as possible. Tom Zanussi recently posted tracing patches to lkml that are based on a nice idea about capturing the data via C structs using STRUCT_ENTER, STRUCT_EXIT type of macros. I liked that method very much, but did not like the implementation that required a developer to add data/code in several disjoint locations. This patch extends the event_tracer macros to do a similar "raw C" approach that Tom Zanussi did. But instead of having the developers needing to tweak a bunch of code all over the place, they can do it all in one macro - preferably placed near the code that it is tracing. That makes it much more likely that tracepoints will be maintained on an ongoing basis by the code they modify. The new macro TRACE_EVENT_FORMAT is created for this approach. (Note, a developer may still utilize the more low level DECLARE_TRACE macros if they don't care about getting their traces automatically in the event tracer.) They can also use the existing TRACE_FORMAT if they don't need to code the tracepoint in C, but just want to use the convenience of printf. So if the developer wants to "hardwire" a tracepoint in the fastest possible way, and wants to acquire their data via a user space utility in a raw binary format, or wants to see it in the trace output but not sacrifice any performance, then they can implement the faster but more complex TRACE_EVENT_FORMAT macro. Here's what usage looks like: TRACE_EVENT_FORMAT(name, TPPROTO(proto), TPARGS(args), TPFMT(fmt, fmt_args), TRACE_STUCT( TRACE_FIELD(type1, item1, assign1) TRACE_FIELD(type2, item2, assign2) [...] ), TPRAWFMT(raw_fmt) ); Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT uses. name: is the unique identifier of the trace point proto: The proto type that the trace point uses args: the args in the proto type fmt: printf format to use with the event printf tracer fmt_args: the printf argments to match fmt TRACE_STRUCT starts the ability to create a structure. Each item in the structure is defined with a TRACE_FIELD TRACE_FIELD(type, item, assign) type: the C type of item. item: the name of the item in the stucture assign: what to assign the item in the trace point callback raw_fmt is a way to pretty print the struct. It must match the order of the items are added in TRACE_STUCT An example of this would be: TRACE_EVENT_FORMAT(sched_wakeup, TPPROTO(struct rq *rq, struct task_struct *p, int success), TPARGS(rq, p, success), TPFMT("task %s:%d %s", p->comm, p->pid, success?"succeeded":"failed"), TRACE_STRUCT( TRACE_FIELD(pid_t, pid, p->pid) TRACE_FIELD(int, success, success) ), TPRAWFMT("task %d success=%d") ); This creates us a unique struct of: struct { pid_t pid; int success; }; And the way the call back would assign these values would be: entry->pid = p->pid; entry->success = success; The nice part about this is that the creation of the assignent is done via macro magic in the event tracer. Once the TRACE_EVENT_FORMAT is created, the developer will then have a faster method to record into the ring buffer. They do not need to worry about the tracer itself. The developer would only need to touch the files in include/trace/*.h Again, I would like to give special thanks to Tom Zanussi for this nice idea. Idea-from: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: add interface to write into current tracer bufferSteven Rostedt
Right now all tracers must manage their own trace buffers. This was to enforce tracers to be independent in case we finally decide to allow each tracer to have their own trace buffer. But now we are adding event tracing that writes to the current tracer's buffer. This adds an interface to allow events to write to the current tracer buffer without having to manage its own. Since event tracing has no "tracer", and is just a way to hook into any other tracer. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-25tracing/core: make the read callbacks reentrantsFrederic Weisbecker
Now that several per-cpu files can be read or spliced at the same, we want the read/splice callbacks for tracing files to be reentrants. Until now, a single global mutex (trace_types_lock) serialized the access to tracing_read_pipe(), tracing_splice_read_pipe(), and the seq helpers. Ie: it means that if a user tries to read trace_pipe0 and trace_pipe1 at the same time, the access to the function tracing_read_pipe() is contended and one reader must wait for the other to finish its read call. The trace_type_lock mutex is mostly here to serialize the access to the global current tracer (current_trace), which can be changed concurrently. Although the iter struct keeps a private pointer to this tracer, its callbacks can be changed by another function. The method used here is to not keep anymore private reference to the tracer inside the iterator but to make a copy of it inside the iterator. Then it checks on subsequents read calls if the tracer has changed. This is not costly because the current tracer is not expected to be changed often, so we use a branch prediction for that. Moreover, we add a private mutex to the iterator (there is one iterator per file descriptor) to serialize the accesses in case of multiple consumers per file descriptor (which would be a silly idea from the user). Note that this is not to protect the ring buffer, since the ring buffer already serializes the readers accesses. This is to prevent from traces weirdness in case of concurrent consumers. But these mutexes can be dropped anyway, that would not result in any crash. Just tell me what you think about it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25tracing/core: introduce per cpu tracing filesFrederic Weisbecker
Impact: split up tracing output per cpu Currently, on the tracing debugfs directory, three files are available to the user to let him extracting the trace output: - trace is an iterator through the ring-buffer. It's a reader but not a consumer It doesn't block when no more traces are available. - trace pretty similar to the former, except that it adds more informations such as prempt count, irq flag, ... - trace_pipe is a reader and a consumer, it will also block waiting for traces if necessary (heh, yes it's a pipe). The traces coming from different cpus are curretly mixed up inside these files. Sometimes it messes up the informations, sometimes it's useful, depending on what does the tracer capture. The tracing_cpumask file is useful to filter the output and select only the traces captured a custom defined set of cpus. But still it is not enough powerful to extract at the same time one trace buffer per cpu. So this patch creates a new directory: /debug/tracing/per_cpu/. Inside this directory, you will now find one trace_pipe file and one trace file per cpu. Which means if you have two cpus, you will have: trace0 trace1 trace_pipe0 trace_pipe1 And of course, reading these files will have the same effect than with the usual tracing files, except that you will only see the traces from the given cpu. The original all-in-one cpu trace file are still available on their original place. Until now, only one consumer was allowed on trace_pipe to avoid racy consuming on the ring-buffer. Now the approach changed a bit, you can have only one consumer per cpu. Which means you are allowed to read concurrently trace_pipe0 and trace_pipe1 But you can't have two readers on trace_pipe0 or trace_pipe1. Following the same logic, if there is one reader on the common trace_pipe, you can not have at the same time another reader on trace_pipe0 or in trace_pipe1. Because in trace_pipe is already a consumer in all cpu buffers in essence. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-18tracing/core: use appropriate waiting on trace_pipeFrederic Weisbecker
Impact: api and pipe waiting change Currently, the waiting used in tracing_read_pipe() is done through a 100 msecs schedule_timeout() loop which periodically check if there are traces on the buffer. This can cause small latencies for programs which are reading the incoming events. This patch makes the reader waiting for the trace_wait waitqueue except for few tracers such as the sched and functions tracers which might be already hold the runqueue lock while waking up the reader. This is performed through a new callback wait_pipe() on struct tracer. If none is implemented on a specific tracer, the default waiting for trace_wait queue is attached. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09tracing: splice support for tracing_pipeEduard - Gabriel Munteanu
Added and implemented tracing_pipe_fops->splice_read(). This allows userspace programs to get tracing data more efficiently. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-09tracing/function-graph-tracer: handle the leaf functions from trace_pipeFrederic Weisbecker
When one cats the trace file, the leaf functions are printed without brackets: function(); whereas in the trace_pipe file we'll see the following: function() { } This is because the ring_buffer handling is not the same between those two files. On the trace file, when an entry is printed, the iterator advanced and then we can check the next entry. There is no iterator with trace_pipe, the current entry to print has been peeked and not consumed. So checking the next entry will still return the current one while we don't consume it. This patch introduces a new value for the output callbacks to ask the tracing core to not consume the current entry after printing it. We need it because we will have to consume the current entry ourself to check the next one. Now the trace_pipe is able to handle well the leaf functions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09tracing/power: move the power trace headers to a dedicated fileFrederic Weisbecker
Impact: cleanup Move the power tracer headers to trace/power.h to keep ftrace.h and power bits more easy to maintain as separated topics. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09tracing/function-graph-tracer: provide a selftest for the function graph tracerFrederic Weisbecker
Making it more easy to do a basic regression test for this tracer. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09Merge branch 'tip/tracing/core/devel' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace Conflicts: kernel/trace/trace_hw_branches.c
2009-02-07trace: trivial fixes in comment typos.Wenji Huang
Impact: clean up Fixed several typos in the comments. Signed-off-by: Wenji Huang <wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07trace: remove deprecated entry->cpuSteven Rostedt
Impact: fix to prevent developers from using entry->cpu With the new ring buffer infrastructure, the cpu for the entry is implicit with which CPU buffer it is on. The original code use to record the current cpu into the generic entry header, which can be retrieved by entry->cpu. When the ring buffer was introduced, the users were convert to use the the cpu number of which cpu ring buffer was in use (this was passed to the tracers by the iterator: iter->cpu). Unfortunately, the cpu item in the entry structure was never removed. This allowed for developers to use it instead of the proper iter->cpu, unknowingly, using an uninitialized variable. This was not the fault of the developers, since it would seem like the logical place to retrieve the cpu identifier. This patch removes the cpu item from the entry structure and fixes all the users that should have been using iter->cpu. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-06trace: Call tracing_reset_online_cpus before tracer->init()Arnaldo Carvalho de Melo
Impact: cleanup To make it easy for ftrace plugin writers, as this was open coded in the existing plugins Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-06tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}Arnaldo Carvalho de Melo
Impact: new API These new functions do what previously was being open coded, reducing the number of details ftrace plugin writers have to worry about. It also standardizes the handling of stacktrace, userstacktrace and other trace options we may introduce in the future. With this patch, for instance, the blk tracer (and some others already in the tree) can use the "userstacktrace" /d/tracing/trace_options facility. $ codiff /tmp/vmlinux.before /tmp/vmlinux.after linux-2.6-tip/kernel/trace/trace.c: trace_vprintk | -5 trace_graph_return | -22 trace_graph_entry | -26 trace_function | -45 __ftrace_trace_stack | -27 ftrace_trace_userstack | -29 tracing_sched_switch_trace | -66 tracing_stop | +1 trace_seq_to_user | -1 ftrace_trace_special | -63 ftrace_special | +1 tracing_sched_wakeup_trace | -70 tracing_reset_online_cpus | -1 13 functions changed, 2 bytes added, 355 bytes removed, diff: -353 linux-2.6-tip/block/blktrace.c: __blk_add_trace | -58 1 function changed, 58 bytes removed, diff: -58 linux-2.6-tip/kernel/trace/trace.c: trace_buffer_lock_reserve | +88 trace_buffer_unlock_commit | +86 2 functions changed, 174 bytes added, diff: +174 /tmp/vmlinux.after: 16 functions changed, 176 bytes added, 413 bytes removed, diff: -237 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05trace: Remove unused trace_array_cpu parameterArnaldo Carvalho de Melo
Impact: cleanup Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-03trace: better manage the context info for eventsFrederic Weisbecker
Impact: make trace_event more convenient for tracers All tracers (for the moment) that use the struct trace_event want to have the context info printed before their own output: the pid/cmdline, cpu, and timestamp. But some other tracers that want to implement their trace_event callbacks will not necessary need these information or they may want to format them as they want. This patch adds a new default-enabled trace option: TRACE_ITER_CONTEXT_INFO When disabled through: echo nocontext-info > /debugfs/tracing/trace_options The pid, cpu and timestamps headers will not be printed. IE with the sched_switch tracer with context-info (default): bash-2935 [001] 100.356561: 2935:120:S ==> [001] 0:140:R <idle> <idle>-0 [000] 100.412804: 0:140:R + [000] 11:115:S events/0 <idle>-0 [000] 100.412816: 0:140:R ==> [000] 11:115:R events/0 events/0-11 [000] 100.412829: 11:115:S ==> [000] 0:140:R <idle> Without context-info: 2935:120:S ==> [001] 0:140:R <idle> 0:140:R + [000] 11:115:S events/0 0:140:R ==> [000] 11:115:R events/0 11:115:S ==> [000] 0:140:R <idle> A tracer can disable it at runtime by clearing the bit TRACE_ITER_CONTEXT_INFO in trace_flags. The print routines were renamed to trace_print_context and trace_print_lat_context, so that they can be used by tracers if they want to use them for one of the trace_event callbacks. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26blktrace: add ftrace pluginArnaldo Carvalho de Melo
Impact: New way of using the blktrace infrastructure This drops the requirement of userspace utilities to use the blktrace facility. Configuration is done thru sysfs, adding a "trace" directory to the partition directory where blktrace can be enabled for the associated request_queue. The same filters present in the IOCTL interface are present as sysfs device attributes. The /sys/block/sdX/sdXN/trace/enable file allows tracing without any filters. The other files in this directory: pid, act_mask, start_lba and end_lba can be used with the same meaning as with the IOCTL interface. Using the sysfs interface will only setup the request_queue->blk_trace fields, tracing will only take place when the "blk" tracer is selected via the ftrace interface, as in the following example: To see the trace, one can use the /d/tracing/trace file or the /d/tracign/trace_pipe file, with semantics defined in the ftrace documentation in Documentation/ftrace.txt. [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3046.491224: 8,1 A WBS 6367 + 8 <- (8,1) 6304 kjournald-305 [000] 3046.491227: 8,1 Q R 6367 + 8 [kjournald] kjournald-305 [000] 3046.491236: 8,1 G RB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491239: 8,1 P NS [kjournald] kjournald-305 [000] 3046.491242: 8,1 I RBS 6367 + 8 [kjournald] kjournald-305 [000] 3046.491251: 8,1 D WB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491610: 8,1 U WS [kjournald] 1 <idle>-0 [000] 3046.511914: 8,1 C RS 6367 + 8 [6367] [root@f10-1 ~]# The default line context (prefix) format is the one described in the ftrace documentation, with the blktrace specific bits using its existing format, described in blkparse(8). If one wants to have the classic blktrace formatting, this is possible by using: [root@f10-1 ~]# echo blk_classic > /t/trace_options [root@f10-1 ~]# cat /t/trace 8,1 0 3046.491224 305 A WBS 6367 + 8 <- (8,1) 6304 8,1 0 3046.491227 305 Q R 6367 + 8 [kjournald] 8,1 0 3046.491236 305 G RB 6367 + 8 [kjournald] 8,1 0 3046.491239 305 P NS [kjournald] 8,1 0 3046.491242 305 I RBS 6367 + 8 [kjournald] 8,1 0 3046.491251 305 D WB 6367 + 8 [kjournald] 8,1 0 3046.491610 305 U WS [kjournald] 1 8,1 0 3046.511914 0 C RS 6367 + 8 [6367] [root@f10-1 ~]# Using the ftrace standard format allows more flexibility, such as the ability of asking for backtraces via trace_options: [root@f10-1 ~]# echo noblk_classic > /t/trace_options [root@f10-1 ~]# echo stacktrace > /t/trace_options [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3318.826779: 8,1 A WBS 6375 + 8 <- (8,1) 6312 kjournald-305 [000] 3318.826782: <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread <= child_rip kjournald-305 [000] 3318.826836: 8,1 Q R 6375 + 8 [kjournald] kjournald-305 [000] 3318.826837: <= generic_make_request <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread Please read the ftrace documentation to use aditional, standardized tracing filters such as /d/tracing/trace_cpumask, etc. See also /d/tracing/trace_mark to add comments in the trace stream, that is equivalent to the /d/block/sdaN/msg interface. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20x86, ftrace, hw-branch-tracer: dump trace on oopsMarkus Metzger
Dump the branch trace on an oops (based on ftrace_dump_on_oops). Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16ftrace: remove static from function tracer functionsSteven Rostedt
Impact: clean up After reorganizing the functions in trace.c and trace_function.c, they no longer need to be in global context. This patch makes the functions and one variable into static. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16ftrace: add stack trace to function tracerSteven Rostedt
Impact: new feature to stack trace any function Chris Mason asked about being able to pick and choose a function and get a stack trace from it. This feature enables his request. # echo io_schedule > /debug/tracing/set_ftrace_filter # echo function > /debug/tracing/current_tracer # echo func_stack_trace > /debug/tracing/trace_options Produces the following in /debug/tracing/trace: kjournald-702 [001] 135.673060: io_schedule <-sync_buffer kjournald-702 [002] 135.673671: <= sync_buffer <= __wait_on_bit <= out_of_line_wait_on_bit <= __wait_on_buffer <= sync_dirty_buffer <= journal_commit_transaction <= kjournald Note, be careful about turning this on without filtering the functions. You may find that you have a 10 second lag between typing and seeing what you typed. This is why the stack trace for the function tracer does not use the same stack_trace flag as the other tracers use. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14tracing/ftrace: separate events tracing and stats tracing engineFrederic Weisbecker
Impact: tracing's Api change Currently, the stat tracing depends on the events tracing. When you switch to a new tracer, the stats files of the previous tracer will disappear. But it's more scalable to separate those two engines. This way, we can keep the stat files of one or several tracers when we want, without bothering of multiple tracer stat files or tracer switching. To build/destroys its stats files, a tracer just have to call register_stat_tracer/unregister_stat_tracer everytimes it wants to. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-11tracing/ftrace: handle more than one stat file per tracerFrederic Weisbecker
Impact: new API for tracers Make the stat tracing API reentrant. And also provide the new directory /debugfs/tracing/trace_stat which will contain all the stat files for the current active tracer. Now a tracer will, if desired, want to provide a zero terminated array of tracer_stat structures. Each one contains the callbacks necessary for one stat file. It have to provide at least a name for its stat file, an iterator with stat_start/start_next callback and an output callback for one stat entry. Also adapt the branch tracer to this new API. We create two files "all" and "annotated" inside the /debugfs/tracing/trace_stat directory, making the both stats simultaneously available instead of needing to change an option to switch from one stat file to another. The output of these stats haven't changed. Changes in v2: _ Apply the previous memory leak fix (rebase against tip/master) Changes in v3: _ Merge the patch that adapted the branch tracer to this Api in this patch to not break the kernel build. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-06Merge branches 'tracing/kmemtrace2' and 'tracing/ftrace' into tracing/urgentIngo Molnar
2009-01-06Merge branch 'linus' into tracing/kmemtrace2Ingo Molnar
2009-01-01cpumask: convert kernel trace functions furtherRusty Russell
Impact: Reduce future memory usage, use new cpumask API. Since the last patch was created and acked, more old cpumask users slipped into kernel/trace. Mostly trivial conversions, except struct trace_iterator's "started" member becomes a cpumask_var_t. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-12-30tracing/kmemtrace: normalize the raw tracer event to the unified tracing APIFrederic Weisbecker
Impact: new tracer plugin This patch adapts kmemtrace raw events tracing to the unified tracing API. To enable and use this tracer, just do the following: echo kmemtrace > /debugfs/tracing/current_tracer cat /debugfs/tracing/trace You will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE | | | | | | | | # | type_id 1 call_site 18446744071565527833 ptr 18446612134395152256 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 That was to stay backward compatible with the format output produced in inux/tracepoint.h. This is the default ouput, but note that I tried something else. If you change an option: echo kmem_minimalistic > /debugfs/trace_options and then cat /debugfs/trace, you will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE | | | | | | | | # | - C 0xffff88007c088780 file_free_rcu + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc780 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc870 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc960 -1 d_alloc + K 1304 1312 000000d0 0xffff8800791d7340 -1 reiserfs_alloc_inode - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 992 1000 000000d0 0xffff880079045b58 -1 alloc_inode + K 768 1024 000080d0 0xffff88007c096400 -1 alloc_pipe_info + K 240 240 000000d0 0xffff8800790dca50 -1 d_alloc + K 272 320 000080d0 0xffff88007c088780 -1 get_empty_filp + K 272 320 000080d0 0xffff88007c088000 -1 get_empty_filp Yeah I shall confess kmem_minimalistic should be: kmem_alternative. Whatever, I find it more readable but this a personal opinion of course. We can drop it if you want. On the ALLOC/FREE column, + means an allocation and - a free. On the type column, you have K = kmalloc, C = cache, P = page I would like the flags to be GFP_* strings but that would not be easy to not break the column with strings.... About the node...it seems to always be -1. I don't know why but that shouldn't be difficult to find. I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would be more easy to find the tracer headers if they are all in their common directory. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29tracing/ftrace: make trace_find_cmdline() generally availableIngo Molnar
Impact: build fix On !CONFIG_CONTEXT_SWITCH_TRACER trace_find_cmdline() is not defined: kernel/trace/trace_output.c: In function 'trace_ctxwake_print': kernel/trace/trace_output.c:499: error: implicit declaration of function 'trace_find_cmdline' kernel/trace/trace_output.c:499: warning: assignment makes pointer from integer without a cast Move it to the generic section in trace.h. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29tracing/ftrace: provide the base infrastructure for histogram tracingFrederic Weisbecker
Impact: extend the tracing API The goal of this patch is to normalize and make more easy the implementation of statistical (histogram) tracing. It implements a trace_stat file into the /debugfs/tracing directory where one can print a one-shot output of statistics/histogram entries. A tracer has to provide two basic iterator callbacks: stat_start() => the first entry stat_next(prev, idx) => the next one. Note that it is adapted for arrays or hash tables or lists.... since it provides a pointer to the previous entry and the current index of the iterator. These two callbacks are called to get a snapshot of the statistics at each opening of the trace_stat file because. The values are so updated between two "cat trace_stat". And the tracer is free to lock its datas during the iteration to keep consistent values. Since it is almost always interesting to sort statisticals values to address the problems by priority, this infrastructure provides a "sorting" of the stat entries too if desired. A tracer has just to provide a stat_cmp callback to compare two entries and the stat tracing infrastructure will build a sorted list of the given entries. A last callback, called stat_headers, can be implemented by a tracer to output headers on its trace. If one of these callbacks is changed on runtime, it just have to signal it to the stat tracing API by calling the init_tracer_stat() helper. Changes in V2: - Fix a memory leak if the user opens multiple times the trace_stat file without closing it. Now we always free our list before rebuilding it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29ftrace: set up trace event hash infrastructureSteven Rostedt
Impact: simplify/generalize/refactor trace.c The trace.c file is becoming more difficult to maintain due to the growing number of events. There is several formats that an event may be printed. This patch sets up the infrastructure of an event hash to allow for events to register how they should be printed. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29ftrace: remove obsolete print continue functionalitySteven Rostedt
Impact: cleanup, remove obsolete code Now that the ring buffer used by ftrace allows for variable length entries, we do not need the 'cont' feature of the buffer. This code makes other parts of ftrace more complex and by removing this it simplifies the ftrace code. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-19ftrace: introduce tracing_reset_online_cpus() helperPekka J Enberg
Impact: cleanup This patch factors out common code from multiple tracers into a tracing_reset_online_cpus() function and converts the tracers to use it. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-17tracing/ftrace: add the printk-msg-only optionFrederic Weisbecker
Impact: display ftrace_printk messages "as is" By default, ftrace_printk() messages find their output with some other informations like pid, caller, ... Sometimes a developer just want to have the ftrace_printk left "as is", without other information. This is done by providing a default-off option called printk-msg-only. To enable it, just do `echo printk-msg-only > /debugfs/tracing/trace_options` Before the patch: <...>-2739 [000] 145.692153: __might_sleep: I'm an ftrace_printk msg in __might_sleep <...>-2739 [000] 145.692155: __might_sleep: I'm another ftrace_printk msg in __might_sleep After the patch and the printk-msg-only option enabled: I'm an ftrace_printk msg in __might_sleep I'm another ftrace_printk msg in __might_sleep Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-12x86, bts, ftrace: adapt the hw-branch-tracer to the ds.c interfaceMarkus Metzger
Impact: restructure code, cleanup Remove BTS bits from the hw-branch-tracer (renamed from bts-tracer) and use the ds interface. Signed-off-by: Markus Metzger <markut.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-05tracing/ftrace: fix the check of ftrace_trace_taskFrederic Weisbecker
Impact: fix default empty traces on function-graph-tracer The actual ftrace_trace_task() checks if ftrace_pid_trace is allocated and return 1 if it is true. If it is NULL, it will check the bit of pid tracing flag for the current task (which are not set by default). So by default, a task is not traced. Actually all tasks should be traced by default and filter_by_pid when ftrace_pid_trace is allocated. The appropriate condition should be to return 1 if filter_by_pid is set. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acke-dby: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-05Merge branches 'tracing/ftrace', 'tracing/function-graph-tracer' and ↵Ingo Molnar
'tracing/urgent' into tracing/core
2008-12-04tracing/function-graph-tracer: handle ftrace_printk entriesFrederic Weisbecker
Handle the TRACE_PRINT entries from the function grapg tracer and output them as a C comment just below the function that called it, as if it was a comment inside this function. Example with an ftrace_printk inside might_sleep() function: void __might_sleep(char *file, int line) { static unsigned long prev_jiffy; /* ratelimiting */ ftrace_printk("Hi I'm a comment in might_sleep() :-)"); A chunk of a resulting trace: 0) | _reiserfs_free_block() { 0) | reiserfs_read_bitmap_block() { 0) | __bread() { 0) | __getblk() { 0) | __find_get_block() { 0) 0.698 us | mark_page_accessed(); 0) 2.267 us | } 0) | __might_sleep() { 0) | /* Hi I'm a comment in might_sleep() :-) */ 0) 1.321 us | } 0) 5.872 us | } 0) 7.313 us | } 0) 8.718 us | } And this patch brings two minor fixes: - The newline after a switch-out task has disappeared - The "|" sign just before the cpu number on task-switch has been deleted. 0) 0.616 us | pick_next_task_rt(); 0) 1.457 us | _spin_trylock(); 0) 0.653 us | _spin_unlock(); 0) 0.728 us | _spin_trylock(); 0) 0.631 us | _spin_unlock(); 0) 0.729 us | native_load_sp0(); 0) 0.593 us | native_load_tls(); ------------------------------------------ 0) cat-2834 => migrati-3 ------------------------------------------ 0) | finish_task_switch() { 0) 0.841 us | _spin_unlock_irq(); 0) 0.616 us | post_schedule_rt(); 0) 3.882 us | } Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04tracing: fix typo and missing inline functionIngo Molnar
Impact: fix build bugs Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04ftrace: use struct pidSteven Rostedt
Impact: clean up, extend PID filtering to PID namespaces Eric Biederman suggested using the struct pid for filtering on pids in the kernel. This patch is based off of a demonstration of an implementation that Eric sent me in an email. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04ftrace: trace single pid for function graph tracerSteven Rostedt
Impact: New feature This patch makes the changes to set_ftrace_pid apply to the function graph tracer. # echo $$ > /debugfs/tracing/set_ftrace_pid # echo function_graph > /debugfs/tracing/current_tracer Will cause only the current task to be traced. Note, the trace flags are also inherited by child processes, so the children of the shell will also be traced. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04ftrace: graph of a single functionSteven Rostedt
This patch adds the file: /debugfs/tracing/set_graph_function which can be used along with the function graph tracer. When this file is empty, the function graph tracer will act as usual. When the file has a function in it, the function graph tracer will only trace that function. For example: # echo blk_unplug > /debugfs/tracing/set_graph_function # cat /debugfs/tracing/trace [...] ------------------------------------------ | 2) make-19003 => kjournald-2219 ------------------------------------------ 2) | blk_unplug() { 2) | dm_unplug_all() { 2) | dm_get_table() { 2) 1.381 us | _read_lock(); 2) 0.911 us | dm_table_get(); 2) 1. 76 us | _read_unlock(); 2) + 12.912 us | } 2) | dm_table_unplug_all() { 2) | blk_unplug() { 2) 0.778 us | generic_unplug_device(); 2) 2.409 us | } 2) 5.992 us | } 2) 0.813 us | dm_table_put(); 2) + 29. 90 us | } 2) + 34.532 us | } You can add up to 32 functions into this file. Currently we limit it to 32, but this may change with later improvements. To add another function, use the append '>>': # echo sys_read >> /debugfs/tracing/set_graph_function # cat /debugfs/tracing/set_graph_function blk_unplug sys_read Using the '>' will clear out the function and write anew: # echo sys_write > /debug/tracing/set_graph_function # cat /debug/tracing/set_graph_function sys_write Note, if you have function graph running while doing this, the small time between clearing it and updating it will cause the graph to record all functions. This should not be an issue because after it sets the filter, only those functions will be recorded from then on. If you need to only record a particular function then set this file first before starting the function graph tracer. In the future this side effect may be corrected. The set_graph_function file is similar to the set_ftrace_filter but it does not take wild cards nor does it allow for more than one function to be set with a single write. There is no technical reason why this is the case, I just do not have the time yet to implement that. Note, dynamic ftrace must be enabled for this to appear because it uses the dynamic ftrace records to match the name to the mcount call sites. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03ftrace: function graph return for function entrySteven Rostedt
Impact: feature, let entry function decide to trace or not This patch lets the graph tracer entry function decide if the tracing should be done at the end as well. This requires all function graph entry functions return 1 if it should trace, or 0 if the return should not be traced. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27Merge branches 'tracing/blktrace', 'tracing/ftrace', ↵Ingo Molnar
'tracing/function-graph-tracer' and 'tracing/power-tracer' into tracing/core
2008-11-26tracing: add "power-tracer": C/P state tracer to help power optimizationArjan van de Ven
Impact: new "power-tracer" ftrace plugin This patch adds a C/P-state ftrace plugin that will generate detailed statistics about the C/P-states that are being used, so that we can look at detailed decisions that the C/P-state code is making, rather than the too high level "average" that we have today. An example way of using this is: mount -t debugfs none /sys/kernel/debug echo cstate > /sys/kernel/debug/tracing/current_tracer echo 1 > /sys/kernel/debug/tracing/tracing_enabled sleep 1 echo 0 > /sys/kernel/debug/tracing/tracing_enabled cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26ftrace: add thread comm to function graph tracerSteven Rostedt
Impact: enhancement to function graph tracer Export the trace_find_cmdline so the function graph tracer can use it to print the comms of the threads. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>