aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2009-11-26hw-breakpoints: Simplify error handling in breakpoint creation requestsFrederic Weisbecker
This simplifies the error handling when we create a breakpoint. We don't need to check the NULL return value corner case anymore since we have improved perf_event_create_kernel_counter() to always return an error code in the failure case. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1259210142-5714-3-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-26hw-breakpoints: Improve in-kernel event creation error granularityFrederic Weisbecker
In fail case, perf_event_create_kernel_counter() returns NULL instead of an error, which doesn't help us to inform the user about the origin of the problem from the outer most callers. Often we can just return -EINVAL, which doesn't help anyone when it's eventually about a memory allocation failure. Then, this patch makes perf_event_create_kernel_counter() always return a detailed error code. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1259210142-5714-2-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-26ksym_tracer: Fix breakpoint removal after modificationFrederic Weisbecker
The error path of a breakpoint modification is broken in the ksym tracer. A modified breakpoint hlist node is immediately released after its removal. Also we leak a breakpoint in this case. Fix the path. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1259210142-5714-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-25trace/syscalls: Change ret param in struct syscall_trace_exit to longTom Zanussi
Commit ee949a86b3aef15845ea677aa60231008de62672 ("tracing/syscalls: Use long for syscall ret format and field definitions") changed the syscall exit return type to long, but forgot to change it in the struct. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1259133299-23594-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-24perf_events: Fix bad software/trace event recursion countingFrederic Weisbecker
Commit 4ed7c92d68a5387ba5f7030dc76eab03558e27f5 (perf_events: Undo some recursion damage) has introduced a bad reference counting of the recursion context. putting the context behaves like getting it, dropping every software/trace events after the first one in a context. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1259091502-5171-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-24perf_events: Fix bogus copy_to_user() in perf_event_read_group()Stephane Eranian
When using an event group, the value and id for non leaders events were wrong due to invalid offset into the outgoing buffer. Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: paulus@samba.org Cc: perfmon2-devel@lists.sourceforge.net LKML-Reference: <4b0b71e1.0508d00a.075e.ffff84a3@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf: Add kernel side syscall events support for breakpointsFrederic Weisbecker
Add the remaining necessary bits to support breakpoints created through perf syscall. We don't use the software counter interface as: - We don't need to check against recursion, this is already done in hardware breakpoints arch level. - We already know the perf event we are dealing with when the event is to be committed. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1258987355-8751-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23hw-breakpoints: Check the breakpoint params from perf toolsFrederic Weisbecker
Perf tools create perf events as disabled in the beginning. Breakpoints are then considered like ptrace temporary breakpoints, only meant to reserve a breakpoint slot until we get all the necessary informations from the user. In this case, we don't check the address that is breakpointed as it is NULL in the ptrace case. But perf tools don't have the same purpose, events are created disabled to wait for all events to be created before enabling all of them. We want to check the breakpoint parameters in this case. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1258987355-8751-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23hw-breakpoint: Attribute authorship of hw-breakpoint related filesK.Prasad
Attribute authorship to developers of hw-breakpoint related files. Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091123154713.GA5593@in.ibm.com> [ v2: moved it to latest -tip ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf_events: Restore sanity to scaling landPeter Zijlstra
It is quite possible to call update_event_times() on a context that isn't actually running and thereby confuse the thing. perf stat was reporting !100% scale values for software counters (2e2af50b perf_events: Disable events when we detach them, solved the worst of that, but there was still some left). The thing that happens is that because we are not self-reaping (we have a caring parent) there is a time between the last schedule (out) and having do_exit() called which will detach the events. This period would be accounted as enabled,!running because the event->state==INACTIVE, even though !event->ctx->is_active. Similar issues could have been observed by calling read() on a event while the attached task was not scheduled in. Solve this by teaching update_event_times() about ctx->is_active. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1258984836.4531.480.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf_events: Undo some recursion damagePeter Zijlstra
Make perf_swevent_get_recursion_context return a context number and disable preemption. This could be used to remove the IRQ disable from the trace bit and index the per-cpu buffer with. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091123103819.993226816@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf_events: Fix __perf_event_exit_task() vs. update_event_times() lockingPeter Zijlstra
Move the update_event_times() call in __perf_event_exit_task() into list_del_event() because that holds the proper lock (ctx->lock) and seems a more natural place to do the last time update. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091123103819.842455480@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf_events: Update the context time on exitPeter Zijlstra
It appeared we did call update_event_times() on exit, but we failed to update the context time, which renders the former moot. Locking is a bit iffy, we call update_event_times under ctx->mutex instead of ctx->lock - the next patch fixes this. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091123103819.764207355@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf_events: Disable events when we detach themPeter Zijlstra
If we leave the event in STATE_INACTIVE, any read of the event after the detach will increase the running count but not the enabled count and cause funny scaling artefacts. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091123103819.689055515@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf_events: Fix style nitsPeter Zijlstra
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091123103819.613427378@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf_events: Undo copy/paste damagePeter Zijlstra
We had two almost identical functions, avoid the duplication. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20091123103819.537537928@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf_events: Optimize the swcounter hotpathIngo Molnar
The structure init creates a bit memcpy, which shows up big time in perf annotate output: : ffffffff810a859d <__perf_sw_event>: 1.68 : ffffffff810a859d: 55 push %rbp 1.69 : ffffffff810a859e: 41 89 fa mov %edi,%r10d 0.01 : ffffffff810a85a1: 49 89 c9 mov %rcx,%r9 0.00 : ffffffff810a85a4: 31 c0 xor %eax,%eax 1.71 : ffffffff810a85a6: b9 16 00 00 00 mov $0x16,%ecx 0.00 : ffffffff810a85ab: 48 89 e5 mov %rsp,%rbp 0.00 : ffffffff810a85ae: 48 83 ec 60 sub $0x60,%rsp 1.52 : ffffffff810a85b2: 48 8d 7d a0 lea -0x60(%rbp),%rdi 85.20 : ffffffff810a85b6: f3 ab rep stos %eax,%es:(%rdi) None of the callees depends on the structure being pre-initialized, so only initialize ->addr. This gets rid of the memcpy overhead. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23perf events: Do not generate function trace entries in perf codeIngo Molnar
Decreases perf overhead when function tracing is enabled, by about 50%. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-22perf_events: Fix modular buildIngo Molnar
Fix: ERROR: "perf_swevent_put_recursion_context" [fs/ext4/ext4.ko] undefined! ERROR: "perf_swevent_get_recursion_context" [fs/ext4/ext4.ko] undefined! Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <1258864015-10579-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-22perf_event: Remove redundant zero fillMárton Németh
The buffer is first zeroed out by memset(). Then strncpy() is used to fill the content. The strncpy() function also pads the string till the end of the specified length, which is redundant. The strncpy() does not ensures that the string will be properly closed with 0. Use strlcpy() instead. The semantic match that finds this kind of pattern is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression buffer; expression size; expression str; @@ memset(buffer, 0, size); ... - strncpy( + strlcpy( buffer, str, sizeof(buffer) ); @@ expression buffer; expression size; expression str; @@ memset(&buffer, 0, size); ... - strncpy( + strlcpy( &buffer, str, sizeof(buffer)); @@ expression buffer; identifier field; expression size; expression str; @@ memset(buffer, 0, size); ... - strncpy( + strlcpy( buffer->field, str, sizeof(buffer->field) ); @@ expression buffer; identifier field; expression size; expression str; @@ memset(&buffer, 0, size); ... - strncpy( + strlcpy( buffer.field, str, sizeof(buffer.field)); // </smpl> On strncpy() vs strlcpy() see http://www.gratisoft.us/todd/papers/strlcpy.html . Signed-off-by: Márton Németh <nm127@freemail.hu> Cc: Julia Lawall <julia@diku.dk> Cc: cocci@diku.dk Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <4B086547.5040100@freemail.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-22hw-breakpoints: Remove x86 specific headers from core fileFrederic Weisbecker
Remove asm/processor.h and asm/debugreg.h as these headers are not used anymore in the hw-breakpoints core file. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1258863695-10464-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-22tracing: Forget about the NMI buffer for syscall eventsFrederic Weisbecker
We are never in an NMI context when we commit a syscall trace to perf. So just forget about the nmi buffer there. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <1258863695-10464-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-22tracing: Use the perf recursion protection from trace eventFrederic Weisbecker
When we commit a trace to perf, we first check if we are recursing in the same buffer so that we don't mess-up the buffer with a recursing trace. But later on, we do the same check from perf to avoid commit recursion. The recursion check is desired early before we touch the buffer but we want to do this check only once. Then export the recursion protection from perf and use it from the trace events before submitting a trace. v2: Put appropriate Reported-by tag Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <1258864015-10579-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf_events: Fix default watermark calculationStephane Eranian
This patch fixes the default watermark value for the sampling buffer. With the existing calculation (watermark = max(PAGE_SIZE, max_size / 2)), no notification was ever received when the buffer was exactly 1 page. This was because you would never cross the threshold (there is no partial samples). In certain configuration, there was no possibilty detecting the problem because there was not enough space left to store the LOST record.In fact, there may be a more generic problem here. The kernel should ensure that there is alaways enough space to store one LOST record. This patch sets the default watermark to half the buffer size. With such limit, we are guaranteed to get a notification even with a single page buffer assuming no sample is bigger than a page. Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212509.344964101@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1256302576-6169-1-git-send-email-eranian@gmail.com>
2009-11-21perf: Fix locking for PERF_FORMAT_GROUPPeter Zijlstra
We should hold event->child_mutex when iterating the inherited counters, we should hold ctx->mutex when iterating siblings. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212509.251030114@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Fix event scaling for inherited countersPeter Zijlstra
Properly account the full hierarchy of counters for both the count (we already did so) and the scale times (new). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212509.153379276@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Fix time lockingPeter Zijlstra
Most sites updating ctx->time and event times do so under ctx->lock, make sure they all do. This was made possible by removing the __perf_event_read() call from __perf_event_sync_stat(), which already had this lock taken. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212509.102316434@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Simplify __perf_event_readPeter Zijlstra
cpuctx is always active, task context is always active for current the previous condition verifies that if its a task context its for current, hence we can assume ctx->is_active. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212509.000272254@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Simplify __perf_event_sync_statPeter Zijlstra
Removes constraints from __perf_event_read() by leaving it with a single callsite; this callsite had ctx->lock held, the other one does not. Removes some superfluous code from __perf_event_sync_stat(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.918544317@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Optimize __perf_event_read()Peter Zijlstra
Both callers actually have IRQs disabled, no need doing so again. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.863685796@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Optimize perf_event_task_sched_outPeter Zijlstra
Remove an update_context_time() call from the perf_event_task_sched_out() path and into the branch its needed. The call was both superfluous, because __perf_event_sched_out() already does it, and wrong, because it was done without holding ctx->lock. Place it in perf_event_sync_stat(), which is the only place it is needed and which does already hold ctx->lock. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.779516394@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Fix PERF_FORMAT_GROUP scale infoPeter Zijlstra
As Corey reported, the total_enabled and total_running times could occasionally be 0, even though there were events counted. It turns out this is because we record the times before reading the counter while the latter updates the times. This patch corrects that. While looking at this code I found that there is a lot of locking iffyness around, the following patches correct most of that. Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.685559857@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Optimize perf_event_mmap_ctx()Peter Zijlstra
Remove a rcu_read_{,un}lock() pair and a few conditionals. We can remove the rcu_read_lock() by increasing the scope of one in the calling function. We can do away with the system_state check if the machine still boots after this patch (seems to be the case). We can do away with the list_empty() check because the bare list_for_each_entry_rcu() reduces to that now that we've removed everything else. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.606459548@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Optimize perf_event_comm_ctx()Peter Zijlstra
Remove a rcu_read_{,un}lock() pair and a few conditionals. We can remove the rcu_read_lock() by increasing the scope of one in the calling function. We can do away with the system_state check if the machine still boots after this patch (seems to be the case). We can do away with the list_empty() check because the bare list_for_each_entry_rcu() reduces to that now that we've removed everything else. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.527608793@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Optimize perf_event_task_ctx()Peter Zijlstra
Remove a rcu_read_{,un}lock() pair and a few conditionals. We can remove the rcu_read_lock() by increasing the scope of one in the calling function. We can do away with the system_state check if the machine still boots after this patch (seems to be the case). We can do away with the list_empty() check because the bare list_for_each_entry_rcu() reduces to that now that we've removed everything else. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.452227115@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Optimize perf_swevent_ctx_event()Peter Zijlstra
Remove a rcu_read_{,un}lock() pair and a few conditionals. We can remove the rcu_read_lock() by increasing the scope of one in the calling function. We can do away with the system_state check if the machine still boots after this patch (seems to be the case). We can do away with the list_empty() check because the bare list_for_each_entry_rcu() reduces to that now that we've removed everything else. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.378188589@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Optimize some swcounter attr.sample_period==1 pathsPeter Zijlstra
Avoid the rather expensive perf_swevent_set_period() if we know we have to sample every single event anyway. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.299508332@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21perf: Allow for custom overflow handlersPeter Zijlstra
in-kernel perf users might wish to have custom actions on the sample interrupt. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091120212508.222339539@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21Merge branch 'tracing/hw-breakpoints' into perf/coreIngo Molnar
Conflicts: arch/x86/kernel/kprobes.c kernel/trace/Makefile Merge reason: hw-breakpoints perf integration is looking good in testing and in reviews, plus conflicts are mounting up - so merge & resolve. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17Merge branch 'perf/core' into perf/probesIngo Molnar
Resolved merge conflict in tools/perf/Makefile Merge reason: we want to queue up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-16perf_event: Optimize perf_output_lock()Peter Zijlstra
The purpose of perf_output_{un,}lock() is to: 1) avoid publishing incomplete data [ possible when publishing a head that is ahead of an entry that is still being written ] 2) guarantee fwd progress [ a simple refcount on pending writers doesn't need to drop to 0, making it so would end up implementing something like forced quiecent states of RCU ] To satisfy the above without undue complexity it serializes between CPUs, this means that a pending writer can only be the same cpu in a nested context, and since (under normal operation) a cpu always makes progress we're good -- if the head is only published when the bottom most writer completes. Now we don't need to disable IRQs in order to serialize between CPUs, disabling preemption ought to be sufficient, esp since we already deal with nesting due to NMIs. This avoids potentially expensive (and needless) local IRQ disable/enable ops. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1258373161.26714.254.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15Merge branches 'perf/powerpc' and 'perf/bench' into perf/coreIngo Molnar
Merge reason: Both 'perf bench' and the pending PowerPC changes are now ready for the next merge window. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15Merge commit 'v2.6.32-rc7' into perf/coreIngo Molnar
Merge reason: pick up perf fixlets Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-13tracing: Rename 'lockdep' event subsystem into 'lock'Frederic Weisbecker
Lockdep events subsystem gathers various locking related events such as a request, release, contention or acquisition of a lock. The name of this event subsystem is a bit of a misnomer since these events are not quite related to lockdep but more generally to locking, ie: these events are not reporting lock dependencies or possible deadlock scenario but pure locking events. Hence this rename. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <1258103194-843-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-11Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: highmem: Fix debug_kmap_atomic() to also handle KM_IRQ_PTE, KM_NMI, and KM_NMI_PTE highmem: Fix race in debug_kmap_atomic() which could cause warn_count to underflow rcu: Fix long-grace-period race between forcing and initialization uids: Prevent tear down race
2009-11-11Merge branch 'irq-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: try_one_irq() must be called with irq disabled
2009-11-10hw-breakpoints: Fix broken hw-breakpoint sample moduleFrederic Weisbecker
The hw-breakpoint sample module has been broken during the hw-breakpoint internals refactoring. Propagate the changes to it. Reported-by: "K. Prasad" <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-11-10ksym_tracer: Support read accesses independent of read/write.Paul Mundt
All of the infrastructure already exists to support read accesses for platforms that support a read access independently of read/write (such as in the case of the SuperH UBC). This just trivially hooks up the read case by itself. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kiszka <jan.kiszka@web.de> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <20091109083733.GA25848@linux-sh.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-11-08ksym_tracer: Remove KSYM_SELFTEST_ENTRYLi Zefan
The macro used to be used in both trace_selftest.c and trace_ksym.c, but no longer, so remove it from header file. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-11-08hw-breakpoints: Arbitrate access to pmu following registers constraintsFrederic Weisbecker
Allow or refuse to build a counter using the breakpoints pmu following given constraints. We keep track of the pmu users by using three per cpu variables: - nr_cpu_bp_pinned stores the number of pinned cpu breakpoints counters in the given cpu - nr_bp_flexible stores the number of non-pinned breakpoints counters in the given cpu. - task_bp_pinned stores the number of pinned task breakpoints in a cpu The latter is not a simple counter but gathers the number of tasks that have n pinned breakpoints. Considering HBP_NUM the number of available breakpoint address registers: task_bp_pinned[0] is the number of tasks having 1 breakpoint task_bp_pinned[1] is the number of tasks having 2 breakpoints [...] task_bp_pinned[HBP_NUM - 1] is the number of tasks having the maximum number of registers (HBP_NUM). When a breakpoint counter is created and wants an access to the pmu, we evaluate the following constraints: == Non-pinned counter == - If attached to a single cpu, check: (per_cpu(nr_bp_flexible, cpu) || (per_cpu(nr_cpu_bp_pinned, cpu) + max(per_cpu(task_bp_pinned, cpu)))) < HBP_NUM -> If there are already non-pinned counters in this cpu, it means there is already a free slot for them. Otherwise, we check that the maximum number of per task breakpoints (for this cpu) plus the number of per cpu breakpoint (for this cpu) doesn't cover every registers. - If attached to every cpus, check: (per_cpu(nr_bp_flexible, *) || (max(per_cpu(nr_cpu_bp_pinned, *)) + max(per_cpu(task_bp_pinned, *)))) < HBP_NUM -> This is roughly the same, except we check the number of per cpu bp for every cpu and we keep the max one. Same for the per tasks breakpoints. == Pinned counter == - If attached to a single cpu, check: ((per_cpu(nr_bp_flexible, cpu) > 1) + per_cpu(nr_cpu_bp_pinned, cpu) + max(per_cpu(task_bp_pinned, cpu))) < HBP_NUM -> Same checks as before. But now the nr_bp_flexible, if any, must keep one register at least (or flexible breakpoints will never be be fed). - If attached to every cpus, check: ((per_cpu(nr_bp_flexible, *) > 1) + max(per_cpu(nr_cpu_bp_pinned, *)) + max(per_cpu(task_bp_pinned, *))) < HBP_NUM Changes in v2: - Counter -> event rename Changes in v5: - Fix unreleased non-pinned task-bound-only counters. We only released it in the first cpu. (Thanks to Paul Mackerras for reporting that) Changes in v6: - Currently, events scheduling are done in this order: cpu context pinned + cpu context non-pinned + task context pinned + task context non-pinned events. Then our current constraints are right theoretically but not in practice, because non-pinned counters may be scheduled before we can apply every possible pinned counters. So consider non-pinned counters as pinned for now. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jan Kiszka <jan.kiszka@web.de> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Avi Kivity <avi@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org>