aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2009-10-13perf tools: Remove expensive old debug code from perf topMike Galbraith
Calling gettimeofday() at high frequency is painful for handicapped boxen. The spot calling gettimeofday() is old unneeded debug code, so remove it. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1255438640.7173.1.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13perf tools: Do not manually count string lengthsVincent Legoll
Use strlen & macros instead of manually counting string lengths as this is error prone and may lend to bugs. Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com> Cc: Linus Torvalds <torvalds@osdl.org> LKML-Reference: <4727185d0910130118m5387058dndb02ac9b384af9f0@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12perf sched: Add -C option to measure on a specific CPUMike Galbraith
To refresh, trying to sched record only one CPU results in bogus latencies as below. I fixed^Wmade it stop doing the bad thing today, by following task migration events properly. Before: marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10 marge:/root/tmp # perf sched lat ----------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------------- Xorg:4943 | 1.290 ms | 1 | avg: 1670.132 ms | max: 1670.132 ms | hald-addon-stor:3569 | 0.091 ms | 3 | avg: 658.609 ms | max: 1975.797 ms | hald-addon-stor:3573 | 0.209 ms | 4 | avg: 499.138 ms | max: 1990.565 ms | audispd:4270 | 0.012 ms | 1 | avg: 0.015 ms | max: 0.015 ms | .... marge:/root/tmp # perf sched trace|grep 'Xorg:4943' swapper-0 [000] 401.184013288: sched_stat_runtime: task: Xorg:4943 runtime: 1233188 [ns], vruntime: 19105169779 [ns] rt2870TimerQHan-4947 [000] 402.854140127: sched_stat_wait: task: Xorg:4943 wait: 580073 [ns] rt2870TimerQHan-4947 [000] 402.854141770: sched_migrate_task: task Xorg:4943 [140] from: 1 to: 0 rt2870TimerQHan-4947 [000] 402.854143854: sched_stat_wait: task: Xorg:4943 wait: 0 [ns] rt2870TimerQHan-4947 [000] 402.854145397: sched_switch: task rt2870TimerQHan:4947 [140] (D) ==> Xorg:4943 [140] Xorg-4943 [000] 402.854193133: sched_stat_runtime: task: Xorg:4943 runtime: 56546 [ns], vruntime: 11766332500 [ns] Xorg-4943 [000] 402.854196842: sched_switch: task Xorg:4943 [140] (S) ==> swapper:0 [140] After: marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10 marge:/root/tmp # perf sched lat ----------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------------- amarokapp:11150 | 271.297 ms | 878 | avg: 0.130 ms | max: 1.057 ms | konsole:5965 | 1.370 ms | 12 | avg: 0.092 ms | max: 0.855 ms | Xorg:4943 | 179.980 ms | 1109 | avg: 0.087 ms | max: 1.206 ms | hald-addon-stor:3574 | 0.212 ms | 9 | avg: 0.040 ms | max: 0.169 ms | hald-addon-stor:3570 | 0.223 ms | 9 | avg: 0.037 ms | max: 0.223 ms | klauncher:5864 | 0.550 ms | 8 | avg: 0.032 ms | max: 0.048 ms | The 'Maximum delay ms' results are now sane. Signed-off-by: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12perf tools: Fix counter sample frequency breakageMike Galbraith
Commit 42e59d7d19dc4b4 switched to a default sample frequency of 1KHz, which overrides any user supplied count, causing sched, top and timechart to miss events due to their discrete events being flagged PERF_SAMPLE_PERIOD. Override default sample frequency when the user profides a period count, and make both record and top honor that user supplied option. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <1255326963.15107.2.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08perf tools: Provide backward compatibility with previous perf.data versionFrederic Weisbecker
We have merged the trace.info file into perf.data by adding one section in the perf headers. This makes it incompatible with previous version: the new perf tools can't read the older perf.data. To support the previous format, we check the headers size. If they have the same size than in the previous format, then ignore the trace info section that doesn't exist. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1255032449-12022-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08perf tools: Fix thread comm resolution in perf schedFrederic Weisbecker
This reverts commit 9a92b479b2f088ee2d3194243f4c8e59b1b8c9c2 ("perf tools: Improve thread comm resolution in perf sched") and fixes the real bug. The bug was elsewhere: We are failing to resolve thread names in perf sched because the table of threads we are building, on top of comm events, has a per process granularity. But perf sched, unlike the other perf tools, needs a per thread granularity as we are profiling every tasks individually. So fix it by building our threads table using the tid instead of the pid as the thread identifier. v2: Revert the previous fix - it is not really needed Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08perf tools: Improve kernel/modules symbol lookupArnaldo Carvalho de Melo
This removes the ovelapping of vmlinux addresses with modules, using the ELF section name when using --vmlinux and creating a unique DSO name when using /proc/kallsyms ([kernel].N). This is done by creating multiple 'struct map' instances for address ranges backed by DSOs that have just the symbols for that range and a name that is derived from the ELF section name.o Now it is possible to ask for just the symbols in some particular kernel section: $ perf report -m --vmlinux ../build/tip-recvmmsg/vmlinux \ --dsos [kernel].vsyscall_fn | head -15 52.73% Xorg [.] vread_hpet 18.61% firefox [.] vread_hpet 14.50% npviewer.bin [.] vread_hpet 6.83% compiz [.] vread_hpet 5.73% glxgears [.] vread_hpet 0.63% java [.] vread_hpet 0.30% gnome-terminal [.] vread_hpet 0.23% perf [.] vread_hpet 0.18% xchat [.] vread_hpet $ Now we don't have to first lookup the list of modules and then, if it fails, vmlinux symbols, its just a simple lookup for the map then the symbols, just like for threads. Reports generated using /proc/kallsyms and --vmlinux should provide the same results, modulo the DSO name for sections other than ".text". But they don't right now because things like: ffffffff81011c20-ffffffff81012068 system_call ffffffff81011c30-ffffffff81011c9b system_call_after_swapgs ffffffff81011c9c-ffffffff81011cb6 system_call_fastpath ffffffff81011cb7-ffffffff81011cbb ret_from_sys_call I.e. overlapping symbols, again some ASM special case that we have to fixup. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1254934136-8503-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08perf tools: Up the verbose level for some really verbose stuffArnaldo Carvalho de Melo
Like printing every symbol created. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08perf tools: Improve thread comm resolution in perf schedFrederic Weisbecker
When we get sched traces that involve a task that was already created before opening the event, we won't have the comm event for it. So if we can't find the comm event for a given thread, we look at the traces that may contain these informations. Before: ata/1:371 | 0.000 ms | 1 | avg: 3988.693 ms | max: 3988.693 ms | kondemand/1:421 | 0.096 ms | 3 | avg: 345.346 ms | max: 1035.989 ms | kondemand/0:420 | 0.025 ms | 3 | avg: 421.332 ms | max: 964.014 ms | :5124:5124 | 0.103 ms | 5 | avg: 74.082 ms | max: 277.194 ms | :6244:6244 | 0.691 ms | 9 | avg: 125.655 ms | max: 271.306 ms | firefox:5080 | 0.924 ms | 5 | avg: 53.833 ms | max: 257.828 ms | npviewer.bin:6225 | 21.871 ms | 53 | avg: 22.462 ms | max: 220.835 ms | :6245:6245 | 9.631 ms | 21 | avg: 41.864 ms | max: 213.349 ms | After: ata/1:371 | 0.000 ms | 1 | avg: 3988.693 ms | max: 3988.693 ms | kondemand/1:421 | 0.096 ms | 3 | avg: 345.346 ms | max: 1035.989 ms | kondemand/0:420 | 0.025 ms | 3 | avg: 421.332 ms | max: 964.014 ms | firefox:5124 | 0.103 ms | 5 | avg: 74.082 ms | max: 277.194 ms | npviewer.bin:6244 | 0.691 ms | 9 | avg: 125.655 ms | max: 271.306 ms | firefox:5080 | 0.924 ms | 5 | avg: 53.833 ms | max: 257.828 ms | npviewer.bin:6225 | 21.871 ms | 53 | avg: 22.462 ms | max: 220.835 ms | npviewer.bin:6245 | 9.631 ms | 21 | avg: 41.864 ms | max: 213.349 ms | Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08perf tools: Unify perf.data mapping and events handlingFrederic Weisbecker
This librarizes the perf.data file mapping and handling in various perf tools, roughly reducing the amount of code and fixing the places that mmap from beginning of the file whereas we want to mmap from the beginning of the data, leading to page fault because the mmap window is too small since the trace info are written in the file too. TODO: - convert perf timechart too Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20091007104729.GD5043@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-07perf tools: Merge trace.info content into perf.dataFrederic Weisbecker
This drops the trace.info file and move its contents into the common perf.data file. This is done by creating a new trace_info section into this file. A user of perf headers needs to call perf_header__set_trace_info() to save the trace meta informations into the perf.data file. A file created by perf after his patch is unsupported by previous version because the size of the headers have increased. That said, it's two new fields that have been added in the end of the headers, and those could be ignored by previous versions if they just handled the dynamic header size and then ignore the unknow part. The offsets guarantee the compatibility. We'll do a -stable fix for that. But current previous versions handle the header size using its static size, not dynamic, then it's not backward compatible with trace records. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091006213643.GA5343@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-07perf tools: Start the perf.data mapping at data offset in perf traceFrederic Weisbecker
Currently, we are mapping perf.data in the beginning of the file and use the data offset as a buffer offset. This may exceed the mapping area if the data offset is upper than page_size * mmap_window and result in a page fault (thing that happen if we merge trace.info in perf.data). Instead, let's start the mapping in the page that matches our data offset. v2: Drop a junk from another patch (trace_report() removal) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1254856886-10348-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06perf tools: Default to 1 KHz auto-sampling freq eventsIngo Molnar
Use auto-freq events by default in perf record and perf top. This allows more consistent hardware event sampling, regardless of the intensity of the underlying event. It also keeps us from over-sampling on larger/busier systems. (also make surrounding initializations more consistent) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06perf trace: Add string/dynamic cases to format_flagsTom Zanussi
Needed for distinguishing string fields in event stream processing. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: hch@infradead.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1254809398-8078-4-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06perf trace: Add subsystem string to struct eventTom Zanussi
Needed to fully qualify event names for event stream processing. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: hch@infradead.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1254809398-8078-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06tracing/events: Add 'signed' field to format filesTom Zanussi
The sign info used for filters in the kernel is also useful to applications that process the trace stream. Add it to the format files and make it available to userspace. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: hch@infradead.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1254809398-8078-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06Merge branch 'perf/urgent' into perf/coreIngo Molnar
Merge reason: Upcoming patch is dependent on a fix in perf/urgent. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06perf_event: Provide vmalloc() based mmap() backingPeter Zijlstra
Some architectures such as Sparc, ARM and MIPS (basically everything with flush_dcache_page()) need to deal with dcache aliases by carefully placing pages in both kernel and user maps. These architectures typically have to use vmalloc_user() for this. However, on other architectures, vmalloc() is not needed and has the downsides of being more restricted and slower than regular allocations. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: David Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1254830228.21044.272.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06perf tools: elf_sym__is_function() should accept "zero" sized functionsArnaldo Carvalho de Melo
Asm routines that end up having size equal to zero are not really zero sized, and as now we do kernel_maps__fixup_sym_end, at least for kernel routines this gets fixed. A similar fixup needs to be done for the userspace bits as well, but as this fixup started only because in /proc/kallsyms we don't have the end address nor the function size, it appeared here first. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1254796503-27203-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06perf trace: Update eval_flag() flags array to match interrupt.hTom Zanussi
Add missing BLOCK_IOPOLL_SOFTIRQ entry. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1254808849-7829-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06perf trace: Remove unused code in builtin-trace.cTom Zanussi
And some minor whitespace cleanup. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1254808849-7829-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05perf report: Use kernel_maps__find_symbol as fallback to find vdsos, etcArnaldo Carvalho de Melo
In resolve_symbol, as we're moving to breaking the kernel symbols list per address ranges, i.e. kernel linking sections, so that we don't have a big kernel_map that in its range covers what is in the modules. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05perf tools: /proc/modules names don't always match its nameArnaldo Carvalho de Melo
$ cut -d' ' -f1 /proc/modules|grep _|wc -l 29 $ cut -d' ' -f1 /proc/modules|grep _|sed 's/$/.ko'/g|while read n;do find /lib/modules/`uname -r` -name $n;done|wc -l 12 For instance: $ grep ^aes_x86 /proc/modules aes_x86_64 9056 2 - Live 0xffffffffa0091000 $ l /lib/modules/2.6.31-tip/kernel/arch/x86/crypto/aes-x86_64.ko -rw-r--r-- 1 root root 136438 2009-09-22 19:05 /lib/modules/2.6.31-tip/kernel/arch/x86/crypto/aes-x86_64.ko Handle that by introducing a strxfrchar routine that replaces dashes with underscores when matching file names to loaded modules. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05perf tools: Create maps for modules when processing kallsymsArnaldo Carvalho de Melo
So that we get kallsyms processing closer to vmlinux + modules symtabs processing. One change in behaviour is that since when one specifies --vmlinux -m should be used to ask for modules, so it is now for kallsyms as well. Also continue if one manages to load the vmlinux data but module processing fails, so that at least some analisys can be done with part of the needed symbols. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05perf top: Keep the default of asking for kernel module symbolsArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-04perf: Propagate term signal to childChris Wilson
If we launch the child on behalf of the user, ensure that it dies along with ourselves when we are interrupted. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> LKML-Reference: <1254616502-4728-1-git-send-email-chris@chris-wilson.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-04perf tools: Remove show_mask bitmaskArnaldo Carvalho de Melo
As it was not being exposed via any command line and with --dsos/--comms we can do this and even more, like asking for just kernel + some module: [root@doppio linux-2.6-tip]# perf report --dsos \[kernel\],\[drm\] --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules | head -15 # Samples: 619669 # # Overhead Command Shared Object Symbol # ........ ............... ............. ...... # 7.12% swapper [kernel] [k] read_hpet 6.86% init [kernel] [k] read_hpet 6.22% init [kernel] [k] mwait_idle_with_hints 5.34% swapper [kernel] [k] mwait_idle_with_hints 3.01% firefox [kernel] [.] vread_hpet 2.14% Xorg [drm] [k] drm_clflush_pages 2.09% pidgin [kernel] [.] vread_hpet 1.58% npviewer.bin [kernel] [.] vread_hpet 1.37% swapper [kernel] [k] hpet_next_event 1.23% Xorg [kernel] [k] read_hpet [root@doppio linux-2.6-tip]# Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091003233048.GA30535@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-03perf tools: Move hist_entry__add common code to hist.cArnaldo Carvalho de Melo
Now perf report and annotate do the callgraph/hit processing in their specialized hist_entry__add functions. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02perf tools: Rewrite and improve support for kernel modulesArnaldo Carvalho de Melo
Representing modules as struct map entries, backed by a DSO, etc, using /proc/modules to find where the module is loaded. DSOs now can have a short and long name, so that in verbose mode we can show exactly which .ko or vmlinux image was used. As kernel modules now are a DSO separate from the kernel, we can ask for just the hits for a particular set of kernel modules, just like we can do with shared libraries: [root@doppio linux-2.6-tip]# perf report -n --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15 84.58% 13266 Xorg [k] drm_clflush_pages 4.02% 630 Xorg [k] trace_kmalloc.clone.0 3.95% 619 Xorg [k] drm_ioctl 2.07% 324 Xorg [k] drm_addbufs 1.68% 263 Xorg [k] drm_gem_close_ioctl 0.77% 120 Xorg [k] drm_setmaster_ioctl 0.70% 110 Xorg [k] drm_lastclose 0.68% 106 Xorg [k] drm_open 0.54% 85 Xorg [k] drm_mm_search_free [root@doppio linux-2.6-tip]# Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko would have the same effect. Allowing specifying just 'drm.ko' is left for another patch. Processing kallsyms so that per kernel module struct map are instantiated was also left for another patch. That will allow removing the module name from each of its symbols. struct symbol was reduced by removing the ->module backpointer and moving it (well now the map) to struct symbol_entry in perf top, that is its only user right now. The total linecount went down by ~500 lines. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01perf tools: Run generate-cmdlist.sh properlyMulyadi Santosa
Right now generate-cmdlist.sh is not executable, so we should call it as an argument ".". This fixes cases where due to different umask defaults the generate-cmdlist.sh script is not executable in a kernel tree checkout. Signed-off-by: Mulyadi Santosa <mulyadi.santosa@gmail.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <f284c33d0909251201w422e9687x8cd3a784e85adf7d@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01perf timechart: Add a power-only modeArjan van de Ven
For doing work on the Linux power management components, I need to make long (30+ seconds) traces. Currently, this then results in a HUGE svg file, with mostly process data that isn't interesting. This patch adds a --power-only mode to perf timechart that only outputs the CPU power section of the SVG; this significantly reduces the size of the SVG file, making even 30+ second traces viewable with inkscape. As a minor tweak for the same effect, the minimum text size is decreased; current inkscape cannot zoom in deep enough to show text this small, but it reduces inkscape compute time. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: peterz@infradead.org LKML-Reference: <20090924154013.0675ab71@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01perf top: Remove dead {min,max}_ip unused variablesArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20090924212400.GA15321@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30perf trace: Remove dead codeArnaldo Carvalho de Melo
Several variables are not used at all, cut'n'paste leftovers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> LKML-Reference: <20090928200818.GF3361@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30perf sched: Remove dead codeArnaldo Carvalho de Melo
Several variables are not used at all, cut'n'paste leftovers. Also check if the sample_type is RAW earlier, to avoid needless searches. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30perf tools: Use rb_tree for mapsArnaldo Carvalho de Melo
Threads can have many and kernel modules will be represented as a tree of maps as well. Ah, and for a perf.data with 146607 samples: Before: [root@doppio ~]# perf stat -r 5 perf report > /dev/null Performance counter stats for 'perf report' (5 runs): 699.823680 task-clock-msecs # 0.991 CPUs ( +- 0.454% ) 74 context-switches # 0.000 M/sec ( +- 1.709% ) 2 CPU-migrations # 0.000 M/sec ( +- 17.008% ) 23114 page-faults # 0.033 M/sec ( +- 0.000% ) 1381257019 cycles # 1973.721 M/sec ( +- 0.290% ) 1456894438 instructions # 1.055 IPC ( +- 0.007% ) 18779818 cache-references # 26.835 M/sec ( +- 0.380% ) 641799 cache-misses # 0.917 M/sec ( +- 1.200% ) 0.705972729 seconds time elapsed ( +- 0.501% ) [root@doppio ~]# After Performance counter stats for 'perf report' (5 runs): 691.261451 task-clock-msecs # 0.993 CPUs ( +- 0.307% ) 72 context-switches # 0.000 M/sec ( +- 0.829% ) 6 CPU-migrations # 0.000 M/sec ( +- 18.409% ) 23127 page-faults # 0.033 M/sec ( +- 0.000% ) 1366395876 cycles # 1976.670 M/sec ( +- 0.153% ) 1443136016 instructions # 1.056 IPC ( +- 0.012% ) 17956402 cache-references # 25.976 M/sec ( +- 0.325% ) 661924 cache-misses # 0.958 M/sec ( +- 1.335% ) 0.696127275 seconds time elapsed ( +- 0.377% ) I.e. we see some speedup too. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> LKML-Reference: <20090928174846.GA3361@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30perf tools: Put common histogram functions in their own fileJohn Kacur
Move histogram related functions into their own files (hist.c and hist.h) and make use of them in builtin-annotate.c and builtin-report.c. Signed-off-by: John Kacur <jkacur@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <alpine.LFD.2.00.0909281531180.8316@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30perf top: Add poll_idle to the skip listArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20090925220239.GA5488@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24perf tools: Create util/sort.and use itJohn Kacur
Create util/sort.[ch] and move common functionality for builtin-report.c and builtin-annotate.c there, and make use of it. Signed-off-by: John Kacur <jkacur@redhat.com> LKML-Reference: <alpine.LFD.2.00.0909241758390.11383@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24perf tools: Protect header files with a consistent styleJohn Kacur
There was a colorful mix of header guards - standardize them. Signed-off-by: John Kacur <jkacur@redhat.com> LKML-Reference: <alpine.LFD.2.00.0909241756530.11383@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24perf annotate: Add the cmp_null function and make use of itJohn Kacur
This function exists in builtin-report.c but not in builtin-annotate.c Functions that use cmp_null are shorter and clearer. Synchronizing functions between these two files will also make it easier to potential share code in the future. Signed-off-by: John Kacur <jkacur@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <alpine.LFD.2.00.0909241754031.11383@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24perf tools: Dont use openat()Eric Dumazet
openat() is still a young glibc facility, better to not use it in a non performance critical program (perf list) Many machines have older glibc (RHEL 4 Update 5 -> glibc-2.3.4-2.36 on my dev machine for example). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ulrich Drepper <drepper@redhat.com> LKML-Reference: <4ABB767D.6080004@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24perf tools: Fix buffer allocationEric Dumazet
"perf top" cores dump on my dev machine, if run from a directory where vmlinux is present: *** glibc detected *** malloc(): memory corruption: 0x085670d0 *** Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: <stable@kernel.org> LKML-Reference: <4ABB6EB7.7000002@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24perf tools: .gitignore += perf*.htmlKirill Smelkov
I've tried building the docs in tools/perf/Documentation/ , and after that `git status` showed dozen of untracked htmls. Let's ignore them. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> LKML-Reference: <1253790022-10300-1-git-send-email-kirr@mns.spb.ru> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24perf tools: Handle relative paths while loading module symbolsMike Galbraith
Inform util/module.c::mod_dso__load_module_paths() that relative paths do exist in some modules.dep, and make it fail noisily should it encounter a path that it doesn't understand, or a module it cannot open. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: rostedt@goodmis.org Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <1253779628.10513.8.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-23perf tools: Fix module symbol loading bugMike Galbraith
Avi Kivity reported 'perf annotate' failures with modules, the requested function was not annotated. If there are no modules currently loaded, or the last module scanned is not loaded, dso__load_modules() steps on the value from dso__load_vmlinux(), so we happily load the kallsyms symbols on top of what we've already loaded. Fix that such that the total count of symbols loaded is returned. Should module symbol load fail after parsing of vmlinux, is's a hard failure, so do not silently fall-back to kallsyms. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: rostedt@goodmis.org Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <1253697658.11461.36.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22perf stat: Fix zero total printoutsIngo Molnar
Before: 0 sched:sched_switch # nan M/sec After: 0 sched:sched_switch # 0.000 M/sec Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21perf: Do the big rename: Performance Counters -> Performance EventsIngo Molnar
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20perf util: SVG performance improvementsArjan van de Ven
Tweak the output SVG to increase performance in SVG viewers by limiting the different types of font sizes and by smarter transformations on the text. At least with Inkscape this gives a notable performance improvement during zoom and scrolling. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090920181438.3a49cb93@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20perf util: Make the timechart SVG width dynamicArjan van de Ven
This patch adds a command line option for timechart that allows the user to specify the width of the SVG file. This patch also makes sure that each second of recording has at least 200 units (pixels at 96 DPI) of width. This impacts recordings longer than 5 seconds; recordings shorter than 5 second will scale up to have a width of 1000 units for the whole recording (as before). Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090920181416.69570c5d@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20perf timechart: Show the duration of scheduler delays in the SVGArjan van de Ven
Given that scheduler latencies are the hot thing nowadays, show the duration of said latencies in the SVG in text form. In addition, if the latency is more than 10 msec, pick a brighter yellow color as a way to point these long delays out. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090920181353.796f4509@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>