aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)Author
2010-05-14x86, cacheinfo: Turn off L3 cache index disable feature in virtualized ↵Frank Arnold
environments When running a quest kernel on xen we get: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 IP: [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df PGD 0 Oops: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Tainted: G W 2.6.34-rc3 #1 /HVM domU RIP: 0010:[<ffffffff8142f2fb>] [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x 2ca/0x3df RSP: 0018:ffff880002203e08 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000060 RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000 RBP: ffff880002203ed8 R08: 00000000000017c0 R09: ffff880002203e38 R10: ffff8800023d5d40 R11: ffffffff81a01e28 R12: ffff880187e6f5c0 R13: ffff880002203e34 R14: ffff880002203e58 R15: ffff880002203e68 FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000038 CR3: 0000000001a3c000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a44020) Stack: ffffffff810d7ecb ffff880002203e20 ffffffff81059140 ffff880002203e30 <0> ffffffff810d7ec9 0000000002203e40 000000000050d140 ffff880002203e70 <0> 0000000002008140 0000000000000086 ffff880040020140 ffffffff81068b8b Call Trace: <IRQ> [<ffffffff810d7ecb>] ? sync_supers_timer_fn+0x0/0x1c [<ffffffff81059140>] ? mod_timer+0x23/0x25 [<ffffffff810d7ec9>] ? arm_supers_timer+0x34/0x36 [<ffffffff81068b8b>] ? hrtimer_get_next_event+0xa7/0xc3 [<ffffffff81058e85>] ? get_next_timer_interrupt+0x19a/0x20d [<ffffffff8142fa23>] get_cpu_leaves+0x5c/0x232 [<ffffffff8106a7b1>] ? sched_clock_local+0x1c/0x82 [<ffffffff8106a9a0>] ? sched_clock_tick+0x75/0x7a [<ffffffff8107748c>] generic_smp_call_function_single_interrupt+0xae/0xd0 [<ffffffff8101f6ef>] smp_call_function_single_interrupt+0x18/0x27 [<ffffffff8100a773>] call_function_single_interrupt+0x13/0x20 <EOI> [<ffffffff8143c468>] ? notifier_call_chain+0x14/0x63 [<ffffffff810295c6>] ? native_safe_halt+0xc/0xd [<ffffffff810114eb>] ? default_idle+0x36/0x53 [<ffffffff81008c22>] cpu_idle+0xaa/0xe4 [<ffffffff81423a9a>] rest_init+0x7e/0x80 [<ffffffff81b10dd2>] start_kernel+0x40e/0x419 [<ffffffff81b102c8>] x86_64_start_reservations+0xb3/0xb7 [<ffffffff81b103c4>] x86_64_start_kernel+0xf8/0x107 Code: 14 d5 40 ff ae 81 8b 14 02 31 c0 3b 15 47 1c 8b 00 7d 0e 48 8b 05 36 1c 8b 00 48 63 d2 48 8b 04 d0 c7 85 5c ff ff ff 00 00 00 00 <8b> 70 38 48 8d 8d 5c ff ff ff 48 8b 78 10 ba c4 01 00 00 e8 eb RIP [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df RSP <ffff880002203e08> CR2: 0000000000000038 ---[ end trace a7919e7f17c0a726 ]--- The L3 cache index disable feature of AMD CPUs has to be disabled if the kernel is running as guest on top of a hypervisor because northbridge devices are not available to the guest. Currently, this fixes a boot crash on top of Xen. In the future this will become an issue on KVM as well. Check if northbridge devices are present and do not enable the feature if there are none. [ hpa: backported to 2.6.34 ] Signed-off-by: Frank Arnold <frank.arnold@amd.com> LKML-Reference: <1271945222-5283-3-git-send-email-bp@amd64.org> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org>
2010-05-03powernow-k8: Fix frequency reportingMark Langsdorf
With F10, model 10, all valid frequencies are in the ACPI _PST table. Cc: <stable@kernel.org> # 33.x 32.x Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> LKML-Reference: <1270065406-1814-6-git-send-email-bp@amd64.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-28Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip: x86: Disable large pages on CPUs with Atom erratum AAE44 x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero x86, mrst: Conditionally register cpu hotplug notifier for apbt
2010-04-24VMware Balloon driverDmitry Torokhov
This is a standalone version of VMware Balloon driver. Ballooning is a technique that allows hypervisor dynamically limit the amount of memory available to the guest (with guest cooperation). In the overcommit scenario, when hypervisor set detects that it needs to shuffle some memory, it instructs the driver to allocate certain number of pages, and the underlying memory gets returned to the hypervisor. Later hypervisor may return memory to the guest by reattaching memory to the pageframes and instructing the driver to "deflate" balloon. We are submitting a standalone driver because KVM maintainer (Avi Kivity) expressed opinion (rightly) that our transport does not fit well into virtqueue paradigm and thus it does not make much sense to integrate with virtio. There were also some concerns whether current ballooning technique is the right thing. If there appears a better framework to achieve this we are prepared to evaluate and switch to using it, but in the meantime we'd like to get this driver upstream. We want to get the driver accepted in distributions so that users do not have to deal with an out-of-tree module and many distributions have "upstream first" requirement. The driver has been shipping for a number of years and users running on VMware platform will have it installed as part of VMware Tools even if it will not come from a distribution, thus there should not be additional risk in pulling the driver into mainline. The driver will only activate if host is VMware so everyone else should not be affected at all. Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Cc: Avi Kivity <avi@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-23x86: Disable large pages on CPUs with Atom erratum AAE44H. Peter Anvin
Atom erratum AAE44/AAF40/AAG38/AAH41: "If software clears the PS (page size) bit in a present PDE (page directory entry), that will cause linear addresses mapped through this PDE to use 4-KByte pages instead of using a large page after old TLB entries are invalidated. Due to this erratum, if a code fetch uses this PDE before the TLB entry for the large page is invalidated then it may fetch from a different physical address than specified by either the old large page translation or the new 4-KByte page translation. This erratum may also cause speculative code fetches from incorrect addresses." [http://download.intel.com/design/processor/specupdt/319536.pdf] Where as commit 211b3d03c7400f48a781977a50104c9d12f4e229 seems to workaround errata AAH41 (mixed 4K TLBs) it reduces the window of opportunity for the bug to occur and does not totally remove it. This patch disables mixed 4K/4MB page tables totally avoiding the page splitting and not tripping this processor issue. This is based on an original patch by Colin King. Originally-by: Colin Ian King <colin.king@canonical.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com> Cc: <stable@kernel.org>
2010-04-06perf, x86: Enable Nehalem-EX supportVince Weaver
According to Intel Software Devel Manual Volume 3B, the Nehalem-EX PMU is just like regular Nehalem (except for the uncore support, which is completely different). Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <alpine.DEB.2.00.1004060956580.1417@cl320.eecs.utk.edu> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-05Merge branch 'master' into export-slabhTejun Heo
2010-04-02perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernelsTorok Edwin
When profiling a 32-bit process on a 64-bit kernel, callgraph tracing stopped after the first function, because it has seen a garbage memory address (tried to interpret the frame pointer, and return address as a 64-bit pointer). Fix this by using a struct stack_frame with 32-bit pointers when the TIF_IA32 flag is set. Note that TIF_IA32 flag must be used, and not is_compat_task(), because the latter is only set when the 32-bit process is executing a syscall, which may not always be the case (when tracing page fault events for example). Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul Mackerras <paulus@samba.org> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org LKML-Reference: <1268820436-13145-1-git-send-email-edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: Fix AMD hotplug & constraint initializationPeter Zijlstra
Commit 3f6da39 ("perf: Rework and fix the arch CPU-hotplug hooks") moved the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP however amd_nb_id() doesn't work yet on prepare so it would simply bail basically reverting to a state where we do not properly track node wide constraints - causing weird perf results. Fix up the AMD NorthBridge initialization code by allocating from CPU_UP_PREPARE and installing it from CPU_STARTING once we have the proper nb_id. It also properly deals with the allocation failing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [ robustify using amd_has_nb() ] Signed-off-by: Stephane Eranian <eranian@google.com> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-01perf: Use hot regs with software sched switch/migrate eventsFrederic Weisbecker
Scheduler's task migration events don't work because they always pass NULL regs perf_sw_event(). The event hence gets filtered in perf_swevent_add(). Scheduler's context switches events use task_pt_regs() to get the context when the event occured which is a wrong thing to do as this won't give us the place in the kernel where we went to sleep but the place where we left userspace. The result is even more wrong if we switch from a kernel thread. Use the hot regs snapshot for both events as they belong to the non-interrupt/exception based events family. Unlike page faults or so that provide the regs matching the exact origin of the event, we need to save the current context. This makes the task migration event working and fix the context switch callchains and origin ip. Example: perf record -a -e cs Before: 10.91% ksoftirqd/0 0 [k] 0000000000000000 | --- (nil) perf_callchain perf_prepare_sample __perf_event_overflow perf_swevent_overflow perf_swevent_add perf_swevent_ctx_event do_perf_sw_event __perf_sw_event perf_event_task_sched_out schedule run_ksoftirqd kthread kernel_thread_helper After: 23.77% hald-addon-stor [kernel.kallsyms] [k] schedule | --- schedule | |--60.00%-- schedule_timeout | wait_for_common | wait_for_completion | blk_execute_rq | scsi_execute | scsi_execute_req | sr_test_unit_ready | | | |--66.67%-- sr_media_change | | media_changed | | cdrom_media_changed | | sr_block_media_changed | | check_disk_change | | cdrom_open v2: Always build perf_arch_fetch_caller_regs() now that software events need that too. They don't need it from modules, unlike trace events, so we keep the EXPORT_SYMBOL in trace_event_perf.c Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-22x86 / perf: Fix suspend to RAM on HP nx6325Rafael J. Wysocki
Commit 3f6da3905398826d85731247e7fbcf53400c18bd (perf: Rework and fix the arch CPU-hotplug hooks) broke suspend to RAM on my HP nx6325 (and most likely on other AMD-based boxes too) by allowing amd_pmu_cpu_offline() to be executed for CPUs that are going offline as part of the suspend process. The problem is that cpuhw->amd_nb may be NULL already, so the function should make sure it's not NULL before accessing the object pointed to by it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-18Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits) perf: Fix unexported generic perf_arch_fetch_caller_regs perf record: Don't try to find buildids in a zero sized file perf: export perf_trace_regs and perf_arch_fetch_caller_regs perf, x86: Fix hw_perf_enable() event assignment perf, ppc: Fix compile error due to new cpu notifiers perf: Make the install relative to DESTDIR if specified kprobes: Calculate the index correctly when freeing the out-of-line execution slot perf tools: Fix sparse CPU numbering related bugs perf_event: Fix oops triggered by cpu offline/online perf: Drop the obsolete profile naming for trace events perf: Take a hot regs snapshot for trace events perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot perf/x86-64: Use frame pointer to walk on irq and process stacks lockdep: Move lock events under lockdep recursion protection perf report: Print the map table just after samples for which no map was found perf report: Add multiple event support perf session: Change perf_session post processing functions to take histogram tree perf session: Add storage for seperating event types in report perf session: Change add_hist_entry to take the tree root instead of session perf record: Add ID and to recorded event data when recording multiple events ...
2010-03-17perf: Fix unexported generic perf_arch_fetch_caller_regsFrederic Weisbecker
perf_arch_fetch_caller_regs() is exported for the overriden x86 version, but not for the generic weak version. As a general rule, weak functions should not have their symbol exported in the same file they are defined. So let's export it on trace_event_perf.c as it is used by trace events only. This fixes: ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined! ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! -v2: And also only build it if trace events are enabled. -v3: Fix changelog mistake Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1268697902-9518-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-14x86/mce: Fix build bug with CONFIG_PROVE_LOCKING=y && CONFIG_X86_MCE_INTEL=yIngo Molnar
Commit f56e8a076 "x86/mce: Fix RCU lockdep splats" introduced the following build bug: arch/x86/kernel/cpu/mcheck/mce.c: In function 'mce_log': arch/x86/kernel/cpu/mcheck/mce.c:166: error: 'mce_read_mutex' undeclared (first use in this function) arch/x86/kernel/cpu/mcheck/mce.c:166: error: (Each undeclared identifier is reported only once arch/x86/kernel/cpu/mcheck/mce.c:166: error: for each function it appears in.) Move the in-the-middle-of-file lock variable up to the variable definition section, the top of the .c file. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1267830207-9474-3-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-13Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix pick_next_highest_task_rt() for cgroups sched: Cleanup: remove unused variable in try_to_wake_up() x86: Fix sched_clock_cpu for systems with unsynchronized TSC
2010-03-13Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, k8 nb: Fix boot crash: enable k8_northbridges unconditionally on AMD systems x86, UV: Fix target_cpus() in x2apic_uv_x.c x86: Reduce per cpu warning boot up messages x86: Reduce per cpu MCA boot up messages x86_64, cpa: Don't work hard in preserving kernel 2M mappings when using 4K already
2010-03-13Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking: Make sparse work with inline spinlocks and rwlocks x86/mce: Fix RCU lockdep splats rcu: Increase RCU CPU stall timeouts if PROVE_RCU ftrace: Replace read_barrier_depends() with rcu_dereference_raw() rcu: Suppress RCU lockdep warnings during early boot rcu, ftrace: Fix RCU lockdep splat in ftrace_perf_buf_prepare() rcu: Suppress __mpol_dup() false positive from RCU lockdep rcu: Make rcu_read_lock_sched_held() handle !PREEMPT rcu: Add control variables to lockdep_rcu_dereference() diagnostics rcu, cgroup: Relax the check in task_subsys_state() as early boot is now handled by lockdep-RCU rcu: Use wrapper function instead of exporting tasklist_lock sched, rcu: Fix rcu_dereference() for RCU-lockdep rcu: Make task_subsys_state() RCU-lockdep checks handle boot-time use rcu: Fix holdoff for accelerated GPs for last non-dynticked CPU x86/gart: Unexport gart_iommu_aperture Fix trivial conflicts in kernel/trace/ftrace.c
2010-03-11perf: export perf_trace_regs and perf_arch_fetch_caller_regsXiao Guangrong
Export perf_trace_regs and perf_arch_fetch_caller_regs since module will use these. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> [ use EXPORT_PER_CPU_SYMBOL_GPL() ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4B989C1B.2090407@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11perf, x86: Fix hw_perf_enable() event assignmentPeter Zijlstra
What happens is that we schedule badly like: <...>-1987 [019] 280.252808: x86_pmu_start: event-46/1300c0: idx: 0 <...>-1987 [019] 280.252811: x86_pmu_start: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252812: x86_pmu_start: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252813: x86_pmu_start: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252814: x86_pmu_start: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252825: x86_pmu_stop: event-46/1300c0: idx: 0 <...>-1987 [019] 280.252826: x86_pmu_stop: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252827: x86_pmu_stop: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252828: x86_pmu_stop: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252829: x86_pmu_stop: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252834: x86_pmu_start: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252834: x86_pmu_start: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252835: x86_pmu_start: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252836: x86_pmu_start: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252837: x86_pmu_start: event-51/1300c0: idx: 32 *FAIL* This happens because we only iterate the n_running events in the first pass, and reset their index to -1 if they don't match to force a re-assignment. Now, in our RR example, n_running == 0 because we fully unscheduled, so event-50 will retain its idx==32, even though in scheduling it will have gotten idx=0, and we don't trigger the re-assign path. The easiest way to fix this is the below patch, which simply validates the full assignment in the second pass. Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1268311069.5037.31.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11x86: Reduce per cpu MCA boot up messagesMike Travis
Don't write per cpu MCA boot up messages. Signed-of-by: Mike Travis <travis@sgi.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: x86@kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11x86/mce: Fix RCU lockdep splatsPaul E. McKenney
Create an rcu_dereference_check_mce() that checks for RCU-sched read side and mce_read_mutex being held on update side. Replace uses of rcu_dereference() in arch/x86/kernel/cpu/mcheck/mce.c with this new macro. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1267830207-9474-3-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf: Introduce new perf_fetch_caller_regs() for hot regs snapshotFrederic Weisbecker
Events that trigger overflows by interrupting a context can use get_irq_regs() or task_pt_regs() to retrieve the state when the event triggered. But this is not the case for some other class of events like trace events as tracepoints are executed in the same context than the code that triggered the event. It means we need a different api to capture the regs there, namely we need a hot snapshot to get the most important informations for perf: the instruction pointer to get the event origin, the frame pointer for the callchain, the code segment for user_mode() tests (we always use __KERNEL_CS as trace events always occur from the kernel) and the eflags for further purposes. v2: rename perf_save_regs to perf_fetch_caller_regs as per Masami's suggestion. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Archs <linux-arch@vger.kernel.org>
2010-03-10perf, x86: Fix double enable callsPeter Zijlstra
hw_perf_enable() would enable already enabled events. This causes problems with code that assumes that ->enable/->disable calls are balanced (like the LBR code does). What happens is that events that were already running and left in place would get enabled again. Avoid this by only enabling new events that match their previous assignment. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Fix double disable callsPeter Zijlstra
hw_perf_enable() would disable events that were not yet enabled. This causes problems with code that assumes that ->enable/->disable calls are balanced (like the LBR code does). What happens is that we disable newly added counters that match their previous assignment, even though they are not yet programmed on the hardware. Avoid this by only doing the first pass over the existing events. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Properly account n_addedPeter Zijlstra
Make sure n_added is properly accounted so that we can rely on the value to reflect the number of added counters. This is needed if its going to be used for more than a boolean check. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Avoid double disable on throttle vs ioctl(PERF_IOC_DISABLE)Peter Zijlstra
Calling ioctl(PERF_EVENT_IOC_DISABLE) on a thottled counter would result in a double disable, cure this by using x86_pmu_{start,stop} for throttle/unthrottle and teach x86_pmu_stop() to check ->active_mask. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Fix x86_pmu_startPeter Zijlstra
pmu::start should undo pmu::stop, make it so. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Use unlocked bitopsPeter Zijlstra
There is no concurrency on these variables, so don't use LOCK'ed ops. As to the intel_pmu_handle_irq() status bit clean, nobody uses that so remove it all together. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.240023029@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Change x86_pmu.{enable,disable} calling conventionPeter Zijlstra
Pass the full perf_event into the x86_pmu functions so that those may make use of more than the hw_perf_event, and while doing this, remove the superfluous second argument. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.165166129@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Remove superfluous arguments to x86_perf_event_update()Peter Zijlstra
The second and third argument to x86_perf_event_update() are superfluous since they are simple expressions of the first argument. Hence remove them. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.089468871@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Remove superfluous arguments to x86_perf_event_set_period()Peter Zijlstra
The second and third argument to x86_perf_event_set_period() are superfluous since they are simple expressions of the first argument. Hence remove them. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.006500906@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86, Do not user perf_disable from NMI contextPeter Zijlstra
Explicitly use intel_pmu_{disable,enable}_all() in intel_pmu_handle_irq() to avoid the NMI race conditions in perf_{disable,enable} Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf: Rework and fix the arch CPU-hotplug hooksPeter Zijlstra
Remove the hw_perf_event_*() hotplug hooks in favour of per PMU hotplug notifiers. This has the advantage of reducing the static weak interface as well as exposing all hotplug actions to the PMU. Use this to fix x86 hotplug usage where we did things in ONLINE which should have been done in UP_PREPARE or STARTING. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mundt <lethal@linux-sh.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100305154128.736225361@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf: Provide generic perf_sample_data initializationPeter Zijlstra
This makes it easier to extend perf_sample_data and fixes a bug on arm and sparc, which failed to set ->raw to NULL, which can cause crashes when combined with PERF_SAMPLE_RAW. It also optimizes PowerPC and tracepoint, because the struct initialization is forced to zero out the whole structure. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jean Pihet <jpihet@mvista.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: Jamie Iles <jamie.iles@picochip.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: stable@kernel.org LKML-Reference: <20100304140100.315416040@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-09Merge commit 'v2.6.34-rc1' into perf/urgentIngo Molnar
Conflicts: tools/perf/util/probe-event.c Merge reason: Pick up -rc1 and resolve the conflict as well. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-07sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributesEric W. Biederman
These are the non-static sysfs attributes that exist on my test machine. Fix them to use sysfs_attr_init or sysfs_bin_attr_init as appropriate. It simply requires making a sysfs attribute present to see this. So this is a little bit tedious but otherwise not too bad. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07Driver core: Constify struct sysfs_ops in struct kobj_typeEmese Revfy
Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> Acked-by: David Teigland <teigland@redhat.com> Acked-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Fix cast warning in pcc driver. [CPUFREQ] Processor Clocking Control interface driver
2010-03-06bitops: rename for_each_bit() to for_each_set_bit()Akinobu Mita
Rename for_each_bit to for_each_set_bit in the kernel source tree. To permit for_each_clear_bit(), should that ever be added. The patch includes a macro to map the old for_each_bit() onto the new for_each_set_bit(). This is a (very) temporary thing to ease the migration. [akpm@linux-foundation.org: add temporary for_each_bit()] Suggested-by: Alexey Dobriyan <adobriyan@gmail.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-05x86: fix mtrr missing kernel-docRandy Dunlap
Fix missing kernel-doc notation in mtrr/main.c: Warning(arch/x86/kernel/cpu/mtrr/main.c:152): No description found for parameter 'info' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-03Merge branch 'x86-bootmem-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-bootmem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) early_res: Need to save the allocation name in drop_range_partial() sparsemem: Fix compilation on PowerPC early_res: Add free_early_partial() x86: Fix non-bootmem compilation on PowerPC core: Move early_res from arch/x86 to kernel/ x86: Add find_fw_memmap_area Move round_up/down to kernel.h x86: Make 32bit support NO_BOOTMEM early_res: Enhance check_and_double_early_res x86: Move back find_e820_area to e820.c x86: Add find_early_area_size x86: Separate early_res related code from e820.c x86: Move bios page reserve early to head32/64.c sparsemem: Put mem map for one node together. sparsemem: Put usemap for one node together x86: Make 64 bit use early_res instead of bootmem before slab x86: Only call dma32_reserve_bootmem 64bit !CONFIG_NUMA x86: Make early_node_mem get mem > 4 GB if possible x86: Dynamically increase early_res array size x86: Introduce max_early_res and early_res_count ...
2010-03-02perf_events, x86: Fixup fixed counter constraintsPeter Zijlstra
Patch 1da53e0230 ("perf_events, x86: Improve x86 event scheduling") lost us one of the fixed purpose counters and then ed8777fc13 ("perf_events, x86: Fix event constraint masks") broke it even further. Widen the fixed event mask to event+umask and specify the full config for each of the 3 fixed purpose counters. Then let the init code fill out the placement for the GP regs based on the cpuid info. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-02perf, x86: Restrict the ANY flagPeter Zijlstra
The ANY flag can show SMT data of another task (like 'top'), so we want to disable it when system-wide profiling is disabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-02x86: Fix sched_clock_cpu for systems with unsynchronized TSCDimitri Sivanich
On UV systems, the TSC is not synchronized across blades. The sched_clock_cpu() function is returning values that can go backwards (I've seen as much as 8 seconds) when switching between cpus. As each cpu comes up, early_init_intel() will currently set the sched_clock_stable flag true. When mark_tsc_unstable() runs, it clears the flag, but this only occurs once (the first time a cpu comes up whose TSC is not synchronized with cpu 0). After this, early_init_intel() will set the flag again as the next cpu comes up. Only set sched_clock_stable if tsc has not been marked unstable. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100301174815.GC8224@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-01Merge branch 'acpica' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI: replace acpi_integer by u64 ACPICA: Update version to 20100121. ACPICA: Remove unused uint32_struct type ACPICA: Disassembler: Remove obsolete "Integer64" field in parse object ACPICA: Remove obsolete ACPI_INTEGER (acpi_integer) type ACPICA: Predefined name repair: fix NULL package elements ACPICA: AcpiGetDevices: Eliminate unnecessary _STA calls ACPICA: Update all ACPICA copyrights and signons to 2010 ACPICA: Update for new gcc-4 warning options
2010-03-01perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLERobert Richter
For consistency reasons this patch renames ARCH_PERFMON_EVENTSEL0_ENABLE to ARCH_PERFMON_EVENTSEL_ENABLE. The following is performed: $ sed -i -e s/ARCH_PERFMON_EVENTSEL0_ENABLE/ARCH_PERFMON_EVENTSEL_ENABLE/g \ arch/x86/include/asm/perf_event.h arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perfctr-watchdog.c \ arch/x86/oprofile/op_model_amd.c arch/x86/oprofile/op_model_ppro.c Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-28Merge branch 'x86-mtrr-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Convert set_atomicity_lock to raw_spinlock x86, mtrr: Kill over the top warn x86, mtrr: Constify struct mtrr_ops
2010-02-28Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Enable L3 CID only on AMD x86, cacheinfo: Remove NUMA dependency, fix for AMD Fam10h rev D1 x86, cpu: Print AMD virtualization features in /proc/cpuinfo x86, cacheinfo: Calculate L3 indices x86, cacheinfo: Add cache index disable sysfs attrs only to L3 caches x86, cacheinfo: Fix disabling of L3 cache indices intel-agp: Switch to wbinvd_on_all_cpus x86, lib: Add wbinvd smp helpers
2010-02-28Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Remove trailing spaces in messages x86, mtrr: Remove unused mtrr/state.c x86, trivial: Fix grammo in tsc comment about Geode TSC reliability