aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2009-09-04x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpusYinghai Lu
Otherwise, system with apci id lifting will have wrong apicid in /proc/cpuinfo. and use that in srat_detect_node(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <4A998CCA.1040407@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03x86, sched: Workaround broken sched domain creation for AMD Magny-CoursAndreas Herrmann
Current sched domain creation code can't handle multi-node processors. When switching to power_savings scheduling errors show up and system might hang later on (due to broken sched domain hierarchy): # echo 0 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-5 level MC groups: 0 1 2 3 4 5 domain 1: span 0-23 level NODE groups: 0-5 6-11 18-23 12-17 ... # echo 1 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-11 level MC groups: 0 1 2 3 4 5 6 7 8 9 10 11 ERROR: parent span is not a superset of domain->span domain 1: span 0-5 level CPU ERROR: domain->groups does not contain CPU0 groups: 6-11 (__cpu_power = 12288) ERROR: groups don't span domain->span domain 2: span 0-23 level NODE groups: ERROR: domain->cpu_power not set ERROR: groups don't span domain->span ... Fixing all aspects of power-savings scheduling for Magny-Cours needs some larger changes in the sched domain creation code. As a short-term and temporary workaround avoid the problems by extending "the worst possible hack" ;-( and always use llc_shared_map on AMD Magny-Cours when MC domain span is calculated. With this I get: # echo 1 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-5 level MC groups: 0 1 2 3 4 5 domain 1: span 0-5 level CPU groups: 0-5 (__cpu_power = 6144) domain 2: span 0-23 level NODE groups: 0-5 (__cpu_power = 6144) 6-11 (__cpu_power = 6144) 18-23 (__cpu_power = 6144) 12-17 (__cpu_power = 6144) ... I.e. no errors during sched domain creation, no system hangs, and also mc_power_savings scheduling works to a certain extend. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-03x86, mcheck: Use correct cpumask for shared bank4Andreas Herrmann
This fixes threshold_bank4 support on multi-node processors. The correct mask to use is llc_shared_map, representing an internal node on Magny-Cours. We need to create 2 sets of symlinks for sibling shared banks -- one set for each internal node, symlinks of each set should target the first core on same internal node. Currently only one set is created where all symlinks are targeting the first core of the entire socket. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-03x86, cacheinfo: Fixup L3 cache information for AMD multi-node processorsAndreas Herrmann
L3 cache size, associativity and shared_cpu information need to be adapted to show information for an internal node instead of the entire physical package. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-03x86: Fix CPU llc_shared_map information for AMD Magny-CoursAndreas Herrmann
Construct entire NodeID and use it as cpu_llc_id. Thus internal node siblings are stored in llc_shared_map. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-03x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit tooIngo Molnar
The macro was defined in the 32-bit path as well - breaking the build on 32-bit platforms: arch/x86/lib/msr-reg.S: Assembler messages: arch/x86/lib/msr-reg.S:53: Error: Bad macro parameter list arch/x86/lib/msr-reg.S:100: Error: invalid character '_' in mnemonic arch/x86/lib/msr-reg.S:101: Error: invalid character '_' in mnemonic Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <tip-f6909f394c2d4a0a71320797df72d54c49c5927e@git.kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-01x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.hHuang Ying
This function measures whether the FPU/SSE state can be touched in interrupt context. If the interrupted code is in user space or has no valid FPU/SSE context (CR0.TS == 1), FPU/SSE state can be used in IRQ or soft_irq context too. This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. v3: - Renamed to irq_fpu_usable to reflect the purpose of the function. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying <ying.huang@intel.com> CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-01x86, msr: fix msr-reg.S compilation with gas 2.16.1H. Peter Anvin
msr-reg.S used the :req option on a macro argument, which wasn't supported by gas 2.16.1 (but apparently by some earlier versions of gas, just to be confusing.) It isn't necessary, so just remove it. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@googlemail.com>
2009-09-01Merge branch 'x86/paravirt' into x86/cpuIngo Molnar
Conflicts: arch/x86/include/asm/paravirt.h Manual merge: arch/x86/include/asm/paravirt_types.h Merge reason: x86/paravirt conflicts non-trivially with x86/cpu, resolve it. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31x86, msr: Export the register-setting MSR functions via /dev/*/msrH. Peter Anvin
Make it possible to access the all-register-setting/getting MSR functions via the MSR driver. This is implemented as an ioctl() on the standard MSR device node. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com>
2009-08-31x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs()H. Peter Anvin
Create _on_cpu helpers for {rw,wr}msr_safe_regs() analogously with the other MSR functions. This will be necessary to add support for these to the MSR driver. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com>
2009-08-31x86, msr: Have the _safe MSR functions return -EIO, not -EFAULTH. Peter Anvin
For some reason, the _safe MSR functions returned -EFAULT, not -EIO. However, the only user which cares about the return code as anything other than a boolean is the MSR driver, which wants -EIO. Change it to -EIO across the board. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>
2009-08-31x86, msr: CFI annotations, cleanups for msr-reg.SH. Peter Anvin
Add CFI annotations for native_{rd,wr}msr_safe_regs(). Simplify the 64-bit implementation: we don't allow the upper half registers to be set, and so we can use them to carry state across the operation. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com>
2009-08-31x86, asm: Make _ASM_EXTABLE() usable from assembly codeH. Peter Anvin
We have had this convenient macro _ASM_EXTABLE() to generate exception table entry in inline assembly. Make it also usable for pure assembly. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-31x86, asm: Add 32-bit versions of the combined CFI macrosH. Peter Anvin
Add 32-bit versions of the combined CFI macros, equivalent to the 64-bit ones except, obviously, operating on 32-bit stack words. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-31x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bitBorislav Petkov
fbd8b1819e80ac5a176d085fdddc3a34d1499318 turns off the bit for /proc/cpuinfo. However, a proper/full fix would be to additionally turn off the bit in the CPUID output so that future callers get correct CPU features info. Do that by basically reversing what the BIOS wrongfully does at boot. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1251705011-18636-3-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-31x86, msr: Rewrite AMD rd/wrmsr variantsBorislav Petkov
Switch them to native_{rd,wr}msr_safe_regs and remove pv_cpu_ops.read_msr_amd. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-2-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-31x86, msr: Add rd/wrmsr interfaces with preset registersBorislav Petkov
native_{rdmsr,wrmsr}_safe_regs are two new interfaces which allow presetting of a subset of eight x86 GPRs before executing the rd/wrmsr instructions. This is needed at least on AMD K8 for accessing an erratum workaround MSR. Originally based on an idea by H. Peter Anvin. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-23x86: add specific support for Intel Atom architectureTobias Doerffel
Add another option when selecting CPU family so the kernel can be optimized for Intel Atom CPUs. If GCC supports tuning options for Intel Atom they will be used. Signed-off-by: Tobias Doerffel <tobias.doerffel@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1251018457-19157-1-git-send-email-tobias.doerffel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-23Merge commit 'v2.6.31-rc7' into x86/cpuIngo Molnar
2009-08-21x86: don't call '->send_IPI_mask()' with an empty maskLinus Torvalds
As noted in 83d349f35e1ae72268c5104dbf9ab2ae635425d4 ("x86: don't send an IPI to the empty set of CPU's"), some APIC's will be very unhappy with an empty destination mask. That commit added a WARN_ON() for that case, and avoided the resulting problem, but didn't fix the underlying reason for why those empty mask cases happened. This fixes that, by checking the result of 'cpumask_andnot()' of the current CPU actually has any other CPU's left in the set of CPU's to be sent a TLB flush, and not calling down to the IPI code if the mask is empty. The reason this started happening at all is that we started passing just the CPU mask pointers around in commit 4595f9620 ("x86: change flush_tlb_others to take a const struct cpumask"), and when we did that, the cpumask was no longer thread-local. Before that commit, flush_tlb_mm() used to create it's own copy of 'mm->cpu_vm_mask' and pass that copy down to the low-level flush routines after having tested that it was not empty. But after changing it to just pass down the CPU mask pointer, the lower level TLB flush routines would now get a pointer to that 'mm->cpu_vm_mask', and that could still change - and become empty - after the test due to other CPU's having flushed their own TLB's. See http://bugzilla.kernel.org/show_bug.cgi?id=13933 for details. Tested-by: Thomas Björnell <thomas.bjornell@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-21x86: don't send an IPI to the empty set of CPU'sLinus Torvalds
The default_send_IPI_mask_logical() function uses the "flat" APIC mode to send an IPI to a set of CPU's at once, but if that set happens to be empty, some older local APIC's will apparently be rather unhappy. So just warn if a caller gives us an empty mask, and ignore it. This fixes a regression in 2.6.30.x, due to commit 4595f9620 ("x86: change flush_tlb_others to take a const struct cpumask"), documented here: http://bugzilla.kernel.org/show_bug.cgi?id=13933 which causes a silent lock-up. It only seems to happen on PPro, P2, P3 and Athlon XP cores. Most developers sadly (or not so sadly, if you're a developer..) have more modern CPU's. Also, on x86-64 we don't use the flat APIC mode, so it would never trigger there even if the APIC didn't like sending an empty IPI mask. Reported-by: Pavel Vilim <wylda@volny.cz> Reported-and-tested-by: Thomas Björnell <thomas.bjornell@gmail.com> Reported-and-tested-by: Martin Rogge <marogge@onlinehome.de> Cc: Mike Travis <travis@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: use the right flag for get_vm_area() percpu, sparc64: fix sparse possible cpu map handling init: set nr_cpu_ids before setup_per_cpu_areas()
2009-08-17x86, mce: Don't initialize MCEs on unknown CPUsIngo Molnar
An older test-box started hanging at the following point during bootup: [ 0.022996] Mount-cache hash table entries: 512 [ 0.024996] Initializing cgroup subsys debug [ 0.025996] Initializing cgroup subsys cpuacct [ 0.026995] Initializing cgroup subsys devices [ 0.027995] Initializing cgroup subsys freezer [ 0.028995] mce: CPU supports 5 MCE banks I've bisected it down to commit 4efc0670 ("x86, mce: use 64bit machine check code on 32bit"), which utilizes the MCE code on 32-bit systems too. The problem is caused by this detail in my config: # CONFIG_CPU_SUP_INTEL is not set This disables the quirks in mce_cpu_quirks() but still enables MCE support - which then hangs due to the missing quirk workaround needed on this CPU: if (c->x86 == 6 && c->x86_model < 0x1A && banks > 0) mce_banks[0].init = 0; The safe solution is to not initialize MCEs if we dont know on what CPU we are running (or if that CPU's support code got disabled in the config). Also be a bit more defensive on 32-bit systems: dont do a boot-time dump of pending MCEs not just on the specific system that we found a problem with (Pentium-M), but earlier ones as well. Now this problem is probably not common and disabling CPU support is rare - but still being more defensive in something we turned on for a wide range of CPUs is prudent. Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> LKML-Reference: Message-ID: <4A88E3E4.40506@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUsBartlomiej Zolnierkiewicz
On my legacy Pentium M laptop (Acer Extensa 2900) I get bogus MCE on a cold boot with CONFIG_X86_NEW_MCE enabled, i.e. (after decoding it with mcelog): MCE 0 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 0 BANK 1 MCG status: MCi status: Error overflow Uncorrected error Error enabled Processor context corrupt MCA: Data CACHE Level-1 UNKNOWN Error STATUS f200000000000195 MCGSTATUS 0 [ The other STATUS values observed: f2000000000001b5 (... UNKNOWN error) and f200000000000115 (... READ Error). To verify that this is not a CONFIG_X86_NEW_MCE bug I also modified the CONFIG_X86_OLD_MCE code (which doesn't log any MCEs) to dump content of STATUS MSR before it is cleared during initialization. ] Since the bogus MCE results in a kernel taint (which in turn disables lockdep support) don't log boot MCEs on Pentium M (model == 13) CPUs by default ("mce=bootlog" boot parameter can be be used to get the old behavior). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Reviewed-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16x86: Annotate section mismatch warnings in kernel/apic/x2apic_uv_x.cLeonardo Potenza
The function uv_acpi_madt_oem_check() has been marked __init, the struct apic_x2apic_uv_x has been marked __refdata. The aim is to address the following section mismatch messages: WARNING: arch/x86/kernel/apic/built-in.o(.data+0x1368): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary() The variable apic_x2apic_uv_x references the function __cpuinit uv_wakeup_secondary() If the reference is valid then annotate the variable with __init* or __refdata (see linux/init.h) or name the variable: *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console, WARNING: arch/x86/kernel/built-in.o(.data+0x68e8): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary() The variable apic_x2apic_uv_x references the function __cpuinit uv_wakeup_secondary() If the reference is valid then annotate the variable with __init* or __refdata (see linux/init.h) or name the variable: *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console, WARNING: arch/x86/built-in.o(.text+0x7b36f): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_ioremap() The function uv_acpi_madt_oem_check() references the function __init early_ioremap(). This is often because uv_acpi_madt_oem_check lacks a __init annotation or the annotation of early_ioremap is wrong. WARNING: arch/x86/built-in.o(.text+0x7b38d): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_iounmap() The function uv_acpi_madt_oem_check() references the function __init early_iounmap(). This is often because uv_acpi_madt_oem_check lacks a __init annotation or the annotation of early_iounmap is wrong. WARNING: arch/x86/built-in.o(.data+0x8668): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary() The variable apic_x2apic_uv_x references the function __cpuinit uv_wakeup_secondary() If the reference is valid then annotate the variable with __init* or __refdata (see linux/init.h) or name the variable: *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console, Signed-off-by: Leonardo Potenza <lpotenza@inwind.it> LKML-Reference: <200908161855.48302.lpotenza@inwind.it> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16x86, mce: therm_throt: Don't log redundant normalityHugh Dickins
0d01f31439c1e4d602bf9fdc924ab66f407f5e38 "x86, mce: therm_throt - change when we print messages" removed redundant announcements of "Temperature/speed normal". They're not worth logging and remove their accompanying "Machine check events logged" messages as well from the console. Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dmitry Torokhov <dtor@mail.ru> LKML-Reference: <Pine.LNX.4.64.0908161544100.7929@sister.anvils> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15x86: Fix UV BAU destination subnode idCliff Wickman
The SGI UV Broadcast Assist Unit is used to send TLB shootdown messages to remote nodes of the system. The header of the message must contain the subnode id of the block in the receiving hub that handles such messages. It should always be 0x10, the id of the "LB" block. It had previously been documented as a "must be zero" field. Signed-off-by: Cliff Wickman <cpw@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <E1Mc1x7-0005Ce-6t@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-14percpu, sparc64: fix sparse possible cpu map handlingTejun Heo
percpu code has been assuming num_possible_cpus() == nr_cpu_ids which is incorrect if cpu_possible_map contains holes. This causes percpu code to access beyond allocated memories and vmalloc areas. On a sparc64 machine with cpus 0 and 2 (u60), this triggers the following warning or fails boot. WARNING: at /devel/tj/os/work/mm/vmalloc.c:106 vmap_page_range_noflush+0x1f0/0x240() Modules linked in: Call Trace: [00000000004b17d0] vmap_page_range_noflush+0x1f0/0x240 [00000000004b1840] map_vm_area+0x20/0x60 [00000000004b1950] __vmalloc_area_node+0xd0/0x160 [0000000000593434] deflate_init+0x14/0xe0 [0000000000583b94] __crypto_alloc_tfm+0xd4/0x1e0 [00000000005844f0] crypto_alloc_base+0x50/0xa0 [000000000058b898] alg_test_comp+0x18/0x80 [000000000058dad4] alg_test+0x54/0x180 [000000000058af00] cryptomgr_test+0x40/0x60 [0000000000473098] kthread+0x58/0x80 [000000000042b590] kernel_thread+0x30/0x60 [0000000000472fd0] kthreadd+0xf0/0x160 ---[ end trace 429b268a213317ba ]--- This patch fixes generic percpu functions and sparc64 setup_per_cpu_areas() so that they handle sparse cpu_possible_map properly. Please note that on x86, cpu_possible_map() doesn't contain holes and thus num_possible_cpus() == nr_cpu_ids and this patch doesn't cause any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu>
2009-08-13Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Report the cloning task as parent on perf_counter_fork() perf_counter: Fix an ipi-deadlock perf: Rework/fix the whole read vs group stuff perf_counter: Fix swcounter context invariance perf report: Don't show unresolved DSOs and symbols when -S/-d is used perf tools: Add a general option to enable raw sample records perf tools: Add a per tracepoint counter attribute to get raw sample perf_counter: Provide hw_perf_counter_setup_online() APIs perf list: Fix large list output by using the pager perf_counter, x86: Fix/improve apic fallback perf record: Add missing -C option support for specifying profile cpu perf tools: Fix dso__new handle() to handle deleted DSOs perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available perf report: Show the tid too in -D perf record: Fix .tid and .pid fill-in when synthesizing events perf_counter, x86: Fix generic cache events on P6-mobile CPUs perf_counter, x86: Fix lapic printk message
2009-08-12perf_counter, x86: Fix/improve apic fallbackIngo Molnar
Johannes Stezenbach reported that his Pentium-M based laptop does not have the local APIC enabled by default, and hence perfcounters do not get initialized. Add a fallback for this case: allow non-sampled counters and return with an error on sampled counters. This allows 'perf stat' to work out of box - and allows 'perf top' and 'perf record' to fall back on a hrtimer based sampling method. ( Passing 'lapic' on the boot line will allow hardware sampling to occur - but if the APIC is disabled permanently by the hardware then this fallback still allows more systems to use perfcounters. ) Also decouple perfcounter support from X86_LOCAL_APIC. -v2: fix typo breaking counters on all other systems ... Reported-by: Johannes Stezenbach <js@sig21.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12x86: Fix oops in identify_cpu() on CPUs without CPUIDOndrej Zary
Kernel is broken for x86 CPUs without CPUID since 2.6.28. It crashes with NULL pointer dereference in identify_cpu(): 766 generic_identify(c); 767 768--> if (this_cpu->c_identify) 769 this_cpu->c_identify(c); this_cpu is NULL. This is because it's only initialized in get_cpu_vendor() function, which is not called if the CPU has no CPUID instruction. Signed-off-by: Ondrej Zary <linux@rainbow-software.org> LKML-Reference: <200908112000.15993.linux@rainbow-software.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11x86: Clear incorrectly forced X86_FEATURE_LAHF_LM flagKevin Winchester
Due to an erratum with certain AMD Athlon 64 processors, the BIOS may need to force enable the LAHF_LM capability. Unfortunately, in at least one case, the BIOS does this even for processors that do not support the functionality. Add a specific check that will clear the feature bit for processors known not to support the LAHF/SAHF instructions. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Acked-by: Borislav Petkov <petkovbb@googlemail.com> LKML-Reference: <4A80A5AD.2000209@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11perf_counter, x86: Fix generic cache events on P6-mobile CPUsIngo Molnar
Johannes Stezenbach reported that 'perf stat' does not count cache-miss and cache-references events on his Pentium-M based laptop. This is because we left them blank in p6_perfmon_event_map[], fill them in. Reported-by: Johannes Stezenbach <js@sig21.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11perf_counter, x86: Fix lapic printk messageIngo Molnar
Instead of this garbled bootup on UP Pentium-M systems: [ 0.015048] Performance Counters: [ 0.016004] no Local APIC, try rebooting with lapicno PMU driver, software counters only. Print: [ 0.015050] Performance Counters: [ 0.016004] no APIC, boot with the "lapic" boot parameter to force-enable it. [ 0.017003] no PMU driver, software counters only. Cf: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11x86, mce: therm_throt - change when we print messagesDmitry Torokhov
My Latitude d630 seems to be handling thermal events in SMI by lowering the max frequency of the CPU till it cools down but still leaks the "everything is normal" events. This spams the console and with high priority printks. Adjust therm_throt driver to only print messages about the fact that temperatire returned back to normal when leaving the throttling state. Also lower the severity of "back to normal" message from KERN_CRIT to KERN_INFO. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Acked-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <20090810051513.0558F526EC9@mailhub.coreip.homeip.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-10x86: Add reboot quirk for every 5 series MacBook/ProShunichi Fuji
Reboot does not work on my MacBook Pro 13 inch (MacBookPro5,5) too. It seems all unibody MacBook and MacBookPro require PCI reboot handling, i guess. Following model/machine ID list shows unibody MacBook/Pro have the 5 series of model number: http://www.everymac.com/systems/by_capability/macs-by-machine-model-machine-id.html Signed-off-by: Shunichi Fuji <palglowr@gmail.com> Cc: Ozan Çağlayan <ozan@pardus.org.tr> LKML-Reference: <30046e3b0908101134p6487ddbftd8776e4ddef204be@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-10x86: Fix serialization in pit_expect_msb()Linus Torvalds
Wei Chong Tan reported a fast-PIT-calibration corner-case: | pit_expect_msb() is vulnerable to SMI disturbance corner case | in some platforms which causes /proc/cpuinfo to show wrong | CPU MHz value when quick_pit_calibrate() jumps to success | section. I think that the real issue isn't even an SMI - but the fact that in the very last iteration of the loop, there's no serializing instruction _after_ the last 'rdtsc'. So even in the absense of SMI's, we do have a situation where the cycle counter was read without proper serialization. The last check should be done outside the outer loop, since _inside_ the outer loop, we'll be testing that the PIT has the right MSB value has the right value in the next iteration. So only the _last_ iteration is special, because that's the one that will not check the PIT MSB value any more, and because the final 'get_cycles()' isn't serialized. In other words: - I'd like to move the PIT MSB check to after the last iteration, rather than in every iteration - I think we should comment on the fact that it's also a serializing instruction and so 'fences in' the TSC read. Here's a suggested replacement. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: "Tan, Wei Chong" <wei.chong.tan@intel.com> Tested-by: "Tan, Wei Chong" <wei.chong.tan@intel.com> LKML-Reference: <B28277FD4E0F9247A3D55704C440A140D5D683F3@pgsmsx504.gar.corp.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09Merge branch 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Avoid redelivery of edge interrupt before next edge KVM: MMU: limit rmap chain length KVM: ia64: fix build failures due to ia64/unsigned long mismatches KVM: Make KVM_HPAGES_PER_HPAGE unsigned long to avoid build error on powerpc KVM: fix ack not being delivered when msi present KVM: s390: fix wait_queue handling KVM: VMX: Fix locking imbalance on emulation failure KVM: VMX: Fix locking order in handle_invalid_guest_state KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages KVM: SVM: force new asid on vcpu migration KVM: x86: verify MTRR/PAT validity KVM: PIT: fix kpit_elapsed division by zero KVM: Fix KVM_GET_MSR_INDEX_LIST
2009-08-09x86: fix buffer overflow in efi_init()Roel Kluin
If the vendor name (from c16) can be longer than 100 bytes (or missing a terminating null), then the null is written past the end of vendor[]. Found with Parfait, http://research.sun.com/projects/parfait/ Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Huang Ying <ying.huang@intel.com>
2009-08-08x86: Add quirk to make Apple MacBookPro5,1 use reboot=pciOzan Çağlayan
MacBookPro5,1 is not able to reboot unless reboot=pci is set. This patch forces it through a DMI quirk specific to this device. Signed-off-by: Ozan Çağlayan <ozan@pardus.org.tr> LKML-Reference: <1249403971-6543-1-git-send-email-ozan@pardus.org.tr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08x86: Fix MSI-X initialization by using online_mask for x2apic target_cpusYinghai Lu
found a system where x2apic reports an MSI-X irq initialization failure: [ 302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002) [ 302.874369] igbvf 0000:81:10.4: using 64bit DMA mask [ 302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask [ 302.894386] igbvf 0000:81:10.4: enabling bus mastering [ 302.898171] igbvf 0000:81:10.4: setting latency timer to 64 [ 302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus [ 302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus [ 302.940367] alloc irq_desc for 265 on node 4 [ 302.956874] alloc kstat_irqs on node 4 [ 302.959452] alloc irq_2_iommu on node 0 [ 302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X [ 302.977778] alloc irq_desc for 266 on node 4 [ 302.980347] alloc kstat_irqs on node 4 [ 302.995312] free_memtype request 0xefb28000-0xefb29000 [ 302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts. ... it turns out that when trying to enable MSI-X, __assign_irq_vector(new, cfg_new, apic->target_cpus()) can not get vector because for x2apic target-cpus returns cpumask_of(0) Update that to online_mask like xapic. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <4A785AFF.3050902@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06KVM: MMU: limit rmap chain lengthMarcelo Tosatti
Otherwise the host can spend too long traversing an rmap chain, which happens under a spinlock. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: VMX: Fix locking imbalance on emulation failureJan Kiszka
We have to disable preemption and IRQs on every exit from handle_invalid_guest_state, otherwise we generate at least a preempt_disable imbalance. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: VMX: Fix locking order in handle_invalid_guest_stateJan Kiszka
Release and re-acquire preemption and IRQ lock in the same order as vcpu_enter_guest does. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in ↵Marcelo Tosatti
kvm_mmu_change_mmu_pages kvm_mmu_change_mmu_pages mishandles the case where n_alloc_mmu_pages is smaller then n_free_mmu_pages, by not checking if the result of the subtraction is negative. Its a valid condition which can happen if a large number of pages has been recently freed. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: SVM: force new asid on vcpu migrationMarcelo Tosatti
If a migrated vcpu matches the asid_generation value of the target pcpu, there will be no TLB flush via TLB_CONTROL_FLUSH_ALL_ASID. The check for vcpu.cpu in pre_svm_run is meaningless since svm_vcpu_load already updated it on schedule in. Such vcpu will VMRUN with stale TLB entries. Based on original patch from Joerg Roedel (http://patchwork.kernel.org/patch/10021/) Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: x86: verify MTRR/PAT validityMarcelo Tosatti
Do not allow invalid memory types in MTRR/PAT (generating a #GP otherwise). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: PIT: fix kpit_elapsed division by zeroMarcelo Tosatti
Fix division by zero triggered by latch count command on uninitialized counter. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: Fix KVM_GET_MSR_INDEX_LISTJan Kiszka
So far, KVM copied the emulated_msrs (only MSR_IA32_MISC_ENABLE) to a wrong address in user space due to broken pointer arithmetic. This caused subtle corruption up there (missing MSR_IA32_MISC_ENABLE had probably no practical relevance). Moreover, the size check for the user-provided kvm_msr_list forgot about emulated MSRs. Cc: stable@kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>