aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/mcheck/mce_64.c
AgeCommit message (Collapse)Author
2008-08-23x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUsRafael J. Wysocki
During CPU hot-remove the sysfs directory created by threshold_create_bank(), defined in arch/x86/kernel/cpu/mcheck/mce_amd_64.c, has to be removed before its parent directory, created by mce_create_device(), defined in arch/x86/kernel/cpu/mcheck/mce_64.c . Moreover, when the CPU in question is hotplugged again, obviously the latter has to be created before the former. At present, the right ordering is not enforced, because all of these operations are carried out by CPU hotplug notifiers which are not appropriately ordered with respect to each other. This leads to serious problems on systems with two or more multicore AMD CPUs, among other things during suspend and hibernation. Fix the problem by placing threshold bank CPU hotplug callbacks in mce_cpu_callback(), so that they are invoked at the right places, if defined. Additionally, use kobject_del() to remove the sysfs directory associated with the kobject created by kobject_create_and_add() in threshold_create_bank(), to prevent the kernel from crashing during CPU hotplug operations on systems with two or more multicore AMD CPUs. This patch fixes bug #11337. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Andi Kleen <andi@firstfloor.org> Tested-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-23Merge branch 'cpus4096-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) NR_CPUS: Replace NR_CPUS in speedstep-centrino.c cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP NR_CPUS: Replace NR_CPUS in cpufreq userspace routines NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target cpumask: Provide a generic set of CPUMASK_ALLOC macros cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Revert "cpumask: introduce new APIs" cpumask: make for_each_cpu_mask a bit smaller net: Pass reference to cpumask variable in net/sunrpc/svc.c ... Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually
2008-07-21sysdev: Convert the x86 mce tolerant sysdev attribute to generic attributeAndi Kleen
Use the new generic int attribute accessors for the x86 mce tolerant attribute. Simple example to illustrate the new macros. There are much more places all over the tree that could be converted like this. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21sysdev: Pass the attribute to the low level sysdev show/store functionAndi Kleen
This allow to dynamically generate attributes and share show/store functions between attributes. Right now most attributes are generated by special macros and lots of duplicated code. With the attribute passed it's instead possible to attach some data to the attribute and then use that in shared low level functions to do different things. I need this for the dynamically generated bank attributes in the x86 machine check code, but it'll allow some further cleanups. I converted all users in tree to the new show/store prototype. It's a single huge patch to avoid unbisectable sections. Runtime tested: x86-32, x86-64 Compiled only: ia64, powerpc Not compile tested/only grep converted: sh, arm, avr32 Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-20NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.cMike Travis
* nr_cpu_ids should be used to allocate arrays based on the number of cpu's present. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-15Merge branch 'generic-ipi' into generic-ipi-for-linusIngo Molnar
Conflicts: arch/powerpc/Kconfig arch/s390/kernel/time.c arch/x86/kernel/apic_32.c arch/x86/kernel/cpu/perfctr-watchdog.c arch/x86/kernel/i8259_64.c arch/x86/kernel/ldt.c arch/x86/kernel/nmi_64.c arch/x86/kernel/smpboot.c arch/x86/xen/smp.c include/asm-x86/hw_irq_32.h include/asm-x86/hw_irq_64.h include/asm-x86/mach-default/irq_vectors.h include/asm-x86/mach-voyager/irq_vectors.h include/asm-x86/smp.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-14Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6Linus Torvalds
* 'bkl-removal' of git://git.lwn.net/linux-2.6: (146 commits) IB/umad: BKL is not needed for ib_umad_open() IB/uverbs: BKL is not needed for ib_uverbs_open() bf561-coreb: BKL unneeded for open() Call fasync() functions without the BKL snd/PCM: fasync BKL pushdown ipmi: fasync BKL pushdown ecryptfs: fasync BKL pushdown Bluetooth VHCI: fasync BKL pushdown tty_io: fasync BKL pushdown tun: fasync BKL pushdown i2o: fasync BKL pushdown mpt: fasync BKL pushdown Remove BKL from remote_llseek v2 Make FAT users happier by not deadlocking x86-mce: BKL pushdown vmwatchdog: BKL pushdown vmcp: BKL pushdown via-pmu: BKL pushdown uml-random: BKL pushdown uml-mmapper: BKL pushdown ...
2008-07-03x86, mce_64.c: mce_cpu_quirks being ignoredVenki Pallipadi
Quirks getting ignored was a bug. Below patch fixes the bug, until we have the dynamic banks support. Sysfs choice configuration should not have any issues with the earlier patch as we look for NR_SYSFS_BANKS in do_machine_check(). Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Max Asbock <masbock@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-02x86-mce: BKL pushdownArnd Bergmann
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2008-06-26on_each_cpu(): kill unused 'retry' parameterJens Axboe
It's not even passed on to smp_call_function() anymore, since that was removed. So kill it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-18x86: correctly report NR_BANKS in mce_64.cDaniel Rahn
attached is a no-brainer that makes kernel correctly report NR_BANKS for MCE. We are right now limited to NR_BANKS==6, but the error message will use the available number of banks instead of the defined maximum. For a Nehalem based system it will print: "MCE: warning: using only 9 banks" while the correct message would be "MCE: warning: using only 6 banks" Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-12x86: remove 6 bank limitation in 64 bit MCE reporting codeVenki Pallipadi
Eliminate the 6 bank restriction in 64 bit mce reporting code. This restriction is artificial (due to static creation of sysfs files) and 32 bit code does not have any such restriction. This change helps in reporting the details of machine checks on a machine check exception with errors in bank 6 and above on CPUs that support those banks. Without the patch, machine check errors in those banks are not reported. We still have 128 (MCE_EXTENDED_BANK) bank restriction instead of max 256 supported in hardware. That is not changed in the patch below as it will have some user level mcelog utility dependency, with bank 128 being used for thermal reporting currently. The patch below does not create sysfs control (bankNctl) for banks higher than 6 as well. That needs some pre-cleanup in /sysfs mce layout, removal of per cpu /sysfs entries for bankctl as they are really global system level control today. That change will follow. This basic change is critical to report the detailed errors on banks higher than 6. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-26x86-64: extend MCE CPU quirk handlingJan Beulich
At least on my Barcelona, I see MCE log entries after cold boot caused by BIOS not properly clearing the respective registers. Therefore, this patch extends the workaround to families 0x10 and 0x11 (the latter just for completeness, I have nothing to verify this against). At the same time, provide a way to make these entries visible via the 'mce=bootlog' command line option even on these machines. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: fix section mismatch warning in mcheck/mce_64.cSam Ravnborg
Fix following warning: WARNING: arch/x86/kernel/cpu/mcheck/built-in.o(.text+0x752): Section mismatch: reference to .cpuinit.text:mce_create_device in 'mce_cpu_callback' mce_cpu_callback() is only used by mce_cpu_notofier. The notifier is only used for hotplugable cpu's as it is registered using register_hotcpu_notifier(), Annotate them both __cpuinit to fix the warning. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: change x86 machine check handler to use unlocked_ioctl insteadNikanth Karthikesan
The machine check handler registers ioctl handler that is called with the BKL held. Changing to register unlocked_ioctl instead. Also mce ioctl handler does not seem to need any lock protection. To: Andi Kleen <andi@firstfloor.org> Cc: linux-kernel@vger.kernel.org Cc: kernel-janitors@vger.kernel.org Change the Machine check handler to use unlocked_ioctl instead of ioctl handler. Also the mce ioctl handler does not need any lock protection. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86-64: honor notify_die() returning NOTIFY_STOPJan Beulich
This requires making die() return a value, making its callers honor this (and be prepared that it may return), and making oops_end() have two additional parameters. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30mcheck mce_64: mce_read_sem to mutexDaniel Walker
Converted to a mutex, and changed the name to mce_read_mutex. Signed-off-by: Daniel Walker <dwalker@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: rename the struct pt_regs members for 32/64-bit consistencyH. Peter Anvin
We have a lot of code which differs only by the naming of specific members of structures that contain registers. In order to enable additional unifications, this patch drops the e- or r- size prefix from the register names in struct pt_regs, and drops the x- prefixes for segment registers on the 32-bit side. This patch also performs the equivalent renames in some additional places that might be candidates for unification in the future. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: add set/clear_cpu_cap operationsJeremy Fitzhardinge
The patch to suppress bitops-related warnings added a pile of ugly casts. Many of these were related to the management of x86 CPU capabilities. Clean these up by adding specific set/clear_cpu_cap macros, and use them consistently. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86 mce_64.c: make struct mcelog staticAdrian Bunk
This patch makes the needlessly global struct mcelog static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-24Driver core: change sysdev classes to use dynamic kobject namesKay Sievers
All kobjects require a dynamically allocated name now. We no longer need to keep track if the name is statically assigned, we can just unconditionally free() all kobject names on cleanup. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-17x86: fix cpu-hotplug regressionAndreas Herrmann
Commit d435d862baca3e25e5eec236762a43251b1e7ffc ("cpu hotplug: mce: fix cpu hotplug error handling") changed the error handling in mce_cpu_callback. In cases where not all CPUs are brought up during boot (e.g. using maxcpus and additional_cpus parameters) mce_cpu_callback now returns NOTFIY_BAD because for such CPUs cpu_data is not completely filled when the notifier is called. Thus mce_create_device fails right at its beginning: if (!mce_available(&cpu_data[cpu])) return -EIO; As a quick fix I suggest to check boot_cpu_data for MCE. To reproduce this regression: (1) boot with maxcpus=2 addtional_cpus=2 on a 4 CPU x86-64 system (2) # echo 1 >/sys/devices/system/cpu/cpu2/online -bash: echo: write error: Invalid argument dmesg shows: _cpu_up: attempt to bring up CPU 2 failed Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-14x86: don't call mce_create_device on CPU_UP_PREPAREAndreas Herrmann
Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc ("cpu hotplug: mce: fix cpu hotplug error handling"). A CPU which was not brought up during boot (using maxcpus and additional_cpus parameters) couldn't be onlined anymore. For such a CPU it seemed that MCE was not supported during CPU_UP_PREPARE-time which caused mce_cpu_callback to return NOTIFY_BAD to notifier_call_chain. To fix this we: - call mce_create_device for CPU_ONLINE event (instead of CPU_UP_PREPARE), - avoid mce_remove_device() for the CPU that is not correctly initialized by mce_create_device() failure, - make mce_cpu_callback always return NOTIFY_OK for CPU_ONLINE event. Because CPU_ONLINE callback return value is always ignored. [akinobu.mita@gmail.com: avoid mce_remove_device() for not initialized device] [akinobu.mita@gmail.com: make mce_cpu_callback always return NOTIFY_OK] Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-23x86: whitespace cleanup of mce_64.cThomas Gleixner
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-23x86: consolidate the cpu/ related code usageThomas Gleixner
The x86_64 arch/x86/kernel/Makefile uses references into arch/x86/kernel/cpu/... to use code from there. Unifiy it with the nicely structured i386 way and reuse the existing subdirectory make rules. Also move the machine check related source into ...kernel/cpu/mcheck, where the other machine check related code is. No code change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>