aboutsummaryrefslogtreecommitdiff
path: root/arch/i386/kernel
AgeCommit message (Collapse)Author
2007-02-03[CPUFREQ] Longhaul - Remove "ignore_latency" optionRafał Bilski
There is no need to have this option in Longhaul anymore. It was for laptop with CLE266 chipset in times, when only ACPI C3 was used to switch frequency. Now we have native support not only for CLE266, but CN400 too. Would be good to have support for PN266, but I can't find datasheet for it. Looks like BIOS for CPU's faster then 1GHz don't support ACPI C2 nor C3. Signed-off-by: Rafał Bilski <rafalbilski@interia.pl> Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-02Revert "[PATCH] fix typo in geode_configre()@cyrix.c"Linus Torvalds
This reverts commit e4f0ae0ea63caceff37a13f281a72652b7ea71ba. It's not wrong, but it's not right either, and everybody seems to agree that the right fix is probably to do the ccr3 write after the ccr4 one (and that we also should clean it up a bit). And after that we need to really validate that all the bits that we write to ccr4 actually do work. The old 2.6.19 code was insane, and basically didn't change ccr4 at all (even though it certainly looks like it was the *intent* to do so). So let's revert the change that may fix things, just because it's not what was actually ever tested when the code was written, even if it _was_ the intent. There's a discussion on http://lkml.org/lkml/2007/1/9/63 that was started by the patch that now gets reverted, and that discussion may well contain the proper long-term fix. Suggested-by: Adrian Bunk <bunk@stusta.de> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreqLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Remove unneeded errata workaround from p4-clockmod. [CPUFREQ] check sysfs_create_link return value
2007-01-30[PATCH] i386: In assign_irq_vector look at all vectors before giving upEric W. Biederman
When the world was a simple and static place setting up irqs was easy. It sufficed to allocate a linux irq number and a find a free cpu vector we could receive that linux irq on. In those days it was a safe assumption that any allocated vector was actually in use so after one global pass through all of the vectors we would have none left. These days things are much more dynamic with interrupt controllers (in the form of MSI or MSI-X) appearing on plug in cards and linux irqs appearing and disappearing. As these irqs come and go vectors are allocated and freed, invalidating the ancient assumption that all allocated vectors stayed in use forever. So this patch modifies the vector allocator to walk through every possible vector before giving up, and to check to see if a vector is in use before assigning it. With these changes we stop leaking freed vectors and it becomes possible to allocate and free irq vectors all day long. This changed was modeled after the vector allocator on x86_64 where this limitation has already been removed. In essence we don't update the static variables that hold the position of the last vector we allocated until have successfully allocated another vector. This allows us to detect if we have completed one complete scan through all of the possible vectors. Acked-by: Auke Kok <auke-jan.h.kok@intel.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-29[CPUFREQ] Remove unneeded errata workaround from p4-clockmod.Dave Jones
This workaround unnecessarily cripples functionality to work around an errata that doesn't seem possible to hit due to us using the automatic clock throttling in the p4 mcheck code. See http://lkml.org/lkml/2006/10/28/148 for complete reasoning and lack of disconsent. Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-26[PATCH] i386 vDSO: use VM_ALWAYSDUMPRoland McGrath
This patch fixes core dumps to include the vDSO vma, which is left out now. It removes the special-case core writing macros, which were not doing the right thing for the vDSO vma anyway. Instead, it uses VM_ALWAYSDUMP in the vma; there is no need for the fixmap page to be installed. It handles the CONFIG_COMPAT_VDSO case by making elf_core_dump use the fake vma from get_gate_vma after real vmas in the same way the /proc/PID/maps code does. This changes core dumps so they no longer include the non-PT_LOAD phdrs from the vDSO. I made the change to add them in the first place, but in turned out that nothing ever wanted them there since the advent of NT_AUXV. It's cleaner to leave them out, and just let the phdrs inside the vDSO image speak for themselves. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] Fix CONFIG_COMPAT_VDSORoland McGrath
I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely. But if it's there, it should work properly. Currently it's quite haphazard: both real vma and fixmap are mapped, both are put in the two different AT_* slots, sysenter returns to the vma address rather than the fixmap address, and core dumps yet are another story. This patch makes CONFIG_COMPAT_VDSO disable the real vma and use the fixmap area consistently. This makes it actually compatible with what the old vdso implementation did. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-23[PATCH] paravirt: mark the paravirt_ops export internalIngo Molnar
The paravirt subsystem is still in flux so all exports from it are definitely internal use only. The APIs around this /will/ change. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Cc: Zachary Amsden <zach@vmware.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-23[PATCH] Revert nmi_known_cpu() check during boot option parsingVenkatesh Pallipadi
Commit f2802e7f571c05f9a901b1f5bd144aa730ccc88e and its x86 version (b7471c6da94d30d3deadc55986cc38d1ff57f9ca) adds nmi_known_cpu() check while parsing boot options in x86_64 and i386. With that, "nmi_watchdog=2" stops working for me on Intel Core 2 CPU based system. The problem is, setup_nmi_watchdog is called while parsing the boot option and identify_cpu is not done yet. So, the return value of nmi_known_cpu() is not valid at this point. So revert that check. This should not have any adverse effect as the nmi_known_cpu() check is done again later in enable_lapic_nmi_watchdog(). Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-22[PATCH] x86: fix PDA variables to work during bootJames Bottomley
The current PDA code, which went in in post 2.6.19 has a flaw in that it doesn't correctly cycle the GDT and %GS segment through the boot PDA, the CPU PDA and finally the per-cpu PDA. The bug generally doesn't show up if the boot CPU id is zero, but everything falls apart for a non zero boot CPU id. The basically kills voyager which is perfectly capable of doing non zero CPU id boots, so voyager currently won't boot without this. The fix is to be careful and actually do the GDT setups correctly. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-11Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: Revert "ACPI: ibm-acpi: make non-generic bay support optional" ACPI: update MAINTAINERS ACPI: schedule obsolete features for deletion ACPI: delete two spurious ACPI messages ACPI: rename cstate_entry_s to cstate_entry ACPI: ec: enable printk on cmdline use ACPI: Altix: ACPI _PRT support
2007-01-11[PATCH] fix typo in geode_configre()@cyrix.ctakada
We write back the wrong register when configuring the Geode processor. Instead of storing to CCR4, it stores to CCR3. Cc: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11[PATCH] i386: sched_clock using init data tsc_disable fixVivek Goyal
o sched_clock() a non-init function is using init data tsc_disable. This is flagged by MODPOST on i386 if CONFIG_RELOCATABLE=y WARNING: vmlinux - Section mismatch: reference to .init.data:tsc_disable from .text between 'sched_clock' (at offset 0xc0109d58) and 'tsc_update_callback' Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11Pull trivial into release branchLen Brown
2007-01-10ACPI: rename cstate_entry_s to cstate_entryVenkatesh Pallipadi
style change only. Signed-off-by: Len Brown <len.brown@intel.com>
2007-01-11[PATCH] i386: Convert some functions to __init to avoid MODPOST warningsVivek Goyal
o Some functions which should have been in init sections as they are called only once. Put them in init sections. Otherwise MODPOST generates warning as these functions are placed in .text and they end up accessing something in init sections. WARNING: vmlinux - Section mismatch: reference to .init.text:migration_init from .text between 'do_pre_smp_initcalls' (at offset 0xc01000d1) and 'run_init_process' Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
2007-01-11[PATCH] i386: cpu hotplug/smpboot misc MODPOST warning fixesVivek Goyal
o Misc smpboot/cpu hotplug path cleanups. I did those to supress the warnings generated by MODPOST. These warnings are visible only if CONFIG_RELOCATABLE=y. o CONFIG_RELOCATABLE compiles the kernel with --emit-relocs option. This option retains relocation information in vmlinux file and MODPOST is quick to spit out "Section mismatch" warnings. o This patch fixes some of those warnings. Many of the functions in smpboot case are __devinit type and they in turn accesses text/data which if of type __cpuinit. Now if CONFIG_HOTPLUG=y and CONFIG_HOTPLUG_CPU=n then we end up in cases where a function in .text segment is calling another function in .init.text segment and MODPOST emits warning. WARNING: vmlinux - Section mismatch: reference to .init.text:identify_cpu from .text between 'smp_store_cpu_info' (at offset 0xc011020d) and 'do_boot_cpu' WARNING: vmlinux - Section mismatch: reference to .init.text:init_gdt from .text between 'do_boot_cpu' (at offset 0xc01102ca) and '__cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.text:print_cpu_info from .text between 'do_boot_cpu' (at offset 0xc01105d0) and '__cpu_up' o It also fixes the issues where CONFIG_HOTPLUG_CPU=y and start_secondary() is calling smp_callin() which in-turn calls synchronize_tsc_ap() which is of type __init. This should have meant broken CPU hotplug. WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc011603f) and 'initialize_secondary' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'MP_processor_info' (at offset 0xc0116a4f) and 'mp_register_lapic' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'MP_processor_info' (at offset 0xc0116a4f) and 'mp_register_lapic' Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
2007-01-05[PATCH] i386: modpost smpboot code warning fixVivek Goyal
o Currently synchronize_tsc_ap() is of type __init. It is called by smp_callin() which is of type __cpuinit. So synchronize_tsc_ap() should be of type __cpuinit. o Modpost generates warnings for i386 if CONFIG_RELOCATABLE=y and CONFIG_HOTPLUG_CPU=y WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164dc) and 'initialize_secondary' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164e8) and 'initialize_secondary' o tsc is of type __initdata. It should be of type __cpuinitdata. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05[PATCH] i386: fix another modpost warningVivek Goyal
o MODPOST generates warning for i386 if kernel is compiled with CONFIG_RELOCATABLE=y WARNING: vmlinux - Section mismatch: reference to .init.data: from .data between 'this_cpu' (at offset 0xc05194d0) and 'cpuinfo_op' o this_cpu pointer should be of type __cpuinitdata. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05[PATCH] i386: fix modpost warning in SMP trampoline codeVivek Goyal
o MODPOST generates warning for i386 if kernel is compiled with CONFIG_RELOCATABLE=y WARNING: vmlinux - Section mismatch: reference to .init.text:startup_32_smp from .data between 'trampoline_data' (at offset 0xc0519cf8) and 'boot_gdt' o trampoline code/data can go into init section is CPU hotplug is not enabled. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-04Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI: asus_acpi: new MAINTAINER ACPI: fix section mis-match build warning ACPI: increase ACPI_MAX_REFERENCE_COUNT for larger systems ACPI: EC: move verbose printk to debug build only backlight: fix backlight_device_register compile failures
2007-01-03Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreqLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] longhaul: Kill off warnings introduced by recent changes. [CPUFREQ] Uninitialized use of cmd.val in arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c:acpi_cpufreq_target() [CPUFREQ] Longhaul - Always guess FSB [CPUFREQ] Longhaul - Fix up powersaver assumptions. [CPUFREQ] longhaul: Fix up unreachable code. [CPUFREQ] speedstep-centrino: missing space and bracket [CPUFREQ] Bug fix for acpi-cpufreq and cpufreq_stats oops on frequency change notification [CPUFREQ] select consistently
2007-01-02[CPUFREQ] longhaul: Kill off warnings introduced by recent changes.Dave Jones
Bunch of unused vars + one case where gcc isn't smart enough. Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-02[CPUFREQ] Uninitialized use of cmd.val in ↵Guillaume Chazarain
arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c:acpi_cpufreq_target() cmd.val was used uninitialized on the line below. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-02[CPUFREQ] Longhaul - Always guess FSBRafał Bilski
This is patch that solves Ebox mini PC issue and make FSB code more specification compilant. At start guess_fsb function is guessing 200MHz FSB too. It is better to make it in this way because, thanks to this function, driver will fail for bogus FSB values caused by bogus multiplier value. For PowerSaver processors we can't depend on Max / MinMHzFSB because these values are only used for PowerSaver 2.0 and 3.0. Most processors on which Longhaul is used are PowerSaver 1.0 only. I'm changing code for older CPU's too, but not so much as previously, and this code was already used for Ezra. Using MinMHzBR for Ezra-T is outside spec. It is for voltage scaling purpose and don't have to be equal to minmult (but it is). Same for Nehemiah (it isn't for sure). Added mult - current multiplier value. Signed-off-by: Rafał Bilski <rafalbilski@interia.pl> Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-02ACPI: fix section mis-match build warningLen Brown
Dunno why this pops out in only in the allmodconfig build. Though the warning is accurate, all the callers of the flagged non __init function are __init, this is not a functional change. WARNING: vmlinux - Section mismatch: reference to .init.data:acpi_sci_flags from .text between 'acpi_sci_ioapic_setup' (at offset 0xc010f0a 6) and 'acpi_gsi_to_irq' WARNING: vmlinux - Section mismatch: reference to .init.text:mp_override_legacy_irq from .text between 'acpi_sci_ioapic_setup' (at offset 0 xc010f0de) and 'acpi_gsi_to_irq' WARNING: vmlinux - Section mismatch: reference to .init.data:acpi_sci_override_gsi from .text between 'acpi_sci_ioapic_setup' (at offset 0x c010f0e4) and 'acpi_gsi_to_irq' Signed-off-by: Len Brown <len.brown@intel.com>
2006-12-29[CPUFREQ] Longhaul - Fix up powersaver assumptions.Rafał Bilski
ACPI PM2 register was fallback for "Longhaul ver. 1" CPU's. My assumption that this register isn't present at "PowerSaver" motherboards is so far true, but current code will not work correctly in other case. There are three possible supports: ACPI C3, PM2 and northbridge. That was my assumption that ACPI C3 and northbridge is for PS and northbridge and PM2 is for V1. In current code we can only check if it is ACPI support or not by port22_en. So remove port22_en and add longhaul_flags. If USE_ACPI_C3 and USE_NORTHBRIDGE are both clear then it means ACPI PM2 support. Also change order of support probe from ACPI C3, PM2, northbridge to ACPI C3, northbridge, ACPI PM2. Paranoid protection against port 0x22 cast as ACPI PM2 register. Bit 1 clear in such case - lockup on AGP DMA. And obvious (now) fixup for do_powersaver. Use cx->address only for ACPI C3 ("PowerSaver" processor using PM2 support). Signed-off-by: Rafaż Bilski <rafalbilski@interia.pl> Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-28[CPUFREQ] longhaul: Fix up unreachable code.Dave Jones
Signed-off-by: Rafał Bilski <rafalbilski@interia.pl> Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-22[CPUFREQ] speedstep-centrino: missing space and bracketBrice Goglin
A space and a bracket are missing (and indentation is wrong). Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org> Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-22[CPUFREQ] Bug fix for acpi-cpufreq and cpufreq_stats oops on frequency ↵Venkatesh Pallipadi
change notification Fixes the oops in cpufreq_stats with acpi_cpufreq driver. The issue was that the frequency was reported as 0 in acpi-cpufreq.c. The bug is due to different indicies for freq_table and ACPI perf table. Also adds a check in cpufreq_stats to check for error return from freq_table_get_index() and avoid using the error return value. Patch fixes the issue reported at http://www.ussg.iu.edu/hypermail/linux/kernel/0611.2/0629.html and also other similar issue here http://bugme.osdl.org/show_bug.cgi?id=7383 comment 53 Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-22Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (68 commits) ACPI: replace kmalloc+memset with kzalloc ACPI: Add support for acpi_load_table/acpi_unload_table_id fbdev: update after backlight argument change ACPI: video: Add dev argument for backlight_device_register ACPI: Implement acpi_video_get_next_level() ACPI: Kconfig - depend on PM rather than selecting it ACPI: fix NULL check in drivers/acpi/osl.c ACPI: make drivers/acpi/ec.c:ec_ecdt static ACPI: prevent processor module from loading on failures ACPI: fix single linked list manipulation ACPI: ibm_acpi: allow clean removal ACPI: fix git automerge failure ACPI: ibm_acpi: respond to workqueue update ACPI: dock: add uevent to indicate change in device status ACPI: ec: Lindent once again ACPI: ec: Change #define to enums there possible. ACPI: ec: Style changes. ACPI: ec: Acquire Global Lock under EC mutex. ACPI: ec: Drop udelay() from poll mode. Loop by reading status field instead. ACPI: ec: Rename gpe_bit to gpe ...
2006-12-22[PATCH] sched: fix bad missed wakeups in the i386, x86_64, ia64, ACPI and ↵Ingo Molnar
APM idle code Fernando Lopez-Lezcano reported frequent scheduling latencies and audio xruns starting at the 2.6.18-rt kernel, and those problems persisted all until current -rt kernels. The latencies were serious and unjustified by system load, often in the milliseconds range. After a patient and heroic multi-month effort of Fernando, where he tested dozens of kernels, tried various configs, boot options, test-patches of mine and provided latency traces of those incidents, the following 'smoking gun' trace was captured by him: _------=> CPU# / _-----=> irqs-off | / _----=> need-resched || / _---=> hardirq/softirq ||| / _--=> preempt-depth |||| / ||||| delay cmd pid ||||| time | caller \ / ||||| \ | / IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (try_to_wake_up) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup <<...>-5856> (37 0) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (c01262ba 0 0) IRQ_19-1479 1D..1 0us : resched_task (try_to_wake_up) IRQ_19-1479 1D..1 0us : __spin_unlock_irqrestore (try_to_wake_up) ... <idle>-0 1...1 11us!: default_idle (cpu_idle) ... <idle>-0 0Dn.1 602us : smp_apic_timer_interrupt (c0103baf 1 0) ... <...>-5856 0D..2 618us : __switch_to (__schedule) <...>-5856 0D..2 618us : __schedule <<idle>-0> (20 162) <...>-5856 0D..2 619us : __spin_unlock_irq (__schedule) <...>-5856 0...1 619us : trace_stop_sched_switched (__schedule) <...>-5856 0D..1 619us : trace_stop_sched_switched <<...>-5856> (37 0) what is visible in this trace is that CPU#1 ran try_to_wake_up() for PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task() for CPU#0. But it decided to not send an IPI that no CPU - due to TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set, and only rescheduled to PID:5856 upon the next lapic timer IRQ. The result was a 600+ usecs latency and a missed wakeup! the bug turned out to be an idle-wakeup bug introduced into the mainline kernel this summer via an optimization in the x86_64 tree: commit 495ab9c045e1b0e5c82951b762257fe1c9d81564 Author: Andi Kleen <ak@suse.de> Date: Mon Jun 26 13:59:11 2006 +0200 [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status During some profiling I noticed that default_idle causes a lot of memory traffic. I think that is caused by the atomic operations to clear/set the polling flag in thread_info. There is actually no reason to make this atomic - only the idle thread does it to itself, other CPUs only read it. So I moved it into ti->status. the problem is this type of change: if (!hlt_counter && boot_cpu_data.hlt_works_ok) { - clear_thread_flag(TIF_POLLING_NRFLAG); + current_thread_info()->status &= ~TS_POLLING; smp_mb__after_clear_bit(); while (!need_resched()) { local_irq_disable(); this changes clear_thread_flag() to an explicit clearing of TS_POLLING. clear_thread_flag() is defined as: clear_bit(flag, &ti->flags); and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms: static inline void clear_bit(int nr, volatile unsigned long * addr) { __asm__ __volatile__( LOCK_PREFIX "btrl %1,%0" hence smp_mb__after_clear_bit() is defined as a simple compile barrier: #define smp_mb__after_clear_bit() barrier() but the explicit TS_POLLING clearing introduced by the patch: + current_thread_info()->status &= ~TS_POLLING; is not an atomic op! So the clearing of the TS_POLLING bit is freely reorderable with the reading of the NEED_RESCHED bit - and both now reside in different memory addresses. CPU idle wakeup very much depends on ordered memory ops, the clearing of the TS_POLLING flag must always be done before we test need_resched() and hit the idle instruction(s). [Symmetrically, the wakeup code needs to set NEED_RESCHED before it tests the TS_POLLING flag, so memory ordering is paramount.] Fernando's dual-core Athlon64 system has a sufficiently advanced memory ordering model so that it triggered this scenario very often. ( And it also turned out that the reason why these latencies never triggered on my testsystems is that i routinely use idle=poll, which was the only idle variant not affected by this bug. ) The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to act as an absolute barrier between the TS_POLLING write and the NEED_RESCHED read. This affects almost all idling methods (default, ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64. Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] ptrace: Fix EFL_OFFSET value according to i386 pda changesJeremy Fitzhardinge
The PDA patches introduced a bug in ptrace: it reads eflags from the wrong place on the target's stack, but writes it back to the correct place. The result is a corrupted eflags, which is most visible when it turns interrupts off unexpectedly. This patch fixes this by making the ptrace code a little less fragile. It changes [gs]et_stack_long to take a straightforward byte offset into struct pt_regs, rather than requiring all callers to do a sizeof(struct pt_regs) offset adjustment. This means that the eflag's offset (EFL_OFFSET) on the target stack can be simply computed with offsetof(). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Frederik Deweerdt <deweerdt@free.fr> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] microcode: fix mc_cpu_notifier section warningJean Delvare
Structure mc_cpu_notifier references a __cpuinit function, but isn't declared __cpuinitdata itself: WARNING: arch/i386/kernel/microcode.o - Section mismatch: reference to .init.text: from .data after 'mc_cpu_notifier' (at offset 0x118) Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] compile error of register_memory()Yasunori Goto
register_memory() becomes double definition in 2.6.20-rc1. It is defined in arch/i386/kernel/setup.c as static definition in 2.6.19. But it is moved to arch/i386/kernel/e820.c in 2.6.20-rc1. And same name function is defined in driver/base/memory.c too. So, it becomes cause of compile error of duplicate definition if memory hotplug option is on. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-20merge linus into test branchLen Brown
2006-12-17Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreqLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] longhaul compile fix. [CPUFREQ] Advise not to use longhaul on VIA C7. [CPUFREQ] set policy->curfreq on initialization [CPUFREQ] Trivial cleanup for acpi read/write port in acpi-cpufreq.c [CPUFREQ] fixes typo in cpufreq.c
2006-12-17[CPUFREQ] longhaul compile fix.Dave Jones
Some gcc's are more anal than others about empty switch labels. error: label at end of compound statement Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-17[CPUFREQ] Advise not to use longhaul on VIA C7.Dave Jones
C7's are centrino speedstep-alike. Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-17[CPUFREQ] set policy->curfreq on initializationMattia Dongili
Check the correct variable and set policy->cur upon acpi-cpufreq initialization to allow the userspace governor to be used as default. Signed-off-by: Mattia Dongili <malattia@linux.it> Acked-by: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-16ACPI: fix git automerge failureLen Brown
Signed-off-by: Len Brown <len.brown@intel.com>
2006-12-16Pull trivial into test branchLen Brown
Conflicts: drivers/acpi/ec.c
2006-12-15Remove stack unwinder for nowLinus Torvalds
It has caused more problems than it ever really solved, and is apparently not getting cleaned up and fixed. We can put it back when it's stable and isn't likely to make warning or bug events worse. In the meantime, enable frame pointers for more readable stack traces. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13[CPUFREQ] Trivial cleanup for acpi read/write port in acpi-cpufreq.cVenkatesh Pallipadi
Small cleanup in acpi-cpufreq. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-13[PATCH] getting rid of all casts of k[cmz]alloc() callsRobert P. J. Day
Run this: #!/bin/sh for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do echo "De-casting $f..." perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f done And then go through and reinstate those cases where code is casting pointers to non-pointers. And then drop a few hunks which conflicted with outstanding work. Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Greg KH <greg@kroah.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Paul Fulghum <paulkf@microgate.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Karsten Keil <kkeil@suse.de> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Ian Kent <raven@themaw.net> Cc: Steven French <sfrench@us.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13[PATCH] update Tigran's email addressesTigran Aivazian
As Adrian pointed out recently, there were still a couple of places where I should have fixed my email address. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13[PATCH] arch/i386/kernel/smpboot.c: remove unneeded ifdefAndrew Morton
#ifdef CONFIG_SMP in a file which isn't compiled in non-SMP kernels. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-12Merge ../linusDave Jones
Conflicts: drivers/cpufreq/cpufreq.c
2006-12-12[CPUFREQ] Longhaul - Add support for CN400Rafał Bilski
Support for CN400 northbridge when ACPI C3 isn't available. Tested on Epia SP13000. Thanks to Robert for testing it. Signed-off-by: Rafał Bilski <rafalbilski@interia.pl> Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-12[CPUFREQ] Longhaul - fix 200MHz FSBRafał Bilski
On board of Epia SP13000 is 10x133Mhz VIA Nehemiah. It is reported as 10x200MHz. This patch is fixing this issue. Signed-off-by: Rafał Bilski <rafalbilski@interia.pl> Signed-off-by: Dave Jones <davej@redhat.com>