aboutsummaryrefslogtreecommitdiff
path: root/include/asm-i386
AgeCommit message (Collapse)Author
2006-10-14ACPI: Processor native C-states using MWAITVenkatesh Pallipadi
Intel processors starting with the Core Duo support support processor native C-state using the MWAIT instruction. Refer: Intel Architecture Software Developer's Manual http://www.intel.com/design/Pentium4/manuals/253668.htm Platform firmware exports the support for Native C-state to OS using ACPI _PDC and _CST methods. Refer: Intel Processor Vendor-Specific ACPI: Interface Specification http://www.intel.com/technology/iapc/acpi/downloads/302223.htm With Processor Native C-state, we use 'MWAIT' instruction on the processor to enter different C-states (C1, C2, C3). We won't use the special IO ports to enter C-state and no SMM mode etc required to enter C-state. Overall this will mean better C-state support. One major advantage of using MWAIT for all C-states is, with this and "treat interrupt as break event" feature of MWAIT, we can now get accurate timing for the time spent in C1, C2, .. states. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Len Brown <len.brown@intel.com>
2006-10-12[VOYAGER] fix up ptregs removal messJames Bottomley
Apparently whoever converted voyager never actually checked that the patch would compile ... Remove as much of the pt_regs references as possible and move the remaining ones into line with what's in x86 generic. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-12[VOYAGER] fix up attribute packed specifiers in voyager.hJames Bottomley
The old style (attribute on each structure entry) never really worked. Move it to an attribute per structure Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-11[PATCH] uaccess.h: match kernel-doc and function namesRandy Dunlap
Place kernel-doc function comment header immediately before the function that is being documented. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11[PATCH] Consolidate check_signatureMatthew Wilcox
There's nothing arch-specific about check_signature(), so move it to <linux/io.h>. Use a cross between the Alpha and i386 implementations as the generic one. Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11[PATCH] epoll_pwait()Davide Libenzi
Implement the epoll_pwait system call, that extend the event wait mechanism with the same logic ppoll and pselect do. The definition of epoll_pwait is: int epoll_pwait(int epfd, struct epoll_event *events, int maxevents, int timeout, const sigset_t *sigmask, size_t sigsetsize); The difference between the vanilla epoll_wait and epoll_pwait is that the latter allows the caller to specify a signal mask to be set while waiting for events. Hence epoll_pwait will wait until either one monitored event, or an unmasked signal happen. If sigmask is NULL, the epoll_pwait system call will act exactly like epoll_wait. For the POSIX definition of pselect, information is available here: http://www.opengroup.org/onlinepubs/009695399/functions/select.html Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Andi Kleen <ak@muc.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-08[PATCH] i386/x86_64: Remove global IO_APIC_VECTOREric W. Biederman
Which vector an irq is assigned to now varies dynamically and is not needed outside of io_apic.c. So remove the possibility of accessing the information outside of io_apic.c and remove the silly macro that makes looking for users of irq_vector difficult. The fact this compiles ensures there aren't any more pieces of the old CONFIG_PCI_MSI weirdness that I failed to remove. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-06[PATCH] make kernels with CONFIG_X86_GENERIC and !CONFIG_SMP compilableJiri Kosina
CONFIG_X86_GENERIC is not exclusively CONFIG_SMP, as mach-default/ could be compiled also for UP archs. The patch fixes compilation error in include/asm/mach-summit/mach_apic.h in case CONFIG_X86_GENERIC && !CONFIG_SMP Signed-off-by: Jiri Kosina <jikos@jikos.cz> Acked-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-05IRQ: Maintain regs pointer globally rather than passing to IRQ handlersDavid Howells
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead of passing regs around manually through all ~1800 interrupt handlers in the Linux kernel. The regs pointer is used in few places, but it potentially costs both stack space and code to pass it around. On the FRV arch, removing the regs parameter from all the genirq function results in a 20% speed up of the IRQ exit path (ie: from leaving timer_interrupt() to leaving do_IRQ()). Where appropriate, an arch may override the generic storage facility and do something different with the variable. On FRV, for instance, the address is maintained in GR28 at all times inside the kernel as part of general exception handling. Having looked over the code, it appears that the parameter may be handed down through up to twenty or so layers of functions. Consider a USB character device attached to a USB hub, attached to a USB controller that posts its interrupts through a cascaded auxiliary interrupt controller. A character device driver may want to pass regs to the sysrq handler through the input layer which adds another few layers of parameter passing. I've build this code with allyesconfig for x86_64 and i386. I've runtested the main part of the code on FRV and i386, though I can't test most of the drivers. I've also done partial conversion for powerpc and MIPS - these at least compile with minimal configurations. This will affect all archs. Mostly the changes should be relatively easy. Take do_IRQ(), store the regs pointer at the beginning, saving the old one: struct pt_regs *old_regs = set_irq_regs(regs); And put the old one back at the end: set_irq_regs(old_regs); Don't pass regs through to generic_handle_irq() or __do_IRQ(). In timer_interrupt(), this sort of change will be necessary: - update_process_times(user_mode(regs)); - profile_tick(CPU_PROFILING, regs); + update_process_times(user_mode(get_irq_regs())); + profile_tick(CPU_PROFILING); I'd like to move update_process_times()'s use of get_irq_regs() into itself, except that i386, alone of the archs, uses something other than user_mode(). Some notes on the interrupt handling in the drivers: (*) input_dev() is now gone entirely. The regs pointer is no longer stored in the input_dev struct. (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does something different depending on whether it's been supplied with a regs pointer or not. (*) Various IRQ handler function pointers have been moved to type irq_handler_t. Signed-Off-By: David Howells <dhowells@redhat.com> (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
2006-10-04Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/confighLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/configh: Remove all inclusions of <linux/config.h> Manually resolved trivial path conflicts due to removed files in the sound/oss/ subdirectory.
2006-10-04[PATCH] msi: refactor and move the msi irq_chip into the arch codeEric W. Biederman
It turns out msi_ops was simply not enough to abstract the architecture specific details of msi. So I have moved the resposibility of constructing the struct irq_chip to the architectures, and have two architecture specific functions arch_setup_msi_irq, and arch_teardown_msi_irq. For simple architectures those functions can do all of the work. For architectures with platform dependencies they can call into the appropriate platform code. With this msi.c is finally free of assuming you have an apic, and this actually takes less code. The helpers for the architecture specific code are declared in the linux/msi.h to keep them separate from the msi functions used by drivers in linux/pci.h Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] Initial generic hypertransport interrupt supportEric W. Biederman
This patch implements two functions ht_create_irq and ht_destroy_irq for use by drivers. Several other functions are implemented as helpers for arch specific irq_chip handlers. The driver for the card I tested this on isn't yet ready to be merged. However this code is and hypertransport irqs are in use in a few other places in the kernel. Not that any of this will get merged before 2.6.19 Because the ipath-ht400 is slightly out of spec this code will need to be generalized to work there. I think all of the powerpc uses are for a plain interrupt controller in a chipset so support for native hypertransport devices is a little less interesting. However I think this is a half way decent model on how to separate arch specific and generic helper code, and I think this is a functional model of how to get the architecture dependencies out of the msi code. [akpm@osdl.org: Kconfig fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg KH <greg@kroah.com> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: i386 irq: Remove the msi assumption that irq == vectorEric W. Biederman
This patch removes the change in behavior of the irq allocation code when CONFIG_PCI_MSI is defined. Removing all instances of the assumption that irq == vector. create_irq is rewritten to first allocate a free irq and then to assign that irq a vector. assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an vector not bound to an irq is removed. The ioapic vector methods are removed, and everything now works with irqs. The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI [akpm@osdl.org: cleanup] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: i386 irq: Move msi message composition into io_apic.cEric W. Biederman
This removes the hardcoded assumption that irq == vector in the msi composition code, and it allows the msi message composition to setup logical mode, or lowest priorirty delivery mode as we do for other apic interrupts, and with the same selection criteria. Basically this moves the problem of what is in the msi message into the architecture irq management code where it belongs. Not in a generic layer that doesn't have enough information to compose msi messages properly. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: convert the i386 architecture to irq-chipsIngo Molnar
This patch converts all the i386 PIC controllers (except VisWS and Voyager, which I could not test - but which should still work as old-style IRQ layers) to the new and simpler irq-chip interrupt handling layer. [akpm@osdl.org: build fix] [mingo@elte.hu: enable fasteoi handler for i386 level-triggered IO-APIC irqs] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04Remove all inclusions of <linux/config.h>Dave Jones
kbuild explicitly includes this at build time. Signed-off-by: Dave Jones <davej@redhat.com>
2006-10-03[PATCH] i383 numa: fix numaq/summit apicid conflictKeith Mannthey
This allows numaq to properly align cpus to their given node during boot. Pass logical apicid to apicid_to_node and allow the summit sub-arch to use physical apicid (hard_smp_processor_id()). Tested against numaq and summit based systems with no issues. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] sched: introduce child field in sched_domainSiddha, Suresh B
Introduce the child field in sched_domain struct and use it in sched_balance_self(). We will also use this field in cleaning up the sched group cpu_power setup(done in a different patch) code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] kernel-doc for kernel/dma.cRandy Dunlap
Add kernel-doc function headers in kernel/dma.c and use it in DocBook. Clean up kernel-doc in mca_dma.h (the colon (':') represents a section header). Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] remove remaining errno and __KERNEL_SYSCALLS__ referencesArnd Bergmann
The last in-kernel user of errno is gone, so we should remove the definition and everything referring to it. This also removes the now-unused lib/execve.c file that was introduced earlier. Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the kernel. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: use init_utsname when appropriateSerge E. Hallyn
In some places, particularly drivers and __init code, the init utsns is the appropriate one to use. This patch replaces those with a the init_utsname helper. Changes: Removed several uses of init_utsname(). Hope I picked all the right ones in net/ipv4/ipconfig.c. These are now changed to utsname() (the per-process namespace utsname) in the previous patch (2/7) [akpm@osdl.org: CIFS fix] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: switch to using uts namespacesSerge E. Hallyn
Replace references to system_utsname to the per-process uts namespace where appropriate. This includes things like uname. Changes: Per Eric Biederman's comments, use the per-process uts namespace for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c [jdike@addtoit.com: UML fix] [clg@fr.ibm.com: cleanup] [akpm@osdl.org: build fix] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] Add regs_return_value() helperAnanth N Mavinakayanahalli
Add the regs_return_value() macro to extract the return value in an architecture agnostic manner, given the pt_regs. Other architecture maintainers may want to add similar helpers. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] paravirt: update pte hookZachary Amsden
Add a pte_update_hook which notifies about pte changes that have been made without using the set_pte / clear_pte interfaces. This allows shadow mode hypervisors which do not trap on page table access to maintain synchronized shadows. It also turns out, there was one pte update in PAE mode that wasn't using any accessor interface at all for setting NX protection. Considering it is PAE specific, and the accessor is i386 specific, I didn't want to add a generic encapsulation of this behavior yet. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] paravirt: remove set pte atomicZachary Amsden
Now that ptep_establish has a definition in PAE i386 3-level paging code, the only paging model which is insane enough to have multi-word hardware PTEs which are not efficient to set atomically, we can remove the ghost of set_pte_atomic from other architectures which falesly duplicated it, and remove all knowledge of it from the generic pgtable code. set_pte_atomic is now a private pte operator which is specific to i386 Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] paravirt: optimize ptep establish for paeZachary Amsden
The ptep_establish macro is only used on user-level PTEs, for P->P mapping changes. Since these always happen under protection of the pagetable lock, the strong synchronization of a 64-bit cmpxchg is not needed, in fact, not even a lock prefix needs to be used. We can simply instead clear the P-bit, followed by a normal set. The write ordering is still important to avoid the possibility of the TLB snooping a partially written PTE and getting a bad mapping installed. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] paravirt: kpte flushZachary Amsden
Create a new PTE function which combines clearing a kernel PTE with the subsequent flush. This allows the two to be easily combined into a single hypercall or paravirt-op. More subtly, reverse the order of the flush for kmap_atomic. Instead of flushing on establishing a mapping, flush on clearing a mapping. This eliminates the possibility of leaving stale kmap entries which may still have valid TLB mappings. This is required for direct mode hypervisors, which need to reprotect all mappings of a given page when changing the page type from a normal page to a protected page (such as a page table or descriptor table page). But it also provides some nicer semantics for real hardware, by providing extra debug-proofing against using stale mappings, as well as ensuring that no stale mappings exist when changing the cacheability attributes of a page, which could lead to cache conflicts when two different types of mappings exist for the same page. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] paravirt: combine flush accessed dirty.patchZachary Amsden
Remove ptep_test_and_clear_{dirty|young} from i386, and instead use the dominating functions, ptep_clear_flush_{dirty|young}. This allows the TLB page flush to be contained in the same macro, and allows for an eager optimization - if reading the PTE initially returned dirty/accessed, we can assume the fact that no subsequent update to the PTE which cleared accessed / dirty has occurred, as the only way A/D bits can change without holding the page table lock is if a remote processor clears them. This eliminates an extra branch which came from the generic version of the code, as we know that no other CPU could have cleared the A/D bit, so the flush will always be needed. We still export these two defines, even though we do not actually define the macros in the i386 code: #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_DIRTY The reason for this is that the only use of these functions is within the generic clear_flush functions, and we want a strong guarantee that there are no other users of these functions, so we want to prevent the generic code from defining them for us. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] stack overflow safe kdump: safe_smp_processor_id()Fernando Vazquez
This is a the first of a series of patch-sets aiming at making kdump more robust against stack overflows. This patch set does the following: * Add safe_smp_processor_id function to i386 architecture (this function was inspired by the x86_64 function of the same name). * Substitute "smp_processor_id" with the stack overflow-safe "safe_smp_processor_id" in the reboot path to the second kernel. This patch: On the event of a stack overflow critical data that usually resides at the bottom of the stack is likely to be stomped and, consequently, its use should be avoided. In particular, in the i386 and IA64 architectures the macro smp_processor_id ultimately makes use of the "cpu" member of struct thread_info which resides at the bottom of the stack. x86_64, on the other hand, is not affected by this problem because it benefits from the use of the PDA infrastructure. To circumvent this problem I suggest implementing "safe_smp_processor_id()" (it already exists in x86_64) for i386 and IA64 and use it as a replacement for smp_processor_id in the reboot path to the dump capture kernel. This is a possible implementation for i386. Signed-off-by: Fernando Vazquez <fernando@intellilink.co.jp> Looks-reasonable-to: Andi Kleen <ak@muc.de> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] Directed yield: cpu_relax variants for spinlocks and rw-locksMartin Schwidefsky
On systems running with virtual cpus there is optimization potential in regard to spinlocks and rw-locks. If the virtual cpu that has taken a lock is known to a cpu that wants to acquire the same lock it is beneficial to yield the timeslice of the virtual cpu in favour of the cpu that has the lock (directed yield). With CONFIG_PREEMPT="n" this can be implemented by the architecture without common code changes. Powerpc already does this. With CONFIG_PREEMPT="y" the lock loops are coded with _raw_spin_trylock, _raw_read_trylock and _raw_write_trylock in kernel/spinlock.c. If the lock could not be taken cpu_relax is called. A directed yield is not possible because cpu_relax doesn't know anything about the lock. To be able to yield the lock in favour of the current lock holder variants of cpu_relax for spinlocks and rw-locks are needed. The new _raw_spin_relax, _raw_read_relax and _raw_write_relax primitives differ from cpu_relax insofar that they have an argument: a pointer to the lock structure. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-30[PATCH] x86: Clean up x86 NMI sysctlsAndi Kleen
Use prototypes in headers Don't define panic_on_unrecovered_nmi for all architectures Cc: dzickus@redhat.com Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-29[PATCH] simplify update_times (avoid jiffies/jiffies_64 aliasing problem)Atsushi Nemoto
Pass ticks to do_timer() and update_times(), and adjust x86_64 and s390 timer interrupt handler with this change. Currently update_times() calculates ticks by "jiffies - wall_jiffies", but callers of do_timer() should know how many ticks to update. Passing ticks get rid of this redundant calculation. Also there are another redundancy pointed out by Martin Schwidefsky. This cleanup make a barrier added by 5aee405c662ca644980c184774277fc6d0769a84 needless. So this patch removes it. As a bonus, this cleanup make wall_jiffies can be removed easily, since now wall_jiffies is always synced with jiffies. (This patch does not really remove wall_jiffies. It would be another cleanup patch) Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Acked-by: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Acked-by: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Use valid_dma_direction() in include/asm-i386/dma-mapping.hRolf Eike Beer
Now that the generic DMA code has a function to decide if a given DMA mapping is valid use it. This will catch cases where direction is not any of the defined enum values but some random number outside the valid range. The current implementation will only catch the defined but invalid case DMA_NONE. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] convert i386 Summit subarch to use SRAT info for apicid_to_node callskeith mannthey
Convert the i386 summit subarch apicid_to_node to use node information provided by the SRAT. It was discussed a little on LKML a few weeks ago and was seen as an acceptable fix. The current way of obtaining the nodeid static inline int apicid_to_node(int logical_apicid) { return logical_apicid >> 5; } is just not correct for all summit systems/bios. Assuming the apicid matches the Linux node number require a leap of faith that the bios mapped out the apicids a set way. Modern summit HW (IBM x460) does not layout its bios in the manner for various reasons and is unable to boot i386 numa. The best way to get the correct apicid to node information is from the SRAT table during boot. It lays out what apicid belongs to what node. I use this information to create a table for use at run time. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] i386: Use early clobbers for semaphores nowAndi Kleen
The new code does clobber the result early, so make sure to tell gcc to not put it into the same register as a input argument Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andrew Morton <akpm@osdl.org> Acked-by: Kyle McMartin <kyle@parisc-linux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] consistently use MAX_ERRNO in __syscall_returnRandy Dunlap
Consistently use MAX_ERRNO when checking for errors in __syscall_return(). [ralf@linux-mips.org: build fix] Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6Linus Torvalds
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits) [PATCH] Don't set calgary iommu as default y [PATCH] i386/x86-64: New Intel feature flags [PATCH] x86: Add a cumulative thermal throttle event counter. [PATCH] i386: Make the jiffies compares use the 64bit safe macros. [PATCH] x86: Refactor thermal throttle processing [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64) [PATCH] Fix unwinder warning in traps.c [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1 [PATCH] x86: Move direct PCI scanning functions out of line [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI [PATCH] Don't leak NT bit into next task [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder [PATCH] Fix some broken white space in ia32_signal.c [PATCH] Initialize argument registers for 32bit signal handlers. [PATCH] Remove all traces of signal number conversion [PATCH] Don't synchronize time reading on single core AMD systems [PATCH] Remove outdated comment in x86-64 mmconfig code [PATCH] Use string instructions for Core2 copy/clear [PATCH] x86: - restore i8259A eoi status on resume [PATCH] i386: Split multi-line printk in oops output. ...
2006-09-26[PATCH] Split i386 and x86_64 ptrace.hJeff Dike
The use of SEGMENT_RPL_MASK in the i386 ptrace.h introduced by x86-allow-a-kernel-to-not-be-in-ring-0.patch broke the UML build, as UML includes the underlying architecture's ptrace.h, but has no easy access to the x86 segment definitions. Rather than kludging around this, as in the past, this patch splits the userspace-usable parts, which are the bits that UML needs, of ptrace.h into ptrace-abi.h, which is included back into ptrace.h. Thus, there is no net effect on i386. As a side-effect, this creates a ptrace header which is close to being usable in /usr/include. x86_64 is also treated in this way for consistency. There was some trailing whitespace there, which is cleaned up. Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] x86: trivial move of ptep_set_access_flagsRusty Russell
Move ptep_set_access_flags to be closer to the other ptep accessors, and make the indentation standard. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] x86: trivial move of __HAVE macros in i386 pagetable headersRusty Russell
Move the __HAVE_ARCH_PTEP defines to accompany the function definitions. Anything else is just a complete nightmare to track through the 2/3-level paging code, and this caused duplicate definitions to be needed (pte_same), which could have easily been taken care of with the asm-generic pgtable functions. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] x86: make __FIXADDR_TOP variable to allow it to make space for a ↵Jeremy Fitzhardinge
hypervisor Make __FIXADDR_TOP a variable, so that it can be set to not get in the way of address space a hypervisor may want to reserve. Original patch by Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] x86: roll all the cpuid asm into one __cpuid callRusty Russell
It's a little neater, and also means only one place to patch for paravirtualization. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] x86: implement always-locked bit ops, for memory shared with an SMP ↵Chris Wright
hypervisor Add "always lock'd" implementations of set_bit, clear_bit and change_bit and the corresponding test_and_ functions. Also add "always lock'd" implementation of cmpxchg. These give guaranteed strong synchronisation and are required for non-SMP kernels running on an SMP hypervisor. Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] Use BUG_ON(foo) instead of "if (foo) BUG()" in ↵Rolf Eike Beer
include/asm-i386/dma-mapping.h Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] Standardize pxx_page macrosDave McCracken
One of the changes necessary for shared page tables is to standardize the pxx_page macros. pte_page and pmd_page have always returned the struct page associated with their entry, while pte_page_kernel and pmd_page_kernel have returned the kernel virtual address. pud_page and pgd_page, on the other hand, return the kernel virtual address. Shared page tables needs pud_page and pgd_page to return the actual page structures. There are very few actual users of these functions, so it is simple to standardize their usage. Since this is basic cleanup, I am submitting these changes as a standalone patch. Per Hugh Dickins' comments about it, I am also changing the pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning. Signed-off-by: Dave McCracken <dmccr@us.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] convert i386 NUMA KVA space to bootmemkeith mannthey
Address a long standing issue of booting with an initrd on an i386 numa system. Currently (and always) the numa kva area is mapped into low memory by finding the end of low memory and moving that mark down (thus creating space for the kva). The issue with this is that Grub loads initrds into this similar space so when the kernel check the initrd it finds it outside max_low_pfn and disables it (it thinks the initrd is not mapped into usable memory) thus initrd enabled kernels can't boot i386 numa :( My solution to the problem just converts the numa kva area to use the bootmem allocator to save it's area (instead of moving the end of low memory). Using bootmem allows the kva area to be mapped into more diverse addresses (not just the end of low memory) and enables the kva area to be mapped below the initrd if present. I have tested this patch on numaq(no initrd) and summit(initrd) i386 numa based systems. [akpm@osdl.org: cleanups] Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] x86: Add a cumulative thermal throttle event counter.Dmitriy Zavin
The counter is exported to /sys that keeps track of the number of thermal events, such that the user knows how bad the thermal problem might be (since the logging to syslog and mcelog is rate limited). AK: Fixed cpu hotplug locking Signed-off-by: Dmitriy Zavin <dmitriyz@google.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26[PATCH] x86: Refactor thermal throttle processingDmitriy Zavin
Refactor the event processing (syslog messaging and rate limiting) into separate file therm_throt.c. This allows consistent reporting of CPU thermal throttle events. After ACK'ing the interrupt, if the event is current, the user (p4.c/mce_intel.c) calls therm_throt_process to log (and rate limit) the event. If that function returns 1, the user has the option to log things further (such as to mce_log in x86_64). AK: minor cleanup Signed-off-by: Dmitriy Zavin <dmitriyz@google.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26[PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinderJan Beulich
Current gcc generates calls not jumps to noreturn functions. When that happens the return address can point to the next function, which confuses the unwinder. This patch works around it by marking asynchronous exception frames in contrast normal call frames in the unwind information. Then teach the unwinder to decode this. For normal call frames the unwinder now subtracts one from the address which avoids this problem. The standard libgcc unwinder uses the same trick. It doesn't include adjustment of the printed address (i.e. for the original example, it'd still be kernel_math_error+0 that gets displayed, but the unwinder wouldn't get confused anymore. This only works with binutils 2.6.17+ and some versions of H.J.Lu's 2.6.16 unfortunately because earlier binutils don't support .cfi_signal_frame [AK: added automatic detection of the new binutils and wrote description] Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26[PATCH] i386: Fix pack_descriptor()Jeremy Fitzhardinge
Fix pack_descriptor: 1. flags are bits 20-23 in the high word 2. limit's 4 msb are bits 16-19 in the high word These haven't mattered so far, because all users have had small limits and a flags setting of 0. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> ===================================================================