aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2007-10-03[POWERPC] Create and use CONFIG_WORD_SIZEStephen Rothwell
Linus made this suggestion for the x86 merge and this starts the process for powerpc. We assume that CONFIG_PPC64 implies CONFIG_PPC_MERGE and CONFIG_PPC_STD_MMU_32 implies CONFIG_PPC_STD_MMU. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03[POWERPC] Separate out legacy machine check exception parsersOlof Johansson
Move out the old-style exception parsers to a separate function, and don't call it on platforms that have a platform-specific handler. It would make sense to move out the generic versions into their platforms instead, but that can be done gradually down the road. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03[POWERPC] clk.h interface for platformsDomen Puncer
This provides an implementation of the <linux/clk.h> interface for arch/powerpc using a set of function pointers in clk_functions. Platforms that want to support this interface should fill clk_functions and select CONFIG_PPC_CLOCK in Kconfig. Signed-off-by: Domen Puncer <domen.puncer@telargo.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03[POWERPC] Inline u3msi_compose_msi_msg()Michael Ellerman
In the MPIC U3 MSI code, we call u3msi_compose_msi_msg() once for each MSI. This is overkill, as the address is per pci device, not per MSI. So setup the address once, and just set the data per MSI. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03[POWERPC] Simplify rtas_change_msi() error semanticsMichael Ellerman
Currently rtas_change_msi() returns either the error code from RTAS, or if the RTAS call succeeded the number of irqs that were configured by RTAS. This makes checking the return value more complicated than it needs to be. Instead, have rtas_change_msi() check that the number of irqs configured by RTAS is equal to what we requested - and return an error otherwise. This makes the return semantics match the usual 0 for success, something else for error. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03[POWERPC] Simplify error logic in rtas_setup_msi_irqs()Michael Ellerman
rtas_setup_msi_irqs() doesn't need to call teardown() itself, the generic code will do this for us as long as we return a non-zero value. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03[POWERPC] Simplify error logic in u3msi_setup_msi_irqs()Michael Ellerman
u3msi_setup_msi_irqs() doesn't need to call teardown() itself, the generic code will do this for us as long as we return a non zero value. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03[POWERPC] Make sure to of_node_get() the result of pci_device_to_OF_node()Michael Ellerman
pci_device_to_OF_node() returns the device node attached to a PCI device, but doesn't actually grab a reference - we need to do it ourselves. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03[POWERPC] Move embedded6xx into multiplatformArnd Bergmann
The various embedded 6xx systems can easily coexist in one kernel together with the other 6xx based systems, so there is no strict reason to keep them separate. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-02[POWERPC] pseries: device node status can be "ok" or "okay"Linas Vepstas
It seems that some versions of firmware will report a device node status as the string "okay". As we are not expecting this string, the device node will be ignored by the EEH subsystem. Which means EEH will not be enabled. When EEH is not enabled, PCI errors will be converted into Machine Check exceptions, and we'll have a very unhappy system. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Fix build errors when BLOCK=nEmil Medve
These are the symptom error messages: CC arch/powerpc/kernel/setup_32.o In file included from include/linux/blkdev.h:17, from include/linux/ide.h:13, from arch/powerpc/kernel/setup_32.c:13: include/linux/bsg.h:67: warning: 'struct request_queue' declared inside parameter list include/linux/bsg.h:67: warning: its scope is only this definition or declaration, which is probably not what you want include/linux/bsg.h:71: warning: 'struct request_queue' declared inside parameter list In file included from arch/powerpc/kernel/setup_32.c:13: include/linux/ide.h:857: error: field 'wrq' has incomplete type CC arch/powerpc/kernel/ppc_ksyms.o In file included from include/linux/blkdev.h:17, from include/linux/ide.h:13, from arch/powerpc/kernel/ppc_ksyms.c:15: include/linux/bsg.h:67: warning: 'struct request_queue' declared inside parameter list include/linux/bsg.h:67: warning: its scope is only this definition or declaration, which is probably not what you want include/linux/bsg.h:71: warning: 'struct request_queue' declared inside parameter list In file included from arch/powerpc/kernel/ppc_ksyms.c:15: include/linux/ide.h:857: error: field 'wrq' has incomplete type The fix tries to use the smallest scope CONFIG_* symbols that will fix the build problem. In this case <linux/ide.h> needs to be included only if IDE=y or IDE=m were selected. Also, ppc_ide_md is needed only if BLK_DEV_IDE=y or BLK_DEV_IDE=m Moved the EXPORT_SYMBOL(ppc_ide_md) from ppc_ksysms.c next to its declaration in setup_32.c which made <linux/ide.h> not needed. With <linux/ide.h> gone from ppc_ksyms.c, <asm/cacheflush.h> is needed to address the following warnings and errors: CC arch/powerpc/kernel/ppc_ksyms.o arch/powerpc/kernel/ppc_ksyms.c:122: error: '__flush_icache_range' undeclared here (not in a function) arch/powerpc/kernel/ppc_ksyms.c:122: warning: type defaults to 'int' in declaration of '__flush_icache_range' arch/powerpc/kernel/ppc_ksyms.c:123: error: 'flush_dcache_range' undeclared here (not in a function) arch/powerpc/kernel/ppc_ksyms.c:123: warning: type defaults to 'int' in declaration of 'flush_dcache_range' Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Fix platinumfb framebufferBenjamin Herrenschmidt
Current kernels have a non-working platinumfb due to some resource management issues. This fixes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Avoid pointless WARN_ON(irqs_disabled()) from panic codepathSatyam Sharma
> ------------[ cut here ]------------ > Badness at arch/powerpc/kernel/smp.c:202 comes when smp_call_function_map() has been called with irqs disabled, which is illegal. However, there is a special case, the panic() codepath, when we do not want to warn about this -- warning at that time is pointless anyway, and only serves to scroll away the *real* cause of the panic and distracts from the real bug. * So let's extract the WARN_ON() from smp_call_function_map() into all its callers -- smp_call_function() and smp_call_function_single() * Also, introduce another caller of smp_call_function_map(), namely __smp_call_function() (and make smp_call_function() a wrapper over this) which does *not* warn about disabled irqs * Use this __smp_call_function() from the panic codepath's smp_send_stop() We also end having to move code of smp_send_stop() below the definition of __smp_call_function(). Signed-off-by: Satyam Sharma <satyam@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Support setting affinity for U3/U4 MSI sourcesOlof Johansson
Hook up affinity-setting for U3/U4 MSI interrupt sources. Tested on Quad G5 with myri10ge. Signed-off-by: Olof Johansson <olof@lixom.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] add Kconfig option for optimizing for cellArnd Bergmann
Since the PPE on cell is an in-order core, it suffers significantly from wrong instruction scheduling. This adds a Kconfig option that enables passing -mtune=cell to gcc in order to generate object code that runs well on cell. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] cell: Don't cast the result of of_get_property()Jeremy Kerr
The cast to u32 * isn't required, of_get_property returns a void *. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Convert define_machine(mpc885_ads) to C99 initializer syntaxTony Breeds
Make the define_machine() block for mpc885_ads more greppable and consistent with other examples in tree. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] mpc5200: Add cuimage support for mpc5200 boardsGrant Likely
Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] mpc8349: Add linux,network-index to ethernet nodes in device treeGrant Likely
cuImage needs to know the logical index of the ethernet devices in order to assign mac addresses. This adds the needed properties. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David Gibson <david@gibson.dropbear.id.au> CC: Scott Wood <scottwood@freescale.com> CC: Kumar Gala <galak@kernel.crashing.org> CC: Timur Tabi <timur@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Fix ppc kernels after build-id additionMeelis Roos
This patch fixes arch/ppc kernels, at least for prep subarch, after build-id addition. Without this, kernels were 3 times the size and bootloader refused to load them. Now they are back to normal again. Tested only with Roland McGrath's "Use LDFLAGS_MODULE only for .ko links" patch applied - boots and works fine. Signed-off-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Prevent direct inclusion of <asm/rwsem.h>.Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] adbhid: Enable KEY_FN key reportingAristeu Rozanski
When a Fn key is used in combination with another key in ADB keyboards it will generate a Fn event and then a second event that can be a different key than pressed (Fn + F1 for instance can generate Fn + brightness down if it's configured like that). This enables the reporting of the Fn key to the input system. As Fn is a dead key for most purposes, it's useful to report it so applications can make use of it. One example is apple_mouse (https://jake.ruivo.org/uinputd/trunk/apple_mouse/) that emulates the second and third keys using a combination of keyboard keys and the mouse button. Other applications may use the KEY_FN as a modifier as well. I've been updating and using this patch for months without problems. Signed-off-by: Aristeu Rozanski <aris@ruivo.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Use __attribute__ in asm-powerpcMike Frysinger
Pretty much everyone uses "__attribute__" or "attribute", no one uses "__attribute". This tweaks the three places in asm-powerpc where this comes up. While only asm-powerpc/types.h is interesting (for userspace), I did asm-powerpc/processor.h as well for consistency. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Add Marvell mv64x60 udbg putc/getc functionsDale Farnsworth
Commit 69331af, "Fixes and cleanups for earlyprintk aka boot console", resulted in printk output prior to the initialization of the mpsc console driver not being printed. That commit causes the mpsc's CON_PRINTBUFFER flag to be cleared since udbg should have printed the previous output. I guess we can no longer ignore udbg. :) This patch provides udbg_putc() and udbg_getc() functions for the Marvell mv64x60 chips. These functions are enabled if an mv64x60 port is to be used as the console as determined from the device tree. Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Acked-by: Mark A. Greer <mgreer@mvista.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Optionally use new device number for pmac_zilogDavid Woodhouse
This adds the option for the pmac_zilog driver to use the major/minor numbers recently allocated specifically for it (/dev/ttyPZn) instead of the /dev/ttySn numbers. The advantage of doing this is that it allows the pmac_zilog and 8250 drivers to coexist. The disadvantage of doing this is that it is a user-visible ABI change and it will break existing working setups on powermacs, and could be confusing to users. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-20[POWERPC] Cleanups for physmap_of.c (v2)David Gibson
This patch includes a whole batch of smallish cleanups for drivers/mtd/physmap_of.c. - A bunch of uneeded #includes are removed - We switch to the modern linux/of.h etc. in place of asm/prom.h - Use some helper macros to avoid some ugly inline #ifdefs - A few lines of unreachable code are removed - A number of indentation / line-wrapping fixes - More consistent use of kernel idioms such as if (!p) instead of if (p == NULL) - Clarify some printk()s and other informative strings. - parse_obsolete_partitions() now returns 0 if no partition information is found, instead of returning -ENOENT which the caller had to handle specially. - (the big one) Despite the name, this driver really has nothing to do with drivers/mtd/physmap.c. The fact that the flash chips must be physically direct mapped is a constrant, but doesn't really say anything about the actual purpose of this driver, which is to instantiate MTD devices based on information from the device tree. Therefore the physmap name is replaced everywhere within the file with "of_flash". The file itself and the Kconfig option is not renamed for now (so that the diff is actually a diff). That can come later. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-09-20[POWERPC] 4xx: Fix Sequoia MAL0 and EMAC dts entries.Valentine Barshak
According to PowerPC 440EPx documentation, MAL0 is comprised of four channels (two transmit and two receive). Each channel is dedicated to one of two EMAC cores. This patch fixes Sequoia DTS MAL0 entry and EMAC entries, assigning correct channel numbers to EMACs. Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-09-20[POWERPC] 4xx: Fix Bamboo MAL0 dts entry.Valentine Barshak
According to PowerPC 440EP documentation, MAL0 consists of 6 channels (4 transmit channels and 2 receive channels) This patch fixes Bamboo DTS MAL0 "num-rx-chans" entry. Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-09-20[POWERPC] Add 64-bit resources support to pci_iomapValentine Barshak
The patch adds support for the 64-bit resources to the PCI iomap code. Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-09-19[POWERPC] Update PowerPC 4xx entry in MAINTAINERSJosh Boyer
Add myself as PowerPC 4xx maintainer and list the git tree Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm..com>
2007-09-19[POWERPC] 4xx: Implement udbg_getc() for 440Hollis Blanchard
Implement udbg_getc() for 440, which fixes xmon input. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-09-19[POWERPC] 4xx: Convert Seqouia flash mappings to new bindingJosh Boyer
A new binding for flash devices was recently introduced. This updates the Sequoia DTS to use the new binding and enabled MTD in the defconfig. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Stefan Roese <sr@denx.de>
2007-09-19[POWERPC] 4xx: Convert Walnut flash mappings to new bindingJosh Boyer
A new binding for flash devices was recently introduced. This updates the Walnut DTS to use the new binding. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>
2007-09-19[POWERPC] Make partitions optional in physmap_ofJosh Boyer
The latest physmap_of driver has a small error where it will fail the probe with: physmap-flash: probe of fff00000.small-flas failed with error -2 if there are no partition subnodes in the device tree and the old style binding is not used. Since partition definitions are optional, the probe should still succeed. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>
2007-09-19[POWERPC] cuimage for Bamboo boardJosh Boyer
Add a cuboot wrapper for the Bamboo board. Additionally, we enable MAC address fixups for both cuboot and treeboot. This also removes some obsoleted linker declarations that have been moved into ops.h Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>
2007-09-20Merge branch 'linux-2.6'Paul Mackerras
2007-09-19Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] cpu-bugs64.c: GCC 3.3 constraint workaround [MIPS] DEC: Initialise ioasic_ssr_lock
2007-09-19Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvbLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: V4L/DVB (6173a): Documentation: Remove reference to dead "cpia_pp=" boot-time option Revert "V4L/DVB (6173a): Documentation: Remove reference to dead "cpia_pp=" boot-time option"
2007-09-19Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6Linus Torvalds
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer. [XFS] Ensure file size updates have been completed before writing inode to disk. [XFS] On-demand reaping of the MRU cache
2007-09-19Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SUNSAB]: Fix several bugs.
2007-09-19Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: ide: remove unused variables from drivers/ide/ppc/pmac.c ide: ST320413A has the same problem as ST340823A
2007-09-19Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Fix timekeeping on PowerPC 601 [POWERPC] Don't expose clock vDSO functions when CPU has no timebase [POWERPC] spusched: Fix null pointer dereference in find_victim
2007-09-19x86-64: page faults from user mode are always user faultsLinus Torvalds
Randy Dunlap noticed an interesting "crashme" behaviour on his dual Prescott Xeon setup, where he gets page faults with the error code having a zero "user" bit, but the register state points back to user mode. This may be a CPU microcode buglet triggered by some strange instruction pattern that crashme generates, and loading a microcode update seems to possibly have fixed it. Regardless, we really should trust the register state more than the error code, since it's really the register state that determines whether we can actually send a signal, or whether we're in kernel mode and need to oops/kill the process in the case of a page fault. Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19[MIPS] cpu-bugs64.c: GCC 3.3 constraint workaroundMaciej W. Rozycki
Add a workaround to address warnings generated on the "n" constraint by GCC 3.3 and below. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-09-19[MIPS] DEC: Initialise ioasic_ssr_lockMaciej W. Rozycki
Fix the definition of the ioasic_ssr_lock spinlock to include a proper initialisation. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-09-19Driver core: fix deprectated sysfs structure for nested class devicesDmitry Torokhov
Nested class devices used to have 'device' symlink point to a real (physical) device instead of a parent class device. When converting subsystems to struct device we need to keep doing what class devices did if CONFIG_SYSFS_DEPRECATED is Y, otherwise parts of udev break. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Cc: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Greg KH <greg@kroah.com> Tested-by: Anssi Hannula <anssi.hannula@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19uml: fix irqstack crashJeff Dike
This patch fixes a crash caused by an interrupt coming in when an IRQ stack is being torn down. When this happens, handle_signal will loop, setting up the IRQ stack again because the tearing down had finished, and handling whatever signals had come in. However, to_irq_stack returns a mask of pending signals to be handled, plus bit zero is set if the IRQ stack was already active, and thus shouldn't be torn down. This causes a problem because when handle_signal goes around the loop, sig will be zero, and to_irq_stack will duly set bit zero in the returned mask, faking handle_signal into believing that it shouldn't tear down the IRQ stack and return thread_info pointers back to their original values. This will eventually cause a crash, as the IRQ stack thread_info will continue pointing to the original task_struct and an interrupt will look into it after it has been freed. The fix is to stop passing a signal number into to_irq_stack. Rather, the pending signals mask is initialized beforehand with the bit for sig already set. References to sig in to_irq_stack can be replaced with references to the mask. [akpm@linux-foundation.org: use UL] Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19Fix NUMA Memory Policy Reference CountingLee Schermerhorn
This patch proposes fixes to the reference counting of memory policy in the page allocation paths and in show_numa_map(). Extracted from my "Memory Policy Cleanups and Enhancements" series as stand-alone. Shared policy lookup [shmem] has always added a reference to the policy, but this was never unrefed after page allocation or after formatting the numa map data. Default system policy should not require additional ref counting, nor should the current task's task policy. However, show_numa_map() calls get_vma_policy() to examine what may be [likely is] another task's policy. The latter case needs protection against freeing of the policy. This patch adds a reference count to a mempolicy returned by get_vma_policy() when the policy is a vma policy or another task's mempolicy. Again, shared policy is already reference counted on lookup. A matching "unref" [__mpol_free()] is performed in alloc_page_vma() for shared and vma policies, and in show_numa_map() for shared and another task's mempolicy. We can call __mpol_free() directly, saving an admittedly inexpensive inline NULL test, because we know we have a non-NULL policy. Handling policy ref counts for hugepages is a bit trickier. huge_zonelist() returns a zone list that might come from a shared or vma 'BIND policy. In this case, we should hold the reference until after the huge page allocation in dequeue_hugepage(). The patch modifies huge_zonelist() to return a pointer to the mempolicy if it needs to be unref'd after allocation. Kernel Build [16cpu, 32GB, ia64] - average of 10 runs: w/o patch w/ refcount patch Avg Std Devn Avg Std Devn Real: 100.59 0.38 100.63 0.43 User: 1209.60 0.37 1209.91 0.31 System: 81.52 0.42 81.64 0.34 Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19Fix user namespace exiting OOPsPavel Emelyanov
It turned out, that the user namespace is released during the do_exit() in exit_task_namespaces(), but the struct user_struct is released only during the put_task_struct(), i.e. MUCH later. On debug kernels with poisoned slabs this will cause the oops in uid_hash_remove() because the head of the chain, which resides inside the struct user_namespace, will be already freed and poisoned. Since the uid hash itself is required only when someone can search it, i.e. when the namespace is alive, we can safely unhash all the user_struct-s from it during the namespace exiting. The subsequent free_uid() will complete the user_struct destruction. For example simple program #include <sched.h> char stack[2 * 1024 * 1024]; int f(void *foo) { return 0; } int main(void) { clone(f, stack + 1 * 1024 * 1024, 0x10000000, 0); return 0; } run on kernel with CONFIG_USER_NS turned on will oops the kernel immediately. This was spotted during OpenVZ kernel testing. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: "Serge E. Hallyn" <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19Convert uid hash to hlistPavel Emelyanov
Surprisingly, but (spotted by Alexey Dobriyan) the uid hash still uses list_heads, thus occupying twice as much place as it could. Convert it to hlist_heads. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>