aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-06-18kexec: sysrq: simplify sysrq-c handlerNeil Horman
Currently the sysrq-c handler is bit over-engineered. Its behavior is dependent on a few compile time and run time factors that alter its behavior which is really unnecessecary. If CONFIG_KEXEC is not configured, sysrq-c, crashes the system with a NULL pointer dereference. If CONFIG_KEXEC is configured, it calls crash_kexec directly, which implies that the kexec kernel will either be booted (if its been previously loaded), or it will simply do nothing (the no kexec kernel has been loaded). It would be much easier to just simplify the whole thing to dereference a NULL pointer all the time regardless of configuration. That way, it will always try to crash the system, and if a kexec kernel has been loaded into reserved space, it will still boot from the page fault trap handler (assuming panic_on_oops is set appropriately). [akpm@linux-foundation.org: build fix] Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: Brayan Arraes <brayan@yack.com.br> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18w1-gpio: add external pull-up enable callbackDaniel Mack
On embedded devices, sleep mode conditions can be tricky to handle, Especially when processors tend to pull-down the w1 bus during sleep. Bus slaves (such as the ds2760) may interpret this as a reason for power-down conditions and entirely switch off the device. This patch adds a callback function pointer to let users switch on and off the external pull-up resistor. This lets the outside world know whether the processor is currently actively driving the bus or not. When this callback is not provided, the code behaviour won't change. Signed-off-by: Daniel Mack <daniel@caiaq.de> Acked-by: Ville Syrjala <syrjala@sci.fi> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18dma-mapping: ia64: add CONFIG_DMA_API_DEBUG supportFUJITA Tomonori
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: "Luck, Tony" <tony.luck@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc; "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18dma-mapping: ia64: use asm-generic/dma-mapping-common.hFUJITA Tomonori
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: "Luck, Tony" <tony.luck@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc; "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18dma-mapping: x86: use asm-generic/dma-mapping-common.hFUJITA Tomonori
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18dma-mapping: add asm-generic/dma-mapping-common.hFUJITA Tomonori
We unified x86 and IA64's handling of multiple dma mapping operations (struct dma_map_ops in linux/dma-mapping.h) so we can remove duplication in their arch/include/asm/dma-mapping.h. This patchset adds include/asm-generic/dma-mapping-common.h that provides some generic dma mapping function definitions for the users of struct dma_map_ops. This enables us to remove about 100 lines. This also enables us to easily add CONFIG_DMA_API_DEBUG support, which only x86 supports for now. The 4th patch adds CONFIG_DMA_API_DEBUG support to IA64 by adding only 8 lines. This patch: This header file provides some mapping function definitions that the users of struct dma_map_ops can use. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18gcov: enable GCOV_PROFILE_ALL for x86_64Peter Oberparleiter
Enable gcov profiling of the entire kernel on x86_64. Required changes include disabling profiling for: * arch/kernel/acpi/realmode and arch/kernel/boot/compressed: not linked to main kernel * arch/vdso, arch/kernel/vsyscall_64 and arch/kernel/hpet: profiling causes segfaults during boot (incompatible context) Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Li Wei <W.Li@Sun.COM> Cc: Michael Ellerman <michaele@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com> Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: WANG Cong <xiyou.wangcong@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18gcov: add gcov profiling infrastructurePeter Oberparleiter
Enable the use of GCC's coverage testing tool gcov [1] with the Linux kernel. gcov may be useful for: * debugging (has this code been reached at all?) * test improvement (how do I change my test to cover these lines?) * minimizing kernel configurations (do I need this option if the associated code is never run?) The profiling patch incorporates the following changes: * change kbuild to include profiling flags * provide functions needed by profiling code * present profiling data as files in debugfs Note that on some architectures, enabling gcc's profiling option "-fprofile-arcs" for the entire kernel may trigger compile/link/ run-time problems, some of which are caused by toolchain bugs and others which require adjustment of architecture code. For this reason profiling the entire kernel is initially restricted to those architectures for which it is known to work without changes. This restriction can be lifted once an architecture has been tested and found compatible with gcc's profiling. Profiling of single files or directories is still available on all platforms (see config help text). [1] http://gcc.gnu.org/onlinedocs/gcc/Gcov.html Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Li Wei <W.Li@Sun.COM> Cc: Michael Ellerman <michaele@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com> Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: WANG Cong <xiyou.wangcong@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18seq_file: add function to write binary dataPeter Oberparleiter
seq_write() can be used to construct seq_files containing arbitrary data. Required by the gcov-profiling interface to synthesize binary profiling data files. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Li Wei <W.Li@Sun.COM> Cc: Michael Ellerman <michaele@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com> Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: WANG Cong <xiyou.wangcong@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18kernel: constructor supportPeter Oberparleiter
Call constructors (gcc-generated initcall-like functions) during kernel start and module load. Constructors are e.g. used for gcov data initialization. Disable constructor support for usermode Linux to prevent conflicts with host glibc. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Li Wei <W.Li@Sun.COM> Cc: Michael Ellerman <michaele@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com> Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18edac: Kconfig: fix the meaning of EDAC abbreviationGeunSik Lim
Fix the meaning of EDAC(Error Detection And Correction) correctly. [akpm@linux-foundation.org: add missing space] Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18edac: add missing __devexit_p()Mike Frysinger
The remove function uses __devexit, so the .remove assignment needs __devexit_p() to fix a build error with hotplug disabled. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18edac: cpc925 MC platform device setupHarry Ciao
Fix up the number of cells for the values of CPC925 Memory Controller, and setup related platform device during system booting up, against which CPC925 Memory Controller EDAC driver would be matched. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18edac: add edac_device_alloc_index()Harry Ciao
Add edac_device_alloc_index(), because for MAPLE platform there may exist several EDAC driver modules that could make use of edac_device_ctl_info structure at the same time. The index allocation for these structures should be taken care of by EDAC core. [akpm@linux-foundation.org: cleanups] Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18edac: add CPC925 Memory Controller driverHarry Ciao
Introduce IBM CPC925 EDAC driver, which makes use of ECC, CPU and HyperTransport Link error detections and corrections on the IBM CPC925 Bridge and Memory Controller. [akpm@linux-foundation.org: cleanup] Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18futex: documentation: fix inconsistent description of futex list_op_pendingMatt Helsley
Strictly speaking list_op_pending points to the 'lock entry', not the 'lock word' (which is actually at 'offset' from 'lock entry'). We can infer this based on reading the code in kernel/futex.c: struct robust_list __user *entry, *next_entry, *pending; ... if (fetch_robust_entry(&pending, &head->list_op_pending, &pip)) return; ... if (pending) handle_futex_death((void __user *)pending + futex_offset, curr, pip); Which is also consistent with the rest of the docs on robust futex lists. Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linuxtronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ipcns: move free_ipcs() protoAlexey Dobriyan
Function is really private to ipc/ and avoid struct kern_ipc_perm forward declaration. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ipcns: make free_ipc_ns() staticAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18nsproxy: extract create_nsproxy()Alexey Dobriyan
clone_nsproxy() does useless copying of old nsproxy -- every pointer will be rewritten to new ns or to old ns. Remove copying, rename clone_nsproxy(), create_nsproxy() will be used by C/R code to create fresh nsproxy on restart. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ipcns: extract create_ipc_ns()Alexey Dobriyan
clone_ipc_ns() is misnamed, it doesn't clone anything and doesn't use passed parameter. Rename it. create_ipc_ns() will be used by C/R to create fresh ipcns. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ipcns: remove useless get/put while CLONE_NEWIPCAlexey Dobriyan
copy_ipcs() doesn't actually copy anything. If new ipcns is created, it's created from scratch, in this case get/put on old ipcns isn't needed. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18utsns: extract creeate_uts_ns()Alexey Dobriyan
create_uts_ns() will be used by C/R to create fresh uts_ns. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18pidns: rewrite copy_pid_ns()Alexey Dobriyan
copy_pid_ns() is a perfect example of a case where unwinding leads to more code and makes it less clear. Watch the diffstat. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Reviewed-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18pidns: make create_pid_namespace() accept parent pidnsAlexey Dobriyan
create_pid_namespace() creates everything, but caller has to assign parent pidns by hand, which is unnatural. At the moment of call new ->level has to be taken from somewhere and parent pidns is already available. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18pids: clean up find_task_by_pid variantsChristoph Hellwig
find_task_by_pid_type_ns is only used to implement find_task_by_vpid and find_task_by_pid_ns, but both of them pass PIDTYPE_PID as first argument. So just fold find_task_by_pid_type_ns into find_task_by_pid_ns and use find_task_by_pid_ns to implement find_task_by_vpid. While we're at it also remove the exports for find_task_by_pid_ns and find_task_by_vpid - we don't have any modular callers left as the only modular caller of he old pre pid namespace find_task_by_pid (gfs2) was switched to pid_task which operates on a struct pid pointer instead of a pid_t. Given the confusion about pid_t values vs namespace that's generally the better option anyway and I think we're better of restricting modules to do it that way. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18sysctl.c: remove unused variableSukanto Ghosh
Remoce the unused variable 'val' from __do_proc_dointvec() The integer has been declared and used as 'val = -val' and there is no reference to it anywhere. Signed-off-by: Sukanto Ghosh <sukanto.cse.iitb@gmail.com> Cc: Jaswinder Singh Rajput <jaswinder@kernel.org> Cc: Sukanto Ghosh <sukanto.cse.iitb@gmail.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ppdev: reduce kernel log spamMichael Buesch
One of my programs frequently grabs the parport, does something with it and then drops it again. This results in spamming of the kernel log with "... registered pardevice" "... unregistered pardevice" These messages are completely useless, except for debugging ppdev, probably. So put them under DEBUG (or dynamic debug). Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18Char: isicom: fix build warningJiri Slaby
Fix this: isicom.c: In function `isicom_probe': isicom.c:1587: warning: `signature' may be used uninitialized in this function by uninitialized_var(), because if the signature is not initialized in reset_card(), we won't use it. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18drivers/char/mem.c: memory_open() cleanup: lookup minor device number from ↵Adriano dos Santos Fernandes
devlist memory_open() ignores devlist and does a switch for each item, duplicating code and conditional definitions. Clean it up by adding backing_dev_info to devlist and use it to lookup for the minor device. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Adriano dos Santos Fernandes <adrianosf@uol.com.br> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ipc: use __ARCH_WANT_IPC_PARSE_VERSION in ipc/util.hArnd Bergmann
The definition of ipc_parse_version depends on __ARCH_WANT_IPC_PARSE_VERSION, but the header file declares it conditionally based on the architecture. Use the macro consistently to make it easier to add new architectures. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18kthreads: simplify migration_thread() exit pathOleg Nesterov
Now that kthread_stop() can be used even if the task has already exited, we can kill the "wait_to_die:" loop in migration_thread(). But we must pin rq->migration_thread after creation. Actually, I don't think CPU_UP_CANCELED or CPU_DEAD should wait for ->migration_thread exit. Perhaps we can simplify this code a bit more. migration_call() can set ->should_stop and forget about this thread. But we need a new helper in kthred.c for that. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Vitaliy Gusev <vgusev@openvz.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18kthreads: rework kthread_stop()Oleg Nesterov
Based on Eric's patch which in turn was based on my patch. kthread_stop() has the nasty problems: - it runs unpredictably long with the global semaphore held. - it deadlocks if kthread itself does kthread_stop() before it obeys the kthread_should_stop() request. - it is not useable if kthread exits on its own, see for example the ugly "wait_to_die:" hack in migration_thread() - it is not possible to just tell kthread it should stop, we must always wait for its exit. With this patch kthread() allocates all neccesary data (struct kthread) on its own stack, globals kthread_stop_xxx are deleted. ->vfork_done is used as a pointer into "struct kthread", this means kthread_stop() can easily wait for kthread's exit. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Vitaliy Gusev <vgusev@openvz.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18kthreads: simplify the startup synchronizationOleg Nesterov
We use two completions two create the kernel thread, this is a bit ugly. kthread() wakes up create_kthread() via ->started, then create_kthread() wakes up the caller kthread_create() via ->done. But kthread() does not need to wait for kthread(), it can just return. Instead kthread() itself can wake up the caller of kthread_create(). Kill kthread_create_info->started, ->done is enough. This improves the scalability a bit and sijmplifies the code. The only problem if kernel_thread() fails, in that case create_kthread() must do complete(&create->done). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Vitaliy Gusev <vgusev@openvz.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18mm: exit.c reorder wait_opts to remove padding on 64 bit buildsRichard Kennedy
Reorder struct wait_opts to remove 8 bytes of alignment padding on 64 bit builds. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18do_wait: fix the theoretical race with stop/trace/contOleg Nesterov
do_wait: current->state = TASK_INTERRUPTIBLE; read_lock(&tasklist_lock); ... search for the task to reap ... In theory, the ->state changing can leak into the critical section. Since the child can change its status under read_lock(tasklist) in parallel (finish_stop/ptrace_stop), we can miss the wakeup if __wake_up_parent() sees us in TASK_RUNNING state. Add the barrier. Also, use __set_current_state() to set TASK_RUNNING. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18do_wait: kill the old BUG_ON, use while_each_thread()Oleg Nesterov
do_wait() does BUG_ON(tsk->signal != current->signal), this looks like a raher obsolete check. At least, I don't think do_wait() is the best place to verify that all threads have the same ->signal. Remove it. Also, change the code to use while_each_thread(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18do_wait: simplify retval/tsk_result/notask_error messOleg Nesterov
Now that we don't pass &retval down to other helpers we can simplify the code more. - kill tsk_result, just use retval - add the "notask" label right after the main loop, and s/got end/goto notask/ after the fastpath pid check. This way we don't need to initialize retval before this check and the code becomes a bit more clean, if this pid has no attached tasks we should just skip the list search. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18introduce "struct wait_opts" to simplify do_wait() patchesOleg Nesterov
Introduce "struct wait_opts" which holds the parameters for misc helpers in do_wait() pathes. This adds 13 lines to kernel/exit.c, but saves 256 bytes from .o and imho makes the code much more readable. This patch temporary uglifies rusage/siginfo code a little bit, will be addressed by further cleanups. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18shift "ptrace implies WUNTRACED" from ptrace_do_wait() to wait_task_stopped()Oleg Nesterov
No functional changes, preparation for the next patch. ptrace_do_wait() adds WUNTRACED to options for wait_task_stopped() which should always accept the stopped tracee, even if do_wait() was called without WUNTRACED. Change wait_task_stopped() to check "ptrace || WUNTRACED" instead. This makes the code more explicit, and "int options" argument becomes const in do_wait() pathes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18elf_core_dump: use rcu_read_lock() to access ->real_parentOleg Nesterov
In theory it is not safe to dereference ->parent/real_parent without tasklist or rcu lock, we can race with re-parenting. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18copy_process(): remove the unneeded clear_tsk_thread_flag(TIF_SIGPENDING)Oleg Nesterov
The forked child can have TIF_SIGPENDING if it was copied from parent's ti->flags. But this is harmless and actually almost never happens, because copy_process() can't succeed if signal_pending() == T. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18wait_task_zombie: do not use thread_group_cputime()Oleg Nesterov
There is no reason for thread_group_cputime() in wait_task_zombie(), there must be no other threads. This call was previously needed to collect the per-cpu data which we do not have any longer. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Roland McGrath <roland@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Vitaly Mayatskikh <vmayatsk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ptrace: don't take tasklist to get/set ->last_siginfoOleg Nesterov
Change ptrace_getsiginfo/ptrace_setsiginfo to use lock_task_sighand() without tasklist_lock. Perhaps it makes sense to make a single helper with "bool rw" argument. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usageOleg Nesterov
If the non-traced sub-thread calls do_notify_parent_cldstop(), we send the notification to group_leader->real_parent and we report group_leader's pid. But, if group_leader is traced we use the wrong ->parent->nsproxy->pid_ns, the tracer and parent can live in different namespaces. Change the code to use "parent" instead of tsk->parent. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ptrace: wait_task_zombie: s/->parent/->real_parent/Oleg Nesterov
Change wait_task_zombie() to use ->real_parent instead of ->parent. We could even use current afaics, but ->real_parent is more clean. We know that the child is not ptrace_reparented() and thus they are equal. But we should avoid using task_struct->parent, we are going to remove it. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ptrace_get_task_struct: s/tasklist/rcu/, make it staticOleg Nesterov
- Use rcu_read_lock() instead of tasklist_lock to find/get the task in ptrace_get_task_struct(). - Make it static, it has no callers outside of ptrace.c. - The comment doesn't match the reality, this helper does not do any checks. Beacuse it is really trivial and static I removed the whole comment. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ptrace: do not use task_lock() for attachOleg Nesterov
Remove the "Nasty, nasty" lock dance in ptrace_attach()/ptrace_traceme() - from now task_lock() has nothing to do with ptrace at all. With the recent changes nobody uses task_lock() to serialize with ptrace, but in fact it was never needed and it was never used consistently. However ptrace_attach() calls __ptrace_may_access() and needs task_lock() to pin task->mm for get_dumpable(). But we can call __ptrace_may_access() before we take tasklist_lock, ->cred_exec_mutex protects us against do_execve() path which can change creds and MMF_DUMP* flags. (ugly, but we can't use ptrace_may_access() because it hides the error code, so we have to take task_lock() and use __ptrace_may_access()). NOTE: this change assumes that LSM hooks, security_ptrace_may_access() and security_ptrace_traceme(), can be called without task_lock() held. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ptrace: cleanup check/set of PT_PTRACED during attachOleg Nesterov
ptrace_attach() and ptrace_traceme() are the last functions which look as if the untraced task can have task->ptrace != 0, this must not be possible. Change the code to just check ->ptrace != 0 and s/|=/=/ to set PT_PTRACED. Also, a couple of trivial whitespace cleanups in ptrace_attach(). And move ptrace_traceme() up near ptrace_attach() to keep them close to each other. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ptrace: ptrace_attach: check PF_KTHREAD + exit_state instead of ->mmOleg Nesterov
- Add PF_KTHREAD check to prevent attaching to the kernel thread with a borrowed ->mm. With or without this change we can race with daemonize() which can set PF_KTHREAD or clear ->mm after ptrace_attach() does the check, but this doesn't matter because reparent_to_kthreadd() does ptrace_unlink(). - Kill "!task->mm" check. We don't really care about ->mm != NULL, and the task can call exit_mm() right after we drop task_lock(). What we need is to make sure we can't attach after exit_notify(), check task->exit_state != 0 instead. Also, move the "already traced" check down for cosmetic reasons. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18ptrace: do not use task->ptrace directly in core kernelOleg Nesterov
No functional changes. - Nobody except ptrace.c & co should use ptrace flags directly, we have task_ptrace() for that. - No need to specially check PT_PTRACED, we must not have other PT_ bits set without PT_PTRACED. And no need to know this flag exists. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>