aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2007-07-16Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits) [PATCH] ocfs2: zero_user_page conversion ocfs2: Support xfs style space reservation ioctls ocfs2: support for removing file regions ocfs2: update truncate handling of partial clusters ocfs2: btree support for removal of arbirtrary extents ocfs2: Support creation of unwritten extents ocfs2: support writing of unwritten extents ocfs2: small cleanup of ocfs2_write_begin_nolock() ocfs2: btree changes for unwritten extents ocfs2: abstract btree growing calls ocfs2: use all extent block suballocators ocfs2: plug truncate into cached dealloc routines ocfs2: simplify deallocation locking ocfs2: harden buffer check during mapping of page blocks ocfs2: shared writeable mmap ocfs2: factor out write aops into nolock variants ocfs2: rework ocfs2_buffered_write_cluster() ocfs2: take ip_alloc_sem during entire truncate ocfs2: Add "preferred slot" mount option [KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c ...
2007-07-16Merge branch 'bsg' of git://git.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block: (25 commits) bsg: Kconfig updates bsg: add SCSI transport-level request support bsg: add bidi support add a struct request pointer to the request structure bsg: fix the deadlock on discarding done commands bsg: fix a blocking read bug bsg: minor bug fixes improve bsg device allocation bind bsg to all SCSI devices bsg: bind bsg to request_queue instead of gendisk bsg: add a request_queue argument to scsi_cmd_ioctl() bsg: simplify __bsg_alloc_command failpath bsg: add cheasy error checks for sysfs stuff Add queue resizing support Replace s32, u32 and u64 with __s32, __u32 and __u64 in bsg.h for userspace bsg: silence a bogus gcc warning bsg: style cleanup bsg: use u32 etc instead of uint32_t bsg: add SG_IO to SG v4 bsg: replace SG v3 with SG v4 ...
2007-07-16ext4: statfs speed upBadari Pulavarty
This is a patch that speeds up statfs. It is very simple - the "overhead" calculation, which takes a huge amount of time for large filesystems, never changes unless the size of the filesystem itself changes. That means we can store it in memory and only recalculate if the filesystem has been resized (almost never). It also fixes a minor problem that we never update the on-disk superblock free blocks/inodes counts until the filesystem is unmounted. While not fatal, we may as well update that on disk when we have the information, and it makes things like debugfs and dumpe2fs report a bit more accurate info. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16ext3: statfs speed upBadari Pulavarty
This is a patch that speeds up statfs. It is very simple - the "overhead" calculation, which takes a huge amount of time for large filesystems, never changes unless the size of the filesystem itself changes. That means we can store it in memory and only recalculate if the filesystem has been resized (almost never). It also fixes a minor problem that we never update the on-disk superblock free blocks/inodes counts until the filesystem is unmounted. While not fatal, we may as well update that on disk when we have the information, and it makes things like debugfs and dumpe2fs report a bit more accurate info. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16ext2: statfs speed upBadari Pulavarty
This is a patch that speeds up statfs. It is very simple - the "overhead" calculation, which takes a huge amount of time for large filesystems, never changes unless the size of the filesystem itself changes. That means we can store it in memory and only recalculate if the filesystem has been resized (almost never). It also fixes a minor problem that we never update the on-disk superblock free blocks/inodes counts until the filesystem is unmounted. While not fatal, we may as well update that on disk when we have the information, and it makes things like debugfs and dumpe2fs report a bit more accurate info. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16remove mm/backing-dev.c:congestion_wait_interruptible()Adrian Bunk
congestion_wait_interruptible() is no longer used. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16make cancel_xxx_work_sync() return a booleanOleg Nesterov
Change cancel_work_sync() and cancel_delayed_work_sync() to return a boolean indicating whether the work was actually cancelled. A zero return value means that the work was not pending/queued. Without that kind of change it is not possible to avoid flush_workqueue() sometimes, see the next patch as an example. Also, this patch unifies both functions and kills the (unlikely) busy-wait loop. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16rename cancel_rearming_delayed_work() to cancel_delayed_work_sync()Oleg Nesterov
Imho, the current naming of cancel_xxx workqueue functions is very confusing. cancel_delayed_work() cancel_rearming_delayed_work() cancel_rearming_delayed_workqueue() // obsolete cancel_work_sync() This looks as if the first 2 functions differ in "type" of their argument which is not true any longer, nowadays the difference is the behaviour. The semantics of cancel_rearming_delayed_work(dwork) was changed significantly, it doesn't require that dwork rearms itself, and cancels dwork synchronously. Rename it to cancel_delayed_work_sync(). This matches cancel_delayed_work() and cancel_work_sync(). Re-create cancel_rearming_delayed_work() as a simple inline obsolete wrapper, like cancel_rearming_delayed_workqueue(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Remove the last few UMSDOS leftoversJesper Juhl
The UMSDOS filesystem was removed back in 2.6.11, but some tiny bits stuck around. This patch removes the few remaining leftovers. The only things left behind after this are the entries in the CREDITS file and the ioctl number in Documentation/ioctl-number.txt as documentation. This third (hopefully final) version of the patch doesn't edit the arch/um/config.release file, since Jeff Dike pointed out to me that it should die completely, and asked me to remove it from my patch as he'll send in a seperate patch removing the file completely. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16generic bug: use show_regs() instead of dump_stack()Heiko Carstens
The current generic bug implementation has a call to dump_stack() in case a WARN_ON(whatever) gets hit. Since report_bug(), which calls dump_stack(), gets called from an exception handler we can do better: just pass the pt_regs structure to report_bug() and pass it to show_regs() in case of a warning. This will give more debug informations like register contents, etc... In addition this avoids some pointless lines that dump_stack() emits, since it includes a stack backtrace of the exception handler which is of no interest in case of a warning. E.g. on s390 the following lines are currently always present in a stack backtrace if dump_stack() gets called from report_bug(): [<000000000001517a>] show_trace+0x92/0xe8) [<0000000000015270>] show_stack+0xa0/0xd0 [<00000000000152ce>] dump_stack+0x2e/0x3c [<0000000000195450>] report_bug+0x98/0xf8 [<0000000000016cc8>] illegal_op+0x1fc/0x21c [<00000000000227d6>] sysc_return+0x0/0x10 Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Andi Kleen <ak@suse.de> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16uninline check_signature()Andrew Morton
This is a rather bizarre thing to have inlined in io.h. Stick it in lib/ instead. While we're there, despaghetti it a bit, and fix its off-by-one behaviour when passed a zero length. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16make seccomp zerocost in scheduleAndrea Arcangeli
This follows a suggestion from Chuck Ebbert on how to make seccomp absolutely zerocost in schedule too. The only remaining footprint of seccomp is in terms of the bzImage size that becomes a few bytes (perhaps even a few kbytes) larger, measure it if you care in the embedded. Signed-off-by: Andrea Arcangeli <andrea@cpushare.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16move seccomp from /proc to a prctlAndrea Arcangeli
This reduces the memory footprint and it enforces that only the current task can enable seccomp on itself (this is a requirement for a strightforward [modulo preempt ;) ] TIF_NOTSC implementation). Signed-off-by: Andrea Arcangeli <andrea@cpushare.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16cdrom: replace hard-coded constants by kernel.h macro.Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16namespace: ensure clone_flags are always stored in an unsigned longEric W. Biederman
While working on unshare support for the network namespace I noticed we were putting clone flags in an int. Which is weird because the syscall uses unsigned long and we at least need an unsigned to properly hold all of the unshare flags. So to make the code consistent, this patch updates the code to use unsigned long instead of int for the clone flags in those places where we get it wrong today. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Remove final two references to "__obsolete_setup" macroRobert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16remove unused lock_cpu_hotplug_interruptible definitionNathan Lynch
aa95387774039096c11803c04011f1aa42d85758 removed the implementation of lock_cpu_hotplug_interruptible and all users of it. This stub definition for !CONFIG_HOTPLUG_CPU was left over -- kill it now. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16COBALT: remove all references to Cobalt NVRAMRobert P. J. Day
Remove not only the references to Cobalt NVRAM, but the header file as well. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Tim Hockin <thockin@hockin.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16user namespace: add unshareSerge E. Hallyn
This patch enables the unshare of user namespaces. It adds a new clone flag CLONE_NEWUSER and implements copy_user_ns() which resets the current user_struct and adds a new root user (uid == 0) For now, unsharing the user namespace allows a process to reset its user_struct accounting and uid 0 in the new user namespace should be contained using appropriate means, for instance selinux The plan, when the full support is complete (all uid checks covered), is to keep the original user's rights in the original namespace, and let a process become uid 0 in the new namespace, with full capabilities to the new namespace. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Andrew Morgan <agm@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16user namespace: add the frameworkCedric Le Goater
Basically, it will allow a process to unshare its user_struct table, resetting at the same time its own user_struct and all the associated accounting. A new root user (uid == 0) is added to the user namespace upon creation. Such root users have full privileges and it seems that theses privileges should be controlled through some means (process capabilities ?) The unshare is not included in this patch. Changes since [try #4]: - Updated get_user_ns and put_user_ns to accept NULL, and get_user_ns to return the namespace. Changes since [try #3]: - moved struct user_namespace to files user_namespace.{c,h} Changes since [try #2]: - removed struct user_namespace* argument from find_user() Changes since [try #1]: - removed struct user_namespace* argument from find_user() - added a root_user per user namespace Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Andrew Morgan <agm@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16remove CONFIG_UTS_NS and CONFIG_IPC_NSCedric Le Goater
CONFIG_UTS_NS and CONFIG_IPC_NS have very little value as they only deactivate the unshare of the uts and ipc namespaces and do not improve performance. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Acked-by: "Serge E. Hallyn" <serue@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Audit: add TTY input auditingMiloslav Trmac
Add TTY input auditing, used to audit system administrator's actions. This is required by various security standards such as DCID 6/3 and PCI to provide non-repudiation of administrator's actions and to allow a review of past actions if the administrator seems to overstep their duties or if the system becomes misconfigured for unknown reasons. These requirements do not make it necessary to audit TTY output as well. Compared to an user-space keylogger, this approach records TTY input using the audit subsystem, correlated with other audit events, and it is completely transparent to the user-space application (e.g. the console ioctls still work). TTY input auditing works on a higher level than auditing all system calls within the session, which would produce an overwhelming amount of mostly useless audit events. Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs by process with the attribute is sent to the audit subsystem by the kernel. The audit netlink interface is extended to allow modifying the audit_tty attribute, and to allow sending explanatory audit events from user-space (for example, a shell might send an event containing the final command, after the interactive command-line editing and history expansion is performed, which might be difficult to decipher from the TTY input alone). Because the "audit_tty" attribute is inherited across fork (), it would be set e.g. for sshd restarted within an audited session. To prevent this, the audit_tty attribute is cleared when a process with no open TTY file descriptors (e.g. after daemon startup) opens a TTY. See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a more detailed rationale document for an older version of this patch. [akpm@linux-foundation.org: build fix] Signed-off-by: Miloslav Trmac <mitr@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Fulghum <paulkf@microgate.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Improve behaviour of spurious IRQ detectAlan Cox
Currently we handle spurious IRQ activity based upon seeing a lot of invalid interrupts, and we clear things back on the base of lots of valid interrupts. Unfortunately in some cases you get legitimate invalid interrupts caused by timing asynchronicity between the PCI bus and the APIC bus when disabling interrupts and pulling other tricks. In this case although the spurious IRQs are not a problem our unhandled counters didn't clear and they act as a slow running timebomb. (This is effectively what the serial port/tty problem that was fixed by clearing counters when registering a handler showed up) It's easy enough to add a second parameter - time. This means that if we see a regular stream of harmless spurious interrupts which are not harming processing we don't go off and do something stupid like disable the IRQ after a month of running. OTOH lockups and performance killers show up a lot more than 10/second [akpm@linux-foundation.org: cleanup] Signed-off-by: Alan Cox <alan@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16fix typo in prefetch.hDave Jones
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16taskstats: add context-switch countersMaxim Uvarov
Make available to the user the following task and process performance statistics: * Involuntary Context Switches (task_struct->nivcsw) * Voluntary Context Switches (task_struct->nvcsw) Statistics information is available from: 1. taskstats interface (Documentation/accounting/) 2. /proc/PID/status (task only). This data is useful for detecting hyperactivity patterns between processes. [akpm@linux-foundation.org: cleanup] Signed-off-by: Maxim Uvarov <muvarov@ru.mvista.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Jonathan Lim <jlim@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Drop an empty isicom.h from being exported to user space.Robert P. J. Day
Drop <linux/isicom.h> from being exported to user space since it would be only an empty file. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16remove sonypi_camera_command()Adrian Bunk
Remove the no longer used sonypi_camera_command(). Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Mattia Dongili <malattia@linux.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Kernel utf-8 handlingJan Engelhardt
This patch fixes dead keys and copy/paste of non-ASCII characters in UTF-8 mode on Linux console. See more details about the original patch at: http://chris.heathens.co.nz/linux/utf8.html Already posted on (Oldest) http://lkml.org/lkml/2003/5/31/148 http://lkml.org/lkml/2005/12/24/69 (Recent) http://lkml.org/lkml/2006/8/7/75 [bunk@stusta.de: make drivers/char/selection.c:store_utf8() static] Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Cc: Alexander E. Patrakov <patrakov@ums.usu.ru> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Remove capability.h from mm.hAlexey Dobriyan
I forgot to remove capability.h from mm.h while removing sched.h! This patch remedies that, because the only inline function which was using CAP_something was made out of line. Cross-compile tested without regressions on: all powerpc defconfigs all mips defconfigs all m68k defconfigs all arm defconfigs all ia64 defconfigs alpha alpha-allnoconfig alpha-defconfig alpha-up arm i386 i386-allnoconfig i386-defconfig i386-up ia64 ia64-allnoconfig ia64-defconfig ia64-up m68k mips parisc parisc-allnoconfig parisc-defconfig parisc-up powerpc powerpc-up s390 s390-allnoconfig s390-defconfig s390-up sparc sparc-allnoconfig sparc-defconfig sparc-up sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up um-x86_64 x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16O_CLOEXEC for SCM_RIGHTSUlrich Drepper
Part two in the O_CLOEXEC saga: adding support for file descriptors received through Unix domain sockets. The patch is once again pretty minimal, it introduces a new flag for recvmsg and passes it just like the existing MSG_CMSG_COMPAT flag. I think this bit is not used otherwise but the networking people will know better. This new flag is not recognized by recvfrom and recv. These functions cannot be used for that purpose and the asymmetry this introduces is not worse than the already existing MSG_CMSG_COMPAT situations. The patch must be applied on the patch which introduced O_CLOEXEC. It has to remove static from the new get_unused_fd_flags function but since scm.c cannot live in a module the function still hasn't to be exported. Here's a test program to make sure the code works. It's so much longer than the actual patch... #include <errno.h> #include <error.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <unistd.h> #include <sys/socket.h> #include <sys/un.h> #ifndef O_CLOEXEC # define O_CLOEXEC 02000000 #endif #ifndef MSG_CMSG_CLOEXEC # define MSG_CMSG_CLOEXEC 0x40000000 #endif int main (int argc, char *argv[]) { if (argc > 1) { int fd = atol (argv[1]); printf ("child: fd = %d\n", fd); if (fcntl (fd, F_GETFD) == 0 || errno != EBADF) { puts ("file descriptor valid in child"); return 1; } return 0; } struct sockaddr_un sun; strcpy (sun.sun_path, "./testsocket"); sun.sun_family = AF_UNIX; char databuf[] = "hello"; struct iovec iov[1]; iov[0].iov_base = databuf; iov[0].iov_len = sizeof (databuf); union { struct cmsghdr hdr; char bytes[CMSG_SPACE (sizeof (int))]; } buf; struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1, .msg_control = buf.bytes, .msg_controllen = sizeof (buf) }; struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg); cmsg->cmsg_level = SOL_SOCKET; cmsg->cmsg_type = SCM_RIGHTS; cmsg->cmsg_len = CMSG_LEN (sizeof (int)); msg.msg_controllen = cmsg->cmsg_len; pid_t child = fork (); if (child == -1) error (1, errno, "fork"); if (child == 0) { int sock = socket (PF_UNIX, SOCK_STREAM, 0); if (sock < 0) error (1, errno, "socket"); if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0) error (1, errno, "bind"); if (listen (sock, SOMAXCONN) < 0) error (1, errno, "listen"); int conn = accept (sock, NULL, NULL); if (conn == -1) error (1, errno, "accept"); *(int *) CMSG_DATA (cmsg) = sock; if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0) error (1, errno, "sendmsg"); return 0; } /* For a test suite this should be more robust like a barrier in shared memory. */ sleep (1); int sock = socket (PF_UNIX, SOCK_STREAM, 0); if (sock < 0) error (1, errno, "socket"); if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0) error (1, errno, "connect"); unlink (sun.sun_path); *(int *) CMSG_DATA (cmsg) = -1; if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0) error (1, errno, "recvmsg"); int fd = *(int *) CMSG_DATA (cmsg); if (fd == -1) error (1, 0, "no descriptor received"); char fdname[20]; snprintf (fdname, sizeof (fdname), "%d", fd); execl ("/proc/self/exe", argv[0], fdname, NULL); puts ("execl failed"); return 1; } [akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch] [akpm@linux-foundation.org: build fix] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Michael Buesch <mb@bu3sch.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Add a flag to indicate deferrable timers in /proc/timer_statsVenki Pallipadi
Add a flag in /proc/timer_stats to indicate deferrable timers. This will let developers/users to differentiate between types of tiemrs in /proc/timer_stats. Deferrable timer and normal timer will appear in /proc/timer_stats as below. 10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 10, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) Also version of timer_stats changes from v0.1 to v0.2 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16remove odd and misleading comments from uio.hChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16dma-mapping: prevent dma dependent code from linking on !HAS_DMA archsDan Williams
Continuing the work started in 411f0f3edc141a582190d3605cadd1d993abb6df ... This enables code with a dma path, that compiles away, to build without requiring additional code factoring. It also prevents code that calls dma_alloc_coherent and dma_free_coherent from linking whereas previously the code would hit a BUG() at run time. Finally, it allows archs that set !HAS_DMA to delete their asm/dma-mapping.h file. Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: John W. Linville <linville@tuxdriver.com> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: James Bottomley <James.Bottomley@SteelEye.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: <geert@linux-m68k.org> Cc: <zippel@linux-m68k.org> Cc: <spyro@f2s.com> Cc: <ysato@users.sourceforge.jp> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16fs: clarify "dummy" member in struct inodes_stat_tStefan Richter
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16AFS: implement file lockingDavid Howells
Implement file locking for AFS. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Remove unnecessary includes of spinlock.h under include/linuxRobert P. J. Day
Remove the obviously unnecessary includes of <linux/spinlock.h> under the include/linux/ directory, and fix the couple errors that are introduced as a result of that. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16fat: gcc 4.3 warning fixOGAWA Hirofumi
This patch fixes the following warnings. fs/fat/dir.c: In function 'fat_parse_long': include/linux/msdos_fs.h:294: warning: array subscript is above array bounds include/linux/msdos_fs.h:295: warning: array subscript is above array bounds include/linux/msdos_fs.h:295: warning: array subscript is above array bounds The ->name is defined as "name[8], ext[3]", but fat_checksum() uses those as name[11]. There is no actual problem, but it's not a good manner. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16percpu_counters(): use cpu notifiersAndrew Morton
per-cpu counters presently must iterate over all possible CPUs in the exhaustive percpu_counter_sum(). But it can be much better to only iterate over the presently-online CPUs. To do this, we must arrange for an offlined CPU's count to be spilled into the counter's central count. We can do this for all percpu_counters in the machine by linking them into a single global list and walking that list at CPU_DEAD time. (I hope. Might have race windows in which the percpu_counter_sum() count is inaccurate?) Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16fuse warning fixAndrew Morton
gcc-4.3: fs/fuse/dir.c: In function 'parse_dirfile': fs/fuse/dir.c:833: warning: cast from pointer to integer of different size fs/fuse/dir.c:835: warning: cast from pointer to integer of different size [miklos@szeredi.hu: use offsetof] Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Use mutexes instead of semaphores in I2O driverMatthias Kaehlcke
The I2O driver uses two semaphores as mutexes. Use the mutex API instead of the (binary) semaphores. Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Prevent an O_NDELAY writer from blocking when a tty write is blocked by the ↵Alan Cox
tty atomic writer mutex Without this a tty write could block if a previous blocking tty write was in progress on the same tty and blocked by a line discipline or hardware event. Originally found and reported by Dave Johnson. Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Dave Johnson <djohnson+linux-kernel@sw.starentnetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Use boot based time for process start time and boot time in /procTomas Janousek
Commit 411187fb05cd11676b0979d9fbf3291db69dbce2 caused boot time to move and process start times to become invalid after suspend. Using boot based time for those restores the old behaviour and fixes the issue. [akpm@linux-foundation.org: little cleanup] Signed-off-by: Tomas Janousek <tjanouse@redhat.com> Cc: Tomas Smetana <tsmetana@redhat.com> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Introduce boot based timeTomas Janousek
The commits 411187fb05cd11676b0979d9fbf3291db69dbce2 (GTOD: persistent clock support) c1d370e167d66b10bca3b602d3740405469383de (i386: use GTOD persistent clock support) changed the monotonic time so that it no longer jumps after resume, but it's not possible to use it for boot time and process start time calculations then. Also, the uptime no longer increases during suspend. I add a variable to track the wall_to_monotonic changes, a function to get the real boot time and a function to get the boot based time from the monotonic one. [akpm@linux-foundation.org: remove exports, add comment] Signed-off-by: Tomas Janousek <tjanouse@redhat.com> Cc: Tomas Smetana <tsmetana@redhat.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16fix jvc cdrom drive lockupZhang, Yanmin
Before calling init_hwif_default, ide_unregister gets lock ide_lock and disables irq. init_hwif_default calls ide_default_io_base which calls pci_get_device and later pci_get_subsys tries to apply for semaphore pci_bus_sem and goes to sleep. Mostly, pci_get_device should be called when irq is turned on. ide_default_io_base just needs find if list pci_devices is empty. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Cc: Greg KH <greg@kroah.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16introduce write_trylock_irqsave()Satyam Sharma
Introduce a write_trylock_irqsave() implementation. Similar in style to the implementation of spin_trylock_irqsave() in mainline. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Cc: Sripathi Kodi <sripathik@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Fix rmmod/read/write races in /proc entriesAlexey Dobriyan
Fix following races: =========================================== 1. Write via ->write_proc sleeps in copy_from_user(). Module disappears meanwhile. Or, more generically, system call done on /proc file, method supplied by module is called, module dissapeares meanwhile. pde = create_proc_entry() if (!pde) return -ENOMEM; pde->write_proc = ... open write copy_from_user pde = create_proc_entry(); if (!pde) { remove_proc_entry(); return -ENOMEM; /* module unloaded */ } *boom* ========================================== 2. bogo-revoke aka proc_kill_inodes() remove_proc_entry vfs_read proc_kill_inodes [check ->f_op validness] [check ->f_op->read validness] [verify_area, security permissions checks] ->f_op = NULL; if (file->f_op->read) /* ->f_op dereference, boom */ NOTE, NOTE, NOTE: file_operations are proxied for regular files only. Let's see how this scheme behaves, then extend if needed for directories. Directories creators in /proc only set ->owner for them, so proxying for directories may be unneeded. NOTE, NOTE, NOTE: methods being proxied are ->llseek, ->read, ->write, ->poll, ->unlocked_ioctl, ->ioctl, ->compat_ioctl, ->open, ->release. If your in-tree module uses something else, yell on me. Full audit pending. [akpm@linux-foundation.org: build fix] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16split mmapMiklos Szeredi
This is a straightforward split of do_mmap_pgoff() into two functions: - do_mmap_pgoff() checks the parameters, and calculates the vma flags. Then it calls - mmap_region(), which does the actual mapping Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16slob: initial NUMA supportPaul Mundt
This adds preliminary NUMA support to SLOB, primarily aimed at systems with small nodes (tested all the way down to a 128kB SRAM block), whether asymmetric or otherwise. We follow the same conventions as SLAB/SLUB, preferring current node placement for new pages, or with explicit placement, if a node has been specified. Presently on UP NUMA this has the side-effect of preferring node#0 allocations (since numa_node_id() == 0, though this could be reworked if we could hand off a pfn to determine node placement), so single-CPU NUMA systems will want to place smaller nodes further out in terms of node id. Once a page has been bound to a node (via explicit node id typing), we only do block allocations from partial free pages that have a matching node id in the page flags. The current implementation does have some scalability problems, in that all partial free pages are tracked in the global freelist (with contention due to the single spinlock). However, these are things that are being reworked for SMP scalability first, while things like per-node freelists can easily be built on top of this sort of functionality once it's been added. More background can be found in: http://marc.info/?l=linux-mm&m=118117916022379&w=2 http://marc.info/?l=linux-mm&m=118170446306199&w=2 http://marc.info/?l=linux-mm&m=118187859420048&w=2 and subsequent threads. Acked-by: Christoph Lameter <clameter@sgi.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16kill vmalloc_earlyreserveJan Beulich
This symbol got orphaned quite a while ago. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16invalidate_mapping_pages(): add cond_reschedAndrew Morton
invalidate_mapping_pages() can sometimes take a long time (millions of pages to free). Long enough for the softlockup detector to trigger. We used to have a cond_resched() in there but I took it out because the drop_caches code calls invalidate_mapping_pages() under inode_lock. The patch adds a nasty flag and puts the cond_resched() back. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>