aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2007-05-08SPIN_LOCK_UNLOCKED cleanup in init_task.hMilind Arun Choudhary
SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08EFI: warn only for pre-1.00 system tablesBjorn Helgaas
We used to warn unless the EFI system table major revision was exactly 1. But EFI 2.00 firmware is starting to appear, and the 2.00 changes don't affect anything in Linux. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08fix hotplug for legacy platform driversDavid Brownell
We've had various reports of some legacy "probe the hardware" style platform drivers having nasty problems with hotplug support. The core issue is that those legacy drivers don't fully conform to the driver model. They assume a role that should be the responsibility of infrastructure code: creating device nodes. The "modprobe" step in hotplugging relies on drivers to have split those roles into different modules. The lack of this split causes the problems. When a driver creates nodes for devices that don't exist (sending a hotplug event), then exits (aborting one modprobe) before the "modprobe $MODALIAS" step completes (by failing, since it's in the middle of a modprobe), the result can be an endless loop of modprobe invocations ... badness. This fix uses the newish per-device flag controlling issuance of "add" events. (A previous version of this patch used a per-device "driver can hotplug" flag, which only scrubbed $MODALIAS from the environment rather than suppressing the entire hotplug event.) It also shrinks that flag to one bit, saving a word in "struct device". So the net of this patch is removing some nasty failures with legacy drivers, while retaining hotplug capability for the majority of platform drivers. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Greg KH <gregkh@suse.de> Cc: Andres Salomon <dilinger@debian.org> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08cleanup compat ioctl handlingChristoph Hellwig
Merge all compat ioctl handling into compat_ioctl.c instead of splitting it over compat.c and compat_ioctl.c. This also allows to get rid of ioctl32.h Signed-off-by: Christoph Hellwig <hch@lst.de> Looks-good-to: Andi Kleen <ak@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Pad irq_desc to internode cacheline sizeRavikiran G Thirumalai
We noticed a drop in n/w performance due to the irq_desc being cacheline aligned rather than internode aligned. We see 50% of expected performance when two e1000 nics local to two different nodes have consecutive irq descriptors allocated, due to false sharing. Note that this patch does away with cacheline padding for the UP case, as it does not seem useful for UP configurations. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08merge compat_ioctl.h into compat_ioctl.cChristoph Hellwig
Now that there is no arch-specific compat ioctl handling left there is not point in having a separate copat_ioctl.h, so merge it into compat_ioctl.c Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Deprecate SA_xxx interrupt flags -V2Thomas Gleixner
The deprecation of the SA_xxx interrupt flags did not emit deprecated warnings. Andrew said about the removal of the deprecated flag defines: > This is going to break a lot of external stuff. We should have found > a way to make usage of SA_* emit deprecated warnings (or _some_ > warning) to warn people of impending doom. But I can't immediately > find a way of doing that. if we _can_ find a way of doing this, I > suspect we'll need to do it, and give people another six months. It's > going to get ugly out there. We shall see... Define the deprecated flags as a call to a __deprecated inline function so a warning is emitted on compile time. Extend the reprieve of out of tree drivers to 9/2007. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Fix race between cat /proc/slab_allocators and rmmodAlexey Dobriyan
Same story as with cat /proc/*/wchan race vs rmmod race, only /proc/slab_allocators want more info than just symbol name. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Fix race between cat /proc/*/wchan and rmmod et alAlexey Dobriyan
kallsyms_lookup() can go iterating over modules list unprotected which is OK for emergency situations (oops), but not OK for regular stuff like /proc/*/wchan. Introduce lookup_symbol_name()/lookup_module_symbol_name() which copy symbol name into caller-supplied buffer or return -ERANGE. All copying is done with module_mutex held, so... Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Fix race between rmmod and cat /proc/kallsymsAlexey Dobriyan
module_get_kallsym() leaks "struct module *" outside of module_mutex which is no-no, because module can dissapear right after mutex unlock. Copy all needed information from inside module_mutex into caller-supplied space. [bunk@stusta.de: is_exported() can now become static] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Simplify module_get_kallsym() by dropping length argAlexey Dobriyan
module_get_kallsym() could in theory truncate module symbol name to fit in buffer, but nobody does this. Always use KSYM_NAME_LEN + 1 bytes for name. Suggested by lg^WRusty. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Kprobes: print details of kretprobe on assertion failureAnanth N Mavinakayanahalli
In certain cases like when the real return address can't be found or when the number of tracked calls to a kretprobed function is less than the number of returns, we may not be able to find the correct return address after processing a kretprobe. Currently we just do a BUG_ON, but no information is provided about the actual failing kretprobe. Print out details of the kretprobe before calling BUG(). Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08kdump/kexec: calculate note size at compile timeSimon Horman
Currently the size of the per-cpu region reserved to save crash notes is set by the per-architecture value MAX_NOTE_BYTES. Which in turn is currently set to 1024 on all supported architectures. While testing ia64 I recently discovered that this value is in fact too small. The particular setup I was using actually needs 1172 bytes. This lead to very tedious failure mode where the tail of one elf note would overwrite the head of another if they ended up being alocated sequentially by kmalloc, which was often the case. It seems to me that a far better approach is to caclculate the size that the area needs to be. This patch does just that. If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) is needed then this should be as easy as making MAX_NOTE_BYTES larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice. However, I think that the approach in this patch is a much more robust idea. Acked-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08remove artificial software max_loop limitKen Chen
Remove artificial maximum 256 loop device that can be created due to a legacy device number limit. Searching through lkml archive, there are several instances where users complained about the artificial limit that the loop driver impose. There is no reason to have such limit. This patch rid the limit entirely and make loop device and associated block queue instantiation on demand. With on-demand instantiation, it also gives the benefit of not wasting memory if these devices are not in use (compare to current implementation that always create 8 loop devices), a net improvement in both areas. This version is both tested with creation of large number of loop devices and is compatible with existing losetup/mount user land tools. There are a number of people who worked on this and provided valuable suggestions, in no particular order, by: Jens Axboe Jan Engelhardt Christoph Hellwig Thomas M Signed-off-by: Ken Chen <kenchen@google.com> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08add touch_all_softlockup_watchdogs()Jeremy Fitzhardinge
Add touch_all_softlockup_watchdogs() to allow the softlockup watchdog timers on all cpus to be updated. This is used to prevent sysrq-t from generating a spurious watchdog message when generating lots of output. Softlockup watchdogs use sched_clock() as its timebase, which is inherently per-cpu (at least, when it is measuring unstolen time). Because of this, it isn't possible for one CPU to directly update the other CPU's timers, but it is possible to tell the other CPUs to do update themselves appropriately. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Lalancette <clalance@redhat.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Rick Lindsley <ricklind@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Move timekeeping code to timekeeping.cjohn stultz
Move the timekeeping code out of kernel/timer.c and into kernel/time/timekeeping.c. I made no cleanups or other changes in transit. [akpm@linux-foundation.org: build fix] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08time: SMP friendly alignment of struct clocksourceEric Dumazet
struct clocksource is a critical data structure. Most of its fields are read only, some of them are heavily modified at each timer interrupt. It makes sense to separate those fields and make sure they all share one cache line, or at least the minimum for machines with small cache lines. [akpm@linux-foundation.org: build fix] Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08<linux/sysdev.h> needs to include <linux/module.h>Ralf Baechle
sysdev.h uses THIS_MODULE so should include <linux/module.h>. [akpm@linux-foundation.org: couple of fixes] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Andi Kleen <ak@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Add a new deferrable delayed work initVenki Pallipadi
Add a new deferrable delayed work init. This can be used to schedule work that are 'unimportant' when CPU is idle and can be called later, when CPU eventually comes out of idle. Use this init in cpufreq ondemand governor. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Add support for deferrable timersVenki Pallipadi
Introduce a new flag for timers - deferrable: Timers that work normally when system is busy. But, will not cause CPU to come out of idle (just to service this timer), when CPU is idle. Instead, this timer will be serviced when CPU eventually wakes up with a subsequent non-deferrable timer. The main advantage of this is to avoid unnecessary timer interrupts when CPU is idle. If the routine currently called by a timer can wait until next event without any issues, this new timer can be used to setup timer event for that routine. This, with dynticks, allows CPUs to be lazy, allowing them to stay in idle for extended period of time by reducing unnecesary wakeup and thereby reducing the power consumption. This patch: Builds this new timer on top of existing timer infrastructure. It uses last bit in 'base' pointer of timer_list structure to store this deferrable timer flag. __next_timer_interrupt() function skips over these deferrable timers when CPU looks for next timer event for which it has to wake up. This is exported by a new interface init_timer_deferrable() that can be called in place of regular init_timer(). [akpm@linux-foundation.org: Privatise a #define] Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08parport->dev driver model supportDavid Brownell
Currently a parport_driver can't get a handle on the device node for the underlying parport (PNPACPI, PCI, etc). That prevents correct placement of sysfs child nodes, which can affect things like power management. This patch adds a field to "struct parport" pointing to that device node, and updates non-legacy port drivers to initialize that device pointer. That field replaces the analagous PCI-only support in parport_pc. [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Delete unused header file linux/awe_voice.hRobert P. J. Day
Delete the unused header file include/linux/awe_voice.h, as well as its corresponding Kbuild entry. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08make remove_inode_dquot_ref() staticAdrian Bunk
remove_inode_dquot_ref() can now become static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Remove do_sync_file_range()Mark Fasheh
Remove do_sync_file_range() and convert callers to just use do_sync_mapping_range(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08move die notifier handling to common codeChristoph Hellwig
This patch moves the die notifier handling to common code. Previous various architectures had exactly the same code for it. Note that the new code is compiled unconditionally, this should be understood as an appel to the other architecture maintainer to implement support for it aswell (aka sprinkling a notify_die or two in the proper place) arm had a notifiy_die that did something totally different, I renamed it to arm_notify_die as part of the patch and made it static to the file it's declared and used at. avr32 used to pass slightly less information through this interface and I brought it into line with the other architectures. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix vmalloc_sync_all bustage] [bryan.wu@analog.com: fix vmalloc_sync_all in nommu] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08tty: introduce no_tty and use it in selinuxEric W. Biederman
While researching the tty layer pid leaks I found a weird case in selinux when we drop a controlling tty because of inadequate permissions we don't do the normal hangup processing. Which is a problem if it happens the session leader has exec'd something that can no longer access the tty. We already have code in the kernel to handle this case in the form of the TIOCNOTTY ioctl. So this patch factors out a helper function that is the essence of that ioctl and calls it from the selinux code. This removes the inconsistency in handling dropping of a controlling tty and who knows it might even make some part of user space happy because it received a SIGHUP it was expecting. In addition since this removes the last user of proc_set_tty outside of tty_io.c proc_set_tty is made static and removed from tty.h Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08enlarge console.nameAndrew Morton
console.name[] is eight chars, but so is "earlyvga". So when we try to print console->name when using earlyvga it runs off the end of the string. Make it bigger. Diagnosed-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08futex: get_futex_key, get_key_refs and drop_key_refsRusty Russell
lguest uses the convenient futex infrastructure for inter-domain I/O, so expose get_futex_key, get_key_refs (renamed get_futex_key_refs) and drop_key_refs (renamed drop_futex_key_refs). Also means we need to expose the union that these use. No code changes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08cyclades: remove custom typesKlaus Kudielka
Switch from private uclong, etc over to standard types. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08fix cyclades.h for x86_64 (and probably others)Klaus Kudielka
At least on x86_64 the present cyclades.h is broken due to the wrong size of uclong. This affects, of course, both the kernel and the user-level utilities. The symptom is that cyzload refuses to load the firmware. I also managed to freeze the machine when unloading the module. The patch below fixes this in an architecture-independent way. I have tested it with 2.6.19 and the driver works fine again with a Cyclades-Z on an Athlon 64 X2. [akpm@linux-foundation.org: fix warnings] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Kprobes: Make kprobe.symbol_name constAnanth N Mavinakayanahalli
Kprobes doesn't scribble the kprobe.symbol_name field. Its only set by the module when registering the probe. Modules that exercise good hygiene using the "const" qualifier will see warnings... warning: assignment discards qualifiers from pointer target type Make struct kprobe.symbol_name const char * Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08VFS: delay the dentry name generation on sockets and pipesEric Dumazet
1) Introduces a new method in 'struct dentry_operations'. This method called d_dname() might be called from d_path() to build a pathname for special filesystems. It is called without locks. Future patches (if we succeed in having one common dentry for all pipes/sockets) may need to change prototype of this method, but we now use : char *d_dname(struct dentry *dentry, char *buffer, int buflen); 2) Adds a dynamic_dname() helper function that eases d_dname() implementations 3) Defines d_dname method for sockets : No more sprintf() at socket creation. This is delayed up to the moment someone does an access to /proc/pid/fd/... 4) Defines d_dname method for pipes : No more sprintf() at pipe creation. This is delayed up to the moment someone does an access to /proc/pid/fd/... A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a *nice* speedup on my Pentium(M) 1.6 Ghz : 3.090 s instead of 3.450 s Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Christoph Hellwig <hch@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Fix race between proc_get_inode() and remove_proc_entry()Alexey Dobriyan
proc_lookup remove_proc_entry =========== ================= lock_kernel(); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] [check refcount and free PDE] spin_unlock(&proc_subdir_lock); proc_get_inode: de_get(de); /* boom */ Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08add filesystem subtype supportMiklos Szeredi
There's a slight problem with filesystem type representation in fuse based filesystems. From the kernel's view, there are just two filesystem types: fuse and fuseblk. From the user's view there are lots of different filesystem types. The user is not even much concerned if the filesystem is fuse based or not. So there's a conflict of interest in how this should be represented in fstab, mtab and /proc/mounts. The current scheme is to encode the real filesystem type in the mount source. So an sshfs mount looks like this: sshfs#user@server:/ /mnt/server fuse rw,nosuid,nodev,... This url-ish syntax works OK for sshfs and similar filesystems. However for block device based filesystems (ntfs-3g, zfs) it doesn't work, since the kernel expects the mount source to be a real device name. A possibly better scheme would be to encode the real type in the type field as "type.subtype". So fuse mounts would look like this: /dev/hda1 /mnt/windows fuseblk.ntfs-3g rw,... user@server:/ /mnt/server fuse.sshfs rw,nosuid,nodev,... This patch adds the necessary code to the kernel so that this can be correctly displayed in /proc/mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08init dma masks in pnp_devDavid Brownell
PNP now initializes device dma masks, which prevents oopses when generic dma calls are made using pnp device nodes. This assumes PNP only uses ISA DMA, with 24 bit addresses; and that it's safe to init those masks for all devices (rather than finding out which devices have been assigned DMA channels, and handling only those). Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Adam Belay <abelay@novell.com> Cc: Jaroslav Kysela <perex@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Merge sys_clone()/sys_unshare() nsproxy and namespace handlingBadari Pulavarty
sys_clone() and sys_unshare() both makes copies of nsproxy and its associated namespaces. But they have different code paths. This patch merges all the nsproxy and its associated namespace copy/clone handling (as much as possible). Posted on container list earlier for feedback. - Create a new nsproxy and its associated namespaces and pass it back to caller to attach it to right process. - Changed all copy_*_ns() routines to return a new copy of namespace instead of attaching it to task->nsproxy. - Moved the CAP_SYS_ADMIN checks out of copy_*_ns() routines. - Removed unnessary !ns checks from copy_*_ns() and added BUG_ON() just incase. - Get rid of all individual unshare_*_ns() routines and make use of copy_*_ns() instead. [akpm@osdl.org: cleanups, warning fix] [clg@fr.ibm.com: remove dup_namespaces() declaration] [serue@us.ibm.com: fix CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) retval] [akpm@linux-foundation.org: fix build with CONFIG_SYSVIPC=n] Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <containers@lists.osdl.org> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08IRQ: add __must_check to request_irqMonakhov Dmitriy
This could help to find buggy drivers where request_irq return value wasn't checked. There's just no reason to ignore errors which can and do occur. Anyone who got warning during compilation have to realise what it is't realy safe code. Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08reiserfs: shrink superblock if no xattrsAlexey Dobriyan
This makes in-core superblock fit into one cacheline here. Before: struct dentry * xattr_root; /* 124 4 */ /* --- cacheline 1 boundary (128 bytes) --- */ struct rw_semaphore xattr_dir_sem; /* 128 12 */ int j_errno; /* 140 4 */ }; /* size: 144, cachelines: 2 */ /* sum members: 142, holes: 1, sum holes: 2 */ /* last cacheline: 16 bytes */ After: int j_errno; /* 124 4 */ /* --- cacheline 1 boundary (128 bytes) --- */ }; /* size: 128, cachelines: 1 */ /* sum members: 126, holes: 1, sum holes: 2 */ Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Fix compilation of drivers with -O0Michal Schmidt
It is sometimes useful to compile individual drivers with optimization disabled for easier debugging. Currently drivers which use htonl() and similar functions don't compile with -O0. This patch fixes it. It also removes obsolete and misleading comments. This header is not for userspace, so we don't have to care about strange programs these comments mention. (akpm: -O0 probably isn't a good idea, but this code looks pretty crufty and unuseful) Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08init/do_mounts.c: proper prepare_namespace() prototypeAdrian Bunk
Add a proper protype for prepare_namespace() in include/linux/init.h. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08use use SEEK_MAX to validate user lseek argumentsChris Snook
Add SEEK_MAX and use it to validate lseek arguments from userspace. Signed-off-by: Chris Snook <csnook@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Fix constant folding and poor optimization in byte swapping codeTrent Piepho
Constant folding does not work for the swabXX() byte swapping functions, and the C versions optimize poorly. Attempting to initialize a global variable to swab16(0x1234) or put something like "case swab32(42):" in a switch statement will not compile. It can work, swab.h just isn't doing it correctly. This patch fixes that. Contrary to the comment in asm-i386/byteorder.h, gcc does not recognize the "C" version of swab16 and turn it into efficient code. gcc can do this, just not with the current code. The simple function: u16 foo(u16 x) { return swab16(x); } Would compile to: movzwl %ax, %eax movl %eax, %edx shrl $8, %eax sall $8, %edx orl %eax, %edx With this patch, it will compile to: rolw $8, %ax I also attempted to document the maze different macros/inline functions that are used to create the final product. Signed-off-by: Trent Piepho <xyzzy@speakeasy.org> Cc: Francois-Rene Rideau <fare@tunes.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08reduce size of task_struct on 64-bit machinesWilliam Cohen
This past week I was playing around with that pahole tool (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the size of various struct in the kernel. I was surprised by the size of the task_struct on x86_64, approaching 4K. I looked through the fields in task_struct and found that a number of them were declared as "unsigned long" rather than "unsigned int" despite them appearing okay as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8 bytes in size and forces 8 byte alignment. Is there a reason there a reason they are "unsigned long"? The patch below drops the size of the struct from 3808 bytes (60 64-byte cachelines) to 3760 bytes (59 64-byte cachelines). A couple other fields in the task struct take a signficant amount of space: struct thread_struct thread; 688 struct held_lock held_locks[30]; 1680 CONFIG_LOCKDEP is turned on in the .config [akpm@linux-foundation.org: fix printk warnings] Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08simplify the stacktrace codeChristoph Hellwig
Simplify the stacktrace code: - remove the unused task argument to save_stack_trace, it's always current - remove the all_contexts flag, it's alwasy 0 Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@suse.de> Cc: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Factor outstanding I/O error handlingGuillaume Chazarain
Cleanup: setting an outstanding error on a mapping was open coded too many times. Factor it out in mapping_set_error(). Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08mm: move common segment checks to separate helper functionDmitriy Monakhov
[akpm@linux-foundation.org: cleanup] Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org> Cc: Christoph Hellwig <hch@lst.de> Acked-by: Anton Altaparmakov <aia21@cam.ac.uk> Acked-by: David Chinner <dgc@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Increase slab redzone to 64bitsDavid Woodhouse
There are two problems with the existing redzone implementation. Firstly, it's causing misalignment of structures which contain a 64-bit integer, such as netfilter's 'struct ipt_entry' -- causing netfilter modules to fail to load because of the misalignment. (In particular, the first check in net/ipv4/netfilter/ip_tables.c::check_entry_size_and_hooks()) On ppc32 and sparc32, amongst others, __alignof__(uint64_t) == 8. With slab debugging, we use 32-bit redzones. And allocated slab objects aren't sufficiently aligned to hold a structure containing a uint64_t. By _just_ setting ARCH_KMALLOC_MINALIGN to __alignof__(u64) we'd disable redzone checks on those architectures. By using 64-bit redzones we avoid that loss of debugging, and also fix the other problem while we're at it. When investigating this, I noticed that on 64-bit platforms we're using a 32-bit value of RED_ACTIVE/RED_INACTIVE in the 64-bit memory location set aside for the redzone. Which means that the four bytes immediately before or after the allocated object at 0x00,0x00,0x00,0x00 for LE and BE machines, respectively. Which is probably not the most useful choice of poison value. One way to fix both of those at once is just to switch to 64-bit redzones in all cases. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <clameter@engr.sgi.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux: gfs2: nfs lock support for gfs2 lockd: add code to handle deferred lock requests lockd: always preallocate block in nlmsvc_lock() lockd: handle test_lock deferrals lockd: pass cookie in nlmsvc_testlock lockd: handle fl_grant callbacks lockd: save lock state on deferral locks: add fl_grant callback for asynchronous lock return nfsd4: Convert NFSv4 to new lock interface locks: add lock cancel command locks: allow {vfs,posix}_lock_file to return conflicting lock locks: factor out generic/filesystem switch from setlock code locks: factor out generic/filesystem switch from test_lock locks: give posix_test_lock same interface as ->lock locks: make ->lock release private data before returning in GETLK case locks: create posix-to-flock helper functions locks: trivial removal of unnecessary parentheses
2007-05-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (34 commits) [GFS2] Uncomment sprintf_symbol calling code [DLM] lowcomms style [GFS2] printk warning fixes [GFS2] Patch to fix mmap of stuffed files [GFS2] use lib/parser for parsing mount options [DLM] Lowcomms nodeid range & initialisation fixes [DLM] Fix dlm_lowcoms_stop hang [DLM] fix mode munging [GFS2] lockdump improvements [GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks [GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) [DLM] fs/dlm/ast.c should #include "ast.h" [DLM] Consolidate transport protocols [DLM] Remove redundant assignment [GFS2] Fix bz 234168 (ignoring rgrp flags) [DLM] change lkid format [DLM] interface for purge (2/2) [DLM] add orphan purging code (1/2) [DLM] split create_message function [GFS2] Set drop_count to 0 (off) by default ...
2007-05-07Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: rfkill: add support for input key to control wireless radio [NET] net/core: Fix error handling [TG3]: Update version and reldate. [TG3]: Eliminate spurious interrupts. [TG3]: Add ASPM workaround. [Bluetooth] Correct SCO buffer for another Broadcom based dongle [Bluetooth] Add support for Targus ACB10US USB dongle [Bluetooth] Disconnect L2CAP connection after last RFCOMM DLC [Bluetooth] Check that device is in rfcomm_dev_list before deleting [Bluetooth] Use in-kernel sockets API [Bluetooth] Attach host adapters to the Bluetooth bus [Bluetooth] Fix L2CAP and HCI setsockopt() information leaks