Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2008-10-17	9p: eliminate depricated conv functions	Eric Van Hensbergen
	Remove depricated conv functions which have been replaced with new protocol routines. This patch also reworks the one instance of the file-system code which directly calls conversion routines (to accomplish unpacking dirreads). Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17	9p: rework client code to use new protocol support functions	Eric Van Hensbergen
	Now that the new protocol functions are in place, this patch switches the client code to using the new support code. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17	9p: move dirread to fs layer	Eric Van Hensbergen
	Currently reading a directory is implemented in the client code. This function is not actually a wire operation, but a meta operation which calls read operations and processes the results. This patch moves this functionality to the fs layer and calls component wire operations instead of constructing their packets. This provides a cleaner separation and will help when we reorganize the client functions and protocol processing methods. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17	9p: adjust 9p vfs write operation	Eric Van Hensbergen
	Currently, the 9p net wire operation ensures that all data is sent by sending multiple packets if the data requested is larger than the msize. This is better handled in the vfs code so that we can simplify wire operations to being concerned with only putting data onto and taking data off of the wire. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17	9p: move readn meta-function from client to fs layer	Eric Van Hensbergen
	There are a couple of methods in the client code which aren't actually wire operations. To keep things organized cleaner, these operations are being moved to the fs layer. This patch moves the readn meta-function (which executes multiple wire reads until a buffer is full) to the fs layer. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17	9p: consolidate read/write functions	Eric Van Hensbergen
	Currently there are two separate versions of read and write. One for dealing with user buffers and the other for dealing with kernel buffers. There is a tremendous amount of code duplication in the otherwise identical versions of these functions. This patch adds an additional user buffer parameter to read and write and conditionalizes handling of the buffer on whether the kernel buffer or the user buffer is populated. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17	9p: consolidate transport structure	Eric Van Hensbergen
	Right now there is a transport module structure which provides per-transport type functions and data and a transport structure which contains per-instance public data as well as function pointers to instance specific functions. This patch moves public transport visible instance data to the client structure (which in some cases had duplicate data) and consolidates the functions into the transport module structure. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-16	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6	Linus Torvalds
	* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (53 commits) NFS: Fix a resolution problem with nfs_inode->cache_change_attribute NFS: Fix the resolution problem with nfs_inode_attrs_need_update() NFS: Changes to inode->i_nlinks must set the NFS_INO_INVALID_ATTR flag RPC/RDMA: ensure connection attempt is complete before signalling. RPC/RDMA: correct the reconnect timer backoff RPC/RDMA: optionally emit useful transport info upon connect/disconnect. RPC/RDMA: reformat a debug printk to keep lines together. RPC/RDMA: harden connection logic against missing/late rdma_cm upcalls. RPC/RDMA: fix connect/reconnect resource leak. RPC/RDMA: return a consistent error, when connect fails. RPC/RDMA: adhere to protocol for unpadded client trailing write chunks. RPC/RDMA: avoid an oops due to disconnect racing with async upcalls. RPC/RDMA: maintain the RPC task bytes-sent statistic. RPC/RDMA: suppress retransmit on RPC/RDMA clients. RPC/RDMA: fix connection IRD/ORD setting RPC/RDMA: support FRMR client memory registration. RPC/RDMA: check selected memory registration mode at runtime. RPC/RDMA: add data types and new FRMR memory registration enum. RPC/RDMA: refactor the inline memory registration code. NFS: fix nfs_parse_ip_address() corner case ...
2008-10-16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (46 commits) UIO: Fix mapping of logical and virtual memory UIO: add automata sercos3 pci card support UIO: Change driver name of uio_pdrv UIO: Add alignment warnings for uio-mem Driver core: add bus_sort_breadthfirst() function NET: convert the phy_device file to use bus_find_device_by_name kobject: Cleanup kobject_rename and !CONFIG_SYSFS kobject: Fix kobject_rename and !CONFIG_SYSFS sysfs: Make dir and name args to sysfs_notify() const platform: add new device registration helper sysfs: use ilookup5() instead of ilookup5_nowait() PNP: create device attributes via default device attributes Driver core: make bus_find_device_by_name() more robust usb: turn dev_warn+WARN_ON combos into dev_WARN debug: use dev_WARN() rather than WARN_ON() in device_pm_add() debug: Introduce a dev_WARN() function sysfs: fix deadlock device model: Do a quickcheck for driver binding before doing an expensive check Driver core: Fix cleanup in device_create_vargs(). Driver core: Clarify device cleanup. ...
2008-10-16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: module: remove CONFIG_KMOD in comment after #endif remove CONFIG_KMOD from fs remove CONFIG_KMOD from drivers Manually fix conflict due to include cleanups in drivers/md/md.c
2008-10-16	Merge branch 'personality' of git://git390.osdl.marist.edu/pub/scm/linux-2.6	Linus Torvalds
	* 'personality' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [PATCH] remove unused ibcs2/PER_SVR4 in SET_PERSONALITY
2008-10-16	Configure out AIO support	Thomas Petazzoni
	This patchs adds the CONFIG_AIO option which allows to remove support for asynchronous I/O operations, that are not necessarly used by applications, particularly on embedded devices. As this is a size-reduction option, it depends on CONFIG_EMBEDDED. It allows to save ~7 kilobytes of kernel code/data: text data bss dec hex filename 1115067 119180 217088 1451335 162547 vmlinux 1108025 119048 217088 1444161 160941 vmlinux.new -7042 -132 0 -7174 -1C06 +/- This patch has been originally written by Matt Mackall <mpm@selenic.com>, and is part of the Linux Tiny project. [randy.dunlap@oracle.com: build fix] Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Zach Brown <zach.brown@oracle.com> Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	afs: convert to new aops	Nick Piggin
	Cannot assume writes will fully complete, so this conversion goes the easy way and always brings the page uptodate before the write. [dhowells@redhat.com: style tweaks] Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	pid_ns: de_thread: kill the now unneeded ->child_reaper change	Oleg Nesterov
	de_thread() checks if the old leader was the ->child_reaper, this is not possible any longer. With the previous patch ->group_leader itself will change ->child_reaper on exit. Henceforth find_new_reaper() is the only function (apart from initialization) which plays with ->child_reaper. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	proc: move sysrq-trigger out of fs/proc/	Alexey Dobriyan
	Move it into sysrq.c, along with the rest of the sysrq implementation. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	block: sanitize invalid partition table entries	Kay Sievers
	We currently follow blindly what the partition table lies about the disk, and let the kernel create block devices which can not be accessed. Trying to identify the device leads to kernel logs full of: sdb: rw=0, want=73392, limit=28800 attempt to access beyond end of device Here is an example of a broken partition table, where sda2 starts behind the end of the disk, and sdb3 is larger than the entire disk: Disk /dev/sdb: 14 MB, 14745600 bytes 1 heads, 29 sectors/track, 993 cylinders, total 28800 sectors Device Boot Start End Blocks Id System /dev/sdb1 29 7800 3886 83 Linux /dev/sdb2 37801 45601 3900+ 83 Linux /dev/sdb3 15602 73402 28900+ 83 Linux /dev/sdb4 23403 28796 2697 83 Linux The kernel creates these completely invalid devices, which can not be accessed, or may lead to other unpredictable failures: grep . /sys/class/block/sdb/{start,size} /sys/class/block/sdb/size:28800 /sys/class/block/sdb1/start:29 /sys/class/block/sdb1/size:7772 /sys/class/block/sdb2/start:37801 /sys/class/block/sdb2/size:7801 /sys/class/block/sdb3/start:15602 /sys/class/block/sdb3/size:57801 /sys/class/block/sdb4/start:23403 /sys/class/block/sdb4/size:5394 With this patch, we ignore partitions which start behind the end of the disk, and limit partitions to the end of the disk if they pretend to be larger: grep . /sys/class/block/sdb/{start,size} /sys/class/block/sdb/size:28800 /sys/class/block/sdb1/start:29 /sys/class/block/sdb1/size:7772 /sys/class/block/sdb3/start:15602 /sys/class/block/sdb3/size:13198 /sys/class/block/sdb4/start:23403 /sys/class/block/sdb4/size:5394 These warnings are printed to the kernel log: sdb: p2 ignored, start 37801 is behind the end of the disk sdb: p3 size 57801 limited to end of disk Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	fs/partitions/acorn.c: remove dead code	Adrian Bunk
	I missed this when I did the arm26 removal. Reported-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	COMPAT_BINFMT_ELF definition tweak	Alexey Dobriyan
	Don't repeat BINFMT_ELF definition, simply multiply COMPAT and BINFMT_ELF. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	binfmt_elf_fdpic: wire up AT_EXECFD, AT_EXECFN, AT_SECURE	Paul Mundt
	These auxvec entries are the only ones left unhandled out of the current base implementation. This syncs up binfmt_elf_fdpic with linux/auxvec.h and current binfmt_elf. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	binfmt_elf_fdpic: convert initial stack alignment to arch_align_stack()	Paul Mundt
	binfmt_elf_fdpic seems to have grabbed a hard-coded hack from an ancient version of binfmt_elf in order to try and fix up initial stack alignment on multi-threaded x86, which while in addition to being unused, was also pushed down beyond the first set of operations on the stack pointer, negating the entire purpose. These days, we have an architecture independent arch_align_stack(), so we switch to using that instead. Move the initial alignment up before the initial stores while we're at it. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	binfmt_elf_fdpic: support auxvec base platform string	Paul Mundt
	Commit 483fad1c3fa1060d7e6710e84a065ad514571739 ("ELF loader support for auxvec base platform string") introduced AT_BASE_PLATFORM, but only implemented it for binfmt_elf. Given that AT_VECTOR_SIZE_BASE is unconditionally enlarged for us, and it's only optionally added in for the platforms that set ELF_BASE_PLATFORM, wire it up for binfmt_elf_fdpic, too. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	quota: remove CVS keywords	Adrian Bunk
	Remove CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	fs/reiserfs: use an IS_ERR test rather than a NULL test	Julien Brunel
	In case of error, the function open_xa_dir returns an ERR pointer, but never returns a NULL pointer. So a NULL test that comes after an IS_ERR test should be deleted. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @match_bad_null_test@ expression x, E; statement S1,S2; @@ x = open_xa_dir(...) ... when != x = E ( * if (x == NULL && ...) S1 else S2 \| * if (x == NULL \|\| ...) S1 else S2 ) // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	reiserfs/procfs.c: remove CVS keywords	Adrian Bunk
	Remove CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	hfs: fix namelength memory corruption	Eric Sesterhenn
	Fix a stack corruption caused by a corrupted hfs filesystem. If the catalog name length is corrupted the memcpy overwrites the catalog btree structure. Since the field is limited to HFS_NAMELEN bytes in the structure and the file format, we throw an error if it is too long. Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	hfsplus: check read_mapping_page() return value	Eric Sesterhenn
	While testing more corrupted images with hfsplus, i came across one which triggered the following bug: [15840.675016] BUG: unable to handle kernel paging request at fffffffb [15840.675016] IP: [<c0116a4f>] kmap+0x15/0x56 [15840.675016] pde = 00008067 pte = 00000000 [15840.675016] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC [15840.675016] Modules linked in: [15840.675016] [15840.675016] Pid: 11575, comm: ln Not tainted (2.6.27-rc4-00123-gd3ee1b4-dirty #29) [15840.675016] EIP: 0060:[<c0116a4f>] EFLAGS: 00010202 CPU: 0 [15840.675016] EIP is at kmap+0x15/0x56 [15840.675016] EAX: 00000246 EBX: fffffffb ECX: 00000000 EDX: cab919c0 [15840.675016] ESI: 000007dd EDI: cab0bcf4 EBP: cab0bc98 ESP: cab0bc94 [15840.675016] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 [15840.675016] Process ln (pid: 11575, ti=cab0b000 task=cab919c0 task.ti=cab0b000) [15840.675016] Stack: 00000000 cab0bcdc c0231cfb 00000000 cab0bce0 00000800 ca9290c0 fffffffb [15840.675016] cab145d0 cab919c0 cab15998 22222222 22222222 22222222 00000001 cab15960 [15840.675016] 000007dd cab0bcf4 cab0bd04 c022cb3a cab0bcf4 cab15a6c ca9290c0 00000000 [15840.675016] Call Trace: [15840.675016] [<c0231cfb>] ? hfsplus_block_allocate+0x6f/0x2d3 [15840.675016] [<c022cb3a>] ? hfsplus_file_extend+0xc4/0x1db [15840.675016] [<c022ce41>] ? hfsplus_get_block+0x8c/0x19d [15840.675016] [<c06adde4>] ? sub_preempt_count+0x9d/0xab [15840.675016] [<c019ece6>] ? __block_prepare_write+0x147/0x311 [15840.675016] [<c0161934>] ? __grab_cache_page+0x52/0x73 [15840.675016] [<c019ef4f>] ? block_write_begin+0x79/0xd5 [15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d [15840.675016] [<c019f22a>] ? cont_write_begin+0x27f/0x2af [15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d [15840.675016] [<c0139ebe>] ? tick_program_event+0x28/0x4c [15840.675016] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd [15840.675016] [<c022b723>] ? hfsplus_write_begin+0x2d/0x32 [15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d [15840.675016] [<c0161988>] ? pagecache_write_begin+0x33/0x107 [15840.675016] [<c01879e5>] ? __page_symlink+0x3c/0xae [15840.675016] [<c019ad34>] ? __mark_inode_dirty+0x12f/0x137 [15840.675016] [<c0187a70>] ? page_symlink+0x19/0x1e [15840.675016] [<c022e6eb>] ? hfsplus_symlink+0x41/0xa6 [15840.675016] [<c01886a9>] ? vfs_symlink+0x99/0x101 [15840.675016] [<c018a2f6>] ? sys_symlinkat+0x6b/0xad [15840.675016] [<c018a348>] ? sys_symlink+0x10/0x12 [15840.675016] [<c01038bd>] ? sysenter_do_call+0x12/0x31 [15840.675016] ======================= [15840.675016] Code: 00 00 75 10 83 3d 88 2f ec c0 02 75 07 89 d0 e8 12 56 05 00 5d c3 55 ba 06 00 00 00 89 e5 53 89 c3 b8 3d eb 7e c0 e8 16 74 00 00 <8b> 03 c1 e8 1e 69 c0 d8 02 00 00 05 b8 69 8e c0 2b 80 c4 02 00 [15840.675016] EIP: [<c0116a4f>] kmap+0x15/0x56 SS:ESP 0068:cab0bc94 [15840.675016] ---[ end trace 4fea40dad6b70e5f ]--- This happens because the return value of read_mapping_page() is passed on to kmap unchecked. The bug is triggered after the first read_mapping_page() in hfsplus_block_allocate(), this patch fixes all three usages in this functions but leaves the ones further down in the file unchanged. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	hfsplus: fix Buffer overflow with a corrupted image	Eric Sesterhenn
	When an hfsplus image gets corrupted it might happen that the catalog namelength field gets b0rked. If we mount such an image the memcpy() in hfsplus_cat_build_key_uni() writes more than the 255 that fit in the name field. Depending on the size of the overwritten data, we either only get memory corruption or also trigger an oops like this: [ 221.628020] BUG: unable to handle kernel paging request at c82b0000 [ 221.629066] IP: [<c022d4b1>] hfsplus_find_cat+0x10d/0x151 [ 221.629066] pde = 0ea29163 pte = 082b0160 [ 221.629066] Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC [ 221.629066] Modules linked in: [ 221.629066] [ 221.629066] Pid: 4845, comm: mount Not tainted (2.6.27-rc4-00123-gd3ee1b4-dirty #28) [ 221.629066] EIP: 0060:[<c022d4b1>] EFLAGS: 00010206 CPU: 0 [ 221.629066] EIP is at hfsplus_find_cat+0x10d/0x151 [ 221.629066] EAX: 00000029 EBX: 00016210 ECX: 000042c2 EDX: 00000002 [ 221.629066] ESI: c82d70ca EDI: c82b0000 EBP: c82d1bcc ESP: c82d199c [ 221.629066] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 [ 221.629066] Process mount (pid: 4845, ti=c82d1000 task=c8224060 task.ti=c82d1000) [ 221.629066] Stack: c080b3c4 c82aa8f8 c82d19c2 00016210 c080b3be c82d1bd4 c82aa8f0 00000300 [ 221.629066] 01000000 750008b1 74006e00 74006900 65006c00 c82d6400 c013bd35 c8224060 [ 221.629066] 00000036 00000046 c82d19f0 00000082 c8224548 c8224060 00000036 c0d653cc [ 221.629066] Call Trace: [ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd [ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b [ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd [ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b [ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd [ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96 [ 221.629066] [<c01302d2>] ? __kernel_text_address+0x1b/0x27 [ 221.629066] [<c010487a>] ? dump_trace+0xca/0xd6 [ 221.629066] [<c0109e32>] ? save_stack_address+0x0/0x2c [ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a [ 221.629066] [<c013b571>] ? save_trace+0x37/0x8d [ 221.629066] [<c013b62e>] ? add_lock_to_list+0x67/0x8d [ 221.629066] [<c013ea1c>] ? validate_chain+0x8a4/0x9f4 [ 221.629066] [<c013553d>] ? down+0xc/0x2f [ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0 [ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd [ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b [ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd [ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96 [ 221.629066] [<c013da5d>] ? mark_held_locks+0x43/0x5a [ 221.629066] [<c013dc3a>] ? trace_hardirqs_on+0xb/0xd [ 221.629066] [<c013dbf4>] ? trace_hardirqs_on_caller+0xf4/0x12f [ 221.629066] [<c06abec8>] ? _spin_unlock_irqrestore+0x42/0x58 [ 221.629066] [<c013555c>] ? down+0x2b/0x2f [ 221.629066] [<c022aa68>] ? hfsplus_iget+0xa0/0x154 [ 221.629066] [<c022b0b9>] ? hfsplus_fill_super+0x280/0x447 [ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96 [ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b [ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b [ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0 [ 221.629066] [<c041c9e4>] ? string+0x2b/0x74 [ 221.629066] [<c041cd16>] ? vsnprintf+0x2e9/0x512 [ 221.629066] [<c010487a>] ? dump_trace+0xca/0xd6 [ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a [ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a [ 221.629066] [<c013b571>] ? save_trace+0x37/0x8d [ 221.629066] [<c013b62e>] ? add_lock_to_list+0x67/0x8d [ 221.629066] [<c013ea1c>] ? validate_chain+0x8a4/0x9f4 [ 221.629066] [<c01354d3>] ? up+0xc/0x2f [ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0 [ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd [ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b [ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd [ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96 [ 221.629066] [<c041cfb7>] ? snprintf+0x1b/0x1d [ 221.629066] [<c01ba466>] ? disk_name+0x25/0x67 [ 221.629066] [<c0183960>] ? get_sb_bdev+0xcd/0x10b [ 221.629066] [<c016ad92>] ? kstrdup+0x2a/0x4c [ 221.629066] [<c022a7b3>] ? hfsplus_get_sb+0x13/0x15 [ 221.629066] [<c022ae39>] ? hfsplus_fill_super+0x0/0x447 [ 221.629066] [<c0183583>] ? vfs_kern_mount+0x3b/0x76 [ 221.629066] [<c0183602>] ? do_kern_mount+0x32/0xba [ 221.629066] [<c01960d4>] ? do_new_mount+0x46/0x74 [ 221.629066] [<c0196277>] ? do_mount+0x175/0x193 [ 221.629066] [<c013dbf4>] ? trace_hardirqs_on_caller+0xf4/0x12f [ 221.629066] [<c01663b2>] ? __get_free_pages+0x1e/0x24 [ 221.629066] [<c06ac07b>] ? lock_kernel+0x19/0x8c [ 221.629066] [<c01962e6>] ? sys_mount+0x51/0x9b [ 221.629066] [<c01962f9>] ? sys_mount+0x64/0x9b [ 221.629066] [<c01038bd>] ? sysenter_do_call+0x12/0x31 [ 221.629066] ======================= [ 221.629066] Code: 89 c2 c1 e2 08 c1 e8 08 09 c2 8b 85 e8 fd ff ff 66 89 50 06 89 c7 53 83 c7 08 56 57 68 c4 b3 80 c0 e8 8c 5c ef ff 89 d9 c1 e9 02 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 83 c3 06 8b 95 e8 fd ff ff 0f [ 221.629066] EIP: [<c022d4b1>] hfsplus_find_cat+0x10d/0x151 SS:ESP 0068:c82d199c [ 221.629066] ---[ end trace e417a1d67f0d0066 ]--- Since hfsplus_cat_build_key_uni() returns void and only has one callsite, the check is performed at the callsite. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	hfsplus: quieten down mounting hfsplus journaled fs read only	Mike Crowe
	Check whether the file system was to be mounted read only anyway before warning about changing the mount to read only. Signed-off-by: Mike Crowe <mac@mcrowe.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	befs: annotate fs32 on tests for superblock endianness	Harvey Harrison
	Does compile-time byteswapping rather than runtime. Noticed by sparse: fs/befs/super.c:29:6: warning: cast to restricted __le32 fs/befs/super.c:29:6: warning: cast from restricted fs32 fs/befs/super.c:31:11: warning: cast to restricted __be32 fs/befs/super.c:31:11: warning: cast from restricted fs32 fs/befs/super.c:31:11: warning: cast to restricted __be32 fs/befs/super.c:31:11: warning: cast from restricted fs32 fs/befs/super.c:31:11: warning: cast to restricted __be32 fs/befs/super.c:31:11: warning: cast from restricted fs32 fs/befs/super.c:31:11: warning: cast to restricted __be32 fs/befs/super.c:31:11: warning: cast from restricted fs32 fs/befs/super.c:31:11: warning: cast to restricted __be32 fs/befs/super.c:31:11: warning: cast from restricted fs32 fs/befs/super.c:31:11: warning: cast to restricted __be32 fs/befs/super.c:31:11: warning: cast from restricted fs32 fs/befs/linuxvfs.c:811:7: warning: cast to restricted __le32 fs/befs/linuxvfs.c:811:7: warning: cast from restricted fs32 fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32 fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32 fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32 fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32 fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32 fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32 fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32 fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32 fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32 fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32 fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32 fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: "Sergey S. Kostyliov" <rathamahata@php4.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	ext2: avoid printk floods in the face of directory corruption	Eric Sandeen
	A very large directory with many read failures (either due to storage problems, or due to invalid size & blocks from corruption) will generate a printk storm as the filesystem continues to try to read all the blocks. This flood of messages can tie up the box until it is complete - which may be a very long time, especially for very large corrupted values. This is fixed by only reporting the corruption once each time we try to read the directory. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Eugene Teo <eugeneteo@kernel.sg> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	ext2: fix ext2 block reservation early ENOSPC issue	Mingming Cao
	We could run into ENOSPC error on ext2, even when there is free blocks on the filesystem. The problem is triggered in the case the goal block group has 0 free blocks , and the rest block groups are skipped due to the check of "free_blocks < windowsz/2". Current code could fall back to non reservation allocation to prevent early ENOSPC after examing all the block groups with reservation on , but this code was bypassed if the reservation window is turned off already, which is true in this case. This patch fixed two issues: 1) We don't need to turn off block reservation if the goal block group has 0 free blocks left and continue search for the rest of block groups. Current code the intention is to turn off the block reservation if the goal allocation group has a few (some) free blocks left (not enough for make the desired reservation window),to try to allocation in the goal block group, to get better locality. But if the goal blocks have 0 free blocks, it should leave the block reservation on, and continues search for the next block groups,rather than turn off block reservation completely. 2) we don't need to check the window size if the block reservation is off. The problem was originally found and fixed in ext4. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	autofs4: add miscellaneous device for ioctls	Ian Kent
	Add a miscellaneous device to the autofs4 module for routing ioctls. This provides the ability to obtain an ioctl file handle for an autofs mount point that is possibly covered by another mount. The actual problem with autofs is that it can't reconnect to existing mounts. Immediately one things of just adding the ability to remount autofs file systems would solve it, but alas, that can't work. This is because autofs direct mounts and the implementation of "on demand mount and expire" of nested mount trees have the file system mounted on top of the mount trigger dentry. To resolve this a miscellaneous device node for routing ioctl commands to these mount points has been implemented in the autofs4 kernel module and a library added to autofs. This provides the ability to open a file descriptor for these over mounted autofs mount points. Please refer to Documentation/filesystems/autofs4-mount-control.txt for a discussion of the problem, implementation alternatives considered and a description of the interface. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	autofs4: track uid and gid of last mount requester	Ian Kent
	Track the uid and gid of the last process to request a mount for on an autofs dentry. [akpm@linux-foundation.org: fix tpyo in comment] Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	autofs4: cleanup autofs mount type usage	Ian Kent
	Usage of the AUTOFS_TYPE_* defines is a little confusing and appears inconsistent. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	eCryptfs: remove netlink transport	Tyler Hicks
	The netlink transport code has not worked for a while and the miscdev transport is a simpler solution. This patch removes the netlink code and makes the miscdev transport the only eCryptfs kernel to userspace transport. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Dustin Kirkland <kirkland@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	ecryptfs: convert to use new aops	Badari Pulavarty
	Convert ecryptfs to use write_begin/write_end Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	eCryptfs: remove retry loop in ecryptfs_readdir()	Michael Halcrow
	The retry block in ecryptfs_readdir() has been in the eCryptfs code base for a while, apparently for no good reason. This loop could potentially run without terminating. This patch removes the loop, instead erroring out if vfs_readdir() on the lower file fails. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Reported-by: Al Viro <viro@ZinIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	Allow recursion in binfmt_script and binfmt_misc	Kirill A. Shutemov
	binfmt_script and binfmt_misc disallow recursion to avoid stack overflow using sh_bang and misc_bang. It causes problem in some cases: $ echo '#!/bin/ls' > /tmp/t0 $ echo '#!/tmp/t0' > /tmp/t1 $ echo '#!/tmp/t1' > /tmp/t2 $ chmod +x /tmp/t* $ /tmp/t2 zsh: exec format error: /tmp/t2 Similar problem with binfmt_misc. This patch introduces field 'recursion_depth' into struct linux_binprm to track recursion level in binfmt_misc and binfmt_script. If recursion level more then BINPRM_MAX_RECURSION it generates -ENOEXEC. [akpm@linux-foundation.org: make linux_binprm.recursion_depth a uint] Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	alpha: introduce field 'taso' into struct linux_binprm	Kirill A. Shutemov
	This change is Alpha-specific. It adds field 'taso' into struct linux_binprm to remember if the application is TASO. Previously, field sh_bang was used for this purpose. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	binfmt_som.c: add MODULE_LICENSE	Adrian Bunk
	Add the missing MODULE_LICENSE("GPL"). Reported-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	compat: move cp_compat_stat to common code	Christoph Hellwig
	struct stat / compat_stat is the same on all architectures, so cp_compat_stat should be, too. Turns out it is, except that various architectures have slightly and some high2lowuid/high2lowgid or the direct assignment instead of the SET_UID/SET_GID that expands to the correct one anyway. This patch replaces the arch-specific cp_compat_stat implementations with a common one based on the x86-64 one. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ] Acked-by: Kyle McMartin <kyle@mcmartin.ca> [ parisc bits ] Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	Remove Andrew Morton's old email accounts	Francois Cami
	People can use the real name an an index into MAINTAINERS to find the current email address. Signed-off-by: Francois Cami <francois.cami@free.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	epoll: drop unnecessary test	Davide Libenzi
	Thomas found that there is an unnecessary (always true) test in ep_send_events(). The callback never inserts into ->rdllink while the send loop is performed, and also does the ~EP_PRIVATE_BITS test. Given we're holding the mutex during this time, the conditions tested inside the loop are always true. This patch drops the test done inside the re-insertion loop. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	exec.c, compat.c: fix count(), compat_count() bounds checking	Jason Baron
	With MAX_ARG_STRINGS set to 0x7FFFFFFF, and being passed to 'count()' and compat_count(), it would appear that the current max bounds check of fs/exec.c:394: if(++i > max) return -E2BIG; would never trigger. Since 'i' is of type int, so values would wrap and the function would continue looping. Simple fix seems to be chaning ++i to i++ and checking for '>='. Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Ollie Wild" <aaw@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	uclinux: fix gzip header parsing in binfmt_flat.c	Volodymyr G. Lukiianyk
	There are off-by-one errors in decompress_exec() when calculating the length of optional "original file name" and "comment" fields: the "ret" index is not incremented when terminating '\0' character is reached. The check of the buffer overflow (after an "extra-field" length was taken into account) is also fixed. I've encountered this off-by-one error when tried to reuse gzip-header-parsing part of the decompress_exec() function. There was an "original file name" field in the payload (with miscalculated length) and zlib_inflate() returned Z_DATA_ERROR. But after the fix similar to this one all worked fine. Signed-off-by: Volodymyr G Lukiianyk <volodymyrgl@gmail.com> Acked-by: Greg Ungerer <gerg@snapgear.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16	kobject: Cleanup kobject_rename and !CONFIG_SYSFS	Eric W. Biederman
	It finally dawned on me what the clean fix to sysfs_rename_dir calling kobject_set_name is. Move the work into kobject_rename where it belongs. The callers serialize us anyway so this is safe. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-16	sysfs: Make dir and name args to sysfs_notify() const	Trent Piepho
	Because they can be, and because code like this produces a warning if they're not: struct device_attribute dev_attr; sysfs_notify(&kobj, NULL, dev_attr.attr.name); Signed-off-by: Trent Piepho <tpiepho@freescale.com> CC: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-16	sysfs: use ilookup5() instead of ilookup5_nowait()	Tejun Heo
	As inode creation is protected by sysfs_mutex, ilookup5_nowait() always either fails to find at all or finds one which is fully initialized, so using ilookup5_nowait() or ilookup5() doesn't make any difference. Switch to ilookup5() as it's planned to be removed. This change also makes lookup return value handling a bit simpler. This change was suggested by Al Viro. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Al Viro <viro@hera.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-16	sysfs: fix deadlock	Nick Piggin
	On Thu, Sep 11, 2008 at 10:27:10AM +0200, Ingo Molnar wrote: > and it's working fine on most boxes. One testbox found this new locking > scenario: > > PM: Adding info for No Bus:vcsa7 > EDAC DEBUG: MC0: i82860_check() > > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.27-rc6-tip #1 > ------------------------------------------------------- > X/4873 is trying to acquire lock: > (&bb->mutex){--..}, at: [<c020ba20>] mmap+0x40/0xa0 > > but task is already holding lock: > (&mm->mmap_sem){----}, at: [<c0125a1e>] sys_mmap2+0x8e/0xc0 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #1 (&mm->mmap_sem){----}: > [<c017dc96>] validate_chain+0xa96/0xf50 > [<c017ef2b>] __lock_acquire+0x2cb/0x5b0 > [<c017f299>] lock_acquire+0x89/0xc0 > [<c01aa8fb>] might_fault+0x6b/0x90 > [<c040b618>] copy_to_user+0x38/0x60 > [<c020bcfb>] read+0xfb/0x170 > [<c01c09a5>] vfs_read+0x95/0x110 > [<c01c1443>] sys_pread64+0x63/0x80 > [<c012146f>] sysenter_do_call+0x12/0x43 > [<ffffffff>] 0xffffffff > > -> #0 (&bb->mutex){--..}: > [<c017d8b7>] validate_chain+0x6b7/0xf50 > [<c017ef2b>] __lock_acquire+0x2cb/0x5b0 > [<c017f299>] lock_acquire+0x89/0xc0 > [<c0d6f2ab>] __mutex_lock_common+0xab/0x3c0 > [<c0d6f698>] mutex_lock_nested+0x38/0x50 > [<c020ba20>] mmap+0x40/0xa0 > [<c01b111e>] mmap_region+0x14e/0x450 > [<c01b170f>] do_mmap_pgoff+0x2ef/0x310 > [<c0125a3d>] sys_mmap2+0xad/0xc0 > [<c012146f>] sysenter_do_call+0x12/0x43 > [<ffffffff>] 0xffffffff > > other info that might help us debug this: > > 1 lock held by X/4873: > #0: (&mm->mmap_sem){----}, at: [<c0125a1e>] sys_mmap2+0x8e/0xc0 > > stack backtrace: > Pid: 4873, comm: X Not tainted 2.6.27-rc6-tip #1 > [<c017cd09>] print_circular_bug_tail+0x79/0xc0 > [<c017d8b7>] validate_chain+0x6b7/0xf50 > [<c017a5b5>] ? trace_hardirqs_off_caller+0x15/0xb0 > [<c017ef2b>] __lock_acquire+0x2cb/0x5b0 > [<c017f299>] lock_acquire+0x89/0xc0 > [<c020ba20>] ? mmap+0x40/0xa0 > [<c0d6f2ab>] __mutex_lock_common+0xab/0x3c0 > [<c020ba20>] ? mmap+0x40/0xa0 > [<c0d6f698>] mutex_lock_nested+0x38/0x50 > [<c020ba20>] ? mmap+0x40/0xa0 > [<c020ba20>] mmap+0x40/0xa0 > [<c01b111e>] mmap_region+0x14e/0x450 > [<c01afb88>] ? arch_get_unmapped_area_topdown+0xf8/0x160 > [<c01b170f>] do_mmap_pgoff+0x2ef/0x310 > [<c0125a3d>] sys_mmap2+0xad/0xc0 > [<c012146f>] sysenter_do_call+0x12/0x43 > [<c0120000>] ? __switch_to+0x130/0x220 > ======================= > evbug.c: Event. Dev: input3, Type: 20, Code: 0, Value: 500 > warning: `sudo' uses deprecated v2 capabilities in a way that may be insecure. > > i've attached the config. > > at first sight it looks like a genuine bug in fs/sysfs/bin.c? Yes, it is a real bug by the looks. bin.c takes bb->mutex under mmap_sem when it is mmapped, and then does its copy_*_user under bb->mutex too. Here is a basic fix for the sysfs lor. From: Nick Piggin <npiggin@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-16	sysfs: Support sysfs_notify from atomic context with new sysfs_notify_dirent	Neil Brown
	Support sysfs_notify from atomic context with new sysfs_notify_dirent sysfs_notify currently takes sysfs_mutex. This means that it cannot be called in atomic context. sysfs_mutex is sometimes held over a malloc (sysfs_rename_dir) so it can block on low memory. In md I want to be able to notify on a sysfs attribute from atomic context, and I don't want to block on low memory because I could be in the writeout path for freeing memory. So: - export the "sysfs_dirent" structure along with sysfs_get, sysfs_put and sysfs_get_dirent so I can get the sysfs_dirent that I want to notify on and hold it in an md structure. - split sysfs_notify_dirent out of sysfs_notify so the sysfs_dirent can be notified on with no blocking (just a spinlock). Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>