aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2009-05-13Remove implementation of readpage from the hugetlbfs_aopsMel Gorman
The core VM assumes the page size used by the address_space in inode->i_mapping is PAGE_SIZE but hugetlbfs breaks this assumption by inserting pages into the page cache at offsets the core VM considers unexpected. This would not be a problem except that hugetlbfs also provide a ->readpage implementation. As it exists, the core VM can assume the base page size is being used, allocate pages on behalf of the filesystem, insert them into the page cache and call ->readpage to populate them. These pages are the wrong size and at the wrong offset for hugetlbfs causing confusion. This patch deletes the ->readpage implementation for hugetlbfs on the grounds the core VM should not be allocating and populating pages on behalf of hugetlbfs. There should be no existing users of the ->readpage implementation so it should not cause a regression. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-12Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: nfsd: silence lockdep warning lockd: fix list corruption on lockd restart nfsd4: check for negative dentry before use in nfsv4 readdir nfsd41: slots are freed with session svcrdma: clean up error paths. svcrdma: Fix dma map direction for rdma read targets
2009-05-12epoll: fix size check in epoll_create()Davide Libenzi
Fix a size check WRT the manual pages. This was inadvertently broken by commit 9fe5ad9c8cef9ad5873d8ee55d1cf00d9b607df0 ("flag parameters add-on: remove epoll_create size param"). Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: <Hiroyuki.Mach@gmail.com> Cc: rohit verma <rohit.170309@gmail.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-11nfsd: silence lockdep warningJ. Bruce Fields
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-11dup2: Fix return value with oldfd == newfd and invalid fdJeff Mahoney
The return value of dup2 when oldfd == newfd and the fd isn't valid is not getting properly sign extended. We end up with 4294967287 instead of -EBADF. I've reproduced this on SLE11 (2.6.27.21), openSUSE Factory (2.6.29-rc5), and Ubuntu 9.04 (2.6.28). This patch uses a signed int for the error value so it is properly extended. Commit 6c5d0512a091480c9f981162227fdb1c9d70e555 introduced this regression. Reported-by: Jiri Dluhos <jdluhos@novell.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (22 commits) Fix the race between capifs remount and node creation Fix races around the access to ->s_options switch ufs directories to ufs_sync_file() Switch open_exec() and sys_uselib() to do_open_filp() Make open_exec() and sys_uselib() use may_open(), instead of duplicating its parts Reduce path_lookup() abuses Make checkpatch.pl shut up on fs/inode.c NULL noise in fs/super.c:kill_bdev_super() romfs: cleanup romfs_fs.h ROMFS: romfs_dev_read() error ignored fs: dcache fix LRU ordering ocfs2: Use nd_set_link(). Fix deadlock in ipathfs ->get_sb() Fix a leak in failure exit in 9p ->get_sb() Convert obvious places to deactivate_locked_super() New helper: deactivate_locked_super() reiserfs: remove privroot hiding in lookup reiserfs: dont associate security.* with xattr files reiserfs: fixup xattr_root caching Always lookup priv_root on reiserfs mount and keep it ...
2009-05-09Fix races around the access to ->s_optionsAl Viro
Put generic_show_options read access to s_options under rcu_read_lock, split save_mount_options() into "we are setting it the first time" (uses in foo_fill_super()) and "we are relacing and freeing the old one", synchronize_rcu() before kfree() in the latter. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09switch ufs directories to ufs_sync_file()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09Switch open_exec() and sys_uselib() to do_open_filp()Al Viro
... and make path_lookup_open() static Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09Make open_exec() and sys_uselib() use may_open(), instead of duplicating its ↵Al Viro
parts Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09Reduce path_lookup() abusesAl Viro
... use kern_path() where possible [folded a fix from rdd] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09Make checkpatch.pl shut up on fs/inode.cManish Katiyar
Code Quality According To Mingo(tm) has been vastly improved, no code has been damaged^Wchanged^Wdamaged. [commit message rewritten -- AV] Signed-off-by: Manish Katiyar <mkatiyar@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09NULL noise in fs/super.c:kill_bdev_super()H Hartley Sweeten
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Subrata Modak <subrata@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09ROMFS: romfs_dev_read() error ignoredRoel Kluin
romfs_dev_read() may return -EIO, but ret is unsigned, so the errorpath isn't taken. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09fs: dcache fix LRU orderingnpiggin@suse.de
Fix ordering of LRU when moving referenced dentries to the head of the list (they should go to the head of the list in the same order as they were found from the tail, rather than reverse order). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09ocfs2: Use nd_set_link().Joel Becker
ocfs2 was hand-calling vfs_follow_link(), but there's no point to that. Let's use page_follow_link_light() and nd_set_link(). Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09Fix a leak in failure exit in 9p ->get_sb()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09Convert obvious places to deactivate_locked_super()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09New helper: deactivate_locked_super()Al Viro
Does equivalent of up_write(&s->s_umount); deactivate_super(s); However, it does not does not unlock it until it's all over. As the result, it's safe to use to dispose of new superblock on ->get_sb() failure exits - nobody will see the sucker until it's all over. Equivalent using up_write/deactivate_super is safe for that purpose if superblock is either safe to use or has NULL ->s_root when we unlock. Normally filesystems take the required precautions, but a) we do have bugs in that area in some of them. b) up_write/deactivate_super sequence is extremely common, so the helper makes sense anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09reiserfs: remove privroot hiding in lookupJeff Mahoney
With Al Viro's patch to move privroot lookup to fs mount, there's no need to have special code to hide the privroot in reiserfs_lookup. I've also cleaned up the privroot hiding in reiserfs_readdir_dentry and removed the last user of reiserfs_xattrs(). Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09reiserfs: dont associate security.* with xattr filesJeff Mahoney
The security.* xattrs are ignored for xattr files, so don't create them. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09reiserfs: fixup xattr_root cachingJeff Mahoney
The xattr_root caching was broken from my previous patch set. It wouldn't cause corruption, but could cause decreased performance due to allocating a larger chunk of the journal (~ 27 blocks) than it would actually use. This patch loads the xattr root dentry at xattr initialization and creates it on-demand. Since we're using the cached dentry, there's no point in keeping lookup_or_create_dir around, so that's removed. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09Always lookup priv_root on reiserfs mount and keep itAl Viro
... even if it's a negative dentry. That way we can set ->d_op on root before anyone could race with us. Simplify d_compare(), while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09reiserfs: Expand i_mutex to enclose lookup_one_lenJeff Mahoney
2.6.30-rc3 introduced some sanity checks in the VFS code to avoid NFS bugs by ensuring that lookup_one_len is always called under i_mutex. This patch expands the i_mutex locking to enclose lookup_one_len. This was always required, but not not enforced in the reiserfs code since it does locking around the xattr interactions with the xattr_sem. This is obvious enough, and it survived an overnight 50 thread ACL test. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09vfs: umount_begin BKL pushdownAlessio Igor Bogani
Push BKL down into ->umount_begin() Signed-off-by: Alessio Igor Bogani <abogani@texware.it> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09GFS2: Fix glock ref counting bugSteven Whitehouse
Depending on the ordering of events as we go around the glock shrinker loop, it is possible to drop the ref count of a glock incorrectly. It doesn't happen very often. This patch corrects the got_ref variable, fixing the problem. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (32 commits) [CIFS] Fix double list addition in cifs posix open code [CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmssp [CIFS] Fix SMB uid in NTLMSSP authenticate request [CIFS] NTLMSSP reenabled after move from connect.c to sess.c [CIFS] Remove sparse warning [CIFS] remove checkpatch warning [CIFS] Fix final user of old string conversion code [CIFS] remove cifs_strfromUCS_le [CIFS] NTLMSSP support moving into new file, old dead code removed [CIFS] Fix endian conversion of vcnum field [CIFS] Remove trailing whitespace [CIFS] Remove sparse endian warnings [CIFS] Add remaining ntlmssp flags and standardize field names [CIFS] Fix build warning cifs: fix length handling in cifs_get_name_from_search_buf [CIFS] Remove unneeded QuerySymlink call and fix mapping for unmapped status [CIFS] rename cifs_strndup to cifs_strndup_from_ucs Added loop check when mounting DFS tree. Enable dfs submounts to handle remote referrals. [CIFS] Remove older session setup implementation ...
2009-05-08[CIFS] Fix double list addition in cifs posix open codeSteve French
Remove adding open file entry twice to lists in the file Do not fill file info twice in case of posix opens and creates Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-06fiemap: fix problem with setting FIEMAP_EXTENT_LASTJosef Bacik
Fix a problem where the generic block based fiemap stuff would not properly set FIEMAP_EXTENT_LAST on the last extent. I've reworked things to keep track if we go past the EOF, and mark the last extent properly. The problem was reported by and tested by Eric Sandeen. Tested-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-btrfs@vger.kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <Joel.Becker@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-06inotify: use GFP_NOFS in kernel_event() to work around a lockdep false-positiveWu Fengguang
There is what we believe to be a false positive reported by lockdep. inotify_inode_queue_event() => take inotify_mutex => kernel_event() => kmalloc() => SLOB => alloc_pages_node() => page reclaim => slab reclaim => dcache reclaim => inotify_inode_is_dead => take inotify_mutex => deadlock The plan is to fix this via lockdep annotation, but that is proving to be quite involved. The patch flips the allocation over to GFP_NFS to shut the warning up, for the 2.6.30 release. Hopefully we will fix this for real in 2.6.31. I'll queue a patch in -mm to switch it back to GFP_KERNEL so we don't forget. ================================= [ INFO: inconsistent lock state ] 2.6.30-rc2-next-20090417 #203 --------------------------------- inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. kswapd0/380 [HC0[0]:SC0[0]:HE1:SE1] takes: (&inode->inotify_mutex){+.+.?.}, at: [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0 {RECLAIM_FS-ON-W} state was registered at: [<ffffffff81079188>] mark_held_locks+0x68/0x90 [<ffffffff810792a5>] lockdep_trace_alloc+0xf5/0x100 [<ffffffff810f5261>] __kmalloc_node+0x31/0x1e0 [<ffffffff81130652>] kernel_event+0xe2/0x190 [<ffffffff81130826>] inotify_dev_queue_event+0x126/0x230 [<ffffffff8112f096>] inotify_inode_queue_event+0xc6/0x110 [<ffffffff8110444d>] vfs_create+0xcd/0x140 [<ffffffff8110825d>] do_filp_open+0x88d/0xa20 [<ffffffff810f6b68>] do_sys_open+0x98/0x140 [<ffffffff810f6c50>] sys_open+0x20/0x30 [<ffffffff8100c272>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff irq event stamp: 690455 hardirqs last enabled at (690455): [<ffffffff81564fe4>] _spin_unlock_irqrestore+0x44/0x80 hardirqs last disabled at (690454): [<ffffffff81565372>] _spin_lock_irqsave+0x32/0xa0 softirqs last enabled at (690178): [<ffffffff81052282>] __do_softirq+0x202/0x220 softirqs last disabled at (690157): [<ffffffff8100d50c>] call_softirq+0x1c/0x50 other info that might help us debug this: 2 locks held by kswapd0/380: #0: (shrinker_rwsem){++++..}, at: [<ffffffff810d0bd7>] shrink_slab+0x37/0x180 #1: (&type->s_umount_key#17){++++..}, at: [<ffffffff8110cfbf>] shrink_dcache_memory+0x11f/0x1e0 stack backtrace: Pid: 380, comm: kswapd0 Not tainted 2.6.30-rc2-next-20090417 #203 Call Trace: [<ffffffff810789ef>] print_usage_bug+0x19f/0x200 [<ffffffff81018bff>] ? save_stack_trace+0x2f/0x50 [<ffffffff81078f0b>] mark_lock+0x4bb/0x6d0 [<ffffffff810799e0>] ? check_usage_forwards+0x0/0xc0 [<ffffffff8107b142>] __lock_acquire+0xc62/0x1ae0 [<ffffffff810f478c>] ? slob_free+0x10c/0x370 [<ffffffff8107c0a1>] lock_acquire+0xe1/0x120 [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0 [<ffffffff81562d43>] mutex_lock_nested+0x63/0x420 [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0 [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0 [<ffffffff81012fe9>] ? sched_clock+0x9/0x10 [<ffffffff81077165>] ? lock_release_holdtime+0x35/0x1c0 [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0 [<ffffffff8110c9dc>] dentry_iput+0xbc/0xe0 [<ffffffff8110cb23>] d_kill+0x33/0x60 [<ffffffff8110ce23>] __shrink_dcache_sb+0x2d3/0x350 [<ffffffff8110cffa>] shrink_dcache_memory+0x15a/0x1e0 [<ffffffff810d0cc5>] shrink_slab+0x125/0x180 [<ffffffff810d1540>] kswapd+0x560/0x7a0 [<ffffffff810ce160>] ? isolate_pages_global+0x0/0x2c0 [<ffffffff81065a30>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8107953d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff810d0fe0>] ? kswapd+0x0/0x7a0 [<ffffffff8106555b>] kthread+0x5b/0xa0 [<ffffffff8100d40a>] child_rip+0xa/0x20 [<ffffffff8100cdd0>] ? restore_args+0x0/0x30 [<ffffffff81065500>] ? kthread+0x0/0xa0 [<ffffffff8100d400>] ? child_rip+0x0/0x20 [eparis@redhat.com: fix audit too] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matt Mackall <mpm@selenic.com> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Eric Paris <eparis@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-06lockd: fix list corruption on lockd restartJ. Bruce Fields
If lockd is signalled soon enough after restart then locks_start_grace() will try to re-add an entry to a list and trigger a lock corruption warning. Thanks to Wang Chen for the problem report and diagnosis. WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c() ... list_add corruption. next->prev should be prev (ef8fe958), but was ef8ff128. (next=ef8ff128). ... Pid: 23062, comm: lockd Tainted: G W 2.6.30-rc2 #3 Call Trace: [<c042d5b5>] warn_slowpath+0x71/0xa0 [<c0422a96>] ? update_curr+0x11d/0x125 [<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150 [<c044b270>] ? trace_hardirqs_on+0xb/0xd [<c051c61a>] ? _raw_spin_lock+0x53/0xfa [<c051c89f>] __list_add+0x27/0x5c [<ef8f6daa>] locks_start_grace+0x22/0x30 [lockd] [<ef8f34da>] set_grace_period+0x39/0x53 [lockd] [<c06b8921>] ? lock_kernel+0x1c/0x28 [<ef8f3558>] lockd+0x64/0x164 [lockd] [<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150 [<c04227b0>] ? complete+0x34/0x3e [<ef8f34f4>] ? lockd+0x0/0x164 [lockd] [<ef8f34f4>] ? lockd+0x0/0x164 [lockd] [<c043dd42>] kthread+0x45/0x6b [<c043dcfd>] ? kthread+0x0/0x6b [<c0403c23>] kernel_thread_helper+0x7/0x10 Reported-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: stable@kernel.org
2009-05-06nfsd4: check for negative dentry before use in nfsv4 readdirJ. Bruce Fields
After 2f9092e1020246168b1309b35e085ecd7ff9ff72 "Fix i_mutex vs. readdir handling in nfsd" (and 14f7dd63 "Copy XFS readdir hack into nfsd code"), an entry may be removed between the first mutex_unlock and the second mutex_lock. In this case, lookup_one_len() will return a negative dentry. Check for this case to avoid a NULL dereference. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-by: J. R. Okajima <hooanon05@yahoo.co.jp> Cc: stable@kernel.org
2009-05-06[CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmsspSteve French
On mount, "sec=ntlmssp" can now be specified to allow "rawntlmssp" security to be enabled during CIFS session establishment/authentication (ntlmssp used to require specifying krb5 which was counterintuitive). Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-06[CIFS] Fix SMB uid in NTLMSSP authenticate requestSteve French
We were not setting the SMB uid in NTLMSSP authenticate request which could lead to INVALID_PARAMETER error on 2nd session setup. Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-04proc: avoid information leaks to non-privileged processesJake Edge
By using the same test as is used for /proc/pid/maps and /proc/pid/smaps, only allow processes that can ptrace() a given process to see information that might be used to bypass address space layout randomization (ASLR). These include eip, esp, wchan, and start_stack in /proc/pid/stat as well as the non-symbolic output from /proc/pid/wchan. ASLR can be bypassed by sampling eip as shown by the proof-of-concept code at http://code.google.com/p/fuzzyaslr/ As part of a presentation (http://www.cr0.org/paper/to-jt-linux-alsr-leak.pdf) esp and wchan were also noted as possibly usable information leaks as well. The start_stack address also leaks potentially useful information. Cc: Stable Team <stable@kernel.org> Signed-off-by: Jake Edge <jake@lwn.net> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-04[CIFS] NTLMSSP reenabled after move from connect.c to sess.cSteve French
The NTLMSSP code was removed from fs/cifs/connect.c and merged (75% smaller, cleaner) into fs/cifs/sess.c As with the old code it requires that cifs be built with CONFIG_CIFS_EXPERIMENTAL, the /proc/fs/cifs/Experimental flag must be set to 2, and mount must turn on extended security (e.g. with sec=krb5). Although NTLMSSP encapsulated in SPNEGO is not enabled yet, "raw" ntlmssp is common and useful in some cases since it offers more complete security negotiation, and is the default way of negotiating security for many Windows systems. SPNEGO encapsulated NTLMSSP will be able to reuse the same code. Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-03nfsd41: slots are freed with sessionAndy Adamson
The session and slots are allocated all in one piece. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-02NFS: Close page_mkwrite() racesTrond Myklebust
Follow up to Nick Piggin's patches to ensure that nfs_vm_page_mkwrite returns with the page lock held, and sets the VM_FAULT_LOCKED flag. See http://bugzilla.kernel.org/show_bug.cgi?id=12913 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: fix getbmap vs mmap deadlock xfs: a couple getbmap cleanups xfs: add more checks to superblock validation xfs_file_last_byte() needs to acquire ilock
2009-05-02Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs: configfs: Fix Trivial Warning in fs/configfs/symlink.c
2009-05-02Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: Change repository in MAINTAINERS. ocfs2: Fix a missing credit when deleting from indexed directories. ocfs2/trivial: Remove unused variable in ocfs2_rename. ocfs2: Add missing iput() during error handling in ocfs2_dentry_attach_lock() ocfs2: Fix some printk() warnings. ocfs2: Fix 2 warning during ocfs2 make. ocfs2: Reserve 1 more cluster in expanding_inline_dir for indexed dir.
2009-05-02ptrace: s/parent/real_parent/ in binfmt_elf_fdpic.cOleg Nesterov
->real_parent is the parent. ->parent may be the tracer. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02mm: fix Committed_AS underflow on large NR_CPUS environmentKOSAKI Motohiro
The Committed_AS field can underflow in certain situations: > # while true; do cat /proc/meminfo | grep _AS; sleep 1; done | uniq -c > 1 Committed_AS: 18446744073709323392 kB > 11 Committed_AS: 18446744073709455488 kB > 6 Committed_AS: 35136 kB > 5 Committed_AS: 18446744073709454400 kB > 7 Committed_AS: 35904 kB > 3 Committed_AS: 18446744073709453248 kB > 2 Committed_AS: 34752 kB > 9 Committed_AS: 18446744073709453248 kB > 8 Committed_AS: 34752 kB > 3 Committed_AS: 18446744073709320960 kB > 7 Committed_AS: 18446744073709454080 kB > 3 Committed_AS: 18446744073709320960 kB > 5 Committed_AS: 18446744073709454080 kB > 6 Committed_AS: 18446744073709320960 kB Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does not check for underflow. But NR_CPUS proportional isn't good calculation. In general, possibility of lock contention is proportional to the number of online cpus, not theorical maximum cpus (NR_CPUS). The current kernel has generic percpu-counter stuff. using it is right way. it makes code simplify and percpu_counter_read_positive() don't make underflow issue. Reported-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> [All kernel versions] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02alpha: binfmt_aout fixIvan Kokshaysky
This fixes the problem introduced by commit 3bfacef412 (get rid of special-casing the /sbin/loader on alpha): osf/1 ecoff binary segfaults when binfmt_aout built as module. That happens because aout binary handler gets on the top of the binfmt list due to late registration, and kernel attempts to execute the binary without preparatory work that must be done by binfmt_loader. Fixed by changing the registration order of the default binfmt handlers using list_add_tail() and introducing insert_binfmt() function which places new handler on the top of the binfmt list. This might be generally useful for installing arch-specific frontends for default handlers or just for overriding them. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Richard Henderson <rth@twiddle.net Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02pagemap: require aligned-length, non-null reads of /proc/pid/pagemapVitaly Mayatskikh
The intention of commit aae8679b0ebcaa92f99c1c3cb0cd651594a43915 ("pagemap: fix bug in add_to_pagemap, require aligned-length reads of /proc/pid/pagemap") was to force reads of /proc/pid/pagemap to be a multiple of 8 bytes, but now it allows to read 0 bytes, which actually puts some data to user's buffer. According to POSIX, if count is zero, read() should return zero and has no other results. Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Cc: Thomas Tuttle <ttuttle@google.com> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02mm: close page_mkwrite racesNick Piggin
Change page_mkwrite to allow implementations to return with the page locked, and also change it's callers (in page fault paths) to hold the lock until the page is marked dirty. This allows the filesystem to have full control of page dirtying events coming from the VM. Rather than simply hold the page locked over the page_mkwrite call, we call page_mkwrite with the page unlocked and allow callers to return with it locked, so filesystems can avoid LOR conditions with page lock. The problem with the current scheme is this: a filesystem that wants to associate some metadata with a page as long as the page is dirty, will perform this manipulation in its ->page_mkwrite. It currently then must return with the page unlocked and may not hold any other locks (according to existing page_mkwrite convention). In this window, the VM could write out the page, clearing page-dirty. The filesystem has no good way to detect that a dirty pte is about to be attached, so it will happily write out the page, at which point, the filesystem may manipulate the metadata to reflect that the page is no longer dirty. It is not always possible to perform the required metadata manipulation in ->set_page_dirty, because that function cannot block or fail. The filesystem may need to allocate some data structure, for example. And the VM cannot mark the pte dirty before page_mkwrite, because page_mkwrite is allowed to fail, so we must not allow any window where the page could be written to if page_mkwrite does fail. This solution of holding the page locked over the 3 critical operations (page_mkwrite, setting the pte dirty, and finally setting the page dirty) closes out races nicely, preventing page cleaning for writeout being initiated in that window. This provides the filesystem with a strong synchronisation against the VM here. - Sage needs this race closed for ceph filesystem. - Trond for NFS (http://bugzilla.kernel.org/show_bug.cgi?id=12913). - I need it for fsblock. - I suspect other filesystems may need it too (eg. btrfs). - I have converted buffer.c to the new locking. Even simple block allocation under dirty pages might be susceptible to i_size changing under partial page at the end of file (we also have a buffer.c-side problem here, but it cannot be fixed properly without this patch). - Other filesystems (eg. NFS, maybe btrfs) will need to change their page_mkwrite functions themselves. [ This also moves page_mkwrite another step closer to fault, which should eventually allow page_mkwrite to be moved into ->fault, and thus avoiding a filesystem calldown and page lock/unlock cycle in __do_fault. ] [akpm@linux-foundation.org: fix derefs of NULL ->mapping] Cc: Sage Weil <sage@newdream.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02autofs4: fix incorrect return in autofs4_mount_busy()Ian Kent
Fix an obvious incorrect return status in autofs4_mount_busy(). Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02[CIFS] Remove sparse warningSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-02[CIFS] remove checkpatch warningSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-02[CIFS] Fix final user of old string conversion codeSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>