aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2009-05-02pagemap: require aligned-length, non-null reads of /proc/pid/pagemapVitaly Mayatskikh
The intention of commit aae8679b0ebcaa92f99c1c3cb0cd651594a43915 ("pagemap: fix bug in add_to_pagemap, require aligned-length reads of /proc/pid/pagemap") was to force reads of /proc/pid/pagemap to be a multiple of 8 bytes, but now it allows to read 0 bytes, which actually puts some data to user's buffer. According to POSIX, if count is zero, read() should return zero and has no other results. Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Cc: Thomas Tuttle <ttuttle@google.com> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02mm: close page_mkwrite racesNick Piggin
Change page_mkwrite to allow implementations to return with the page locked, and also change it's callers (in page fault paths) to hold the lock until the page is marked dirty. This allows the filesystem to have full control of page dirtying events coming from the VM. Rather than simply hold the page locked over the page_mkwrite call, we call page_mkwrite with the page unlocked and allow callers to return with it locked, so filesystems can avoid LOR conditions with page lock. The problem with the current scheme is this: a filesystem that wants to associate some metadata with a page as long as the page is dirty, will perform this manipulation in its ->page_mkwrite. It currently then must return with the page unlocked and may not hold any other locks (according to existing page_mkwrite convention). In this window, the VM could write out the page, clearing page-dirty. The filesystem has no good way to detect that a dirty pte is about to be attached, so it will happily write out the page, at which point, the filesystem may manipulate the metadata to reflect that the page is no longer dirty. It is not always possible to perform the required metadata manipulation in ->set_page_dirty, because that function cannot block or fail. The filesystem may need to allocate some data structure, for example. And the VM cannot mark the pte dirty before page_mkwrite, because page_mkwrite is allowed to fail, so we must not allow any window where the page could be written to if page_mkwrite does fail. This solution of holding the page locked over the 3 critical operations (page_mkwrite, setting the pte dirty, and finally setting the page dirty) closes out races nicely, preventing page cleaning for writeout being initiated in that window. This provides the filesystem with a strong synchronisation against the VM here. - Sage needs this race closed for ceph filesystem. - Trond for NFS (http://bugzilla.kernel.org/show_bug.cgi?id=12913). - I need it for fsblock. - I suspect other filesystems may need it too (eg. btrfs). - I have converted buffer.c to the new locking. Even simple block allocation under dirty pages might be susceptible to i_size changing under partial page at the end of file (we also have a buffer.c-side problem here, but it cannot be fixed properly without this patch). - Other filesystems (eg. NFS, maybe btrfs) will need to change their page_mkwrite functions themselves. [ This also moves page_mkwrite another step closer to fault, which should eventually allow page_mkwrite to be moved into ->fault, and thus avoiding a filesystem calldown and page lock/unlock cycle in __do_fault. ] [akpm@linux-foundation.org: fix derefs of NULL ->mapping] Cc: Sage Weil <sage@newdream.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02autofs4: fix incorrect return in autofs4_mount_busy()Ian Kent
Fix an obvious incorrect return status in autofs4_mount_busy(). Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-02[CIFS] Remove sparse warningSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-02[CIFS] remove checkpatch warningSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-02[CIFS] Fix final user of old string conversion codeSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-02[CIFS] remove cifs_strfromUCS_leJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-02[CIFS] NTLMSSP support moving into new file, old dead code removedSteve French
Remove dead NTLMSSP support from connect.c prior to addition of the new code to replace it. Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-01[CIFS] Fix endian conversion of vcnum fieldSteve French
When multiply mounting from the same client to the same server, with different userids, we create a vcnum which should be unique if possible (this is not the same as the smb uid, which is the handle to the security context). We were not endian converting additional (beyond the first which is zero) vcnum properly. CC: Stable <stable@kernel.org> Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-01[CIFS] Remove trailing whitespaceSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-01[CIFS] Remove sparse endian warningsSteve French
Removes two sparse CHECK_ENDIAN warnings from Jeffs earlier patch, and removes the dead readlink code (after noting where in findfirst we will need to add something like that in the future to handle the newly discovered unexpected error on FindFirst of NTFS symlinks. Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-01[CIFS] Add remaining ntlmssp flags and standardize field namesSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-01[CIFS] Fix build warningSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-01cifs: fix length handling in cifs_get_name_from_search_bufJeff Layton
The earlier patch to move this code to use the new unicode helpers assumed that the filename strings would be null terminated. That's not always the case. Instead of passing "max_len" to the string converter, pass "min(len, max_len)", which makes it do the right thing while still keeping the parser confined to the response. Also fix up the prototypes of this function and the callers so that max_len is unsigned (like len is). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30[CIFS] Remove unneeded QuerySymlink call and fix mapping for unmapped statusSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30ocfs2: Fix a missing credit when deleting from indexed directories.Joel Becker
The ocfs2 directory index updates two blocks when we remove an entry - the dx root and the dx leaf. OCFS2_DELETE_INODE_CREDITS was only accounting for the dx leaf. This shows up when ocfs2_delete_inode() runs out of credits in jbd2_journal_dirty_metadata() at "J_ASSERT_JH(jh, handle->h_buffer_credits > 0);". The test that caught this was running dirop_file_racer from the ocfs2-test suite with a 250-character filename PREFIX. Run on a 512B blocksize, it forces the orphan dir index to grow large enough to trigger. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-04-30[CIFS] rename cifs_strndup to cifs_strndup_from_ucsSteve French
In most cases, cifs_strndup is converting from Unicode (UCS2 / UTF-32) to the configured local code page for the Linux mount (usually UTF8), so Jeff suggested that to make it more clear that cifs_strndup is doing a conversion not just memory allocation and copy, rename the function to including "from_ucs" (ie Unicode) Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30Added loop check when mounting DFS tree.Igor Mammedov
Added loop check when mounting DFS tree. mount will fail with ELOOP if referral walks exceed MAX_NESTED_LINK count. Signed-off-by: Igor Mammedov <niallain@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30Enable dfs submounts to handle remote referrals.Igor Mammedov
Having remote dfs root support in cifs_mount, we can afford to pass into it UNC that is remote. Signed-off-by: Igor Mammedov <niallain@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30[CIFS] Remove older session setup implementationSteve French
Two years ago, when the session setup code in cifs was rewritten and moved to fs/cifs/sess.c, we were asked to keep the old code for a release or so (which could be reenabled at runtime) since it was such a large change and because the asn (SPNEGO) and NTLMSSP code was not rewritten and needed to be. This was useful to avoid regressions, but is long overdue to be removed. Now that the Kerberos (asn/spnego) code is working in fs/cifs/sess.c, and the NTLMSSP code moved (NTLMSSP blob setup be rewritten with the next patch in this series) quite a bit of dead code from fs/cifs/connect.c now can be removed. This old code should have been removed last year, but the earlier krb5 patches did not move/remove the NTLMSSP code which we had asked to be done first. Since no one else volunteered, I am doing it now. It is extremely important that we continue to examine the documentation for this area, to make sure our code continues to be uptodate with changes since Windows 2003. Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30cifs: change cifs_get_name_from_search_buf to use new unicode helperJeff Layton
...and remove cifs_convertUCSpath. There are no more callers. Also add a #define for the buffer used in the readdir path so that we don't have so many magic numbers floating around. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30cifs: change CIFSSMBUnixQuerySymLink to use new helpersJeff Layton
Change CIFSSMBUnixQuerySymLink to use the new unicode helper functions. Also change the calling conventions so that the allocation of the target name buffer is done in CIFSSMBUnixQuerySymLink rather than by the caller. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30cifs: fix session setup unicode string saving to use new unicode helpersJeff Layton
...and change decode_unicode_ssetup to be a void function. It never returns an actual error anyway. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30cifs: convert CIFSTCon to use new unicode helper functionsJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30cifs: rename cifs_strlcpy_to_host and make it use new functionsJeff Layton
Rename cifs_strlcpy_to_host to cifs_strndup since that better describes what this function really does. Then, convert it to use the new string conversion and measurement functions that work in units of bytes rather than wide chars. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30cifs: add new function to get unicode string length in bytesJeff Layton
Working in units of words means we do a lot of unnecessary conversion back and forth. Standardize on bytes instead since that's more useful for allocating buffers and such. Also, remove hostlen_fromUCS since the new function has a similar purpose. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30cifs: add replacement for cifs_strtoUCS_le called cifs_from_ucs2Jeff Layton
Add a replacement function for cifs_strtoUCS_le. cifs_from_ucs2 takes args for the source and destination length so that we can ensure that the function is confined within the intended buffers. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30cifs: move #defines for mapchars into cifs_unicode.hJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-04-30Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6Steve French
2009-04-30xfs: fix getbmap vs mmap deadlockChristoph Hellwig
xfs_getbmap (or rather the formatters called by it) copy out the getbmap structures under the ilock, which can deadlock against mmap. This has been reported via bugzilla a while ago (#717) and has recently also shown up via lockdep. So allocate a temporary buffer to format the kernel getbmap structures into and then copy them out after dropping the locks. A little problem with this is that we limit the number of extents we can copy out by the maximum allocation size, but I see no real way around that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-30xfs: a couple getbmap cleanupsChristoph Hellwig
- reshuffle various conditionals for data vs attr fork to make the code more readable - do fine-grainded goto-based error handling - exit early from conditionals instead of keeping a long else branch around - allow kmem_alloc to fail Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-30xfs: add more checks to superblock validationOlaf Weber
There had been reports where xfs filesystem was randomly corrupted with fsfuzzer, and xfs failed to handle it gracefully. This patch fixes couple of reported problem by providing additional checks in the superblock validation routine. Signed-off-by: Olaf Weber <olaf@sgi.com> Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-30xfs_file_last_byte() needs to acquire ilockLachlan McIlroy
We had some systems crash with this stack: [<a00000010000cb20>] ia64_leave_kernel+0x0/0x280 [<a00000021291ca00>] xfs_bmbt_get_startoff+0x0/0x20 [xfs] [<a0000002129080b0>] xfs_bmap_last_offset+0x210/0x280 [xfs] [<a00000021295b010>] xfs_file_last_byte+0x70/0x1a0 [xfs] [<a00000021295b200>] xfs_itruncate_start+0xc0/0x1a0 [xfs] [<a0000002129935f0>] xfs_inactive_free_eofblocks+0x290/0x460 [xfs] [<a000000212998fb0>] xfs_release+0x1b0/0x240 [xfs] [<a0000002129ad930>] xfs_file_release+0x70/0xa0 [xfs] [<a000000100162ea0>] __fput+0x1a0/0x420 [<a000000100163160>] fput+0x40/0x60 The problem here is that xfs_file_last_byte() does not acquire the inode lock and can therefore race with another thread that is modifying the extext list. While xfs_bmap_last_offset() is trying to lookup what was the last extent some extents were merged and the extent list shrunk so the index we lookup is now beyond the end of the extent list and potentially in a freed buffer. Signed-off-by: Lachlan McIlroy <lmcilroy@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29ocfs2/trivial: Remove unused variable in ocfs2_rename.Tao Ma
With indexed dir enabled, now we use ocfs2_dir_lookup_result to wrap all the bh used for dir. So remove the 2 unused variables. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-04-28fuse: destroy bdi on errorMiklos Szeredi
Destroy bdi on error in fuse_fill_super(). This was an omission from commit 26c3679101dbccc054dcf370143941844ba70531 "fuse: destroy bdi on umount", which moved the bdi_destroy() call from fuse_conn_put() to fuse_put_super(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org
2009-04-27eCryptfs: Fix min function comparison warningTyler Hicks
This warning shows up on 64 bit builds: fs/ecryptfs/inode.c:693: warning: comparison of distinct pointer types lacks a cast Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2009-04-27ecryptfs: fix printk format warningRandy Dunlap
fs/ecryptfs/inode.c:670: warning: format '%d' expects type 'int', but argument 3 has type 'size_t' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Cc: Dustin Kirkland <kirkland@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2009-04-28bio: fix memcpy corruption in bio_copy_user_iov()FUJITA Tomonori
st driver uses blk_rq_map_user() in order to just build a request out of page frames. In this case, map_data->offset is a non zero value and iov[0].iov_base is NULL. We need to increase nr_pages for that. Cc: stable@kernel.org Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: look for acls during btrfs_read_locked_inode Btrfs: fix acl caching Btrfs: Fix a bunch of printk() warnings. Btrfs: Fix a trivial warning using max() of u64 vs ULL. Btrfs: remove unused btrfs_bit_radix slab Btrfs: ratelimit IO error printks Btrfs: remove #if 0 code Btrfs: When shrinking, only update disk size on success Btrfs: fix deadlocks and stalls on dead root removal Btrfs: fix fallocate deadlock on inode extent lock Btrfs: kill btrfs_cache_create Btrfs: don't export symbols Btrfs: simplify makefile Btrfs: try to keep a healthy ratio of metadata vs data block groups
2009-04-27Btrfs: look for acls during btrfs_read_locked_inodeChris Mason
This changes btrfs_read_locked_inode() to peek ahead in the btree for acl items. If it is certain a given inode has no acls, it will set the in memory acl fields to null to avoid acl lookups completely. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: fix acl cachingChris Mason
Linus noticed the btrfs code to cache acls wasn't properly caching a NULL acl when the inode didn't have any acls. This meant the common case of no acls resulted in expensive btree searches every time the kernel checked permissions (which is quite often). This is a modified version of Linus' original patch: Properly set initial acl fields to BTRFS_ACL_NOT_CACHED in the inode. This forces an acl lookup when permission checks are done. Fix btrfs_get_acl to avoid lookups and locking when the inode acls fields are set to null. Fix btrfs_get_acl to use the right return value from __btrfs_getxattr when deciding to cache a NULL acl. It was storing a NULL acl when __btrfs_getxattr return -ENOENT, but __btrfs_getxattr was actually returning -ENODATA for this case. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6: ext2: missing unlock in ext2_quota_write() quota: remove obsolete comments in fs/quota/Makefile
2009-04-27ext2: missing unlock in ext2_quota_write()Dan Carpenter
The inode->i_mutex should be unlocked. Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2009-04-27quota: remove obsolete comments in fs/quota/MakefileChristoph Hellwig
Get rid of useless comments and the equally useless obj-y initialization. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2009-04-27Btrfs: Fix a bunch of printk() warnings.Joel Becker
Just happened to notice a bunch of %llu vs u64 warnings. Here's a patch to cast them all. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: Fix a trivial warning using max() of u64 vs ULL.Joel Becker
A small warning popped up on ia64 because inode-map.c was comparing a u64 object id with the ULL FIRST_FREE_OBJECTID. My first thought was that all the OBJECTID constants should contain the u64 cast because btrfs code deals entirely in u64s. But then I saw how large that was, and figured I'd just fix the max() call. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: remove unused btrfs_bit_radix slabChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: ratelimit IO error printksChris Mason
Btrfs has printks for various IO errors, including bad checksums and mismatches between what we expect the block headers to contain and what we actually find on the disk. Longer term we need a real reporting mechanism for this, but for now printk is going to have to do. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: remove #if 0 codeChris Mason
Btrfs had some old code sitting around under #if 0, this drops it. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: When shrinking, only update disk size on successChris Ball
Previously, we updated a device's size prior to attempting a shrink operation. This patch moves the device resizing logic to only happen if the shrink completes successfully. In the process, it introduces a new field to btrfs_device -- disk_total_bytes -- to track the on-disk size. Signed-off-by: Chris Ball <cjb@laptop.org> Signed-off-by: Chris Mason <chris.mason@oracle.com>