aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2007-05-08exec: fix remove_arg_zeroNick Piggin
Petr Tesarik discovered a problem in remove_arg_zero(). He writes: When a script is loaded, load_script() replaces argv[0] with the name of the interpreter and the filename passed to the exec syscall. However, there is no guarantee that the length of the interpreter name plus the length of the filename is greater than the length of the original argv[0]. If the difference happens to cross a page boundary, setup_arg_pages() will call put_dirty_page() [aka install_arg_page()] with an address outside the VMA. Therefore, remove_arg_zero() must free all pages which would be unused after the argument is removed. So, rewrite the remove_arg_zero function without gotos, with a few comments, and with the commonly used explicit index/offset. This fixes the problem and makes it easier to understand as well. [a.p.zijlstra@chello.nl: add comment] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08reiserfs: correct misspelled "REISERFS_PROC_INFO" to "CONFIG_REISERFS_PROC_INFO"Robert P. J. Day
Correct the misspelling of the preprocessor check of a Kconfig option to refer to CONFIG_REISERFS_PROC_INFO and not just the incorrect REISERFS_PROC_INFO. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08reiserfs: shrink superblock if no xattrsAlexey Dobriyan
This makes in-core superblock fit into one cacheline here. Before: struct dentry * xattr_root; /* 124 4 */ /* --- cacheline 1 boundary (128 bytes) --- */ struct rw_semaphore xattr_dir_sem; /* 128 12 */ int j_errno; /* 140 4 */ }; /* size: 144, cachelines: 2 */ /* sum members: 142, holes: 1, sum holes: 2 */ /* last cacheline: 16 bytes */ After: int j_errno; /* 124 4 */ /* --- cacheline 1 boundary (128 bytes) --- */ }; /* size: 128, cachelines: 1 */ /* sum members: 126, holes: 1, sum holes: 2 */ Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08reiserfs: possible null pointer dereference during resizeDmitriy Monakhov
sb_read may return NULL, let's explicitly check it. If so free new bitmap blocks array, after this we may safely exit as it done above during bitmap allocation. Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08freevxfs: possible null pointer dereference fixDmitriy Monakhov
sb_read may return NULL, so let's explicitly check it. Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08is_power_of_2 in fs/block_dev.cVignesh Babu BM
Replace (n & (n-1)) in the context of power of 2 checks with is_power_of_2 Signed-off-by: vignesh babu <vignesh.babu@wipro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08is_power_of_2 in fs/hfsVignesh Babu BM
Replace (n & (n-1)) in the context of power of 2 checks with is_power_of_2 Signed-off-by: vignesh babu <vignesh.babu@wipro.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08is_power_of_2 in fatVignesh Babu BM
Replacing (n & (n-1)) in the context of power of 2 checks with is_power_of_2 Signed-off-by: vignesh babu <vignesh.babu@wipro.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08devpts: add fsnotify create eventFlorin Malita
Currently, devpts doesn't generate an fsnotify event upon pts creation because the regular vfs paths aren't involved. Deallocation, on the other hand, correctly generates a nameremove event thanks to the d_delete() invocation in devpts_pty_kill(). This patch adds the missing fsnotify_create() trigger in devpts_pty_new(). Signed-off-by: Florin Malita <fmalita@gmail.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08use use SEEK_MAX to validate user lseek argumentsChris Snook
Add SEEK_MAX and use it to validate lseek arguments from userspace. Signed-off-by: Chris Snook <csnook@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08use symbolic constants in generic lseek codeChris Snook
Convert magic numbers to SEEK_* values from fs.h Signed-off-by: Chris Snook <csnook@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08mm: shrink parent dentries when shrinking slabAndrew Morton
Teach the dentry slab shrinker to aggressively shrink parent dentries when shrinking the dentry cache. This is done to attempt to improve the situation where the dentry slab cache gets a lot of internal fragmentation due to pages containing directory dentries. It is expected that this change will cause some of those dentries to be reaped earlier, and with less scanning. Needs careful testing. Cc: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08fix quadratic behavior of shrink_dcache_parent()Miklos Szeredi
The time shrink_dcache_parent() takes, grows quadratically with the depth of the tree under 'parent'. This starts to get noticable at about 10,000. These kinds of depths don't occur normally, and filesystems which invoke shrink_dcache_parent() via d_invalidate() seem to have other depth dependent timings, so it's not even easy to expose this problem. However with FUSE it's easy to create a deep tree and d_invalidate() will also get called. This can make a syscall hang for a very long time. This is the original discovery of the problem by Russ Cox: http://article.gmane.org/gmane.comp.file-systems.fuse.devel/3826 The following patch fixes the quadratic behavior, by optionally allowing prune_dcache() to prune ancestors of a dentry in one go, instead of doing it one at a time. Common code in dput() and prune_one_dentry() is extracted into a new helper function d_kill(). shrink_dcache_parent() as well as shrink_dcache_sb() are converted to use the ancestry-pruner option. Only for shrink_dcache_memory() is this behavior not desirable, so it keeps using the old algorithm. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Maneesh Soni <maneesh@in.ibm.com> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08reduce size of task_struct on 64-bit machinesWilliam Cohen
This past week I was playing around with that pahole tool (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the size of various struct in the kernel. I was surprised by the size of the task_struct on x86_64, approaching 4K. I looked through the fields in task_struct and found that a number of them were declared as "unsigned long" rather than "unsigned int" despite them appearing okay as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8 bytes in size and forces 8 byte alignment. Is there a reason there a reason they are "unsigned long"? The patch below drops the size of the struct from 3808 bytes (60 64-byte cachelines) to 3760 bytes (59 64-byte cachelines). A couple other fields in the task struct take a signficant amount of space: struct thread_struct thread; 688 struct held_lock held_locks[30]; 1680 CONFIG_LOCKDEP is turned on in the .config [akpm@linux-foundation.org: fix printk warnings] Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08ext2/3/4: fix file date underflow on ext2 3 filesystems on 64 bit systemsMarkus Rechberger
Taken from http://bugzilla.kernel.org/show_bug.cgi?id=5079 signed long ranges from -2.147.483.648 to 2.147.483.647 on x86 32bit 10000011110110100100111110111101 .. -2,082,844,739 10000011110110100100111110111101 .. 2,212,122,557 <- this currently gets stored on the disk but when converting it to a 64bit signed long value it loses its sign and becomes positive. Cc: Andreas Dilger <adilger@dilger.ca> Cc: <linux-ext4@vger.kernel.org> Andreas says: This patch is now treating timestamps with the high bit set as negative times (before Jan 1, 1970). This means we lose 1/2 of the possible range of timestamps (lopping off 68 years before unix timestamp overflow - now only 30 years away :-) to handle the extremely rare case of setting timestamps into the distant past. If we are only interested in fixing the underflow case, we could just limit the values to 0 instead of storing negative values. At worst this will skew the timestamp by a few hours for timezones in the far east (files would still show Jan 1, 1970 in "ls -l" output). That said, it seems 32-bit systems (mine at least) allow files to be set into the past (01/01/1907 works fine) so it seems this patch is bringing the x86_64 behaviour into sync with other kernels. On the plus side, we have a patch that is ready to add nanosecond timestamps to ext3 and as an added bonus adds 2 high bits to the on-disk timestamp so this extends the maximum date to 2242. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Allow access to /proc/$PID/fd after setuid()Alexey Dobriyan
/proc/$PID/fd has r-x------ permissions, so if process does setuid(), it will not be able to access /proc/*/fd/. This breaks fstatat() emulation in glibc. open("foo", O_RDONLY|O_DIRECTORY) = 4 setuid32(65534) = 0 stat64("/proc/self/fd/4/bar", 0xbfafb298) = -1 EACCES (Permission denied) Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Acked-By: Kirill Korotaev <dev@openvz.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08block_write_full_page(): report ENOSPCAndrew Morton
block_write_full_page() forgot to propagate ENPSOC into the address_space. Cc: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Factor outstanding I/O error handlingGuillaume Chazarain
Cleanup: setting an outstanding error on a mapping was open coded too many times. Factor it out in mapping_set_error(). Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08uml: hostfs style fixesJeff Dike
hostfs needed some style goodness. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08uml: make hostfs_setattr() support operations on unlinked open filesAlberto Bertogli
This patch allows hostfs_setattr() to work on unlinked open files by calling set_attr() (the userspace part) with the inode's fd. Without this, applications that depend on doing attribute changes to unlinked open files will fail. It works by using the fd versions instead of the path ones (for example fchmod() instead of chmod(), fchown() instead of chown()) when an fd is available. Signed-off-by: Alberto Bertogli <albertito@gmail.com> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08mm: move common segment checks to separate helper functionDmitriy Monakhov
[akpm@linux-foundation.org: cleanup] Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org> Cc: Christoph Hellwig <hch@lst.de> Acked-by: Anton Altaparmakov <aia21@cam.ac.uk> Acked-by: David Chinner <dgc@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08[PATCH] splice: always call into page_cache_readahead()Jens Axboe
Don't try to guess what the read-ahead logic will do, allow it to make its own decisions. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-05-08[PATCH] splice(): fix interaction with readaheadFengguang Wu
Eric Dumazet, thank you for disclosing this bug. Readahead logic somehow fails to populate the page range with data. It can be because 1) the readahead routine is not always called in the following lines of fs/splice.c: if (!loff || nr_pages > 1) page_cache_readahead(mapping, &in->f_ra, in, index, nr_pages); 2) even called, page_cache_readahead() wont guarantee the pages are there. It wont submit readahead I/O for pages already in the radix tree, or when (ra_pages == 0), or after 256 cache hits. In your case, it should be because of the retried reads, which lead to excessive cache hits, and disables readahead at some time. And that _one_ failure of readahead blocks the whole read process. The application receives EAGAIN and retries the read, but __generic_file_splice_read() refuse to make progress: - in the previous invocation, it has allocated a blank page and inserted it into the radix tree, but never has the chance to start I/O for it: the test of SPLICE_F_NONBLOCK goes before that. - in the retried invocation, the readahead code will neither get out of the cache hit mode, nor will it submit I/O for an already existing page. Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-05-08[XFS] Add lockdep support for XFSLachlan McIlroy
SGI-PV: 963965 SGI-Modid: xfs-linux-melb:xfs-kern:28485a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] Fix race in xfs_write() b/w dmapi callout and direct I/O checks.Lachlan McIlroy
In xfs_write() the iolock is dropped and reacquired in XFS_SEND_DATA() which means that the file could change from not-cached to cached and we need to redo the direct I/O checks. We should also redo the direct I/O checks when the file size changes regardless if O_APPEND is set or not. SGI-PV: 963483 SGI-Modid: xfs-linux-melb:xfs-kern:28440a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] Get rid of redundant "required" in msg.Utako Kusaka
SGI-PV: 963466 SGI-Modid: xfs-linux-melb:xfs-kern:28416a Signed-off-by: Utako Kusaka <utako@tnes.nec.co.jp> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2007-05-08[XFS] Export via a function xfs_buftarg_list for use by kdb/xfsidbg.Tim Shimmin
SGI-PV: 963465 SGI-Modid: xfs-linux-melb:xfs-kern:28414a Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2007-05-08[XFS] Remove unused ilen variable and references.Tim Shimmin
SGI-PV: 907752 SGI-Modid: xfs-linux-melb:xfs-kern:28344a Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
2007-05-08[XFS] Fix to prevent the notorious 'NULL files' problem after a crash.Lachlan McIlroy
The problem that has been addressed is that of synchronising updates of the file size with writes that extend a file. Without the fix the update of a file's size, as a result of a write beyond eof, is independent of when the cached data is flushed to disk. Often the file size update would be written to the filesystem log before the data is flushed to disk. When a system crashes between these two events and the filesystem log is replayed on mount the file's size will be set but since the contents never made it to disk the file is full of holes. If some of the cached data was flushed to disk then it may just be a section of the file at the end that has holes. There are existing fixes to help alleviate this problem, particularly in the case where a file has been truncated, that force cached data to be flushed to disk when the file is closed. If the system crashes while the file(s) are still open then this flushing will never occur. The fix that we have implemented is to introduce a second file size, called the in-memory file size, that represents the current file size as viewed by the user. The existing file size, called the on-disk file size, is the one that get's written to the filesystem log and we only update it when it is safe to do so. When we write to a file beyond eof we only update the in- memory file size in the write operation. Later when the I/O operation, that flushes the cached data to disk completes, an I/O completion routine will update the on-disk file size. The on-disk file size will be updated to the maximum offset of the I/O or to the value of the in-memory file size if the I/O includes eof. SGI-PV: 958522 SGI-Modid: xfs-linux-melb:xfs-kern:28322a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] Fix race condition in xfs_write().Lachlan McIlroy
This change addresses a race in xfs_write() where, for direct I/O, the flags need_i_mutex and need_flush are setup before the iolock is acquired. The logic used to setup the flags may change between setting the flags and acquiring the iolock resulting in these flags having incorrect values. For example, if a file is not currently cached then need_i_mutex is set to zero and then if the file is cached before the iolock is acquired we will fail to do the flushinval before the direct write. The flush (and also the call to xfs_zero_eof()) need to be done with the iolock held exclusive so we need to acquire the iolock before checking for cached data (or if the write begins after eof) to prevent this state from changing. For direct I/O I've chosen to always acquire the iolock in shared mode initially and if there is a need to promote it then drop it and reacquire it. There's also some other tidy-ups including removing the O_APPEND offset adjustment since that work is done in generic_write_checks() (and we don't use offset as an input parameter anywhere). SGI-PV: 962170 SGI-Modid: xfs-linux-melb:xfs-kern:28319a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] Fix uquota and oquota enforcement problems.Kouta Ooizumi
When uquota and oquota (gquota/pquota) are enabled for accounting both are enforced if ether has enforcement active. Conditions: - Both XFS_UQUOTA_ACCT and XFS_GQUOTA_ACCT are enabled. - Either XFS_UQUOTA_ENFD or XFS_OQUOTA_ENFD is enabled. - The usage without enforce is reached at the soft limit. Problems: 1. "repquota" shows all grace time even if no enforcement. 2. we cannot make a file over a hard limits even if no enforcement. SGI-PV: 962291 SGI-Modid: xfs-linux-melb:xfs-kern:28272a Signed-off-by: Kouta Ooizumi <k-ooizumi@tnes.nec.co.jp> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] propogate return codes from flush routinesLachlan McIlroy
This patch handles error return values in fs_flush_pages and fs_flushinval_pages. It changes the prototype of fs_flushinval_pages so we can propogate the errors and handle them at higher layers. I also modified xfs_itruncate_start so that it could propogate the error further. SGI-PV: 961990 SGI-Modid: xfs-linux-melb:xfs-kern:28231a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] Fix quotaon syscall failures for group enforcement requests.Donald Douwsma
xfs_qm_scall_quotaon was incorrectly failing requests to enable group quota enforcement. Fixes logic error in OQUOTA handling. SGI-PV: 961964 SGI-Modid: xfs-linux-melb:xfs-kern:28227a Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] Invalidate quotacheck when mounting without a quota type.Donald Douwsma
When quotas are mounted or remounted without a particular quota type the quota accounting for that type becomes invalid. Previously we were ignoring this leading to accounting errors. SGI-PV: 961964 SGI-Modid: xfs-linux-melb:xfs-kern:28225a Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Utako Kusaka <utako@tnes.nec.co.jp> Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] reducing the number of random number functions.Joe Perches
Patch provided by Joe Perches SGI-PV: 961696 SGI-Modid: xfs-linux-melb:xfs-kern:28209a Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] remove more misc. unused argsEric Sandeen
Patch provided by Eric Sandeen. SGI-PV: 961695 SGI-Modid: xfs-linux-melb:xfs-kern:28205a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] the "aendp" arg to xfs_dir2_data_freescan is always NULL, remove it.Eric Sandeen
Patch provided by Eric Sandeen. SGI-PV: 961694 SGI-Modid: xfs-linux-melb:xfs-kern:28204a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] The last argument "lsn" of xfs_trans_commit() is always called withEric Sandeen
NULL. Patch provided by Eric Sandeen. SGI-PV: 961693 SGI-Modid: xfs-linux-melb:xfs-kern:28199a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[JFFS2] Simplify and clean up jffs2_add_tn_to_tree() some more.David Woodhouse
Fixing at least a couple more bugs in the process. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-05-07Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux: gfs2: nfs lock support for gfs2 lockd: add code to handle deferred lock requests lockd: always preallocate block in nlmsvc_lock() lockd: handle test_lock deferrals lockd: pass cookie in nlmsvc_testlock lockd: handle fl_grant callbacks lockd: save lock state on deferral locks: add fl_grant callback for asynchronous lock return nfsd4: Convert NFSv4 to new lock interface locks: add lock cancel command locks: allow {vfs,posix}_lock_file to return conflicting lock locks: factor out generic/filesystem switch from setlock code locks: factor out generic/filesystem switch from test_lock locks: give posix_test_lock same interface as ->lock locks: make ->lock release private data before returning in GETLK case locks: create posix-to-flock helper functions locks: trivial removal of unnecessary parentheses
2007-05-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (34 commits) [GFS2] Uncomment sprintf_symbol calling code [DLM] lowcomms style [GFS2] printk warning fixes [GFS2] Patch to fix mmap of stuffed files [GFS2] use lib/parser for parsing mount options [DLM] Lowcomms nodeid range & initialisation fixes [DLM] Fix dlm_lowcoms_stop hang [DLM] fix mode munging [GFS2] lockdump improvements [GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks [GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) [DLM] fs/dlm/ast.c should #include "ast.h" [DLM] Consolidate transport protocols [DLM] Remove redundant assignment [GFS2] Fix bz 234168 (ignoring rgrp flags) [DLM] change lkid format [DLM] interface for purge (2/2) [DLM] add orphan purging code (1/2) [DLM] split create_message function [GFS2] Set drop_count to 0 (off) by default ...
2007-05-07blackfin architectureBryan Wu
This adds support for the Analog Devices Blackfin processor architecture, and currently supports the BF533, BF532, BF531, BF537, BF536, BF534, and BF561 (Dual Core) devices, with a variety of development platforms including those avaliable from Analog Devices (BF533-EZKit, BF533-STAMP, BF537-STAMP, BF561-EZKIT), and Bluetechnix! Tinyboards. The Blackfin architecture was jointly developed by Intel and Analog Devices Inc. (ADI) as the Micro Signal Architecture (MSA) core and introduced it in December of 2000. Since then ADI has put this core into its Blackfin processor family of devices. The Blackfin core has the advantages of a clean, orthogonal,RISC-like microprocessor instruction set. It combines a dual-MAC (Multiply/Accumulate), state-of-the-art signal processing engine and single-instruction, multiple-data (SIMD) multimedia capabilities into a single instruction-set architecture. The Blackfin architecture, including the instruction set, is described by the ADSP-BF53x/BF56x Blackfin Processor Programming Reference http://blackfin.uclinux.org/gf/download/frsrelease/29/2549/Blackfin_PRM.pdf The Blackfin processor is already supported by major releases of gcc, and there are binary and source rpms/tarballs for many architectures at: http://blackfin.uclinux.org/gf/project/toolchain/frs There is complete documentation, including "getting started" guides available at: http://docs.blackfin.uclinux.org/ which provides links to the sources and patches you will need in order to set up a cross-compiling environment for bfin-linux-uclibc This patch, as well as the other patches (toolchain, distribution, uClibc) are actively supported by Analog Devices Inc, at: http://blackfin.uclinux.org/ We have tested this on LTP, and our test plan (including pass/fails) can be found at: http://docs.blackfin.uclinux.org/doku.php?id=testing_the_linux_kernel [m.kozlowski@tuxland.pl: balance parenthesis in blackfin header files] Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Aubrey Li <aubrey.li@analog.com> Signed-off-by: Jie Zhang <jie.zhang@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07hugetlbfs: add NULL check in hugetlb_zero_setup()Akinobu Mita
If hugetlbfs module_init() fails, hugetlbfs_vfsmount is not initialized and shmget() with SHM_HUGETLB flag will cause NULL pointer dereference. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: William Irwin <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07slab allocators: Remove SLAB_DEBUG_INITIAL flagChristoph Lameter
I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07get_unmapped_area handles MAP_FIXED in hugetlbfsBenjamin Herrenschmidt
Generic hugetlb_get_unmapped_area() now handles MAP_FIXED by just calling prepare_hugepage_range() Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: William Irwin <bill.irwin@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07KMEM_CACHE(): simplify slab cache creationChristoph Lameter
This patch provides a new macro KMEM_CACHE(<struct>, <flags>) to simplify slab creation. KMEM_CACHE creates a slab with the name of the struct, with the size of the struct and with the alignment of the struct. Additional slab flags may be specified if necessary. Example struct test_slab { int a,b,c; struct list_head; } __cacheline_aligned_in_smp; test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC) will create a new slab named "test_slab" of the size sizeof(struct test_slab) and aligned to the alignment of test slab. If it fails then we panic. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07mm: optimize acorn partition truncatePeter Zijlstra
invalidate_bdev() is superfluous when truncate_inode_pages() is also called. do call invalidate_bh_lrus() though, to avoid stale pointers. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07mm: optimize kill_bdev()Peter Zijlstra
Remove duplicate work in kill_bdev(). It currently invalidates and then truncates the bdev's mapping. invalidate_mapping_pages() will opportunistically remove pages from the mapping. And truncate_inode_pages() will forcefully remove all pages. The only thing truncate doesn't do is flush the bh lrus. So do that explicitly. This avoids (very unlikely) but possible invalid lookup results if the same bdev is quickly re-issued. It also will prevent extreme kernel latencies which are observed when blockdevs which have a large amount of pagecache are unmounted, by avoiding invalidate_mapping_pages() on that path. invalidate_mapping_pages() has no cond_resched (it can be called under spinlock), whereas truncate_inode_pages() has one. [akpm@linux-foundation.org: restore nrpages==0 optimisation] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07mm: remove destroy_dirty_buffers from invalidate_bdev()Peter Zijlstra
Remove the destroy_dirty_buffers argument from invalidate_bdev(), it hasn't been used in 6 years (so akpm says). find * -name \*.[ch] | xargs grep -l invalidate_bdev | while read file; do quilt add $file; sed -ie 's/invalidate_bdev(\([^,]*\),[^)]*)/invalidate_bdev(\1)/g' $file; done Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07Make page->private usable in compound pagesChristoph Lameter
If we add a new flag so that we can distinguish between the first page and the tail pages then we can avoid to use page->private in the first page. page->private == page for the first page, so there is no real information in there. Freeing up page->private makes the use of compound pages more transparent. They become more usable like real pages. Right now we have to be careful f.e. if we are going beyond PAGE_SIZE allocations in the slab on i386 because we can then no longer use the private field. This is one of the issues that cause us not to support debugging for page size slabs in SLAB. Having page->private available for SLUB would allow more meta information in the page struct. I can probably avoid the 16 bit ints that I have in there right now. Also if page->private is available then a compound page may be equipped with buffer heads. This may free up the way for filesystems to support larger blocks than page size. We add PageTail as an alias of PageReclaim. Compound pages cannot currently be reclaimed. Because of the alias one needs to check PageCompound first. The RFC for the this approach was discussed at http://marc.info/?t=117574302800001&r=1&w=2 [nacc@us.ibm.com: fix hugetlbfs] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>