aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2006-04-11[PATCH] splice: unlikely() optimizationsJens Axboe
Also corrects a few comments. Patch mainly from Ingo, changes by me. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] splice: speedups and optimizationsJens Axboe
- Kill the local variables that cache ->nrbufs, they just take up space. - Only set do_wakeup for a real pipe. This is a big win for direct splicing. - Kill i_mutex lock around ->f_pos update, regular io paths don't do this either. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] pipe.c/fifo.c code cleanupsIngo Molnar
more code cleanups after the macro conversion: - standardize on 'struct pipe_inode_info *pipe' variable names - introduce 'pipe' temporaries to reduce mass inode->i_pipe dereferencing Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] get rid of the PIPE_*() macrosIngo Molnar
get rid of the PIPE_*() macros. Scripted transformation. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] splice: speedup __generic_file_splice_readJens Axboe
Using find_get_page() is a lot faster than find_or_create_page(). This gets splice a lot closer to sendfile() for fd -> socket transfers. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] splice: add direct fd <-> fd splicing supportJens Axboe
It's more efficient for sendfile() emulation. Basically we cache an internal private pipe and just use that as the intermediate area for pages. Direct splicing is not available from sys_splice(), it is only meant to be used for sendfile() emulation. Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at exit for the normal fast path. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[XFS] Fix a problem in aligning inode allocations to stripe unitNathan Scott
boundaries. SGI-PV: 951862 SGI-Modid: xfs-linux-melb:xfs-kern:25726a Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11[XFS] Fix utime(2) in the case that no times parameter was passed in. Nathan Scott
SGI-PV: 949858 SGI-Modid: xfs-linux-melb:xfs-kern:25717a Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11[XFS] Fix an inode use-after-free durin an unpin. When reclaiming inodesDavid Chinner
that have been unlinked, we may need to execute transactions during reclaim. By the time the transaction has hit the disk, the linux inode and xfs vnode may already have been freed so we can't reference them safely. Use the known xfs inode state to determine if it is safe to reference the vnode and linux inode during the unpin operation. SGI-PV: 946321 SGI-Modid: xfs-linux-melb:xfs-kern:25687a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11[XFS] Fix inode reclaim scalability regression. When a filesystem hasDavid Chinner
millions of inodes cached and has sparse cluster population, removing inodes from the cluster hash consumes excessive amounts of CPU time. Reduce the CPU cost by making removal O(1) via use of a double linked list for the hash chains. SGI-PV: 951551 SGI-Modid: xfs-linux-melb:xfs-kern:25683a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11[XFS] Fix a writepage regression where we accidentally stopped honouringNathan Scott
nonblock mode with the new IO path code (since 2.6.16). SGI-PV: 951662 SGI-Modid: xfs-linux-melb:xfs-kern:25676a Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11[XFS] Fix superblock validation regression for the zero imaxpct case. Nathan Scott
Thanks to kjamieson for noticing. SGI-PV: 951661 SGI-Modid: xfs-linux-melb:xfs-kern:25675a Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-10Merge branch 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2Linus Torvalds
* 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2: [PATCH] CONFIGFS_FS must depend on SYSFS [PATCH] Bogus NULL pointer check in fs/configfs/dir.c ocfs2: Better I/O error handling in heartbeat ocfs2: test and set teardown flag early in user_dlm_destroy_lock() ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast() ocfs2: catch an invalid ast case in dlmfs ocfs2: remove an overly aggressive BUG() in dlmfs ocfs2: multi node truncate fix
2006-04-10[PATCH] de_thread: Don't confuse users do_each_thread.Eric W. Biederman
Oleg Nesterov spotted two interesting bugs with the current de_thread code. The simplest is a long standing double decrement of __get_cpu_var(process_counts) in __unhash_process. Caused by two processes exiting when only one was created. The other is that since we no longer detach from the thread_group list it is possible for do_each_thread when run under the tasklist_lock to see the same task_struct twice. Once on the task list as a thread_group_leader, and once on the thread list of another thread. The double appearance in do_each_thread can cause a double increment of mm_core_waiters in zap_threads resulting in problems later on in coredump_wait. To remedy those two problems this patch takes the simple approach of changing the old thread group leader into a child thread. The only routine in release_task that cares is __unhash_process, and it can be trivially seen that we handle cleaning up a thread group leader properly. Since de_thread doesn't change the pid of the exiting leader process and instead shares it with the new leader process. I change thread_group_leader to recognize group leadership based on the group_leader field and not based on pids. This should also be slightly cheaper then the existing thread_group_leader macro. I performed a quick audit and I couldn't see any user of thread_group_leader that cared about the difference. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-10[PATCH] CONFIGFS_FS must depend on SYSFSAdrian Bunk
This patch fixes the a compile error with CONFIG_SYSFS=n Configfs is creating, as a matter of policy, the /sys/kernel/config mountpoint. This means it requires CONFIG_SYSFS. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-10[PATCH] Bogus NULL pointer check in fs/configfs/dir.cEric Sesterhenn
We check the "group" pointer after we dereference it. This check is bogus, as it cannot be NULL coming in. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-10[PATCH] splice: add optional input and output offsetsIngo Molnar
add optional input and output offsets to sys_splice(), for seekable file descriptors: asmlinkage long sys_splice(int fd_in, loff_t __user *off_in, int fd_out, loff_t __user *off_out, size_t len, unsigned int flags); semantics are straightforward: f_pos will be updated with the offset provided by user-space, before the splice transfer is about to begin. Providing a NULL offset pointer means the existing f_pos will be used (and updated in situ). Providing an offset for a pipe results in -ESPIPE. Providing an invalid offset pointer results in -EFAULT. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] introduce a "kernel-internal pipe object" abstractionIngo Molnar
separate out the 'internal pipe object' abstraction, and make it usable to splice. This cleans up and fixes several aspects of the internal splice APIs and the pipe code: - pipes: the allocation and freeing of pipe_inode_info is now more symmetric and more streamlined with existing kernel practices. - splice: small micro-optimization: less pointer dereferencing in splice methods Signed-off-by: Ingo Molnar <mingo@elte.hu> Update XFS for the ->splice_read/->splice_write changes. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: be smarter about calling do_page_cache_readahead()Jens Axboe
We don't want to call into the read-ahead logic unless we are at the start of a page, _or_ we have multiple pages to read. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: optimize the splice buffer mappingJens Axboe
We don't really need to lock down the pages, just make sure they are uptodate. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: cleanup __generic_file_splice_read()Jens Axboe
The whole shadow/pages logic got overly complex, and this simpler approach is actually faster in testing. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: only call wake_up_interruptible() when we really have toJens Axboe
__wake_up_common() is pretty heavy in the kernel profiles, this brings it down to a more acceptable level. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: potential !page dereferenceDave Jones
We can get to out: with a NULL page, which we probably don't want to be calling page_cache_release() on. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: mark the io page as accessedJens Axboe
We should do that, since we do the LRU manipulation ourselves now. Suggested by Nick Piggin. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-07ocfs2: Better I/O error handling in heartbeatMark Fasheh
Propagate errors received in o2hb_bio_end_io() back to the heartbeat thread so it can skip re-arming the timer. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07ocfs2: test and set teardown flag early in user_dlm_destroy_lock()Mark Fasheh
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast()Mark Fasheh
Remove the code which attempted to catch it via dlmunlock() return status - this never happens there. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07ocfs2: catch an invalid ast case in dlmfsMark Fasheh
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07ocfs2: remove an overly aggressive BUG() in dlmfsMark Fasheh
Don't BUG() user_dlm_unblock_lock() on the absence of the USER_LOCK_BLOCKED flag - this turns out to be a valid case. Make some of the related BUG() statements print more useful information. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07ocfs2: multi node truncate fixMark Fasheh
Fix ocfs2_truncate_file() so that it forces a truncate_inode_pages() on all interested nodes in all cases of a truncate(), not just allocation change. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-02Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] splice: fix page stealing LRU handling. [PATCH] splice: page stealing needs to wait_on_page_writeback() [PATCH] splice: export generic_splice_sendpage [PATCH] splice: add a SPLICE_F_MORE flag [PATCH] splice: add comments documenting more of the code [PATCH] splice: improve writeback and clean up page stealing [PATCH] splice: fix shadow[] filling logic
2006-04-02[PATCH] splice: fix page stealing LRU handling.Jens Axboe
Originally from Nick Piggin, just adapted to the newer branch. You can't check PageLRU without holding zone->lru_lock. The page release code can get away with it only because the page refcount is 0 at that point. Also, you can't reliably remove pages from the LRU unless the refcount is 0. Ever. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: page stealing needs to wait_on_page_writeback()Jens Axboe
Thanks to Andrew for the good explanation of why this is so. akpm writes: If a page is under writeback and we remove it from pagecache, it's still going to get written to disk. But the VFS no longer knows about that page, nor that this page is about to modify disk blocks. So there might be scenarios in which those blocks-which-are-about-to-be-written-to get reused for something else. When writeback completes, it'll scribble on those blocks. This won't happen in ext2/ext3-style filesystems in normal mode because the page has buffers and try_to_release_page() will fail. But ext2 in nobh mode doesn't attach buffers at all - it just sticks the page in a BIO, finds some new blocks, points the BIO at those blocks and lets it rip. While that write IO's in flight, someone could truncate the file. Truncate won't block on the writeout because the page isn't in pagecache any more. So truncate will the free the blocks from the file under the page's feet. Then something else can reallocate those blocks. Then write data to them. Now, the original write completes, corrupting the filesystem. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: export generic_splice_sendpageJens Axboe
Forgot that one, thanks Jeff. Also move the other EXPORT_SYMBOL to right below the functions. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: add a SPLICE_F_MORE flagJens Axboe
This lets userspace indicate whether more data will be coming in a subsequent splice call. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: add comments documenting more of the codeJens Axboe
Hopefully this will make Andrew a little more happy. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: improve writeback and clean up page stealingJens Axboe
By cleaning up the writeback logic (killing write_one_page() and the manual set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and just keep it local in pipe_to_file(). This also adds dirty page balancing logic and O_SYNC handling. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: fix shadow[] filling logicJens Axboe
Clear the entire range, and don't increment pidx or we keep filling the same position again and again. Thanks to KAMEZAWA Hiroyuki. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02Merge git://oss.sgi.com:8090/oss/git/xfs-2.6Linus Torvalds
* git://oss.sgi.com:8090/oss/git/xfs-2.6: [XFS] Provide XFS support for the splice syscall. [XFS] Reenable write barriers by default. [XFS] Make project quota enforcement return an error code consistent with [XFS] Implement the silent parameter to fill_super, previously ignored. [XFS] Cleanup comment to remove reference to obsoleted function
2006-04-02[PATCH] sysfs: zero terminate sysfs write buffersGreg Kroah-Hartman
No one should be writing a PAGE_SIZE worth of data to a normal sysfs file, so properly terminate the buffer. Thanks to Al Viro for pointing out my supidity here. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits) Documentation: fix minor kernel-doc warnings BUG_ON() Conversion in drivers/net/ BUG_ON() Conversion in drivers/s390/net/lcs.c BUG_ON() Conversion in mm/slab.c BUG_ON() Conversion in mm/highmem.c BUG_ON() Conversion in kernel/signal.c BUG_ON() Conversion in kernel/signal.c BUG_ON() Conversion in kernel/ptrace.c BUG_ON() Conversion in ipc/shm.c BUG_ON() Conversion in fs/freevxfs/ BUG_ON() Conversion in fs/udf/ BUG_ON() Conversion in fs/sysv/ BUG_ON() Conversion in fs/inode.c BUG_ON() Conversion in fs/fcntl.c BUG_ON() Conversion in fs/dquot.c BUG_ON() Conversion in md/raid10.c BUG_ON() Conversion in md/raid6main.c BUG_ON() Conversion in md/raid5.c Fix minor documentation typo BFP->BPF in Documentation/networking/tuntap.txt ...
2006-04-02splice: add SPLICE_F_NONBLOCK flagLinus Torvalds
It doesn't make the splice itself necessarily nonblocking (because the actual file descriptors that are spliced from/to may block unless they have the O_NONBLOCK flag set), but it makes the splice pipe operations nonblocking. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02Documentation: fix minor kernel-doc warningsMartin Waitz
This patch updates the comments to match the actual code. Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/freevxfs/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/udf/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/sysv/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/inode.cEric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/fcntl.cEric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/dquot.cEric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02Merge with git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.gitAdrian Bunk