aboutsummaryrefslogtreecommitdiff
path: root/include/linux/fs.h
AgeCommit message (Collapse)Author
2007-05-02remove "struct subsystem" as it is no longer neededGreg Kroah-Hartman
We need to work on cleaning up the relationship between kobjects, ksets and ktypes. The removal of 'struct subsystem' is the first step of this, especially as it is not really needed at all. Thanks to Kay for fixing the bugs in this patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-26[PATCH] Turn do_sync_file_range() into do_sync_mapping_range()Mark Fasheh
do_sync_file_range() accepts a file * from which it takes an address_space to sync. Abstract out the bulk of the function into do_sync_mapping_range() which takes the address_space directly. This way callers who want to sync an address_space directly can take advantage of the functionality provided. do_sync_file_range() is preserved as a small wrapper around do_sync_mapping_range(). Ocfs2 in particular would like to use this to initiate a sync of a specific inode range during truncate, where a file * may not be available. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-02-12[PATCH] Mark struct super_operations constJosef 'Jeff' Sipek
This patch is inspired by Arjan's "Patch series to mark struct file_operations and struct inode_operations const". Compile tested with gcc & sparse. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] mark struct inode_operations const 3Arjan van de Ven
Many struct inode_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] move remove_dquot_ref to dqout.cChristoph Hellwig
Remove_dquot_ref can move to dqout.c instead of beeing in inode.c under #ifdef CONFIG_QUOTA. Also clean the resulting code up a tiny little bit by testing sb->dq_op earlier - it's constant over a filesystems lifetime. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] remove invalidate_inode_pages()Andrew Morton
Convert all calls to invalidate_inode_pages() into open-coded calls to invalidate_mapping_pages(). Leave the invalidate_inode_pages() wrapper in place for now, marked as deprecated. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] Export invalidate_mapping_pages() to modulesAnton Altaparmakov
It makes no sense to me to export invalidate_inode_pages() and not invalidate_mapping_pages() and I actually need invalidate_mapping_pages() because of its range specification ability... akpm: also remove the export of invalidate_inode_pages() by making it an inlined wrapper. Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] avoid one conditional branch in touch_atime()Eric Dumazet
I added IS_NOATIME(inode) macro definition in include/linux/fs.h, true if the inode superblock is marked readonly or noatime. This new macro is then used in touch_atime() instead of separatly testing MS_RDONLY and MS_NOATIME Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-11[PATCH] Revert bd_mount_mutex back to a semaphoreDavid Chinner
Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f /mnt/newtest; xfs_freeze -u /mnt/newtest works safely and doesn't produce lockdep warnings. (XFS unlocks the semaphore from a different task, by design. The mutex code warns about this) Signed-off-by: Dave Chinner <dgc@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11[PATCH] NFS: Fix race in nfs_release_page()Trond Myklebust
NFS: Fix race in nfs_release_page() invalidate_inode_pages2() may find the dirty bit has been set on a page owing to the fact that the page may still be mapped after it was locked. Only after the call to unmap_mapping_range() are we sure that the page can no longer be dirtied. In order to fix this, NFS has hooked the releasepage() method and tries to write the page out between the call to unmap_mapping_range() and the call to remove_mapping(). This, however leads to deadlocks in the page reclaim code, where the page may be locked without holding a reference to the inode or dentry. Fix is to add a new address_space_operation, launder_page(), which will attempt to write out a dirty page without releasing the page lock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Also, the bare SetPageDirty() can skew all sort of accounting leading to other nasties. [akpm@osdl.org: cleanup] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13[PATCH] relative atimeValerie Henson
Add "relatime" (relative atime) support. Relative atime only updates the atime if the previous atime is older than the mtime or ctime. Like noatime, but useful for applications like mutt that need to know when a file has been read since it was last modified. A corresponding patch against mount(8) is available at http://userweb.kernel.org/~akpm/mount-relative-atime.txt Signed-off-by: Valerie Henson <val_henson@linux.intel.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Karel Zak <kzak@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08[PATCH] VFS: change struct file to use struct pathJosef "Jeff" Sipek
This patch changes struct file to use struct path instead of having independent pointers to struct dentry and struct vfsmount, and converts all users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}. Additionally, it adds two #define's to make the transition easier for users of the f_dentry and f_vfsmnt. Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08[PATCH] remove the old bd_mutex lockdep annotationPeter Zijlstra
Remove the old complex and crufty bd_mutex annotation. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] Save some bytes in struct inodeArnaldo Carvalho de Melo
[acme@newtoy net-2.6.20]$ pahole --cacheline 64 fs/inode.o inode /* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/linux/dcache.h:86 */ struct inode { struct hlist_node i_hash; /* 0 8 */ struct list_head i_list; /* 8 8 */ struct list_head i_sb_list; /* 16 8 */ struct list_head i_dentry; /* 24 8 */ long unsigned int i_ino; /* 32 4 */ atomic_t i_count; /* 36 4 */ umode_t i_mode; /* 40 2 */ /* XXX 2 bytes hole, try to pack */ unsigned int i_nlink; /* 44 4 */ uid_t i_uid; /* 48 4 */ gid_t i_gid; /* 52 4 */ dev_t i_rdev; /* 56 4 */ loff_t i_size; /* 60 8 */ struct timespec i_atime; /* 68 8 */ struct timespec i_mtime; /* 76 8 */ struct timespec i_ctime; /* 84 8 */ unsigned int i_blkbits; /* 92 4 */ long unsigned int i_version; /* 96 4 */ blkcnt_t i_blocks; /* 100 4 */ short unsigned int i_bytes; /* 104 2 */ /* XXX 2 bytes hole, try to pack */ spinlock_t i_lock; /* 108 40 */ struct mutex i_mutex; /* 148 76 */ struct rw_semaphore i_alloc_sem; /* 224 64 */ struct inode_operations * i_op; /* 288 4 */ const struct file_operations * i_fop; /* 292 4 */ struct super_block * i_sb; /* 296 4 */ struct file_lock * i_flock; /* 300 4 */ struct address_space * i_mapping; /* 304 4 */ struct address_space i_data; /* 308 188 */ struct list_head i_devices; /* 496 8 */ union ; /* 504 4 */ int i_cindex; /* 508 4 */ __u32 i_generation; /* 512 4 */ /* ---------- cacheline 8 boundary ---------- */ long unsigned int i_dnotify_mask; /* 516 4 */ struct dnotify_struct * i_dnotify; /* 520 4 */ struct list_head inotify_watches; /* 524 8 */ struct mutex inotify_mutex; /* 532 76 */ long unsigned int i_state; /* 608 4 */ long unsigned int dirtied_when; /* 612 4 */ unsigned int i_flags; /* 616 4 */ atomic_t i_writecount; /* 620 4 */ void * i_security; /* 624 4 */ void * i_private; /* 628 4 */ }; /* size: 632, sum members: 628, holes: 2, sum holes: 4 */ [acme@newtoy net-2.6.20]$ So just moving i_mode to after i_bytes we save 4 bytes by nuking both holes: [acme@newtoy net-2.6.20]$ codiff -V /tmp/inode.o.before fs/inode.o /pub/scm/linux/kernel/git/acme/net-2.6.20/fs/inode.c: struct inode | -4 i_mode; from: umode_t /* 40(0) 2(0) */ to: umode_t /* 102(0) 2(0) */ 1 struct changed [acme@newtoy net-2.6.20]$ I've prunned all the other offset changes, only this one is of interest here. So now we have: [acme@newtoy net-2.6.20]$ pahole --cacheline 64 ../OUTPUT/qemu/net-2.6.20/fs/inode.o inode /* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/linux/dcache.h:86 */ struct inode { struct hlist_node i_hash; /* 0 8 */ struct list_head i_list; /* 8 8 */ struct list_head i_sb_list; /* 16 8 */ struct list_head i_dentry; /* 24 8 */ long unsigned int i_ino; /* 32 4 */ atomic_t i_count; /* 36 4 */ unsigned int i_nlink; /* 40 4 */ uid_t i_uid; /* 44 4 */ gid_t i_gid; /* 48 4 */ dev_t i_rdev; /* 52 4 */ loff_t i_size; /* 56 8 */ /* ---------- cacheline 1 boundary ---------- */ struct timespec i_atime; /* 64 8 */ struct timespec i_mtime; /* 72 8 */ struct timespec i_ctime; /* 80 8 */ unsigned int i_blkbits; /* 88 4 */ long unsigned int i_version; /* 92 4 */ blkcnt_t i_blocks; /* 96 4 */ short unsigned int i_bytes; /* 100 2 */ umode_t i_mode; /* 102 2 */ spinlock_t i_lock; /* 104 40 */ struct mutex i_mutex; /* 144 76 */ struct rw_semaphore i_alloc_sem; /* 220 64 */ struct inode_operations * i_op; /* 284 4 */ const struct file_operations * i_fop; /* 288 4 */ struct super_block * i_sb; /* 292 4 */ struct file_lock * i_flock; /* 296 4 */ struct address_space * i_mapping; /* 300 4 */ struct address_space i_data; /* 304 188 */ struct list_head i_devices; /* 492 8 */ union ; /* 500 4 */ int i_cindex; /* 504 4 */ __u32 i_generation; /* 508 4 */ /* ---------- cacheline 8 boundary ---------- */ long unsigned int i_dnotify_mask; /* 512 4 */ struct dnotify_struct * i_dnotify; /* 516 4 */ struct list_head inotify_watches; /* 520 8 */ struct mutex inotify_mutex; /* 528 76 */ long unsigned int i_state; /* 604 4 */ long unsigned int dirtied_when; /* 608 4 */ unsigned int i_flags; /* 612 4 */ atomic_t i_writecount; /* 616 4 */ void * i_security; /* 620 4 */ void * i_private; /* 624 4 */ }; /* size: 628 */ [acme@newtoy net-2.6.20]$ Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] fs: reorder some 'struct inode' fields to speedup i_size manipulationsEric Dumazet
On 32bits SMP platforms, 64bits i_size is protected by a seqcount (i_size_seqcount). When i_size is read or written, i_size_seqcount is read/written as well, so it make sense to group these two fields together in the same cache line. This patch moves i_size_seqcount next to i_size, and also moves i_version to let offsetof(struct inode, i_size) being 0x40 instead of 0x3c (for 32bits platforms). For 64 bits platforms, i_size_seqcount doesnt exist, and the move of a 'long i_version' should not introduce a new hole because of padding. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] constify inode accessorsJan Engelhardt
Change the signature of i_size_read(), IMINOR() and IMAJOR() because they, or the functions they call, will never modify the argument. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] slab: remove SLAB_KERNELChristoph Lameter
SLAB_KERNEL is an alias of GFP_KERNEL. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] Move names_cachep to linux/fs.hChristoph Lameter
The names_cachep is used for getname() and putname(). So lets put it into fs.h near those two definitions. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-04[PATCH] severing fs.h, radix-tree.h -> sched.hAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-10-19[PATCH] Add lockless helpers for remove_suid()Jens Axboe
Right now users have to grab i_mutex before calling remove_suid(), in the unlikely event that a call to ->setattr() may be needed. Split up the function in two parts: - One to check if we need to remove suid - One to actually remove it The first we can call lockless. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-19[PATCH] Introduce generic_file_splice_write_nolock()Mark Fasheh
This allows file systems to manage their own i_mutex locking while still re-using the generic_file_splice_write() logic. OCFS2 in particular wants this so that it can order cluster locks within i_mutex. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-19[PATCH] Take i_mutex in splice_from_pipe()Mark Fasheh
The splice_actor may be calling ->prepare_write() and ->commit_write(). We want i_mutex on the inode being written to before calling those so that we don't race i_size changes. The double locking behavior is done elsewhere in splice.c, and if we eventually want _nolock variants of generic_file_splice_write(), fs modules might have to replicate the nasty locking code. We introduce inode_double_lock() and inode_double_unlock() to consolidate the locking rules into one set of functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-17[PATCH] document i_size_write locking rulesMiklos Szeredi
Unless someone reads the documentation for write_seqcount_{begin,end} it is not obvious, that i_size_write() needs locking. Especially, that lack of such locking can result in a system hang. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6: (292 commits) [GFS2] Fix endian bug for de_type [GFS2] Initialize SELinux extended attributes at inode creation time. [GFS2] Move logging code into log.c (mostly) [GFS2] Mark nlink cleared so VFS sees it happen [GFS2] Two redundant casts removed [GFS2] Remove uneeded endian conversion [GFS2] Remove duplicate sb reading code [GFS2] Mark metadata reads for blktrace [GFS2] Remove iflags.h, use FS_ [GFS2] Fix code style/indent in ops_file.c [GFS2] streamline-generic_file_-interfaces-and-filemap gfs fix [GFS2] Remove readv/writev methods and use aio_read/aio_write instead (gfs bits) [GFS2] inode-diet: Eliminate i_blksize from the inode structure [GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs) [GFS2] Fix typo in last patch [GFS2] Fix direct i/o logic in filemap.c [GFS2] Fix bug in Makefiles for lock modules [GFS2] Remove (extra) fs_subsys declaration [GFS2/DLM] Fix trailing whitespace [GFS2] Tidy up meta_io code ...
2006-10-03[PATCH] dm: export blkdev_driver_ioctlAlasdair G Kergon
Export blkdev_driver_ioctl for device-mapper. If we get as far as the device-mapper ioctl handler, we know the ioctl is not a standard block layer BLK* one, so we don't need to check for them a second time and can call blkdev_driver_ioctl() directly. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbersDavid Howells
These patches make the kernel pass 64-bit inode numbers internally when communicating to userspace, even on a 32-bit system. They are required because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS for example. The 64-bit inode numbers are then propagated to userspace automatically where the arch supports it. Problems have been seen with userspace (eg: ld.so) using the 64-bit inode number returned by stat64() or getdents64() to differentiate files, and failing because the 64-bit inode number space was compressed to 32-bits, and so overlaps occur. This patch: Make filldir_t take a 64-bit inode number and struct kstat carry a 64-bit inode number so that 64-bit inode numbers can be passed back to userspace. The stat functions then returns the full 64-bit inode number where available and where possible. If it is not possible to represent the inode number supplied by the filesystem in the field provided by userspace, then error EOVERFLOW will be issued. Similarly, the getdents/readdir functions now pass the full 64-bit inode number to userspace where possible, returning EOVERFLOW instead when a directory entry is encountered that can't be properly represented. Note that this means that some inodes will not be stat'able on a 32-bit system with old libraries where they were before - but it does mean that there will be no ambiguity over what a 32-bit inode number refers to. Note similarly that directory scans may be cut short with an error on a 32-bit system with old libraries where the scan would work before for the same reasons. It is judged unlikely that this situation will occur because modern glibc uses 64-bit capable versions of stat and getdents class functions exclusively, and that older systems are unlikely to encounter unrepresentable inode numbers anyway. [akpm: alpha build fix] Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[GFS2] Remove iflags.h, use FS_Steven Whitehouse
Update GFS2 in the light of David Howells' patch: [PATCH] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #6] 36695673b012096228ebdc1b39a6a5850daa474e which calls the filesystem independant flags FS_..._FL. As a result we no longer need the flags.h file and the conversion routine is moved into the GFS2 source code. Userland programs which used to include iflags.h should now include fs.h and use the new flag names. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02[PATCH] file: modify struct fown_struct to use a struct pidEric W. Biederman
File handles can be requested to send sigio and sigurg to processes. By tracking the destination processes using struct pid instead of pid_t we make the interface safe from all potential pid wrap around problems. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] Some cleanup in the pipe codeAndi Kleen
Split the big and hard to read do_pipe function into smaller pieces. This creates new create_write_pipe/free_write_pipe/create_read_pipe functions. These functions are made global so that they can be used by other parts of the kernel. The resulting code is more generic and easier to read and has cleaner error handling and less gotos. [akpm@osdl.org: cleanup] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] r/o bind mounts: monitor zeroing of i_nlinkDave Hansen
Some filesystems, instead of simply decrementing i_nlink, simply zero it during an unlink operation. We need to catch these in addition to the decrement operations. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] r/o bind mount prepwork: inc_nlink() helperDave Hansen
This is mostly included for parity with dec_nlink(), where we will have some more hooks. This one should stay pretty darn straightforward for now. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] r/o bind mounts: unlink: monitor i_nlinkDave Hansen
When a filesystem decrements i_nlink to zero, it means that a write must be performed in order to drop the inode from the filesystem. We're shortly going to have keep filesystems from being remounted r/o between the time that this i_nlink decrement and that write occurs. So, add a little helper function to do the decrements. We'll tie into it in a bit to note when i_nlink hits zero. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] Add vector AIO supportBadari Pulavarty
This work is initially done by Zach Brown to add support for vectored aio. These are the core changes for AIO to support IOCB_CMD_PREADV/IOCB_CMD_PWRITEV. [akpm@osdl.org: huge build fix] Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] Streamline generic_file_* interfaces and filemap cleanupsBadari Pulavarty
This patch cleans up generic_file_*_read/write() interfaces. Christoph Hellwig gave me the idea for this clean ups. In a nutshell, all filesystems should set .aio_read/.aio_write methods and use do_sync_read/ do_sync_write() as their .read/.write methods. This allows us to cleanup all variants of generic_file_* routines. Final available interfaces: generic_file_aio_read() - read handler generic_file_aio_write() - write handler generic_file_aio_write_nolock() - no lock write handler __generic_file_aio_write_nolock() - internal worker routine Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] Remove readv/writev methods and use aio_read/aio_write insteadBadari Pulavarty
This patch removes readv() and writev() methods and replaces them with aio_read()/aio_write() methods. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] Vectorize aio_read/aio_write fileop methodsBadari Pulavarty
This patch vectorizes aio_read() and aio_write() methods to prepare for collapsing all aio & vectored operations into one interface - which is aio_read()/aio_write(). Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Michael Holzheu <HOLZHEU@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-30[PATCH] BLOCK: Make it possible to disable the block layer [try #6]David Howells
Make it possible to disable the block layer. Not all embedded devices require it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require the block layer to be present. This patch does the following: (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev support. (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls an item that uses the block layer. This includes: (*) Block I/O tracing. (*) Disk partition code. (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS. (*) The SCSI layer. As far as I can tell, even SCSI chardevs use the block layer to do scheduling. Some drivers that use SCSI facilities - such as USB storage - end up disabled indirectly from this. (*) Various block-based device drivers, such as IDE and the old CDROM drivers. (*) MTD blockdev handling and FTL. (*) JFFS - which uses set_bdev_super(), something it could avoid doing by taking a leaf out of JFFS2's book. (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is, however, still used in places, and so is still available. (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and parts of linux/fs.h. (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK. (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK. (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK is not enabled. (*) fs/no-block.c is created to hold out-of-line stubs and things that are required when CONFIG_BLOCK is not set: (*) Default blockdev file operations (to give error ENODEV on opening). (*) Makes some /proc changes: (*) /proc/devices does not list any blockdevs. (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK. (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK. (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if given command other than Q_SYNC or if a special device is specified. (*) In init/do_mounts.c, no reference is made to the blockdev routines if CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2. (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return error ENOSYS by way of cond_syscall if so). (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if CONFIG_BLOCK is not set, since they can't then happen. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30[PATCH] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #6]David Howells
Move common FS-specific ioctls from linux/ext2_fs.h to linux/fs.h as FS_IOC_* and FS_IOC32_* and have the users of them use those as a base. Also move the GETFLAGS/SETFLAGS flags to linux/fs.h as FS_*_FL macros, and then have the other users use them as a base. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30[PATCH] BLOCK: Move functions out of buffer code [try #6]David Howells
Move some functions out of the buffering code that aren't strictly buffering specific. This is a precursor to being able to disable the block layer. (*) Moved some stuff out of fs/buffer.c: (*) The file sync and general sync stuff moved to fs/sync.c. (*) The superblock sync stuff moved to fs/super.c. (*) do_invalidatepage() moved to mm/truncate.c. (*) try_to_release_page() moved to mm/filemap.c. (*) Moved some related declarations between header files: (*) declarations for do_invalidatepage() and try_to_release_page() moved to linux/mm.h. (*) __set_page_dirty_buffers() moved to linux/buffer_head.h. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30[PATCH] Allow file systems to differentiate between data and meta readsJens Axboe
We can use this information for making more intelligent priority decisions, and it will also be useful for blktrace. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-30[PATCH] Kill various deprecated/unused block layer defines/functionsJens Axboe
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-29[PATCH] fs.h: ifdef security fieldsAlexey Dobriyan
[assuming BSD security levels are deleted] The only user of i_security, f_security, s_security fields is SELinux, however, quite a few security modules are trying to get into kernel. So, wrap them under CONFIG_SECURITY. Adding config option for each security field is likely an overkill. Following Stephen Smalley's suggestion, i_security initialization is moved to security_inode_alloc() to not clutter core code with ifdefs and make alloc_inode() codepath tiny little bit smaller and faster. The user of (highly greppable) struct fown_struct::security field is still to be found. I've checked every "fown_struct" and every "f_owner" occurence. Additionally it's removal doesn't break i386 allmodconfig build. struct inode, struct file, struct super_block, struct fown_struct become smaller. P.S. Combined with two reiserfs inode shrinking patches sent to linux-fsdevel, I can finally suck 12 reiserfs inodes into one page. /proc/slabinfo -ext2_inode_cache 388 10 +ext2_inode_cache 384 10 -inode_cache 280 14 +inode_cache 276 14 -proc_inode_cache 296 13 +proc_inode_cache 292 13 -reiser_inode_cache 336 11 +reiser_inode_cache 332 12 <= -shmem_inode_cache 372 10 +shmem_inode_cache 368 10 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] ifdef ->quota_read, ->quota_writeAlexey Dobriyan
All suppliers of ->quota_read, ->quota_write (I've found ext2, ext3, UFS, reiserfs) already have them properly ifdeffed. All callers of ->quota_read, ->quota_write are under CONFIG_QUOTA umbrella, so... Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode-diet: Eliminate i_blksize from the inode structureTheodore Ts'o
This eliminates the i_blksize field from struct inode. Filesystems that want to provide a per-inode st_blksize can do so by providing their own getattr routine instead of using the generic_fillattr() function. Note that some filesystems were providing pretty much random (and incorrect) values for i_blksize. [bunk@stusta.de: cleanup] [akpm@osdl.org: generic_fillattr() fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode-diet: Move i_cdev into a unionTheodore Ts'o
Move the i_cdev pointer in struct inode into a union. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode-diet: Move i_bdev into a unionTheodore Ts'o
Move the i_bdev pointer in struct inode into a union. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode-diet: Move i_pipe into a unionTheodore Ts'o
Move the i_pipe pointer into a union that will be shared with i_bdev and i_cdev. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_privateTheodore Ts'o
The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. [judith@osdl.org: powerpc build fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Judith Lebzelter <judith@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-24[PATCH] Allow file systems to manually d_move() inside of ->rename()Mark Fasheh
Some file systems want to manually d_move() the dentries involved in a rename. We can do this by making use of the FS_ODD_RENAME flag if we just have nfs_rename() unconditionally do the d_move(). While there, we rename the flag to be more descriptive. OCFS2 uses this to protect that part of the rename operation with a cluster lock. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-08-27[PATCH] lockdep: annotate reiserfsIngo Molnar
reiserfs seems to have another locking level layer for the i_mutex due to the xattrs-are-a-directory thing. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>