aboutsummaryrefslogtreecommitdiff
path: root/fs/ocfs2
AgeCommit message (Collapse)Author
2007-10-12Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (75 commits) PM: merge device power-management source files sysfs: add copyrights kobject: update the copyrights kset: add some kerneldoc to help describe what these strange things are Driver core: rename ktype_edd and ktype_efivar Driver core: rename ktype_driver Driver core: rename ktype_device Driver core: rename ktype_class driver core: remove subsystem_init() sysfs: move sysfs file poll implementation to sysfs_open_dirent sysfs: implement sysfs_open_dirent sysfs: move sysfs_dirent->s_children into sysfs_dirent->s_dir sysfs: make sysfs_root a regular directory dirent sysfs: open code sysfs_attach_dentry() sysfs: make s_elem an anonymous union sysfs: make bin attr open get active reference of parent too sysfs: kill unnecessary NULL pointer check in sysfs_release() sysfs: kill unnecessary sysfs_get() in open paths sysfs: reposition sysfs_dirent->s_mode. sysfs: kill sysfs_update_file() ...
2007-10-12Drivers: clean up direct setting of the name of a ksetGreg Kroah-Hartman
A kset should not have its name set directly, so dynamically set the name at runtime. This is needed to remove the static array in the kobject structure which will be changed in a future patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12ocfs2: Optionally return filldir errorsMark Fasheh
Modify ocfs2_dir_foreach_blk() to optionally return any error from the filldir callback. This way ocfs2_dirforeach() can terminate early, as opposed to always passing through the entire directory. This fixes a bug introduced during a previous code refactor where ocfs2_empty_dir() would loop infinitely. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-10-12ocfs2: Write support for directories with inline dataMark Fasheh
Create all new directories with OCFS2_INLINE_DATA_FL and the inline data bytes formatted as an empty directory. Inode size field reflects the actual amount of inline data available, which makes searching for dirent space very similar to the regular directory search. Inline-data directories are automatically pushed out to extents on any insert request which is too large for the available space. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Read support for directories with inline dataMark Fasheh
This splits out extent based directory read support and implements inline-data versions of those functions. All knowledge of inline-data versus extent based directories is internalized. For lookups the code uses ocfs2_find_entry_id(), full dir iterations make use of ocfs2_dir_foreach_blk_id(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Write support for inline dataMark Fasheh
This fixes up write, truncate, mmap, and RESVSP/UNRESVP to understand inline inode data. For the most part, the changes to the core write code can be relied on to do the heavy lifting. Any code calling ocfs2_write_begin (including shared writeable mmap) can count on it doing the right thing with respect to growing inline data to an extent tree. Size reducing truncates, including UNRESVP can simply zero that portion of the inode block being removed. Size increasing truncatesm, including RESVP have to be a little bit smarter and grow the inode to an extent tree if necessary. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Read support for inline dataMark Fasheh
This hooks up ocfs2_readpage() to populate a page with data from an inode block. Direct IO reads from inline data are modified to fall back to buffered I/O. Appropriate checks are also placed in the extent map code to avoid reading an extent list when inline data might be stored. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Structure updates for inline dataMark Fasheh
Add the disk, network and memory structures needed to support data in inode. Struct ocfs2_inline_data is defined and embedded in ocfs2_dinode for storing inline data. A new inode field, i_dyn_features, is added to facilitate tracking of dynamic inode state. Since it will be used often, we want to mirror it on ocfs2_inode_info, and transfer it via the meta data lvb. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Cleanup dirent size checkMark Fasheh
The check to see if a new dirent would fit in an old one is pretty ugly, and it's done at least twice. Clean things up by putting this in it's own easier-to-read function. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Rename cleanupsMark Fasheh
ocfs2_rename() does direct manipulation of the dirent it's gotten back from a directory search. Wrap this manipulation inside of a function so that we can transparently change directory update behavior in the future. As an added bonus, this gets rid of an ugly macro. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Provide convenience function for ino lookupMark Fasheh
A couple paths which needed to just match a parent dir + name pair to an inode number were a bit messy because they had to deal with ocfs2_find_files_on_disk() which returns a larger number of values. Provide a convenience function, ocfs2_lookup_ino_from_name() which internalizes all the extra accounting. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Implement ocfs2_empty_dir() as a caller of ocfs2_dir_foreach()Mark Fasheh
We can preserve the behavior of ocfs2_empty_dir(), while getting rid of the open coded directory walk by just providing a smart filldir callback. This also automatically gets to use the dir readahead code, though in this case any advantage is minor at best. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Remove open coded readdir()Mark Fasheh
ocfs2_queue_orphans() has an open coded readdir loop which can easily just use a directory accessor function. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Pass raw u64 to filldirMark Fasheh
filldir_t can take this, so don't turn de->inode into a 32 bit value. Right now this doesn't make a difference since no ocfs2 inodes overflow that, but it could be a nasty surprise later on if some kernel code is calling ocfs2_dir_foreach_blk() and expecting real inode numbers back... Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Abstract out core dir listing functionalityMark Fasheh
Put this in it's own function so that the functionality can be overridden. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Move directory manipulation code into dir.cMark Fasheh
The code for adding, removing, deleting directory entries was splattered all over namei.c. I'd rather have this all centralized, so that it's easier to make changes for inline dir data, and eventually indexed directories. None of the code in any of the functions was changed. I only removed the static keyword from some prototypes so that they could be exported. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Small refactor of truncate zeroing codeMark Fasheh
We'll want to reuse most of this when pushing inline data back out to an extent. Keeping this part as a seperate patch helps to keep the upcoming changes for write support uncluttered. The core portion of ocfs2_zero_cluster_pages() responsible for making sure a page is mapped and properly dirtied is abstracted out into it's own function, ocfs2_map_and_dirty_page(). Actual functionality doesn't change, though zeroing becomes optional. We also turn part of ocfs2_free_write_ctxt() into a common function for unlocking and freeing a page array. This operation is very common (and uniform) for Ocfs2 cluster sizes greater than page size, so it makes sense to keep the code in one place. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: move nonsparse hole-filling into ocfs2_write_begin()Mark Fasheh
By doing this, we can remove any higher level logic which has to have knowledge of btree functionality - any callers of ocfs2_write_begin() can now expect it to do anything necessary to prepare the inode for new data. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
2007-10-12ocfs2: Sync ocfs2_fs.h with ocfs2-toolsMark Fasheh
ocfs2-tools added some on-disk fields and flags which are used by tunefs.ocfs2. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-10-12[PATCH] fs/ocfs2/: removed unneeded initial value and function's return valueDenis Cheng
Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-10-12ocfs2: Implement show_options()Sunil Mushran
Implement sops->show_options() so as to allow /proc/mounts to show the mount options. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-10-12ocfs2: Clear slot map when umounting a local volumeMark Fasheh
This is technically harmless (recovery will clean it out later), but leaves a bogus entry in the slot_map which really shouldn't be there. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-10-12ocfs2: Remove unused structure fieldMark Fasheh
c_used_tail_recs in struct ocfs2_merge_ctxt is only ever set, so we can remove it. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-10-12ocfs2: remove unused variableTao Mao
delete_tail_recs in ocfs2_try_to_merge_extent() was only ever set, remove it. Signed-off-by: Tao Mao <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-10-12ocfs2: remove mostly unused field from insert structureTao Mao
ocfs2_insert_type->ins_free_records was only used in one place, and was set incorrectly in most places. We can free up some memory and lose some code by removing this. * Small warning fixup contributed by Andrew Mortom <akpm@linux-foundation.org> Signed-off-by: Tao Mao <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-10-12Fix up more bio falloutAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-10Drop 'size' argument from bio_endio and bi_end_ioNeilBrown
As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-03ocfs2: Unlock mutex in local alloc failure caseSunil Mushran
The fs was not unlocking the local alloc inode mutex in the code path in which it failed to find a window of free bits in the global bitmap. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-09-20ocfs2: Pack vote message and response structuresSunil Mushran
The ocfs2_vote_msg and ocfs2_response_msg structs needed to be packed to ensure similar sizeofs in 32-bit and 64-bit arches. Without this, we had inadvertantly broken 32/64 bit cross mounts. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-09-20ocfs2: Don't double set write parametersMark Fasheh
The target page offsets were being incorrectly set a second time in ocfs2_prepare_page_for_write(), which was causing problems on a 16k page size kernel. Additionally, ocfs2_write_failure() was incorrectly using those parameters instead of the parameters for the individual page being cleaned up. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-09-20ocfs2: Fix pos/len passed to ocfs2_write_clusterMark Fasheh
This was broken for file systems whose cluster size is greater than page size. Pos needs to be incremented as we loop through the descriptors, and len needs to be capped to the size of a single cluster. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-09-20ocfs2: Allow smaller allocations during large writesMark Fasheh
The ocfs2 write code loops through a page much like the block code, except that ocfs2 allocation units can be any size, including larger than page size. Typically it's equal to or larger than page size - most kernels run 4k pages, the minimum ocfs2 allocation (cluster) size. Some changes introduced during 2.6.23 changed the way writes to pages are handled, and inadvertantly broke support for > 4k page size. Instead of just writing one cluster at a time, we now handle the whole page in one pass. This means that multiple (small) seperate allocations might happen in the same pass. The allocation code howver typically optimizes by getting the maximum which was reserved. This triggered a BUG_ON in the extend code where it'd ask for a single bit (for one part of a > 4k page) and get back more than it asked for. Fix this by providing a variant of the high level allocation function which allows the caller to specify a maximum. The traditional function remains and just calls the new one with a maximum determined from the initial reservation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-09-11ocfs2: Fix calculation of i_blocks during truncateMark Fasheh
We were setting i_blocks too early - before truncating any allocation. Correct things to set i_blocks after the allocation change. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-09-11[PATCH] ocfs2: Fix a wrong cluster calculation.tao.ma@oracle.com
In ocfs2_alloc_write_write_ctxt, the written clusters length is calculated by the byte length only. This may cause some problems if we start to write at some position in the end of one cluster and last to a second cluster while the "len" is smaller than a cluster size. In that case, we have to write 2 clusters actually. So we have to take the start position into consideration also. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-09-11[PATCH] ocfs2: fix mount option parsingTiger Yang
For some mount option types, ocfs2_parse_options() will try to access sb->s_fs_info to get at the ocfs2 private superblock. Unfortunately, that hasn't been allocated yet and will cause a kernel crash. Fix this by storing options in a struct which can then get pushed into the ocfs2_super once it's been allocated later. If we need more options which store to the ocfs2_super in the future, we can just fields to this struct. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09ocfs2: set non-default s_time_gran during mountMark Fasheh
We need to manually set this to '1' during mount, otherwise inode_setattr() will chop off the nanosecond portion of our timestamps. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09ocfs2: Retry sendpage() if it returns EAGAINSunil Mushran
Instead of treating EAGAIN, returned from sendpage(), as an error, this patch retries the operation. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09ocfs2: Fix rename/extend raceSunil Mushran
If one process is extending a file while another is renaming it, there exists a window when rename could flush the old inode's stale i_size to disk. This patch recognizes the fact that rename is only updating the old inode's ctime, so it ensures only that value is flushed to disk. Signed-off-by: Sunil Mushran <sunil.musran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09[2.6 patch] ocfs2_insert_extent(): remove dead codeAdrian Bunk
This patch removes some now dead code. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09ocfs2: Fix max offset calculationsMark Fasheh
ocfs2_max_file_offset() was over-estimating the largest file size for several cases. This wasn't really a problem before, but now that we support sparse files, it needs to be more accurate. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09ocfs2: check ia_size limits in setattrMark Fasheh
We have to manually check the requested truncate size as the check in vmtruncate() comes too late for Ocfs2. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09ocfs2: Fix some casting errors related to file writesMark Fasheh
ocfs2_align_clusters_to_page_index() needs to cast the clusters shift to pgoff_t and ocfs2_file_buffered_write() needs loff_t when calculating destination start for memcpy. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09ocfs2: use s_maxbytes directly in ocfs2_change_file_space()Mark Fasheh
There's no need to recalculate things via ocfs2_max_file_offset() as we've already done that to fill s_maxbytes, so use that instead. We can also un-export ocfs2_max_file_offset() then. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09ocfs2: Restrict inode changes in ocfs2_update_inode_atime()Mark Fasheh
ocfs2_update_inode_atime() calls ocfs2_mark_inode_dirty() to push changes from the struct inode into the ocfs2 disk inode. The problem is, ocfs2_mark_inode_dirty() might change other fields, depending on what happened to the struct inode. Since we don't always have locking to serialize changes to other fields (like i_size, etc), just fix things up to only touch the atime field. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-07-24ocfs2: bad kunmap_atomic()Jens Axboe
kunmap_atomic() takes the virtual address, not the mapped page as argument. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20fix some conversion overflowsNick Piggin
Fix page index to offset conversion overflows in buffer layer, ecryptfs, and ocfs2. It would be nice to convert the whole tree to page_offset, but for now just fix the bugs. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20mm: Remove slab destructors from kmem_cache_create().Paul Mundt
Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-19Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: ->fallocate() support
2007-07-19mm: fault feedback #1Nick Piggin
Change ->fault prototype. We now return an int, which contains VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte. FAULT_RET_ code tells the VM whether a page was found, whether it has been locked, and potentially other things. This is not quite the way he wanted it yet, but that's changed in the next patch (which requires changes to arch code). This means we no longer set VM_CAN_INVALIDATE in the vma in order to say that a page is locked which requires filemap_nopage to go away (because we can no longer remain backward compatible without that flag), but we were going to do that anyway. struct fault_data is renamed to struct vm_fault as Linus asked. address is now a void __user * that we should firmly encourage drivers not to use without really good reason. The page is now returned via a page pointer in the vm_fault struct. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19mm: merge populate and nopage into fault (fixes nonlinear)Nick Piggin
Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes the virtual address -> file offset differently from linear mappings. ->populate is a layering violation because the filesystem/pagecache code should need to know anything about the virtual memory mapping. The hitch here is that the ->nopage handler didn't pass down enough information (ie. pgoff). But it is more logical to pass pgoff rather than have the ->nopage function calculate it itself anyway (because that's a similar layering violation). Having the populate handler install the pte itself is likewise a nasty thing to be doing. This patch introduces a new fault handler that replaces ->nopage and ->populate and (later) ->nopfn. Most of the old mechanism is still in place so there is a lot of duplication and nice cleanups that can be removed if everyone switches over. The rationale for doing this in the first place is that nonlinear mappings are subject to the pagefault vs invalidate/truncate race too, and it seemed stupid to duplicate the synchronisation logic rather than just consolidate the two. After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in pagecache. Seems like a fringe functionality anyway. NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no users have hit mainline yet. [akpm@linux-foundation.org: cleanup] [randy.dunlap@oracle.com: doc. fixes for readahead] [akpm@linux-foundation.org: build fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>