aboutsummaryrefslogtreecommitdiff
path: root/fs/ocfs2/alloc.c
AgeCommit message (Collapse)Author
2009-09-22ocfs2: Make transaction extend more efficient.Tao Ma
In ocfs2_extend_rotate_transaction, op_credits is the orignal credits in the handle and we only want to extend the credits for the rotation, but the old solution always double it. It is harmless for some minor operations, but for actions like reflink we may rotate tree many times and cause the credits increase dramatically. So this patch try to only increase the desired credits. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: CoW refcount tree improvement.Tao Ma
During CoW, if the old extent record is refcounted, we allocate som new clusters and do CoW. Actually we can have some improvement here. If the old extent has refcount=1, that means now it is only used by this file. So we don't need to allocate new clusters, just remove the refcounted flag and it is OK. We also have to remove it from the refcount tree while not deleting it. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: Add CoW support.Tao Ma
This patch try CoW support for a refcounted record. the whole process will be: 1. Calculate how many clusters we need to CoW and where we start. Extents that are not completely encompassed by the write will be broken on 1MB boundaries. 2. Do CoW for the clusters with the help of page cache. 3. Change the b-tree structure with the new allocated clusters. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: Decrement refcount when truncating refcounted extents.Tao Ma
Add 'Decrement refcount for delete' in to the normal truncate process. So for a refcounted extent record, call refcount rec decrementation instead of cluster free. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: Add functions for extents refcounted.Tao Ma
Add function ocfs2_mark_extent_refcounted which can mark an extent refcounted. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: Add support of decrementing refcount for delete.Tao Ma
Given a physical cpos and length, decrement the refcount in the tree. If the refcount for any portion of the extent goes to zero, that portion is queued for freeing. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: move tree path functions to alloc.h.Tao Ma
Now fs/ocfs2/alloc.c has more than 7000 lines. It contains our basic b-tree operation. Although we have already make our b-tree operation generic, the basic structrue ocfs2_path which is used to iterate one b-tree branch is still static and limited to only used in alloc.c. As refcount tree need them and I don't want to add any more b-tree unrelated code to alloc.c, export them out. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: Add refcount b-tree as a new extent tree.Tao Ma
Add refcount b-tree as a new extent tree so that it can use the b-tree to store and maniuplate ocfs2_refcount_rec. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: Abstract extent split process.Tao Ma
ocfs2_mark_extent_written actually does the following things: 1. check the parameters. 2. initialize the left_path and split_rec. 3. call __ocfs2_mark_extent_written. it will do: 1) check the flags of unwritten 2) do the real split work. The whole process is packed tightly somehow. So this patch will abstract 2 different functions so that future b-tree operation can work with it. 1. __ocfs2_split_extent will accept path and split_rec and do the real split work. 2. ocfs2_change_extent_flag will accept a new flag and initialize path and split_rec. So now ocfs2_mark_extent_written will do: 1. check the parameters. 2. call ocfs2_change_extent_flag. 1) initalize the left_path and split_rec. 2) check whether the new flags conflict with the old one. 3) call __ocfs2_split_extent to do the split. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22ocfs2: Wrap ocfs2_extent_contig in ocfs2_extent_tree.Tao Ma
Add a new operation eo_ocfs2_extent_contig int the extent tree's operations vector. So that with the new refcount tree, We want this so that refcount trees can always return CONTIG_NONE and prevent extent merging. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-04ocfs2: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().Joel Becker
With this commit, extent tree operations are divorced from inodes and rely on ocfs2_caching_info. Phew! Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: __ocfs2_mark_extent_written() doesn't need struct inode.Joel Becker
We only allow unwritten extents on data, so the toplevel ocfs2_mark_extent_written() can use an inode all it wants. But the subfunction isn't even using the inode argument. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Teach ocfs2_replace_extent_rec() to use an extent_tree.Joel Becker
Don't use a struct inode anymore. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_split_and_insert() no longer needs struct inode.Joel Becker
It already has an extent_tree. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_remove_extent() no longer needs struct inode.Joel Becker
One more generic btree function that is isolated from struct inode. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_add_clusters_in_btree() no longer needs struct inode.Joel Becker
One more function that doesn't need a struct inode to pass to its children. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_insert_extent() no longer needs struct inode.Joel Becker
One more function down, no inode in the entire insert-extent chain. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Make extent map insertion an extent_tree_operation.Joel Becker
ocfs2_insert_extent() wants to insert a record into the extent map if it's an inode data extent. But since many btrees can call that function, let's make it an op on ocfs2_extent_tree. Other tree types can leave it empty. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_figure_insert_type() no longer needs struct inode.Joel Becker
It's not using it, so remove it from the parameter list. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Remove inode from ocfs2_figure_extent_contig().Joel Becker
It already has an ocfs2_extent_tree and doesn't need the inode. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Swap inode for extent_tree in ocfs2_figure_merge_contig_type().Joel Becker
We don't want struct inode in generic btree operations. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_extent_contig() only requires the superblock.Joel Becker
Don't pass the inode in. We don't want it around for generic btree operations. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_do_insert_extent() and ocfs2_insert_path() no longer need an inode.Joel Becker
They aren't using it, so remove it from their parameter lists. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Give ocfs2_split_record() an extent_tree instead of an inode.Joel Becker
Another on the way to generic btree functions. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_insert_at_leaf() doesn't need struct inode.Joel Becker
Give it an ocfs2_extent_tree and it is happy. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Make truncating the extent map an extent_tree_operation.Joel Becker
ocfs2_remove_extent() wants to truncate the extent map if it's truncating an inode data extent. But since many btrees can call that function, let's make it an op on ocfs2_extent_tree. Other tree types can leave it empty. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_truncate_rec() doesn't need struct inode.Joel Becker
It's not using it anymore. Remove it from the parameter list. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_grow_branch() and ocfs2_append_rec_to_path() lose struct inode.Joel Becker
ocfs2_grow_branch() not really using it other than to pass it to the subfunctions ocfs2_shift_tree_depth(), ocfs2_find_branch_target(), and ocfs2_add_branch(). The first two weren't it either, so they drop the argument. ocfs2_add_branch() only passed it to ocfs2_adjust_rightmost_branch(), which drops the inode argument and uses the ocfs2_extent_tree as well. ocfs2_append_rec_to_path() can be take an ocfs2_extent_tree instead of the inode. The function ocfs2_adjust_rightmost_records() goes along for the ride. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_try_to_merge_extent() doesn't need struct inode.Joel Becker
It's not using it, so remove it from the parameter list. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_merge_rec_left/right() no longer need struct inode.Joel Becker
Drop it from the parameters - they already have ocfs2_extent_list. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_rotate_tree_left() no longer needs struct inode.Joel Becker
It already gets ocfs2_extent_tree, so we can just use that. This chains to the same modification for ocfs2_remove_rightmost_path() and ocfs2_rotate_rightmost_leaf_left(). Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: __ocfs2_rotate_tree_left() doesn't need struct inode.Joel Becker
It already has struct ocfs2_extent_tree, which has the caching info. So we don't need to pass it struct inode. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_rotate_subtree_left() doesn't need struct inode.Joel Becker
It already has struct ocfs2_extent_tree, which has the caching info. So we don't need to pass it struct inode. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_update_edge_lengths() doesn't need struct inode.Joel Becker
Pass in the extent tree, which is all we need. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_rotate_tree_right() doesn't need struct inode.Joel Becker
We don't need struct inode in ocfs2_rotate_tree_right() anymore. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Drop struct inode from ocfs2_extent_tree_operations.Joel Becker
We can get to the inode from the caching information. Other parent types don't need it. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Pass ocfs2_extent_tree to ocfs2_get_subtree_root()Joel Becker
Get rid of the inode argument. Use extent_tree instead. This means a few more functions have to pass an extent_tree around. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Get inode out of ocfs2_rotate_subtree_root_right().Joel Becker
Pass the ocfs2_extent_list down through ocfs2_rotate_tree_right() and get rid of struct inode in ocfs2_rotate_subtree_root_right(). Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_complete_edge_insert() doesn't need struct inode at all.Joel Becker
Completely unused argument. Get rid of it. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Pass ocfs2_extent_tree to ocfs2_unlink_path()Joel Becker
ocfs2_unlink_path() doesn't need struct inode, so let's pass it struct ocfs2_extent_tree. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_create_new_meta_bhs() doesn't need struct inode.Joel Becker
Pass struct ocfs2_extent_tree into ocfs2_create_new_meta_bhs(). It no longer needs struct inode or ocfs2_super. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: ocfs2_find_path() only needs the caching infoJoel Becker
ocfs2_find_path and ocfs2_find_leaf() walk our btrees, reading extent blocks. They need struct ocfs2_caching_info for that, but not struct inode. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Pass ocfs2_caching_info to ocfs2_read_extent_block().Joel Becker
extent blocks belong to btrees on more than just inodes, so we want to pass the ocfs2_caching_info structure directly to ocfs2_read_extent_block(). A number of places in alloc.c can now drop struct inode from their argument list. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Store the ocfs2_caching_info on ocfs2_extent_tree.Joel Becker
What do we cache? Metadata blocks. What are most of our non-inode metadata blocks? Extent blocks for our btrees. struct ocfs2_extent_tree is the main structure for managing those. So let's store the associated ocfs2_caching_info there. This means that ocfs2_et_root_journal_access() doesn't need struct inode anymore, and any place that has an et can refer to et->et_ci instead of INODE_CACHE(inode). Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Pass struct ocfs2_caching_info to the journal functions.Joel Becker
The next step in divorcing metadata I/O management from struct inode is to pass struct ocfs2_caching_info to the journal functions. Thus the journal locks a metadata cache with the cache io_lock function. It also can compare ci_last_trans and ci_created_trans directly. This is a large patch because of all the places we change ocfs2_journal_access..(handle, inode, ...) to ocfs2_journal_access..(handle, INODE_CACHE(inode), ...). Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04ocfs2: Take the inode out of the metadata read/write paths.Joel Becker
We are really passing the inode into the ocfs2_read/write_blocks() functions to get at the metadata cache. This commit passes the cache directly into the metadata block functions, divorcing them from the inode. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-08-17ocfs2: release the buffer head in ocfs2_do_truncate.Tao Ma
In ocfs2_do_truncate, we forget to release last_eb_bh which will cause memleak. So call brelse in the end. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-07-23ocfs2: Use ocfs2_rec_clusters in ocfs2_adjust_adjacent_records.Tao Ma
In ocfs2_adjust_adjacent_records, we will adjust adjacent records according to the extent_list in the lower level. But actually the lower level tree will either be a leaf or a branch. If we only use ocfs2_is_empty_extent we will meet with some problem if the lower tree is a branch (tree_depth > 1). So use !ocfs2_rec_clusters instead. And actually only the leaf record can have holes. So add a BUG_ON for non-leaf branch. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-07-21ocfs2: Add extra credits and access the modified bh in update_edge_lengths.Tao Ma
In normal tree rotation left process, we will never touch the tree branch above subtree_index and ocfs2_extend_rotate_transaction doesn't reserve the credits for them either. But when we want to delete the rightmost extent block, we have to update the rightmost records for all the rightmost branch(See ocfs2_update_edge_lengths), so we have to allocate extra credits for them. What's more, we have to access them also. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-06-15ocfs2: Adjust rightmost path in ocfs2_add_branch.Tao Ma
In ocfs2_add_branch, we use the rightmost rec of the leaf extent block to generate the e_cpos for the newly added branch. In the most case, it is OK but if the parent extent block's rightmost rec covers more clusters than the leaf does, it will cause kernel panic if we insert some clusters in it. The message is something like: (7445,1):ocfs2_insert_at_leaf:3775 ERROR: bug expression: le16_to_cpu(el->l_next_free_rec) >= le16_to_cpu(el->l_count) (7445,1):ocfs2_insert_at_leaf:3775 ERROR: inode 66053, depth 0, count 28, next free 28, rec.cpos 270, rec.clusters 1, insert.cpos 275, insert.clusters 1 [<fa7ad565>] ? ocfs2_do_insert_extent+0xb58/0xda0 [ocfs2] [<fa7b08f2>] ? ocfs2_insert_extent+0x5bd/0x6ba [ocfs2] [<fa7b1b8b>] ? ocfs2_add_clusters_in_btree+0x37f/0x564 [ocfs2] ... The panic can be easily reproduced by the following small test case (with bs=512, cs=4K, and I remove all the error handling so that it looks clear enough for reading). int main(int argc, char **argv) { int fd, i; char buf[5] = "test"; fd = open(argv[1], O_RDWR|O_CREAT); for (i = 0; i < 30; i++) { lseek(fd, 40960 * i, SEEK_SET); write(fd, buf, 5); } ftruncate(fd, 1146880); lseek(fd, 1126400, SEEK_SET); write(fd, buf, 5); close(fd); return 0; } The reason of the panic is that: the 30 writes and the ftruncate makes the file's extent list looks like: Tree Depth: 1 Count: 19 Next Free Rec: 1 ## Offset Clusters Block# 0 0 280 86183 SubAlloc Bit: 7 SubAlloc Slot: 0 Blknum: 86183 Next Leaf: 0 CRC32: 00000000 ECC: 0000 Tree Depth: 0 Count: 28 Next Free Rec: 28 ## Offset Clusters Block# Flags 0 0 1 143368 0x0 1 10 1 143376 0x0 ... 26 260 1 143576 0x0 27 270 1 143584 0x0 Now another write at 1126400(275 cluster) whiich will write at the gap between 271 and 280 will trigger ocfs2_add_branch, but the result after the function looks like: Tree Depth: 1 Count: 19 Next Free Rec: 2 ## Offset Clusters Block# 0 0 280 86183 1 271 0 143592 So the extent record is intersected and make the following operation bug out. This patch just try to remove the gap before we add the new branch, so that the root(branch) rightmost rec will cover the same right position. So in the above case, before adding branch the tree will be changed to Tree Depth: 1 Count: 19 Next Free Rec: 1 ## Offset Clusters Block# 0 0 271 86183 SubAlloc Bit: 7 SubAlloc Slot: 0 Blknum: 86183 Next Leaf: 0 CRC32: 00000000 ECC: 0000 Tree Depth: 0 Count: 28 Next Free Rec: 28 ## Offset Clusters Block# Flags 0 0 1 143368 0x0 1 10 1 143376 0x0 ... 26 260 1 143576 0x0 27 270 1 143584 0x0 And after branch add, the tree looks like Tree Depth: 1 Count: 19 Next Free Rec: 2 ## Offset Clusters Block# 0 0 271 86183 1 271 0 143592 Signed-off-by: Tao Ma <tao.ma@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>