aboutsummaryrefslogtreecommitdiff
path: root/fs/ocfs2
AgeCommit message (Collapse)Author
2009-01-05ocfs2: Rename ocfs2_cp_xattr_cluster() to ocfs2_mv_xattr_buckets().Joel Becker
ocfs2_cp_xattr_cluster() takes the last cluster of an xattr extent, copies its buckets to the front of a new extent, and then shrinks the bucket count of the original extent. So it's really moving the data, not copying it. While we're here, the function doesn't need a buffer_head for the old extent, just the block number. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Use ocfs2_cp_xattr_bucket() in ocfs2_mv_xattr_bucket_cross_cluster().Joel Becker
The buffer copy loop of ocfs2_mv_xattr_bucket_cross_cluster() actually looks a lot like ocfs2_cp_xattr_bucket(). Let's just use that instead. We also use bucket operations to update the buckets at the start of each extent. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Explain t_is_new in ocfs2_cp_xattr_cluster().Joel Becker
I was unsure of the JOURNAL_ACCESS parameters in ocfs2_cp_xattr_cluster(). They're based on the function argument 't_is_new', but I couldn't quite figure out how t_is_new mapped to allocation. ocfs2_cp_xattr_cluster() actually overwrites the target, regardless of t_is_new. Well, I just figured it out. So I'm adding a big fat comment for those who come after me. ocfs2_divide_xattr_cluster() has the same behavior. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Dirty the entire first bucket in ocfs2_cp_xattr_cluster().Joel Becker
ocfs2_cp_xattr_cluster() takes the last bucket of a full extent and copies it over to a new extent. It then updates the headers of both extents to reflect the new state. It is passed the first bh of the first bucket in order to update that first extent's bucket count. It reads and dirties the first bh of the new extent for the same reason. However, future code wants to always dirty the entire bucket when it is changed. So it is changed to read the entire bucket it is updating for both extents. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Dirty the entire first bucket in ocfs2_extend_xattr_bucket()Joel Becker
ocfs2_extend_xattr_bucket() takes an extent of buckets and shifts some of them down to make room for a new xattr. It is passed the first bh of the first bucket, because that is where we store the number of buckets in the extent. However, future code wants to always dirty the entire bucket when it is changed. So let's pass the entire bucket into this function, skip any block reads (we have them), and add the access/dirty logic. We also can skip passing in the target bucket bh - we only need its block number. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Narrow the transaction for deleting xattrs from a bucket.Tao Ma
We move the transaction into the loop because in ocfs2_remove_extent, we will double the credits in function ocfs2_extend_rotate_transaction. So if we have a large loop number, we will soon waste much the journal space. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Dirty the entire bucket in ocfs2_bucket_value_truncate()Joel Becker
ocfs2_bucket_value_truncate() currently takes the first bh of the bucket, and magically plays around with the value bh - even though the bucket structure in the calling function already has it. In addition, future code wants to always dirty the entire bucket when it is changed. So let's pass the entire bucket into this function, skip any block reads (we have them), and add the access/dirty logic. ocfs2_xattr_update_value_size() is no longer necessary, as it only did one thing other than journal access/dirty. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2/quota: sparse fixes for quotaTao Ma
Fix 2 minor things in quota. They are both found by sparse check. 1. an endian bug in ocfs2_local_quota_add_chunk. 2. change olq_alloc_dquot to static. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: fix indendation in ocfs2_dquot_drop_slowTao Ma
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Fix build warnings (64-bit types vs long long)Jan Kara
fs/ocfs2/quota_local.c: In function 'olq_set_dquot': fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 7 has type '__le64' fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 8 has type '__le64' fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 7 has type '__le64' fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 8 has type '__le64' fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 7 has type '__le64' fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 8 has type '__le64' fs/ocfs2/quota_global.c: In function '__ocfs2_sync_dquot': fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 8 has type 's64' fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 10 has type 's64' fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 8 has type 's64' fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 10 has type 's64' fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 8 has type 's64' fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 10 has type 's64' Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Make ocfs2_get_quota_block() consistent with ocfs2_read_quota_block()Jan Kara
Make function return error status and not buffer pointer so that it's consistent with ocfs2_read_quota_block(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Fix oops when extending quota filesJan Kara
We have to mark buffer as uptodate before calling ocfs2_journal_access() and ocfs2_set_buffer_uptodate() does not do this for us. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Fix ocfs2_read_quota_block() error handling.Joel Becker
ocfs2_bread() has become ocfs2_read_virt_blocks(), with a prototype to match ocfs2_read_blocks(). The quota code, converting from ocfs2_bread(), wraps the call to ocfs2_read_virt_blocks() in ocfs2_read_quota_block(). Unfortunately, the prototype of ocfs2_read_quota_block() matches the old prototype of ocfs2_bread(). The problem is that ocfs2_bread() returned the buffer head, and callers assumed that a NULL pointer was indicative of error. It wasn't. This is why ocfs2_bread() took an int*err argument as well. The new prototype of ocfs2_read_virt_blocks() avoids this error handling confusion. Let's change ocfs2_read_quota_block() to match. Signed-off-by: Joel Becker <joel.becker@oracle.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Add missing initializationJan Kara
Add missing variable initialization to ocfs2_dquot_drop_slow(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Use BH_JBDPrivateStart instead of BH_UnshadowMark Fasheh
This is safer. We no longer have to worry about tracking changes to jbd_state_bits. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Enable quota accounting on mount, disable on umountJan Kara
Enable quota usage tracking on mount and disable it on umount. Also add support for quota on and quota off quotactls and usrquota and grpquota mount options. Add quota features among supported ones. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Implement quota recoveryJan Kara
Implement functions for recovery after a crash. Functions just read local quota file and sync info to global quota file. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Periodic quota syncingMark Fasheh
This patch creates a work queue for periodic syncing of locally cached quota information to the global quota files. We constantly queue a delayed work item, to get the periodic behavior. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Jan Kara <jack@suse.cz>
2009-01-05ocfs2: Add quota calls for allocation and freeing of inodes and spaceJan Kara
Add quota calls for allocation and freeing of inodes and space, also update estimates on number of needed credits for a transaction. Move out inode allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called outside of a transaction. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Implementation of local and global quota file handlingJan Kara
For each quota type each node has local quota file. In this file it stores changes users have made to disk usage via this node. Once in a while this information is synced to global file (and thus with other nodes) so that limits enforcement at least aproximately works. Global quota files contain all the information about usage and limits. It's mostly handled by the generic VFS code (which implements a trie of structures inside a quota file). We only have to provide functions to convert structures from on-disk format to in-memory one. We also have to provide wrappers for various quota functions starting transactions and acquiring necessary cluster locks before the actual IO is really started. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Mark system files as not subject to quota accountingJan Kara
Mark system files as not subject to quota accounting. This prevents possible recursions into quota code and thus deadlocks. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Assign feature bits and system inodes to quota feature and quota filesJan Kara
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Support nested transactionsJan Kara
OCFS2 can easily support nested transactions. We just have to take care and not spoil statistics acquire semaphore unnecessarily. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2/xattr: Restore not_found in xisTao Ma
During an xattr set, when we move a xattr which was stored in inode to the outside bucket, we have to delete it and it will use the old value of xis->not_found. xis->not_found is removed by ocfs2_calc_xattr_set_need though, so we must restore it. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2/xattr: Fix a bug in xattr allocation estimationTao Ma
When we extend one xattr's value to a large size, the old value size might be smaller than the size of a value root. In those cases, we still need to guess the metadata allocation. Reported-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Remove JBD compatibility layerMark Fasheh
JBD2 is fully backwards compatible with JBD and it's been tested enough with Ocfs2 that we can clean this code up now. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Convert ocfs2_read_dir_block() to ocfs2_read_virt_blocks()Joel Becker
Now that we've centralized the ocfs2_read_virt_blocks() code, let's use it in ocfs2_read_dir_block(). Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Wrap virtual block reads in ocfs2_read_virt_blocks()Joel Becker
The ocfs2_read_dir_block() function really maps an inode's virtual blocks to physical ones before calling ocfs2_read_blocks(). Let's extract that to common code, because other places might want to do that. Other than the block number being virtual, ocfs2_read_virt_blocks() takes the same arguments as ocfs2_read_blocks(). It converts those virtual block numbers to physical before calling ocfs2_read_blocks() directly. If the blocks asked for are discontiguous, this can mean multiple calls to ocfs2_read_blocks(), but this is mostly hidden from the caller. Like ocfs2_read_blocks(), the caller can pass in an existing buffer_head. This is usually done to pick up some readahead I/O. ocfs2_read_virt_blocks() checks the buffer_head's block number against the extent map - it must match. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Validate metadata only when it's read from disk.Joel Becker
Add an optional validation hook to ocfs2_read_blocks(). Now the validation function is only called when a block was actually read off of disk. It is not called when the buffer was in cache. We add a buffer state bit BH_NeedsValidate to flag these buffers. It must always be one higher than the last JBD2 buffer state bit. The dinode, dirblock, extent_block, and xattr_block validators are lifted to this scheme directly. The group_descriptor validator needs to be split into two pieces. The first part only needs the gd buffer and is passed to ocfs2_read_block(). The second part requires the dinode as well, and is called every time. It's only 3 compares, so it's tiny. This also allows us to clean up the non-fatal gd check used by resize.c. It now has no magic argument. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Wrap xattr block reads in a dedicated functionJoel Becker
We weren't consistently checking xattr blocks after we read them. Most places checked the signature, but none checked xb_blkno or xb_fs_signature. Create a toplevel ocfs2_read_xattr_block() that does the read and the validation. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Wrap dirblock reads in a dedicated function.Joel Becker
We have ocfs2_bread() as a vestige of the original ext-based dir code. It's only used by directories, though. Turn it into ocfs2_read_dir_block(), with a prototype matching the other metadata read functions. It's set up to validate dirblocks when the time comes. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Wrap extent block reads in a dedicated function.Joel Becker
We weren't consistently checking extent blocks after we read them. Most places checked the signature, but none checked h_blkno or h_fs_signature. Create a toplevel ocfs2_read_extent_block() that does the read and the validation. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Morph the haphazard OCFS2_IS_VALID_GROUP_DESC() checks.Joel Becker
Random places in the code would check a group descriptor bh to see if it was valid. The previous commit unified descriptor block reads, validating all block reads in the same place. Thus, these checks are no longer necessary. Rather than eliminate them, however, we change them to BUG_ON() checks. This ensures the assumptions remain true. All of the code paths to these checks have been audited to ensure they come from a validated descriptor read. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Wrap group descriptor reads in a dedicated function.Joel Becker
We have a clean call for validating group descriptors, but every place that wants the always does a read_block()+validate() call pair. Create a toplevel ocfs2_read_group_descriptor() that does the right thing. This allows us to leverage the single call point later for fancier handling. We also add validation of gd->bg_generation against the superblock and gd->bg_blkno against the block we thought we read. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Consolidate validation of group descriptors.Joel Becker
Currently the validation of group descriptors is directly duplicated so that one version can error the filesystem and the other (resize) can just report the problem. Consolidate to one function that takes a boolean. Wrap that function with the old call for the old users. This is in preparation for lifting the read+validate step into a single function. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks.Joel Becker
Random places in the code would check a dinode bh to see if it was valid. Not only did they do different levels of validation, they handled errors in different ways. The previous commit unified inode block reads, validating all block reads in the same place. Thus, these haphazard checks are no longer necessary. Rather than eliminate them, however, we change them to BUG_ON() checks. This ensures the assumptions remain true. All of the code paths to these checks have been audited to ensure they come from a validated inode read. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: Wrap inode block reads in a dedicated function.Joel Becker
The ocfs2 code currently reads inodes off disk with a simple ocfs2_read_block() call. Each place that does this has a different set of sanity checks it performs. Some check only the signature. A couple validate the block number (the block read vs di->i_blkno). A couple others check for VALID_FL. Only one place validates i_fs_generation. A couple check nothing. Even when an error is found, they don't all do the same thing. We wrap inode reading into ocfs2_read_inode_block(). This will validate all the above fields, going readonly if they are invalid (they never should be). ocfs2_read_inode_block_full() is provided for the places that want to pass read_block flags. Every caller is passing a struct inode with a valid ip_blkno, so we don't need a separate blkno argument either. We will remove the validation checks from the rest of the code in a later commit, as they are no longer necessary. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add mount option and Kconfig option for aclTiger Yang
This patch adds the Kconfig option "CONFIG_OCFS2_FS_POSIX_ACL" and mount options "acl" to enable acls in Ocfs2. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add ocfs2_init_acl in mknodTiger Yang
We need to get the parent directories acls and let the new child inherit it. To this, we add additional calculations for data/metadata allocation. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add ocfs2_acl_chmodTiger Yang
This function is used to update acl xattrs during file mode changes. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add ocfs2_check_aclTiger Yang
This function is used to enhance permission checking with POSIX ACLs. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add POSIX ACL APITiger Yang
This patch adds POSIX ACL(access control lists) APIs in ocfs2. We convert struct posix_acl to many ocfs2_acl_entry and regard them as an extended attribute entry. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add ocfs2_xattr_get_nolockTiger Yang
This function does the work of ocfs2_xattr_get under an open lock. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add ocfs2_init_security in during file createTiger Yang
Security attributes must be set when creating a new inode. We do this in three steps. - First, get security xattr's name and value by security_operation - Calculate and reserve the meta data and clusters needed by this security xattr before starting transaction - Finally, we set it before add_entry Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add security xattr APITiger Yang
This patch add security xattr set/get/list APIs to support security attributes in Ocfs2. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: add ocfs2_xattr_set_handleTiger Yang
This function is used to set xattr's in a started transaction. It is only called during inode creation inode for initial security/acl xattrs of the new inode. These xattrs could be put into ibody or extent block, so xattr bucket would not be use in this case. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: move new inode allocation out of the transactionTiger Yang
Move out inode allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called outside of a transaction. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2: turn __ocfs2_remove_inode_range() into ocfs2_remove_btree_range()Mark Fasheh
This patch genericizes the high level handling of extent removal. ocfs2_remove_btree_range() is nearly identical to __ocfs2_remove_inode_range(), except that extent tree operations have been used where necessary. We update ocfs2_remove_inode_range() to use the generic helper. Now extent tree based structures have an easy way to truncate ranges. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
2009-01-05ocfs2/xattr: Merge xattr set transaction.Tao Ma
In current ocfs2/xattr, the whole xattr set is divided into many steps are many transaction are used, this make the xattr set process isn't like a real transaction, so this patch try to merge all the transaction into one. Another benefit is that acl can use it easily now. I don't merge the transaction of deleting xattr when we remove an inode. The reason is that if we have a large number of xattrs and every xattrs has large values(large enough for outside storage), the whole transaction will be very huge and it looks like jbd can't handle it(I meet with a jbd complain once). And the old inode removal is also divided into many steps, so I'd like to leave as it is. Note: In xattr set, I try to avoid ocfs2_extend_trans since if the credits aren't enough for the extension, it will commit all the dirty blocks and create a new transaction which may lead to inconsistency in metadata. All ocfs2_extend_trans remained are safe now. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05ocfs2/xattr: Reserve meta/data at the beginning of ocfs2_xattr_set.Tao Ma
In ocfs2 xattr set, we reserve metadata and clusters in any place they are needed. It is time-consuming and ineffective, so this patch try to reserve metadata and clusters at the beginning of ocfs2_xattr_set. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>