aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2009-06-17nfs41: free slotAndy Adamson
Free a slot in the slot table. Mark the slot as free in the bitmap-based allocation table by clearing a bit corresponding to the slotid. Update lowest_free_slotid if freed slotid is lower than that. Update highest_used_slotid. In the case the freed slotid equals the highest_used_slotid, scan downwards for the next highest used slotid using the optimized fls* functions. Finally, wake up thread waiting on slot_tbl_waitq for a free slot to become available. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: free slot use slotid] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: use find_first_zero_bit for nfs4_find_slot] While at it, obliterate lowest_free_slotid and fix-up related comments. As per review comment 21/85. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: use __clear_bit for nfs4_free_slot] While at it, fix-up function comment. Part of review comment 22/85. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: use find_last_bit in nfs4_free_slot to determine highest used slot.] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock] Otherwise there's a race (we've hit) with nfs4_free_slot where nfs41_setup_sequence sees a full slot table, unlocks slot_tbl_lock, nfs4_free_slots happen concurrently and call rpc_wake_up_next where there's nobody to wake up yet, context goes back to nfs41_setup_sequence which goes to sleep when the slot table is actually empty now and there's no-one to wake it up anymore. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: setup_sequence methodAndy Adamson
Allocate a slot in the session slot table and set the sequence op arguments. Called at the rpc prepare stage. Add a status to nfs41_sequence_res, initialize it to one so that we catch rpc level failures which do not go through decode_sequence which sets the new status field. Note that upon an rpc level failure, we don't know if the server processed the sequence operation or not. Proceed as if the server did process the sequence operation. Signed-off-by: Rahul Iyer <iyer@netapp.com> [nfs41: sequence args use slotid] [nfs41: find slot return slotid] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: remove SEQ4_STATUS_USE_TK_STATUS] As per 11-14-08 review [move extern declaration from nfs41: sequence setup/done support] [removed sa_session definition, changed sa_cache_this into a u8 to reduce footprint] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock] Otherwise there's a race (we've hit) with nfs4_free_slot where nfs41_setup_sequence sees a full slot table, unlocks slot_tbl_lock, nfs4_free_slots happen concurrently and call rpc_wake_up_next where there's nobody to wake up yet, context goes back to nfs41_setup_sequence which goes to sleep when the slot table is actually empty now and there's no-one to wake it up anymore. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: find slotBenny Halevy
Find a free slot using bitmap-based allocation. Use the optimized ffz function to find a zero bit in the bitmap that indicates a free slot, starting the search from the 'lowest_free_slotid' position. If found, mark the slot as used in the bitmap, get the slot's slotid and seqid, and update max_slotid to be used by the SEQUENCE operation. Also, update lowest_free_slotid for next search. If no free slot was found the caller has to wait for a free slot (outside the scope of this function) Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: find slot return slotid] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [use find_first_zero_bit for nfs4_find_slot as per review comment 21/85.] [use NFS4_MAX_SLOT_TABLE rather than NFS4_NO_SLOT] [nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: nfs4_setup_sequenceAndy Adamson
Perform the nfs4_setup_sequence in the rpc_call_prepare state. If a session slot is not available, we will rpc_sleep_on the slot wait queue leaving the tk_action as rpc_call_prepare. Once we have a session slot, hang on to it even through rpc_restart_calls. Ensure the nfs41_sequence_res sr_slot pointer is NULL before rpc_run_task is called as nfs41_setup_sequence will only find a new slot if it is NULL. A future patch will call free slot after any rpc_restart_calls, and handle the rpc restart that result from a sequence operation error. Signed-off-by: Rahul Iyer <iyer@netapp.com> [nfs41: sequence res use slotid] Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: simplify nfs4_call_sync] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_call_sync] [nfs41: check for session not minorversion] [nfs41: remove rpc_message from nfs41_call_sync_args] [moved NFS4_MAX_SLOT_TABLE logic into nfs41_setup_sequence] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs41_call_sync_data use nfs_client not nfs_server] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: expose nfs4_call_sync_session for lease renewal] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: remove unnecessary return check] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: xdr {encode,decode}_sequenceAndy Adamson
Implement stubs for encode and decode sequence, defined as no-ops when CONFIG_NFS_V4_1 is not defined. Add the nfsv41 encode and decode sizes. Add encode_sequence to all nfs4_enc_* routines and decode_sequence to all nfs4_dec_* routines as required by v41. [was nfs41: minorversion support for xdr] [added nfs_client argument to encode_sequence so not to use sequence_args to pass sa_session] Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: pass *session in seq_args and seq_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: encode minorversion in compound headerBenny Halevy
Signed-off-by: Andy Adamdon <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: pass *session in seq_args and seq_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17NFS: use dynamically computed compound_hdr.replen for xdr_inline_pages offsetBenny Halevy
As Trond suggested, rather than passing a constant to xdr_inline_pages, keep a running count of the expected reply bytes. In preparation for nfs41, where additional op sequence are expteced when talking to nfs41 servers. [NFS: cb_compoundhdr.replen is in words not bytes] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: get fs_locations replen before encoding the GETATTR] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: get getacl replen before encoding the GETATTR] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17NFS: update hdr->replen for every encode opBenny Halevy
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17NFS: define and initialize compound_hdr.replenBenny Halevy
replen holds the running count of expected reply bytes. repl will then be used by encoding routines for xdr_inline_pages offset after which data bytes are to be received directly into the xdr buffer pages. NOTE: According to the nfsv4 and v4.1 RFCs, the replied tag SHOULD be the same is the one sent, but this is not required as a MUST for the server to do so. The server may screw us if it replies a tag of a different length in the compound result. [NFS: cb_compoundhdr.replen is in words not bytes] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17NFS: use decode_change_info_maxsz for xdr maxsz calculationsBenny Halevy
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: set up seq_res.sr_slotidAndy Adamson
Initialize nfs4_sequence_res sr_slotid to NFS4_MAX_SLOT_TABLE. [was nfs41: sequence res use slotid] Signed-off-by: Andy Adamson <andros@netapp.com> [pulled definition of struct nfs4_sequence_res.sr_slotid to here] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: nfs41: pass *session in seq_args and seq_resBenny Halevy
To be used for getting the rpc's minorversion and for nfs41 xdr {en,de}coding of the sequence operation. Reset the seq session ptrs for minorversion=0 rpc calls. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: introduce nfs4_call_syncAndy Adamson
Use nfs4_call_sync rather than rpc_call_sync to provide for a nfs41 sessions-enabled interface for sessions manipulation. The nfs41 rpc logic uses the rpc_call_prepare method to recover and create the session, as well as selecting a free slot id and the rpc_call_done to free the slot and update slot table related metadata. In the coming patches we'll add rpc prepare and done routines for setting up the sequence op and processing the sequence result. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_call_sync] As per 11-14-08 review. Squash into "nfs41: introduce nfs4_call_sync" and "nfs41: nfs4_setup_sequence" Define two functions one for v4 and one for v41 add a pointer to struct nfs4_client to the correct one. Signed-off-by: Andy Adamson <andros@netapp.com> [added BUG() in _nfs4_call_sync_session if !CONFIG_NFS_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: check for session not minorversion] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [group minorversion specific stuff together] Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> [nfs41: fixup nfs4_clear_client_minor_version] [introduce nfs4_init_client_minor_version() in this patch] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [cleaned-up patch: got rid of nfs_call_sync_t, dprintks, cosmetics, extra server defs] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: use nfs4_fs_locations_resBenny Halevy
In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [find nfs4_fs_locations_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: use nfs4_setaclresBenny Halevy
In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs_setaclres] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17NFS: get rid of unused xdr decode_setattr(, res) argumentBenny Halevy
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: use nfs4_getaclresBenny Halevy
In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: embed resp_len in nfs_getaclres] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: use nfs4_pathconf_resBenny Halevy
In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_pathconf_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: use nfs4_fsinfo_resBenny Halevy
In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_fsinfo_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: use nfs4_statfs_resBenny Halevy
In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_statfs_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: use nfs4_readlink_resBenny Halevy
In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_readlink_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: use nfs4_server_caps_argBenny Halevy
In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_server_caps_arg] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: sessions client infrastructureAndy Adamson
NFSv4.1 Sessions basic data types, initialization, and destruction. The session is always associated with a struct nfs_client that holds the exchange_id results. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [remove extraneous rpc_clnt pointer, use the struct nfs_client cl_rpcclient. remove the rpc_clnt parameter from nfs4 nfs4_init_session] Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Use the presence of a session to determine behaviour instead of the minorversion number.] Signed-off-by: Andy Adamson <andros@netapp.com> [constified nfs4_has_session's struct nfs_client parameter] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Rename nfs4_put_session() to nfs4_destroy_session() and call it from nfs4_free_client() not nfs4_free_server(). Also get rid of nfs4_get_session() and the ref_count in nfs4_session struct as keeping track of nfs_client should be sufficient] Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> [nfs41: pass rsize and wsize into nfs4_init_session] Signed-off-by: Andy Adamson <andros@netapp.com> [separated out removal of rpc_clnt parameter from nfs4_init_session ot a patch of its own] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Pass the nfs_client pointer into nfs4_alloc_session] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: don't assign to session->clp->cl_session in nfs4_destroy_session] [nfs41: fixup nfs4_clear_client_minor_version] [introduce nfs4_clear_client_minor_version() in this patch] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Refactor nfs4_init_session] Moved session allocation into nfs4_init_client_minor_version, called from nfs4_init_client. Leave rwise and wsize initialization in nfs4_init_session, called from nfs4_init_server. Reverted moving of nfs_fsid definition to nfs_fs_sb.h Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Move NFS4_MAX_SLOT_TABLE define from under CONFIG_NFS_V4_1] [Fix comile error when CONFIG_NFS_V4_1 is not set.] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [moved nfs4_init_slot_table definition to "create_session operation"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: alloc session with GFP_KERNEL] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: translate NFS4ERR_MINOR_VERS_MISMATCH to EPROTONOSUPPORTBenny Halevy
To be returned to the mount command when trying to mount a v4 server using minorversion 1. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: Use mount minorversion optionBenny Halevy
Use the mount minorversion option to initialize the nfs_client cl_minorversion and match it in nfs_match_client() when looking up a nfs_client. [nfs41: remove ifdefs around nfs_client_initdata.minorversion] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: nfs_client.cl_minorversionBenny Halevy
This field is set to the nfsv4 minor version for this mount. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Note: This patch sets the referral to the same minorversion as the current mount. Revisit in future patch. Signed-off-by: Andy Adamson <andros@netapp.com> [removed cl_minorversion assignment in nfs_set_client] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [always define nfs_client.cl_minorversion] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: add mount command option minorversionMike Sager
mount -t nfs4 -o minorversion=[0|1] specifies whether to use 4.0 or 4.1. By default, the minorversion is set to 0. Signed-off-by: Mike Sager <sager@netapp.com> [set default minorversion to 0 as per Trond and SteveD's request] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: Add Kconfig symbols for NFSv4.1Ricardo Labiaga
Added CONFIG_NFS_V4_1 and made it depend upon CONFIG_NFS_V4 and EXPERIMENTAL. Indicate that CONFIG_NFS_V4_1 is for NFS developers at the moment At the moment we're expecting folks trying out nfs41 to actively participate in the development process by helping us debug issues and ideally send patches to fix problems. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-09jbd: fix race in buffer processing in commit codeJan Kara
In commit code, we scan buffers attached to a transaction. During this scan, we sometimes have to drop j_list_lock and then we recheck whether the journal buffer head didn't get freed by journal_try_to_free_buffers(). But checking for buffer_jbd(bh) isn't enough because a new journal head could get attached to our buffer head. So add a check whether the journal head remained the same and whether it's still at the same transaction and list. This is a nasty bug and can cause problems like memory corruption (use after free) or trigger various assertions in JBD code (observed). Signed-off-by: Jan Kara <jack@suse.cz> Cc: <stable@kernel.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-09autofs4: remove hashed check in validate_wait()Ian Kent
The recent ->lookup() deadlock correction required the directory inode mutex to be dropped while waiting for expire completion. We were concerned about side effects from this change and one has been identified. I saw several error messages. They cause autofs to become quite confused and don't really point to the actual problem. Things like: handle_packet_missing_direct:1376: can't find map entry for (43,1827932) which is usually totally fatal (although in this case it wouldn't be except that I treat is as such because it normally is). do_mount_direct: direct trigger not valid or already mounted /test/nested/g3c/s1/ss1 which is recoverable, however if this problem is at play it can cause autofs to become quite confused as to the dependencies in the mount tree because mount triggers end up mounted multiple times. It's hard to accurately check for this over mounting case and automount shouldn't need to if the kernel module is doing its job. There was one other message, similar in consequence of this last one but I can't locate a log example just now. When checking if a mount has already completed prior to adding a new mount request to the wait queue we check if the dentry is hashed and, if so, if it is a mount point. But, if a mount successfully completed while we slept on the wait queue mutex the dentry must exist for the mount to have completed so the test is not really needed. Mounts can also be done on top of a global root dentry, so for the above case, where a mount request completes and the wait queue entry has already been removed, the hashed test returning false can cause an incorrect callback to the daemon. Also, d_mountpoint() is not sufficient to check if a mount has completed for the multi-mount case when we don't have a real mount at the base of the tree. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-06integrity: fix IMA inode leakHugh Dickins
CONFIG_IMA=y inode activity leaks iint_cache and radix_tree_node objects until the system runs out of memory. Nowhere is calling ima_inode_free() a.k.a. ima_iint_delete(). Fix that by calling it from destroy_inode(). Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-06ext3/4 with synchronous writes gets wedged by PostfixAl Viro
OK, that's probably the easiest way to do that, as much as I don't like it... Since iget() et.al. will not accept I_FREEING (will wait to go away and restart), and since we'd better have serialization between new/free on fs data structures anyway, we can afford simply skipping I_FREEING et.al. in insert_inode_locked(). We do that from new_inode, so it won't race with free_inode in any interesting ways and it won't race with iget (of any origin; nfsd or in case of fs corruption a lookup) since both still will wait for I_LOCK. Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Jan Kara <jack@suse.cz> Tested-by: David Watson <dbwatson@ukfsn.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-06Fix nobh_truncate_page() to not pass stack garbage to get_block()Theodore Ts'o
The nobh_truncate_page() function is used by ext2, exofs, and jfs. Of these three, only ext2 and jfs's get_block() function pays attention to bh->b_size --- which is normally always the filesystem blocksize except when the get_block() function is called by either mpage_readpage(), mpage_readpages(), or the direct I/O routines in fs/direct_io.c. Unfortunately, nobh_truncate_page() does not initialize map_bh before calling the filesystem-supplied get_block() function. So ext2 and jfs will try to calculate the number of blocks to map by taking stack garbage and shifting it left by inode->i_blkbits. This should be *mostly* harmless (except the filesystem will do some unnneeded work) unless the stack garbage is less than filesystem's blocksize, in which case maxblocks will be zero, and the attempt to find out whether or not the filesystem has a hole at a given logical block will fail, and the page cache entry might not get zero'ed out. Also if the stack garbage in in map_bh->state happens to have the BH_Mapped bit set, there could be an attempt to call readpage() on a non-existent page, which could cause nobh_truncate_page() to return an error when it should not. Fix this by initializing map_bh->state and map_bh->size. Fortunately, it's probably fairly unlikely that ext2 and jfs users mount with nobh these days. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: Fix oops and use after free during space balancing Btrfs: set device->total_disk_bytes when adding new device
2009-06-04Btrfs: Fix oops and use after free during space balancingChris Mason
The btrfs allocator uses list_for_each to walk the available block groups when searching for free blocks. It starts off with a hint to help find the best block group for a given allocation. The hint is resolved into a block group, but we don't properly check to make sure the block group we find isn't in the middle of being freed due to filesystem shrinking or balancing. If it is being freed, the list pointers in it are bogus and can't be trusted. But, the code happily goes along and uses them in the list_for_each loop, leading to all kinds of fun. The fix used here is to check to make sure the block group we find really is on the list before we use it. list_del_init is used when removing it from the list, so we can do a proper check. The allocation clustering code has a similar bug where it will trust the block group in the current free space cluster. If our allocation flags have changed (going from single spindle dup to raid1 for example) because the drives in the FS have changed, we're not allowed to use the old block group any more. The fix used here is to check the current cluster against the current allocation flags. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-04Btrfs: set device->total_disk_bytes when adding new deviceYan Zheng
It was not being properly initialized, and so the size saved to disk was not correct. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-02Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: prevent deadlock in xfs_qm_shake() xfs: fix overflow in xfs_growfs_data_private xfs: fix double unlock in xfs_swap_extents()
2009-06-01xfs: prevent deadlock in xfs_qm_shake()Felix Blyakher
It's possible to recurse into filesystem from the memory allocation, which deadlocks in xfs_qm_shake(). Add check for __GFP_FS, and bail out if it is not set. Signed-off-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Hedi Berriche <hedi@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-06-01xfs: fix overflow in xfs_growfs_data_privateEric Sandeen
In the case where growing a filesystem would leave the last AG too small, the fixup code has an overflow in the calculation of the new size with one fewer ag, because "nagcount" is a 32 bit number. If the new filesystem has > 2^32 blocks in it this causes a problem resulting in an EINVAL return from growfs: # xfs_io -f -c "truncate 19998630180864" fsfile # mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile # mount -o loop fsfile /mnt # xfs_growfs /mnt meta-data=/dev/loop0 isize=256 agcount=52, agsize=76288719 blks = sectsz=512 attr=2 data = bsize=4096 blocks=3905982455, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument Reported-by: richard.ems@cape-horn-eng.com Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-06-01xfs: fix double unlock in xfs_swap_extents()Felix Blyakher
Regreesion from commit ef8f7fc, which rearranged the code in xfs_swap_extents() leading to double unlock of xfs inode ilock. That resulted in xfs_fsr deadlocking itself on platforms, which don't handle double unlock of rw_semaphore nicely. It caused the count go negative, which represents the write holder, without really having one. ia64 is one of the platforms where deadlock was easily reproduced and the fix was tested. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-05-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix bh leak in nilfs_cpfile_delete_checkpoints function
2009-05-30nilfs2: fix bh leak in nilfs_cpfile_delete_checkpoints functionRyusuke Konishi
The nilfs_cpfile_delete_checkpoints() wrongly skips brelse() for the header block of checkpoint file in case of errors. This fixes the leak bug. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-05-29Merge git://git.infradead.org/~dwmw2/mtd-2.6.30Linus Torvalds
* git://git.infradead.org/~dwmw2/mtd-2.6.30: jffs2: Fix corruption when flash erase/write failure mtd: MXC NAND driver fixes (v5)
2009-05-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: Driver Core: do not oops when driver_unregister() is called for unregistered drivers sysfs: file.c: use create_singlethread_workqueue()
2009-05-29Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: svcrdma: dma unmap the correct length for the RPCRDMA header page. nfsd: Revert "svcrpc: take advantage of tcp autotuning" nfsd: fix hung up of nfs client while sync write data to nfs server
2009-05-29flat: fix data sections alignmentOskar Schirmer
The flat loader uses an architecture's flat_stack_align() to align the stack but assumes word-alignment is enough for the data sections. However, on the Xtensa S6000 we have registers up to 128bit width which can be used from userspace and therefor need userspace stack and data-section alignment of at least this size. This patch drops flat_stack_align() and uses the same alignment that is required for slab caches, ARCH_SLAB_MINALIGN, or wordsize if it's not defined by the architecture. It also fixes m32r which was obviously kaput, aligning an uninitialized stack entry instead of the stack pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Oskar Schirmer <os@emlix.com> Cc: David Howells <dhowells@redhat.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Bryan Wu <cooloney@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Johannes Weiner <jw@emlix.com> Acked-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-29procfs: make errno values consistent when open pident vs exit(2) race occursKOSAKI Motohiro
proc_pident_instantiate() has following call flow. proc_pident_lookup() proc_pident_instantiate() proc_pid_make_inode() And, proc_pident_lookup() has following error handling. const struct pid_entry *p, *last; error = ERR_PTR(-ENOENT); if (!task) goto out_no_task; Then, proc_pident_instantiate should return ENOENT too when racing against exit(2) occur. EINAL has two bad reason. - it implies caller is wrong. bad the race isn't caller's mistake. - man 2 open don't explain EINVAL. user often don't handle it. Note: Other proc_pid_make_inode() caller already use ENOENT properly. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-29jffs2: Fix corruption when flash erase/write failureJoakim Tjernlund
Erase errors such as: "Newly-erased block contained word 0xa4ef223e at offset 0x0296a014" and failure to write the clean marker, moves the offending erase block to erasing list before calling jffs2_erase_failed(). This is bad as jffs2_erase_failed() will also move the block to the bad_list, but is now moving the wrong block, causing FS corruption. Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-05-28sysfs: file.c: use create_singlethread_workqueue()Andrew Morton
We don't need a kernel thread per CPU for this application. Acked-by: Alex Chiang <achiang@hp.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-27nfsd: fix hung up of nfs client while sync write data to nfs serverWei Yongjun
Commit 'Short write in nfsd becomes a full write to the client' (31dec2538e45e9fff2007ea1f4c6bae9f78db724) broken the sync write. With the following commands to reproduce: $ mount -t nfs -o sync 192.168.0.21:/nfsroot /mnt $ cd /mnt $ echo aaaa > temp.txt Then nfs client is hung up. In SYNC mode the server alaways return the write count 0 to the client. This is because the value of host_err in nfsd_vfs_write() will be overwrite in SYNC mode by 'host_err=nfsd_sync(file);', and then we return host_err(which is now 0) as write count. This patch fixed the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>