aboutsummaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)Author
2009-08-14nfs: nfs4xdr: get rid of COPYMEMBenny Halevy
Just directly call memcpy. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: introduce decode_sessionid helperBenny Halevy
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: introduce decode_verifier helperBenny Halevy
Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Trond: Fixed up an 'uninitialised variable' issue in decode_readdir] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: introduce decode_opaque_fixed and decode_stateid helpersBenny Halevy
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: introduce print_overflow_msgBenny Halevy
Part fo the nfs4xdr cleanup. READ_BUF will go away. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: get rid of READTIMEBenny Halevy
It has no users. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: get rid of READ64Benny Halevy
s/READ64\(\*(.*)\)/p = xdr_decode_hyper(p, \1)/ s/READ64\((.*)\)/p = xdr_decode_hyper(p, &\1)/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: get rid of READ32Benny Halevy
s/READ32\((.*)\)/\1 = be32_to_cpup(p++)/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: merge xdr_encode_int+xdr_encode_opaque_fixed into ↵Benny Halevy
xdr_encode_opaque use encode_string where appropriate. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: optimize low level encodingBenny Halevy
do not increment encoding ptr if not needed. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: change RESERVE_SPACE macro into a static helperBenny Halevy
In order to open code and expose the result pointer assignment. Alternatively, we can open code the call to xdr_reserve_space and do the BUG_ON an the error case at the call site. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: encode_compound_hdr does not have to round up reserved bytesBenny Halevy
This is already done by xdr_reserve_space and since encode_compound_hdr is adding a byte count to "12" which is already word aligned, the xdr level rounding will work just as well. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: optimize RESERVE_SPACE in encode_create_session and ↵Benny Halevy
encode_sequence Coalesce multilpe constant RESERVE_SPACEs into one Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: get rid of WRITEMEMBenny Halevy
s/WRITEMEM(/p = xdr_encode_opaque_fixed(p, / Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: get rid of WRITE64Benny Halevy
s/WRITE64/p = xdr_encode_hyper(p, / Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-14nfs: nfs4xdr: get rid of WRITE32Benny Halevy
s/WRITE32/*p++ = cpu_to_be32/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-12NFS: Fix an O_DIRECT Oops...Trond Myklebust
We can't call nfs_readdata_release()/nfs_writedata_release() without first initialising and referencing args.context. Doing so inside nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment() causes an Oops. We should rather be calling nfs_readdata_free()/nfs_writedata_free() in those cases. Looking at the O_DIRECT code, the "struct nfs_direct_req" is already referencing the nfs_open_context for us. Since the readdata and writedata structures carry a reference to that, we can simplify things by getting rid of the extra nfs_open_context references, so that we can replace all instances of nfs_readdata_release()/nfs_writedata_release(). Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-10Merge branch 'sunrpc_cache-for-2.6.32' into nfs-for-2.6.32Trond Myklebust
2009-08-10Merge branch 'patches_cel-for-2.6.32' into nfs-for-2.6.32Trond Myklebust
2009-08-10nfs: remove superfluous BUG_ON()sBartlomiej Zolnierkiewicz
Subject: [PATCH] nfs: remove superfluous BUG_ON()s Remove duplicated BUG_ON()s from nfs[4]_create_server() (we make the same checks earlier in both functions). This takes care of the following entries from Dan's list: fs/nfs/client.c +1078 nfs_create_server(47) warning: variable derefenced before check 'server->nfs_client' fs/nfs/client.c +1079 nfs_create_server(48) warning: variable derefenced before check 'server->nfs_client->rpc_ops' fs/nfs/client.c +1363 nfs4_create_server(43) warning: variable derefenced before check 'server->nfs_client' fs/nfs/client.c +1364 nfs4_create_server(44) warning: variable derefenced before check 'server->nfs_ Reported-by: Dan Carpenter <error27@gmail.com> Cc: corbet@lwn.net Cc: eteo@redhat.com Cc: Julia Lawall <julia@diku.dk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-10NFS: read-modify-write page updatingPeter Staubach
Hi. I have a proposal for possibly resolving this issue. I believe that this situation occurs due to the way that the Linux NFS client handles writes which modify partial pages. The Linux NFS client handles partial page modifications by allocating a page from the page cache, copying the data from the user level into the page, and then keeping track of the offset and length of the modified portions of the page. The page is not marked as up to date because there are portions of the page which do not contain valid file contents. When a read call comes in for a portion of the page, the contents of the page must be read in the from the server. However, since the page may already contain some modified data, that modified data must be written to the server before the file contents can be read back in the from server. And, since the writing and reading can not be done atomically, the data must be written and committed to stable storage on the server for safety purposes. This means either a FILE_SYNC WRITE or a UNSTABLE WRITE followed by a COMMIT. This has been discussed at length previously. This algorithm could be described as modify-write-read. It is most efficient when the application only updates pages and does not read them. My proposed solution is to add a heuristic to decide whether to do this modify-write-read algorithm or switch to a read- modify-write algorithm when initially allocating the page in the write system call path. The heuristic uses the modes that the file was opened with, the offset in the page to read from, and the size of the region to read. If the file was opened for reading in addition to writing and the page would not be filled completely with data from the user level, then read in the old contents of the page and mark it as Uptodate before copying in the new data. If the page would be completely filled with data from the user level, then there would be no reason to read in the old contents because they would just be copied over. This would optimize for applications which randomly access and update portions of files. The linkage editor for the C compiler is an example of such a thing. I tested the attached patch by using rpmbuild to build the current Fedora rawhide kernel. The kernel without the patch generated about 269,500 WRITE requests. The modified kernel containing the patch generated about 261,000 WRITE requests. Thus, about 8,500 fewer WRITE requests were generated. I suspect that many of these additional WRITE requests were probably FILE_SYNC requests to WRITE a single page, but I didn't test this theory. The difference between this patch and the previous one was to remove the unneeded PageDirty() test. I then retested to ensure that the resulting system continued to behave as desired. Thanx... ps Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-10NFS: Add a ->migratepage() aop for NFSTrond Myklebust
Make NFS a bit more friendly to NUMA and memory hot removal... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09SUNRPC: Replace rpc_client->cl_dentry and cl_mnt, with a cl_pathTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09SUNRPC: Constify rpc_pipe_ops...Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFS: Replace nfs_set_port() with rpc_set_port()Chuck Lever
Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFS: Replace nfs_parse_ip_address() with rpc_pton()Chuck Lever
Clean up: Use the common routine now provided in sunrpc.ko for parsing mount addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09SUNRPC: Provide functions for managing universal addressesChuck Lever
Introduce a set of functions in the kernel's RPC implementation for converting between a socket address and either a standard presentation address string or an RPC universal address. The universal address functions will be used to encode and decode RPCB_FOO and NFSv4 SETCLIENTID arguments. The other functions are part of a previous promise to deliver shared functions that can be used by upper-layer protocols to display and manipulate IP addresses. The kernel's current address printf formatters were designed specifically for kernel to user-space APIs that require a particular string format for socket addresses, thus are somewhat limited for the purposes of sunrpc.ko. The formatter for IPv6 addresses, %pI6, does not support short-handing or scope IDs. Also, these printf formatters are unique per address family, so a separate formatter string is required for printing AF_INET and AF_INET6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFS: Use the authentication flavor list returned by mountdChuck Lever
Commit a14017db added support in the kernel's NFS mount client to decode the authentication flavor list returned by mountd. The NFS client can now use this list to determine whether the authentication flavor requested by the user is actually supported by the server. Note we don't actually negotiate the security flavor if none was specified by the user. Instead, we try to use AUTH_SYS, and fail if the server does not support it. This prevents us from negotiating an inappropriate security flavor (some servers list AUTH_NULL first). If the server does not support AUTH_SYS, the user must provide an appropriate security flavor by specifying the "sec=" mount option. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFS: Fix auth flavor len accountingChuck Lever
Previous logic in the NFS mount parsing code path assumed auth_flavor_len was set to zero for simple authentication flavors (like AUTH_UNIX), and 1 for compound flavors (like AUTH_GSS). At some earlier point (maybe even before the option parsers were merged?) specific checks for auth_flavor_len being zero were removed from the functions that validate the mount option that sets the mount point's authentication flavor. Since we are populating an array for authentication flavors, the auth_flavor_len should always be set to the number of flavors. Let's eliminate some cleverness here, and prepare for new logic that needs to know the number of flavors in the auth_flavors[] array. (auth_flavors[] is an array because at some point we want to allow a list of acceptable authentication flavors to be specified via the sec= mount option. For now it remains a single element array). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFS: Add ability to send MOUNTPROC_UMNT to the kernel's mountd clientChuck Lever
After certain failure modes of an NFS mount, an NFS client should send a MOUNTPROC_UMNT request to remove the just-added mount entry from the server's mount table. While no-one should rely on the accuracy of the server's mount table, sending a UMNT is simply being a good internet neighbor. Since NFS mount processing is handled in the kernel now, we will need a function in the kernel's mountd client that can post a MOUNTRPC_UMNT request, in order to handle these failure modes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFS: Fix up new minorversion= optionChuck Lever
The new minorversion= mount option (commit 3fd5be9e) was merged at the same time as the recent sloppy parser fixes (commit a5a16bae), so minorversion= still uses the old value parsing logic. If the minorversion= option specifies a bogus value, it should fail with "bad value" not "bad option." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFSv4: Clean up the nfs.callback_tcpport optionTrond Myklebust
Tighten up the validity checking in param_set_port: check for NULL pointers. Ensure that the option shows up on 'modinfo' output. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFSv4: Don't do idmapper upcalls for asynchronous RPC callsTrond Myklebust
We don't want to cause rpciod to hang... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFSv4: Add 'server capability' flags for NFSv4 recommended attributesTrond Myklebust
If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code needs to know that, so that it don't keep trying to poll for it. However, by the same count, if the NFSv4 server does support that attribute, then we should ensure that the inode metadata is appropriately labelled as being untrusted. For instance, if we don't know the correct value of the file's uid, we should certainly not be caching ACLs or ACCESS results. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09NFSv4: Don't loop forever on state recovery failure...Trond Myklebust
If the server is broken, then retrying forever won't fix it. We should just give up after a while, and return an error to the user. We set the number of retries to 10 for now... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09nfs: Keep index within mnt_errtbl[]Roel Kluin
Ensure that index i remains within array mnt_errtbl[] and mnt3_errtbl[]. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-07-21NFSv4: Fix a problem whereby a buggy server can oops the kernelTrond Myklebust
We just had a case in which a buggy server occasionally returns the wrong attributes during an OPEN call. While the client does catch this sort of condition in nfs4_open_done(), and causes the nfs4_atomic_open() to return -EISDIR, the logic in nfs_atomic_lookup() is broken, since it causes a fallback to an ordinary lookup instead of just returning the error. When the buggy server then returns a regular file for the fallback lookup, the VFS allows the open, and bad things start to happen, since the open file doesn't have any associated NFSv4 state. The fix is firstly to return the EISDIR/ENOTDIR errors immediately, and secondly to ensure that we are always careful when dereferencing the nfs_open_context state pointer. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-07-21NFSv4: Fix an NFSv4 mount regressionTrond Myklebust
Commit 008f55d0e019943323c20a03493a2ba5672a4cc8 (nfs41: recover lease in _nfs4_lookup_root) forces the state manager to always run on mount. This is a bug in the case of NFSv4.0, which doesn't require us to send a setclientid until we want to grab file state. In any case, this is completely the wrong place to be doing state management. Moving that code into nfs4_init_session... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-07-21NFSv4: Fix an Oops in nfs4_free_lock_stateTrond Myklebust
The oops http://www.kerneloops.org/raw.php?rawid=537858&msgid= appears to be due to the nfs4_lock_state->ls_state field being uninitialised. This happens if the call to nfs4_free_lock_state() is triggered at the end of nfs4_get_lock_state(). The fix is to move the initialisation of ls_state into the allocator. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-07-12headers: smp_lock.h reduxAlexey Dobriyan
* Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-11Revert "fuse: Fix build error" as unnecessaryLinus Torvalds
This reverts commit 097041e576ee3a50d92dd643ee8ca65bf6a62e21. Trond had a better fix, which is the parent of this one ("Fix compile error due to congestion_wait() changes") Requested-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10fuse: Fix build errorLarry Finger
When building v2.6.31-rc2-344-g69ca06c, the following build errors are found due to missing includes: CC [M] fs/fuse/dev.o fs/fuse/dev.c: In function ‘request_end’: fs/fuse/dev.c:289: error: ‘BLK_RW_SYNC’ undeclared (first use in this function) ... fs/nfs/write.c: In function ‘nfs_set_page_writeback’: fs/nfs/write.c:207: error: ‘BLK_RW_ASYNC’ undeclared (first use in this function) Signed-off-by: Larry Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10Fix congestion_wait() sync/async vs read/write confusionJens Axboe
Commit 1faa16d22877f4839bd433547d770c676d1d964c accidentally broke the bdi congestion wait queue logic, causing us to wait on congestion for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-08headers: mnt_namespace.h reduxAlexey Dobriyan
Fix various silly problems wrt mnt_namespace.h: - exit_mnt_ns() isn't used, remove it - done that, sched.h and nsproxy.h inclusions aren't needed - mount.h inclusion was need for vfsmount_lock, but no longer - remove mnt_namespace.h inclusion from files which don't use anything from mnt_namespace.h Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-22NFS: Correct the NFS mount path when following a referralTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-22NFS: Fix nfs_path() to always return a '/' at the beginning of the pathTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-22NFSv4: Replace nfs4_path_walk() with VFS path lookup in a private namespaceTrond Myklebust
As noted in the previous patch, the NFSv4 client mount code currently has several limitations. If the mount path contains symlinks, or referrals, or even if it just contains a '..', then the client code in nfs4_path_walk() will fail with an error. This patch replaces the nfs4_path_walk()-based lookup with a helper function that sets up a private namespace to represent the namespace on the server, then uses the ordinary VFS and NFS path lookup code to walk down the mount path in that namespace. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-22Merge branch 'for-2.6.31' of git://fieldses.org/git/linux-nfsdLinus Torvalds
* 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits) SUNRPC: Fix the TCP server's send buffer accounting nfsd41: Backchannel: minorversion support for the back channel nfsd41: Backchannel: cleanup nfs4.0 callback encode routines nfsd41: Remove ip address collision detection case nfsd: optimise the starting of zero threads when none are running. nfsd: don't take nfsd_mutex twice when setting number of threads. nfsd41: sanity check client drc maxreqs nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct NFS: kill off complicated macro 'PROC' sunrpc: potential memory leak in function rdma_read_xdr nfsd: minor nfsd_vfs_write cleanup nfsd: Pull write-gathering code out of nfsd_vfs_write nfsd: track last inode only in use_wgather case sunrpc: align cache_clean work's timer nfsd: Use write gathering only with NFSv2 NFSv4: kill off complicated macro 'PROC' NFSv4: do exact check about attribute specified knfsd: remove unreported filehandle stats counters knfsd: fix reply cache memory corruption knfsd: reply cache cleanups ...
2009-06-20nfs41: Move initialization of nfs4_opendata seq_res to nfs4_init_opendata_resBenny Halevy
nfs4_open_recover_helper clears opendata->o_res before calling nfs4_init_opendata_res, thus causing NFSv4.0 OPEN operations to be sent rather than nfsv4.1. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-18Merge branch 'devel-for-2.6.31' into for-2.6.31Trond Myklebust
Conflicts: fs/nfs/client.c fs/nfs/super.c