aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJeff Layton <jlayton@redhat.com>2008-02-06 11:34:10 -0500
committerJ. Bruce Fields <bfields@citi.umich.edu>2008-02-10 18:09:36 -0500
commit031fd3aa20fcf6d1862ea7814ee8b2caf36c0d78 (patch)
treeb60252860730b3f9b4578db1dc891497e0a00722
parent551e4fb2465b87de9d4aa1669b27d624435443bb (diff)
NLM: set RPC_CLNT_CREATE_NOPING for NLM RPC clients
It's currently possible for an unresponsive NLM client to completely lock up a server's lockd. The scenario is something like this: 1) client1 (or a process on the server) takes a lock on a file 2) client2 tries to take a blocking lock on the same file and awaits the callback 3) client2 goes unresponsive (plug pulled, network partition, etc) 4) client1 releases the lock ...at that point the server's lockd will try to queue up a GRANT_MSG callback for client2, but first it requeues the block with a timeout of 30s. nlm_async_call will attempt to bind the RPC client to client2 and will call rpc_ping. rpc_ping entails a sync RPC call and if client2 is unresponsive it will take around 60s for that to time out. Once it times out, it's already time to retry the block and the whole process repeats. Once in this situation, nlmsvc_retry_blocked will never return until the host starts responding again. lockd won't service new calls. Fix this by skipping the RPC ping on NLM RPC clients. This makes nlm_async_call return quickly when called. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
-rw-r--r--fs/lockd/host.c1
1 files changed, 1 insertions, 0 deletions
diff --git a/fs/lockd/host.c b/fs/lockd/host.c
index ca6b16fc310..00063ee0b55 100644
--- a/fs/lockd/host.c
+++ b/fs/lockd/host.c
@@ -244,6 +244,7 @@ nlm_bind_host(struct nlm_host *host)
.version = host->h_version,
.authflavor = RPC_AUTH_UNIX,
.flags = (RPC_CLNT_CREATE_HARDRTRY |
+ RPC_CLNT_CREATE_NOPING |
RPC_CLNT_CREATE_AUTOBIND),
};