aboutsummaryrefslogtreecommitdiff
path: root/fs/dlm/lowcomms.c
AgeCommit message (Collapse)Author
2008-01-29dlm: close otherconsPatrick Caulfeld
This patch addresses a problem introduced with the last round of lowcomms patches where the 'othercon' connections do not get freed when the DLM shuts down. This results in the error message "slab error in kmem_cache_destroy(): cache `dlm_conn': Can't free all objects" and the DLM cannot be restarted without a system reboot. See bz#428119 Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-29dlm: bind connections from known local address when using TCPLon Hohberger
A common problem occurs when multiple IP addresses within the same subnet are assigned to the same NIC. If we make a connection attempt to another address on the same subnet as one of those addresses, the connection attempt will not necessarily be routed from the address we want. In the case of the DLM, the other nodes will quickly drop the connection attempt, causing problems. This patch makes the DLM bind to the local address it acquired from the cluster manager when using TCP prior to making a connection, obviating the need for administrators to "fix" their systems or use clever routing tricks. Signed-off-by: Lon Hohberger <lhh@redhat.com> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2007-11-07[DLM] lowcomms: Do not muck with sysctl_rmem_max.David S. Miller
Use SO_RCVBUFFORCE instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DLM] Make dlm_sendd cond_resched morePatrick Caulfield
Under high recovery loads dlm_sendd can monopolise the CPU and cause soft lockups. This one extra and one moved cond_resched() make it yield a little more during such times keeping work moving. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10[DLM] Fix lowcomms socket closingPatrick Caulfield
This patch fixes the slight mess made in lowcomms closing by previous patches and fixes all sorts of DLM hangs. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14[DLM] More othercon fixesPatrick Caulfield
The last patch to clean out 'othercon' structures only fixed half the problem. The attached addresses the other situations too, and fixes bz#238490 Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14[DLM] zero unused parts of sockaddr_storagePatrick Caulfield
When we build a sockaddr_storage for an IP address, clear the unused parts as they could be used for node comparisons. I have seen this occasionally make sctp connections fail. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14[DLM] Clear othercon pointers when a connection is closedPatrick Caulfield
This patch clears the othercon pointer and frees the memory when a connnection is closed. This could cause a small memory leak when nodes leave the cluster. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-20mm: Remove slab destructors from kmem_cache_create().Paul Mundt
Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-09[GFS2] git-gfs2-nmw-build-fixakpm@linux-foundation.org
Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[DLM] Telnet to port 21064 can stop all lockspacesPatrick Caulfield
This patch fixes Red Hat bz#245892 Opening a tcp connection from a cluster member to another cluster member targeting the dlm port it is enough to stop every dlm operation in the cluster. This means that GFS and rgmanager will hang. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[DLM] fix socket shutdownPatrick Caulfield
This patch clears the user_data of active sockets as part of cleanup. This prevents any late-arriving data from trying to add jobs to the work queue while we are tidying up. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-Off-By: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01[DLM] lowcomms styleDavid Teigland
Replace some printk with log_print, and fix some simple cases of lines over 80. Also, return -ENOTCONN if lowcomms_start fails due to no local IP address being available. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01[DLM] Lowcomms nodeid range & initialisation fixesPatrick Caulfield
Fix a few range & initialization bugs in lowcomms. - max_nodeid is really the highest nodeid encountered, so all loops must include it in their iterations. - clean dlm_local_count & connection_idr so we can do a clean restart. - Remove a spurious BUG_ON Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01[DLM] Fix dlm_lowcoms_stop hangJosef Bacik
When you attempt to release a lockspace in DLM, it will hang trying to down a semaphore that has already been downed. The attached patch fixes the problem. Signed-off-by: Josef Bacik <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Patrick Caulfield <pcaulfie@redhat.com>
2007-05-01[DLM] Consolidate transport protocolsPatrick Caulfield
This patch consolidates the TCP & SCTP protocols for the DLM into a single file and makes it switchable at run-time (well, at least before the DLM actually starts up!) For RHEL5 this patch requires Neil Horman's patch that expands the in-kernel socket API but that has already been twice ACKed so it should be OK. The patch adds a new lowcomms.c file that replaces the existing lowcomms-sctp.c & lowcomms-tcp.c files. Signed-off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[DLM] Add support for tcp communicationsPatrick Caulfield
The following patch adds a TCP based communications layer to the DLM which is compile time selectable. The existing SCTP layer gives the advantage of allowing multihoming, whereas the TCP layer has been heavily tested in previous versions of the DLM and is known to be robust and therefore can be used as a baseline for performance testing. Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20[DLM] fix iovec length in recvmsgPatrick Caulfield
I didn't spot that the msg_iovlen was set to 2 if there were two elements in the iovec but left at zero if not :( I think this might be why bob was still seeing trouble. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-12[DLM] fix iovec length in recvmsgPatrick Caulfield
The DLM always passes the iovec length as 1, this is wrong when the circular buffer wraps round. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-09[PATCH] dlm gfp_t annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-11[DLM] move kmap to after spin_unlockDavid Teigland
Doing the kmap() while holding the spinlock was causing recursive spinlock problems. It seems the kmap was scheduling, although there was no warning as I'd expect. Patrick, do we need locking around the kmap? Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-19[DLM] init rwsem earlierDavid Teigland
The nodeinfo_lock rwsem needs to be initialized when the module is loaded instead of when the dlm is first used. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-25[GFS2] Change name due to local_nodeid being a macroSteven Whitehouse
Change names of local_nodeid to dlm_local_nodeid to prevent a namespace collision. Changed other local variable to match. Cc: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28[DLM] PATCH 2/3 dlm: lowcomms closeDavid Teigland
When a node is removed from a lockspace configuration, close our connection to it, clearing any remaining messages for it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18[DLM] The core of the DLM for GFS2/CLVMDavid Teigland
This is the core of the distributed lock manager which is required to use GFS2 as a cluster filesystem. It is also used by CLVM and can be used as a standalone lock manager independantly of either of these two projects. It implements VAX-style locking modes. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steve Whitehouse <swhiteho@redhat.com>