From e8224e4b804b4fd26723191c1891101a5959bb8a Mon Sep 17 00:00:00 2001 From: Yossi Etigin Date: Tue, 16 Sep 2008 11:57:45 -0700 Subject: IPoIB: Fix deadlock on RTNL between bcast join comp and ipoib_stop() Taking rtnl_lock in ipoib_mcast_join_complete() causes a deadlock with ipoib_stop(). We avoid it by scheduling the piece of code that takes the lock on ipoib_workqueue instead of executing it directly. This works because we only flush the ipoib_workqueue with the RTNL not held. The deadlock happens because ipoib_stop() calls ipoib_ib_dev_down() which calls ipoib_mcast_dev_flush(), which calls ipoib_mcast_free(), which calls ipoib_mcast_leave(). The latter calls ib_sa_free_multicast(), and this waits until the multicast completion handler finishes. This handler is ipoib_mcast_join_complete(), which waits for the rtnl_lock(), which was already taken by ipoib_stop(). This bug was introduced in commit a77a57a1 ("IPoIB: Fix deadlock on RTNL in ipoib_stop()"). Signed-off-by: Yossi Etigin Signed-off-by: Roland Dreier --- drivers/infiniband/ulp/ipoib/ipoib.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'drivers/infiniband/ulp/ipoib/ipoib.h') diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index b0ffc9abe8c..05eb41b8ab6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -293,6 +293,7 @@ struct ipoib_dev_priv { struct delayed_work pkey_poll_task; struct delayed_work mcast_task; + struct work_struct carrier_on_task; struct work_struct flush_light; struct work_struct flush_normal; struct work_struct flush_heavy; @@ -464,6 +465,7 @@ int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port); void ipoib_dev_cleanup(struct net_device *dev); void ipoib_mcast_join_task(struct work_struct *work); +void ipoib_mcast_carrier_on_task(struct work_struct *work); void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb); void ipoib_mcast_restart_task(struct work_struct *work); -- cgit v1.2.3