aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/ulp/ipoib/ipoib_main.c
AgeCommit message (Collapse)Author
2008-07-14IPoIB: Copy small received SKBs in connected modeEli Cohen
The connected mode implementation in the IPoIB driver has a large overhead in the way SKBs are handled in the receive flow. It usually allocates an SKB with as big as was used in the currently received SKB and moves unused fragments from the old SKB to the new one. This involves a loop on all the remaining fragments and incurs overhead on the CPU. This patch, for small SKBs, allocates an SKB just large enough to contain the received data and copies to it the data from the received SKB. The newly allocated SKB is passed to the stack and the old SKB is reposted. When running netperf, UDP small messages, without this pach I get: UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 14.4.3.178 (14.4.3.178) port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 114688 128 10.00 5142034 0 526.31 114688 10.00 1130489 115.71 With this patch I get both send and receive at ~315 mbps. The reason that send performance actually slows down is as follows: When using this patch, the overhead of the CPU for handling RX packets is dramatically reduced. As a result, we do not experience RNR NAK messages from the receiver which cause the connection to be closed and reopened again; when the patch is not used, the receiver cannot handle the packets fast enough so there is less time to post new buffers and hence the mentioned RNR NACKs. So what happens is that the application *thinks* it posted a certain number of packets for transmission but these packets are flushed and do not really get transmitted. Since the connection gets opened and closed many times, each time netperf gets the CPU time that otherwise would have been given to IPoIB to actually transmit the packets. This can be verified when looking at the port counters -- the output of ifconfig and the oputput of netperf (this is for the case without the patch): tx packets ========== port counter: 1,543,996 ifconfig: 1,581,426 netperf: 5,142,034 rx packets ========== netperf 1,1304,089 Signed-off-by: Eli Cohen <eli@mellanox.co.il>
2008-07-14RDMA: Remove subversion $Id tagsRoland Dreier
They don't get updated by git and so they're worse than useless. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-29IPoIB: Use separate CQ for UD send completionsEli Cohen
Use a dedicated CQ for UD send completions. Also, do not arm the UD send CQ, which reduces the number of interrupts generated. This patch farther reduces overhead by not calling poll CQ for every posted send WR -- it does polls only when there 16 or more outstanding work requests. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-23IPoIB: Handle 4K IB MTU for UD (datagram) modeShirley Ma
This patch enables IPoIB to use 4K UD messages (when the underlying device and fabrics support a 4K MTU) by using two scatter buffers when PAGE_SIZE is less than or equal to thhe HCA IB MTU size. The first buffer is for IPoIB header + GRH header, and the second buffer is the IPoIB payload, which is 4K-4. Signed-off-by: Shirley Ma <xma@us.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IPoIB: Add basic ethtool supportEli Cohen
Just add the infrastructure so we can add functionality later. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IPoIB: Add LSO supportEli Cohen
For HCAs that support TCP segmentation offload (IB_DEVICE_UD_TSO), set NETIF_F_TSO and use HW LSO to offload TCP segmentation. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB: Use shorter list_splice_init() for brevityRobert P. J. Day
Convert list_splice() + INIT_LIST_HEAD() to the equivalent list_splice_init() Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IPoIB: Use checksum offload support if availableEli Cohen
For HCAs that support checksum offload (ie that set IB_DEVICE_UD_IP_CSUM in the device capabilities flags), have IPoIB set NETIF_F_IP_CSUM and use the HCA to generate and verify IP checksums. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-03-12IPoIB: Allocate priv->tx_ring with vmalloc()Roland Dreier
Commit 7143740d ("IPoIB: Add send gather support") made struct ipoib_tx_buf significantly larger, since the mapping member changed from a single u64 to an array with MAX_SKB_FRAGS + 1 entries. This means that allocating tx_rings with kzalloc() may fail because there is not enough contiguous memory for the new, much bigger size. Fix this regression by allocating the rings with vmalloc() instead. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-08IPoIB: Add high DMA feature flagEli Cohen
All current InfiniBand devices can handle all DMA addresses, and it's hard to imagine anyone would be silly enough to build a new device that couldn't. Therefore, enable the NETIF_F_HIGHDMA feature for IPoIB. This has no effect for no, but is needed when we enable gather/scatter support and checksum stateless offloads. Signed-off-by: Eli Cohen <eli@mellnaox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-04IPoIB: Remove a misleading debug printOr Gerlitz
Commit 732a2170 ("IB/ipoib: Bound the net device to the ipoib_neigh structue") left a misleading debug print (n->dev would be a bond device only if boding is used). Clean it up. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-04IPoIB: Handle bonding failover race for connected neighbours tooOr Gerlitz
Move up the code that checks for a situation where the remote GID stored in the ipoib_neigh is different than the one present in the neighbour (handle gratuitous ARP) or that a bonding fail over has happened but the neighbour still has a pointer to an ipoib_neigh created by a different device than the current slave. This will cause the driver to apply the check also for connected mode neighbours. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25IPoIB: Remove redundant check of netif_queue_stopped() in xmit handlerKrishna Kumar
qdisc_run() now tests for queue_stopped() before calling __qdisc_run(), and the same check is done in every iteration of __qdisc_run(), so another check is not required in the driver xmit. This means that ipoib_start_xmit() no longer needs to test netif_queue_stopped(); the test was added to fix earlier kernels, where the networking stack did not guarantee that the xmit method of an LLTX driver would not be called after the queue was stopped, but current kernels do provide this guarantee. To validate, I put a debug in the TX_BUSY path which never hit with 64 threads running overnight exercising this code a few 100 million times. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entriesPradeep Satyanarayana
Some HCAs (such as ehca2) support SRQ, but only support fewer than 16 SG entries for SRQs. Currently IPoIB/CM implicitly assumes all HCAs will support 16 SG entries for SRQs (to handle a 64K MTU with 4K pages). This patch removes that restriction by limiting the maximum MTU in connected mode to what the maximum number of SRQ SG entries allows. This patch addresses <https://bugs.openfabrics.org/show_bug.cgi?id=728> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25IPoIB/cm: Add connected mode support for devices without SRQsPradeep Satyanarayana
Some IB adapters (notably IBM's eHCA) do not implement SRQs (shared receive queues). The current IPoIB connected mode support only works on devices that support SRQs. Fix this by adding support for using the receive queue of each connected mode receive QP. The disadvantage of this compared to using an SRQ is that it means a full queue of receives must be posted for each remote connected mode peer, which means that total memory usage is potentially much higher than when using SRQs. To manage this, add a new module parameter "max_nonsrq_conn_qp" that limits the number of connections allowed per interface. The rest of the changes are fairly straightforward: we use a table of struct ipoib_cm_rx to hold all the active connections, and put the table index of the connection in the high bits of receive WR IDs. This is needed because we cannot rely on the struct ib_wc.qp field for non-SRQ receive completions. Most of the rest of the changes just test whether or not an SRQ is available, and post receives or find received packets in the right place depending on the answer. Cleaning up dead connections actually becomes simpler, because we do not have to do the "last WQE reached" dance that is required to destroy QPs attached to an SRQ. We just move the QP to the error state and wait for all pending receives to be flushed. Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> [ Completely rewritten and split up, based on Pradeep's work. Several bugs fixed and no doubt several bugs introduced. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25IPoIB: Trivial formatting cleanupsRoland Dreier
Fix whitespace blunders, convert "foo* bar" to "foo *bar", etc. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-11-27IPoIB: Fix oops if xmit is called when priv->broadcast is NULLJack Morgenstein
If a port goes down, ipoib_ib_dev_down() is invoked -- which flushes the mcasts (clearing priv->broadcast) and clearing the path record cache. If ipoib_start_xmit() is then invoked (before the broadcast group is rejoined), a kernel oops results from attempting to access priv->broadcast, which is still unset. Returning NULL from path_rec_create() if priv->broadcast is NULL is a harmless way of bypassing the problem -- the offending packet is simply discarded "without prejudice." Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4_core: Increase command timeout for INIT_HCA to 10 seconds IPoIB/cm: Use common CQ for CM send completions IB/uverbs: Fix checking of userspace object ownership IB/mlx4: Sanity check userspace send queue sizes IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))" IB/ehca: Enable large page MRs by default IB/ehca: Change meaning of hca_cap_mr_pgsize IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr() IB/ehca: Fix masking error in {,re}reg_phys_mr() IB/ehca: Supply QP token for SRQ base QPs IPoIB: Use round_jiffies() for ah_reap_task RDMA/cma: Fix deadlock destroying listen requests RDMA/cma: Add locking around QP accesses IB/mthca: Avoid alignment traps when writing doorbells mlx4_core: Kill mlx4_write64_raw()
2007-10-19IPoIB/cm: Use common CQ for CM send completionsMichael S. Tsirkin
Use the same CQ for CM send completions as for all other IPoIB completions. This means all completions are processed via the same NAPI polling routine. This should help reduce the number of interrupts for bi-directional traffic (such as TCP) and fixes "driver is hogging interrupts" errors reported for IPoIB send side, e.g. <https://bugs.openfabrics.org/show_bug.cgi?id=508> To do this, keep a per-interface counter of outstanding send WRs, and stop the interface when this counter reaches the send queue size to avoid CQ overruns. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-15IB/ipoib: Verify address handle validity on sendMoni Shoua
When the bonding device senses a carrier loss of its active slave it replaces that slave with a new one. In between the times when the carrier of an IPoIB device goes down and ipoib_neigh is destroyed, it is possible that the bonding driver will send a packet on a new slave that uses an old ipoib_neigh. This patch detects and prevents this from happenning. Signed-off-by: Moni Shoua <monis at voltaire.com> Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com> Acked-by: Roland Dreier <rdreier@cisco.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-15IB/ipoib: Bound the net device to the ipoib_neigh structueMoni Shoua
IPoIB uses a two layer neighboring scheme, such that for each struct neighbour whose device is an ipoib one, there is a struct ipoib_neigh buddy which is created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour) call. When using the bonding driver, neighbours are created by the net stack on behalf of the bonding (master) device. On the tx flow the bonding code gets an skb such that skb->dev points to the master device, it changes this skb to point on the slave device and calls the slave hard_start_xmit function. Under this scheme, ipoib_neigh_destructor assumption that for each struct neighbour it gets, n->dev is an ipoib device and hence netdev_priv(n->dev) can be casted to struct ipoib_dev_priv is buggy. To fix it, this patch adds a dev field to struct ipoib_neigh which is used instead of the struct neighbour dev one, when n->dev->flags has the IFF_MASTER bit set. Signed-off-by: Moni Shoua <monis at voltaire.com> Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com> Acked-by: Roland Dreier <rdreier@cisco.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (87 commits) mlx4_core: Fix section mismatches IPoIB: Allow setting policy to ignore multicast groups IB/mthca: Mark error paths as unlikely() in post_srq_recv functions IB/ipath: Minor fix to ordering of freeing and zeroing of tid pages. IB/ipath: Remove redundant link state checks IB/ipath: Fix IB_EVENT_PORT_ERR event IB/ipath: Better handling of unexpected GPIO interrupts IB/ipath: Maintain active time on all chips IB/ipath: Fix QHT7040 serial number check IB/ipath: Indicate a couple of chip bugs to userspace IB/ipath: iba6110 rev4 no longer needs recv header overrun workaround IB/ipath: Use counters in ipath_poll and cleanup interrupts in ipath_close IB/ipath: Remove duplicate copy of LMC IB/ipath: Add ability to set the LMC via the sysfs debugging interface IB/ipath: Optimize completion queue entry insertion and polling IB/ipath: Implement IB_EVENT_QP_LAST_WQE_REACHED IB/ipath: Generate flush CQE when QP is in error state IB/ipath: Remove redundant code IB/ipath: Future proof eeprom checksum code (contents reading) IB/ipath: UC RDMA WRITE with IMMEDIATE doesn't send the immediate ...
2007-10-10IPoIB: Fix unused variable warningRoland Dreier
The conversion to use netdevice internal stats left an unused variable in ipoib_neigh_free(), since there's no longer any reason to get netdev_priv() in order to increment dropped packets. Delete the unused priv variable. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10[IPoIB]: Convert to netdevice internal statsRoland Dreier
Use the stats member of struct netdevice in IPoIB, so we can save memory by deleting the stats member of struct ipoib_dev_priv, and save code by deleting ipoib_get_stats(). Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NET]: Move hardware header operations out of netdevice.Stephen Hemminger
Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NET]: Nuke SET_MODULE_OWNER macro.Ralf Baechle
It's been a useless no-op for long enough in 2.6 so I figured it's time to remove it. The number of people that could object because they're maintaining unified 2.4 and 2.6 drivers is probably rather small. [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NET]: Make NAPI polling independent of struct net_device objects.Stephen Hemminger
Several devices have multiple independant RX queues per net device, and some have a single interrupt doorbell for several queues. In either case, it's easier to support layouts like that if the structure representing the poll is independant from the net device itself. The signature of the ->poll() call back goes from: int foo_poll(struct net_device *dev, int *budget) to int foo_poll(struct napi_struct *napi, int budget) The caller is returned the number of RX packets processed (or the number of "NAPI credits" consumed if you want to get abstract). The callee no longer messes around bumping dev->quota, *budget, etc. because that is all handled in the caller upon return. The napi_struct is to be embedded in the device driver private data structures. Furthermore, it is the driver's responsibility to disable all NAPI instances in it's ->stop() device close handler. Since the napi_struct is privatized into the driver's private data structures, only the driver knows how to get at all of the napi_struct instances it may have per-device. With lots of help and suggestions from Rusty Russell, Roland Dreier, Michael Chan, Jeff Garzik, and Jamal Hadi Salim. Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra, Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan. [ Ported to current tree and all drivers converted. Integrated Stephen's follow-on kerneldoc additions, and restored poll_list handling to the old style to fix mutual exclusion issues. -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10IPoIB: Allow setting policy to ignore multicast groupsOr Gerlitz
The kernel IB stack allows (through the RDMA CM) userspace applications to join and use multicast groups from the IPoIB MGID range. This allows multicast traffic to be handled directly from userspace QPs, without going through the kernel stack, which gives better performance for some applications. However, to fully interoperate with IP multicast, such userspace applications need to participate in IGMP reports and queries, or else routers may not forward the multicast traffic to the system where the application is running. The simplest way to do this is to share the kernel IGMP implementation by using the IP_ADD_MEMBERSHIP option to join multicast groups that are being handled directly in userspace. However, in such cases, the actual multicast traffic should not also be handled by the IPoIB interface, because that would burn resources handling multicast packets that will just be discarded in the kernel. To handle this, this patch adds lookup on the database used for IB multicast group reference counting when IPoIB is joining multicast groups, and if a multicast group is already handled by user space, then the IPoIB kernel driver ignores the group. This is controlled by a per-interface policy flag. When the flag is set, IPoIB will not join and attach its QP to a multicast group which already has an entry in the database; when the flag is cleared, IPoIB will behave as before this change. For each IPoIB interface, the /sys/class/net/$intf/umcast attribute controls the policy flag. The default value is off/0. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IPoIB: Specify Traffic Class with path record queries for QoS supportSean Hefty
To support QoS within and between subnets, modify IPoIB to request specific Traffic Class values with path record queries, using the value associated with the IPoIB broadcast group. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ See some comments I made on this at v1 and v2 of the posts <http://lists.openfabrics.org/pipermail/general/2007-August/039275.html> <http://lists.openfabrics.org/pipermail/general/2007-September/040312.html> ] Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IPoIB: Fix error path memory leakEli Cohen
Clean up properly if ib_query_pkey() or ib_query_gid() fail. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19IPoIB: Handle P_Key table reorderingYosef Etigin
SM reconfiguration or failover possibly causes a shuffling of the values in the P_Key table. Right now, IPoIB only queries for the P_Key index once when it creates the device QP, and hence there are problems if the index of a P_Key value changes. Fix this by using the PKEY_CHANGE event to trigger a recheck of the P_Key index. Signed-off-by: Yosef Etigin <yosefe@voltaire.com> Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06IPoIB: Convert to NAPIRoland Dreier
Convert the IP-over-InfiniBand network device driver over to using NAPI to handle completions for the main CQ. This covers all receives as well as datagram mode sends; send completions for connected mode connections are still handled from interrupt context. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-24IB/ipoib: Use ib_init_ah_from_path to initialize ah_attrSean Hefty
To support destinations that are not on the local IB subnet, IPoIB should include the GRH information when constructing an address handle. Using the existing ib_init_ah_from_path() call will do this for us. Signed-off-by: Sean Hefty <sean.hefty@intel.com>
2007-03-25[NET]: Fix neighbour destructor handling.Alexey Kuznetsov
->neigh_destructor() is killed (not used), replaced with ->neigh_cleanup(), which is called when neighbor entry goes to dead state. At this point everything is still valid: neigh->dev, neigh->parms etc. The device should guarantee that dead neighbor entries (neigh->dead != 0) do not get private part initialized, otherwise nobody will cleanup it. I think this is enough for ipoib which is the only user of this thing. Initialization private part of neighbor entries happens in ipib start_xmit routine, which is not reached when device is down. But it would be better to add explicit test for neigh->dead in any case. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-22IPoIB: Fix use-after-free in path_rec_completion()Michael S. Tsirkin
The connected mode code added the possibility that an neigh struct gets freed in the list_for_each_entry() loop in path_rec_completion(), which causes a use-after-free. Fix this by changing to the _safe variant of the list walking macro. This was spotted by the Coverity checker (CID 1567). Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-26IPoIB: Correct debugging output when path record lookup failsRoland Dreier
If path_rec_completion() is passed a non-NULL path record pointer along with an unsuccessful status value, the tracing code incorrectly prints the (invalid) DLID from the path record rather than the more interesting status code. The actual logic of the function correctly uses the path record only if the status indicates a successful lookup. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-10IPoIB: Connected mode experimental supportMichael S. Tsirkin
The following patch adds experimental support for IPoIB connected mode, as defined by the draft from the IETF ipoib working group. The idea is to increase performance by increasing the MTU from the maximum of 2K (theoretically 4K) supported by IPoIB on top of UD. With this code, I'm able to get 800MByte/sec or more with netperf without options on a Mellanox 4x back-to-back DDR system. Some notes on code: 1. SRQ is used for scalability to large cluster sizes 2. Only RC connections are used (UC does not support SRQ now) 3. Retry count is set to 0 since spec draft warns against retries 4. Each connection is used for data transfers in only 1 direction, so each connection is either active(TX) or passive (RX). 2 sides that want to communicate create 2 connections. 5. Each active (TX) connection has a separate CQ for send completions - this keeps the code simple without CQ resize and other tricks 6. To detect stale passive side connections (where the remote side is down), we keep an LRU list of passive connections (updated once per second per connection) and destroy a connection after it has been unused for several seconds. The LRU rule makes it possible to avoid scanning connections that have recently been active. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-07Network: convert network devices to use struct device instead of class_deviceGreg Kroah-Hartman
This lets the network core have the ability to handle suspend/resume issues, if it wants to. Thanks to Frederik Deweerdt <frederik.deweerdt@gmail.com> for the arm driver fixes. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-12IPoIB: Make sure struct ipoib_neigh.queue is always initializedRoland Dreier
Move the initialization of ipoib_neigh's skb_queue into ipoib_neigh_alloc(), since commit 2745b5b7 ("IPoIB: Fix skb leak when freeing neighbour") will make iterate over the skb_queue to free any packets left over when freeing the ipoib_neigh structure. This fixes a crash when freeing ipoib_neigh structures allocated in ipoib_mcast_send(), which otherwise don't have their skb_queue initialized. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-12-05Merge branch 'master' of ↵David Howells
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/infiniband/core/iwcm.c drivers/net/chelsio/cxgb2.c drivers/net/wireless/bcm43xx/bcm43xx_main.c drivers/net/wireless/prism54/islpci_eth.c drivers/usb/core/hub.h drivers/usb/input/hid-core.c net/core/netpoll.c Fix up merge failures with Linus's head and fix new compilation failures. Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-29IPoIB: Fix skb leak when freeing neighbourMichael S. Tsirkin
ipoib_neigh_free() is sometimes called while neighbour is still alive, so it might still have queued skbs. Fix skb leak in this case. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-11-22WorkStruct: make allyesconfigDavid Howells
Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-16IPoIB: Clear high octet in QP numberMichael S. Tsirkin
IPoIB assumes that high (reserved) octet in the hardware address is 0, and copies it into the QPN. This violates RFC 4391 (which requires that the high 8 bits are ignored on receive), and will result in an invalid QPN being used when interoperating with IPoIB connected mode. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IPoIB: Add some likely/unlikely annotations in hot pathEli Cohen
Signed-off-by: Eli Cohen <eli@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IPoIB: Remove unused include of vmalloc.hDotan Barak
IPoIB doesn't use anything from <linux/vmalloc.h>, so don't include it. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/sa: Require SA registrationMichael S. Tsirkin
Require users to register with SA module, to prevent the sa_query module text from going away while an SA query callback is still running. Update all in-tree users for the new interface. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22RDMA: iWARP Core Changes.Tom Tucker
Modifications to the existing rdma header files, core files, drivers, and ulp files to support iWARP, including: - Hook iWARP CM into the build system and use it in rdma_cm. - Convert enum ib_node_type to enum rdma_node_type, which includes the possibility of RDMA_NODE_RNIC, and update everything for this. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/ipoib: Fix flush/start xmit race (from code review)Michael S. Tsirkin
Prevent flush task from freeing the ipoib_neigh pointer, while ipoib_start_xmit() is accessing the ipoib_neigh through the pointer it has loaded from the skb's hardware address. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-07-24IB/ipoib: Fix packet loss after hardware address updateMichael S. Tsirkin
The neighbour ha field may get updated without destroying the neighbour. In this case, the ha field gets out of sync with the address handle stored in ipoib_neigh->ah, with the result that the ah field would point to an incorrect path, resulting in all packets being lost. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IPoIB: Fix kernel unaligned access on ia64Jack Morgenstein
Fix misaligned access faults on ia64: never cast a misaligned neighbour->ha + 4 pointer to union ib_gid type; pass a void * pointer instead. The memcpy was being optimized to use full word accesses because the compiler thought that union ib_gid is always aligned. The cast in IPOIB_GID_ARG is safe, since it is fixed to access each byte separately. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>