aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/ulp
AgeCommit message (Collapse)Author
2007-04-27Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (49 commits) IB: Set class_dev->dev in core for nice device symlink IB/ehca: Implement modify_port IB/umad: Clarify documentation of transaction ID IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements IB/mad: Change SMI to use enums rather than magic return codes IB/umad: Implement GRH handling for sent/received MADs IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr IB/sa: Set src_path_bits correctly in ib_init_ah_from_path() IB/ucm: Simplify ib_ucm_event() RDMA/ucma: Simplify ucma_get_event() IB/mthca: Simplify CQ cleaning in mthca_free_qp() IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory IB/mthca: Update HCA firmware revisions IB/ipath: Fix WC format drift between user and kernel space IB/ipath: Check that a UD work request's address handle is valid IB/ipath: Remove duplicate stuff from ipath_verbs.h IB/ipath: Check reserved memory keys IB/ipath: Fix unit selection when all CPU affinity bits set IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times IB/ipath: Disable IB link earlier in shutdown sequence ...
2007-04-25[SK_BUFF]: Introduce skb_reset_mac_header(skb)Arnaldo Carvalho de Melo
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-24IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacementsRoland Dreier
There are quite a few places in ipoib_cm.c where we know IRQs are enabled because we do something that sleeps in the same function, so we can convert several occurrences of spin_lock_irqsave() to a plain spin_lock_irq(). This cleans up the source a little and makes the code smaller too: add/remove: 0/0 grow/shrink: 1/5 up/down: 3/-51 (-48) function old new delta ipoib_cm_tx_reap 403 406 +3 ipoib_cm_stale_task 146 145 -1 ipoib_cm_dev_stop 173 172 -1 ipoib_cm_tx_handler 964 956 -8 ipoib_cm_rx_handler 956 937 -19 ipoib_cm_skb_reap 212 190 -22 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-24IB/ipoib: Use ib_init_ah_from_path to initialize ah_attrSean Hefty
To support destinations that are not on the local IB subnet, IPoIB should include the GRH information when constructing an address handle. Using the existing ib_init_ah_from_path() call will do this for us. Signed-off-by: Sean Hefty <sean.hefty@intel.com>
2007-04-18IPoIB: Remove pointless opcode field from debugging outputRoland Dreier
There's no point in printing the opcode field in the completion handling debugging output, since the type of completion is already printed at the beginning of the line. In fact the opcode field is not even defined for completions with a status other than success. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-10IPoIB/cm: Fix DMA direction typoMichael S. Tsirkin
Receive buffers need to be mapped with DMA_FROM_DEVICE. Incorrectly mapping with DMA_TO_DEVICE causes a hard lock on ppc64 machines with an IOMMU. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=431> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-05IB/iser: Don't defer connection failure notification to workqueueErez Zilber
When a connection is terminated asynchronously from the iSCSI layer's perspective, iSER needs to notify the iSCSI layer that the connection has failed. This is done using a workqueue (switched to from the iSER tasklet context). Meanwhile, the connection object (that holds the work struct) is released. If the workqueue function wasn't called yet, it will be called later with a NULL pointer, which will crash the kernel. The context switch (tasklet to workqueue) is not required, and everything can be done from the iSER tasklet. This eliminates the NULL work struct bug (and simplifies the code). Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-28Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/iser: Handle aborting a command after it is sent IB/mthca: Fix thinko in init_mr_table() RDMA/cxgb3: Fix resource leak in cxio_hal_init_ctrl_qp()
2007-03-26IB/iser: Handle aborting a command after it is sentErez Zilber
The SCSI midlayer may abort a command that was already sent. If the initiator is still trying to send the command (or data-out PDUs for that command), the QP may time out after the midlayer times out. Therefore, when aborting the command, iSER may still have references for the command's buffers. When sending these PDUs, the sends will complete with an error and their resources will be released then. Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-25[NET]: Fix neighbour destructor handling.Alexey Kuznetsov
->neigh_destructor() is killed (not used), replaced with ->neigh_cleanup(), which is called when neighbor entry goes to dead state. At this point everything is still valid: neigh->dev, neigh->parms etc. The device should guarantee that dead neighbor entries (neigh->dead != 0) do not get private part initialized, otherwise nobody will cleanup it. I think this is enough for ipoib which is the only user of this thing. Initialization private part of neighbor entries happens in ipib start_xmit routine, which is not reached when device is down. But it would be better to add explicit test for neigh->dead in any case. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-22IB/ipoib: Fix thinko in packet length checksMichael S. Tsirkin
The packet length checks in ipoib are broken: we add 4 bytes (IPoIB encapsulation header) when sending a packet, not 20 bytes (hardware address length) to each packet. Therefore, if connected mode is enabled so that the interface MTU is larger than the multicast MTU, IPoIB may end up trying to send too-long multicast packets. For example, multicast is broken if a message of size 2048 bytes is sent on an interface with UD MTU 2048, because 2048 is bigger than the real limit of 2044 but the code tests against the wrong limit of 2060. This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=418>, submitted by Scott Weitzenkamp <sweitzen@cisco.com>. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22IPoIB: Fix use-after-free in path_rec_completion()Michael S. Tsirkin
The connected mode code added the possibility that an neigh struct gets freed in the list_for_each_entry() loop in path_rec_completion(), which causes a use-after-free. Fix this by changing to the _safe variant of the list walking macro. This was spotted by the Coverity checker (CID 1567). Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22IPoIB: Fix race in detaching from mcast group before attachingSean Hefty
There's a race between ipoib_mcast_leave() and ipoib_mcast_join_finish() where we can try to detach from a multicast group before we've attached to it. Fix this by reordering the code in ipoib_mcast_leave to free the multicast group first, which waits for the multicast callback thread (which calls ipoib_mcast_join_finish()) to complete before detaching from the group. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22IPoIB/cm: Fix reaping of stale connectionsMichael S. Tsirkin
The sense of the time_after_eq() test in ipoib_cm_stale_task() is reversed so that only non-stale connections are reaped. Fix this by changing to time_before_eq(). Noticed by Pradeep Satyanarayana <pradeep@us.ibm.com>. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-08IPoIB: Turn on interface's carrier after broadcast group is joinedShirley Ma
Do netif_carrier_on() right after the IPv4 broadcast multicast group is joined, rather than waiting for all of the initial set of multicast group joins to finish. This allows at least IPv4 traffic to limp along on broken fabrics where not all multicast groups can be joined. Signed-off-by: Shirley Ma <xma@us.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-27IPoIB: Only handle async events for one portRoland Dreier
An asynchronous event carries the port number that the event occurred on, so there's no reason for an IPoIB interface to process an event associated with a different local HCA port. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-26IPoIB: Correct debugging output when path record lookup failsRoland Dreier
If path_rec_completion() is passed a non-NULL path record pointer along with an unsuccessful status value, the tracing code incorrectly prints the (invalid) DLID from the path record rather than the more interesting status code. The actual logic of the function correctly uses the path record only if the status indicates a successful lookup. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-21IPoIB: Remove unused local_rate trackingRoland Dreier
Now that low-level drivers handle the conversion from an absolute rate to a relative rate, there's no need for the IPoIB driver to keep track of the local port's data rate. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-20IPoIB/cm: Improve small message bandwidthMichael S. Tsirkin
Avoid the overhead of freeing/reallocating and mapping/unmapping for DMA pages that have not been written to by hardware. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IB/sa: Track multicast join/leave requestsSean Hefty
The IB SA tracks multicast join/leave requests on a per port basis and does not do any reference counting: if two users of the same port join the same group, and one leaves that group, then the SA will remove the port from the group even though there is one user who wants to stay a member left. Therefore, in order to support multiple users of the same multicast group from the same port, we need to perform reference counting locally. To do this, add an multicast submodule to ib_sa to perform reference counting of multicast join/leave operations. Modify ib_ipoib (the only in-kernel user of multicast) to use the new interface. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IPoIB: CM error handling thinko fixMichael S. Tsirkin
ipoib_cm_alloc_rx_skb() might be called from IRQ context, so it must use dev_kfree_skb_any(), not kfree_skb(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IPoIB: Only allow root to change between datagram and connected modeRoland Dreier
Change the permissions of the "mode" sysfs attribute to be S_IWUSR instead of S_IWUGO. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-13Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/mthca: Always fill MTTs from CPU IB/mthca: Merge MR and FMR space on 64-bit systems IB/mthca: Fix access to MTT and MPT tables on non-cache-coherent CPUs IB/mthca: Give reserved MTTs a separate cache line IB/mthca: Fix reserved MTTs calculation on mem-free HCAs RDMA/cxgb3: Add driver for Chelsio T3 RNIC IB: Remove redundant "_wq" from workqueue names RDMA/cma: Increment port number after close to avoid re-use IB/ehca: Fix memleak on module unloading IB/mthca: Work around gcc bug on sparc64 IPoIB: Connected mode experimental support IB/core: Use ARRAY_SIZE macro for mandatory_table IB/mthca: Use correct structure size in call to memset()
2007-02-12[PATCH] mark struct file_operations const 3Arjan van de Ven
Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-10IPoIB: Connected mode experimental supportMichael S. Tsirkin
The following patch adds experimental support for IPoIB connected mode, as defined by the draft from the IETF ipoib working group. The idea is to increase performance by increasing the MTU from the maximum of 2K (theoretically 4K) supported by IPoIB on top of UD. With this code, I'm able to get 800MByte/sec or more with netperf without options on a Mellanox 4x back-to-back DDR system. Some notes on code: 1. SRQ is used for scalability to large cluster sizes 2. Only RC connections are used (UC does not support SRQ now) 3. Retry count is set to 0 since spec draft warns against retries 4. Each connection is used for data transfers in only 1 direction, so each connection is either active(TX) or passive (RX). 2 sides that want to communicate create 2 connections. 5. Each active (TX) connection has a separate CQ for send completions - this keeps the code simple without CQ resize and other tricks 6. To detect stale passive side connections (where the remote side is down), we keep an LRU list of passive connections (updated once per second per connection) and destroy a connection after it has been unused for several seconds. The LRU rule makes it possible to avoid scanning connections that have recently been active. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-09[PATCH] iscsi endianness annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-07Network: convert network devices to use struct device instead of class_deviceGreg Kroah-Hartman
This lets the network core have the ability to handle suspend/resume issues, if it wants to. Thanks to Frederik Deweerdt <frederik.deweerdt@gmail.com> for the arm driver fixes. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-04IB/srp: Don't wait for response when QP is in error state.Ishai Rabinovitz
When there is a call to send_tsk_mgmt SRP posts a send and waits for 5 seconds to get a response. When the QP is in the error state it is obvious that there will be no response so it is quite useless to wait. In fact, the timeout causes SRP to wait a long time to reconnect when a QP error occurs. (Each abort and each reset_device calls send_tsk_mgmt, which waits for the timeout). The following patch solves this problem by identifying the failure and returning an immediate error code. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-01-22IB/srp: Check match_strdup() returnIshai Rabinovitz
Checks if the kmalloc in match_strdup() was successful, and bail out on looking at the token if it failed. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-01-07IB/iser: Return error code when PDUs may not be sentErez Zilber
iSER limits the number of outstanding PDUs to send. When this threshold is reached, it should return an error code (-ENOBUFS) instead of setting the suspend_tx bit (which should be used only by libiscsi). Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-12-15IB/srp: Fix FMR mapping for 32-bit kernels and addresses above 4GRoland Dreier
struct srp_device.fmr_page_mask was unsigned long, which means that the top part of addresses above 4G was being chopped off on 32-bit architectures. Of course nothing good happens when data from SRP targets is DMAed to the wrong place. Fix this by changing fmr_page_mask to u64, to match the addresses actually used by IB devices. Thanks to Brian Cain <Brian.Cain@ge.com> and David McMillen <davem@systemfabricworks.com> for help diagnosing the bug and testing the fix. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-12-12IPoIB: Make sure struct ipoib_neigh.queue is always initializedRoland Dreier
Move the initialization of ipoib_neigh's skb_queue into ipoib_neigh_alloc(), since commit 2745b5b7 ("IPoIB: Fix skb leak when freeing neighbour") will make iterate over the skb_queue to free any packets left over when freeing the ipoib_neigh structure. This fixes a crash when freeing ipoib_neigh structures allocated in ipoib_mcast_send(), which otherwise don't have their skb_queue initialized. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-12-12IB/iser: Use the new verbs DMA mapping functionsRalph Campbell
Convert iSER to use the new verbs DMA mapping functions for kernel verbs consumers. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-12-12IB/srp: Use new verbs IB DMA mapping functionsRalph Campbell
Convert SRP to use the new verbs DMA mapping functions for kernel verbs consumers. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-12-12IPoIB: Use the new verbs DMA mapping functionsRalph Campbell
Convert IPoIB to use the new DMA mapping functions for kernel verbs consumers. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-12-12IB/iser: Remove unused "write-only" variablesRoland Dreier
Remove variables that are set but then never looked at in the iSER initiator. These cleanups came from David Binderman's list of "set but never used" warnings from icc. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-12-08[PATCH] LOG2: Implement a general integer log2 facility in the kernelDavid Howells
This facility provides three entry points: ilog2() Log base 2 of unsigned long ilog2_u32() Log base 2 of u32 ilog2_u64() Log base 2 of u64 These facilities can either be used inside functions on dynamic data: int do_something(long q) { ...; y = ilog2(x) ...; } Or can be used to statically initialise global variables with constant values: unsigned n = ilog2(27); When performing static initialisation, the compiler will report "error: initializer element is not constant" if asked to take a log of zero or of something not reducible to a constant. They treat negative numbers as unsigned. When not dealing with a constant, they fall back to using fls() which permits them to use arch-specific log calculation instructions - such as BSR on x86/x86_64 or SCAN on FRV - if available. [akpm@osdl.org: MMC fix] Signed-off-by: David Howells <dhowells@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David Howells <dhowells@redhat.com> Cc: Wojtek Kaniewski <wojtekka@toxygen.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-05Merge branch 'master' of ↵David Howells
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/ata/libata-scsi.c include/linux/libata.h Futher merge of Linus's head and compilation fixups. Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05Merge branch 'master' of ↵David Howells
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/infiniband/core/iwcm.c drivers/net/chelsio/cxgb2.c drivers/net/wireless/bcm43xx/bcm43xx_main.c drivers/net/wireless/prism54/islpci_eth.c drivers/usb/core/hub.h drivers/usb/input/hid-core.c net/core/netpoll.c Fix up merge failures with Linus's head and fix new compilation failures. Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-04[PATCH] severing skbuff.h -> highmem.hAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-11-29IPoIB: Fix skb leak when freeing neighbourMichael S. Tsirkin
ipoib_neigh_free() is sometimes called while neighbour is still alive, so it might still have queued skbs. Fix skb leak in this case. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-11-29IB/srp: Fix memory leak on reconnectVu Pham
SRP reallocates the IU buffers for tx_ring and rx_ring without freeing the old buffers when it reconnects to a target. Fix this by keeping the old IU buffers around. Signed-off-by: Vu Pham <vu@mellanox.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-11-29IB: Convert kmem_cache_t -> struct kmem_cacheRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-11-29IB/srp: Increase supported CDB sizeArne Redlich
Set the Scsi_Host's max_cmd_len from 12 (default) to 16 for SRP. Otherwise scsi_dispatch_cmd() won't pass down certain commands such as READ CAPACITY 16, required for supporting disks > 2TB. Signed-off-by: Arne Redlich <arne.redlich@xiranet.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-11-22WorkStruct: make allyesconfigDavid Howells
Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-16IPoIB: Clear high octet in QP numberMichael S. Tsirkin
IPoIB assumes that high (reserved) octet in the hardware address is 0, and copies it into the QPN. This violates RFC 4391 (which requires that the high 8 bits are ignored on receive), and will result in an invalid QPN being used when interoperating with IPoIB connected mode. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-30IB/iser: Start connection after enabling iSERErez Zilber
When a connection is started (a new connection or a recovered one), iSER should prepare its resources for full-featured mode and only then notify the iSCSI layer that it is ready to start queueing commands. Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-10IPoIB: Check for DMA mapping error for TX packetsRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-10IB/srp: Enable multiple connections to the same targetIshai Rabinovitz
Enable multiple concurrent connections to the same SRP target: 1) Use port GUID instead of node GUID in the initiator port identifier. This allows connections to be made from multiple HCA ports at the same time. 2) Let the user specify the identifier extention when adding the device. This allows userspace to make multiple connections even from the same port, if it wants too. Without this, only one connection can be made from any given HCA, even if it has multiple ports, because we don't use multi-channel mode, so targets will only allow one connection from a given initiator port ID. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-10IB/srp: Remove redundant memset()Ishai Rabinovitz
scsi_host_alloc() already allocates with kzalloc(), so the struct Scsi_Host is zeroed out, including the private data portion. Remove the redundant memset that zeros this out again in the SRP initiator. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>