aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2006-03-30IB/mad: RMPP support for additional classesHal Rosenstock
Add RMPP support for additional management classes that support it. Also, validate RMPP is consistent with management class specified. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-30IB/mad: include GID/class when matching receivesJack Morgenstein
Received responses are currently matched against sent requests based on TID only. According to the spec, responses should match based on the combination of TID, management class, and requester LID/GID. Without the additional qualification, an agent that is responding to two requests, both of which have the same TID, can match RMPP ACKs with the incorrect transaction. This problem can occur on the SM node when responding to SA queries. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-29IB/mthca: Fix section mismatch problemsRoland Dreier
Quite a few cleanup functions in mthca were marked as __devexit. However, they could also be called from error paths during initialization, so they cannot be marked that way. Just delete all of the incorrect annotations. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-29IPoIB: Fix oops with raw socketsRoland Dreier
ipoib_hard_header() needs to handle the case that daddr is NULL. This can happen when packets are injected via a raw socket, and IPoIB shouldn't oops in this case. Reported by Anton Blanchard <anton@samba.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-29IB/mthca: Fix check of size in SRQ creationJack Morgenstein
The previous patch for Tavor broke MemFree logic. The driver should perform limit check only for Tavor. For MemFree, the check is incorrect, since ds (WQE stride) is always a power-of-2 (although the max_desc_size may not be). In Tavor, however, WQE stride and desc_size are the same, and are not necessarily power-of-2. The check was really for the WQE stride (and it Tavor, we use max_desc_size for the stride). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-29IB/srp: Fix unmapping of fake scatterlistRoland Dreier
The recently merged patch to create a fake scatterlist for non-SG SCSI commands had a bug: the driver ended up doing dma_unmap_sg() on a scatterlist scmnd->request_buffer rather than the fake scatter list it created. Fix this so that the driver unmaps the same thing it maps. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IPoIB: P_Key change event handlingLeonid Arsh
This patch causes the network interface to respond to P_Key change events correctly. As a result, you'll see a child interface in the "RUNNING" state (netif_carrier_on()) only when the corresponding P_Key is configured by the SM. When SM removes a P_Key, the "RUNNING" state will be disabled for the corresponding network interface. To implement this, I added IB_EVENT_PKEY_CHANGE event handling. To prevent flushing the device before the device is open by the "delay open" mechanism, I added an additional device flag called IPOIB_FLAG_INITIALIZED. This also prevents the child network interface from trying to join to multicast groups until the PKEY is configured. We used to get error messages like: ib0.f2f2: couldn't attach QP to multicast group ff12:401b:f2f2:0:0:0:ffff:ffff in this case. To fix this, I just check IPOIB_FLAG_OPER_UP flag in ipoib_set_mcast_list(). Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Fix modify QP error pathRoland Dreier
If the call to mthca_MODIFY_QP() failed, then mthca_modify_qp() would still do some things it shouldn't, such as store away attributes for special QPs. Fix this, and simplify the code, by simply jumping to the exit path if mthca_MODIFY_QP() fails. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IPoIB: Fix network interface "RUNNING" statusLeonid Arsh
With the current IPoIB driver, the status of network interfaces stays "RUNNING" even if the link goes down (for example because a cable is unplugged). Fix this by flushing the IPoIB interface when the link goes down. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Fix indentationRoland Dreier
Fix some whitespace damage (indenting with spaces) that snuck in. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Fix uninitialized variable in mthca_alloc_qp()Jack Morgenstein
mthca_alloc_sqp() by mthca_set_qp_size() need to set qp->transport before calling mthca_set_qp_size(), since the value is used there. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Check SRQ limit in modify SRQ operationJack Morgenstein
When setting the shared receive queue (SRQ) watermark in a modify SRQ operation, make sure that the supplied value is not larger than the full size of the SRQ. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Check that SRQ WQE size does not exceed device's max valueJack Morgenstein
Guarantee the calculated work queue entry size does not exceed the max allowable WQE size when creating an SRQ. This is a problem with Arbel in Tavor-compatibility mode because the current WQE size computation method rounds up to next power of 2. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Check that sgid_index and path_mtu are valid in modify_qpDotan Barak
Add a check that the modify QP parameters sgid_index and path_mtu are valid, since they might come from userspace. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/srp: Use a fake scatterlist for non-SG SCSI commandsRoland Dreier
Since the SCSI midlayer is moving towards entirely getting rid of commands with use_sg == 0, we should treat this case as an exception. Therefore, change the IB SRP initiator to create a fake scatterlist for these commands with sg_init_one(). This simplifies the flow of DMA mapping and unmapping, since SRP can just use dma_map_sg() and dma_unmap_sg() unconditionally, rather than having to choose between the dma_{map,unmap}_sg() and dma_{map,unmap}_single() variants. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IPoIB: Pass correct pointer when flushing child interfacesLeonid Arsh
ipoib_ib_dev_flush() should get passed cpriv->dev, not &cpriv->dev. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-21Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (235 commits) [NETFILTER]: Add H.323 conntrack/NAT helper [TG3]: Don't mark tg3_test_registers() as returning const. [IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2 [IPV6]: Nearly complete kzalloc cleanup for net/ipv6 [IPV6]: Cleanup of net/ipv6/reassambly.c [BRIDGE]: Remove duplicate const from is_link_local() argument type. [DECNET]: net/decnet/dn_route.c: fix inconsequent NULL checking [TG3]: make drivers/net/tg3.c:tg3_request_irq() static [BRIDGE]: use LLC to send STP [LLC]: llc_mac_hdr_init const arguments [BRIDGE]: allow show/store of group multicast address [BRIDGE]: use llc for receiving STP packets [BRIDGE]: stp timer to jiffies cleanup [BRIDGE]: forwarding remove unneeded preempt and bh diasables [BRIDGE]: netfilter inline cleanup [BRIDGE]: netfilter VLAN macro cleanup [BRIDGE]: netfilter dont use __constant_htons [BRIDGE]: netfilter whitespace [BRIDGE]: optimize frame pass up [BRIDGE]: use kzalloc ...
2006-03-20[INFINIBAND] ipoib: Remove leftover use of neigh_ops->destructorArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20[NET]: Move destructor from neigh->ops to neigh_paramsMichael S. Tsirkin
struct neigh_ops currently has a destructor field, which no in-kernel drivers outside of infiniband use. The infiniband/ulp/ipoib in-tree driver stashes some info in the neighbour structure (the results of the second-stage lookup from ARP results to real link-level path), and it uses neigh->ops->destructor to get a callback so it can clean up this extra info when a neighbour is freed. We've run into problems with this: since the destructor is in an ops field that is shared between neighbours that may belong to different net devices, there's no way to set/clear it safely. The following patch moves this field to neigh_parms where it can be safely set, together with its twin neigh_setup. Two additional patches in the patch series update ipoib to use this new interface. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20IB/mthca: Query SRQ srq_limit fixesEli Cohen
Fix endianness handling of srq_limit: it is big-endian in the context structure, so we need to swab it before returning it. Also add support for srq_limit query for Tavor (non-MemFree) HCAs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Get rid of useless test of queue lengthRoland Dreier
In neigh_add_path(), the queue of delayed packets can never be full, because the queue is always freshly created and cannot be found by any other code path. In fact, the test of the queue length is worse than useless: if somehow the test ever triggered and path_rec_start() also failed, then dev_kfree_skb_any() will be called twice on the same skb. Fix this by deleting the useless test. Pointed out by Michael S. Tsirkin <mst@mellanox.co.il>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Correct reported SRQ size in MemFree case.Dotan Barak
MemFree devices need to reserve one shared receive queue (SRQ) work request for internal use, so the capacity returned from the create_srq and query_srq methods should be srq->max - 1. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mad: Fix oopsable race on device removalMichael S. Tsirkin
Fix an oopsable race debugged by Eli Cohen <eli@mellanox.co.il>: After removing the port from port_list, ib_mad_port_close flushes port_priv->wq before destroying the special QPs. This means that a completion event could arrive, and queue a new work in this work queue after flush. This patch also removes an unnecessary flush_workqueue(): destroy_workqueue() already includes a flush. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/srp: Coverity fix to srp_parse_options()Roland Dreier
Fix leak found by Coverity: in the SRP_OPT_DGID case, srp_parse_options() didn't free the result of match_strdup(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Coverity fix to mthca_init_eq_table()Roland Dreier
Fix bug found by coverity: the loop body never executed, because it was doing for (i = 0; i < MTHCA_EQ_CMD; ++i), but MTHCA_EQ_CMD is 0. The correct loop bound is MTHCA_NUM_EQ, to loop over all EQs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Coverity fixes to sysfs.cRoland Dreier
Fix two bugs found by coverity: - Memory leak in error path of alloc_group_attrs() - Fencepost error in state_show(): the test should be < ARRAY_SIZE(), not <= ARRAY_SIZE(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Move ipoib_ib_dev_flush() to ipoib workqueueJack Morgenstein
Move ipoib_ib_dev_flush() to ipoib's workqueue. This keeps it ordered with respect to other work scheduled by the ipoib driver. This fixes problems with races, for example: - ipoib_ib_dev_flush() has started running because of an IB event - user does ifconfig ib0 down - ipoib_mcast_stop_thread() gets called twice and waits for the same completion twice Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Fix build now that neighbour destructor is in neigh_paramsRoland Dreier
Fix the IPoIB build (which is broken in net-2.6.17 because of my screw-up, which left out this chunk in ipoib_multicast.c). The neighbour destructor is now in neigh_params, so we don't need to clear it in the ops structure. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Use correct alt_pkey_index in modify QPAmi Perlmutter
The old code incorrectly used the primary P_Key index as the alternate index too. Signed-off-by: Ami Perlmutter <amip@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/umad: Add support for large RMPP transfersJack Morgenstein
Add support for sending and receiving large RMPP transfers. The old code supports transfers only as large as a single contiguous kernel memory allocation. This patch uses linked list of memory buffers when sending and receiving data to avoid needing contiguous pages for larger transfers. Receive side: copy the arriving MADs in chunks instead of coalescing to one large buffer in kernel space. Send side: split a multipacket MAD buffer to a list of segments, (multipacket_list) and send these using a gather list of size 2. Also, save pointer to last sent segment, and retrieve requested segments by walking list starting at last sent segment. Finally, save pointer to last-acked segment. When retrying, retrieve segments for resending relative to this pointer. When updating last ack, start at this pointer. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/srp: Add SCSI host attributes to show target portRoland Dreier
Add SCSI host attributes in sysfs that show the ID extension, IOC GUID, service ID, P_Key and destination GID for each target port that the SRP initiator connects to. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/cm: Check cm_id state before handling a REPSean Hefty
Move checking the state of a cm_id before modifying it when handling a REP. This fixes a bug seen under MPI scale-up testing, where a NULL timewait_info pointer is dereferenced if a request times out before a REP is received. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Update firmware versionsRoland Dreier
Update known firmware versions in driver's table to the latest releases. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Optimize large messages on Sinai HCAsEli Cohen
Sinai (one-port PCI Express) HCAs get improved throughput for messages bigger than 80 KB in DDR mode if memory keys are formatted in a specific way. The enhancement only works if the memory key table is smaller than 2^24 entries. For larger tables, the enhancement is off and a warning is printed (to avoid silent performance loss). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Fix query QP return of sq_sig_allDotan Barak
The old code didn't convert from the kernel's enum correctly. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Fix modify QP checking of "current QP state" attributeDotan Barak
According to the IB spec version 1.2, section 11.2.4.2, the current table has a couple of mistakes where it allows the current QP state (IB_QP_CUR_STATE) attribute. For the transitions: RTS -> RTS: IB_QP_CUR_STATE should be allowed for all transports SQD -> SQD: IB_QP_CUR_STATE should never be allowed Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Fix multicast race between canceling and completingMichael S. Tsirkin
ipoib_mcast_stop_thread currently tests mcast->query and if it is NULL, does not perform wait_for_completion on the mcast and frees the mcast object directly. However, since both operations are done without locking, it is possible that ipoib_mcast_join_complete is in progress on this mcast object and has set mcast->query to NULL already. Solve this by: - taking priv->lock before we change mcast->query in ipoib_mcast_join_complete, and keeping it until we no longer need the mcast object - taking priv->lock around mcast->query test in ipoib_mcast_stop_thread Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Clean up if posting receives failsEli Cohen
If posting receives in ipoib_ib_dev_open() fails, call ipoib_ib_dev_stop() to move the device's QP back to the RESET state so that we can try again later. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Use an enum for HCA page sizeIshai Rabinovitz
Use a named enum for the HCA's internal page size, rather than having magic values of 4096 and shifts by 12 all over the code. Also, fix one minor bug in EQ handling: only one HCA page is mapped to the HCA during initialization, but a full kernel page is unmapped during cleanup. This might cause problems when PAGE_SIZE != 4096. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Check alternate P_Key index when setting alternate pathDotan Barak
Check that the alternate P_Key index is in range when setting the alternate path for a QP. Also make a cosmetic touch up to the debug message printed when the main P_Key index is out of range. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Add support for send work request fence flagDotan Barak
Add support for IB_SEND_FENCE flag in post_send methods. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Close race in setting mcast->ahEli Cohen
ipoib_mcast_send() tests mcast->ah twice. If this value is changed between these two points, we leak an skb. However, ipoib_mcast_join_finish() sets mcast->ah with no locking, so it could race against ipoib_mcast_send(). As a solution, take priv->lock around assignment to mcast->ah thus making sure ipoib_mcast_send() (which also takes priv->lock) is not in flight. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Implement query_ah methodJack Morgenstein
Implement query_ah (except for AVs which are in HCA memory). This is needed to implement RMPP duplicate session detection on sending side (extraction of DGID/DLID and GRH flag from address handle). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Write FW commands through doorbell pageEli Cohen
This patch is checks whether the HCA supports posting FW commands through a doorbell page (user access region 0, or "UAR0"). If this is supported, the driver maps UAR0 and uses it for FW commands. This can be controlled by the value of a writable module parameter fw_cmd_doorbell. When the parameter is 0, the commands are posted through HCR using the old method; otherwise if HCA is capable commands go through UAR0. This use of UAR0 to post commands eliminates the need for polling the "go" bit prior to posting a new command. Since reading from a PCI device is much more expensive then issuing a posted write, it is expected that issuing FW commands this way will provide better CPU utilization. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Return actual capacity from create SRQ operationDotan Barak
Pass actual capacity of created SRQ back to userspace, so that userspace can report accurate capacities. This requires an ABI bump, to change struct ib_uverbs_create_srq_resp. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Return actual capacity from create_srqDotan Barak
Have mthca's create_srq method return the actual capacity of the SRQ that gets created. Also update comments in <rdma/ib_verbs.h> to clarify that this is what is expected from ib_create_srq(). Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: clarify to_ipoib_neigh()Michael S. Tsirkin
Cosmetic change: make alignment explicit in to_ipoib_neigh. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Bump driver version and release dateRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Support for query QP and SRQEli Cohen
Implement the query_qp and query_srq methods in mthca. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Support for query SRQ from userspaceDotan Barak
Add support to uverbs to handle querying userspace SRQs (shared receive queues), including adding an ABI for marshalling requests and responses. The kernel midlayer already has the underlying ib_query_srq() function. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>