aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/mlx4
AgeCommit message (Collapse)Author
2008-07-23IB/mlx4: Add support for memory management extensions and local DMA L_KeyRoland Dreier
Add support for the following operations to mlx4 when device firmware supports them: - Send with invalidate and local invalidate send queue work requests; - Allocate/free fast register MRs; - Allocate/free fast register MR page lists; - Fast register MR send queue work requests; - Local DMA L_Key. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22mlx4_core: Keep free count for MTT buddy allocatorRoland Dreier
MTT entries are allocated with a buddy allocator, which just keeps bitmaps for each level of the buddy table. However, all free space starts out at the highest order, and small allocations start scanning from the lowest order. When the lowest order tables have no free space, this can lead to scanning potentially millions of bits before finding a free entry at a higher order. We can avoid this by just keeping a count of how many free entries each order has, and skipping the bitmap scan when an order is completely empty. This provides a nice performance boost for a negligible increase in memory usage. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22mlx4_code: Add missing FW status return codeJack Morgenstein
Add ICM_ERROR firmware status code. In mapping to errnos, -ENFILE seems closest. This is in preparation for providing more detailed log info using mlx4_err() in low-level driver when a non-zero status is returned. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22mlx4_core: Add module parameter to enable QoS supportJack Morgenstein
Add a module parameter "enable_qos" to mlx4_core. If this param is set, enable support for QoS in the INIT_HCA command. By default, the parameter is set to 0 (disabled). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14mlx4_core: Use MOD_STAT_CFG command to get minimal page sizeVladimir Sokolovsky
There was a bug in some versions of the mlx4 driver in mlx4_alloc_fmr(), which hardcoded the minimum acceptable page_shift to be 12. However, new ConnectX firmware can support a minimum page_shift of 9 (log_pg_sz of 9 returned by QUERY_DEV_LIM) -- so with old drivers, ib_fmr_alloc() would fail for ULPs using the device minimum when creating FMRs. To preserve firmware compatibility with released mlx4 drivers, the firmware will continue to return 12 as before for log_page_sz in QUERY_DEV_CAP for these drivers. However, to enable new drivers to take advantage of the available smaller page size, the mlx4 driver now first sets the log_pg_sz to the device minimum by setting a log_page_sz value to 0 via the MOD_STAT_CFG command and then reading the real minimum via QUERY_DEV_CAP. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mlx4: Add support for blocking multicast loopback packetsRon Livne
Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK flag by using the per-multicast group loopback blocking feature of mlx4 hardware. Signed-off-by: Ron Livne <ronli@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-05-05mlx4_core: Support creation of FMRs with pages smaller than 4KOren Duer
Don't hard code a test against a minimum page shift of 12, since the device may support smaller pages. Test against the actual smallest page size from the device capabilities. Signed-off-by: Oren Duer <oren@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-29mlx4_core: Avoid recycling old FMR R_Keys too soonOlaf Kirch
When a FMR is unmapped, mlx4 resets the map count to 0, and clears the upper part of the R_Key which is used as the sequence counter. This poses a problem for RDS, which uses ib_fmr_unmap as a fence operation. RDS assumes that after issuing an unmap, the old R_Keys will be invalid for a "reasonable" period of time. For instance, Oracle processes uses shared memory buffers allocated from a pool of buffers. When a process dies, we want to reclaim these buffers -- but we must make sure there are no pending RDMA operations to/from those buffers. The only way to achieve that is by using unmap and sync the TPT. However, when the sequence count is reset on unmap, there is a high likelihood that a new mapping will be given the same R_Key that was issued a few milliseconds ago. To prevent this, don't reset the sequence count when unmapping a FMR. Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-29mlx4_core: Add a way to set the "collapsed" CQ flagYevgeny Petrilin
Extend the mlx4_cq_resize() API with a way to set the "collapsed" flag for the CQ being created. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-25mlx4_core: Add helper to move QP to ready-to-sendYevgeny Petrilin
Avoid duplicating code in ethernet and FC modules. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-25mlx4_core: Add HW queues allocation helpersYevgeny Petrilin
Wrap doorbell, buffer and MTT allocation in helper functions for ethernet and FC modules to use. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-23mlx4_core: CQ resizing should pass a 0 opcode modifier to MODIFY_CQVladimir Sokolovsky
The call to mlx4_MODIFY_CQ() had a typo so that mlx4_cq_resize() was actually asking the FW to modify a CQ's interrupt moderation rather than asking it to resize a CQ. Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-23mlx4_core: Move kernel doorbell management into coreYevgeny Petrilin
In addition to mlx4_ib, there will be ethernet and FC consumers of mlx4_core, so move the code for managing kernel doorbells into the core module to avoid having to duplicate this multiple times. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Fix incorrect commentEli Cohen
mlx4 hardware does not support external DDR memory. Moreover, UAR area (BAR 2) can change depending on FW version. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Fix race when detaching a QP from a multicast groupEli Cohen
When detaching the last QP from an MCG entry, we need to make sure that at any time, there will be no entry with zero number of QPs which is linked to the list of the MCGs of the corresponding hash index. So don't write back the MCG entry if we are removing the last QP; just unlink the entry. Also, remove an unnecessary MCG read when attaching a QP requires allocation of a new entry in the AMGM. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Add support for resizing CQsVladimir Sokolovsky
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Add support for modifying CQ moderation parametersEli Cohen
Signed-off-by: Eli Cohen <eli@mellnaox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16mlx4_core: Increase max number of QPs to 128KJack Morgenstein
With the advent large clusters which utilize multicore hosts, 64K QPs is not enough. We should increase the default maximum for QPs to 128K. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Add IPoIB LSO supportEli Cohen
Add TSO support to the mlx4_ib driver. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Add IPoIB checksum offload supportEli Cohen
ConnectX devices support checksum generation and verification of TCP and UDP packets for UD IPoIB messages. This patch checks if the HCA supports this and sets the IB_DEVICE_UD_IP_CSUM capability flag if it does. It implements support for handling the IB_SEND_IP_CSUM send flag and setting the csum_ok field in receive work completions. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Ali Ayub <ali@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16mlx4_core: Fix confusion between mlx4_event and mlx4_dev_event enumsRoland Dreier
The struct mlx4_interface.event() method was supposed to get an enum mlx4_dev_event, but the driver code was actually passing in the hardware enum mlx4_event values. Fix up the callers of mlx4_dispatch_event() so that they pass in the right type of value, and fix up the event method in mlx4_ib so that it can handle the enum mlx4_dev_event values. This eliminates the need for the subtype parameter to the event method, so remove it. This also fixes the sparse warning drivers/net/mlx4/intf.c:127:48: warning: mixing different enum types drivers/net/mlx4/intf.c:127:48: int enum mlx4_event versus drivers/net/mlx4/intf.c:127:48: int enum mlx4_dev_event Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16mlx4_core: Move opening brace of function onto a new lineRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-14mlx4_core: Move table_find from fmr_alloc to fmr_enableJack Morgenstein
mlx4_table_find (for FMR MPTs) requires that ICM memory already be mapped. Before this fix, FMR allocation depended on ICM memory already being mapped for the MPT entry. If all currently mapped entries are taken, the find operation fails (even if the MPT ICM table still had more entries, which were just not mapped yet). This fix moves the mpt find operation to fmr_enable, to guarantee that any required ICM memory mapping has already occurred. Found by Oren Duer of Mellanox. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-11mlx4_core: Fix build break (missing include)Olof Johansson
Commit 313abe55 ("mlx4_core: For 64-bit systems, vmap() kernel queue buffers") caused this to pop up on powerpc allyesconfig, looks like a missing include file: drivers/net/mlx4/alloc.c: In function 'mlx4_buf_alloc': drivers/net/mlx4/alloc.c:162: error: implicit declaration of function 'vmap' drivers/net/mlx4/alloc.c:162: error: 'VM_MAP' undeclared (first use in this function) drivers/net/mlx4/alloc.c:162: error: (Each undeclared identifier is reported only once drivers/net/mlx4/alloc.c:162: error: for each function it appears in.) drivers/net/mlx4/alloc.c:162: warning: assignment makes pointer from integer without a cast drivers/net/mlx4/alloc.c: In function 'mlx4_buf_free': drivers/net/mlx4/alloc.c:187: error: implicit declaration of function 'vunmap' Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-06mlx4_core: Clean up struct mlx4_bufRoland Dreier
Now that struct mlx4_buf.u is a struct instead of a union because of the vmap() changes, there's no point in having a struct at all. So move .direct and .page_list directly into struct mlx4_buf and get rid of a bunch of unnecessary ".u"s. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-06mlx4_core: For 64-bit systems, vmap() kernel queue buffersJack Morgenstein
Since kernel virtual memory is not a problem on 64-bit systems, there is no reason to use our own 2-layer page mapping scheme for large kernel queue buffers on such systems. Instead, map the page list to a single virtually contiguous buffer with vmap(), so that can we access buffer memory via direct indexing. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-04IB: Avoid marking __devinitdata as constRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-04mlx4_core: Don't read reserved fields in mlx4_QUERY_ADAPTER()Jack Morgenstein
The firmware QUERY_ADAPTER command does not return vendor_id, device_id, and revision_id; eliminate these fields from the query. Initialize the rev_id field of the mlx4 device via init_node_data (MAD IFC query), as is done in the query_device verb implementation. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-04mlx4_core: Fix more section mismatchesRoland Dreier
Commit 3d73c288 ("mlx4_core: Fix section mismatches") fixed some of the section mismatches introduced when error recovery was added, but there were still more cases of errory recovery code calling into __devinit code from regular .text. Fix this by getting rid of the now-incorrect __devinit annotations. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25mlx4_core: Fix max_eqs masking in QUERY_DEV_CAPJack Morgenstein
log_max_eqs is a 4-bit field, not a 3-bit field in the response to the QUERY_DEV_CAP FW command, so we should mask with 0xf instead of 0x7 when reading it. Found by Yossi Leybovitch of Mellanox. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-11-20mlx4_core: Fix state check in mlx4_qp_modify()Jack Morgenstein
When checking the states passed in, mlx4_qp_modify() accidentally checks cur_state twice rather than checking cur_state and new_state. Fix this to make sure that both values are in-bounds. Since these values may be passed in from userspace, this bug results in userspace being able to trigger an oops. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: stable <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-11-14mlx4_core: Fix thinko in QP destroy (incorrect bitmap_free)Jack Morgenstein
Fix thinko in commit eaf559bf ("mlx4_core: Don't free special QPs in QP number bitmap"). The old commit had the logic exactly backwards and ended up freeing *only* special QPs, which not only left the original bug in place but also introduced the problem that the QP number bitmap would get full after a while. Found by Dotan Barak of Mellanox. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-11-13mlx4_core: Fix possible bad free in mlx4_buf_free()Ali Ayoub
When mlx4_buf_free() is called from the error path of mlx4_buf_alloc(), it may be passed a buffer structure that does not have all pages filled in. Add a check for NULL to mlx4_buf_free() so we avoid passing NULL to dma_free_coherent() (which will crash). Signed-off-by: Ali Ayoub <ali@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-24SG: Change sg_set_page() to take length and offset argumentJens Axboe
Most drivers need to set length and offset as well, so may as well fold those three lines into one. Add sg_assign_page() for those two locations that only needed to set the page, where the offset/length is set outside of the function context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4_core: Increase command timeout for INIT_HCA to 10 seconds IPoIB/cm: Use common CQ for CM send completions IB/uverbs: Fix checking of userspace object ownership IB/mlx4: Sanity check userspace send queue sizes IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))" IB/ehca: Enable large page MRs by default IB/ehca: Change meaning of hca_cap_mr_pgsize IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr() IB/ehca: Fix masking error in {,re}reg_phys_mr() IB/ehca: Supply QP token for SRQ base QPs IPoIB: Use round_jiffies() for ah_reap_task RDMA/cma: Fix deadlock destroying listen requests RDMA/cma: Add locking around QP accesses IB/mthca: Avoid alignment traps when writing doorbells mlx4_core: Kill mlx4_write64_raw()
2007-10-22[SG] Update drivers to use sg helpersJens Axboe
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-21mlx4_core: Increase command timeout for INIT_HCA to 10 secondsJack Morgenstein
The current INIT_HCA firmware command timeout is sufficient for the default number of resources (QPs, CQs, etc) being allocated, but if the HCA profile is modified to increase the amount of resources, then a spurious timeout is detected and HCA initialization fails. Increase the timeout for the INIT_HCA command to 10 seconds, which also brings it into line with all the other command timeouts. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-13mlx4_core: Fix infinite loop on device initializationRoland Dreier
Commit 3d73c288 ("mlx4_core: Fix section mismatches") introduced a stupid bug in device init: when some of mlx4_init_one() was split off into __mlx4_init_one(), the call from the main mlx4_init_one() function was back to mlx4_init_one() rather than to __mlx4_init_one(), which leads to an obvious infinite loop if the function is every called. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-10mlx4_core: Fix section mismatchesRoland Dreier
Commit ee49bd93 ("mlx4_core: Reset device when internal error is detected") introduced some section mismatch problems when CONFIG_HOTPLUG=n, because the error recovery code tears down and reinitializes the device after everything is loaded, which ends up calling into lots of code marked __devinit and __devexit from regular .text. Fix this by getting rid of these now-incorrect section markers. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Use mmiowb() to avoid firmware commands getting jumbled upRoland Dreier
Firmware commands are sent to the HCA by writing multiple words to a command register block. Access to this block of registers is serialized with a mutex. However, on large SGI systems writes to the register block may be reordered within the system interconnect and reach the HCA in a different order than they were issued (even with the mutex). Fix this by adding an mmiowb() before dropping the mutex. This bug was observed with real workloads with the similar FW command code in the mthca driver, and adding the mmiowb() as in commit 66547550 ("IB/mthca: Use mmiowb() to avoid firmware commands getting jumbled up") was confirmed to fix the problems, so we should add the same fix to mlx4. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Increase max number of QPs per multicast group to 56Jack Morgenstein
Increase the number of QPs allowed per multicast group from 8 to 56. This allows for one QP per core on 16-core systems, which are now quite common, and allows some space for future growth. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/mlx4: Implement FMRsJack Morgenstein
Implement FMRs for mlx4. This is an adaptation of code from mthca. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Write MTTs from CPU instead with of WRITE_MTT FW commandJack Morgenstein
Write MTT entries directly to ICM from the driver (eliminating use of WRITE_MTT command). This reduces the number of FW commands needed to register an MR by at least a factor of 2 and speeds up memory registration significantly. This code will also be used to implement FMRs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Fix meaning of dev->caps.reserved_mttsRoland Dreier
Everything that uses caps.reserved_mtts expects it to be a count of MTT segments, not MTT entries. So convert the value that the FW gives us to a count of segments. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Reserve the correct number of MTT segmentsRoland Dreier
Taking ilog2(dev->caps.reserved_mtts) to find out the order to pass to the MTT buddy allocator will do the wrong thing if reserved_mtts is ever not a power of 2. Be safe and use fls(dev->caps.reserved_mtts - 1). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Support ICM tables in coherent memoryJack Morgenstein
Enable having ICM tables in coherent memory, and use coherent memory for the dMPT table. This will allow writing MPT entries for MRs both via the SW2HW_MPT command and also directly by the driver for FMR remapping without needing to flush or worry about cacheline boundaries. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/mlx4: Display misc device information under /sys/class/infiniband/Jack Morgenstein
display the following device information under /sys/class/infiniband/mlx4_X: board_id, fw_ver, hw_rev, hca_type. This patch makes this information available to userspace utilities such as ibstat and ibv_devinfo. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Change capability decoding: SRC->XRCRoland Dreier
The SRC ("scalable RC") transport has been renamed to XRC ("extended RC"), to avoid having an abbreviation that is so easily confused with an abbreviation for "source." Update the HCA capability decoding output to use the new name. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/mlx4: Fix up SRQ limit_watermark endiannessRoland Dreier
mlx4_srq_query() returns a big-endian 16-bit value through an int *, which screws up sparse checking. Fix this so that a CPU-endian value is returned. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Enable MSI-X by defaultMichael S. Tsirkin
Recover from MSI-X errors by automatically falling back on regular interrupt, instead of asking the user to do this manually. This makes it possible to enable MSI-X by default, and will make it possible to get rid of the msi_x module option in the future. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>