aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/core/uverbs_cmd.c
AgeCommit message (Collapse)Author
2008-04-16IB/core: Add support for "send with invalidate" work requestsRoland Dreier
Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a "send with invalidate" work request as defined in the iWARP verbs and the InfiniBand base memory management extensions. Also put "imm_data" and a new "invalidate_rkey" member in a new "ex" union in struct ib_send_wr. The invalidate_rkey member can be used to pass in an R_Key/STag to be invalidated. Add this new union to struct ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in ib_uverbs_post_send(). Fix up low-level drivers to deal with the change to struct ib_send_wr, and just remove the imm_data initialization from net/sunrpc/xprtrdma/, since that code never does any send with immediate operations. Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since the iWARP drivers currently in the tree set the bit. The amso1100 driver at least will silently fail to honor the IB_SEND_INVALIDATE bit if passed in as part of userspace send requests (since it does not implement kernel bypass work request queueing). Remove the flag from all existing drivers that set it until we know which ones are OK. The values chosen for the new flag is not consecutive to avoid clashing with flags defined in the XRC patches, which are not merged yet but which are already in use and are likely to be merged soon. This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/core: Add creation flags to struct ib_qp_init_attrEli Cohen
Add a create_flags member to struct ib_qp_init_attr that will allow a kernel verbs consumer to create a pass special flags when creating a QP. Add a flag value for telling low-level drivers that a QP will be used for IPoIB UD LSO. The create_flags member will also be useful for XRC and ehca low-latency QP support. Since no create_flags handling is implemented yet, add code to all low-level drivers to return -EINVAL if create_flags is non-zero. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-19IB/uverbs: Fix checking of userspace object ownershipRoland Dreier
Commit 9ead190b ("IB/uverbs: Don't serialize with ib_uverbs_idr_mutex") rewrote how userspace objects are looked up in the uverbs module's idrs, and introduced a severe bug in the process: there is no checking that an operation is being performed by the right process any more. Fix this by adding the missing check of uobj->context in __idr_get_uobj(). Apparently everyone is being very careful to only touch their own objects, because this bug was introduced in June 2006 in 2.6.18, and has gone undetected until now. Cc: stable <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-08IB/uverbs: Export ib_umem_get()/ib_umem_release() to modulesRoland Dreier
Export ib_umem_get()/ib_umem_release() and put low-level drivers in control of when to call ib_umem_get() to pin and DMA map userspace, rather than always calling it in ib_uverbs_reg_mr() before calling the low-level driver's reg_user_mr method. Also move these functions to be in the ib_core module instead of ib_uverbs, so that driver modules using them do not depend on ib_uverbs. This has a number of advantages: - It is better design from the standpoint of making generic code a library that can be used or overridden by device-specific code as the details of specific devices dictate. - Drivers that do not need to pin userspace memory regions do not need to take the performance hit of calling ib_mem_get(). For example, although I have not tried to implement it in this patch, the ipath driver should be able to avoid pinning memory and just use copy_{to,from}_user() to access userspace memory regions. - Buffers that need special mapping treatment can be identified by the low-level driver. For example, it may be possible to solve some Altix-specific memory ordering issues with mthca CQs in userspace by mapping CQ buffers with extra flags. - Drivers that need to pin and DMA map userspace memory for things other than memory regions can use ib_umem_get() directly, instead of hacks using extra parameters to their reg_phys_mr method. For example, the mlx4 driver that is pending being merged needs to pin and DMA map QP and CQ buffers, but it does not need to create a memory key for these buffers. So the cleanest solution is for mlx4 to call ib_umem_get() in the create_qp and create_cq methods. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06IB: Add CQ comp_vector supportMichael S. Tsirkin
Add a num_comp_vectors member to struct ib_device and extend ib_create_cq() to pass in a comp_vector parameter -- this parallels the userspace libibverbs API. Update all hardware drivers to set num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector value. Pass the value of num_comp_vectors to userspace rather than hard-coding a value of 1. We want multiple CQ event vector support (via MSI-X or similar for adapters that can generate multiple interrupts), but it's not clear how many vectors we want, or how we want to deal with policy issues such as how to decide which vector to use or how to set up interrupt affinity. This patch is useful for experimenting, since no core changes will be necessary when updating a driver to support multiple vectors, and we know that we want to make at least these changes anyway. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-22IB/uverbs: Return correct error for invalid PD in register MRRoland Dreier
If no matching PD is found in ib_uverbs_reg_mr(), then the function jumps to err_release without setting the return value ret. This means that ret will hold the return value of the call to ib_umem_get() a few lines earlier; if the function reaches the point where it looks for the PD, we know that ib_umem_get() must have returned 0, so ib_uverbs_reg_mr() ends up return 0 for a bad PD ID. Fix this by setting ret to -EINVAL before jumping to the exit path when no PD is found. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04IB: Return qp pointer as part of ib_wcMichael S. Tsirkin
struct ib_wc currently only includes the local QP number: this matches the IB spec, but seems mostly useless. The following patch replaces this with the pointer to qp itself, and updates all low level drivers and all users. This has the following advantages: - Ability to get a per-qp context through wc->qp->qp_context - Existing drivers already have the qp pointer ready in poll cq, so this change actually saves a tiny bit (extra memory read) on data path (for ehca it would actually be expensive to find the QP pointer when polling a CQ, but ehca does not support SRQ so we can leave wc->qp as NULL for ehca) - Users that need the QP number can still get it through wc->qp->qp_num Use case: In IPoIB connected mode code, I have a common CQ shared by multiple QPs. To track connection usage, I need a way to get at some per-QP context upon the completion, and I would like to avoid allocating context object per work request just to stick a QP pointer into it. With this code, I can just use wc->qp->qp_context. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-30IB/uverbs: Return sq_draining value in query_qp responseJack Morgenstein
Return the sq_draining value back to user space for query_qp instead of the en_sqd_async notify value, which is valid only for modify_qp. For query_qp, the draining status should returned. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB: Whitespace fixesRoland Dreier
Remove some trailing whitespace that has snuck in despite the best efforts of whitespace=error-all. Also fix a few other whitespace bogosities. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Pass userspace data to modify_srq and modify_qp methodsRalph Campbell
Pass a struct ib_udata to the low-level driver's ->modify_srq() and ->modify_qp() methods, so that it can get to the device-specific data passed in by the userspace driver. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Allow resize CQ operation to return driver-specific dataRalph Campbell
Add a ib_uverbs_resize_cq_resp.driver_data field so that low-level drivers can return data from a resize CQ operation to userspace. Have ib_uverbs_resize_cq() only copy the cqe field, to avoid having to bump the userspace ABI. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Fix lockdep warning when QP is created with 2 CQsRoland Dreier
Lockdep warns when userspace creates a QP that uses different CQs for send completions and receive completions, because both CQs are locked and their mutexes belong to the same lock class. However, we know that the mutexes are distinct and the nesting is safe (there is no possibility of AB-BA deadlock because the mutexes are locked with down_read()), so annotate the situation with SINGLE_DEPTH_NESTING to get rid of the lockdep warning. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Use idr_read_cq() where appropriateRoland Dreier
There were two functions that open-coded idr_read_cq() in terms of idr_read_uobj() rather than using the helper. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-07-23IB/uverbs: Fix lockdep warningsRoland Dreier
Lockdep warns because uverbs is trying to take uobj->mutex when it already holds that lock. This is because there are really multiple types of uobjs even though all of their locks are initialized in common code. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-07-23IB/uverbs: Fix unlocking in error pathsMichael S. Tsirkin
ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the PD's read lock in their error paths, which lead to deadlock when destroying the PD. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-30IB/uverbs: Set correct user handle for user SRQsRoland Dreier
Store away the user handle passed in from userspace when creating an SRQ, so that the kernel can return the correct handle when an SRQ asynchronous event occurs. (A 0 was incorrectly stored as the user handle as part of the changes in 9ead190b, "IB/uverbs: Don't serialize with ib_uverbs_idr_mutex") Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-22IB/uverbs: Don't free wr list when it's known to be emptyKrishna Kumar
In ib_uverbs_post_send(), move the "out:" label after the loop that frees the list of work requests, since the only place that jumps there is before any work requests could possibly be added to the list. This removes a compile warning: "is_ud might be used uninitialized in this function". Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/uverbs: Don't serialize with ib_uverbs_idr_mutexRoland Dreier
Currently, all userspace verbs operations that call into the kernel are serialized by ib_uverbs_idr_mutex. This can be a scalability issue for some workloads, especially for devices driven by the ipath driver, which needs to call into the kernel even for datapath operations. Fix this by adding reference counts to the userspace objects, and then converting ib_uverbs_idr_mutex into a spinlock that only protects the idrs long enough to take a reference on the object being looked up. Because remove operations may fail, we have to do a slightly funky two-step deletion, which is described in the comments at the top of uverbs_cmd.c. This also still leaves ib_uverbs_idr_lock as a single lock that is possibly subject to contention. However, the lock hold time will only be a single idr operation, so multiple threads should still be able to make progress, even if ib_uverbs_idr_lock is being ping-ponged. Surprisingly, these changes even shrink the object code: add/remove: 23/5 grow/shrink: 4/21 up/down: 633/-693 (-60) Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/uverbs: Factor out common idr codeRoland Dreier
Factor out common code for adding a userspace object to an idr into a function idr_add_uobj(). This shrinks both the source and object code: add/remove: 1/0 grow/shrink: 0/6 up/down: 57/-220 (-163) function old new delta idr_add_uobj - 57 +57 ib_uverbs_create_ah 543 512 -31 ib_uverbs_create_srq 662 630 -32 ib_uverbs_reg_mr 737 699 -38 ib_uverbs_create_cq 639 600 -39 ib_uverbs_alloc_pd 485 446 -39 ib_uverbs_create_qp 1020 979 -41 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/uverbs: Don't decrement usecnt on error pathsRoland Dreier
In error paths when destroying an object, uverbs should not decrement associated objects' usecnt, since ib_dereg_mr(), ib_destroy_qp(), etc. already do that. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/uverbs: Release lock on error pathGanapathi CH
If ibdev->alloc_ucontext() fails then ib_uverbs_get_context() does not unlock file->mutex before returning error. Signed-off by: Ganapathi CH <cganapathi@novell.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Use correct alt_pkey_index in modify QPAmi Perlmutter
The old code incorrectly used the primary P_Key index as the alternate index too. Signed-off-by: Ami Perlmutter <amip@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Fix query QP return of sq_sig_allDotan Barak
The old code didn't convert from the kernel's enum correctly. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Return actual capacity from create SRQ operationDotan Barak
Pass actual capacity of created SRQ back to userspace, so that userspace can report accurate capacities. This requires an ABI bump, to change struct ib_uverbs_create_srq_resp. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Support for query SRQ from userspaceDotan Barak
Add support to uverbs to handle querying userspace SRQs (shared receive queues), including adding an ABI for marshalling requests and responses. The kernel midlayer already has the underlying ib_query_srq() function. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Support for query QP from userspaceDotan Barak
Add support to uverbs to handle querying userspace QPs (queue pairs), including adding an ABI for marshalling requests and responses. The kernel midlayer already has the underlying ib_query_qp() function. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Whitespace cleanupsRoland Dreier
Remove trailing whitespace and fix indentation that with spaces instead of tabs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Add userspace support for resizing CQsRoland Dreier
Add support to uverbs to handle resizing userspace CQs (completion queues), including adding an ABI for marshalling requests and responses. The kernel midlayer already has ib_resize_cq(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-13IB: convert from semaphores to mutexesIngo Molnar
semaphore to mutex conversion by Ingo and Arjan's script. Signed-off-by: Ingo Molnar <mingo@elte.hu> [ Sanity-checked on real IB hardware ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-10IB: Add node_guid to struct ib_deviceSean Hefty
Add a node_guid field to struct ib_device. It is the responsibility of the low-level driver to initialize this field before registering a device with the midlayer. Convert everyone to looking at this field instead of calling ib_query_device() when all they want is the node GUID, and remove the node_guid field from struct ib_device_attr. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06IB/uverbs: Release event file reference on ib_uverbs_create_cq() errorJack Morgenstein
ib_uverbs_create_cq() should release the completion channel event file if an error occurs after it looks it up. Also, if userspace asks for a completion channel and we don't find it, an error should be returned instead of silently creating a CQ without a completion channel. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06IB/uverbs: set ah_flags when creating address handleRalph Campbell
AH attribute's ah_flags need to be set according to the is_global flag passed in from userspace. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06IB/uverbs: Fix reference counting on error pathsJack Morgenstein
If an operation fails after incrementing an object's reference count, then it should decrement the reference count on the error path. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-29IB/uverbs: track multicast group membership for userspace QPsJack Morgenstein
uverbs needs to track which multicast groups is each qp attached to, in order to properly detach when cleanup is performed on device file close. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-10[IB] uverbs: have kernel return QP capabilitiesJack Morgenstein
Move the computation of QP capabilities (max scatter/gather entries, max inline data, etc) into the kernel, and have the uverbs module return the values as part of the create QP response. This keeps precise knowledge of device limits in the low-level kernel driver. This requires an ABI bump, so while we're making changes, get rid of the max_sge parameter for the modify SRQ command -- it's not used and shouldn't be there. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-31[IB] uverbs: Avoid NULL pointer deref on CQ async eventRoland Dreier
Userspace CQs that have no completion event channel attached end up with their cq_context set to NULL. However, asynchronous events like "CQ overrun" can still occur on such CQs, so add a uverbs_file member to struct ib_ucq_object that we can follow to deliver these events. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-28[IB] uverbs: Fix device lifetime problemsRoland Dreier
Move ib_uverbs module to using cdev_alloc() and class_device_create() so that we can handle device lifetime properly. Now we can make sure we keep all of our data structures around until the last way to reach them is gone. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17[IB] uverbs: Implement more commandsRoland Dreier
Add kernel support for userspace calling poll CQ, request CQ notification, post send, post receive, post SRQ receive, create AH and destroy AH commands. These commands allow us to support userspace verbs for devices that can't perform these operations directly from userspace (eg the PathScale HCA). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17[IB] uverbs: reject invalid memory registration permission flagsRoland Dreier
Reject userspace memory registrations with invalid permission flags: "local write" is required if "remote write" or "remote atomic" is also requested. Pointed out by Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17[IB] uverbs: Fix up resource creation error pathsRoland Dreier
By waiting to add resources to our lists until after the last operation that can fail, we don't have to remove them from their lists in the error path. Also, we should hold the idr mutex until we know whether resource creation has succeed or failed, to avoid someone finding a resource in our table before we're ready. Loosely based on work by Robert Walsh <rjwalsh@pathscale.com>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17[IB] uverbs: ABI-breaking fixes for userspace verbsRoland Dreier
Introduce new userspace verbs ABI version 3. This eliminates some unneeded commands, and adds support for user-created completion channels. This cleans up problems with file leaks on error paths, and also makes sure that file descriptors are always installed into the correct process. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-09-26[IB] uverbs: Close some exploitable racesRoland Dreier
Al Viro pointed out that the current IB userspace verbs interface allows userspace to cause mischief by closing file descriptors before we're ready, or issuing the same command twice at the same time. This patch closes those races, and fixes other obvious problems such as a module reference leak. Some other interface bogosities will require an ABI change to fix properly, so I'm deferring those fixes until 2.6.15. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-09-09Make sure that userspace does not retrieve stale asynchronous orRoland Dreier
completion events after destroying a CQ, QP or SRQ. We do this by sweeping the event lists before returning from a destroy calls, and then return the number of events already reported before the destroy call. This allows userspace wait until it has processed all events for an object returned from the kernel before it frees its context for the object. The ABI of the destroy CQ, destroy QP and destroy SRQ commands has to change to return the event count, so bump the ABI version from 1 to 2. The userspace libibverbs library has already been updated to handle both the old and new ABI versions. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26[PATCH] IB: userspace SRQ supportRoland Dreier
Add SRQ support to userspace verbs module. This adds several commands and associated structures, but it's OK to do this without bumping the ABI version because the commands are added at the end of the list so they don't change the existing numbering. There are two cases to worry about: 1. New kernel, old userspace. This is OK because old userspace simply won't try to use the new SRQ commands. None of the old commands are changed. 2. Old kernel, new userspace. This works perfectly as long as userspace doesn't try to use SRQ commands. If userspace tries to use SRQ commands, it will get EINVAL, which is perfectly reasonable: the kernel doesn't support SRQs, so we couldn't do any better. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-07-07[PATCH] IB uverbs: core implementationRoland Dreier
Add the core of the InfiniBand userspace verbs implementation, including creating character device nodes, dispatching requests from userspace, and passing event notifications back up to userspace. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>