Amir Vadai [Thu, 10 Jun 2010 08:32:50 +0000 (11:32 +0300)]
sdp: Fix bad handling of small rcvbuf size in zcopy
- Do not send RdmaRdComp when no buffers
- Same for SendSm
- post at least 3 buffers in RX to have the minimal number of credits
- make purge_tx_ring ignore WR used by RDMA
- fixed a typo, to reschedule tx_cq_poll timer according to tx queue and
not to rx queue (!)
- Allow credit updates when less than half RX Q is filled
Amir Vadai [Wed, 9 Jun 2010 09:40:23 +0000 (12:40 +0300)]
sdp: cleanup skb allocations
- Bad sizing of inline data on send sockets had implications on the
performance.
- All sent data is placed on the skb itself (unless accumulated by nagle)
- Do not count sdp header twice when allocating skb
added some likely/unlikely
Eldad Zinger [Sun, 30 May 2010 11:03:43 +0000 (14:03 +0300)]
sdp: device removal rewritten for a stability improvement.
main changes:
1. device_removal_lock is better used.
2. sdp_dev is marked NULL in order to prevent new sockets born to the removed
device.
3. new timeout functionality used when a reference count was taken for the CMA
to return, but the CMA won't be invoked because rdma_id was destroyed.
Eldad Zinger [Tue, 4 May 2010 08:43:29 +0000 (11:43 +0300)]
sdp: BUG2031 - sdp_cma_handler() won't be invoked after last ref count removed
rdma_id will be destructed in cases where the sdp_cma_handler() is not required any more.
This will disable asynchronous-events handling after the last reference count removed.
Kernel always crashed in the following test case:
user program: socket+bind+listen+accept. socket accpted.
shell: rmmod mlx4_ib
user program: close <<< CRASH
The fix closes any socket that its ib_device is the device being removed.
Amir Vadai [Thu, 15 Apr 2010 08:57:11 +0000 (11:57 +0300)]
sdp: Don't try to allocate FMR larger than RLIMIT_MEMLOCK
During ZCopy. If don't have CAP_IPC_LOCK capability and current
max number of locked pages is smaller than the buffer size, split
the send into small fragments.
Amir Vadai [Thu, 15 Apr 2010 08:52:19 +0000 (11:52 +0300)]
sdp: Don't count sdp header twice when calculating size_goal
sizeof(struct sdp_bsdh) is included inside skb->len. Ignore it
when calculating maximum payload of the skb.
This mistake caused every BCopy send of 32K (and its multiples) to
be split and thus got bad performance.
sdp: added differentiation between bind failures of sdp.
When bind()ing in mode 'BOTH', bind(sdp_sock) might fail if:
1. the IP&port is already bounded.
2. the IP is not part of IB network.
previous implementation returned errno=EADDRINUSE either way.
Only the first case should fail the bind(), the second is legitimate
because the TCP socket will hanle the connection.
This fix corresponds to a fix in libsdp.
Amir Vadai [Thu, 25 Feb 2010 09:43:03 +0000 (11:43 +0200)]
sdp: SendSM wasn't sent sometimes after getting SrcAvailCancel
* skb was freed if rx_sa is aborted - preventing SendSM
to be sent.
* Didn't update rx_sa->used in case of SrcAvailCancel
and therefore not sending RdmaRdCompl.
This also caused the next read to fail because offset
wasn't updated
Amir Vadai [Wed, 24 Feb 2010 08:59:31 +0000 (10:59 +0200)]
sdp: Fix bug in crossing SrcAvail
* Handle RdmaRdCompl in interrupt, before splitted to two Q's
This way the handling is sequencial, and no race could occure
between RdmaRdCompl and SrcAvailCancel
* Fixed an error when checking that RdmaRdCompl is not for
old SrcAvail
If sdp_add_device() fails, there is no client data stored in the IB device,
leading to a kernel crash when a connection is being established. Fix this
by rejecting connections when the device is not initialized.
Also, fix a bad goto target in an error case early in sdp_init_qp().
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Tue, 16 Feb 2010 15:36:00 +0000 (17:36 +0200)]
sdp: Fix bugs in huge paged HW's
* Protect some constants that are based on PAGE_SIZE:
- FMR size
- xmit_goal
* renamed SDP_HEAD_SIZE => SDP_SKB_HEAD_SIZE
* removed unneeded special IA64 code due to changes here
Amir Vadai [Sun, 24 Jan 2010 15:12:34 +0000 (17:12 +0200)]
sdp: must use ib_sg_dma_*, not sg_dma_* for mapping
This fixes OFED bug 1895, althoug some warnings are still generated,
when running qperf sdp_bw with large sizes (using zcopy), on the
truescale adapters.
Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Tue, 24 Nov 2009 07:32:39 +0000 (09:32 +0200)]
sdp: fixed BUG1796 - running out of memory on rx
rcv queue could grow endlessly because minimal RX buffers in QP
was set to SDP_MIN_TX_CREDITS + 1 - so there always were credits
available for the sender.
Jack Morgenstein [Sun, 30 Aug 2009 14:16:13 +0000 (17:16 +0300)]
sdp: incorrect SDP_FMR_SIZE on 32-bit machines
On 32-bit machines, sizeof (u64 *) is 4 bytes (size of a ***pointer***).
However, the max SDP FMR pool size should be PAGE_SIZE / sizeof(an mtt entry) --
and mtt entries are u64's (or __be64's).
This resulted in SDP requesting twice as many entries per pool on 32-bit machines
as could fit on a single page -- with the result that the fmr pool allocation failed
at driver startup.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Jack Morgenstein [Sun, 30 Aug 2009 14:24:07 +0000 (17:24 +0300)]
sdp: check if sdp device is actually present in sdp_remove_one
If sdp fails to initialize at driver startup for any reason,
the device is still registered with the ib_core, but there will be
no client data (i.e., ib_set_client_data() will not be called, and all
kernel resources are de-allocated).
On removal, ib_get_client_data() will return NULL in this case -- and this
must be tested for -- or we will get a kernel Oops for a NULL pointer
dereference.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>