Amir Vadai [Thu, 25 Feb 2010 09:43:03 +0000 (11:43 +0200)]
sdp: SendSM wasn't sent sometimes after getting SrcAvailCancel
* skb was freed if rx_sa is aborted - preventing SendSM
to be sent.
* Didn't update rx_sa->used in case of SrcAvailCancel
and therefore not sending RdmaRdCompl.
This also caused the next read to fail because offset
wasn't updated
Amir Vadai [Wed, 24 Feb 2010 08:59:31 +0000 (10:59 +0200)]
sdp: Fix bug in crossing SrcAvail
* Handle RdmaRdCompl in interrupt, before splitted to two Q's
This way the handling is sequencial, and no race could occure
between RdmaRdCompl and SrcAvailCancel
* Fixed an error when checking that RdmaRdCompl is not for
old SrcAvail
If sdp_add_device() fails, there is no client data stored in the IB device,
leading to a kernel crash when a connection is being established. Fix this
by rejecting connections when the device is not initialized.
Also, fix a bad goto target in an error case early in sdp_init_qp().
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Tue, 16 Feb 2010 15:36:00 +0000 (17:36 +0200)]
sdp: Fix bugs in huge paged HW's
* Protect some constants that are based on PAGE_SIZE:
- FMR size
- xmit_goal
* renamed SDP_HEAD_SIZE => SDP_SKB_HEAD_SIZE
* removed unneeded special IA64 code due to changes here
Amir Vadai [Sun, 24 Jan 2010 15:12:34 +0000 (17:12 +0200)]
sdp: must use ib_sg_dma_*, not sg_dma_* for mapping
This fixes OFED bug 1895, althoug some warnings are still generated,
when running qperf sdp_bw with large sizes (using zcopy), on the
truescale adapters.
Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Tue, 24 Nov 2009 07:32:39 +0000 (09:32 +0200)]
sdp: fixed BUG1796 - running out of memory on rx
rcv queue could grow endlessly because minimal RX buffers in QP
was set to SDP_MIN_TX_CREDITS + 1 - so there always were credits
available for the sender.
Jack Morgenstein [Sun, 30 Aug 2009 14:16:13 +0000 (17:16 +0300)]
sdp: incorrect SDP_FMR_SIZE on 32-bit machines
On 32-bit machines, sizeof (u64 *) is 4 bytes (size of a ***pointer***).
However, the max SDP FMR pool size should be PAGE_SIZE / sizeof(an mtt entry) --
and mtt entries are u64's (or __be64's).
This resulted in SDP requesting twice as many entries per pool on 32-bit machines
as could fit on a single page -- with the result that the fmr pool allocation failed
at driver startup.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Jack Morgenstein [Sun, 30 Aug 2009 14:24:07 +0000 (17:24 +0300)]
sdp: check if sdp device is actually present in sdp_remove_one
If sdp fails to initialize at driver startup for any reason,
the device is still registered with the ib_core, but there will be
no client data (i.e., ib_set_client_data() will not be called, and all
kernel resources are de-allocated).
On removal, ib_get_client_data() will return NULL in this case -- and this
must be tested for -- or we will get a kernel Oops for a NULL pointer
dereference.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Amir Vadai [Tue, 28 Apr 2009 06:40:10 +0000 (09:40 +0300)]
sdp: BUG1311 Netpipe fails with a IB_WC_LOC_LEN_ERR.
This problem is seen when the receive buffer or the receive buffer fragments
are smaller than the senders buffer. If the sender is using 64KB pages
and supports a sk fragment of 64KB it may send fragments that are a full 64KB
in length causing the receiver to generate an IB_WC_LOC_LEN_ERR. This patch
makes two changes:
If the kernel does not support a full 64KB fragment it will reject resize
requests over 32K. (On older kernels a fragment size is defined as a U16)
If a kernel supports a 64KB fragment then it allows a full 64KB receive
fragment to be used.
Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Nicolas Morey-Chaisemartin [Wed, 29 Apr 2009 14:23:04 +0000 (16:23 +0200)]
sdp: change orphan_count and sockets_allocated from atomic_t to percpu_counter
Fixed SDP to work on 2.6.29+
As percpu_counter are huge they can be allocated on the stack without causing sdp module to crash.
Both variable are now dynamically allocated at module init.
Signed-off-by: Nicolas Morey-Chaisemartin <nicolas.morey-chaisemartin@ext.bull.net> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Mon, 17 Nov 2008 08:11:27 +0000 (10:11 +0200)]
SDP: BUG1391 - bugs in the zero-copy send code
* fix sdp_bz_setup() code to handle the case of kernel data segment correctly (kernel sockets)
* make sdp_bz_setup() pass ENOMEM, EFAULT or other errors to sendmsg().
* Fix: the deallocation of bz descriptor in sendmsg() is not handled properly -- it is allocated many times, but freed once
* Fix: sdp_bzcopy_get() code does not raise reference count for all pages in the bz descriptor (only the "partial" pages will get the count raised).
However, the send completion code will call put_page() on all entries, leading to a crash for page-aligned transfers.
Signed-off-by: Constantine Gavrilov <constantine.gavrilov@gmail.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Thu, 20 Nov 2008 10:56:39 +0000 (12:56 +0200)]
SDP: BUG1348 - sockets are left in CLOSE state with ref count > 0
Removed unnecessary sock_hold() when a CM_REJECT arrives before TCP_ESTABLISHED state.
This happend in the server side when after getting CM_REQ and answering with CM_REP a CM_REJ
arrived.
The sock_hold that was removed assumed that there will be a timewait state - but according to
the spec, the state changes back to LISTEN without TIMEWAIT.