Amir Vadai [Thu, 25 Feb 2010 09:43:03 +0000 (11:43 +0200)]
sdp: SendSM wasn't sent sometimes after getting SrcAvailCancel
* skb was freed if rx_sa is aborted - preventing SendSM
to be sent.
* Didn't update rx_sa->used in case of SrcAvailCancel
and therefore not sending RdmaRdCompl.
This also caused the next read to fail because offset
wasn't updated
Amir Vadai [Wed, 24 Feb 2010 08:59:31 +0000 (10:59 +0200)]
sdp: Fix bug in crossing SrcAvail
* Handle RdmaRdCompl in interrupt, before splitted to two Q's
This way the handling is sequencial, and no race could occure
between RdmaRdCompl and SrcAvailCancel
* Fixed an error when checking that RdmaRdCompl is not for
old SrcAvail
If sdp_add_device() fails, there is no client data stored in the IB device,
leading to a kernel crash when a connection is being established. Fix this
by rejecting connections when the device is not initialized.
Also, fix a bad goto target in an error case early in sdp_init_qp().
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Tue, 16 Feb 2010 15:36:00 +0000 (17:36 +0200)]
sdp: Fix bugs in huge paged HW's
* Protect some constants that are based on PAGE_SIZE:
- FMR size
- xmit_goal
* renamed SDP_HEAD_SIZE => SDP_SKB_HEAD_SIZE
* removed unneeded special IA64 code due to changes here
Amir Vadai [Sun, 24 Jan 2010 15:12:34 +0000 (17:12 +0200)]
sdp: must use ib_sg_dma_*, not sg_dma_* for mapping
This fixes OFED bug 1895, althoug some warnings are still generated,
when running qperf sdp_bw with large sizes (using zcopy), on the
truescale adapters.
Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Tue, 24 Nov 2009 07:32:39 +0000 (09:32 +0200)]
sdp: fixed BUG1796 - running out of memory on rx
rcv queue could grow endlessly because minimal RX buffers in QP
was set to SDP_MIN_TX_CREDITS + 1 - so there always were credits
available for the sender.
Jack Morgenstein [Sun, 30 Aug 2009 14:16:13 +0000 (17:16 +0300)]
sdp: incorrect SDP_FMR_SIZE on 32-bit machines
On 32-bit machines, sizeof (u64 *) is 4 bytes (size of a ***pointer***).
However, the max SDP FMR pool size should be PAGE_SIZE / sizeof(an mtt entry) --
and mtt entries are u64's (or __be64's).
This resulted in SDP requesting twice as many entries per pool on 32-bit machines
as could fit on a single page -- with the result that the fmr pool allocation failed
at driver startup.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Jack Morgenstein [Sun, 30 Aug 2009 14:24:07 +0000 (17:24 +0300)]
sdp: check if sdp device is actually present in sdp_remove_one
If sdp fails to initialize at driver startup for any reason,
the device is still registered with the ib_core, but there will be
no client data (i.e., ib_set_client_data() will not be called, and all
kernel resources are de-allocated).
On removal, ib_get_client_data() will return NULL in this case -- and this
must be tested for -- or we will get a kernel Oops for a NULL pointer
dereference.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Amir Vadai [Tue, 28 Apr 2009 06:40:10 +0000 (09:40 +0300)]
sdp: BUG1311 Netpipe fails with a IB_WC_LOC_LEN_ERR.
This problem is seen when the receive buffer or the receive buffer fragments
are smaller than the senders buffer. If the sender is using 64KB pages
and supports a sk fragment of 64KB it may send fragments that are a full 64KB
in length causing the receiver to generate an IB_WC_LOC_LEN_ERR. This patch
makes two changes:
If the kernel does not support a full 64KB fragment it will reject resize
requests over 32K. (On older kernels a fragment size is defined as a U16)
If a kernel supports a 64KB fragment then it allows a full 64KB receive
fragment to be used.
Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Nicolas Morey-Chaisemartin [Wed, 29 Apr 2009 14:23:04 +0000 (16:23 +0200)]
sdp: change orphan_count and sockets_allocated from atomic_t to percpu_counter
Fixed SDP to work on 2.6.29+
As percpu_counter are huge they can be allocated on the stack without causing sdp module to crash.
Both variable are now dynamically allocated at module init.
Signed-off-by: Nicolas Morey-Chaisemartin <nicolas.morey-chaisemartin@ext.bull.net> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Mon, 17 Nov 2008 08:11:27 +0000 (10:11 +0200)]
SDP: BUG1391 - bugs in the zero-copy send code
* fix sdp_bz_setup() code to handle the case of kernel data segment correctly (kernel sockets)
* make sdp_bz_setup() pass ENOMEM, EFAULT or other errors to sendmsg().
* Fix: the deallocation of bz descriptor in sendmsg() is not handled properly -- it is allocated many times, but freed once
* Fix: sdp_bzcopy_get() code does not raise reference count for all pages in the bz descriptor (only the "partial" pages will get the count raised).
However, the send completion code will call put_page() on all entries, leading to a crash for page-aligned transfers.
Signed-off-by: Constantine Gavrilov <constantine.gavrilov@gmail.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Amir Vadai [Thu, 20 Nov 2008 10:56:39 +0000 (12:56 +0200)]
SDP: BUG1348 - sockets are left in CLOSE state with ref count > 0
Removed unnecessary sock_hold() when a CM_REJECT arrives before TCP_ESTABLISHED state.
This happend in the server side when after getting CM_REQ and answering with CM_REP a CM_REJ
arrived.
The sock_hold that was removed assumed that there will be a timewait state - but according to
the spec, the state changes back to LISTEN without TIMEWAIT.
Amir Vadai [Tue, 18 Nov 2008 13:41:15 +0000 (15:41 +0200)]
SDP: BUG1343 - Polygraph test crashes machine
No socket reference was taken before starting DREQ timeout.
cancel delayed work only remove the work if it is in the timer stage
before entered to the workqueue.
Because of that, sdp_dreq_timeout_work could be in the work queue
after the socket was destructed. and when the socket was reused
and the inlined work structre was resetted things went wrong.
Amir Vadai [Sun, 26 Oct 2008 09:43:43 +0000 (11:43 +0200)]
sdp: Limit skb frag size to 64K-1
When 64K pages are in use, the skb_frag size can become larger
than the skb_frag can address. An skb_frag's max size is 64K-1.
This patch defines SDP_MAX_PAYLOAD as 64K - SDP_HEADER_SIZE.
The patch changes sdp_post_recv() and sdp_sendmsg() to use the smaller of
PAGE_SIZE or SDP_MAX_PAYLOAD as it segment size.
This fix the bug here:
https://bugs.openfabrics.org/show_bug.cgi?id=1300
Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Amir Vadai <amirv@mellanox.co.il>