]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
8 years agoRDS: IB: skip rx/tx work when destroying connection
Wengang Wang [Thu, 11 Aug 2016 21:16:45 +0000 (14:16 -0700)]
RDS: IB: skip rx/tx work when destroying connection

Orabug: 24395789
quickref: 24314773

There is a race between rds connection destruction (rds_ib_conn_shutdown) path
and the IRQ path (rds_ib_cq_comp_handler_recv). The IRQ path can schedule the
takelet (i_rtasklet) again (to receive data) in between of the removal of the
tasklet from list and the destruction of the connection in destuction path. When
the tasklet run, it would then access on stale (destroied) data.
A seen case is it was accessing ic->i_rcq which is set to NULL by destuction
path.

Fix:
We add a flag to rds_ib_connection structure indicating the connection is
under detroying when set. The flag is set after we reap on the receive CQ i_rcq
and before start to destroy the CQ in rds_ib_conn_shutdown(). We also flush the
rds_ib_rx running in rds_aux_wq worker thread before starting the destroy. So
that all existing run of rds_ib_rx (in tasklet path and workder thread path)
won't access distroyed receive CQ. And newly queued job (tasklet or worker) will
exist on seeing the flag set before accessing the (maybe destroied) receive CQ.
The flag is unset on new connection completions to allow access on re-created
receive CQ. This patch also takes care of rds_ib_cq_comp_handler_send (the IRQ
handler for send). And we do a final reap after destroying the QP to take care
of the flushing errors to release resouce.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Reviewed-by: Rama Nichanamatlu <rama.nichanamatlu@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agoRDS: TCP: rds_tcp_accept_one() should transition socket from RESETTING to UP
Sowmini Varadhan [Tue, 28 Jun 2016 19:17:32 +0000 (12:17 -0700)]
RDS: TCP: rds_tcp_accept_one() should transition socket from RESETTING to UP

Orabug 23542064

Backport of upstream commit 3bb549ae4c51 ("RDS: TCP:
rds_tcp_accept_one() should transition socket from RESETTING to UP")

The state of the rds_connection after rds_tcp_reset_callbacks() would
be RDS_CONN_RESETTING and this is the value that should be passed by
rds_tcp_accept_one() to rds_connect_path_complete() to transition the
socket to RDS_CONN_UP.

Fixes: b5c21c0947c1 ("RDS: TCP: fix race windows in send-path
quiescence by rds_tcp_accept_one()")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS: TCP: fix race windows in send-path quiescence by rds_tcp_accept_one()
Sowmini Varadhan [Tue, 7 Jun 2016 18:48:43 +0000 (11:48 -0700)]
RDS: TCP: fix race windows in send-path quiescence by rds_tcp_accept_one()

Orabug: 23542064

Backport of upstream commit 9c79440e2c5e ("RDS: TCP: fix race windows
in send-path quiescence by rds_tcp_accept_one()")

The send path needs to be quiesced before resetting callbacks from
rds_tcp_accept_one(), and commit eb192840266f ("RDS:TCP: Synchronize
rds_tcp_accept_one with rds_send_xmit when resetting t_sock") achieves
this using the c_state and RDS_IN_XMIT bit following the pattern
used by rds_conn_shutdown(). However this leaves the possibility
of a race window as shown in the sequence below
        take t_conn_lock in rds_tcp_conn_connect
        send outgoing syn to peer
        drop t_conn_lock in rds_tcp_conn_connect
        incoming from peer triggers rds_tcp_accept_one, conn is
     marked CONNECTING
        wait for RDS_IN_XMIT to quiesce any rds_send_xmit threads
        call rds_tcp_reset_callbacks
        [.. race-window where incoming syn-ack can cause the conn
     to be marked UP from rds_tcp_state_change ..]
        lock_sock called from rds_tcp_reset_callbacks, and we set
     t_sock to null
As soon as the conn is marked UP in the race-window above, rds_send_xmit()
threads will proceed to rds_tcp_xmit and may encounter a null-pointer
deref on the t_sock.

Given that rds_tcp_state_change() is invoked in softirq context, whereas
rds_tcp_reset_callbacks() is in workq context, and testing for RDS_IN_XMIT
after lock_sock could result in a deadlock with tcp_sendmsg, this
commit fixes the race by using a new c_state, RDS_TCP_RESETTING, which
will prevent a transition to RDS_CONN_UP from rds_tcp_state_change().

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS: TCP: Retransmit half-sent datagrams when switching sockets in rds_tcp_reset_call...
Sowmini Varadhan [Tue, 7 Jun 2016 17:37:28 +0000 (10:37 -0700)]
RDS: TCP: Retransmit half-sent datagrams when switching sockets in rds_tcp_reset_callbacks

Orabug: 23542064

Backport of upstream commit 0b6f760cff04 ("RDS: TCP: Retransmit half-sent
datagrams when switching sockets in rds_tcp_reset_callbacks")

When we switch a connection's sockets in rds_tcp_rest_callbacks,
any partially sent datagram must be retransmitted on the new
socket so that the receiver can correctly reassmble the RDS
datagram. Use rds_send_reset() which is designed for this purpose.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS: TCP: Add/use rds_tcp_reset_callbacks to reset tcp socket safely
Sowmini Varadhan [Tue, 7 Jun 2016 14:52:23 +0000 (07:52 -0700)]
RDS: TCP: Add/use rds_tcp_reset_callbacks to reset tcp socket safely

Orabug: 23542064

Backport of upstream commit 335b48d980f6 ("RDS: TCP: Add/use
rds_tcp_reset_callbacks to reset tcp socket safely")

When rds_tcp_accept_one() has to replace the existing tcp socket
with a newer tcp socket (duelling-syn resolution), it must lock_sock()
to suppress the rds_tcp_data_recv() path while callbacks are being
changed.  Also, existing RDS datagram reassembly state must be reset,
so that the next datagram on the new socket  does not have corrupted
state. Similarly when resetting the newly accepted socket, appropriate
locks and synchronization is needed.

This commit ensures correct synchronization by invoking
kernel_sock_shutdown to reset a newly accepted sock, and by taking
appropriate lock_sock()s (for old and new sockets) when resetting
existing callbacks.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS: TCP: Avoid rds connection churn from rogue SYNs
Sowmini Varadhan [Mon, 6 Jun 2016 21:47:18 +0000 (14:47 -0700)]
RDS: TCP: Avoid rds connection churn from rogue SYNs

Orabug: 23542064

Backport of upstream commmit c948bb5c2cc4 ("RDS: TCP: Avoid rds connection
churn from rogue SYNs")

When a rogue SYN is received after the connection arbitration
algorithm has converged, the incoming SYN should not needlessly
quiesce the transmit path, and it should not result in needless
TCP connection resets due to re-execution of the connection
arbitration logic.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcp
Sowmini Varadhan [Mon, 6 Jun 2016 20:22:43 +0000 (13:22 -0700)]
RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcp

Orabug 23542064

Backport of upstream commit 37e14f4fe299 ("RDS: TCP: rds_tcp_accept_worker()
must exit gracefully when terminating rds-tcp")

There are two instances where we want to terminate RDS-TCP: when
exiting the netns or during module unload. In either case, the
termination sequence is to stop the listen socket, mark the
rtn->rds_tcp_listen_sock as null, and flush any accept workqs.
Thus any workqs that get flushed at this point will encounter a
null rds_tcp_listen_sock, and must exit gracefully to allow
the RDS-TCP termination to complete successfully.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS: TCP: Remove kfreed tcp conn from list
Sowmini Varadhan [Mon, 6 Jun 2016 19:03:18 +0000 (12:03 -0700)]
RDS: TCP: Remove kfreed tcp conn from list

Orabug: 23542064

This is a backport of the upstream commit 8200a59f24ae
("rds: Remove kfreed tcp conn from list")

All the rds_tcp_connection objects are stored list, but when
being freed it should be removed from there.

Original author: Pavel Emelyanov <xemul@parallels.com>

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS: IB: Add MOS note details to link local(HAIP) address print
Santosh Shilimkar [Wed, 10 Aug 2016 19:30:36 +0000 (12:30 -0700)]
RDS: IB: Add MOS note details to link local(HAIP) address print

Update the log to include MOS note details and also make the
banner more prominent. This makes it consistent with application
flagging the similar error with MOS note details.

Orabug: 23027670

Acked-by: Mukesh Kacker <mukesh.kacker@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agoib/mlx4: Initialize multiple Mellanox HCAs in parallel
Qing Huang [Wed, 10 Aug 2016 17:14:25 +0000 (10:14 -0700)]
ib/mlx4: Initialize multiple Mellanox HCAs in parallel

This is a rework of UEK2 commit a8962313e121 ("OFED: Load multiple ...").
The goal of this patch to reduce the total mount of system boot/kernel
startup time when there are multiple Mellanox HCAs present in the system.
Typically each HCA/PF would require 6~7s to initialize plus extra time for
a certian number of VFs created by each PF. By default, multiple HCAs have
to be probed one by one in a serialized fasion.

The new scheme is to create a work request for current pci probe/mlx4 init
task and then return -EPROBE_DEFER immediately to the probe caller while
the system thread starts to execute the work request in the background.
The main pci probe thread doesn't have to wait for all the current probe
task to finish. The background init task's progress and return err code
will be saved by the sys worker thread and processed from the deferred
queue.

Orabug: 20995222

Signed-off-by: Qing Huang <qing.huang@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agoRevert "IB/mlx4: Generate alias GUID for slaves"
Yuval Shaia [Wed, 10 Aug 2016 17:16:33 +0000 (10:16 -0700)]
Revert "IB/mlx4: Generate alias GUID for slaves"

Now the alias GUID management is moved to userland so
we no longer need this broken API.

Orabug: 24355806

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agoIB/mlx4: Do not generate random node_guid for VFs
Yuval Shaia [Thu, 21 Jul 2016 12:41:14 +0000 (05:41 -0700)]
IB/mlx4: Do not generate random node_guid for VFs

Exadata fast node detection and fail-over mechanism(s) relies on the fact
that node GUID in guest is the same as in dom0.

Orabug: 22145330

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Reviewed-by: Qing Huang <qing.huang@oracle.com>
8 years ago{IB/{core,ipoib},net/{mlx4,rds}}: Mark unload_allowed as __initdata variable
Yuval Shaia [Mon, 30 May 2016 10:23:28 +0000 (03:23 -0700)]
{IB/{core,ipoib},net/{mlx4,rds}}: Mark unload_allowed as __initdata variable

Replacing __read_mostly directive with __initdata since this variable is
used only during module initialization. Module parameter permissions are
changed accordingly.

Orabug: 23501273

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Reviewed-By: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agoEPSC_API_VERSION(2,8) - New EPSC_QUERY_ON_CHIP_TEMP
Lars Paul Huse [Fri, 22 Jul 2016 21:21:30 +0000 (23:21 +0200)]
EPSC_API_VERSION(2,8) - New EPSC_QUERY_ON_CHIP_TEMP

Also added a new EPSA_GET_EXPORTED_SYMBOL_MAP
which returns a list of exported EPSA runtime symbols.

Orabugs: 2431774623168922

Change-Id: I97c600950fefe6649ec0a8b6539f7225d78aa9c4
Reviewed-by: Knut Omang <knut.omang@oracle.com>
8 years agosif: pqp: Be less aggressive in invoking cond_resched()
Knut Omang [Fri, 22 Jul 2016 12:30:24 +0000 (14:30 +0200)]
sif: pqp: Be less aggressive in invoking cond_resched()

This commit attempts to avoid unnecessary or potentially dangerous
calls to cond_resched() in the privileged QP completion polling code.

Privileged QP requests typically takes a few microseconds to
complete. Since we usually need the result of the operation
to be able to continue, user code usually calls poll_cq_waitfor()
to busy wait for the completion of the request. This commit
adds two measures to make this logic better:

1) Avoid rescheduling while interrupts have been turned off:
The driver was using the in_interrupt() test to avoid calling
cond_resched() from interrupt context, and leaving the rest of
the decision making of whether or not to reschedule to cond_resched().
Testing indicates that this could lead to deadlock prone calls to
schedule(), as cond_resched() would actually allow rescheduling if interrupts
have been disabled. Switch this logic to use irqs_disabled() instead, which will
cover both the interrupt case and the cases where interrupts have been disabled
by the caller.

2) Busywait for the completion for a few cycles before even trying to reschedule
or cpu_relax. Measurements indicate that 10 tries are enough to cover
a large fraction of cases on a lightly loaded system.

Orabug: 23733539

Change-Id: Ief35e1828d4dde9b692640f259c3df80ccdb553b
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Francisco Trivino-Garcia <francisco.trivino@oracle.com>
8 years agosif: xrc: Add handling for xrc_domain_violation & invalid_xrceth events
Vinay Shaw [Sat, 16 Jul 2016 06:32:54 +0000 (08:32 +0200)]
sif: xrc: Add handling for xrc_domain_violation & invalid_xrceth events

XRC spec defines xrc_domain_violation and invalid_xrceth events as
affiliated asynchronous error on XRC TGTQP, the QP is marked in ERROR.
Map these to ofed's IB_EVENT_QP_FATAL type and qp event handler.

Orabug: 24318556

Signed-off-by: Vinay Shaw <vinay.shaw@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
8 years agosif: dfs: Minor change to print CQ tied to XRCSRQ (rq_hw).
Vinay Shaw [Tue, 19 Jul 2016 13:37:08 +0000 (15:37 +0200)]
sif: dfs: Minor change to print CQ tied to XRCSRQ (rq_hw).

Orabug: 24318845

Signed-off-by: Vinay Shaw <vinay.shaw@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
8 years agosif: During driver load, hold back events instead of ignoring them
Knut Omang [Wed, 20 Jul 2016 12:03:13 +0000 (14:03 +0200)]
sif: During driver load, hold back events instead of ignoring them

The current semantics when a queued event is received before the driver
is done loading is to ignore it with a log warning. This was not sufficient
to implement the flush_retry_qp setup, which relies on lid change events.

Unfortunately the solution to make an exception for lid events for the
flush_retry_qp is not valid because it defeats the purpose of the
check in the first place by allowing such an event to be handled before
the data structure needed to handle it is initialized.

This commit introduces a new kernel completion that the driver completes
when the whole driver load is finished. The first EPSC event queued on the
sif work queue will now block on this completion.

This covers all the remaining cases not handled by commit
"eq: Avoid enabling interrupts on TSU EQs until the initialization is complete"
and solves the original problem that introduced the need for a fix.

Orabug: 24296729

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
8 years agosif: Let sif_remove implement the shutdown entry point
Knut Omang [Wed, 20 Jul 2016 06:15:24 +0000 (08:15 +0200)]
sif: Let sif_remove implement the shutdown entry point

Due to bugs in the FLR handling on PSIF 2.1, we need to make sure
that the driver gets to do a full unload whenever possible.
Simply let the shutdown entry point be sif_remove similar
to the remove entry point.

Orabug: 24322970

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
8 years agosif: pqp: Fix potential null pointer exception under high load
Knut Omang [Tue, 19 Jul 2016 07:23:05 +0000 (09:23 +0200)]
sif: pqp: Fix potential null pointer exception under high load

If a high number of invalidate requests are posted
without requesting completions, the PQP may run full
enough not to be able to allow a posted req anymore.

To handle this scenario, an additional attempt to send a
synchronous invalidate request was added. Unfortunately
that request ended up being posted with synchronous semantics
but without a handle to handle the completion.

This commit fixes this case by dynamically allocating/freeing
a handle in such situations.

Orabug: 24316139

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
8 years agosif: fmr: call sif_post_flush_tlb with ptw flush and in SR/IOV cases
Knut Omang [Tue, 19 Jul 2016 05:54:45 +0000 (07:54 +0200)]
sif: fmr: call sif_post_flush_tlb with ptw flush and in SR/IOV cases

Flush the ptw cache as part of the bulk tlb flush.

With the introduction of more dynamic page tables,
the code to distinguish between the two types of cases,
simple page table entries and entries with interior nodes
was simplified because we no longer always know if PTW entries
have been consumed. Unfortunately the code was unified to the
wrong case which does not flush the ptw cache possibly causing
cache entries to remain in the cache and theoretically have
sif look up pages that no longer exists or that has been reused
for other purposes.

Also now that the EPSC handles the tlb flushing, it is just fine
to call that code even in virtualized setups.
Remove tests for whether VFs exists or we are running in a VF.

Orabug: 24315529

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
8 years agosif: eq: Avoid enabling interrupts on TSU EQs until the initialization is complete
Knut Omang [Mon, 18 Jul 2016 11:30:01 +0000 (13:30 +0200)]
sif: eq: Avoid enabling interrupts on TSU EQs until the initialization is complete

During driver load we might have some rare conditions where external events
are occuring before the driver is ready to accept them. The hardware workarounds
to handle issues with QP flushing are particularly sensitive to this.

Delay enabling of the IRQs that can generate interrupts for all
event queues except the EPS event queue(s) until everything is set up and ready.

Note that this commit will also implicitly cause interrupts for EQs 1-3 for each EPSA
not to be enabled. This is no big deal as they are currently not used anyway.

Orabug: 24296729

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
8 years agosif: base: change default queue size according to ED scale_profile=1
Hakon Bugge [Thu, 14 Jul 2016 11:34:18 +0000 (13:34 +0200)]
sif: base: change default queue size according to ED scale_profile=1

Make queue sizes of PSIF equal to those of cx3 when using ED's
scale_profile=1.

Signed-off-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Orabug: 23141108
Reviewed-by: Knut Omang <knut.omang@oracle.com>
8 years agosif: sif_eq: fix missing qp->refcnt decrement for COMM_EST events
Francisco Triviño [Wed, 13 Jul 2016 15:23:40 +0000 (17:23 +0200)]
sif: sif_eq: fix missing qp->refcnt decrement for COMM_EST events

qp->refcnt is increased by 1 when event_status_communication_established
is dispatched but later it is not decremented when handling the event
work for IB_EVENT_COMM_EST for UD & RAW QP types.

This commit decrements qp->refcnt for those cases too and fixes another
potential bug by moving the sif_log line up before the qp->refcnt
decrement.

Orabug: 24288467

Signed-off-by: Francisco Triviño <francisco.trivino@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
8 years agoEPSC_API_VERSION(2,6) - Adding retrieval of SMP and vlink connect modes
Harald Høeg [Fri, 15 Jul 2016 08:03:26 +0000 (10:03 +0200)]
EPSC_API_VERSION(2,6) - Adding retrieval of SMP and vlink connect modes

Orabug: 23634562

Change-Id: Ic3eea7a7297c9ff97e72cb25dda4ba44fdfa2937
Signed-off-by: Harald Høeg <harald.hoeg@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
8 years agosif: eq: increase cq_eq_max to 46
Hakon Bugge [Fri, 8 Jul 2016 10:39:19 +0000 (12:39 +0200)]
sif: eq: increase cq_eq_max to 46

PSIF supports 48 msi-x interrupts. We associate one msi-x per event
queue (EQ). Further, PSIF need one eq for epsc and one for async
events from the hardware. That leaves 46 for completion notification
events or completion vectors.

This commit also reduces the number of completion notification event
queues to the lesser of the number of cpus present and the default.

Note, this requires fw 1.0.0.1 or newer...

Orabug: 23705843

Change-Id: Iea9101bf09203dff86403453a7e0690cb31b3756
Signed-off-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
8 years agosif: sif_r3: implemented WA#4074 stats counters
Triviño [Thu, 30 Jun 2016 14:30:07 +0000 (16:30 +0200)]
sif: sif_r3: implemented WA#4074 stats counters

This commit added both wa4074 and wa4059 statistics
to help to identify potential issues when the work-
around are applied.

The wa4074 stats implementation is based on:

a) pre_wa4074_cnt == post_wa4074_cnt. This means the
w/a is triggered from the modify_qp_hw.
b) pre_wa4074_cnt != post_wa4074_cnt. post_wa4074 is
triggered from other scenarios too.
c) post_wa4074_err_cnt != 0. It means that post_wa4074
fails.
d) wrs_csum_corr_wa4074_cnt indicates the number of
WRs that were csum corrupted.
e) rcv_snd_gen_wa4074_cnt shows the number of recv
and send cqe's were generated.

The wa4059 stats indicate the number of keep-alive
events that have been sent.

This commit also improves wa3714 stats implementation
by using atomic64 counters and enumeration values,
and other minor changes such as clean up and fix
typos on comment messages.

Orabug: 23760170

Signed-off-by: Triviño <francisco.trivino@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
8 years agosif: Remove software emulation of > 16 SGEs
Hans Westgaard Ry [Wed, 6 Jul 2016 10:44:35 +0000 (12:44 +0200)]
sif: Remove software emulation of > 16 SGEs

Orabug: 24310514

Change-Id: I1886d138b0ff103b074c45da475b3052dd1fd9b1
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
8 years agosif: rq: Do not clear the rq_sw until the completion of flush_rq
Wei Lin Guay [Tue, 5 Jul 2016 11:51:37 +0000 (13:51 +0200)]
sif: rq: Do not clear the rq_sw until the completion of flush_rq

Orabug: 23754857

The rq can be invalidated from reset_qp or flush_rq. Nevertheless,
the rq_sw data structure has been reset after rq is invalidated in
reset_qp regardless of the completion of the flush_rq. Thus, move
the rq synchronization to reset_qp, and place the synchronization
in between of reset_qp and flush_rq. After invalidating and
reseting the rq, no flush rq is required as both head and tail
have been reset to 0.

This commit creates another atomic_t variable for the synchronization
between reset rq and flush_rq.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
8 years agoIBCM: dereference timewait_info only when needed
Santosh Shilimkar [Tue, 19 Jul 2016 02:35:26 +0000 (19:35 -0700)]
IBCM: dereference timewait_info only when needed

timewait_info is available in valid CM states and may
not be even allocated in invalid states.

Lets move the dereferencing only when we need in
those valid state.

Orabug: 24326732

Reviewed-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Tested-by: Efrain Galaviz <efrain.galaviz@oracle.com>
Tested-by: Hong Liu <hong.x.liu@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agoIB: Add RNR timer workaround for PSIF
Santosh Shilimkar [Sat, 18 Jun 2016 20:06:29 +0000 (13:06 -0700)]
IB: Add RNR timer workaround for PSIF

The RNR NAK Retry timer on Titan and Sonoma 1&2 IB subsystems runs 500
times faster than desired. This means that retries are started a lot
sooner than they should.

The software workaround is bit involved and intrusive because it needs
to work in mixed HCA environments. It uses CM protocol to detect the
involvement of the offending IB requestor and then enables the
workaround in the peer responder. To keep the workaround flag
persistent, ib_qp verbs need to carry the flag which impacts
IB core kABI which is wrapped under __GENKSYMS__.

The workaround matches the desired RNR NAK Retry timer value when the
encodings 1 to 14 (decimal) are supplied. For encodings larger than 14
and for zero, the work-around will set the largest possible RNR NAK
Timer value for the offending requestor, which is 1,31 ms.

Thanks to Trivino, Haakon for updates and wide range of testing for
kernel as well as userland with mixed HCA configurations.

Orabug: 23633926

Reviewed-by Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: David Brean <david.brean@oracle.com>
Tested-by: Francisco Triviño García <francisco.trivino@oracle.com>
Signed-off-by: Francisco Triviño García <francisco.trivino@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agoIB/core: Add encode/decode FDR/EDR rates
Hans Westgaard Ry [Mon, 11 Jul 2016 10:15:16 +0000 (12:15 +0200)]
IB/core: Add encode/decode FDR/EDR rates

The cases for FDR/EDR signalling speed, was missing in
ib_rate_to_mult and mult_to_ib_rate giving wrong return values
when drivers are converting static rate to/from inter-packet-delay.

Orabug: 23084916

Change-Id: Ib1d6e84eeea1addb830c415faf92f9f430c4ba32
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agoxsigo: SKB Frag cleanup
Pradeep Gopanapalli [Tue, 12 Jul 2016 20:17:29 +0000 (13:17 -0700)]
xsigo: SKB Frag cleanup

Orabug: 23514725

Fixed pre-allocating transmit scatter gather lists by using
max_sge variable instead of MAX_SKB_FRAGS

some changes to prints

Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: sajid zia <szia@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
9 years agoxsigo: Tx_tail goes outof bound
Pradeep Gopanapalli [Tue, 12 Jul 2016 20:14:34 +0000 (13:14 -0700)]
xsigo: Tx_tail goes outof bound

Orabug: 23514725

Fixed a rare condition where tx_tail value goes out of bound, by properly
locking poll_tx

Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: sajid zia <szia@oracle.com>
Reviewed-by: Haakon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
9 years agoxsigo: Fixed Path locking issues
Pradeep Gopanapalli [Tue, 12 Jul 2016 20:10:56 +0000 (13:10 -0700)]
xsigo: Fixed Path locking issues

Orabug: 23514725

Changed xve_put_path to allow condition where caller
holds private lock, priv->lock
Removed path_free function and put all the functionality
in xve_put_path
No need for using scatter-gatter when MTU is less than admin mtu
instead of multicast mtu, as admin MTU is the driving factor for
vnic

Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: sajid zia <szia@oracle.com>
Reviewed-by: Haakon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Asmund Ostvold <asmund.ostvold@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
9 years agonet/rds: Skip packet filtering if interface does not support ACL
Yuval Shaia [Thu, 9 Jun 2016 18:41:32 +0000 (11:41 -0700)]
net/rds: Skip packet filtering if interface does not support ACL

NULL value returned from ib_cm_dpp_acl_lookup for a given DPP means that
this DPP is not under ACL protection.
In this case we skip packet filtering.

Orabug: 23541567

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: Fix the rds_conn_destroy panic due to pending messages
Bang Nguyen [Sat, 16 Apr 2016 18:47:05 +0000 (11:47 -0700)]
RDS: Fix the rds_conn_destroy panic due to pending messages

In corner cases, there could be pending messages on connection which
needs to be detsroyed. Make sure those messages are purged before
the connection is torned down.

Orabug: 23222944

Signed-off-by: Bang Nguyen <bang.nguyen@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: add handshaking for ACL violation detection at passive
Ajaykumar Hotchandani [Thu, 14 Apr 2016 21:20:08 +0000 (14:20 -0700)]
RDS: add handshaking for ACL violation detection at passive

Offending connections with ACL violations should be cleaned up as
early as possible. When active detects ACL violation and sends reject;
it fills up private_data field. Passive checks for private_data
whenever it receives reject; and in case of ACL violation it destroys
connection.

Orabug: 23222944

Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: IB: enforce IP anti-spoofing based on ACLs
Santosh Shilimkar [Tue, 15 Mar 2016 12:32:09 +0000 (05:32 -0700)]
RDS: IB: enforce IP anti-spoofing based on ACLs

Connection is established only after the IP requesting the connection
is legitimate and part of the ACL group. Invalid connection request(s)
are rejected and destroyed.

Ajay moved destroy connection when ACL check fails while initiating
connection to avoid unnecessary packet transfer on wire.

Orabug: 23222944

Signed-off-by: Bang Ngyen <bang.nguyen@oracle.com>
Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: Add acl fields to the rds_connection
Santosh Shilimkar [Wed, 3 Feb 2016 16:13:25 +0000 (08:13 -0800)]
RDS: Add acl fields to the rds_connection

ACL can enabled on connections and to track them per connection,
lets add couple of fields.

Orabug: 23222944

Signed-off-by: Bang Ngyen <bang.nguyen@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: IB: invoke connection destruction in worker
Ajaykumar Hotchandani [Thu, 14 Apr 2016 20:58:46 +0000 (13:58 -0700)]
RDS: IB: invoke connection destruction in worker

This is to avoid deadlock with c_cm_lock mutex.
In event handling path of Infiniband, whenever connection destruction is
required; we should invoke worker in order to avoid deadlock with mutex.

Orabug: 23222944

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
9 years agoRDS: Add reset all conns for a source address to CONN_RESET
Santosh Shilimkar [Wed, 23 Mar 2016 04:51:49 +0000 (21:51 -0700)]
RDS: Add reset all conns for a source address to CONN_RESET

RDS_CONN_RESET SO gets enhanced to support reseting all
connections associated with a local address.

$rds-stress -r <SRC_IP> -s 0 --reset

Orabug: 23222944

Reported-by: Bang Ngyen <bang.nguyen@oracle.com>
Acked-by: Bang Ngyen <bang.nguyen@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoIB/mlx4: Generate alias GUID for slaves
Yuval Shaia [Thu, 26 May 2016 16:25:19 +0000 (09:25 -0700)]
IB/mlx4: Generate alias GUID for slaves

Generate alias GUID by changing the fourth byte to be the GUID index in the
port GUID table.

This is porting of a work done in uek2 for Oracle purpose only.

Orabug: 23222944

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
9 years agoIB/ipoib: ioctl interface to manage ACL tables
Yuval Shaia [Mon, 4 Apr 2016 13:40:56 +0000 (16:40 +0300)]
IB/ipoib: ioctl interface to manage ACL tables

Expose ioctl to manage ACL content by application layer.

Orabug: 23222944

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
merge into IOCTL code

9 years agoIB/ipoib: sysfs interface to manage ACL tables
Yuval Shaia [Mon, 4 Apr 2016 13:29:12 +0000 (16:29 +0300)]
IB/ipoib: sysfs interface to manage ACL tables

Expose sysfs interface for ACL to be used for debug.

Orabug: 23222944

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoIB/{cm,ipoib}: Filter traffic using ACL
Yuval Shaia [Mon, 4 Apr 2016 12:39:49 +0000 (15:39 +0300)]
IB/{cm,ipoib}: Filter traffic using ACL

Implement two packet filtering points, one at ib_ipoib driver when
processing ARP packets and second in ib_cm when processing connection
requests.

Orabug: 23222944

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoIB/{cm,ipoib}: Manage ACL tables
Yuval Shaia [Mon, 4 Apr 2016 10:56:50 +0000 (13:56 +0300)]
IB/{cm,ipoib}: Manage ACL tables

Add support for ACL tables for ib_ipoib and ib_cm drivers.
ib_cm driver exposes functions register and unregister tables and to manage
tables content.
In ib_ipoib driver add ACL object for each network device.

Orabug: 23222944

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agooffload ib subnet manager port and node get info query handling.
Rama Nichanamatlu [Tue, 21 Jun 2016 07:54:37 +0000 (00:54 -0700)]
offload ib subnet manager port and node get info query handling.

This change offloads ib subnet manager port and node get info query
handling to the HCA firmware to answer them. These port and node get
info query responses are time bound and HOST based sma software handler
responses can get delayed because of busy CPUs (RT workloads, interrupt
handlers, etc). Delayed responses can lead to SM taking node out of the
fabric which is not desirable. The port/node INFO query offload, will
let these specific SM queries handled by HCA firmware in a timely manner
irrespective of CPUs being busy at that moment in time.

Orabug: 23750258

Signed-off-by: Rama Nichanamatlu <rama.nichanamatlu@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoIB/ipoib: Adjust queue sizes
Ajaykumar Hotchandani [Thu, 26 May 2016 00:04:49 +0000 (17:04 -0700)]
IB/ipoib: Adjust queue sizes

Current UEK4 uses 128 as default send queue size, and 256 as default
receive queue size.
UEK2 uses 2048 for send and receive queue size as default.

This patch adjusts queue sizes to avoid potential reports regarding
performance bottlenecks on UEK4.

Orabug: 23302017

Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
9 years agoIB/ipoib: Change send workqueue size for CM mode
Ajaykumar Hotchandani [Wed, 18 May 2016 01:54:42 +0000 (18:54 -0700)]
IB/ipoib: Change send workqueue size for CM mode

Idea here is, one misbehaving connection should not become single point
of failure.

priv->tx_outstanding is shared by all QPs and when it reaches
sendq_size, network interface queue is stopped.

In connected mode, for every connection, TX QP size is sendq_size.
So if one of QP starts behaving bad and we don't receive send
completions in time, priv->tx_outstanding value can reach to the limit
where network interface queue is required to be stopped.
This can bring down entire cluster, because even ping will not go
forward from that point onwards.

With this patch, when creating CM QP for send operations, we limit size:
+int ipoib_cm_sendq_size __read_mostly = ipoib_sendq_size / 8;

Based on Yuval's suggestion, added module parameter to dictate how many
bad connections we want to allow (8 above is configurable).

If outstanding completions for that particular connection reaches to
size of ipoib_cm_sendq_size; we halt sending data on that connection
till we receive at least one completion.

In summary, this will require multiple QPs to misbehave (instead of 1)
in order to bring down entire cluster.

As clarification, this patch is not trying to recover or change behavior
of connection which may have gone bad; but it's reducing impact of bad
connection.

Orabug: 23254764

Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
9 years agomlx4_core: use higher log_rdmarc_per_qp when scale_profile is set
Mukesh Kacker [Thu, 30 Jun 2016 19:16:46 +0000 (12:16 -0700)]
mlx4_core: use higher log_rdmarc_per_qp when scale_profile is set

Another parameter log_rdmarc_per_qp is scaled up to higher
value (128) when scale_profile is set based on new requirement.

The commit
58f318ea1272 "net/mlx4_core: Modify default value of
              log_rdmarc_per_qp to be consistent with HW capability"
was modifying the Mellanox defaults to accomplish the same but
this change uses scale_profile to be consistent with all the
other changes from Mellanox defaults done for HCA parameters.

This also (indirectly) fixes a code merge issue
with  following commits where a change to default value
of log_rdmarc_per_qp got inadvertently reverted as
two independent changes that interacted were done
in the same merge window (albeit this fix does it with a
slightly different implementation):

58f318ea1272 "net/mlx4_core: Modify default value of
              log_rdmarc_per_qp to be consistent with HW capability"
3480399bdf6d "mlx4_core: scale_profile should work without params
              set to 0"

Orabug: 23725942

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: IB: change rds_ib_active_bonding_excl_ips to only RFC3927 space
Todd Vierling [Tue, 28 Jun 2016 19:56:13 +0000 (15:56 -0400)]
RDS: IB: change rds_ib_active_bonding_excl_ips to only RFC3927 space

Currently rds_ib_active_bonding_excl_ips excludes both
169.254.0.0/16 and 172.10.0.0/16 address ranges from use with RDS
bonding. This parameter was meant to default to the range used by
link-local addresses (169.254.0.0/16, RFC3927) as those do not play
nicely with InfiniBand.

"172.10/16" was probably a mistaken typing of "172.16/12", which is
one of the private use -- but not link-local -- ranges defined by
RFC1918. 172.10.0.0/16 is in active use on the global Internet
(part of the block 172.0.0.0/12 as of this writing); it doesn't
belong here.

Change the parameter default to only "169.254/16" per the original
change's intent.

Orabug; 23712042

Signed-off-by: Todd Vierling <todd.vierling@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: avoid large pages for sg allocation for TCP transport
Santosh Shilimkar [Sat, 25 Jun 2016 21:56:18 +0000 (14:56 -0700)]
RDS: avoid large pages for sg allocation for TCP transport

To reduce SGEs, commit '23f90cc {"RDS: fix the sg allocation based
on actual message size" used buddy allocator to allocate large
pages based on messages size.

This change though seems to create issue for TCP transport most
likely triggering memory leak some where in RDS TCP driver path.
The same core code with large pages seems to work just fine with
IB transport.

Patch avoids the hugepage allocation for RDS TCP sockets.

Orabug: 23635336

Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoIB/ipoib v2: Add readout of statistics using ethtool
Hans Westgaard Ry [Thu, 30 Jun 2016 11:41:36 +0000 (13:41 +0200)]
IB/ipoib v2: Add readout of statistics using ethtool

IPoIB collects statistics of traffic including number of packets
sent/received, number of bytes transferred, and certain errors. This
patch makes these statistics available to be queried by ethtool.

Orabug: 23105464
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
Change-Id: I654587da2fd1628e0977346c87b4a3a2f08a4bdc

9 years agoIB/core: Add encode/decode IB_RATE_25_GBPS
Hans Westgaard Ry [Wed, 29 Jun 2016 10:22:16 +0000 (12:22 +0200)]
IB/core: Add encode/decode IB_RATE_25_GBPS

The case for IB_RATE_25_GBPS, EDR signalling speed, was missing in
ib_rate_to_mult and mult_to_ib_rate giving wrong return values
when drivers are converting static rate to/from inter-packet-delay.

Orabug: 23084916

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: Support for EPSC_API_VERSION(2,5)
Knut Omang [Sun, 3 Jul 2016 12:47:10 +0000 (14:47 +0200)]
sif: Support for EPSC_API_VERSION(2,5)

Up-to-date header files for protocol up to and including EPSC API v.2.5:
 * a new query, EPSC_QUERY_HW_REVISION to query HW revision through mailbox
   (from v2.5)
 * Enable Jumbo frame query support (from v2.4)
 * add DEGRADE_CAUSE_FLAG_MCAST_LACK_OF_CREDIT
   adding new cause for degraded mode (from v2.3)
 * adding external portinfo query:
   Adding a query for some portinfo attributes on the external port.
   Only a draft until PSIFFW implementation is done.
   (from v2.2)
 * API for mailbox for BER (BER = Bit Error Rate) support
   (from v2.1)

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Be more memory conservative for kdump and xen pv
Knut Omang [Fri, 1 Jul 2016 09:16:03 +0000 (11:16 +0200)]
sif: Be more memory conservative for kdump and xen pv

- Enable using the Xen PV memory usage settings for kdump as well
- Tune these settings down by a factor 2 to alleviate
  Orabug: 23523713

Orabug: 23729807

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: rq: Use a workqueue to handle sif_flush_rq
Wei Lin Guay [Fri, 1 Jul 2016 08:28:46 +0000 (10:28 +0200)]
sif: rq: Use a workqueue to handle sif_flush_rq

Orabug: 23491094

In sif_flush_rq, one of the required steps is to acquire
the qp mutex for qp state transition. Thus, this commit
moves the sif_flush_rq into a seperate singlethreaded
workqueue to ensure that sif_flush_rq is safe to call
from any context, including the interrupt context.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: rq: Added synchronization between sif_flush_rq and sif_post_recv
Wei Lin Guay [Fri, 24 Jun 2016 13:19:33 +0000 (15:19 +0200)]
sif: rq: Added synchronization between sif_flush_rq and sif_post_recv

Orabug: 23491094

The sif_flush_rq retrieves the rq_sw->last_seq without acquiring
the rq lock. Thus, adding the lock in sif_flush_rq to ensure that
the FLUSH-IN-ERR completion(rq_sw->last_seq) is only being generated
after post_recv(rq_sw->last_seq) has completed.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: qp: added persistent_state in sif_qp struct
Wei Lin Guay [Thu, 30 Jun 2016 08:42:26 +0000 (10:42 +0200)]
sif: qp: added persistent_state in sif_qp struct

Orabug: 23491094

QP state needs to be referred in certain context which may not sleep.
Nevertheless, the state is not guarantee as mutex cannot be used. Thus,
this commit added new atomic persistent_state to determine the QP
state in non-sleep context.

This commit removes non-used flush_sq_done_wa4074 variable and added
mutex for sif_query_qp due to WA #3714 and WA #662. In SIF, there is
intermediate QP state from RTS->ERR and RTS->RESET. Thus, without
mutex, the sif_query_qp might gets the intermediate QP state.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: qp: Increase inline data for TSO QPs to accomodate larger L3/L4-headers
Hans Westgaard Ry [Thu, 30 Jun 2016 09:25:34 +0000 (11:25 +0200)]
sif: qp: Increase inline data for TSO QPs to accomodate larger L3/L4-headers

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Tested-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: WA#3714: Set flush_retry_qp transport timer to infinite
Triviño [Thu, 30 Jun 2016 08:49:47 +0000 (10:49 +0200)]
sif: WA#3714: Set flush_retry_qp transport timer to infinite

The flush_retry_qp is configured with a minimum timeout of 6 value
262.144 usec), in combination with bug#4146 (duplicate send requests
not Acked if target RQ is empty) seems to be the reason because
driver is running into some timeouts after applying WA#3714 (waiting
for the completion of the zero post send).

This commit set the flush_retry_qp transport timer to infinite (0 value)

Signed-off-by: Triviño <francisco.trivino@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Add a feature mask to allow internal vlink state to follow ext.links
Knut Omang [Tue, 28 Jun 2016 07:24:45 +0000 (09:24 +0200)]
sif: Add a feature mask to allow internal vlink state to follow ext.links

SIF implements an internal IB switch for each port - all HCA vPorts are
connected to this switch, which again has a single external port
associated with the actual state of the physical port.

Some of the current management software assumes that a failed external port
of an HCA can be observed by looking at the local port, which is not
the case with SIF, where the local virtual port will not go down
if the external link goes down.

Firmware implements a mode to logically "wire" the vPort to the
corresponding physical port to mimic the legacy behaviour.

This mode can be enabled by OR'ing in 0x10000 in the module parameter
feature_mask. This is a temporary fix until management software can handle
this topology better.

Orabug: 23509653

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Redefine IB_QP_CREATE_ flags
Hans Westgaard Ry [Tue, 28 Jun 2016 08:33:00 +0000 (10:33 +0200)]
sif: Redefine IB_QP_CREATE_ flags

Redefine sif-spesific IB_QP_CREATE_ flags to avoid conflict with flags defined in ib_verbs.h
Flags moved to range defined by IB_QP_CREATE_RESERVED_START and IB_QP_CREATE_RESERVED_END

Note that we define more flags than fit the range and that some are defined below _RESERVED_START.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
9 years agosif: SQ: Adding synchronization between wa4074 and post_send
Wei Lin Guay [Thu, 23 Jun 2016 19:11:42 +0000 (21:11 +0200)]
sif: SQ: Adding synchronization between wa4074 and post_send

Orabug: 23607042

In pre_process_wa4074, the SQ lock must be held before corrupting
the checksum in the SQ entry. Besides, use inverse the checksum
value rather than setting it to 0.

Another missing case of acquiring of SQ lock is before generating
the completion.  The SQ lock is only held to access the sq_sw->last_seq
to avoid generating completion before post send is completed. If this
case happen, it might cause the completion to be generated using the old
wc_id.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: BZ4074: clean up the workaround
Wei Lin Guay [Thu, 23 Jun 2016 18:34:35 +0000 (20:34 +0200)]
sif: BZ4074: clean up the workaround

Orabug: 23607042

This patch cleans up workaround 4074 in reviewing
the code while working on Orabug: 23607042.

1) This patch replaces the epsc_query_qp with reading
   the QPS from the memory directly.
   As the QP is already in RESET state,  accessing
   the QPS info from EPSC might potentially cause
   any unexpected returned data.
2) Remove the unused function sq_flush_wa4074.
3) For readibility, use the correct PSIF enum to
   mask PSIF specific WC code.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: BZ 4150: Flush retry reset at 1 when QP is modified to ERROR
Wei Lin Guay [Thu, 23 Jun 2016 19:39:43 +0000 (21:39 +0200)]
sif: BZ 4150: Flush retry reset at 1 when QP is modified to ERROR

Orabug: 23607042

The workaround is to disallow the polling of CQ before
the QP is modified to ERROR. By doing so, the CQ will be updated to
the correct sq_seq during post_wa4074.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Automatically generate module version from new define TITAN_RELEASE
Knut Omang [Thu, 16 Jun 2016 07:20:51 +0000 (09:20 +0200)]
sif: Automatically generate module version from new define TITAN_RELEASE

Change-Id: Ie9e262f12f53c0cdcd27ba9f7fa387be0ef4d884
Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Enable debugging via trace_printk again
Knut Omang [Fri, 1 Jul 2016 08:13:59 +0000 (10:13 +0200)]
sif: Enable debugging via trace_printk again

See Orabug: 23510486

Change-Id: I6353820356c9cf9286a1ba72ce883da507736c5f
Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Remove ib_query_mr - it has been removed upstream
Knut Omang [Tue, 14 Jun 2016 11:28:28 +0000 (13:28 +0200)]
sif: Remove ib_query_mr - it has been removed upstream

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Use kernel function printk_ratelimit() instead of home brew
Knut Omang [Wed, 8 Jun 2016 12:10:27 +0000 (14:10 +0200)]
sif: Use kernel function printk_ratelimit() instead of home brew

Removed sif_log_cq, sif_log_cq and the perf_sampling_threshold
kernel module which was added for debugging purposes.
Also adjust down a few log levels of some messages.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: sif_qp: implement additional flush_retry_qp for port 2 (WA#3714)
Triviño [Wed, 8 Jun 2016 08:57:57 +0000 (10:57 +0200)]
sif: sif_qp: implement additional flush_retry_qp for port 2 (WA#3714)

Current implementation of WA#3714 uses an RC flush_retry_qp associated to
port 1, and configured with port_lid from port 1. Under the scenario where
port 1 is not used (port 1 is not connected, or in INIT state), but port
2 is up and running, the WA will use an invalid flush_retry_qp (with
port_lid = 0).

This commit improves WA#3714 implementation by creating an additional
flush_retry_qp that is associated to the port 2. The proper lush_retry_qp
is selected depending on the target QP port on which the WA will be applied.

Signed-off-by: Triviño <francisco.trivino@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Build for kernel v.4.5.6
Knut Omang [Tue, 14 Jun 2016 14:02:26 +0000 (16:02 +0200)]
sif: Build for kernel v.4.5.6

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: rq: Added synchronization during freeing rq
Wei Lin Guay [Tue, 14 Jun 2016 18:40:14 +0000 (20:40 +0200)]
sif: rq: Added synchronization during freeing rq

Added synchronization between free_rq and flush_rq to
ensure rq can only be freed up after flush_rq has
completed.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: cq: Do not invalidate the CQ until completion of events
Wei Lin Guay [Tue, 14 Jun 2016 12:33:07 +0000 (14:33 +0200)]
sif: cq: Do not invalidate the CQ until completion of events

The CQ completion event might be performed while the CQ
has been invalidated. Besides, adding a check in the
sif_req_notify_cq for not rearming the CQ if the CQ
has been invalidated.

This fixes a scenario reported in Orabug: #23491094.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: BZ 4138: Fix a NULL pointer dereference in RDS during tear-down
Wei Lin Guay [Tue, 31 May 2016 05:51:32 +0000 (07:51 +0200)]
sif: BZ 4138: Fix a NULL pointer dereference in RDS during tear-down

The issue is in the rqflush where the hardware view cannot be
trusted and SIF driver needs to rely on the software view. In
this case, software must wait for 1s to ensure that all the
completions are back. If the software counter is different
than the hardware view, the software counter will be used.

This issue is observed in Orabug: 23490618.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: LSO, test/adjust attr in create_qp,test stencil-size in send
Hans Westgaard Ry [Mon, 13 Jun 2016 11:17:58 +0000 (13:17 +0200)]
sif: LSO, test/adjust attr in create_qp,test stencil-size in send

We need to test that we can add an sge for the LSO stencil
subtracting 1 from max number of supported sge.  The
attr.max_sge is incremented to get correct allocation of sge
with the extra entry for the LSO stencil.

We assumpe that the size of LSO headers/stencils is <= 64
adjusting max_inlinesize. The actual size is tested when
work-requests are posted.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: qp: Remove function name in debug printout to avoid confusion
Hakon Bugge [Mon, 13 Jun 2016 13:56:33 +0000 (15:56 +0200)]
sif: qp: Remove function name in debug printout to avoid confusion

Signed-off-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: sif_qp: remove flush_sq_done_wa4074 condition from WA#3714
Triviño [Fri, 10 Jun 2016 14:47:43 +0000 (16:47 +0200)]
sif: sif_qp: remove flush_sq_done_wa4074 condition from WA#3714

Commit ab5d21b added flush_sq_done_wa4074 condition to the WA#3714 check.
As a result, WA#3714 is never called. This commit removes the condition.

Signed-off-by: Triviño <francisco.trivino@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Add debugfs for workaround usage statistics
Triviño [Fri, 10 Jun 2016 14:44:20 +0000 (16:44 +0200)]
sif: Add debugfs for workaround usage statistics

This commit also adds:

* WA#3714 usage counters and dump info
* Rename 3713 (bug ticket) to 3714 (WA ticket)

Signed-off-by: Triviño <francisco.trivino@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: ARMv8 (aarch64) portability changes.
Gerd Rausch [Mon, 13 Jun 2016 17:31:07 +0000 (19:31 +0200)]
sif: ARMv8 (aarch64) portability changes.

Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Fixed typo
Hans Westgaard Ry [Thu, 26 May 2016 12:11:55 +0000 (14:11 +0200)]
sif: Fixed typo

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: query: Make headroom for TSO stencil used by IPoIB datagram mode
Hans Westgaard Ry [Fri, 10 Jun 2016 09:55:30 +0000 (11:55 +0200)]
sif: query: Make headroom for TSO stencil used by IPoIB datagram mode

When query_device is called from IPoIB in datagram-mode it will
return max_sge = (SIF_HW_MAX_SEND_SGE-1) as opposed to SIF_HW_MAX_SEND_SGE
for other cases.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: using FW release version in device attibutes
Andre Wuttke [Tue, 7 Jun 2016 14:24:08 +0000 (16:24 +0200)]
sif: using FW release version in device attibutes

Signed-off-by: Andre Wuttke <andre.wuttke@oracle.com>
Reviewed-by: Åsmund Østvold <asmund.ostvold@oracle.com>
Pre-check: Åsmund Østvold <asmund.ostvold@oracle.com>

9 years agosif: Make driver more silent at startup
Knut Omang [Tue, 31 May 2016 05:59:31 +0000 (07:59 +0200)]
sif: Make driver more silent at startup

- Set debug_mask to 0x1 for upstream
- Move most initialization messages to INIT(0x2) level
- Return number of VFs enabled instead of 0 from sif_vf_enable
  This also eliminates a warning from the kernel framework

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: sif_r3: fix sif_r3_recreate_flush_qp soft lockup.
Triviño [Tue, 7 Jun 2016 14:24:20 +0000 (16:24 +0200)]
sif: sif_r3: fix sif_r3_recreate_flush_qp soft lockup.

This commit fixes Orabug: #23540257. It prevents the situation where
the flush_retry_qp is used before the sdev->flush_lock has been
initialized. This occurs when IB_EVENT_LID_CHANGE event is received
before the flush_retry_qp is created by the driver.

Signed-off-by: Triviño <francisco.trivino@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: ah: Fixed incorrect ipd setting
Hakon Bugge [Sun, 5 Jun 2016 16:02:42 +0000 (18:02 +0200)]
sif: ah: Fixed incorrect ipd setting

Use cached copies of active speed and width in order to fulfill
ib_core locking rules. That is, create_ah() cannot sleep.

Signed-off-by: Hakon Bugge <Haakon.Bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: qp/ah: Added XRC QPs & IPD(AH) to debugfs output
Vinay Shaw [Thu, 2 Jun 2016 22:32:15 +0000 (00:32 +0200)]
sif: qp/ah: Added XRC QPs & IPD(AH) to debugfs output

Signed-off-by: Vinay Shaw <vinay.shaw@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: epsc: Fix keepalive timeouts
Knut Omang [Thu, 2 Jun 2016 08:39:14 +0000 (10:39 +0200)]
sif: epsc: Fix keepalive timeouts

* If a keepalive is posted, do not reset the general timeout interval.
  This effectively caused EPSC requests not ever to time out.
* Remove a superfluous timeout reset in sif_eps_poll_cqe
  The timeout was already set correctly during post.
* Also avoid sending keepalives if the sender has given up.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Compile with kernel 4.4.10
Knut Omang [Wed, 1 Jun 2016 06:58:03 +0000 (08:58 +0200)]
sif: Compile with kernel 4.4.10

* Remove obsolete (dummy) unimplemented fast_reg call impl
  This has changed significantly in 4.4.x and was not
  implemented for older kernels anyway.

* Add ifdefs for new wr struct layout -
  no longer uses union, instead we have to upcast
  to the right type to find the qp/request
  type specific fields

* undefine the mtrr code as the mtrr_del function seems not to be
  made available anymore. The functionality is not used anyway atm.

* Some regressions wrt checkpatch

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: DNE QPs were created even with limited mode
Knut Omang [Thu, 2 Jun 2016 08:38:07 +0000 (10:38 +0200)]
sif: DNE QPs were created even with limited mode

This could potentially lead to situations where it is not
possible to upgrade firmware.

Also make all QP creation fail in limited mode, otherwise
someone might create one and try to run traffic on it.
In particular any use of PQPs will lead to kernel null pointer
exceptions as they have not been initialized.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: eq: Avoid sending COMM_EST event to ULPs (UD, RAW & GSI QPs)
Vinay Shaw [Tue, 31 May 2016 16:35:05 +0000 (18:35 +0200)]
sif: eq: Avoid sending COMM_EST event to ULPs (UD, RAW & GSI QPs)

From IB spec, o11-5.1.1:
For UD and Raw service types, generation of the Communication
Established Affiliated Asynchronous Event is allowed, but is
strongly discouraged.

Signed-off-by: Vinay Shaw <vinay.shaw@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: XRC: XRC support and PSIF 2.1 limitation #3521
Vinay Shaw [Thu, 12 May 2016 05:52:01 +0000 (07:52 +0200)]
sif: XRC: XRC support and PSIF 2.1 limitation #3521

This commit addresses the issue of XRC support (Orabug: 23044600).

Changes include
 XRCTGT QP not to allocate SQ
 Introduced get_sq/rq function to check XRC cases
    (XRC INI/TGT QP has no RQ, XRC TGT QP has no SQ & RQ).
 Overload "ib_qp_attr" attributes for modify XRC QP (RTS state)
    requirement of PSIF
 Rearranged/moved all QP helper functions to be in sif_qp.c/.h files

Note about user space support for XRC:
Since a XRCSRQ can be targeted by multiple XRCTGTQPs with same
XRC domain, simply getting a QP# in completion doesn't help.
MLX-hw overloads the "src_qp" with XRCSRQ# for completions.

For now, we limit the XRC association (not related to kernel context)
    one user-context <--> one XRCTGTQP/XSRQ

Signed-off-by: Vinay Shaw <vinay.shaw@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: cq: tear-down sequence in cleaning up the SendCQ
Wei Lin Guay [Mon, 30 May 2016 07:54:34 +0000 (09:54 +0200)]
sif: cq: tear-down sequence in cleaning up the SendCQ

This commit ensures that the sif_fixup_cqes for a sendCQ
can only be executed after post_process_wa4074. As the CQE
in a sendCQ cannot be trusted,  walk_and_update CQ must
be performed first.

In a scenario where the post_process_wa4074 and sif_fixup_cqes
are performed concurrently, the post_process_wa4074 is given
priority where no polling of the SendCQ is allowed in
sif_fixup_cqes. Then, post_process_wa4074 will generate
the remaining FLUSH-IN ERR for a Send queue.

Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
9 years agosif: Fix regressions in supporting fw from release 0.1.0.4 and earlier
Knut Omang [Mon, 30 May 2016 12:12:39 +0000 (14:12 +0200)]
sif: Fix regressions in supporting fw from release 0.1.0.4 and earlier

Orabug: 23497496

This commit fixes two separate regressions in handling the old fw:

1) Teardown of the dne_qp happens only with older FWs because the newer
   firmwares implements the dne_qp handling in fw, so
   driver does not invoke the teardown code. This teardown code
   uses generic calls that has been implicitly amended
   to by the WA for Bug #4074, which also assumed that all QPs
   subject to that call has a valid send queue. The DNE QPs don't
   and this causes a null pointer exception, which is triggered
   both during driver unload and as a side effect of lid changes.
2) EPSC support for SL to TSL mapping was introduced in EPSC API v.0.56
   but was broken - this causes the driver to set wrong values
   which leads to modify_qp errors. The fix is just to avoid putting
   the map to use unless epsc version is >= 0.57.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
9 years ago{IBCM/IPoIB/MLX4/RDS}: Temporary backout Exasecure change
Santosh Shilimkar [Sun, 26 Jun 2016 20:46:33 +0000 (13:46 -0700)]
{IBCM/IPoIB/MLX4/RDS}: Temporary backout Exasecure change

ExaSecure changeset seems to impact Exadata data integrity
checker. We back out all the Exasecure changes for now till
the issue gets addressed.

Orabug: 23634771

Revert "IB/mlx4: Generate alias GUID for slaves"
Revert "RDS: Fix the rds_conn_destroy panic due to pending messages"
Revert "RDS: add handshaking for ACL violation detection at passive"
Revert "RDS: IB: enforce IP anti-spoofing for UUID context"
Revert "RDS: IB: invoke connection destruction in worker"
Revert "RDS: message filtering based on UUID"
Revert "RDS: Add UUID socket option"
Revert "RDS: Add reset all conns for a source address to CONN_RESET"
Revert "IB/ipoib: ioctl interface to manage ACL tables"
Revert "IB/ipoib: sysfs interface to manage ACL tables"
Revert "IB/{cm,ipoib}: Filter traffic using ACL"
Revert "IB/{cm,ipoib}: Manage ACL tables"

Tested-by: Rene Kundersma <rene.kundersma@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS/IB: Fix crash in SRQ initialization
Ajaykumar Hotchandani [Thu, 16 Jun 2016 19:15:21 +0000 (12:15 -0700)]
RDS/IB: Fix crash in SRQ initialization

SRQ initialization causes crash when IC connection is not available.

Orabug: 23523586

This is regression fix for commit 0f0f08915.
We require more work to have SRQ working with variable fragment size.
For now, we fix crash in SRQ initialization.

This also adds warning when SRQ is enabled.
SRQ feature is experimental and disabled by default.
When any user enables it, we should give warning.

Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Tested-by: jenny x.xu <jenny.x.xu@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: Remove the link-local restriction as a stop gap measure
Santosh Shilimkar [Sat, 18 Jun 2016 17:48:07 +0000 (10:48 -0700)]
RDS: Remove the link-local restriction as a stop gap measure

Fresh CRS install seems to have a dependency with RDS IB link-local
connection going through. Setting the cluster_interconnect
parameter to non-link local address isn't covering the fresh
install usecases.

So as a stop gap measure, we just warn the user but let the connection
through till we come up with a solution to re-introduce the change.

Orabug: 2360905

Tested-by: Maria Yip <maria.yip@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoRDS: IB: restore the vector spreading for the CQs
Santosh Shilimkar [Fri, 10 Jun 2016 04:05:54 +0000 (21:05 -0700)]
RDS: IB: restore the vector spreading for the CQs

Since the IB_CQ_LEAST_LOADED vector support is not their on newer
kernels(post OFED 1.5), we had #if 0 code for it which got removed
as part of 'commit 3f1db626594e ("RDS: IB: drop discontinued IB
CQ_VECTOR support")'. On UEK2, the drivers had implementation
for this IB verb. UEK4 which is based on newer kernel obviously
doesn't support it.

RDS had an alternate fallback scheme which can be used in absence
of the dropped verb. On UEK2, we didn't use it but UEK4 RDS code was
silently using that till the code got removed. The patch restores
that code with bit more clarity on what it is actually doing.

Orabug: 23550561

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoIB/mlx4: Generate alias GUID for slaves
Yuval Shaia [Thu, 26 May 2016 16:25:19 +0000 (09:25 -0700)]
IB/mlx4: Generate alias GUID for slaves

Generate alias GUID by changing the fourth byte to be the GUID index in the
port GUID table.

This is porting of a work done in uek2 for Oracle purpose only.

Orabug: 23292164

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>