www.infradead.org Git - users/jedix/linux-maple.git/log

mlx4_ib: Fix endianness in blueflame post_send.

qp object field doorbell_qpn was initialized using swab()
at qp creation.
swab() unconditionally swaps dword endianness. Thus, on
little-endian platforms the endianness of doorbell_qpn was
big endian; on big-endian platforms, doorbell_qpn is little-endian.

In post send blueflame, doorbell_qpn was taken as is (i.e., the
driver assumed that it was in big-endian format). This was OK
for little-endian hosts, but incorrect for big-endian hosts.

The fix is to use cpu_to_be32 when initializing doorbell_qpn (thus
guaranteeing that doorbell_qpn is in big-endian format on all
host types). This also requires modifying non-bf sends to
use __raw_writel (which does not do any endianness swapping)
instead of writel (which does endianness swapping on big-endian hosts).

The fix was developed by Shamir Rabinovitch of Oracle.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported by Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4: Switching between sending commands via polling and events may results in hung tasks

When switching between those methonds of sending commands, it's
possbile that a task will keep waiting for the polling sempahore,
but may never be able to acquire it.
This is due to mlx4_cmd_use_events which "down"s the
sempahore back to 0.

Reproducing it involves in sending commands while changing
between mlx4_cmd_use_polling and mlx4_cmd_use_events.

Signed-off-by: Matan Barak <matanb@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: Put non zero value in max_ah

We put INT_MAX since this is the max value the in can hold.
Though hardware capability is unlimited, this is practically
a large enough number so we can use it.

Signed-off-by: Eli Cohen <eli@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/core: Add debugging prints to ib_uverbs_write

These debug prints should help anyone attempting to understand
why -EINVAL was returned for a command.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/core: add debugging prints to explain -EINVAL in ib_uverbs_reg_mr

Understanding why -EINVAL is returned from uverbs is difficult
as there are multiple code paths that can cause the value to be
returned. This patch adds some explainations as pr_debug prints.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

fix warning about bitwise or between u32 and size_t

(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: Don't update QP1 for native functions

For native functions, there's no reason to update
the smac_index, as QP1 is a GSI QP.

Signed-off-by: Matan Barak <matanb@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/ipoib: Check gso size prior to ib_send

We found that in some cases the kernel sends skb where the
gso field was damaged, and the size of that field was bigger
than the physical mtu, when the HW gets such size it flushes
the qp to error state and all traffic on that interface is
disabled.

In order to avoid such case, i added a check to that field
prior to the ib_send.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_vnic: fix may be used uninitialized compilation warnings

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_vnic: fix potential data corruption in sprintf

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Fix resource tracker memory leak after Reset Flow

In case of non-responsive device mlx4_ACCESS_MEM fails and the
driver can't read qp_detach mailbox, which includes all the
rule information.

Since the driver doesn't get the rules attributes form the
qp_detach mailbox the master fails to detach his rules form
the resource tracker during driver unload sequence when the
device in in internal_error state.

Calling rem_slave_qp will remove those rules and the qps they are
attached to unconditionally.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: Check port_num before using it in mlx4_ib_port_link_layer

In mlx4_ib_port_link_layer func port_num is used as table
index without checking its validity.

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
(Ported from OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: Fix wrong calculation of link layer

Use ah->port_num to find link layer before value was set to it.

Signed-off-by: Moni Shoua <monis@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: Copy SL from correct place in address path

According to PRM, sl in address path is at offset 28 for
InfiniBand and 29 for Ethernet.

Signed-off-by: Shani Michaeli <shanim@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Check return status of rdma_resolve_ip

Fix usage of rdma_resolve_ip() when return status was
not checked for success.

Signed-off-by: Shani Michaeli <shanim@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4: Clean IRQ affinity hint when freeing it

This is done to avoid kernel's warning when affinty hint
of IRQ is set, when free_irq is called.

WARNING: at kernel/irq/manage.c:1002 __free_irq+0x22d/0x250()
Call Trace:
[<ffffffff81071e27>] ? warn_slowpath_common+0x87/0xc0
[<ffffffff81071e7a>] ? warn_slowpath_null+0x1a/0x20
[<ffffffff810e83fd>] ? __free_irq+0x22d/0x250
[<ffffffff810e848e>] ? free_irq+0x4e/0xb0
[<ffffffffa03d083d>] ? mlx4_release_eq+0x9d/0xc0 [mlx4_core]

Signed-off-by: Ido Shamay <idos@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/core: Fix QP attr mask when resolving smac

When rdma_accept() is called rdma_cm modifies the QP to RTR.
During this stage the source mac needs to be resolved and put
in the QP atrr. Before modifying the QP. This patch also adds
the flag IB_QP_SMAC to the atrr_mask which was missing.

Signed-off-by: Moni Shoua <monis@mellanox.co.il>

mlx4_vnic: fix typo in log messages

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_vnic: print vnic keep alive info in mlx4_vnic_info

Print last keep alive timestamp and GW keep alive period.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: default gid should respect dev_id

The default gid should match the true ipv6 link
local address which respects the dev_id.

Signed-off-by: Matan Barak <matanb@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Change the name of the num_mtt in mlx4_profile to be num_mtt_segs.

The old name is misleading. The variable is the number of mtt
segments and not the number of mtts so it was changed to match
the actual meaning.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: Print error messages when GID table update failed

When trying to add a GID to a full GID table or when trying
to delete a GID which is not in the GID table an error will
be printed to the kernel log.

Signed-off-by: Moni Shoua <monis@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: Remove unnecessary warning message

Signed-off-by: Moni Shoua <monis@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib_core: Check that caches exist before accessing them

Check that the gid/pkey cache exists before trying to access
them (ib_find_cached_xxx and ib_cache_update).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

rdma_cm/cma: Cache broadcast domain record.

Currently, rdma_cm waits for the IPoIB driver to complete
its join to the broadcast domain record; after IPoIB gets its
multicast, rdma_cm tries to obtain its own multicast. After an
IB_CLIENT reregister event, IPoIB may not succeed in its first
effort to reregister its multicast groups. In this case, the
backoff mechanism is applied, and IPoIB retries after a
backoff which starts at 2 seconds and can increase up to
16 seconds.
Since rdma_cm waits for the IPoIB multicast join to succeed,
it too will be delayed at least 2 seconds.

The fix is to detach rdma_cm's multicast operation from IPoIB's
broadcast record re-join. When rdma_cm executes a new join
request, it now tries (via the cma) to take parameters from a
cached broadcast record. If the join fails using the cached
values, the cma deletes the cached record and tries to get a new
one.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ipoib: added an error message when trying to change mtu to 2K-4K

Max mtu defined by IB is 4K, but mcast_mtu is limited to 2K,
so any request to change mtu to a value between 2K-4K didn't
change the mtu, but also didn't show an error message.
An error value (-EINVAL) is now returned and an ipoib_warn
is issued in such cases.

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib_core: Do not transition MC groups to error on SM_CHANGE event

Do not transition multicast groups to error on an SM_CHANGE
event. These events are not connected with mcast groups.
(When the SM wishes to have multicast groups reregistered,
it issues the CLIENT_REREG event).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ipoib: Do not flush mcast groups on SM_CHANGE event

SM_CHANGE events have nothing to do with reregistering
multicast groups. Therefore, do not flush/rereg mcast
groups when receiving an SM_CHANGE event.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

rdma_cm: add debug functions and module parameter

added debug function and debug module parameter.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

rdma_cm: garbage-collection thread for rdma_destroy_id()

garbage-collection thread for rdma_destroy_id,
so as not to paralyze ib_cm thread with wait_for_completion.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Extracted from following commit in Mellanox OFED-2.4
812c3972cf93e0d04f568ba353e87a1de7c9e006
(rdma_cm: race condition bug fixes)

Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_vnic: always remove child macs in vnic_parent_update remove request

Child macs are not removed in host admin vnics once the connection
is lost with BX. This caused a loss of connectivity for child vnics
in case of connection restored with the BX, since the BX is not
aware of the old child macs.

Solution is to always remove child macs when vnic_paren_update is
called with remove request.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_vnic: set default moderation values in vnic_alloc_netdev

vnic_set_default_moder was called from _vnic_open, which caused
to reset all current moderation values to the default every time
the user opens/closes the vnic interface.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4: Handle memory region deregistration failure

Memory region deregistration can fail when memory windows
are bound to it. We handle such failures by propagating them
to the user, or by printing a serious warning.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib_core: More fixes to ib_sa_add_one error flow

commit 0e7377eed fixed a resource leak of mad agents in
the ib_sa_add_one error flow. However, the fix allowed
ib_mad_unregister_agent to be called in a case where the
ib_mad_register_agent request failed (resulting in an
illegal pointer in the agent field). This caused a kernel
Oops in the error flow.

Fix this by calling ib_unregister_mad_agent only for cases where
ib_register_mad_agent succeeded.

In addition, separate the ib_register_event_handler() call error
flow from the loop error flow. If the call to
ib_register_event_handler fails, the client data must be reset
to NULL, (in case at some point ib_register_event_handler() is
modified so that it may return a non-zero (error) value).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/ipoib: Set mode only when needed.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Use div_u64 to avoid unresolved symbol on 32-bit OSes

E.g.: On Xenserver6.1/2:
WARNING: "__udivdi3" [<path>/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko] undefined!

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib_core: Safely unregister mad agent when necessary.

When the allocation of the receive buffer fails the driver
needs to unregister the mad agent. The function
ib_unregister_mad_agent doesn't check if the pointer of the
mad agent is valid and doesn't contain an error and causes a
Kernel Panic. Therefore, we need to check if the pointer of
the mad agent is valid by calling PTR_ERR and only then
unregister the agent.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_vnic: use netif_set_real_num_tx_queues to dynamically change tx queue size

When network admin vnics are created the network device is
already registered and it is not allowed to change
dev->real_num_tx_queues directly.

This fixes a bug where the unload of mlx4_vnic hangs with
the following error message: waiting for eth442 to become
free. Usage count = 16

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Extend num_mtt in dev caps to avoid overflow.

Some legitimate combinations of log_num_mtt and log_mtts_per_seg
cause overflow in the calculation of the num_mtt when initializing
the HCA which causes Kernel panic. Changed the variable to be 'u64'
instead of 'int' to avoid the overflow and made the needed changes
to support the new type.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: fix FMR unmapping to allow remapping afterward

The FMR common use flow (as implemented in fmr_pool) is:
- Allocate FMR (ib_alloc_fmr)
- Use the FMR to remap DMA memory until remaps limit
   exceeded (ib_map_phys_fmr)
- Unmap the FMR (ib_unmap_fmr)
- Use the FMR to remap DMA memory until remaps limit
   exceeded (ib_map_phys_fmr)
- ...

The current implementation of mlx4_fmr_unmap is not following
this use flow since it is using the HW2SW MPT command.
The HW2SW MPT command notifies the FW that the MPT entry is
not used by HW anymore. The FW may act according to this information,
therefore it is not safe for the driver to manipulate the MPT
directly.  The patch fixes this by manipulating the MPT directly
to unmap the memory instead of using the HW2SW MPT command.

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/ipoib: unlock dev_start_xmit() on ipoib_cm_rep_handler()

Signed-off-by: Tal Alon <talal@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib_core: fixed resource leak in case of error

Fixed off-by-one bug, we need to decrement the port number only after
we released the resources to the current port.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/ipoib: fix illegal locking on ipoib_cm_rep_handler

Signed-off-by: Tal Alon <talal@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/ipoib: ipoib_cm_rep_handler lock skb queue while dequeue before xmit

Signed-off-by: Tal Alon <talal@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: resolvs kernel panic when connectx_port_config fail to set ports

When changing ports configutation (e.g. from ib,ib to eth,eth)
the device is disconnected from interfaces and catas error lists
than we change ports config and reconnecting the device.
In case ports config changing fails the device left disconnected.
If we try again to configure the ports the driver retry to
disconnect the device form its lists and crashes in list_del
function. To aviod this the list_del replaced by list_del_init
(to allow redeleting the device).

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Avoid setting ports for auto when only one port type is supported

When only one port type is supported driver should reject requests
to change mode to auto sense.

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: sysfs, fix usage of log_num_mtt module parameter

When was auto calculated based on RAM size it wrongly includes
also log_mtts_per_seg.

It's wrong in 2 ways:
First, log_mtts_per_seg should be added by the application itself,
no reason to a have total in log_num_mtt itself.
Second, in case that an extra NIC exists it may get an invalid
value as it depends on a larger value.
Specifically, it may cause an overflow and later leads on to
kernel panic via mlx4_buddy_init.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: fix ib_uverbs_get_context flow

Fix flow to prevent kernel panic in case of a failure in copy_to_user.

INIT_IB_EVENT_HANDLER must be called to initialize the event handler
list before releasing filp as part of fput.
Otherwise will get a kernel panic at ib_unregister_event_handler
when calling list_del.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Fix Coverity issues.

Signed-off-by: Itai Garbi <igarbi@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: Fix Coverity issues

Signed-off-by: Itai Garbi <igarbi@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/core: Fix Coverity issues for rdma_cm

Signed-off-by: Itai Garbi <igarbi@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Derived from(subset of!) Mellanox OFED-2.4 patch
84c6d50a470b4b61ca6b2ed1c718b121870daa37
(IB/core: Fix Coverity issues)

Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

Release Date is updated to __DATE__ instead of a static string

Signed-off-by: Alex Markuze <markuze@mellanox.com>
(Ported from Mellanox OFED 2.4)
Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: use msi_x module param to limit num of MSI-X irqs

The msi_x module param usage is:
0 - don't use MSI-X
1 - use MSI-X (driver decide the num of MSI-X irqs)
>1 - limit number of MSI-X irqs to msi_x
In case of SRIOV the msi_x>1 treated as msi_x==1

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

Seting ring size to default when module param set incorrectly

Signed-off-by: Alex Markuze <markuze@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/core: change error prints in cm module to debug prints.

commit acd10b49 added prints to the cm module.
These, however, should really be debug prints, to be activated
when it is necessary to track down some cm problem.

To activate the debug mechanism, you need to do the following:
1. mount the debug fs (do this once)
   mount -t debugfs none /sys/kernel/debug/

2. activate debug output for ib_cm:
   echo -n "module ib_cm +p" > /sys/kernel/debug/dynamic_debug/control

3. To de-activate debug output when you are done, do the following
   echo -n "module ib_cm -p" > /sys/kernel/debug/dynamic_debug/control

You will see the debug output in dmesg.

This change was suggested by Moni Shoua (monis@mellanox.com)

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Add more info to mlx4_cmd_post failure error messages

To assist in debugging and support, add additional information
to output generated when fail to post a FW command. In addition,
add in_param, in_modifier, and op_modifier values to output
when commands are successfully posted but time out.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: disable mlx4_QP_ATTACH calls from guests if master is doing flow steering.

Old upstream kernel guests do not detect if device-enabled flow
steering is activated by the master. If DMFS is activated,
the master should return error to guests which try to use
the B0-steering flow calls (mlx4_QP_ATTACH).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: change resource quotas to enable supporting upstream-kernel guests

The resource-quota code passed non-power-of-2 quotas to guests.
In the upstream kernel (bugs), resource quotas for MPTs and QPs
are assumed to be powers-of-2. In MPT case, mlx4_init_mr_table
checks for num_mpts being a power-of-2 before checking if it
is running as a slave.

In the QP case, procedure mlx4_qp_alloc() assumes that
(num_qps - 1) is a power-of-2 when calling radix_tree_insert()
and radix_tree_delete().

In the MPT case, mlx4_init_mr_table() failed on the guest,
causing abort of the guest driver bringup.

In the QP case, although create-qp succeeded on the
hypervisor, the radix_tree_insert() call failed, resulting
in failure to create QPs with certain qp numbers.

The fix, for both cases, is to round-up the quota to the
next power-of-2 for guests for MPTs and QPs. This does no
harm, as these two resources were not really meant to be
limited by an upper quota. The guaranteed resources for QPs
and MPTs per VF/PF are not affected by this change.
The only effect is that no guest will ever be able to
actually reach its max-quota for QPs and MPTs.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: device revision support

The device revision field returned by the NodeInfo MAD
is incorrect on ConnectX3 devices.

This patch is driver side handling to complete a FW fix
added at 2.11.1172. INIT_HCA - bit at offset 0x0C.12 is
set to 1 so that FW will report correct device revision.

Older FW versions won't be affected from turning on that bit,
no capability bit is needed.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: print more info when command times out

To assist in diagnosing command timeouts, print the
go-bit status and toggle-bit status in the warning
output.

In addition, print an indication of the pci_bus is offline.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: move out label to the right place

Add new steering entry for good flow only.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/mlx4: deprecate "failed to alloc bf reg" message from err to debug

This message is not an error, and qp creation continues
normally -- only without use of blueflame.

For VFs attached to VMs, KVM disables write combining, so
performance is better without BF in this case. The host driver
therefore intentionally disables BF use for guests. (see upstream
kernel commit b91cb3ebcd).

The message notifying that BF is not available is therefore
deprecated from err to debug.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Do not allow mlx4_bitmap_init to reserve more slots than available

Caused a kernel crash when log_num_mac was too big

Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/ipoib: Fix deadlock between rmmod and set_mode

set_mod called from sys/fs, takes sys/fs lock and tries to
take rtnl_lock, rmmod takes rtnl_lock and now tries to take
sys/fs lock, that causes deadlock.
deadloc a->b, b->a

The problem starts when ipoib_set_mod free it's rtnl_lck and
tries to get it after that.

set_mod:

[<ffffffff8104f2bd>] ? check_preempt_curr+0x6d/0x90
[<ffffffff814fee8e>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff81448655>] ? __rtnl_unlock+0x15/0x20
[<ffffffff814fed2b>] mutex_lock+0x2b/0x50
[<ffffffff81448675>] rtnl_lock+0x15/0x20
[<ffffffffa02ad807>] ipoib_set_mode+0x97/0x160 [ib_ipoib]
[<ffffffffa02b5f5b>] set_mode+0x3b/0x80 [ib_ipoib]
[<ffffffff8134b840>] dev_attr_store+0x20/0x30
[<ffffffff811f0fe5>] sysfs_write_file+0xe5/0x170
[<ffffffff8117b068>] vfs_write+0xb8/0x1a0
[<ffffffff8117ba81>] sys_write+0x51/0x90
[<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
INFO: task rmmod:8057 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

rmmod:
[<ffffffff81279ffc>] ? put_dec+0x10c/0x110
[<ffffffff8127a2ee>] ? number+0x2ee/0x320
[<ffffffff814fe6a5>] schedule_timeout+0x215/0x2e0
[<ffffffff8127cc04>] ? vsnprintf+0x484/0x5f0
[<ffffffff8127b550>] ? string+0x40/0x100
[<ffffffff814fe323>] wait_for_common+0x123/0x180
[<ffffffff81060250>] ? default_wake_function+0x0/0x20
[<ffffffff8119661e>] ? ifind_fast+0x5e/0xb0
[<ffffffff814fe43d>] wait_for_completion+0x1d/0x20
[<ffffffff811f2e68>] sysfs_addrm_finish+0x228/0x270
[<ffffffff811f2fb3>] sysfs_remove_dir+0xa3/0xf0
[<ffffffff81273f66>] kobject_del+0x16/0x40
[<ffffffff8134cd14>] device_del+0x184/0x1e0
[<ffffffff8144e59b>] netdev_unregister_kobject+0xab/0xc0
[<ffffffff8143c05e>] rollback_registered+0xae/0x130
[<ffffffff8143c102>] unregister_netdevice+0x22/0x70
[<ffffffff8143c16e>] unregister_netdev+0x1e/0x30
[<ffffffffa02a91b0>] ipoib_remove_one+0xe0/0x120 [ib_ipoib]
[<ffffffffa01ed95f>] ib_unregister_device+0x4f/0x100 [ib_core]
[<ffffffffa021f5e1>] mlx4_ib_remove+0x41/0x180 [mlx4_ib]
[<ffffffffa01ab771>] mlx4_remove_device+0x71/0x90 [mlx4_core]
[<ffffffffa01ab863>] mlx4_unregister_interface+0x43/0x80 [mlx4_core]
[<ffffffffa02324f3>] __exit_compat+0x15/0x4e [mlx4_ib]
[<ffffffff810addd4>] sys_delete_module+0x194/0x260
[<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
[<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/ipoib: getout whenever failed to load port.

whenever add_one failed to load one of the port, call remove one
after that, otherwise it ends with panic, (many resources are
for all the ports and some actions will be aken over both of them
without considering that one of them failed to load)

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib_ipoib: Fixing issue with delayed work running after child is killed.

This patch addresses a common issues in the ipoib driver.
it has the following form:

        Flow 1:
            1)    if (!test_bit(IPOIB_STOP_NEIGH_GC, &priv->flags))
                     2)  queue_delayed_work(ipoib_workqueue, &priv->neigh_reap_task,
                                                            arp_tbl.gc_interval);
        Flow 2:
              3)   set_bit(IPOIB_STOP_NEIGH_GC, &priv->flags);
              4)   cancel_delayed_work(&priv->neigh_reap_task);

The Linux Kernel (Unlike the ESX Kernel ) is preemptable, which
means that the this sequence of actions is feasible.
        1 flow 1
        3 flow 2
        4 flow 2
        2 flow 1
Now the effect of this sequence is that line #4 has no defacto effect.
The work will be rescheduled and will run once again and only then
will see that the bit value has changed.
In the IPoIB driver this is benign because after each
"set and cancel" sequence, flush is called which waits until the
second run ends.  This is not the case with the neigh_reap_task
which does cancel without flush.

took from 1.5.3 : b7abd8b0b071f0422b0c1018804033ee1b542eca
by Alex Markuze

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: set device to use extended counters

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-on: http://r-webdev02.lab.mtl.com:8080/479
Tested-by: Redmine Issue
Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Mini-Regression: Yevgeny Petrilin <yevgenyp@mellanox.com>
Reviewed-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/ipoib: debug prints instead of warn in tx_wc function

In IB_WC_RNR_RETRY_EXC_ERR status print debug message else
print error message. IB_WC_RNR_RETRY_EXC_ERR is part of the
CM life cycle.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/ipoib: add detailed error message on dev_queue_xmit

whenever calling to requeue packet via __skb_dequeue, add the return
code from the dev_queue_xmit function.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/ipoib: Fix removing call for update_pmtu from spin-lock context.

    The function ipoib_cm_send can call the function
    ipoib_cm_skb_too_long under spin_lock_irq, the
    ipoib_cm_skb_too_long function calls update_pmtu which also
    tries to get spin_lock, that can cause to a deadlock.
    In order to solve that I took update_pmtu to workqueue that
    it is not under spin_lock.

V2: Adjusted for kernel 3.7-rc4

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ipoib: fixed NULL dereferencing in case of error flow

In case of failure, result will be equal to zero, which may
lead to NULL dereferencing and having the following kernel panic:

BUG: unable to handle kernel paging request at 00000000000010e8
IP: [<ffffffff8127b814>] __list_add+0x34/0xa0
PGD 116536067 PUD 11bc42067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/infiniband/mlx4_0/node_desc
CPU 1
Modules linked in: ib_ipoib(+)(U) rdma_ucm(U) ib_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_cm(U) ib_uverbs(U) ib_umad(U) mlx4_ib(U) ib_sa(U) ib_mad(U) ib_core(U) mlx4_en(U) mlx4_core(U) netconsole configfs nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ipv6 knem(U) microcode virtio_balloon memtrack(U) virtio_net snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_ipoib]

Pid: 2387, comm: insmod Not tainted 2.6.32-220.el6.x86_64 #1 Red Hat KVM
RIP: 0010:[<ffffffff8127b814>]  [<ffffffff8127b814>] __list_add+0x34/0xa0
RSP: 0018:ffff88011b409de8  EFLAGS: 00010246
RAX: 0000000000000004 RBX: 00000000000010e8 RCX: ffff88010868e080
RDX: ffff8801192d9e00 RSI: ffff8801192d9e00 RDI: 00000000000010e8
RBP: ffff88011b409e08 R08: ffff8801192d9e00 R09: 0a64656c69616620
R10: 0000000000000002 R11: 0000000000000000 R12: ffff8801192d9e00
R13: ffff8801192d9e00 R14: ffff88010868e6e0 R15: ffff8801192d9e00
FS:  00007f498bf4f700(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000010e8 CR3: 000000011896f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process insmod (pid: 2387, threadinfo ffff88011b408000, task ffff88010a4ccb40)
Stack:
0000000000000001 ffff880118990000 0000000000000002 0000000000000000
<0> ffff88011b409eb8 ffffffffa0410daa ffffffffa041cf20 00000000000005e4
<0> ffff8801000000d0 ffffffffa04201c0 ffff88011fc00040 ffff880118990008
Call Trace:
[<ffffffffa0410daa>] ipoib_add_one+0x1ea/0x350 [ib_ipoib]
[<ffffffffa03894bd>] ib_register_client+0x7d/0xa0 [ib_core]
[<ffffffffa0425200>] ipoib_init_module+0x200/0x296 [ib_ipoib]
[<ffffffffa0425000>] ? ipoib_init_module+0x0/0x296 [ib_ipoib]
[<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0
[<ffffffff810af641>] sys_init_module+0xe1/0x250
[<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Code: 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 4c 8b 42 08 49 89 f5 49 89 d4 49 39 f0 75 27 4d 8b 45 00 4d 39 c4 75 40 49 89 5c 24 08 <4c> 89 23 4c 89 6b 08 4c 8b 65 f0 49 89 5d 00 48 8b 5d e8 4c 8b
RIP  [<ffffffff8127b814>] __list_add+0x34/0xa0
RSP <ffff88011b409de8>
CR2: 00000000000010e8
---[ end trace 2c7c92f924933cec ]---
Kernel panic - not syncing: Fatal exception
Pid: 2387, comm: insmod Tainted: G      D    ----------------   2.6.32-220.el6.x86_64 #1
Call Trace:
[<ffffffff814ec341>] ? panic+0x78/0x143
[<ffffffff814f04d4>] ? oops_end+0xe4/0x100
[<ffffffff8104230b>] ? no_context+0xfb/0x260
[<ffffffff81042595>] ? __bad_area_nosemaphore+0x125/0x1e0
[<ffffffff81272b1c>] ? put_dec+0x10c/0x110
[<ffffffff810426be>] ? bad_area+0x4e/0x60
[<ffffffff81042dc3>] ? __do_page_fault+0x3c3/0x480
[<ffffffffa0044e59>] ? memtrack_free+0x119/0x270 [memtrack]
[<ffffffff81275306>] ? vsnprintf+0x2b6/0x5f0
[<ffffffff8109694f>] ? up+0x2f/0x50
[<ffffffffa0044e59>] ? memtrack_free+0x119/0x270 [memtrack]
[<ffffffff814f248e>] ? do_page_fault+0x3e/0xa0
[<ffffffff814ef845>] ? page_fault+0x25/0x30
[<ffffffff8127b814>] ? __list_add+0x34/0xa0
[<ffffffffa0410daa>] ? ipoib_add_one+0x1ea/0x350 [ib_ipoib]
[<ffffffffa03894bd>] ? ib_register_client+0x7d/0xa0 [ib_core]
[<ffffffffa0425200>] ? ipoib_init_module+0x200/0x296 [ib_ipoib]
[<ffffffffa0425000>] ? ipoib_init_module+0x0/0x296 [ib_ipoib]
[<ffffffff8100204c>] ? do_one_initcall+0x3c/0x1d0
[<ffffffff810af641>] ? sys_init_module+0xe1/0x250
[<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Erez Shitrit <erezsh@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Update minimum size for log_num_qp to 18

The minimum number of QPs must exceed the number of
reserved QPs. The reserved QPs can be divided to 2 categories
reserved_from_bot and reserved_from_top.

reserved_from_bot is normally less than 2K QPs and can't
exceed 2^16 QPs.

reserved_from_top is the sum of reserved QPs for REGION_ETH_ADDR,
REGION_FC_ADDR and REGION_FC_EXCH.

reserved QPs for REGION_FC_EXCH is 2^16 and for REGION_FC_ADDR and
REGION_ETH_ADDR it can't exceed 2^15.

Therefore:
reserved_from_top <= 2^16 + 2*(2^15) = 2^17
Reserved QPs = reserved_from_bot + reserved_from_top
<= 2^16 + 2^17 < 2^18

To make it simpler I set the minimum qp number to 2^18 even
though 2^17 is acceptable when log_num_mac module parameter
is set to 6 (or lower).

Signed-off-by: Moshe Lazer <moshel@mellanox.co.il>
Reviewed-by: Eli Cohen <eli@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core, mlx4_ib: Have enough room in steering range for pkey interfaces

Needed to enlarge default qp bitmap size to have room for such
a big QP numbers block.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4: return bad error status to caller function in case of error

Not doing this, may cause kernel oops in flows that comes later.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/core: Remove annoying message.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_ib: fix memory leak if QP creation failed

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib/core: add prints to the cm module.

control flows, on errors/unexpected events.

V2: replaced printk(KERN_ERR with pr_err()

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4/IB: add a message print when the logical link goes up/down

This message will help us in debugging issues.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4/ib: clean memory for EQs in case of error flow

This will prevent memory leak.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Shlomo Pongratz <shlomop@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4_core: set used number of MTTs when using auto-detection

Issue 31158.

If we set the number of MTTs automatically by the driver (default
value or auto-detection, by the memory size in the machine), need
to set the used value of the MTTs, so the user will see this value.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4_core: the number of MTTs should consider log_mtts_per_seg

No matter how we decided how many MTTs will be used (calculation
from system memory or using default value), we need to consider
the number of MTTs per segment.

This will allow the user to specify how many MTTs will be used
(with log_num_mtt) and how many MTTs will be used per segment
(using log_mtts_per_seg).

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4_core: limit to 4TB of memory registration

we have a limit of log_num_mtt to 30 -> 4TB mapping.
limit comes from bit map operations which work with int.

V2: removed the change in file icm.c, as the fix was already
upstream.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4_core: num mtt issues

consider num mtts per segment
use vmalloc only when kmalloc fails

V2: adjusted for kernel 3.7-rc4
use vmalloc only when kmalloc fails -- already in kernel:
see commit 89dd86db7 --mlx4_core: Allow large mlx4_buddy bitmaps

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_vnic: Kconfig and Makefile changes

Also fixed a potential uninitialized pointer problem in the vnic
driver when the MLX4_VNIC_DEBUG option is selected.

Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Signed-off-by: Qing Huang <qing.huang@oracle.com>

mlx4_vnic: add mlx4_vnic

Add mlx4_vnic code

Also squash following porting commmits for compilation
of the integrated commit (without squashing they wont compile)

mlx4_vnic: adapt vnic to ofed2 mlx4 implementation
mlx4_vnic: align with OFED2 upstream 3.7 kernel
mlx4_vnic: Fix reference path to hw/mlx4 header files
mlx4_vnic: remove mlx4_vnic_helper module
mlx4_vnic: use ib_modify_cq() in upstream kernel
        We modify code to use ib_modify_cq() in upstream kernel
        (and not use a modified Mellanox version)
mlx4_vnic: removed reference to mlx4_ib_qp->rules_list in vnic_qp.c
        Remove field introduced with Mellanox OFED 2.4 flow
        steering patches which are not in upstream kernel.
mlx4_vnic: used an older version of mlx4_qp_reserve_range()
        Use mlx4_qp_reserve_range() aligned with version
        in Linux 3.18 (We can use the new API when it is
        available upstream)
mlx4_vnic: port to Linux 3.18*
        mlx4_vnic code is based on the original port
        of mlx4_vnic in UEK3. Make changes to compile
        on UEK4 (based on Linux 3.18). Use upstream APIs
        -not Mellanox specific ones - where they are in
        conflict and other changes to make it compile
        on Linux 3.18

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Signed-off-by: Qing Huang <qing.huang@oracle.com>
(Ported from UEK3 and Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_ib: add blue flame support for kernel consumers

Using blue flame can improve latency by allowing the HW to more
efficiently access the WQE. A consumer who wants to use blue flame,
has to create the QP with inline support. When posting a send WR,
the consumer has to set IB_SEND_INLINE in the send flags. This
approach is similar to that take in userspace; that is, in order
to use blue flame you must use inline. However, if the send WR is
too large for blue flame, it will only use inline.

A kernel consumer that creates a QP with inline support, will be
allocated a UAR and a blue flame register. All QP doorbells will
be set to the UAR and blue flame posts to the blue flame register.
We make use of all available registers in a blue flame page.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4_core: add sanity check when creating bitmap structure

If a user tries to allocate bitmap structure with invalid values, this may
cause a kernel panic.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4_core: unmap clear register in case of error flow

Clear interrupt clear register in case of error flows.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

ib_core: fix NULL pointer dereference

If there was a failure in the initial fill of the Pkey/GID cache,
and some other ULP/module will try to query a Pkey/GID, the NULL
pointer will be dereferenced, thing that will lead to the
following kernel panic:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: [<ffffffff814ef21f>] _spin_lock_irqsave+0x1f/0x40
PGD 37b49067 PUD becb3067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/module/mlx4_core/initstate
CPU 1
Modules linked in: mlx4_ib(+)(U) ib_sa(U) ib_mad(U) ib_core(U) mlx4_en(U) mlx4_core(U) memtrack(U) netconsole configfs nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ipv6 knem(U) microcode virtio_balloon virtio_net snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: memtrack]

Pid: 20715, comm: modprobe Not tainted 2.6.32-220.el6.x86_64 #1 Red Hat KVM
RIP: 0010:[<ffffffff814ef21f>]  [<ffffffff814ef21f>] _spin_lock_irqsave+0x1f/0x40
RSP: 0018:ffff8800bc331d08  EFLAGS: 00010002
RAX: 0000000000010000 RBX: ffff8800bec00088 RCX: 0000000000000000
RDX: 0000000000000202 RSI: 0000000000000046 RDI: 0000000000000058
RBP: ffff8800bc331d08 R08: 0000000000000000 R09: ffff88011bde8050
R10: ffff8800bc3317d8 R11: 0000000000000002 R12: ffff8800b9520000
R13: ffff8800bec00000 R14: 0000000000000000 R15: ffff8800bec02c00
FS:  00007f2f07fcd700(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000058 CR3: 00000000bb708000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 20715, threadinfo ffff8800bc330000, task ffff8801185e0b40)
Stack:
ffff8800bc331d28 ffffffffa0244ee5 ffff8800bec00000 ffff8800b9520000
<0> ffff8800bc331d68 ffffffffa0246f96 0000000000000370 ffffffffa0359560
<0> ffffffffa024fde0 ffff8800b9520000 ffff8800bec00000 ffff8800bcf36080
Call Trace:
[<ffffffffa0244ee5>] ib_unregister_event_handler+0x25/0x50 [ib_core]
[<ffffffffa0246f96>] ib_cache_cleanup_one+0x26/0x1b0 [ib_core]
[<ffffffffa02450ce>] ib_unregister_device+0x4e/0x1a0 [ib_core]
[<ffffffffa035fee6>] ? mlx4_ib_mad_init+0xa6/0x170 [mlx4_ib]
[<ffffffffa0366721>] mlx4_ib_add+0x621/0xb80 [mlx4_ib]
[<ffffffffa0456e73>] mlx4_add_device+0x73/0x1d0 [mlx4_core]
[<ffffffffa04570db>] mlx4_register_interface+0x7b/0x100 [mlx4_core]
[<ffffffffa013112a>] mlx4_ib_init+0x12a/0x188 [mlx4_ib]
[<ffffffffa0131000>] ? mlx4_ib_init+0x0/0x188 [mlx4_ib]
[<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0
[<ffffffff810af641>] sys_init_module+0xe1/0x250
[<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Code: c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 b8 00 00 01 00 <f0> 0f c1 07 0f b7 c8 c1 e8 10 39 c1 74 0e f3 90 0f 1f 44 00 00
RIP  [<ffffffff814ef21f>] _spin_lock_irqsave+0x1f/0x40
RSP <ffff8800bc331d08>
CR2: 0000000000000058
---[ end trace 4cc9b3e738027b7c ]---
Kernel panic - not syncing: Fatal exception
Pid: 20715, comm: modprobe Tainted: G      D    ----------------   2.6.32-220.el6.x86_64 #1
Call Trace:
[<ffffffff814ec341>] ? panic+0x78/0x143
[<ffffffff814f04d4>] ? oops_end+0xe4/0x100
[<ffffffff8104230b>] ? no_context+0xfb/0x260
[<ffffffffa001a62d>] ? start_xmit+0x5d/0x1d0 [virtio_net]
[<ffffffff81042595>] ? __bad_area_nosemaphore+0x125/0x1e0
[<ffffffff810426be>] ? bad_area+0x4e/0x60
[<ffffffff81042dc3>] ? __do_page_fault+0x3c3/0x480
[<ffffffffa03d737d>] ? write_msg+0xfd/0x110 [netconsole]
[<ffffffff81069d15>] ? __call_console_drivers+0x75/0x90
[<ffffffff8109694f>] ? up+0x2f/0x50
[<ffffffff81069d7a>] ? _call_console_drivers+0x4a/0x80
[<ffffffff814f248e>] ? do_page_fault+0x3e/0xa0
[<ffffffff814ef845>] ? page_fault+0x25/0x30
[<ffffffff814ef21f>] ? _spin_lock_irqsave+0x1f/0x40
[<ffffffffa0244ee5>] ? ib_unregister_event_handler+0x25/0x50 [ib_core]
[<ffffffffa0246f96>] ? ib_cache_cleanup_one+0x26/0x1b0 [ib_core]
[<ffffffffa02450ce>] ? ib_unregister_device+0x4e/0x1a0 [ib_core]
[<ffffffffa035fee6>] ? mlx4_ib_mad_init+0xa6/0x170 [mlx4_ib]
[<ffffffffa0366721>] ? mlx4_ib_add+0x621/0xb80 [mlx4_ib]
[<ffffffffa0456e73>] ? mlx4_add_device+0x73/0x1d0 [mlx4_core]
[<ffffffffa04570db>] ? mlx4_register_interface+0x7b/0x100 [mlx4_core]
[<ffffffffa013112a>] ? mlx4_ib_init+0x12a/0x188 [mlx4_ib]
[<ffffffffa0131000>] ? mlx4_ib_init+0x0/0x188 [mlx4_ib]
[<ffffffff8100204c>] ? do_one_initcall+0x3c/0x1d0
[<ffffffff810af641>] ? sys_init_module+0xe1/0x250
[<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Eli Cohen <eli@mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_ib: contig support for control objects

Reviewer: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: fix wrong comment about the reason of subtract one from the max_cqes

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/core - Don't modify outgoing DR SMP if first part is LID routed

The code in handle_outgoing_dr_smp() checks to see if the directed
route SMP has an initial LID routed part and correctly does not
modify the hop pointer but it then proceeds to process the packet
as if there was no initial LID routed part. Instead, if there
is an initial LID routed part, the packet should just be sent on
to the destination and not processed further since it can't be
destined for the local SM/SMA.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

net/mlx4: adjust initial value of vl_cap in mlx4_SET_PORT

Adjust the initial value of vl_cap in mlx4_SET_PORT such that
only for CX3 devices we start with 8 VLs, in an attempt to
avoid errors such as:

mlx4_core 0000:06:00.0: command 0xc failed: fw status = 0x40
mlx4_core 0000:06:00.0: vhcr command:0xc slave:0 failed with error:0, status -12

to appear in the system log.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: Error message on mtt allocation failure

Add error message if mlx4_mtt_init fails to allocate mtts.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/core: Control number of retries for SA to leave an MCG

Add a multicast leave maximum retry setting in:
sys/module/ib_sa/parameters/mcast_leave_retries.
Add a debug print when the maximum retry count is reached.

Signed-off-by: Nir Muchtar <nirm@voltaire.com>
Reviewed-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
Fixup by adding modifications to changed file from following
commit from Mellanox OFED-2.4 to account for kernel API change
so there are no compiler warnings:
19b33d488938e1667bcc16562a2405d632428db0
(core, srp, mlx4_en: Adjustments to patch set which rebased to kernel 3.7)

Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4: reducing wait during SW reset for 500 msecs

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_ib: Do not enable blueflame sends if write combining is not available

V2: adjusted for Or Gerlitz' 64-byte CQE patch

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

IB/core: Fix create_qp issue relates to qp group type

qpg_type field of ib_qp_init_attr wasn't initialized properly.
Now it's set to IB_QPG_NONE (=0) as part of attr initialization.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_core: log_num_mtt handling

Fixed a bug when calculating based on system RAM size
limit of 25 was fixed - was set to 28

V2: adjusted for rebase to kernel 3.7-rc4.
In this kernel, part of the patch appears as commit 89dd86d

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>

mlx4_ib: Fix the SQ size of an RC QP to support masked atomic operation

When calculating the required size of an RC QP send queue,
leave enough space for masked atomic operation (which requires
more space than "regular" atomic operation).

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
(Ported from Mellanox OFED 2.4)

Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>