www.infradead.org Git - users/jedix/linux-maple.git/log

SPARC64: vcc: delay device removal until close()

    If a vcc device file is open while it's removed (due to a
    domain being unbound), delay the removal of the associated vcc
    device structure until the final close() call is made on
    the device. This preventsthe  device file cdev minor number from
    being reused which can result in ugly filesystem warnings to
    the console.

Orabug: 24594547

Signed-off-by: Aaron Young <aaron.young@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-By: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

sparc64: fix vio handshake issue

When rebooting multiple LDoms together with bind/unbind a separate
LDom, kernel panic with vio handshake error. The panic is caused by
vio trying to allocate a buffer which is not freed properly. The
ldc_unbind should unconfigure and stop the ldc queue before freeing
the irq. If the irq is freed before stopping the queue, interrupts
can continue to happen after the irq is freed which may cause
issue.

Orabug: 26259622

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>

sparc64: Use cpu_poke to resume idle cpu

Use cpu_poke hypervisor call to resume idle cpu if supported.

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 25575672
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>

sparc64: Add a new hypercall CPU_POKE

This adds a new hypercall CPU_POKE for quickly waking up an idle CPU.
CPU POKE should only be sent to valid non-local CPUs.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 25575672
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>

sparc64: fix out of order spin_lock_irqsave and spin_unlock_restore

After enabling spinlocks debug option, kernel prints out call trace
suggesting that the function ldom_req_sp_token executes a might_sleep
function while IRQs is disabled. IRQs is disabled because the
spin_lock_irqsave and spin_unlock_irqrestore are out of order.

Normal correct sequence is:

spin_lock_irqsave(ds_dev_list_lock, data_flags)

        LOCK_DS_DEV()/* spin_lock_irqsave(ds_lock, ds_flags) */
        .
        .
        UNLOCK_DS_DEV()/* spin_unlock_irqrestore(ds_lock, ds_flags) */

spin_unlock_irqrestore(ds_dev_list_lock, data_flags)

Out or order sequence:

spin_lock_irqsave(ds_dev_list_lock, data_flags)

        LOCK_DS_DEV()/* spin_lock_irqsave(ds_lock, ds_flags) */

spin_unlock_irqrestore(ds_dev_list_lock, data_flags)

        UNLOCK_DS_DEV()/* spin_unlock_irqrestore(ds_lock, ds_flags) */

The last UNLOCK_DS_DEV() ends up restoring IRQs to disabled state,
because the previous LOCK_DS_DEV() is in irqs_disabled().
To fix the issue, follows the order of irqsave()/irqrestore().

Orabug: 26265190
Orabug: 25421812

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Reviewed-by: Aaron Young <aaron.young@oracle.com>

ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT

snd_timer_user_tselect() reallocates the queue buffer dynamically, but
it forgot to reset its indices.  Since the read may happen
concurrently with ioctl and snd_timer_user_tselect() allocates the
buffer via kmalloc(), this may lead to the leak of uninitialized
kernel-space data, as spotted via KMSAN:

  BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10
  CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:16
   dump_stack+0x143/0x1b0 lib/dump_stack.c:52
   kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
   kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086
   copy_to_user ./arch/x86/include/asm/uaccess.h:725
   snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004
   do_loop_readv_writev fs/read_write.c:716
   __do_readv_writev+0x94c/0x1380 fs/read_write.c:864
   do_readv_writev fs/read_write.c:894
   vfs_readv fs/read_write.c:908
   do_readv+0x52a/0x5d0 fs/read_write.c:934
   SYSC_readv+0xb6/0xd0 fs/read_write.c:1021
   SyS_readv+0x87/0xb0 fs/read_write.c:1018

This patch adds the missing reset of queue indices.  Together with the
previous fix for the ioctl/read race, we cover the whole problem.

Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit ba3021b2c79b2fa9114f92790a99deb27a65b728)

Orabug: 26267070
CVE-2017-1000380

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

ALSA: timer: Fix race between read and ioctl

The read from ALSA timer device, the function snd_timer_user_tread(),
may access to an uninitialized struct snd_timer_user fields when the
read is concurrently performed while the ioctl like
snd_timer_user_tselect() is invoked. We have already fixed the races
among ioctls via a mutex, but we seem to have forgotten the race
between read vs ioctl.

This patch simply applies (more exactly extends the already applied
range of) tu->ioctl_lock in snd_timer_user_tread() for closing the
race window.

Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit d11662f4f798b50d8c8743f433842c3e40fe3378)

Orabug: 26267070
CVE-2017-1000380

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

net/rds: Replace printk in TX path with stat variable

Printing in any form is not recommended in data path.
Fix it by replacing printk with a new statistics counter.

Orabug: 26402653

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>

Revert "mm: meminit: only set page reserved in the memblock region"

This reverts commit 35d7de6449327c64ed82c3e2b8c071a7664e0b19.

Orabug: 26446232
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

Revert "mm: meminit: move page initialization into a separate function"

This reverts commit aeb76e45d8809efc0c273a5aa6edf481256893c6.

Orabug: 26446232
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

NVMe: Retain QUEUE_FLAG_SG_GAPS flag for bio vector alignment.

The nvme queue flag QUEUE_FLAG_SG_GAPS checks for the bio vector
alignment against the page size. In upstream, the QUEUE_FLAG_SG_GAPS
flag is replaced by blk_queue_virt_boundary() and pulling in the
respective patches caused instability in the driver and hence
QUEUE_FLAG_SG_GAPS flag is retained for vector alignment.

Orabug: 26402433

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: Don't post statistics to malicious VFs

Once firmware indicates that a given VF is malicious and until
that VF passes an FLR all bets are off - PF can't know anything
is happening to the VF [since VF can't communicate anything to its PF].
But PF is currently still periodically asking device to collect
statistics for the VF which might in turn fill logs by IOMMU blocking
memory access done by the VF's PCI function [in the case VF has unmapped
its buffers].

Orabug: 26440216

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3523882229b903e967de05665b871dab87c5df0f)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: Allow vfs to disable txvlan offload

VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.

Orabug: 26440216

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 92f85f05caa51d844af6ea14ffbc7a786446a644)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: fix pf2vf bulletin DMA mapping leak

When freeing VF's DMA mappings, an already NULLed pointer was checked
again due to an apparent copy&paste error. Consequently, the pf2vf
bulletin DMA mapping was not freed.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 996652c7050c70008e4434af108be6f15f20fbd0)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: Fix Multi-Cos

Apparently multi-cos isn't working for bnx2x quite some time -
driver implements ndo_select_queue() to allow queue-selection
for FCoE, but the regular L2 flow would cause it to modulo the
fallback's result by the number of queues.
The fallback would return a queue matching the needed tc
[via __skb_tx_hash()], but since the modulo is by the number of TSS
queues where number of TCs is not accounted, transmission would always
be done by a queue configured into using TC0.

Orabug: 26440216

Fixes: ada7c19e6d27 ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3968d38917eb9bd0cd391265f6c9c538d9b33ffa)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: add missing configuration of VF VLAN filters

Configuring VLANs from the VF side had no effect, because the PF ignored
filters of type VFPF_VLAN_FILTER in the VF-PF message.

Add the missing filter type to configure.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e39513259450a8312cb98a9a3b16bb924310dbcc)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: fix incorrect filter count in an error message

filters->count is the number of filters we were supposed to configure.
There is no reason to increase it by +1 when printing the count in an error
message.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 74bcbeb7d77ec92e4262fc340cb436ef7d98ba01)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: do not rollback VF MAC/VLAN filters we did not configure

On failure to configure a VF MAC/VLAN filter we should not attempt to
rollback filters that we failed to configure with -EEXIST.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 78d5505432436516456c12abbe705ec8dee7ee2b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: fix detection of VLAN filtering feature for VF

VFs are currently missing the VLAN filtering feature, because we were
checking the PF's acquire response before actually performing the acquire.

Fix it by setting the feature flag later when we have the PF response.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 83bd9eb8fc69cdd5135ed6e1f066adc8841800fd)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: fix possible overrun of VFPF multicast addresses array

It is too late to check for the limit of the number of VF multicast
addresses after they have already been copied to the req->multicast[]
array, possibly overflowing it.

Do the check before copying.

Also fix the error path to not skip unlocking vf2pf_mutex.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 22118d861cec5da6ed525aaf12a3de9bfeffc58f)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: lower verbosity of VF stats debug messages

When BNX2X_MSG_IOV is enabled, the driver produces too many VF statistics
messages. Lower the verbosity of the VF stats messages similarly as in
commit 76ca70fabbdaa3 ("bnx2x: [Debug] change verbosity of some prints").

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 850268d320f0c7c5eb7ad0a62ef21859fa331ded)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: prevent crash when accessing PTP with interface down

It is possible to crash the kernel by accessing a PTP device while its
associated bnx2x interface is down. Before the interface is brought up,
the timecounter is not initialized, so accessing it results in NULL
dereference.

Fix it by checking if the interface is up.

Use -ENETDOWN as the error code when the interface is down.
-EFAULT in bnx2x_ptp_adjfreq() did not seem right.

Tested using phc_ctl get/set/adj/freq commands.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 466e8bf10ac104d96e1ea813e8126e11cb72ea20)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

Merge remote-tracking branch 'linux-uek/topic/uek-4.1/dtrace' into uek-4.1-next

Conflicts:
arch/sparc/kernel/ttable_64.S

dtrace: FBT module support and SPARCs return probes

This fix adds two features to FBT provider:
1) support for modules
2) support for return probes on SPARC

The module support of x86 was almost ready as it does not rely on trampolines and
uses hashtables of tracepoints. This works well if we know amount of probes in
advance so can reserve correct amount of memory during module load time. Unfortunately
that is not possible on SPARC and we need to allocate a trampoline dynamically.

Major part of this code is about removing all static assumptions about FBT from kernel
code and moving the responsibility to dtrace modules. Trampolines for SPARC are now
allocated dynamically (including kernel's pseudo module). This applies to SDT trampolines
too.

Second change adds scan for return probes on SPARC with small heuristics to quickly
skip over cases that are not interesting for DTrace. At the same time this patch
allocates new SPARC Trap for FBT.

Support for .init section is not available on any platform. The .init section is freed
after a module is fully loaded and it is not possible to remove its probes without
further chagnes in DTrace framework (modules). This is deffered for later work.

Orabug: 25960276
Orabug: 26384199

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>

mm: fix use-after-free if memory allocation failed in vma_adjust()

There's one case when vma_adjust() expands the vma, overlapping with
*two* next vma.  See case 6 of mprotect, described in the comment to
vma_merge().

To handle this (and only this) situation we iterate twice over main part
of the function.  See "goto again".

Vegard reported[1] that he sees out-of-bounds access complain from
KASAN, if anon_vma_clone() on the *second* iteration fails.

This happens because we free 'next' vma by the end of first iteration
and don't have a way to undo this if anon_vma_clone() fails on the
second iteration.

The solution is to do all required allocations upfront, before we touch
vmas.

The allocation on the second iteration is only required if first two
vmas don't have anon_vma, but third does.  So we need, in total, one
anon_vma_clone() call.

It's easy to adjust 'exporter' to the third vma for such case.

[1] http://lkml.kernel.org/r/1469514843-23778-1-git-send-email-vegard.nossum@oracle.com

Link: http://lkml.kernel.org/r/1469625255-126641-1-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Orabug: 26441514

(cherry picked from commit 734537c9cb725fc8005ee7a25c48f1ad10fce5df)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

uek-rpm nano: Signature verification support in kexec_file_load

The following configuration options to support
signature verification in the kexec_file_load
syscall are enabled:
CONFIG_KEXEC_VERIFY_SIG=y
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_PKCS7_MESSAGE_PARSER=y
CONFIG_SIGNED_PE_FILE_VERIFICATION=y

Orabug: 26386345
Signed-off-by: alexey.petrenko@oracle.com

lpfc update for uek4 11.4.0.2

Orabug: 26439257

Signed-off-by: rkennedy <dick.kennedy@avagotech.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Driver responds LS_RJT to Beacon Off

[backport of 6c29ac6512240659455456b36bfe7d6f3b839312]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Beacon OFF from switch is rejected by driver.

Driver fails Beacon OFF if frequency is set to 0. As per fc-ls spec,
status, capability, frequency and duration fields are only applicable
for Beacon ON.

Remove frequency and type checks. Reject Beacon ON if duration is non
zero.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix crash after firmware flash when

[backport of e9e6003bdcfec67c719c8e335fa5bff65c707297]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

OS crashes after the completion of firmware download.

Failure in posting SCSI SGL buffers because number of SGL buffers is
less than total count. Some of the pending IOs are not completed by
driver. SGL buffers for these IOs are not added back to the list.
Pending IOs are not completed because lpfc_wq_list list is initialized
before completion of pending IOs.

Postpone lpfc_wq_list reinitialization by moving
lpfc_sli4_queue_destroy() after lpfc_hba_down_post().

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Vport creation is failing with "Link

[backport of f0b2223a666e93c84fd42a0c5dad8a17c08aa869]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Vport creation fails for SLI-3 adapters.

Mailbox submission fails because mailbox interrupt is disabled. Mailbox
interrupt is disabled during port reset.

Do reset only for physical port.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Null pointer dereference when

[backport of 8ca7edf129441181280bc5c932e70d338d30f47f]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Kernel panic when log_verbose is set to 0xffffffff

phba->pport is dereferenced before it is initialized

Fix: Do not dereference phba->pport if it is NULL

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix return value of board_mode store

[backport of b18f2a8765fc072701452f710ac164d243ee4579]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

On hbacmd reset failure, observing wrong string "nline" in kernel log.

On failure, non negative value (1) is returned from sysfs store
routine. It is interpreted as count by kernel and store routine is
called again with the remaining characters as input.

Fix: Return negative error code (-EIO) in case of failure.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix Port going offline after

[backport of f4af07d4df21c5d02f66a0174b9669187a40bf1e]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Observing lpfc port down after issuing hbacmd reset command

Failure in posting SGL buffers. If there is only one SGL buffer and rrq
is valid for its XRI, we are rightly returning NULL but not adding the
buffer back to the SGL list. So, number of buffers become less than
total count and repost fails during reset.

Add SGL buffer back to list before returning NULL.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: fix spelling mistake "entrys"

[backport of 627efc5892d73cdfc0c9841a50714677855775a9]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Trivial fix to spelling mistake in debugfs message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Add MDS Diagnostic support.

[backport of e01b7cdb676e06310c4dba367ba153f33effc464]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Added code to support Cisco MDS loopback diagnostic. The diagnostics run
various loopbacks including one which loops-back frame through the
driver.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix used-RPI accounting problem.

[backport of a1d3e3605028f2b4b0ff811b91df73454d9b8792]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

With 255 vports created a link trasition can casue a crash.

When going through discovery after a link bounce the driver is using
rpis before the cmd FCOE_POST_HDR_TEMPLATES completes. By doing that the
next rpi bumps the rpi range out of the boundary.

The fix it to increment the next_rpi only when the
FCOE_POST_HDR_TEMPLATE succeeds.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix panic on BFS configuration

[backport of 596b05edb75f251f64373255f2a338ea7a9b5197]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

To select the appropriate shost template, the driver is issuing a
mailbox command to retrieve the wwn. Turns out the sending of the
command precedes the reset of the function. On SLI-4 adapters, this is
inconsequential as the mailbox command location is specified by dma via
the BMBX register. However, on SLI-3 adapters, the location of the
mailbox command submission area changes. When the function is first
powered on or reset, the cmd is submitted via PCI bar memory. Later the
driver changes the function config to use host memory and DMA. The
request to start a mailbox command is the same, a simple doorbell write,
regardless of submission area. So.. if there has not been a boot driver
run against the adapter, the mailbox command works as defaults are
ok. But, if the boot driver has configured the card and, and if no
platform pci function/slot reset occurs as the os starts, the mailbox
command will fail. The SLI-3 device will use the stale boot driver dma
location. This can cause PCI eeh errors.

Fix is to reset the sli-3 function before sending the mailbox command,
thus synchronizing the function/driver on mailbox location.

Note: The fix uses routines that are typically invoked later in the call
flow to reset the sli-3 device. The issue in using those routines is
that the normal (non-fix) flow does additional initialization, namely
the allocation of the pport structure. So, rather than significantly
reworking the initialization flow so that the pport is alloc'd first,
pointer checks are added to work around it. Checks are limited to the
routines invoked by a sli-3 adapter (s3 routines) as this fix/early call
is only invoked on a sli3 adapter. Nothing changes post the
fix. Subsequent initialization, and another adapter reset, still occur -
both on sli-3 and sli-4 adapters.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
Cc: stable@vger.kernel.org # v4.11+
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix Express lane queue creation.

[backport of 24754c5e3651ef7521cd2f8d2f7ac2c38b7bb072]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

The older sli4 adapters only supported the 64 byte WQE entry size.
The new adapter (fw) support both 64 and 128 byte WQE entry sizies.
The Express lane WQ was not being created with the 128 byte WQE sizes
when it was supported.

Not having the right WQE size created for the express lane work queue
caused the the firmware to overwrite the lun indentifier in the FCP header.

This patch correctly creates the express lane work queue with the
supported size.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix driver usage of 128B WQEs when WQ_CREATE is

[backport of 8c434f8b07d99c4f0728ed144108839a367c72c0]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

There are two versions of a structure for queue creation and setup that the
driver shares with FW. The driver was only treating as version 0.

Verify WQ_CREATE with 128B WQEs in V0 and V1.

Code review of another bug showed the driver passing
128B WQEs and 8 pages in WQ CREATE and V0.
Code inspection/instrumentation showed that the driver
uses V0 in WQ_CREATE and if the caller passes queue->entry_size
128B, the driver sets the hdr_version to V1 so all is good.
When I tested the V1 WQ_CREATE, the mailbox failed causing
the driver to unload.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Add Fabric assigned WWN support.

[backport of a2ed2d89b451f7ca26b049ea1664f4397c5df87b]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Adding support for Fabric assigned WWPN and WWNN.

Firmware sends first FLOGI to fabric with vendor version changes.
On link up driver gets updated service parameter with FAWWN assigned port
name. Driver sends 2nd FLOGI with updated fawwpn and modifies the
vport->fc_portname in driver.

Note:
Soft wwpn will not be allowed when fawwpn is enabled.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix crash after issuing lip reset

[backport of 43f40add69672723389a32adab1d24715aaedff0]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

When RPI is not available, driver sends WQE with invalid RPI value and
rejected by HBA.
lpfc 0000:82:00.3: 1:3154 BLS ABORT RSP failed, data: x3/xa0320008
and
lpfc :2753 PLOGI failure DID:FFFFFA Status:x3/xa0240008

In this case, driver accesses rpi_ids array out of bounds.

Fix:
Check return value of lpfc_sli4_alloc_rpi(). Do not allocate
lpfc_nodelist entry if RPI is not available.

When RPI is not available, we will get discovery timeouts and
command drops for some of the vports as seen below.

lpfc :0273 Unexpected discovery timeout, vport State x0
lpfc :0230 Unexpected timeout, hba link state x5
lpfc :0111 Dropping received ELS cmd Data: x0 xc90c55 x0

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Remove NULL ptr check before kfree.

[backport of 600b57091effbdb680c067332c92de2a21c5d79d]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

The check for NULL ptr is not necessary, kfree will check it.

Removing NULL ptr check.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
This patch was applied manually.

Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix spelling in comments.

[backport of 92c9d8da38785584acae3660112acadc3baf3051]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Comment should have said Repost.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix PT2PT PRLI reject

[backport of 114e80db15039e248eb4e458559cef57737930a8]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

lpfc cannot establish connection with targets that send PRLI in P2P
configurations.

If lpfc rejects a PRLI that is sent from a target the target will not
resend and will reject the PRLI send from the initiator.

[mkp: applied by hand]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: correct rdp diag portnames

[backport of 91b7cd038317fa47e32f42096c1ff28de4337af0]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

NVME merge reverted diag port names to the physical port.
They should be the vport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix eh_deadline setting for sli3 adapters.

[backport of 96418b5e2c8867da3279d877f5d1ffabfe460c3d]
From: James Smart <jsmart2021@gmail.com>

Orabug: 26439257

A previous change unilaterally removed the hba reset entry point
from the sli3 host template. This was done to allow tape devices
being used for back up from being removed. Why was this done ?
When there was non-responding device on the fabric, the error
escalation policy would escalate to the reset handler. When the
reset handler was called, it would reset the adapter, dropping
link, thus logging out and terminating all i/o's - on any target.
If there was a tape device on the same adapter that wasn't in
error, it would kill the tape i/o's, effectively killing the
tape device state. With the reset point removed, the adapter
reset avoided the fabric logout, allowing the other devices to
continue to operate unaffected. A hack - yes. Hint: we really
need a transport I_T nexus reset callback added to the eh process
(in between the SCSI target reset and hba reset points), so a
fc logout could occur to the one bad target only and stop the error
escalation process.

This patch commonizes the approach so it can be used for sli3 and sli4
adapters, but mandates the admin, via module parameter, specifically
identify which adapters the resets are to be removed for. Additionally,
bus_reset, which sends Target Reset TMFs to all targets, is also removed
from the template as it too has the same effect as the adapter reset.

This patch had to be modified from the original because of the NVME changes that are in the upstream driver.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters

Orabug: 26439257

if REG_VPI fails, the driver was incorrectly issuing INIT_VFI
(a SLI4 command) on a SLI3 adapter.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5d181531bc6169e19a02a27d202cf0e982db9d0e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: fix missing spin_unlock on sql_list_lock

Orabug: 26439257

In the case where sglq is null, the current code just returns without
unlocking the spinlock sql_list_lock. Fix this by breaking out of the
while loop and the exit path will then unlock and return NULL as was
the original intention.

Detected by CoverityScan, CID#1411635 ("Missing unlock")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

drm/mgag200: Fix to always set HiPri for G200e4 V2

- Changed the HiPri value for G200e4 to always be 0.
- Added Bandwith limitation to block resolution above 1920x1200x60Hz

Signed-off-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Acked-by: Dave Airlie <airlied@redhat.com>
[seanpaul removed some trailing whitespace from the patch]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/ec0f8568d7ec41904dfe593c5deccf3f062d7bd8.1497450944.git.mathieu.larouche@matrox.com
(cherry picked from commit 0cbb738108927916a659b5b0b96e386fcd7cc6e1)

Orabug: 26427049

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

char: lp: fix possible integer overflow in lp_setup()

The lp_setup() code doesn't apply any bounds checking when passing
"lp=none", and only in this case, resulting in an overflow of the
parport_nr[] array. All versions in Git history are affected.

Reported-By: Roee Hay <roee.hay@hcl.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3e21f4af170bebf47c187c1ff8bf155583c9f3b1)

Orabug: 26174869
CVE-2017-1000363

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

NFSv4: Fix callback server shutdown

We want to use kthread_stop() in order to ensure the threads are
shut down before we tear down the nfs_callback_info in nfs_callback_down.

Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes: bb6aeba736ba9 ("NFSv4.x: Switch to using svc_set_num_threads()...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit ed6473ddc704a2005b9900ca08e236ebb2d8540a)

Orabug: 26143102
CVE-2017-9059

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
fs/nfs/callback.c

SUNRPC: Refactor svc_set_num_threads()

Refactor to separate out the functions of starting and stopping threads
so that they can be used in other helpers.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit 9e0d87680d689f1758185851c3da6eafb16e71e1)

Orabug: 26143102
CVE-2017-9059

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
net/sunrpc/svc.c

ipv6/dccp: do not inherit ipv6_mc_list from parent

Like commit 657831ffc38e ("dccp/tcp: do not inherit mc_list from parent")
we should clear ipv6_mc_list etc. for IPv6 sockets too.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 83eaddab4378db256d00d295bda6ca997cd13a52)

Orabug: 26142901
CVE-2017-9077

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
net/dccp/ipv6.c

Merge remote-tracking branch 'linux-uek/topic/uek-4.1/dtrace' into qu6-push

btrfs: introduce device delete by devid

This introduces new ioctl BTRFS_IOC_RM_DEV_V2, which uses enhanced struct
btrfs_ioctl_vol_args_v2 to carry devid as an user argument.

The patch won't delete the old ioctl interface and so kernel remains
backward compatible with user land progs.

Test case/script:
echo "0 $(blockdev --getsz /dev/sdf) linear /dev/sdf 0" | dmsetup create bad_disk
mkfs.btrfs -f -d raid1 -m raid1 /dev/sdd /dev/sde /dev/mapper/bad_disk
mount /dev/sdd /btrfs
dmsetup suspend bad_disk
echo "0 $(blockdev --getsz /dev/sdf) error /dev/sdf 0" | dmsetup load bad_disk
dmsetup resume bad_disk
echo "bad disk failed. now deleting/replacing"
btrfs dev del 3 /btrfs
echo $?
btrfs fi show /btrfs
umount /btrfs
btrfs-show-super /dev/sdd | egrep num_device
dmsetup remove bad_disk
wipefs -a /dev/sdf

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reported-by: Martin <m_btrfs@ml1.co.uk>
[ adjust messages, s/disk/device/ ]
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit 6b526ed70cf189660d009ea6f17af77a9cca0f38)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: enhance btrfs_find_device_by_user_input() to check device path

The operation of device replace and device delete follows same steps upto
some depth with in btrfs kernel, however they don't share codes. This
enhancement will help replace and delete to share codes.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit b3d1b1532ff9620ff5dba891a96f3e912005eb10)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: make use of btrfs_find_device_by_user_input()

btrfs_rm_device() has a section of the code which can be replaced
btrfs_find_device_by_user_input()

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit 24fc572fe456c02ff4136c07861a3edd4b8de683)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: create helper btrfs_find_device_by_user_input()

The patch renames btrfs_dev_replace_find_srcdev() to
btrfs_find_device_by_user_input() and moves it to volumes.c, so that
delete device can use it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit 24e0474b59538cdb9d2b7318ec7c7ae9f6faf85d)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: clean up and optimize __check_raid_min_device()

__check_raid_min_device() which was pealed from btrfs_rm_device()
maintianed its original code to show the block move. This patch cleans up
__check_raid_min_device().

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit bd45ffbcb1f082e3c5a0bd56b1d7310d8b707ffb,
drop the second argument of the btrfs_dev_replace_lock/unlock to
match with the current kernel api)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: create helper function __check_raid_min_devices()

move a section of btrfs_rm_device() code to check for min number of the
devices into the function __check_raid_min_devices()

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit f1fa7f264250f2cb60119aca3fd114c3f47070c2,
drop the second argument of the btrfs_dev_replace_lock/unlock to
match with the current kernel api)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

dtrace: Add support for manual triggered cyclics

In some scenarios it is better if a client of cyclic susbstem can reprogram cyclic on
his own. This is not possible with current implementation.

This chage adds cyclic_reprogram() that can be used to schedule cyclic from inside and
outside of its handler. A manually triggered cyclic is distinguished from other types
by having its interval set to -1.

Orabug: 26384803

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>

dtrace: LOW level cyclics should use workqueues

The HIGH level cyclics are meant to be run from interrupt handler. This works on
Linux because the hrtimer is scheduled as tasklet. The LOW level cyclics must be
interruptible and should not be scheduled as tasklests.

DTrace is currently relying on being able to call dtrace_sync() from within a cyclic
handler. On Linux it is not safe to try send IPIs from within interrupt/bottom half
handlers.

This fix changes LOW level cyclics to use workqueues. At the moment we are using
shared system workqueue but it may be required to allocate our owns if this causes
big latency in our timer routines.

Orabug: 26384779

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
Acked-by: Chuck Anderson <chuck.anderson@oracle.com>

SPARC64: Correct ATU IOTSB binding flow

Any PCIe device attempting to use ATU for 64-bit DMA must be successfully
bound to IOTSB otherwise DMA access from device will be rejected by
PCIe root complex.

In the current code, ATU initialization and PCIe device binding to IOTSB
is done during system boot when PCI Bus Module (PBM) is initialized.
(PBM driver traverse through the PCI tree and binds each PCI device to
IOTSB).

In case if user/system add SRIOV VFs later on (either after system is up
or when PF driver is loaded), the new VF devices never get bounded to
IOTSB. So when attempt is made from VF devices to access DMA region, ATU
HW will flag it as invalid access; generates CTE_Invalid errors and VF
driver/device fails to perform DMA.

This patch fixes the aforementioned issue by delaying PCIe device
binding to IOTSB only when driver requests DMA setting from kernel.
Doing it this way makes sure every PCIe device requesting 64bit DMA mask
(and therefore attempting to use ATU) will get bind to IOTSB and can do
successful DMA operations.

Orabug: 25177689
Orabug: 25450353

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: Govinda Tatti <Govinda.Tatti@Oracle.COM>
Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

SPARC64: Introduce IOMMU BYPASS method

This change adds IOMMU BYPASS HV API that will allow PCIe devices to
bypass IOMMU during DMA map/unmap.

Orabug: 25573557

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

i40e: Revert i40e temporary workaround

ATU and Hypervisor IOMMU v2 APIs enable 64bit DMA addressing so
i40e temporary workaround no longer needed.

This patch revert uek4 commit f59e8082fb57c0009
("i40e: Temporary workaround for DMA map issue").

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Enable 64-bit DMA

ATU 64bit addressing allows PCIe devices with 64bit DMA capabilities
to use ATU for 64bit DMA.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Enable sun4v dma ops to use IOMMU v2 APIs

Add Hypervisor IOMMU v2 APIs pci_iotsb_map(), pci_iotsb_demap() and
enable sun4v dma ops to use IOMMU v2 API for all PCIe devices with
64bit DMA mask.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Bind PCIe devices to use IOMMU v2 service

In order to use Hypervisor (HV) IOMMU v2 API for map/demap, each PCIe
device has to be bound to IOTSB using HV API pci_iotsb_bind().

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Initialize iommu_map_table and iommu_pool

Like legacy IOMMU, use common iommu_map_table and iommu_pool for ATU.
This change initializes iommu_map_table and iommu_pool for ATU.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Add ATU (new IOMMU) support

ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
Hypervisor IOMMU v2 APIs.

Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 300MB-500MB DVMA space per
instance. When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.

ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.

ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Make FORCE_MAX_ZONEORDER to 13 for ATU

This change allows ATU (new IOMMU) in SPARC systems to request
large (32M) contiguous memory during boot for creating IOTSB backing
store.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

Revert "sparc64: bypass iommu to use 64bit address space"

This reverts commit 34edb40e5e13cba76ad13646d3a1823d877851c6.

Conflicts:
arch/sparc/kernel/pci_sun4v.c

Orabug: 21149316

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

i40e: fix annoying message

The driver was printing a message about not being able
to assign VMDq because of a lack of MSI-X vectors.

This was because a line was missing that initialized a variable,
simply a merge error.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 26409501
(cherry picked from commit e9e53662d8130dd950885e37dc1d97008e1283f9)
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

watchdog: Move hardlockup detector to separate file

Separate hardlockup code from watchdog.c and move it to watchdog_hld.c.
It is mostly straight forward. Remove everything inside
CONFIG_HARDLOCKUP_DETECTORS. This code will go to file watchdog_hld.c.
Also update the makefile accordigly.

Orabug: 24796651

Reviewed-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

watchdog: Move shared definitions to nmi.h

Move shared macros and definitions to nmi.h so that watchdog.c,
new file watchdog_hld.c or any other architecture specific handler
can use those definitions.

Orabug: 24796651

Reviewed-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Suppress kmalloc (DAX driver) warning due to allocation failure

dax_alloc is used by libdax to allocate 4MB chunks of physically
contiguous memory using kmalloc. It is normal for dax_alloc to fail
when kmalloc runs out of memory and this should not be treated as a warning.
This failure should be caught in libdax and attempted to allocate memory
from another source (Eg: mmap hugepage).

Orabug: 26224254

Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

i40evf: Use le32_to_cpu before evaluating HW desc fields.

i40e hardware descriptor fields are in little-endian format. Driver
must use le32_to_cpu while evaluating these fields otherwise on
big-endian arch we end up evaluating incorrect values, cause errors
like:

i40evf 0001:04:02.0: Expected response 0 from PF, received 285212672
i40evf 0001:04:02.1: Expected response 0 from PF, received 285212672

Orabug: 25577233

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: revert pause instruction patch for atomic backoff and cpu_relax()

This patch reverts the commit e9b9eb59ffcdee09ec96b040f85c919618f4043e
("sparc64: Use pause instruction when available").

This all started with our TPCC results on UEK4. During T7 testing, the TPCC results
were much lower compared to UEK2. The atomic calls like atomic_add and atomic_sub
were showing top on perf results. Karl found out that this was caused by
the upstream commit e9b9eb59ffcdee09ec96b040f85c919618f4043e
(sparc64: Use pause instruction when available). After reverting this commit on UEK4,
the TPCC numbers were back to UEK2 level. However, things changed after Atish's
scheduler fixes on UEK4. The TPCC numbers improved and the upstream commit
(sparc64: Use pause instruction when available) did not seem make any difference.
So, Karl's "revert pause instruction" patch was removed from UEK4.

Now again with T8 testing, we are seeing the same old behaviour. The atomic calls
like atomic_add and atomic_sub are showing top on perf results. After trying with
Karl's patch(revert pause instruction patch for atomic backoff) the TPCC numbers
improved(about %25 better than T7) and atomic calls are not showing on top in perf.

So, we are adding this patch back again. This is a temporary fix. Long term solution
is still in the discussion. The original patch is from Karl.
http://ca-git.us.oracle.com/?p=linux-uek-sparc.git;a=commit;h=f214eebf2223d23a2b1499be5b54719bdd7651e3

All the credit should go to Karl. Rebased it on latest sparc tree.

Orabug: 26306832

Signed-off-by: Karl Volz <karl.volz@Oracle.com>
Reviewed-by: Atish Patra <atish.patra@oracle.com>
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Reviewed-by: Karl Volz <karl.volz@Oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

be2net: Update the driver version to 11.4.0.0

Orabug: 26403655

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: Fix UE detection logic for BE3

On certain platforms BE3 chips may indicate spurious UEs (unrecoverable
error). Because of the UE detection logic was disabled in the driver
for BE3 chips. Because of this, even in cases of a real UE,
a failover will not occur. This patch re-enables UE detection on BE3
and if a UE is detected, reads the POST register. If the POST register,
reports either a FAT_LOG_STATE or a ARMFW_UE, then it means that a valid
UE occurred in the chip.

Orabug: 26403655

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: Fix offload features for Q-in-Q packets

At least some of the be2net cards do not seem to be capabled
of performing checksum offload computions on Q-in-Q packets.
In these case, the recevied checksum on the remote is invalid
and TCP syn packets are dropped.

This patch adds a call to check disbled acceleration features
on Q-in-Q tagged traffic.

Orabug: 26403655

CC: Sathya Perla <sathya.perla@broadcom.com>
CC: Ajit Khaparde <ajit.khaparde@broadcom.com>
CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
CC: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

benet: Use time_before_eq for time comparison

Use time_before_eq for time comparison more safe and dealing
with timer wrapping to be future-proof.

Orabug: 26403655

Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: Fix endian issue in logical link config command

Use cpu_to_le32() for link_config variable in set_logical_link_config
command as this variable is of type u32.

Orabug: 26403655

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: fix initial MAC setting

Recent commit 34393529163a ("be2net: fix MAC addr setting on privileged
BE3 VFs") allows privileged BE3 VFs to set its MAC address during
initialization. Although the initial MAC for such VFs is already
programmed by parent PF the subsequent setting performed by VF is OK,
but in certain cases (after fresh boot) this command in VF can fail.

The MAC should be initialized only when:
1) no MAC is programmed (always except BE3 VFs during first init)
2) programmed MAC is different from requested (e.g. MAC is set when
   interface is down). In this case the initial MAC programmed by PF
   needs to be deleted.

The adapter->dev_mac contains MAC address currently programmed in HW so
it should be zeroed when the MAC is deleted from HW and should not be
filled when MAC is set when interface is down in be_mac_addr_set() as
no programming is performed in this case.

Example of failure without the fix (immediately after fresh boot):

be2net 0000:01:00.0 eth0: Link is Up

...
be2net 0000:01:04.0: Emulex OneConnect(be3): VF  port 0

be2net 0000:01:04.0: opcode 59-1 failed:status 1-76
RTNETLINK answers: Input/output error

iommu: Removing device 0000:01:04.0 from group 33
...

iommu: Removing device 0000:01:04.0 from group 33
...

be2net 0000:01:04.0 eth8: Link is Up

Initialization is now OK.

v2 - Corrected the comment and condition check suggested by Suresh & Harsha

Orabug: 26403655

Fixes: 34393529163a ("be2net: fix MAC addr setting on privileged BE3 VFs")
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

drivers: net: generalize napi_complete_done()

napi_complete_done() allows to opt-in for gro_flush_timeout,
added back in linux-3.19, commit 3b47d30396ba
("net: gro: add a per device gro flush timer")

This allows for more efficient GRO aggregation without
sacrifying latencies.

Orabug: 26403655

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: fix MAC addr setting on privileged BE3 VFs

During interface opening MAC address stored in netdev->dev_addr is
programmed in the HW with exception of BE3 VFs where the initial
MAC is programmed by parent PF. This is OK when MAC address is not
changed when an interfaces is down. In this case the requested MAC is
stored to netdev->dev_addr and later is stored into HW during opening.
But this is not done for all BE3 VFs so the NIC HW does not know
anything about this change and all traffic is filtered.

This is the case of bonding if fail_over_mac == 0 where the MACs of
the slaves are changed while they are down.

The be2net behavior is too restrictive because if a BE3 VF has
the FILTMGMT privilege then it is able to modify its MAC without
any restriction.

To solve the described problem the driver should take care about these
privileged BE3 VFs so the MAC is programmed during opening. And by
contrast unpriviled BE3 VFs should not be allowed to change its MAC
in any case.

Orabug: 26403655

Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: fix unicast list filling

The adapter->pmac_id[0] item is used for primary MAC address but
this is not true for adapter->uc_list[0] as is assumed in
be_set_uc_list(). There are N UC addresses copied first from net_device
to adapter->uc_list[1..N] and then N UC addresses from
adapter->uc_list[0..N-1] are sent to HW. So the last UC address is never
stored into HW and address 00:00:00:00;00:00 (from uc_list[0]) is used
instead.

Orabug: 26403655

Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Fixes: b717241 be2net: replace polling with sleeping in the FW completion path
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: fix accesses to unicast list

Commit 988d44b "be2net: Avoid redundant addition of mac address in HW"
introduced be_dev_mac_add & be_uc_mac_add helpers that incorrectly
access adapter->uc_list as an array of bytes instead of an array of
be_eth_addr. Consequently NIC is not filled with valid data so unicast
filtering is broken.

Orabug: 26403655

Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Fixes: 988d44b be2net: Avoid redundant addition of mac address in HW
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: fix non static symbol warnings

Fixes the following sparse warnings:

drivers/net/ethernet/emulex/benet/be_main.c:47:25: warning:
symbol 'be_err_recovery_workq' was not declared. Should it be static?
drivers/net/ethernet/emulex/benet/be_main.c:63:25: warning:
symbol 'be_wq' was not declared. Should it be static?

Orabug: 26403655

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: Avoid redundant addition of mac address in HW

If a mac address is added to the uc_list and later the same mac address
is added via ndo_set_mac_address() or vice versa, the driver does not
detect this condition and tries to add it again. This results in a mac
address collision error when the FW rejects it.

Fix this by checking if the given mac address is present in uc_list while
setting the device mac address and vice versa. Similarly skip deletion if
the address is still in use in the other form.

Orabug: 26403655

Signed-off-by: Suresh Reddy <Suresh.Reddy@broadcom.com>
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: Support UE recovery in BEx/Skyhawk adapters

This patch supports recovery from UEs caused due to Transient Parity
Errors (TPE), in BE2, BE3 and Skyhawk adapters. This change avoids
system reboot when such errors occur. The driver recovers from these
errors such that the adapter resumes full operational status as prior
to the UE.

Following is the list of changes in the driver to support this:

o The driver registers its UE recoverable capability with ARM FW at init
time. This also allows the driver to know if the feature is supported in
the FW.

o As the UE recovery requires precise time bound processing, the driver
creates its own error recovery work queue with a single worker thread (per
module, shared across functions).

o Each function runs an error detection task at an interval of 1 second as
required by the FW. The error detection logic already exists for BEx/SH,
but it now runs in the context of a separate worker thread.

o When an error is detected by the task, if it is recoverable, the PF0
driver instance initiates a soft reset, while other PF driver instances
wait for the reset to complete and the chip to become ready. Once
the chip is ready, all driver instances including PF0, resume to
reinitialize the respective functions.

o The PF0 driver checks for some recovery criteria, to determine if the
recovery can be initiated. If the criteria is not met, the PF0 driver does
not initiate a soft reset, it retains the existing behavior to stop
further processing and requires a reboot to get the chip to operational
state again.

o To allow each function to share the workq, while also making progress in
its recovery process, a per-function recovery state machine is used.
The per-function tasks avoid blocking operations like msleep() while in
this state machine (until reinit state) and instead reschedule for the
required delay.

o With these changes, the existing error recovery code for Lancer also
runs in the context of the new worker thread.

Orabug: 26403655

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: replace polling with sleeping in the FW completion path

The ndo_set_rx_mode() and ndo_add/del_vxlan_port() calls may be called with
BHs disabled. The driver currently issues the required cmds to the FW in
these contexts and polls on completions from the FW, while BHs remain
disabled. This can cause either packet loss or packet reception to be
delayed on that CPU.

This patch defers processing of the above cmds to a separate workqueue.
With this change, FW cmds are now issued only in process context.
Now that the FW cmds are issued only in process context, they can sleep
waiting for a completion instead of polling. All the spin_lock_bh(mcc_lock)
calls are now replaced with mutex calls.

Also a new rx_filter_lock is now needed to protect the RX filtering fields
like vids[] between be_vlan_add/rem_vid() and __be_set_rx_mode() contexts.

Orabug: 26403655

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

be2net: support asymmetric rx/tx queue counts

be2net so far supported creation of RX/TX queues only in pairs.
On configs where rx and tx queue counts are different, creation of only
the lesser number of queues has been supported.

This patch now allows a combination of RX/TX-only channels along with
combined channels. N TX-queues and M RX-queues can be created with the
following cmds:
ethtool -L ethX combined N rx M-N (when N < M)
ethtool -L ethX combined M tx N-M (when M < N)

Setting both RX-only and TX-only channels is still not supported.
It is mandatory to create atleast one combined channel.

Orabug: 26403655

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

net/rds: Add mutex exclusion for vector_load

Several TOS connections may invoke ibdev_get_unused_vector()
concurrently. Hence, mutual exclusion is required to protect
rds_ibdev->vector_load.

Changes from v1:
- Reworded commit message and commentary based on Avinash's feedback
- Added r-b from Wei Lin and Avinash

Orabug: 26406492

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>

net: properly release sk_frag.page

I mistakenly added the code to release sk->sk_frag in
sk_common_release() instead of sk_destruct()

TCP sockets using sk->sk_allocation == GFP_ATOMIC do no call
sk_common_release() at close time, thus leaking one (order-3) page.

iSCSI is using such sockets.

Fixes: 5640f7685831 ("net: use a per task frag allocator")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 26409538

(cherry picked from commit 22a0e18eac7a9e986fec76c60fa4a2926d1291e2)

Signed-off-by: Ghazale Hosseinabadi <ghazale.hosseinabadi@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

i40e: Correct the macros for setting the DMA attributes

The I40E_DMA_ATTRS() macro was not setting the WEAK_ORDERING as
intended, this change corrects the issue.

Orabug: 26379170
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>

bnxt_en: Fix netpoll handling.

Orabug: 26402533, 26325599, 26366387

To handle netpoll properly, the driver must only handle TX packets
during NAPI. Handling RX events cause warnings and errors in
netpoll mode. The ndo_poll_controller() method should call
napi_schedule() directly so that a NAPI weight of zero will be used
during netpoll mode.

The bnxt_en driver supports 2 ring modes: combined, and separate rx/tx.
In separate rx/tx mode, the ndo_poll_controller() method will only
process the tx rings. In combined mode, the rx and tx completion
entries are mixed in the completion ring and we need to drop the rx
entries and recycle the rx buffers.

Add a function bnxt_force_rx_discard() to handle this in netpoll mode
when we see rx entries in combined ring mode.

Reported-by: Calvin Owens <calvinowens@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2270bc5da34979454e6f2eb133d800b635156174)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

bnxt_en: Add missing logic to handle TPA end error conditions.

Orabug: 26402533, 26325599, 26366387

When we get a TPA_END completion to handle a completed LRO packet, it
is possible that hardware would indicate errors. The current code is
not checking for the error condition. Define the proper error bits and
the macro to check for this error and abort properly.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 69c149e2e39e8d66437c9034bb4926ef2c1f7c23)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

bnxt_en: Fix xmit_more with BQL.

Orabug: 26402533, 26325599, 26366387

We need to write the doorbell if BQL has stopped the queue and
skb->xmit_more is set. Otherwise it is possible for the tx queue to
rot and cause tx timeout.

Fixes: 4d172f21cefe ("bnxt_en: Implement xmit_more.")
Suggested-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ffe406457753a7ca2061ecc8c4d3971623066911)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

bnxt_en: Pass in sh parameter to bnxt_set_dflt_rings().

Orabug: 26402533, 26325599, 26366387

In the existing code, the local variable sh is hardcoded to true to
calculate default rings for shared ring configuration. It is better
to have the caller determine the value of sh.

Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 702c221ca64060b81af4461553be19cba275da8b)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>