Aaron Young [Tue, 11 Jul 2017 16:56:40 +0000 (09:56 -0700)]
SPARC64: vcc: delay device removal until close()
If a vcc device file is open while it's removed (due to a
domain being unbound), delay the removal of the associated vcc
device structure until the final close() call is made on
the device. This preventsthe device file cdev minor number from
being reused which can result in ugly filesystem warnings to
the console.
Signed-off-by: Aaron Young <aaron.young@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-By: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Thomas Tai [Mon, 10 Jul 2017 16:55:55 +0000 (10:55 -0600)]
sparc64: fix vio handshake issue
When rebooting multiple LDoms together with bind/unbind a separate
LDom, kernel panic with vio handshake error. The panic is caused by
vio trying to allocate a buffer which is not freed properly. The
ldc_unbind should unconfigure and stop the ldc queue before freeing
the irq. If the irq is freed before stopping the queue, interrupts
can continue to happen after the irq is freed which may cause
issue.
Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Use cpu_poke hypervisor call to resume idle cpu if supported.
Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 25575672 Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Vijay Kumar [Fri, 5 May 2017 19:35:05 +0000 (15:35 -0400)]
sparc64: Add a new hypercall CPU_POKE
This adds a new hypercall CPU_POKE for quickly waking up an idle CPU.
CPU POKE should only be sent to valid non-local CPUs.
Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 25575672 Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Thomas Tai [Tue, 27 Jun 2017 15:21:44 +0000 (09:21 -0600)]
sparc64: fix out of order spin_lock_irqsave and spin_unlock_restore
After enabling spinlocks debug option, kernel prints out call trace
suggesting that the function ldom_req_sp_token executes a might_sleep
function while IRQs is disabled. IRQs is disabled because the
spin_lock_irqsave and spin_unlock_irqrestore are out of order.
The last UNLOCK_DS_DEV() ends up restoring IRQs to disabled state,
because the previous LOCK_DS_DEV() is in irqs_disabled().
To fix the issue, follows the order of irqsave()/irqrestore().
Takashi Iwai [Fri, 2 Jun 2017 15:26:56 +0000 (17:26 +0200)]
ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
snd_timer_user_tselect() reallocates the queue buffer dynamically, but
it forgot to reset its indices. Since the read may happen
concurrently with ioctl and snd_timer_user_tselect() allocates the
buffer via kmalloc(), this may lead to the leak of uninitialized
kernel-space data, as spotted via KMSAN:
BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10
CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x143/0x1b0 lib/dump_stack.c:52
kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086
copy_to_user ./arch/x86/include/asm/uaccess.h:725
snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004
do_loop_readv_writev fs/read_write.c:716
__do_readv_writev+0x94c/0x1380 fs/read_write.c:864
do_readv_writev fs/read_write.c:894
vfs_readv fs/read_write.c:908
do_readv+0x52a/0x5d0 fs/read_write.c:934
SYSC_readv+0xb6/0xd0 fs/read_write.c:1021
SyS_readv+0x87/0xb0 fs/read_write.c:1018
This patch adds the missing reset of queue indices. Together with the
previous fix for the ioctl/read race, we cover the whole problem.
Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit ba3021b2c79b2fa9114f92790a99deb27a65b728)
Takashi Iwai [Fri, 2 Jun 2017 13:03:38 +0000 (15:03 +0200)]
ALSA: timer: Fix race between read and ioctl
The read from ALSA timer device, the function snd_timer_user_tread(),
may access to an uninitialized struct snd_timer_user fields when the
read is concurrently performed while the ioctl like
snd_timer_user_tselect() is invoked. We have already fixed the races
among ioctls via a mutex, but we seem to have forgotten the race
between read vs ioctl.
This patch simply applies (more exactly extends the already applied
range of) tu->ioctl_lock in snd_timer_user_tread() for closing the
race window.
Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit d11662f4f798b50d8c8743f433842c3e40fe3378)
NVMe: Retain QUEUE_FLAG_SG_GAPS flag for bio vector alignment.
The nvme queue flag QUEUE_FLAG_SG_GAPS checks for the bio vector
alignment against the page size. In upstream, the QUEUE_FLAG_SG_GAPS
flag is replaced by blk_queue_virt_boundary() and pulling in the
respective patches caused instability in the driver and hence
QUEUE_FLAG_SG_GAPS flag is retained for vector alignment.
Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Mintz, Yuval [Fri, 9 Jun 2017 14:17:02 +0000 (17:17 +0300)]
bnx2x: Don't post statistics to malicious VFs
Once firmware indicates that a given VF is malicious and until
that VF passes an FLR all bets are off - PF can't know anything
is happening to the VF [since VF can't communicate anything to its PF].
But PF is currently still periodically asking device to collect
statistics for the VF which might in turn fill logs by IOMMU blocking
memory access done by the VF's PCI function [in the case VF has unmapped
its buffers].
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3523882229b903e967de05665b871dab87c5df0f) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Mintz, Yuval [Fri, 9 Jun 2017 14:17:01 +0000 (17:17 +0300)]
bnx2x: Allow vfs to disable txvlan offload
VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 92f85f05caa51d844af6ea14ffbc7a786446a644) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Tue, 6 Jun 2017 14:30:31 +0000 (16:30 +0200)]
bnx2x: fix pf2vf bulletin DMA mapping leak
When freeing VF's DMA mappings, an already NULLed pointer was checked
again due to an apparent copy&paste error. Consequently, the pf2vf
bulletin DMA mapping was not freed.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 996652c7050c70008e4434af108be6f15f20fbd0) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Mintz, Yuval [Thu, 1 Jun 2017 12:57:56 +0000 (15:57 +0300)]
bnx2x: Fix Multi-Cos
Apparently multi-cos isn't working for bnx2x quite some time -
driver implements ndo_select_queue() to allow queue-selection
for FCoE, but the regular L2 flow would cause it to modulo the
fallback's result by the number of queues.
The fallback would return a queue matching the needed tc
[via __skb_tx_hash()], but since the modulo is by the number of TSS
queues where number of TCs is not accounted, transmission would always
be done by a queue configured into using TC0.
Fixes: ada7c19e6d27 ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3968d38917eb9bd0cd391265f6c9c538d9b33ffa) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e39513259450a8312cb98a9a3b16bb924310dbcc) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Fri, 3 Mar 2017 16:08:33 +0000 (17:08 +0100)]
bnx2x: fix incorrect filter count in an error message
filters->count is the number of filters we were supposed to configure.
There is no reason to increase it by +1 when printing the count in an error
message.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 74bcbeb7d77ec92e4262fc340cb436ef7d98ba01) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 78d5505432436516456c12abbe705ec8dee7ee2b) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 83bd9eb8fc69cdd5135ed6e1f066adc8841800fd) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Fri, 3 Mar 2017 16:08:30 +0000 (17:08 +0100)]
bnx2x: fix possible overrun of VFPF multicast addresses array
It is too late to check for the limit of the number of VF multicast
addresses after they have already been copied to the req->multicast[]
array, possibly overflowing it.
Do the check before copying.
Also fix the error path to not skip unlocking vf2pf_mutex.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 22118d861cec5da6ed525aaf12a3de9bfeffc58f) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Fri, 3 Mar 2017 16:08:29 +0000 (17:08 +0100)]
bnx2x: lower verbosity of VF stats debug messages
When BNX2X_MSG_IOV is enabled, the driver produces too many VF statistics
messages. Lower the verbosity of the VF stats messages similarly as in
commit 76ca70fabbdaa3 ("bnx2x: [Debug] change verbosity of some prints").
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 850268d320f0c7c5eb7ad0a62ef21859fa331ded) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Fri, 3 Mar 2017 16:08:28 +0000 (17:08 +0100)]
bnx2x: prevent crash when accessing PTP with interface down
It is possible to crash the kernel by accessing a PTP device while its
associated bnx2x interface is down. Before the interface is brought up,
the timecounter is not initialized, so accessing it results in NULL
dereference.
Fix it by checking if the interface is up.
Use -ENETDOWN as the error code when the interface is down.
-EFAULT in bnx2x_ptp_adjfreq() did not seem right.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 466e8bf10ac104d96e1ea813e8126e11cb72ea20) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tomas Jedlicka [Fri, 7 Apr 2017 20:51:53 +0000 (16:51 -0400)]
dtrace: FBT module support and SPARCs return probes
This fix adds two features to FBT provider:
1) support for modules
2) support for return probes on SPARC
The module support of x86 was almost ready as it does not rely on trampolines and
uses hashtables of tracepoints. This works well if we know amount of probes in
advance so can reserve correct amount of memory during module load time. Unfortunately
that is not possible on SPARC and we need to allocate a trampoline dynamically.
Major part of this code is about removing all static assumptions about FBT from kernel
code and moving the responsibility to dtrace modules. Trampolines for SPARC are now
allocated dynamically (including kernel's pseudo module). This applies to SDT trampolines
too.
Second change adds scan for return probes on SPARC with small heuristics to quickly
skip over cases that are not interesting for DTrace. At the same time this patch
allocates new SPARC Trap for FBT.
Support for .init section is not available on any platform. The .init section is freed
after a module is fully loaded and it is not possible to remove its probes without
further chagnes in DTrace framework (modules). This is deffered for later work.
Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
mm: fix use-after-free if memory allocation failed in vma_adjust()
There's one case when vma_adjust() expands the vma, overlapping with
*two* next vma. See case 6 of mprotect, described in the comment to
vma_merge().
To handle this (and only this) situation we iterate twice over main part
of the function. See "goto again".
Vegard reported[1] that he sees out-of-bounds access complain from
KASAN, if anon_vma_clone() on the *second* iteration fails.
This happens because we free 'next' vma by the end of first iteration
and don't have a way to undo this if anon_vma_clone() fails on the
second iteration.
The solution is to do all required allocations upfront, before we touch
vmas.
The allocation on the second iteration is only required if first two
vmas don't have anon_vma, but third does. So we need, in total, one
anon_vma_clone() call.
It's easy to adjust 'exporter' to the third vma for such case.
uek-rpm nano: Signature verification support in kexec_file_load
The following configuration options to support
signature verification in the kexec_file_load
syscall are enabled:
CONFIG_KEXEC_VERIFY_SIG=y
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_PKCS7_MESSAGE_PARSER=y
CONFIG_SIGNED_PE_FILE_VERIFICATION=y
Driver fails Beacon OFF if frequency is set to 0. As per fc-ls spec,
status, capability, frequency and duration fields are only applicable
for Beacon ON.
Remove frequency and type checks. Reject Beacon ON if duration is non
zero.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
OS crashes after the completion of firmware download.
Failure in posting SCSI SGL buffers because number of SGL buffers is
less than total count. Some of the pending IOs are not completed by
driver. SGL buffers for these IOs are not added back to the list.
Pending IOs are not completed because lpfc_wq_list list is initialized
before completion of pending IOs.
Postpone lpfc_wq_list reinitialization by moving
lpfc_sli4_queue_destroy() after lpfc_hba_down_post().
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Mailbox submission fails because mailbox interrupt is disabled. Mailbox
interrupt is disabled during port reset.
Do reset only for physical port.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Kernel panic when log_verbose is set to 0xffffffff
phba->pport is dereferenced before it is initialized
Fix: Do not dereference phba->pport if it is NULL
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
On hbacmd reset failure, observing wrong string "nline" in kernel log.
On failure, non negative value (1) is returned from sysfs store
routine. It is interpreted as count by kernel and store routine is
called again with the remaining characters as input.
Fix: Return negative error code (-EIO) in case of failure.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Observing lpfc port down after issuing hbacmd reset command
Failure in posting SGL buffers. If there is only one SGL buffer and rrq
is valid for its XRI, we are rightly returning NULL but not adding the
buffer back to the SGL list. So, number of buffers become less than
total count and repost fails during reset.
Add SGL buffer back to list before returning NULL.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Trivial fix to spelling mistake in debugfs message
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Added code to support Cisco MDS loopback diagnostic. The diagnostics run
various loopbacks including one which loops-back frame through the
driver.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
With 255 vports created a link trasition can casue a crash.
When going through discovery after a link bounce the driver is using
rpis before the cmd FCOE_POST_HDR_TEMPLATES completes. By doing that the
next rpi bumps the rpi range out of the boundary.
The fix it to increment the next_rpi only when the
FCOE_POST_HDR_TEMPLATE succeeds.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
To select the appropriate shost template, the driver is issuing a
mailbox command to retrieve the wwn. Turns out the sending of the
command precedes the reset of the function. On SLI-4 adapters, this is
inconsequential as the mailbox command location is specified by dma via
the BMBX register. However, on SLI-3 adapters, the location of the
mailbox command submission area changes. When the function is first
powered on or reset, the cmd is submitted via PCI bar memory. Later the
driver changes the function config to use host memory and DMA. The
request to start a mailbox command is the same, a simple doorbell write,
regardless of submission area. So.. if there has not been a boot driver
run against the adapter, the mailbox command works as defaults are
ok. But, if the boot driver has configured the card and, and if no
platform pci function/slot reset occurs as the os starts, the mailbox
command will fail. The SLI-3 device will use the stale boot driver dma
location. This can cause PCI eeh errors.
Fix is to reset the sli-3 function before sending the mailbox command,
thus synchronizing the function/driver on mailbox location.
Note: The fix uses routines that are typically invoked later in the call
flow to reset the sli-3 device. The issue in using those routines is
that the normal (non-fix) flow does additional initialization, namely
the allocation of the pport structure. So, rather than significantly
reworking the initialization flow so that the pport is alloc'd first,
pointer checks are added to work around it. Checks are limited to the
routines invoked by a sli-3 adapter (s3 routines) as this fix/early call
is only invoked on a sli3 adapter. Nothing changes post the
fix. Subsequent initialization, and another adapter reset, still occur -
both on sli-3 and sli-4 adapters.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.") Cc: stable@vger.kernel.org # v4.11+ Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The older sli4 adapters only supported the 64 byte WQE entry size.
The new adapter (fw) support both 64 and 128 byte WQE entry sizies.
The Express lane WQ was not being created with the 128 byte WQE sizes
when it was supported.
Not having the right WQE size created for the express lane work queue
caused the the firmware to overwrite the lun indentifier in the FCP header.
This patch correctly creates the express lane work queue with the
supported size.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Brian Maly <brian.maly@oracle.com>
There are two versions of a structure for queue creation and setup that the
driver shares with FW. The driver was only treating as version 0.
Verify WQ_CREATE with 128B WQEs in V0 and V1.
Code review of another bug showed the driver passing
128B WQEs and 8 pages in WQ CREATE and V0.
Code inspection/instrumentation showed that the driver
uses V0 in WQ_CREATE and if the caller passes queue->entry_size
128B, the driver sets the hdr_version to V1 so all is good.
When I tested the V1 WQ_CREATE, the mailbox failed causing
the driver to unload.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Firmware sends first FLOGI to fabric with vendor version changes.
On link up driver gets updated service parameter with FAWWN assigned port
name. Driver sends 2nd FLOGI with updated fawwpn and modifies the
vport->fc_portname in driver.
Note:
Soft wwpn will not be allowed when fawwpn is enabled.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When RPI is not available, driver sends WQE with invalid RPI value and
rejected by HBA.
lpfc 0000:82:00.3: 1:3154 BLS ABORT RSP failed, data: x3/xa0320008
and
lpfc :2753 PLOGI failure DID:FFFFFA Status:x3/xa0240008
In this case, driver accesses rpi_ids array out of bounds.
Fix:
Check return value of lpfc_sli4_alloc_rpi(). Do not allocate
lpfc_nodelist entry if RPI is not available.
When RPI is not available, we will get discovery timeouts and
command drops for some of the vports as seen below.
lpfc :0273 Unexpected discovery timeout, vport State x0
lpfc :0230 Unexpected timeout, hba link state x5
lpfc :0111 Dropping received ELS cmd Data: x0 xc90c55 x0
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The check for NULL ptr is not necessary, kfree will check it.
Removing NULL ptr check.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
This patch was applied manually.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de Signed-off-by: Brian Maly <brian.maly@oracle.com>
lpfc cannot establish connection with targets that send PRLI in P2P
configurations.
If lpfc rejects a PRLI that is sent from a target the target will not
resend and will reject the PRLI send from the initiator.
[mkp: applied by hand]
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
NVME merge reverted diag port names to the physical port.
They should be the vport.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
A previous change unilaterally removed the hba reset entry point
from the sli3 host template. This was done to allow tape devices
being used for back up from being removed. Why was this done ?
When there was non-responding device on the fabric, the error
escalation policy would escalate to the reset handler. When the
reset handler was called, it would reset the adapter, dropping
link, thus logging out and terminating all i/o's - on any target.
If there was a tape device on the same adapter that wasn't in
error, it would kill the tape i/o's, effectively killing the
tape device state. With the reset point removed, the adapter
reset avoided the fabric logout, allowing the other devices to
continue to operate unaffected. A hack - yes. Hint: we really
need a transport I_T nexus reset callback added to the eh process
(in between the SCSI target reset and hba reset points), so a
fc logout could occur to the one bad target only and stop the error
escalation process.
This patch commonizes the approach so it can be used for sli3 and sli4
adapters, but mandates the admin, via module parameter, specifically
identify which adapters the resets are to be removed for. Additionally,
bus_reset, which sends Target Reset TMFs to all targets, is also removed
from the template as it too has the same effect as the adapter reset.
This patch had to be modified from the original because of the NVME changes that are in the upstream driver.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
if REG_VPI fails, the driver was incorrectly issuing INIT_VFI
(a SLI4 command) on a SLI3 adapter.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5d181531bc6169e19a02a27d202cf0e982db9d0e) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In the case where sglq is null, the current code just returns without
unlocking the spinlock sql_list_lock. Fix this by breaking out of the
while loop and the exit path will then unlock and return NULL as was
the original intention.
Detected by CoverityScan, CID#1411635 ("Missing unlock")
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Willy Tarreau [Tue, 16 May 2017 17:18:55 +0000 (19:18 +0200)]
char: lp: fix possible integer overflow in lp_setup()
The lp_setup() code doesn't apply any bounds checking when passing
"lp=none", and only in this case, resulting in an overflow of the
parport_nr[] array. All versions in Git history are affected.
Reported-By: Roee Hay <roee.hay@hcl.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3e21f4af170bebf47c187c1ff8bf155583c9f3b1)
We want to use kthread_stop() in order to ensure the threads are
shut down before we tear down the nfs_callback_info in nfs_callback_down.
Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com> Reported-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: bb6aeba736ba9 ("NFSv4.x: Switch to using svc_set_num_threads()...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit ed6473ddc704a2005b9900ca08e236ebb2d8540a)
Refactor to separate out the functions of starting and stopping threads
so that they can be used in other helpers.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit 9e0d87680d689f1758185851c3da6eafb16e71e1)
WANG Cong [Tue, 9 May 2017 23:59:54 +0000 (16:59 -0700)]
ipv6/dccp: do not inherit ipv6_mc_list from parent
Like commit 657831ffc38e ("dccp/tcp: do not inherit mc_list from parent")
we should clear ipv6_mc_list etc. for IPv6 sockets too.
Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 83eaddab4378db256d00d295bda6ca997cd13a52)
Anand Jain [Sat, 13 Feb 2016 02:01:37 +0000 (10:01 +0800)]
btrfs: enhance btrfs_find_device_by_user_input() to check device path
The operation of device replace and device delete follows same steps upto
some depth with in btrfs kernel, however they don't share codes. This
enhancement will help replace and delete to share codes.
Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586
Anand Jain [Sat, 13 Feb 2016 02:01:34 +0000 (10:01 +0800)]
btrfs: clean up and optimize __check_raid_min_device()
__check_raid_min_device() which was pealed from btrfs_rm_device()
maintianed its original code to show the block move. This patch cleans up
__check_raid_min_device().
Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586
(Cherry picked from commit bd45ffbcb1f082e3c5a0bd56b1d7310d8b707ffb,
drop the second argument of the btrfs_dev_replace_lock/unlock to
match with the current kernel api)
Signed-off-by: Shan Hai <shan.hai@oracle.com> Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Anand Jain [Sat, 13 Feb 2016 02:01:33 +0000 (10:01 +0800)]
btrfs: create helper function __check_raid_min_devices()
move a section of btrfs_rm_device() code to check for min number of the
devices into the function __check_raid_min_devices()
Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586
(Cherry picked from commit f1fa7f264250f2cb60119aca3fd114c3f47070c2,
drop the second argument of the btrfs_dev_replace_lock/unlock to
match with the current kernel api)
Signed-off-by: Shan Hai <shan.hai@oracle.com> Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Tomas Jedlicka [Sun, 2 Jul 2017 16:17:01 +0000 (12:17 -0400)]
dtrace: Add support for manual triggered cyclics
In some scenarios it is better if a client of cyclic susbstem can reprogram cyclic on
his own. This is not possible with current implementation.
This chage adds cyclic_reprogram() that can be used to schedule cyclic from inside and
outside of its handler. A manually triggered cyclic is distinguished from other types
by having its interval set to -1.
Tomas Jedlicka [Fri, 30 Jun 2017 13:17:06 +0000 (09:17 -0400)]
dtrace: LOW level cyclics should use workqueues
The HIGH level cyclics are meant to be run from interrupt handler. This works on
Linux because the hrtimer is scheduled as tasklet. The LOW level cyclics must be
interruptible and should not be scheduled as tasklests.
DTrace is currently relying on being able to call dtrace_sync() from within a cyclic
handler. On Linux it is not safe to try send IPIs from within interrupt/bottom half
handlers.
This fix changes LOW level cyclics to use workqueues. At the moment we are using
shared system workqueue but it may be required to allocate our owns if this causes
big latency in our timer routines.
Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Chuck Anderson <chuck.anderson@oracle.com>
Tushar Dave [Thu, 2 Mar 2017 01:29:57 +0000 (17:29 -0800)]
SPARC64: Correct ATU IOTSB binding flow
Any PCIe device attempting to use ATU for 64-bit DMA must be successfully
bound to IOTSB otherwise DMA access from device will be rejected by
PCIe root complex.
In the current code, ATU initialization and PCIe device binding to IOTSB
is done during system boot when PCI Bus Module (PBM) is initialized.
(PBM driver traverse through the PCI tree and binds each PCI device to
IOTSB).
In case if user/system add SRIOV VFs later on (either after system is up
or when PF driver is loaded), the new VF devices never get bounded to
IOTSB. So when attempt is made from VF devices to access DMA region, ATU
HW will flag it as invalid access; generates CTE_Invalid errors and VF
driver/device fails to perform DMA.
This patch fixes the aforementioned issue by delaying PCIe device
binding to IOTSB only when driver requests DMA setting from kernel.
Doing it this way makes sure every PCIe device requesting 64bit DMA mask
(and therefore attempting to use ATU) will get bind to IOTSB and can do
successful DMA operations.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Reviewed-by: Govinda Tatti <Govinda.Tatti@Oracle.COM> Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Allen Pais [Fri, 7 Jul 2017 07:54:25 +0000 (13:24 +0530)]
sparc64: Add ATU (new IOMMU) support
ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
Hypervisor IOMMU v2 APIs.
Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 300MB-500MB DVMA space per
instance. When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.
ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.
ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.
Reviewed-by: chris hyser <chris.hyser@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Jesse Brandeburg [Sat, 3 Oct 2015 00:57:21 +0000 (17:57 -0700)]
i40e: fix annoying message
The driver was printing a message about not being able
to assign VMDq because of a lack of MSI-X vectors.
This was because a line was missing that initialized a variable,
simply a merge error.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 26409501
(cherry picked from commit e9e53662d8130dd950885e37dc1d97008e1283f9) Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Allen Pais [Mon, 10 Jul 2017 07:31:23 +0000 (13:01 +0530)]
watchdog: Move hardlockup detector to separate file
Separate hardlockup code from watchdog.c and move it to watchdog_hld.c.
It is mostly straight forward. Remove everything inside
CONFIG_HARDLOCKUP_DETECTORS. This code will go to file watchdog_hld.c.
Also update the makefile accordigly.
Reviewed-by: Karl Volz <karl.volz@oracle.com> Signed-off-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Allen Pais [Mon, 10 Jul 2017 07:23:57 +0000 (12:53 +0530)]
watchdog: Move shared definitions to nmi.h
Move shared macros and definitions to nmi.h so that watchdog.c,
new file watchdog_hld.c or any other architecture specific handler
can use those definitions.
Reviewed-by: Karl Volz <karl.volz@oracle.com> Signed-off-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Sanath Kumar [Thu, 29 Jun 2017 17:13:28 +0000 (12:13 -0500)]
sparc64: Suppress kmalloc (DAX driver) warning due to allocation failure
dax_alloc is used by libdax to allocate 4MB chunks of physically
contiguous memory using kmalloc. It is normal for dax_alloc to fail
when kmalloc runs out of memory and this should not be treated as a warning.
This failure should be caught in libdax and attempted to allocate memory
from another source (Eg: mmap hugepage).
Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Tushar Dave [Mon, 1 May 2017 22:11:07 +0000 (16:11 -0600)]
i40evf: Use le32_to_cpu before evaluating HW desc fields.
i40e hardware descriptor fields are in little-endian format. Driver
must use le32_to_cpu while evaluating these fields otherwise on
big-endian arch we end up evaluating incorrect values, cause errors
like:
i40evf 0001:04:02.0: Expected response 0 from PF, received 285212672
i40evf 0001:04:02.1: Expected response 0 from PF, received 285212672
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
This all started with our TPCC results on UEK4. During T7 testing, the TPCC results
were much lower compared to UEK2. The atomic calls like atomic_add and atomic_sub
were showing top on perf results. Karl found out that this was caused by
the upstream commit e9b9eb59ffcdee09ec96b040f85c919618f4043e
(sparc64: Use pause instruction when available). After reverting this commit on UEK4,
the TPCC numbers were back to UEK2 level. However, things changed after Atish's
scheduler fixes on UEK4. The TPCC numbers improved and the upstream commit
(sparc64: Use pause instruction when available) did not seem make any difference.
So, Karl's "revert pause instruction" patch was removed from UEK4.
Now again with T8 testing, we are seeing the same old behaviour. The atomic calls
like atomic_add and atomic_sub are showing top on perf results. After trying with
Karl's patch(revert pause instruction patch for atomic backoff) the TPCC numbers
improved(about %25 better than T7) and atomic calls are not showing on top in perf.
So, we are adding this patch back again. This is a temporary fix. Long term solution
is still in the discussion. The original patch is from Karl.
http://ca-git.us.oracle.com/?p=linux-uek-sparc.git;a=commit;h=f214eebf2223d23a2b1499be5b54719bdd7651e3
All the credit should go to Karl. Rebased it on latest sparc tree.
Signed-off-by: Karl Volz <karl.volz@Oracle.com> Reviewed-by: Atish Patra <atish.patra@oracle.com> Signed-off-by: Henry Willard <henry.willard@oracle.com> Signed-off-by: Babu Moger <babu.moger@oracle.com> Reviewed-by: Karl Volz <karl.volz@Oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com> Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Suresh Reddy [Thu, 25 May 2017 02:24:38 +0000 (22:24 -0400)]
be2net: Fix UE detection logic for BE3
On certain platforms BE3 chips may indicate spurious UEs (unrecoverable
error). Because of the UE detection logic was disabled in the driver
for BE3 chips. Because of this, even in cases of a real UE,
a failover will not occur. This patch re-enables UE detection on BE3
and if a UE is detected, reads the POST register. If the POST register,
reports either a FAT_LOG_STATE or a ARMFW_UE, then it means that a valid
UE occurred in the chip.
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com> Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Vlad Yasevich [Tue, 23 May 2017 17:38:42 +0000 (13:38 -0400)]
be2net: Fix offload features for Q-in-Q packets
At least some of the be2net cards do not seem to be capabled
of performing checksum offload computions on Q-in-Q packets.
In these case, the recevied checksum on the remote is invalid
and TCP syn packets are dropped.
This patch adds a call to check disbled acceleration features
on Q-in-Q tagged traffic.
Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com> Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com> Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Ivan Vecera [Tue, 31 Jan 2017 19:01:31 +0000 (20:01 +0100)]
be2net: fix initial MAC setting
Recent commit 34393529163a ("be2net: fix MAC addr setting on privileged
BE3 VFs") allows privileged BE3 VFs to set its MAC address during
initialization. Although the initial MAC for such VFs is already
programmed by parent PF the subsequent setting performed by VF is OK,
but in certain cases (after fresh boot) this command in VF can fail.
The MAC should be initialized only when:
1) no MAC is programmed (always except BE3 VFs during first init)
2) programmed MAC is different from requested (e.g. MAC is set when
interface is down). In this case the initial MAC programmed by PF
needs to be deleted.
The adapter->dev_mac contains MAC address currently programmed in HW so
it should be zeroed when the MAC is deleted from HW and should not be
filled when MAC is set when interface is down in be_mac_addr_set() as
no programming is performed in this case.
Example of failure without the fix (immediately after fresh boot):
be2net 0000:01:00.0 eth0: Link is Up
...
be2net 0000:01:04.0: Emulex OneConnect(be3): VF port 0
Eric Dumazet [Mon, 30 Jan 2017 16:22:01 +0000 (08:22 -0800)]
drivers: net: generalize napi_complete_done()
napi_complete_done() allows to opt-in for gro_flush_timeout,
added back in linux-3.19, commit 3b47d30396ba
("net: gro: add a per device gro flush timer")
This allows for more efficient GRO aggregation without
sacrifying latencies.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com> Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Ivan Vecera [Fri, 13 Jan 2017 21:38:29 +0000 (22:38 +0100)]
be2net: fix MAC addr setting on privileged BE3 VFs
During interface opening MAC address stored in netdev->dev_addr is
programmed in the HW with exception of BE3 VFs where the initial
MAC is programmed by parent PF. This is OK when MAC address is not
changed when an interfaces is down. In this case the requested MAC is
stored to netdev->dev_addr and later is stored into HW during opening.
But this is not done for all BE3 VFs so the NIC HW does not know
anything about this change and all traffic is filtered.
This is the case of bonding if fail_over_mac == 0 where the MACs of
the slaves are changed while they are down.
The be2net behavior is too restrictive because if a BE3 VF has
the FILTMGMT privilege then it is able to modify its MAC without
any restriction.
To solve the described problem the driver should take care about these
privileged BE3 VFs so the MAC is programmed during opening. And by
contrast unpriviled BE3 VFs should not be allowed to change its MAC
in any case.
Ivan Vecera [Fri, 6 Jan 2017 20:59:30 +0000 (21:59 +0100)]
be2net: fix unicast list filling
The adapter->pmac_id[0] item is used for primary MAC address but
this is not true for adapter->uc_list[0] as is assumed in
be_set_uc_list(). There are N UC addresses copied first from net_device
to adapter->uc_list[1..N] and then N UC addresses from
adapter->uc_list[0..N-1] are sent to HW. So the last UC address is never
stored into HW and address 00:00:00:00;00:00 (from uc_list[0]) is used
instead.
Ivan Vecera [Fri, 6 Jan 2017 19:30:02 +0000 (20:30 +0100)]
be2net: fix accesses to unicast list
Commit 988d44b "be2net: Avoid redundant addition of mac address in HW"
introduced be_dev_mac_add & be_uc_mac_add helpers that incorrectly
access adapter->uc_list as an array of bytes instead of an array of
be_eth_addr. Consequently NIC is not filled with valid data so unicast
filtering is broken.
drivers/net/ethernet/emulex/benet/be_main.c:47:25: warning:
symbol 'be_err_recovery_workq' was not declared. Should it be static?
drivers/net/ethernet/emulex/benet/be_main.c:63:25: warning:
symbol 'be_wq' was not declared. Should it be static?
be2net: Avoid redundant addition of mac address in HW
If a mac address is added to the uc_list and later the same mac address
is added via ndo_set_mac_address() or vice versa, the driver does not
detect this condition and tries to add it again. This results in a mac
address collision error when the FW rejects it.
Fix this by checking if the given mac address is present in uc_list while
setting the device mac address and vice versa. Similarly skip deletion if
the address is still in use in the other form.
be2net: Support UE recovery in BEx/Skyhawk adapters
This patch supports recovery from UEs caused due to Transient Parity
Errors (TPE), in BE2, BE3 and Skyhawk adapters. This change avoids
system reboot when such errors occur. The driver recovers from these
errors such that the adapter resumes full operational status as prior
to the UE.
Following is the list of changes in the driver to support this:
o The driver registers its UE recoverable capability with ARM FW at init
time. This also allows the driver to know if the feature is supported in
the FW.
o As the UE recovery requires precise time bound processing, the driver
creates its own error recovery work queue with a single worker thread (per
module, shared across functions).
o Each function runs an error detection task at an interval of 1 second as
required by the FW. The error detection logic already exists for BEx/SH,
but it now runs in the context of a separate worker thread.
o When an error is detected by the task, if it is recoverable, the PF0
driver instance initiates a soft reset, while other PF driver instances
wait for the reset to complete and the chip to become ready. Once
the chip is ready, all driver instances including PF0, resume to
reinitialize the respective functions.
o The PF0 driver checks for some recovery criteria, to determine if the
recovery can be initiated. If the criteria is not met, the PF0 driver does
not initiate a soft reset, it retains the existing behavior to stop
further processing and requires a reboot to get the chip to operational
state again.
o To allow each function to share the workq, while also making progress in
its recovery process, a per-function recovery state machine is used.
The per-function tasks avoid blocking operations like msleep() while in
this state machine (until reinit state) and instead reschedule for the
required delay.
o With these changes, the existing error recovery code for Lancer also
runs in the context of the new worker thread.
be2net: replace polling with sleeping in the FW completion path
The ndo_set_rx_mode() and ndo_add/del_vxlan_port() calls may be called with
BHs disabled. The driver currently issues the required cmds to the FW in
these contexts and polls on completions from the FW, while BHs remain
disabled. This can cause either packet loss or packet reception to be
delayed on that CPU.
This patch defers processing of the above cmds to a separate workqueue.
With this change, FW cmds are now issued only in process context.
Now that the FW cmds are issued only in process context, they can sleep
waiting for a completion instead of polling. All the spin_lock_bh(mcc_lock)
calls are now replaced with mutex calls.
Also a new rx_filter_lock is now needed to protect the RX filtering fields
like vids[] between be_vlan_add/rem_vid() and __be_set_rx_mode() contexts.
Sathya Perla [Wed, 22 Jun 2016 12:54:54 +0000 (08:54 -0400)]
be2net: support asymmetric rx/tx queue counts
be2net so far supported creation of RX/TX queues only in pairs.
On configs where rx and tx queue counts are different, creation of only
the lesser number of queues has been supported.
This patch now allows a combination of RX/TX-only channels along with
combined channels. N TX-queues and M RX-queues can be created with the
following cmds:
ethtool -L ethX combined N rx M-N (when N < M)
ethtool -L ethX combined M tx N-M (when M < N)
Setting both RX-only and TX-only channels is still not supported.
It is mandatory to create atleast one combined channel.
Eric Dumazet [Wed, 15 Mar 2017 20:21:28 +0000 (13:21 -0700)]
net: properly release sk_frag.page
I mistakenly added the code to release sk->sk_frag in
sk_common_release() instead of sk_destruct()
TCP sockets using sk->sk_allocation == GFP_ATOMIC do no call
sk_common_release() at close time, thus leaking one (order-3) page.
iSCSI is using such sockets.
Fixes: 5640f7685831 ("net: use a per task frag allocator") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 26409538
To handle netpoll properly, the driver must only handle TX packets
during NAPI. Handling RX events cause warnings and errors in
netpoll mode. The ndo_poll_controller() method should call
napi_schedule() directly so that a NAPI weight of zero will be used
during netpoll mode.
The bnxt_en driver supports 2 ring modes: combined, and separate rx/tx.
In separate rx/tx mode, the ndo_poll_controller() method will only
process the tx rings. In combined mode, the rx and tx completion
entries are mixed in the completion ring and we need to drop the rx
entries and recycle the rx buffers.
Add a function bnxt_force_rx_discard() to handle this in netpoll mode
when we see rx entries in combined ring mode.
Reported-by: Calvin Owens <calvinowens@fb.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2270bc5da34979454e6f2eb133d800b635156174) Signed-off-by: Brian Maly <brian.maly@oracle.com> Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
When we get a TPA_END completion to handle a completed LRO packet, it
is possible that hardware would indicate errors. The current code is
not checking for the error condition. Define the proper error bits and
the macro to check for this error and abort properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 69c149e2e39e8d66437c9034bb4926ef2c1f7c23) Signed-off-by: Brian Maly <brian.maly@oracle.com> Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
We need to write the doorbell if BQL has stopped the queue and
skb->xmit_more is set. Otherwise it is possible for the tx queue to
rot and cause tx timeout.
Fixes: 4d172f21cefe ("bnxt_en: Implement xmit_more.") Suggested-by: Yuval Mintz <yuval.mintz@cavium.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ffe406457753a7ca2061ecc8c4d3971623066911) Signed-off-by: Brian Maly <brian.maly@oracle.com> Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
In the existing code, the local variable sh is hardcoded to true to
calculate default rings for shared ring configuration. It is better
to have the caller determine the value of sh.
Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 702c221ca64060b81af4461553be19cba275da8b) Signed-off-by: Brian Maly <brian.maly@oracle.com> Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>