www.infradead.org Git - users/jedix/linux-maple.git/log

Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/drivers:
PCI: hv: Microsoft changes in support of RHEL and UEK4

PCI: hv: Microsoft changes in support of RHEL and UEK4

This patch layers changes made by Microsoft in a Github repo to support
RHEL kernel versions, it eliminates the IRQ Domain dependencies of the
initial commit into mainline. This has a few modifications for OL7/UEK4.

Orabug: 25507635
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>

Merge branch topic/uek-4.1/dtrace of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/dtrace:
  dtrace: get rid of dtrace_gethrtime
  dtrace: drop spurious debugging left in by accident
  dtrace: comtinuing the FBT implementation and fixes
  dtrace: ensure DTrace can use get_user_pages safely
  dtrace: enable paranoid mode and IST shift for xen_int3
  dtrace: ensure we skip the entire SDT probe point
  dtrace: add ip SDT provider

Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/drivers: (262 commits)
  scsi: qla2xxx: Fix apparent cut-n-paste error.
  scsi: qla2xxx: Fix Target mode handling with Multiqueue changes.
  scsi: qla2xxx: Add Block Multi Queue functionality.
  scsi: qla2xxx: Add multiple queue pair functionality.
  qla2xxx: Add irq affinity notification
  scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
  be2net: get rid of custom busy poll code
  be2net: fix initial MAC setting
  be2net: fix MAC addr setting on privileged BE3 VFs
  be2net: don't delete MAC on close on unprivileged BE3 VFs
  be2net: fix status check in be_cmd_pmac_add()
  be2net: Increase skb headroom size to 256 bytes
  be2net: Add DEVSEC privilege to SET_HSW_CONFIG command.
  be2net: do not call napi_hash_del()
  be2net: Enable VF link state setting for BE3
  be2net: Fix TX stats for TSO packets
  be2net: Update Copyright string in be_hw.h
  be2net: NCSI FW section should be properly updated with ethtool for BE3
  be2net: Provide an alternate way to read pf_num for BEx chips
  be2net: mark symbols static where possible
  ...

Conflicts:
drivers/net/ethernet/intel/i40e/i40e_main.c
drivers/scsi/be2iscsi/be_main.c

Merge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/rpm-build:
uek-rpm: sync up spec with linux-firmware version

Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race
  ext2: convert to mbcache2
  ext4: convert to mbcache2
  mbcache2: reimplement mbcache

Merge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/rpm-build:
uek-config: enable CONFIG_MOUSE_PS2_VMMOUSE
uek-rpm: enable CONFIG_KSPLICE.

Merge branch topic/uek-4.1/uek-carry of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/uek-carry:
timers: Use proper base migration in add_timer_on()

Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/drivers: (289 commits)
  Input: vmmouse - remove port reservation
  Input: vmmouse - fix absolute device registration
  bnxt_en: use eth_hw_addr_random()
  bnxt_en: fix pci cleanup in bnxt_init_one() failure path
  bnxt_en: Fix NULL pointer dereference in a failure path during open.
  bnxt_en: Reject driver probe against all bridge devices
  bnxt_en: Added PCI IDs for BCM57452 and BCM57454 ASICs
  bnxt_en: Fix bnxt_setup_tc() error message.
  bnxt_en: Print FEC settings as part of the linkup dmesg.
  bnxt_en: Do not setup PHY unless driving a single PF.
  bnxt_en: Add hardware NTUPLE filter for encapsulated packets.
  bnxt_en: Allow NETIF_F_NTUPLE to be enabled on VFs.
  bnxt_en: Fix ethtool -l pre-set max combined channel.
  bnxt_en: Retry failed NVM_INSTALL_UPDATE with defragmentation flag.
  bnxt_en: Update to firmware interface spec 1.7.0.
  bnxt_en: Refactor tx completion path.
  bnxt_en: Add a set of TX rings to support XDP.
  bnxt_en: Add tx ring mapping logic.
  bnxt_en: Centralize logic to reserve rings.
  bnxt_en: Use event bit map in RX path.
  ...

Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  Btrfs: fix crash on fsync when using overlayfs v4
  vfio/pci: Hide broken INTx support from user
  crypto: cryptd - Assign statesize properly
  crypto: ghash-clmulni - Fix load failure
  USB: digi_acceleport: do sanity checking for the number of ports
  ksplice: add sysctls for determining Ksplice features.
  signal: protect SIGNAL_UNKILLABLE from unintentional clearing.

Conflicts:
kernel/Makefile
kernel/sysctl.c

Merge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/rpm-build:
config: enable simple framebuffer driver for OL6
fm10k: Add driver to the kernel config for UEK4

Merge branch topic/uek-4.1/sparc of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/sparc: (32 commits)
  sparc: fix kernel panic caused by vio handshake
  sparc64: Add sensible read values for /proc/<pid>/sparc_adi
  sparc64: Add ability to set the mcde state for a process
  sparc64: Add proc files specific to ADI
  sparc64: add mcd_on_by_default
  Revert "sparc: fix intermittent LDom hang waiting for vdc_port_up"
  sparc64: Add support for ADI (Application Data Integrity)
  sparc64: Add support for ADI register fields, ASIs and traps
  mm: Add functions to support extra actions on swap in/out
  signals, sparc: Add signal codes for ADI violations
  sparc64: shut down to OBP correctly
  sparc64: fix for user probes in high memory
  sparc64: Use online cpus instead of present cpus during hotplug.
  sparc64: Update cpumaps correctly during hotplug.
  sparc: fix intermittent LDom hang waiting for vdc_port_up
  arch/sparc: Add a dedicated clear_page and clear_user_page for M7
  sparc64: perf: Enable dynamic tracepoints when using perf probe
  SPARC64: UEK4 LDOMS DOMAIN SERVICES UPDATE 7
  arch/sparc: Fix indexing msi_msiqid_table and msi_irq_table
  arch/sparc: Clear msi_msiqid_table during teardown
  ...

Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/drivers: (200 commits)
  scsi: megaraid-sas: request irqs later
  scsi: megaraid_sas: add in missing white spaces in error messages text
  scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
  scsi: megaraid_sas: driver version upgrade
  scsi: megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map
  scsi: megaraid_sas: Send SYNCHRONIZE_CACHE for VD to firmware
  scsi: megaraid_sas: Do not fire DCMDs during PCI shutdown/detach
  scsi: megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade
  scsi: megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset
  scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
  scsi: megaraid_sas: clean function declarations in megaraid_sas_base.c up
  scsi: megaraid_sas: add in missing white space in error message text
  scsi: megaraid_sas: Fix the search of first memory bar
  scsi: megaraid_sas: Use memdup_user() rather than duplicating its implementation
  megaraid_sas: Fix probing cards without io port
  megaraid_sas: Do not fire MR_DCMD_PD_LIST_QUERY to controllers which do not support it
  megaraid_sas: Downgrade two success messages to info
  megaraid_sas: driver version upgrade
  megaraid_sas: task management code optimizations
  megaraid_sas: call ISR function to clean up pending replies in OCR path
  ...

Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks: (280 commits)
  dm btree: fix bufio buffer leaks in dm_btree_del() error path
  ipv4: keep skb->dst around in presence of IP options
  ip6_gre: fix ip6gre_err() invalid reads
  watchdog: hpwdt: changed maintainer information
  watchdog: hpwdt: add support for iLO5
  watchdog: hpwdt: remove email address from doc
  watchdog: hpwdt: Adjust documentation to match latest kernel module parameters.
  hpwdt: use nmi_panic() when kernel panics in NMI handler
  panic: change nmi_panic from macro to function
  watchdog/hpwdt: Fix build on certain configs
  watchdog/hpwdt: Create stack frame in asminline_call()
  x86/asm: Add C versions of frame pointer macros
  x86/asm: Clean up frame pointer macros
  watchdog: hpwdt: HP rebranding
  panic, x86: Allow CPUs to save registers even if looping in NMI context
  watchdog: hpwdt: Add support for WDIOC_SETOPTIONS
  kvm: fix page struct leak in handle_vmon
  bnx2: use READ_ONCE() instead of barrier()
  bnx2: Wait for in-flight DMA to complete at probe stage
  bnx2: fix locking when netconsole is used
  ...

Merge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/stable-cherry-picks:
btrfs: trimming some start_transaction() code away
dm flakey: fix reads to be issued if drop_writes configured

sparc: fix kernel panic caused by vio handshake

During hours long reboot test, the primary prints out multiple TX trigger
errors followed by a VIO handshake panic. The TX trigger error happens
because the primary ldmvsw detects that the ldc channel is down. In this
situation, the ldc operation is aborted, the tx and rx queue are then
flushed. The problem is that the rx queue may contain a LDC_EVENT_RESET
sent by the guest. It causes the primary to think that the ldc channel
is not in reset state. When the guest comes up again, the handshake is
out of sequence and thus causes handshake panic.

The TX trigger error would not have happened if the LDC_EVENT_RESET was
received before the TX checked the ldc link state. This is the reason
why the panic happens intermittently.

This patch checks for the connection reset and changes the ldc state to
reset. The reset logic is taken from existing vnet_event_napi() ldc_ctrl:
code path.

Orabug: 23476613
Orabug: 25064864

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>

sparc64: Add sensible read values for /proc/<pid>/sparc_adi

This patch makes value read from /proc/<pid>/sparc_adi consistent
across platforms that support ADi and ones that do not. When ADI is
not available for a process either due to process being an anonymous
process on an ADI-capable platform or the process is running on a
non-ADI platform, a read from /proc/<pid>/sparc_adi always reads a
value of -1. This patch updates the documentation file as well with
the values for sparc_adi proc file.

Orabug: 25173120

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Add ability to set the mcde state for a process

turn off version checking (PSTATE.mcde) to avoid tripping over ADI
versions in flux. This has been partially remedied by using non-faulting
loads.

However, there is still a need to turn off PSTATE.mcde in memory dump
functions. This is to determine if an address is readable. If the
address is unreadable, the dump shows the memory contents as "********"
instead of a 4-byte hex value.

Orabug: 25130002

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Add proc files specific to ADI

This patch adds /proc/sys/kernel/mcd_on_by_default and
/proc/<pid>/sparc_adi files. These files allow userspace access to
change ADI parameters.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: add mcd_on_by_default

Add the global variable mcd_on_by_default and support for the kernel boot arg
"mcd_on_by_default" which causes mcd_on_by_default = 1 if the kernel is
adi_capable().

Based on the code in commit:
sparc64: Enable Application Data Integrity for m7 and newer processors
Required by commit:
sparc64: Add proc files specific to ADI

Orabug: 22713162
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>

Revert "sparc: fix intermittent LDom hang waiting for vdc_port_up"

This reverts commit 94ac2958dd26064af74f49a966e3b7e3bd4dccfe.

Orabug: 25409637

sparc64: Add support for ADI (Application Data Integrity)

ADI is a new feature supported on SPARC M7 and newer processors to allow
hardware to catch rogue accesses to memory. ADI is supported for data
fetches only and not instruction fetches. An app can enable ADI on its
data pages, set version tags on them and use versioned addresses to
access the data pages. Upper bits of the address contain the version
tag. On M7 processors, upper four bits (bits 63-60) contain the version
tag. If a rogue app attempts to access ADI enabled data pages, its
access is blocked and processor generates an exception. Please see
Documentation/sparc/adi.txt for further details.

This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable
MCD (Memory Corruption Detection) on selected memory ranges, enable
TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI
version tags on page swap out/in or migration. It also adds handlers for
traps related to MCD. ADI is not enabled by default for any task. A task
must explicitly enable ADI on a memory range and set version tag for ADI
to be effective for the task.

This initial implementation supports saving and restoring one tag per
page. A page must use same version tag across the entire page for the
tag to survive swap and migration. Swap swupport infrastructure in this
patch allows for this capability to be expanded to store/restore more
than one tag per page in future.

This is a backport of patch sent upstream and brings UEK code closer to
upstream patch v6.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>

sparc64: Add support for ADI register fields, ASIs and traps

SPARC M7 processor adds new control register fields, ASIs and a new
trap to support the ADI (Application Data Integrity) feature. This
patch adds definitions for these register fields, ASIs and a handler
for the new precise memory corruption detected trap.

This is a backport of patch sent upstream and brings UEK code in sync
with upstream patch v6.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>

mm: Add functions to support extra actions on swap in/out

If a processor supports special metadata for a page, for example ADI
version tags on SPARC M7, this metadata must be saved when the page is
swapped out. The same metadata must be restored when the page is swapped
back in. This patch adds two new architecture specific functions -
arch_do_swap_page() to be called when a page is swapped in,
arch_unmap_one() to be called when a page is being unmapped for swap
out.

This is a backport of patch sent upstream and brings UEK code in sync
with upstream patch v6.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>

signals, sparc: Add signal codes for ADI violations

SPARC M7 processor introduces a new feature - Application Data
Integrity (ADI). ADI allows MMU to  catch rogue accesses to memory.
When a rogue access occurs, MMU blocks the access and raises an
exception. In response to the exception, kernel sends the offending
task a SIGSEGV with si_code that indicates the nature of exception.
This patch adds three new signal codes specific to ADI feature:

1. ADI is not enabled for the address and task attempted to access
   memory using ADI
2. Task attempted to access memory using wrong ADI tag and caused
   a deferred exception.
3. Task attempted to access memory using wrong ADI Ttag and caused
   a precise exception.

This is a backport of patch sent upstream and brings UEK code closer to
upstream patch v6.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>

sparc64: shut down to OBP correctly

Orabug: 23467092

The command "shutdown -h -H now" should shut the system down to the
OBP, however the machine was being powered off in the LDOM case.

In the LDOM case, the "reboot-command" variable must be set to
the string "noop" and then ldom_reboot() must be called.
This will make the OBP ignore the setting of "auto-boot?" after it
completes the reset. This causes the system to stop at the ok prompt.

Signed-off-by: Larry Bassel <larry.bassel@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: fix for user probes in high memory

Orabug 25428066

When returning from the user probe code into userspace process, PC & NPC are
truncated to 32 bits.

As a result of shared libraries get loaded very high in the virtual address
space of the process, placing a user probe inside a shared library makes the
kernel return into the process at the wrong address, causing it to seg'fault
most of the time.

This patch prevents truncating PC and NPC.

Signed-off-by: Eric Saint Etienne <eric.saint.etienne@oracle.com>
Reviewed-by: David Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Use online cpus instead of present cpus during hotplug.

As per the hotplug documentation, online cpu maps should be
updated if cpu hotplug happens via sysfs. Thus, all other
cpu maps should be updated basd on the online cpus instead
of present cpus. The following example illustrates the issue
if cpu maps are updated based on present cpus.

Before the fix on a T7-2:

[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list
0-7
[root@ca-sparc64 hackbench]# echo 0 > /sys/devices/system/cpu/cpu0/online
[root@ca-sparc64 hackbench]# echo 0 > /sys/devices/system/cpu/cpu2/online
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/thread_siblings_list
1,3-7
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/core_siblings_list
1,3-255

[root@ca-sparc64 hackbench]# echo 1 > /sys/devices/system/cpu/cpu2/online
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/core_siblings_list
0-255
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/thread_siblings_list
0-7
This is wrong because cpu0 is still offline.

After the fix:
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/core_siblings_list
1-255
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/thread_siblings_list
1-7

Orabug: 25472256

Signed-off-by: Atish Patra <atish.patra@oracle.com>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Update cpumaps correctly during hotplug.

Currently,numa_cpu_mask is not updated when cpus are
hotplugged resulting incorrect number of cpus reported
by lscpu/numactl. Moreover, cpu_core_sib_cache_map is
also not cleared when cpu goes offline.

Update both the masks correctly whenever cpu goes online/
offline.

Orabug: 25144324

Signed-off-by: Atish Patra <atish.patra@oracle.com>
Reviewed-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc: fix intermittent LDom hang waiting for vdc_port_up

When an LDom boots, sunvdc probes the disk using the LDC channel.
If the channel was previously configured, we need to wait for
the channel state to change from UP to RESETTING so that the
seqid is properly reset in the primary. Otherwise the primary
will expect that the ldc packet contains a seqid other than 0.

Also disable ldc hypervisor interrupt before calling vio_port_up,
because interrupts can happen once ldc_bind is called. disabling the
interrupt ensures everything is configured before getting an interrupt
request.

orabug: 25409637

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Reviewed-By: Liam Merwick <Liam.Merwick@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

arch/sparc: Add a dedicated clear_page and clear_user_page for M7

Adding a dedicated clear_page and clear_user_page for M7.
Avoids multiple checks which are really not required.
This eliminates about 30 instructions for each call.
Seen about 3 to 4 percent latency reduction in some cases.

Orabug: 25456049

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: perf: Enable dynamic tracepoints when using perf probe

This commit enables the use of dynamic tracepoints (kprobes) when
using the perf probe command.

Orabug: 24925615

Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: Eric Saint Etienne <eric.saint.etienne@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

SPARC64: UEK4 LDOMS DOMAIN SERVICES UPDATE 7

This update fixes the following issues for LDom domain services on UEK4:

1. Kernel watchdog panic when unbinding guest domains. This panic was
due to the ds driver accessing a freed data structure out of ds_remove().

2. "no service registered for UNREG_REQ handle" error messages on the console
when ldmd is restarted.

Signed-off-by: Aaron Young <Aaron.Young@oracle.com>
Reviewed-By: Bijan Mottahedeh <Bijan.Mottahedeh@oracle.com>
Reviewed-By: Liam Merwick <Liam.Merwick@oracle.com>
Orabug: 25408406, 25366664
Signed-off-by: Allen Pais <allen.pais@oracle.com>

arch/sparc: Fix indexing msi_msiqid_table and msi_irq_table

Orabug: 25391918

Couple of indexing fixes.
1. Fix indexing pbm->msi_msiqid_table. It is initialized
   based off of pbm->msi_first(not pbm->msiq_first as previously done).
   Here is how it is initialized(Look at in sparc64_setup_msi_irq)
   pbm->msi_msiqid_table[msi - pbm->msi_first] = msiqid;

2. In set_related_affinity, we dont need to subtract msi_first as
   the loop is indexed from 0 to size of the table.

(cherry picked from uek2 commit 57d31847c9f2011314de8ea98c06616f91c5dbb8)

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Dmitry Klochkov <dmitry.klochkov@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

arch/sparc: Clear msi_msiqid_table during teardown

Orabug: 25391918

teardown_msi_irq needs to clear msi_msiqid_table in PBM.

(cherry picked from uek2 commit 77264d74588ae4c59682c561707471a4accfed2a)

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Dmitry Klochkov <dmitry.klochkov@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Skip flushing TLBs if there are no mm_users

Orabug: 25379970

Saves time when smp_flush_tlb_page/smp_flush_tlb_pending
is called during do_exit(...). Without this patch, killing
processes had performance bottle neck in these functions
due to unnecessary xcalls made to flush TLBs.

Reviewed-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: Bob Picco <bob.picco@oracle.com
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64:This fixes the numa_node attributes displayed in sysfs.

Orabug: 22748961

Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Zero pages on allocation for mondo and error queues.

Error queues use a non-zero first word to detect if the queues are full.
Using pages that have not been zeroed may result in false positive
overflow events. These queues are set up once during boot so zeroing
all mondo and error queue pages is safe.

Note that this does not always occur because the page allocation for
these queues is so early in the boot cycle that higher number CPUs get
fresh pages. It is only when traps are serviced with lower number CPUs
who were given already used pages that this issue is exposed.

orabug: 23054018

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Don't panic on user mode non-resumable errors

Send a SIGBUS to the offending process on all userspace non-resumable
traps. This prevents userspace applications from creating a kernel
panic. The siginfo will return the code BUS_ADRERR and a valid address
if possible.

orabug: 23054018

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: affine strand irq stacks

    Like the subject says let us NUMA affine the per strand softirq and
    hardirq stacks.

    This has been boot tested on T7-4 and T4-1.

    Ported to UEK4

Orabug: 23050718

Signed-off-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Handle extremely large kernel TLB range flushes more gracefully.

When the vmalloc area gets fragmented, and because the firmware
mapping area sits between where modules live and the vmalloc area, we
can sometimes receive requests for enormous kernel TLB range flushes.

When this happens the cpu just spins flushing billions of pages and
this triggers the NMI watchdog and other problems.

We took care of this on the TSB side by doing a linear scan of the
table once we pass a certain threshold.

Do something similar for the TLB flush, however we are limited by
the TLB flush facilities provided by the different chip variants.

First of all we use an (mostly arbitrary) cut-off of 256K which is
about 32 pages.  This can be tuned in the future.

The huge range code path for each chip works as follows:

1) On spitfire we flush all non-locked TLB entries using diagnostic
   acceses.

2) On cheetah we use the "flush all" TLB flush.

3) On sun4v/hypervisor we do a TLB context flush on context 0, which
   unlike previous chips does not remove "permanent" or locked
   entries.

We could probably do something better on spitfire, such as limiting
the flush to kernel TLB entries or even doing range comparisons.
However that probably isn't worth it since those chips are old and
the TLB only had 64 entries.

Orabug: 25499527

Reported-by: James Clarke <jrtc27@jrtc27.com>
Tested-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a74ad5e660a9ee1d071665e7e8ad822784a2dc7f)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.

Just like the non-cross-call TLB flush handlers, the cross-call ones need
to avoid doing PC-relative branches outside of their code blocks.

Orabug: 25499527

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a236441bb69723032db94128761a469030c3fe6d)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.

Noticed by James Clarke.

Orabug: 25499527

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 830cda3f9855ff092b0e9610346d110846fc497c)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Handle extremely large kernel TSB range flushes sanely.

If the number of pages we are flushing is more than twice the number
of entries in the TSB, just scan the TSB table for matches rather
than probing each and every page in the range.

Based upon a patch and report by James Clarke.

Orabug: 25499527

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 849c498766060a16aad5b0e0d03206726e7d2fa4)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Fix illegal relative branches in hypervisor patched TLB code.

When we copy code over to patch another piece of code, we can only use
PC-relative branches that target code within that piece of code.

Such PC-relative branches cannot be made to external symbols because
the patch moves the location of the code and thus modifies the
relative address of external symbols.

Use an absolute jmpl to fix this problem.

Orabug: 25499527

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b429ae4d5b565a71dfffd759dfcd4f6c093ced94)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

SPARC64: UEK4 LDOMS DOMAIN SERVICES UPDATE 6

This update fixes the following issues for LDom domain services on UEK4:

1. Error messages displayed on the console when guest domains are stopped
   such as:

ldc_print: id=0x11 flags=0x7 state=CONNECTED cstate=0x0 hsstate=0x10
        rx_h=0x2b40 rx_t=0x2b40 rx_n=512
        tx_h=0x4440 tx_t=0x4440 tx_n=512
        rcv_nxt=635 snd_nxt=723
ds-3: ds_disconnect_service_client: failed to send UNREG_REQ for handle
700000001 (1)

2. CPU DR related problems including 'length too big' errors and hangs. With
   these new fixes, >256 vcpus can be successfully added/removed from a guest
   domain. As part of this fix, a new scheme for reusing event data memory
   buffers was implemented.

Signed-off-by: Aaron Young <Aaron.Young@oracle.com>
Reviewed-By: Liam Merwick <Liam.Merwick@oracle.com>
Orabug: 23171935, 24848179
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc: Optimized memset, memcpy, copy_to_user, copy_from_user for M7

New algorithm that takes advantage of the M7 block init store
ASI, ie, overlapping pipelines and miss buffer filling.
Full details in code comments.

Ported from following UEK2 commits.
http://ca-git.us.oracle.com/?p=linux-uek-2.6.39-sparc.git;a=commit;h=c58ef937e442830c362d1ab20a35a1c61b409827
http://ca-git.us.oracle.com/?p=linux-uek-2.6.39-sparc.git;a=commit;h=322d6f95ade517f4e180545f23fa731b2d748b33
http://ca-git.us.oracle.com/?p=linux-uek-2.6.39-sparc.git;a=commit;h=bc0b4ae6b87fbb28bd816320d22ae6c6a2393865

Orabug: 25120741

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

uek-rpm: sync up spec with linux-firmware version

Orabug: 25685665

Linux-firmware was udpated to verison 20170224-51.git432444c5.0.1
Sync up ol6/ol7 spec.
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: qla2xxx: Fix apparent cut-n-paste error.

Orabug: 25477809

Commit 093df73771ba ("scsi: qla2xxx: Fix Target mode handling with
Multiqueue changes.") introduces two bodies of code that look similar
but with s/req/rsp/ in the second instance. But in one case, it looks
like this conversion was missed.

Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Quinn Tran <Quinn.Tran@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: qla2xxx: Fix Target mode handling with Multiqueue changes.

Orabug: 25477809

- Fix race condition between dpc_thread accessing Multiqueue resources
and qla2x00_remove_one thread trying to free resource.
- Fix out of order free for Multiqueue resources. Also, Multiqueue
interrupts needs a workqueue. Interrupt needed to stop before
the wq can be destroyed.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: qla2xxx: Add Block Multi Queue functionality.

Orabug: 25477809

Tell the SCSI layer how many hardware queues we have based on the number
of max queue pairs created. The number of max queue pairs created will
depend on number of MSI-X vector count.

This feature can be turned on via CONFIG_SCSI_MQ_DEFAULT or passing
scsi_mod.use_blk_mq=Y as a parameter to the kernel

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com>
Signed-off-by: Michael Hernandez <michael.hernandez@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: qla2xxx: Add multiple queue pair functionality.

Orabug: 25477809

Replaced existing multiple queue functionality with framework
that allows for the creation of pairs of request and response queues,
either at start of day or dynamically.

Queue pair creation depend on module parameter "ql2xmqsupport",
which need to be enabled to create queue pair.

Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com>
Signed-off-by: Michael Hernandez <michael.hernandez@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

qla2xxx: Add irq affinity notification

Orabug: 25477809

Register to receive notification of when irq setting change
occured.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init

Orabug: 25477809

A system can get hung task timeouts if a qlogic board fails during
initialization (if the board breaks again or fails the init). The hang
involves the scsi scan.

In a nutshell, since commit beb9e315e6e0 ("qla2xxx: Prevent removal and
board_disable race"):

...it is possible to have freed ha (base_vha->hw) early by a call to
qla2x00_remove_one when pdev->enable_cnt equals zero:

       if (!atomic_read(&pdev->enable_cnt)) {
               scsi_host_put(base_vha->host);
               kfree(ha);
               pci_set_drvdata(pdev, NULL);
               return;

Almost always, the scsi_host_put above frees the vha structure
(attached to the end of the Scsi_Host we're putting) since it's the last
put, and life is good.  However, if we are entering this routine because
the adapter has broken sometime during initialization AND a scsi scan is
already in progress (and has done its own scsi_host_get), vha will not
be freed. What's worse, the scsi scan will access the freed ha structure
through qla2xxx_scan_finished:

        if (time > vha->hw->loop_reset_delay * HZ)
                return 1;

The scsi scan keeps checking to see if a scan is complete by calling
qla2xxx_scan_finished. There is a timeout value that limits the length
of time a scan can take (hw->loop_reset_delay, usually set to 5
seconds), but this definition is in the data structure (hw) that can get
freed early.

This can yield unpredictable results, the worst of which is that the
scsi scan can hang indefinitely. This happens when the freed structure
gets reused and loop_reset_delay gets overwritten with garbage, which
the scan obliviously uses as its timeout value.

The fix for this is simple: at the top of qla2xxx_scan_finished, check
for the UNLOADING bit in the vha structure (_vha is not freed at this
point).  If UNLOADING is set, we exit the scan for this adapter
immediately. After this last reference to the ha structure, we'll exit
the scan for this adapter, and continue on.

This problem is hard to hit, but I have run into it doing negative
testing many times now (with a test specifically designed to bring it
out), so I can verify that this fix works. My testing has been against a
RHEL7 driver variant, but the bug and patch are equally relevant to to
the upstream driver.

Fixes: beb9e315e6e0 ("qla2xxx: Prevent removal and board_disable race")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

dtrace: get rid of dtrace_gethrtime

Remove the need for dtrace_gethrtime() and dtrace_getwalltime() because
the current implementations are not deadlock safe.

Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Acked-by: Nick Alcock <nick.alcock@oracle.com>

dtrace: drop spurious debugging left in by accident

We should not be emitting a KERN_INFO log message whenever an is-enabled
probe is discovered.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Orabug: 25143173

dtrace: comtinuing the FBT implementation and fixes

This commit continues the implementation of Function Boundary Tracing
(FBT) and fixes various problems with the original implementation and
other things in DTrace that it caused to break.  It is done as a single
commit due to the intertwined nature of the code it touches.

1. We were only handling unaligned memory access traps as part of the
   NOFAULT access protection.  This commit adds handling data and
   instruction access trap handling.

2. When an OOPS takes place, we now add output about whether we are
   in DTrace probe context and what the last probe was that was being
   processed (if any).  That last data item isn't guaranteed to always
   have a valid value.  But it is helpful.

3. New ustack stack walker implementation (moved from module to kernel
   for consistency and because we need access to low level structures
   like the page tables) for both x86 and sparc.  The new code avoids
   any locking or sleeping.  The new user stack walker is accessed as
   as sub-function of dtrace_stacktrace(), selected using the flags
   field of stacktrace_state_t.

4. We added a new field to the dtrace_psinfo_t structure (ustack) to
   hold the bottom address of the stack.  This is needed in the stack
   walker (specifically for x86) to know when we have reached the end
   of the stack.  It is initialized from copy_process (in DTrace
   specific code) when stack_start is passed as parameter to clone.
   It is also set from dtrace_psinfo_alloc() (which is generally called
   from performing an exec), and there it gets its value from the
   mm->start_stack value.

5. The FBT black lists have been updated with functions that may be
   invoked during probe processing.  In addition, for x86_64 we added
   explicit filter out of functions that start with insn_* or inat_*
   because they are used for instruction analysis during probe
   processing.

6. On sparc64, per-cpu data gets access by means of a global register
   that holds the base address for this memory area.  Some assembler
   code clobbers that register in some cases, so it is not safe to
   depend on this in probe context.  Instead, we explicitly access
   the data based on the smp_processor_id().

7. We added a new CPU DTTrace flag (CPU_DTRACE_PROBE_CTX) to flag that
   we are processing in DTrace probe context.  It is primarily used
   to detect attempts of re-entry into dtrace_probe().

Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Acked-by: Nick Alcock <nick.alcock@oracle.com>
Orabug: 21220305
Orabug: 24829326

dtrace: ensure DTrace can use get_user_pages safely

The processing of the DTrace-specific FOLL_IMMED flag was not robust
enough. We could still get into a situation where cond_resched() was
called (which is bad) or where the VMA area would get extended (which
is also bad). The only code that passes this flag is DTrace support
code, and when the flag is not passed, the execution flow is not at all
affected.

Orabug: 25640153
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
Reviewed-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>

dtrace: enable paranoid mode and IST shift for xen_int3

The Xen PVM path into an INT3 trap was not using paranoid=1 mode nor was
it using an IST shift as is done for HW INT3 traps. This interferes with
the instruction emulation code check based on the handler return value.

Orabug: 25580519
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race

Di Shen reported a race between two concurrent sys_perf_event_open()
calls where both try and move the same pre-existing software group
into a hardware context.

The problem is exactly that described in commit:

f63a8daa5812 ("perf: Fix event->ctx locking")

... where, while we wait for a ctx->mutex acquisition, the event->ctx
relation can have changed under us.

That very same commit failed to recognise sys_perf_event_context() as an
external access vector to the events and thereby didn't apply the
established locking rules correctly.

So while one sys_perf_event_open() call is stuck waiting on
mutex_lock_double(), the other (which owns said locks) moves the group
about. So by the time the former sys_perf_event_open() acquires the
locks, the context we've acquired is stale (and possibly dead).

Apply the established locking rules as per perf_event_ctx_lock_nested()
to the mutex_lock_double() for the 'move_group' case. This obviously means
we need to validate state after we acquire the locks.

Reported-by: Di Shen (Keen Lab)
Tested-by: John Dias <joaodias@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Min Chong <mchong@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: f63a8daa5812 ("perf: Fix event->ctx locking")
Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 321027c1fe77f892f4ea07846aeae08cefbbb290)

Duplicate perf events are handled by setting appropriate return value and redirecting
the flow to 'err_locked' goto label followed by 'err_context' label. In UEK4, 'err_locked'
goto label is not available. Hence, the operations under this label are performed before
redirecting the flow to 'err_context' label.

Orabug : 25564210
CVE-2017-6001

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

ext2: convert to mbcache2

Orabug: 24521483
CVE: CVE-2015-8952

The conversion is generally straightforward. We convert filesystem from
a global cache to per-fs one. Similarly to ext4 the tricky part is that
xattr block corresponding to found mbcache entry can get freed before we
get buffer lock for that block. So we have to check whether the entry is
still valid after getting the buffer lock.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit be0726d33cb8f411945884664924bed3cb8c70ee)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

ext4: convert to mbcache2

Orabug: 24521483
CVE: CVE-2015-8952

The conversion is generally straightforward. The only tricky part is
that xattr block corresponding to found mbcache entry can get freed
before we get buffer lock for that block. So we have to check whether
the entry is still valid after getting buffer lock.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 82939d7999dfc1f1998c4b1c12e2f19edbdff272)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

mbcache2: reimplement mbcache

Orabug: 24521483
CVE: CVE-2015-8952

Original mbcache was designed to have more features than what ext?
filesystems ended up using. It supported entry being in more hashes, it
had a home-grown rwlocking of each entry, and one cache could cache
entries from multiple filesystems. This genericity also resulted in more
complex locking, larger cache entries, and generally more code
complexity.

This is reimplementation of the mbcache functionality to exactly fit the
purpose ext? filesystems use it for. Cache entries are now considerably
smaller (7 instead of 13 longs), the code is considerably smaller as
well (414 vs 913 lines of code), and IMO also simpler. The new code is
also much more lightweight.

I have measured the speed using artificial xattr-bench benchmark, which
spawns P processes, each process sets xattr for F different files, and
the value of xattr is randomly chosen from a pool of V values. Averages
of runtimes for 5 runs for various combinations of parameters are below.
The first value in each cell is old mbache, the second value is the new
mbcache.

V=10
F\P 1 2 4 8 16 32 64
10 0.158,0.157 0.208,0.196 0.500,0.277 0.798,0.400 3.258,0.584 13.807,1.047 61.339,2.803
100 0.172,0.167 0.279,0.222 0.520,0.275 0.825,0.341 2.981,0.505 12.022,1.202 44.641,2.943
1000 0.185,0.174 0.297,0.239 0.445,0.283 0.767,0.340 2.329,0.480 6.342,1.198 16.440,3.888

V=100
F\P 1 2 4 8 16 32 64
10 0.162,0.153 0.200,0.186 0.362,0.257 0.671,0.496 1.433,0.943 3.801,1.345 7.938,2.501
100 0.153,0.160 0.221,0.199 0.404,0.264 0.945,0.379 1.556,0.485 3.761,1.156 7.901,2.484
1000 0.215,0.191 0.303,0.246 0.471,0.288 0.960,0.347 1.647,0.479 3.916,1.176 8.058,3.160

V=1000
F\P 1 2 4 8 16 32 64
10 0.151,0.129 0.210,0.163 0.326,0.245 0.685,0.521 1.284,0.859 3.087,2.251 6.451,4.801
100 0.154,0.153 0.211,0.191 0.276,0.282 0.687,0.506 1.202,0.877 3.259,1.954 8.738,2.887
1000 0.145,0.179 0.202,0.222 0.449,0.319 0.899,0.333 1.577,0.524 4.221,1.240 9.782,3.579

V=10000
F\P 1 2 4 8 16 32 64
10 0.161,0.154 0.198,0.190 0.296,0.256 0.662,0.480 1.192,0.818 2.989,2.200 6.362,4.746
100 0.176,0.174 0.236,0.203 0.326,0.255 0.696,0.511 1.183,0.855 4.205,3.444 19.510,17.760
1000 0.199,0.183 0.240,0.227 1.159,1.014 2.286,2.154 6.023,6.039 ---,10.933 ---,36.620

V=100000
F\P 1 2 4 8 16 32 64
10 0.171,0.162 0.204,0.198 0.285,0.230 0.692,0.500 1.225,0.881 2.990,2.243 6.379,4.771
100 0.151,0.171 0.220,0.210 0.295,0.255 0.720,0.518 1.226,0.844 3.423,2.831 19.234,17.544
1000 0.192,0.189 0.249,0.225 1.162,1.043 2.257,2.093 5.853,4.997 ---,10.399 ---,32.198

We see that the new code is faster in pretty much all the cases and
starting from 4 processes there are significant gains with the new code
resulting in upto 20-times shorter runtimes. Also for large numbers of
cached entries all values for the old code could not be measured as the
kernel started hitting softlockups and died before the test completed.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit f9a61eb4e2471c56a63cd804c7474128138c38ac)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: get rid of custom busy poll code

Orabug: 25570957

Compared to custom busy_poll, the generic NAPI one is better, since
it allows to use GRO, and it removes a lot of code and extra locked
operations in fast path.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fb6113e688e0bf5bc3081d6ff02e8ad77fed3c7a)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/emulex/benet/be_main.c

Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: fix initial MAC setting

Orabug: 25570957

Recent commit 34393529163a ("be2net: fix MAC addr setting on privileged
BE3 VFs") allows privileged BE3 VFs to set its MAC address during
initialization. Although the initial MAC for such VFs is already
programmed by parent PF the subsequent setting performed by VF is OK,
but in certain cases (after fresh boot) this command in VF can fail.

The MAC should be initialized only when:
1) no MAC is programmed (always except BE3 VFs during first init)
2) programmed MAC is different from requested (e.g. MAC is set when
   interface is down). In this case the initial MAC programmed by PF
   needs to be deleted.

The adapter->dev_mac contains MAC address currently programmed in HW so
it should be zeroed when the MAC is deleted from HW and should not be
filled when MAC is set when interface is down in be_mac_addr_set() as
no programming is performed in this case.

Example of failure without the fix (immediately after fresh boot):

be2net 0000:01:00.0 eth0: Link is Up

...
be2net 0000:01:04.0: Emulex OneConnect(be3): VF  port 0

be2net 0000:01:04.0: opcode 59-1 failed:status 1-76
RTNETLINK answers: Input/output error

iommu: Removing device 0000:01:04.0 from group 33
...

iommu: Removing device 0000:01:04.0 from group 33
...

be2net 0000:01:04.0 eth8: Link is Up

Initialization is now OK.

v2 - Corrected the comment and condition check suggested by Suresh & Harsha

Fixes: 34393529163a ("be2net: fix MAC addr setting on privileged BE3 VFs")
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4993b39ab04b083ff6ee1147e7e7f120feb6bf7f)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/emulex/benet/be_main.c

Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: fix MAC addr setting on privileged BE3 VFs

Orabug: 25570957

During interface opening MAC address stored in netdev->dev_addr is
programmed in the HW with exception of BE3 VFs where the initial
MAC is programmed by parent PF. This is OK when MAC address is not
changed when an interfaces is down. In this case the requested MAC is
stored to netdev->dev_addr and later is stored into HW during opening.
But this is not done for all BE3 VFs so the NIC HW does not know
anything about this change and all traffic is filtered.

This is the case of bonding if fail_over_mac == 0 where the MACs of
the slaves are changed while they are down.

The be2net behavior is too restrictive because if a BE3 VF has
the FILTMGMT privilege then it is able to modify its MAC without
any restriction.

To solve the described problem the driver should take care about these
privileged BE3 VFs so the MAC is programmed during opening. And by
contrast unpriviled BE3 VFs should not be allowed to change its MAC
in any case.

Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 34393529163af7163ef8459808e3cf2af7db7f16)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/emulex/benet/be_main.c

Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: don't delete MAC on close on unprivileged BE3 VFs

Orabug: 25570957

BE3 VFs without FILTMGMT privilege are not allowed to modify its MAC,
VLAN table and UC/MC lists. So don't try to delete MAC on such VFs.

Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6d928ae590c8d58cfd5cca997d54394de139cbb7)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/emulex/benet/be_main.c

Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: fix status check in be_cmd_pmac_add()

Orabug: 25570957

Return value from be_mcc_notify_wait() contains a base completion status
together with an additional status. The base_status() macro need to be
used to access base status.

Fixes: e3a7ae2 be2net: Changing MAC Address of a VF was broken
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fe68d8bfe59c561664aa87d827aa4b320eb08895)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Increase skb headroom size to 256 bytes

Orabug: 25570957

The driver currently allocates 128 bytes of skb headroom.
This was found to be insufficient with some configurations
like Geneve tunnels, which resulted in skb head reallocations.

Increase the headroom to 256 bytes to fix this.

Signed-off-by: Kalesh A P <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 76b15923b777aa2660029629179550124c1fc40e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Add DEVSEC privilege to SET_HSW_CONFIG command.

Orabug: 25570957

OPCODE_COMMON_GET_FN_PRIVILEGES is returning only DEVSEC
privilege (Unrestricted Administrative Privilege) for Lancer NIC functions.
So, driver is failing SET_HSW_CONFIG command, as DEVSEC privilege was not
set in the privilege bitmap. This patch fixes the problem by setting DEVSEC
privilege in SET_HSW_CONFIG’s privilege bitmap.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d14584d91976c42c7178164665c4959495740939)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: do not call napi_hash_del()

Orabug: 25570957

Calling napi_hash_del() before netif_napi_del() is dangerous
if a synchronize_rcu() is not enforced before NAPI struct freeing.

Lets leave this detail to core networking stack and feel
more comfortable.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ea339343d64a14594d882ccb52e8619d42defe5e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Enable VF link state setting for BE3

Orabug: 25570957

The VF link state setting feature now works on BE3 chips too from
FW ver 11.1.192.0 onwards.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit dc6e8511ff7141141578bac559565c55a1e14ad8)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Fix TX stats for TSO packets

Orabug: 25570957

TX stats update does not take into account headers which get duplicated
when the TSO packet is split into segments by HW. Fix this for both
tunneled (vxlan) and non-tunneled TSO packets.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f3d6ad84807254954fc69bdebb6123e5a2883baf)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Update Copyright string in be_hw.h

Orabug: 25570957

This patch updates the year and company name in the copyright string
in be_hw.h.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 77b696cba961bb6e88aeba36253849443f9a4186)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: NCSI FW section should be properly updated with ethtool for BE3

Orabug: 25570957

The driver has a check to ensure that NCSI FW section is updated only
if the current FW version in the card supports it. This FW version check
is done using memcmp() which obviously fails in some cases. Fix this by
breaking up the version string into integer version components and
comparing them.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f5ef017e1195d0a8c69a82bf95fea9c776b93ff0)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Provide an alternate way to read pf_num for BEx chips

Orabug: 25570957

The driver gets the pf_num for Skyhawk and Lancer using
GET_FUNC_CONFIG FW command. But since that command is not
supported in BEx, we need to get it from some other command.
Otherwise TPE recovery would fail since all NIC PFs would
end up with a func num of 0. There's a pci function number
field in the response of GET_CNTL_ATTRIBUTES command that
can be read to get the same info for BEx adapters.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6ee080bb09889dc0195a9c659288d17999237fb6)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: mark symbols static where possible

Orabug: 25570957

We get 4 warnings when building kernel with W=1:
drivers/net/ethernet/emulex/benet/be_main.c:4368:6: warning: no previous prototype for 'be_calculate_pf_pool_rss_tables' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4385:5: warning: no previous prototype for 'be_get_nic_pf_num_list' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4537:6: warning: no previous prototype for 'be_reset_nic_desc' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4910:5: warning: no previous prototype for '__be_cmd_set_logical_link_config' [-Wmissing-prototypes]

In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
so this patch marks these functions with 'static'.

Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d766e7e6b68d681d46d74e228ad0ba133e730e36)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Update the driver version to 11.1.0.0

Orabug: 25570957

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 368f2f137f5401f37d2acb42c4ca4e5867570495)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Fix mac address collision in some configurations

Orabug: 25570957

If the device mac address is updated using ndo_set_mac_address(),
while the same mac address is already programmed, the driver does not
detect this condition if its netdev->dev_addr has been changed. The
driver tries to add the same mac address resulting in mac address
collision error. This has been observed in bonding mode-5 configuration.

To fix this, store the mac address configured in HW in the adapter
structure. Use this to compare against the new address being updated
to avoid collision.

Signed-off-by: Suresh Reddy <Suresh.Reddy@broadcom.com>
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c27ebf58517536c0006813007680b24db17def47)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/emulex/benet/be.h
drivers/net/ethernet/emulex/benet/be_main.c

Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Add privilege level check for OPCODE_COMMON_GET_EXT_FAT_CAPABILITIES SLI cmd.

Orabug: 25570957

Driver issues OPCODE_COMMON_GET_EXT_FAT_CAPABILITIES cmd during init which
when issued by VFs results in the logging of a cmd failure message since
they don't have the required privilege for this cmd. Fix by checking
privilege before issuing the cmd.

Also fixed typo in CAPABILITIES.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 62259ac4b36e348077635e673f253cc139dd6032)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Issue COMMON_RESET_FUNCTION cmd during driver unload

Orabug: 25570957

As per SLI guideline, drivers need to issue COMMON_RESET_FUNCTION SLI
cmd during driver unload to clean up any non-persistent state
information.
Issue this cmd only if VFs are not assigned to VMs as it is possible
for PF driver to unload while it\'s VF remains functional and assigned
to a VM.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f72099e057c0b3ea3cfd16301cff9202c4db8ef4)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Avoid unnecessary firmware updates of multicast list

Orabug: 25570957

Eachtime the ndo_set_rx_mode() routine is called, the driver programs the
multicast list in the adapter without checking if there are any changes to
the list. This leads to a flood of RX_FILTER cmds when a number of vlan
interfaces are configured over the device, as the ndo_ gets
called for each vlan interface. To avoid this, we now use __dev_mc_sync()
and __dev_uc_sync() API, but only to detect if there is a change in the
mc/uc lists. Now that we use this API, the code has to be-designed to
issue these API calls for each invocation of the be_set_rx_mode() call.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 92fbb1df83ec17f62a46b23507ebb3f06ca10cd3)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: do not remove vids from driver table if be_vid_config() fails.

Orabug: 25570957

The driver currently removes a new vid from the adapter->vids[] array if
be_vid_config() returns an error, which occurs when there is an error in
HW/FW. This is wrong. After the HW/FW error is recovered from, we need the
complete vids[] array to re-program the vlan list.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0aff1fbfe72e47412e3213648e972c339af30e4e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: clear vlan-promisc setting before programming the vlan list

Orabug: 25570957

The Lancer FW has a bug due to which in some cases vlan-promisc setting
is cleared eventhough the vlan-list programming did not succeed (via
VLAN_CONFIG) cmd. The driver has no way of knowing if the vlan-promisc
mode was cleared or not when this cmd fails. To work around this issue,
this patch first explicitly clears the vlan-promisc mode via RX_FILTER
cmd and then tries to program the vlan list.
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 841f60fcc4e7986a4ef3f83a289ab47076872e42)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: perform temperature query in adapter regardless of its interface state

Orabug: 25570957

The be2net driver performs fw temperature queries on be_worker() routine,
which is executed each second for each be_adapter. There is a frequency
threshold to avoid fw query to happens at each call to be_worker();
instead, currently a fw query occurs once in 64 runs of the procedure.

Nevertheless, this fw temperature query is invoked only for adapters which
interface is up, so we can see I/O errors on read of hwmon counters from
userspace (from tools like lm-sensors) in case we have adapters' functions
which interface is down.

This patch moves the fw query code to be invoked even if interface is down.
No functional changes were introduced.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d3480615cf00c6f615cdb61a9d03386574b93342)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: signedness bug in be_msix_enable()

Orabug: 25570957

"num_vec" needs to be signed for the error handling to work.

Fixes: e261768e9e39 ('be2net: support asymmetric rx/tx queue counts')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6fde0e63eccbaf21fa278b240b8129fec14b864b)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: update be2net maintainers list

Orabug: 25570957

This patch removes Padmanabh's name from the maintainers list as he's no
longer with the company. It also adds the driver name on the headline to
make it easy to lookup the maintainers list by the driver name.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d2ee76fa0a999e1c174b219e91c748f65f42ac16)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
MAINTAINERS

Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Change copyright markings in source files

Orabug: 25570957

This patch updates year and company name in the copyright markings in the
be2net source files.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7dfbe7d799ffd5cafd02c79434f3bf93bbe4fe52)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: Fix broadcast echoes from EVB in BE3

Orabug: 25570957

On SR-IOV profiles, when the user connects a Linux Bridge or OVS to a BE3
vport, they suffer the "broadcast/multicast echo" problem. BE3 EVB echoes
broadcast and multicast packets back to PF's vport confusing the
Linux bridge. BE3 relies on the src-mac addr being programmed on the
interface to avoid sending back an echo of a broadcast or multicast packet
on a vPort. When a Linux bridge is connected to a BE3, the mac-addr of the
VM behind the bridge doesn't get configured on the vPort and so echo
cancellation doesn't work.
This patch worksaround this problem by disabling the EVB initially
and re-enabling it *only* when SR-IOV is enabled by the user. For the
driver fix to work, the BE3 FW version must be >= 11.1.84.0.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 884476be065e23bb8e5abda3aad9ba04c17341c3)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

be2net: fix definition of be_max_eqs()

Orabug: 25570957

The EQs available on a function are shared between NIC and RoCE.
The be_max_eqs() macro was so far being used to refer to the max number of
EQs available for NIC. This has caused some confusion in the code. To fix
this confusion this patch introduces a new macro called be_max_nic_eqs()
to refer to the max number of EQs avialable for NIC only and renames
be_max_eqs() to be_max_func_eqs().

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ce7faf0a071b6d05f55d8b7b36fd796cab527427)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Use GFP_ATOMIC under spin lock

Orabug: 25655127

A spin lock is taken here so we should use GFP_ATOMIC.

Fixes: 987132167f4b ("scsi: be2iscsi: Fix for crash in beiscsi_eh_device_reset")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Update driver version

Orabug: 25655127

Version 11.2.1.0

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Add warning message for unsupported adapter

Orabug: 25655127

Add a warning message to indicate obsolete/unsupported
BE2 Adapter Family devices

Signed-off-by: Ketan Mukadam <ketan.mukadam@avagotech.com>
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Reinit SGL handle, CID tables after TPE

Orabug: 25655127

After TPE recovery, CID table needs to be repopulated as per CIDs in
WRBQ creation responses.

SGL handles table needs to be recreated for posting and its indices need
to be resetted.

This is achieved by calling beiscsi_cleanup_port when disabling and
beiscsi_init_port in enabling port.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Add checks to validate CID alloc/free

Orabug: 25655127

Set CID slot to 0xffff to indicate empty.
Check if connection already exists in conn_table before binding.
Check if endpoint already NULL before putting back CID.
Break ep->conn link in free_ep to ignore completions after freeing.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Remove wq_name from beiscsi_hba

Orabug: 25655127

wq_name is used only to set WQ name when its being allocated.
Remove it from beiscsi_hba structure and define locally.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Remove unused struct members

Orabug: 25655127

Fix errors reported in static analysis.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Remove redundant receive buffers posting

Orabug: 25655127

This duplicate code got added during manual merging.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Fix iSCSI cmd cleanup IOCTL

Orabug: 25655127

Prepare the IOCTL with appropriate sizes of buffers of V0 and V1.
Set missing chute number in V1 IOCTL.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>

scsi: be2iscsi: Add checks to validate completions

Orabug: 25655127

Added check in beiscsi_process_cq for pio_handle.
pio_handle is cleared in beiscsi_put_wrb_handle.
This catches any case where task gets cleaned up just before completion.

Use back_lock before accessing pio_handle.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>