www.infradead.org Git - users/jedix/linux-maple.git/log

Merge branch topic/uek-4.1/uek-carry of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/uek-carry:
timers: Use proper base migration in add_timer_on()

Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/drivers: (289 commits)
  Input: vmmouse - remove port reservation
  Input: vmmouse - fix absolute device registration
  bnxt_en: use eth_hw_addr_random()
  bnxt_en: fix pci cleanup in bnxt_init_one() failure path
  bnxt_en: Fix NULL pointer dereference in a failure path during open.
  bnxt_en: Reject driver probe against all bridge devices
  bnxt_en: Added PCI IDs for BCM57452 and BCM57454 ASICs
  bnxt_en: Fix bnxt_setup_tc() error message.
  bnxt_en: Print FEC settings as part of the linkup dmesg.
  bnxt_en: Do not setup PHY unless driving a single PF.
  bnxt_en: Add hardware NTUPLE filter for encapsulated packets.
  bnxt_en: Allow NETIF_F_NTUPLE to be enabled on VFs.
  bnxt_en: Fix ethtool -l pre-set max combined channel.
  bnxt_en: Retry failed NVM_INSTALL_UPDATE with defragmentation flag.
  bnxt_en: Update to firmware interface spec 1.7.0.
  bnxt_en: Refactor tx completion path.
  bnxt_en: Add a set of TX rings to support XDP.
  bnxt_en: Add tx ring mapping logic.
  bnxt_en: Centralize logic to reserve rings.
  bnxt_en: Use event bit map in RX path.
  ...

Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  Btrfs: fix crash on fsync when using overlayfs v4
  vfio/pci: Hide broken INTx support from user
  crypto: cryptd - Assign statesize properly
  crypto: ghash-clmulni - Fix load failure
  USB: digi_acceleport: do sanity checking for the number of ports
  ksplice: add sysctls for determining Ksplice features.
  signal: protect SIGNAL_UNKILLABLE from unintentional clearing.

Conflicts:
kernel/Makefile
kernel/sysctl.c

Merge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/rpm-build:
config: enable simple framebuffer driver for OL6
fm10k: Add driver to the kernel config for UEK4

Merge branch topic/uek-4.1/sparc of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/sparc: (32 commits)
  sparc: fix kernel panic caused by vio handshake
  sparc64: Add sensible read values for /proc/<pid>/sparc_adi
  sparc64: Add ability to set the mcde state for a process
  sparc64: Add proc files specific to ADI
  sparc64: add mcd_on_by_default
  Revert "sparc: fix intermittent LDom hang waiting for vdc_port_up"
  sparc64: Add support for ADI (Application Data Integrity)
  sparc64: Add support for ADI register fields, ASIs and traps
  mm: Add functions to support extra actions on swap in/out
  signals, sparc: Add signal codes for ADI violations
  sparc64: shut down to OBP correctly
  sparc64: fix for user probes in high memory
  sparc64: Use online cpus instead of present cpus during hotplug.
  sparc64: Update cpumaps correctly during hotplug.
  sparc: fix intermittent LDom hang waiting for vdc_port_up
  arch/sparc: Add a dedicated clear_page and clear_user_page for M7
  sparc64: perf: Enable dynamic tracepoints when using perf probe
  SPARC64: UEK4 LDOMS DOMAIN SERVICES UPDATE 7
  arch/sparc: Fix indexing msi_msiqid_table and msi_irq_table
  arch/sparc: Clear msi_msiqid_table during teardown
  ...

Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/drivers: (200 commits)
  scsi: megaraid-sas: request irqs later
  scsi: megaraid_sas: add in missing white spaces in error messages text
  scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
  scsi: megaraid_sas: driver version upgrade
  scsi: megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map
  scsi: megaraid_sas: Send SYNCHRONIZE_CACHE for VD to firmware
  scsi: megaraid_sas: Do not fire DCMDs during PCI shutdown/detach
  scsi: megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade
  scsi: megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset
  scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
  scsi: megaraid_sas: clean function declarations in megaraid_sas_base.c up
  scsi: megaraid_sas: add in missing white space in error message text
  scsi: megaraid_sas: Fix the search of first memory bar
  scsi: megaraid_sas: Use memdup_user() rather than duplicating its implementation
  megaraid_sas: Fix probing cards without io port
  megaraid_sas: Do not fire MR_DCMD_PD_LIST_QUERY to controllers which do not support it
  megaraid_sas: Downgrade two success messages to info
  megaraid_sas: driver version upgrade
  megaraid_sas: task management code optimizations
  megaraid_sas: call ISR function to clean up pending replies in OCR path
  ...

Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks: (280 commits)
  dm btree: fix bufio buffer leaks in dm_btree_del() error path
  ipv4: keep skb->dst around in presence of IP options
  ip6_gre: fix ip6gre_err() invalid reads
  watchdog: hpwdt: changed maintainer information
  watchdog: hpwdt: add support for iLO5
  watchdog: hpwdt: remove email address from doc
  watchdog: hpwdt: Adjust documentation to match latest kernel module parameters.
  hpwdt: use nmi_panic() when kernel panics in NMI handler
  panic: change nmi_panic from macro to function
  watchdog/hpwdt: Fix build on certain configs
  watchdog/hpwdt: Create stack frame in asminline_call()
  x86/asm: Add C versions of frame pointer macros
  x86/asm: Clean up frame pointer macros
  watchdog: hpwdt: HP rebranding
  panic, x86: Allow CPUs to save registers even if looping in NMI context
  watchdog: hpwdt: Add support for WDIOC_SETOPTIONS
  kvm: fix page struct leak in handle_vmon
  bnx2: use READ_ONCE() instead of barrier()
  bnx2: Wait for in-flight DMA to complete at probe stage
  bnx2: fix locking when netconsole is used
  ...

Merge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/stable-cherry-picks:
btrfs: trimming some start_transaction() code away
dm flakey: fix reads to be issued if drop_writes configured

sparc: fix kernel panic caused by vio handshake

During hours long reboot test, the primary prints out multiple TX trigger
errors followed by a VIO handshake panic. The TX trigger error happens
because the primary ldmvsw detects that the ldc channel is down. In this
situation, the ldc operation is aborted, the tx and rx queue are then
flushed. The problem is that the rx queue may contain a LDC_EVENT_RESET
sent by the guest. It causes the primary to think that the ldc channel
is not in reset state. When the guest comes up again, the handshake is
out of sequence and thus causes handshake panic.

The TX trigger error would not have happened if the LDC_EVENT_RESET was
received before the TX checked the ldc link state. This is the reason
why the panic happens intermittently.

This patch checks for the connection reset and changes the ldc state to
reset. The reset logic is taken from existing vnet_event_napi() ldc_ctrl:
code path.

Orabug: 23476613
Orabug: 25064864

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>

sparc64: Add sensible read values for /proc/<pid>/sparc_adi

This patch makes value read from /proc/<pid>/sparc_adi consistent
across platforms that support ADi and ones that do not. When ADI is
not available for a process either due to process being an anonymous
process on an ADI-capable platform or the process is running on a
non-ADI platform, a read from /proc/<pid>/sparc_adi always reads a
value of -1. This patch updates the documentation file as well with
the values for sparc_adi proc file.

Orabug: 25173120

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Add ability to set the mcde state for a process

turn off version checking (PSTATE.mcde) to avoid tripping over ADI
versions in flux. This has been partially remedied by using non-faulting
loads.

However, there is still a need to turn off PSTATE.mcde in memory dump
functions. This is to determine if an address is readable. If the
address is unreadable, the dump shows the memory contents as "********"
instead of a 4-byte hex value.

Orabug: 25130002

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Add proc files specific to ADI

This patch adds /proc/sys/kernel/mcd_on_by_default and
/proc/<pid>/sparc_adi files. These files allow userspace access to
change ADI parameters.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: add mcd_on_by_default

Add the global variable mcd_on_by_default and support for the kernel boot arg
"mcd_on_by_default" which causes mcd_on_by_default = 1 if the kernel is
adi_capable().

Based on the code in commit:
sparc64: Enable Application Data Integrity for m7 and newer processors
Required by commit:
sparc64: Add proc files specific to ADI

Orabug: 22713162
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>

Revert "sparc: fix intermittent LDom hang waiting for vdc_port_up"

This reverts commit 94ac2958dd26064af74f49a966e3b7e3bd4dccfe.

Orabug: 25409637

sparc64: Add support for ADI (Application Data Integrity)

ADI is a new feature supported on SPARC M7 and newer processors to allow
hardware to catch rogue accesses to memory. ADI is supported for data
fetches only and not instruction fetches. An app can enable ADI on its
data pages, set version tags on them and use versioned addresses to
access the data pages. Upper bits of the address contain the version
tag. On M7 processors, upper four bits (bits 63-60) contain the version
tag. If a rogue app attempts to access ADI enabled data pages, its
access is blocked and processor generates an exception. Please see
Documentation/sparc/adi.txt for further details.

This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable
MCD (Memory Corruption Detection) on selected memory ranges, enable
TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI
version tags on page swap out/in or migration. It also adds handlers for
traps related to MCD. ADI is not enabled by default for any task. A task
must explicitly enable ADI on a memory range and set version tag for ADI
to be effective for the task.

This initial implementation supports saving and restoring one tag per
page. A page must use same version tag across the entire page for the
tag to survive swap and migration. Swap swupport infrastructure in this
patch allows for this capability to be expanded to store/restore more
than one tag per page in future.

This is a backport of patch sent upstream and brings UEK code closer to
upstream patch v6.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>

sparc64: Add support for ADI register fields, ASIs and traps

SPARC M7 processor adds new control register fields, ASIs and a new
trap to support the ADI (Application Data Integrity) feature. This
patch adds definitions for these register fields, ASIs and a handler
for the new precise memory corruption detected trap.

This is a backport of patch sent upstream and brings UEK code in sync
with upstream patch v6.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>

mm: Add functions to support extra actions on swap in/out

If a processor supports special metadata for a page, for example ADI
version tags on SPARC M7, this metadata must be saved when the page is
swapped out. The same metadata must be restored when the page is swapped
back in. This patch adds two new architecture specific functions -
arch_do_swap_page() to be called when a page is swapped in,
arch_unmap_one() to be called when a page is being unmapped for swap
out.

This is a backport of patch sent upstream and brings UEK code in sync
with upstream patch v6.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>

signals, sparc: Add signal codes for ADI violations

SPARC M7 processor introduces a new feature - Application Data
Integrity (ADI). ADI allows MMU to  catch rogue accesses to memory.
When a rogue access occurs, MMU blocks the access and raises an
exception. In response to the exception, kernel sends the offending
task a SIGSEGV with si_code that indicates the nature of exception.
This patch adds three new signal codes specific to ADI feature:

1. ADI is not enabled for the address and task attempted to access
   memory using ADI
2. Task attempted to access memory using wrong ADI tag and caused
   a deferred exception.
3. Task attempted to access memory using wrong ADI Ttag and caused
   a precise exception.

This is a backport of patch sent upstream and brings UEK code closer to
upstream patch v6.

Orabug: 22713162

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>

sparc64: shut down to OBP correctly

Orabug: 23467092

The command "shutdown -h -H now" should shut the system down to the
OBP, however the machine was being powered off in the LDOM case.

In the LDOM case, the "reboot-command" variable must be set to
the string "noop" and then ldom_reboot() must be called.
This will make the OBP ignore the setting of "auto-boot?" after it
completes the reset. This causes the system to stop at the ok prompt.

Signed-off-by: Larry Bassel <larry.bassel@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: fix for user probes in high memory

Orabug 25428066

When returning from the user probe code into userspace process, PC & NPC are
truncated to 32 bits.

As a result of shared libraries get loaded very high in the virtual address
space of the process, placing a user probe inside a shared library makes the
kernel return into the process at the wrong address, causing it to seg'fault
most of the time.

This patch prevents truncating PC and NPC.

Signed-off-by: Eric Saint Etienne <eric.saint.etienne@oracle.com>
Reviewed-by: David Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Use online cpus instead of present cpus during hotplug.

As per the hotplug documentation, online cpu maps should be
updated if cpu hotplug happens via sysfs. Thus, all other
cpu maps should be updated basd on the online cpus instead
of present cpus. The following example illustrates the issue
if cpu maps are updated based on present cpus.

Before the fix on a T7-2:

[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list
0-7
[root@ca-sparc64 hackbench]# echo 0 > /sys/devices/system/cpu/cpu0/online
[root@ca-sparc64 hackbench]# echo 0 > /sys/devices/system/cpu/cpu2/online
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/thread_siblings_list
1,3-7
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/core_siblings_list
1,3-255

[root@ca-sparc64 hackbench]# echo 1 > /sys/devices/system/cpu/cpu2/online
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/core_siblings_list
0-255
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/thread_siblings_list
0-7
This is wrong because cpu0 is still offline.

After the fix:
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/core_siblings_list
1-255
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/thread_siblings_list
1-7

Orabug: 25472256

Signed-off-by: Atish Patra <atish.patra@oracle.com>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Update cpumaps correctly during hotplug.

Currently,numa_cpu_mask is not updated when cpus are
hotplugged resulting incorrect number of cpus reported
by lscpu/numactl. Moreover, cpu_core_sib_cache_map is
also not cleared when cpu goes offline.

Update both the masks correctly whenever cpu goes online/
offline.

Orabug: 25144324

Signed-off-by: Atish Patra <atish.patra@oracle.com>
Reviewed-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc: fix intermittent LDom hang waiting for vdc_port_up

When an LDom boots, sunvdc probes the disk using the LDC channel.
If the channel was previously configured, we need to wait for
the channel state to change from UP to RESETTING so that the
seqid is properly reset in the primary. Otherwise the primary
will expect that the ldc packet contains a seqid other than 0.

Also disable ldc hypervisor interrupt before calling vio_port_up,
because interrupts can happen once ldc_bind is called. disabling the
interrupt ensures everything is configured before getting an interrupt
request.

orabug: 25409637

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Reviewed-By: Liam Merwick <Liam.Merwick@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

arch/sparc: Add a dedicated clear_page and clear_user_page for M7

Adding a dedicated clear_page and clear_user_page for M7.
Avoids multiple checks which are really not required.
This eliminates about 30 instructions for each call.
Seen about 3 to 4 percent latency reduction in some cases.

Orabug: 25456049

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: perf: Enable dynamic tracepoints when using perf probe

This commit enables the use of dynamic tracepoints (kprobes) when
using the perf probe command.

Orabug: 24925615

Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: Eric Saint Etienne <eric.saint.etienne@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

SPARC64: UEK4 LDOMS DOMAIN SERVICES UPDATE 7

This update fixes the following issues for LDom domain services on UEK4:

1. Kernel watchdog panic when unbinding guest domains. This panic was
due to the ds driver accessing a freed data structure out of ds_remove().

2. "no service registered for UNREG_REQ handle" error messages on the console
when ldmd is restarted.

Signed-off-by: Aaron Young <Aaron.Young@oracle.com>
Reviewed-By: Bijan Mottahedeh <Bijan.Mottahedeh@oracle.com>
Reviewed-By: Liam Merwick <Liam.Merwick@oracle.com>
Orabug: 25408406, 25366664
Signed-off-by: Allen Pais <allen.pais@oracle.com>

arch/sparc: Fix indexing msi_msiqid_table and msi_irq_table

Orabug: 25391918

Couple of indexing fixes.
1. Fix indexing pbm->msi_msiqid_table. It is initialized
   based off of pbm->msi_first(not pbm->msiq_first as previously done).
   Here is how it is initialized(Look at in sparc64_setup_msi_irq)
   pbm->msi_msiqid_table[msi - pbm->msi_first] = msiqid;

2. In set_related_affinity, we dont need to subtract msi_first as
   the loop is indexed from 0 to size of the table.

(cherry picked from uek2 commit 57d31847c9f2011314de8ea98c06616f91c5dbb8)

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Dmitry Klochkov <dmitry.klochkov@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

arch/sparc: Clear msi_msiqid_table during teardown

Orabug: 25391918

teardown_msi_irq needs to clear msi_msiqid_table in PBM.

(cherry picked from uek2 commit 77264d74588ae4c59682c561707471a4accfed2a)

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Dmitry Klochkov <dmitry.klochkov@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Skip flushing TLBs if there are no mm_users

Orabug: 25379970

Saves time when smp_flush_tlb_page/smp_flush_tlb_pending
is called during do_exit(...). Without this patch, killing
processes had performance bottle neck in these functions
due to unnecessary xcalls made to flush TLBs.

Reviewed-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: Bob Picco <bob.picco@oracle.com
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64:This fixes the numa_node attributes displayed in sysfs.

Orabug: 22748961

Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Zero pages on allocation for mondo and error queues.

Error queues use a non-zero first word to detect if the queues are full.
Using pages that have not been zeroed may result in false positive
overflow events. These queues are set up once during boot so zeroing
all mondo and error queue pages is safe.

Note that this does not always occur because the page allocation for
these queues is so early in the boot cycle that higher number CPUs get
fresh pages. It is only when traps are serviced with lower number CPUs
who were given already used pages that this issue is exposed.

orabug: 23054018

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Don't panic on user mode non-resumable errors

Send a SIGBUS to the offending process on all userspace non-resumable
traps. This prevents userspace applications from creating a kernel
panic. The siginfo will return the code BUS_ADRERR and a valid address
if possible.

orabug: 23054018

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: affine strand irq stacks

    Like the subject says let us NUMA affine the per strand softirq and
    hardirq stacks.

    This has been boot tested on T7-4 and T4-1.

    Ported to UEK4

Orabug: 23050718

Signed-off-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Handle extremely large kernel TLB range flushes more gracefully.

When the vmalloc area gets fragmented, and because the firmware
mapping area sits between where modules live and the vmalloc area, we
can sometimes receive requests for enormous kernel TLB range flushes.

When this happens the cpu just spins flushing billions of pages and
this triggers the NMI watchdog and other problems.

We took care of this on the TSB side by doing a linear scan of the
table once we pass a certain threshold.

Do something similar for the TLB flush, however we are limited by
the TLB flush facilities provided by the different chip variants.

First of all we use an (mostly arbitrary) cut-off of 256K which is
about 32 pages.  This can be tuned in the future.

The huge range code path for each chip works as follows:

1) On spitfire we flush all non-locked TLB entries using diagnostic
   acceses.

2) On cheetah we use the "flush all" TLB flush.

3) On sun4v/hypervisor we do a TLB context flush on context 0, which
   unlike previous chips does not remove "permanent" or locked
   entries.

We could probably do something better on spitfire, such as limiting
the flush to kernel TLB entries or even doing range comparisons.
However that probably isn't worth it since those chips are old and
the TLB only had 64 entries.

Orabug: 25499527

Reported-by: James Clarke <jrtc27@jrtc27.com>
Tested-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a74ad5e660a9ee1d071665e7e8ad822784a2dc7f)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.

Just like the non-cross-call TLB flush handlers, the cross-call ones need
to avoid doing PC-relative branches outside of their code blocks.

Orabug: 25499527

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a236441bb69723032db94128761a469030c3fe6d)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.

Noticed by James Clarke.

Orabug: 25499527

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 830cda3f9855ff092b0e9610346d110846fc497c)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Handle extremely large kernel TSB range flushes sanely.

If the number of pages we are flushing is more than twice the number
of entries in the TSB, just scan the TSB table for matches rather
than probing each and every page in the range.

Based upon a patch and report by James Clarke.

Orabug: 25499527

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 849c498766060a16aad5b0e0d03206726e7d2fa4)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Fix illegal relative branches in hypervisor patched TLB code.

When we copy code over to patch another piece of code, we can only use
PC-relative branches that target code within that piece of code.

Such PC-relative branches cannot be made to external symbols because
the patch moves the location of the code and thus modifies the
relative address of external symbols.

Use an absolute jmpl to fix this problem.

Orabug: 25499527

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b429ae4d5b565a71dfffd759dfcd4f6c093ced94)
Signed-off-by: Allen Pais <allen.pais@oracle.com>

SPARC64: UEK4 LDOMS DOMAIN SERVICES UPDATE 6

This update fixes the following issues for LDom domain services on UEK4:

1. Error messages displayed on the console when guest domains are stopped
   such as:

ldc_print: id=0x11 flags=0x7 state=CONNECTED cstate=0x0 hsstate=0x10
        rx_h=0x2b40 rx_t=0x2b40 rx_n=512
        tx_h=0x4440 tx_t=0x4440 tx_n=512
        rcv_nxt=635 snd_nxt=723
ds-3: ds_disconnect_service_client: failed to send UNREG_REQ for handle
700000001 (1)

2. CPU DR related problems including 'length too big' errors and hangs. With
   these new fixes, >256 vcpus can be successfully added/removed from a guest
   domain. As part of this fix, a new scheme for reusing event data memory
   buffers was implemented.

Signed-off-by: Aaron Young <Aaron.Young@oracle.com>
Reviewed-By: Liam Merwick <Liam.Merwick@oracle.com>
Orabug: 23171935, 24848179
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc: Optimized memset, memcpy, copy_to_user, copy_from_user for M7

New algorithm that takes advantage of the M7 block init store
ASI, ie, overlapping pipelines and miss buffer filling.
Full details in code comments.

Ported from following UEK2 commits.
http://ca-git.us.oracle.com/?p=linux-uek-2.6.39-sparc.git;a=commit;h=c58ef937e442830c362d1ab20a35a1c61b409827
http://ca-git.us.oracle.com/?p=linux-uek-2.6.39-sparc.git;a=commit;h=322d6f95ade517f4e180545f23fa731b2d748b33
http://ca-git.us.oracle.com/?p=linux-uek-2.6.39-sparc.git;a=commit;h=bc0b4ae6b87fbb28bd816320d22ae6c6a2393865

Orabug: 25120741

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

Input: vmmouse - remove port reservation

Orabug: 25652572

The VMWare EFI BIOS will expose port 0x5658 as an ACPI resource.  This
causes the port to be reserved by the APCI module as the system comes up,
making it unavailable to be reserved again by other drivers, thus
preserving this VMWare port for special use in a VMWare guest.

This port is designed to be shared among multiple VMWare services, such as
the VMMOUSE.  Because of this, VMMOUSE should not try to reserve this port
on its own.

The VMWare non-EFI BIOS does not do this to preserve compatibility with
existing/legacy VMs.  It is known that there is small chance a VM may be
configured such that these ports get reserved by other non-VMWare devices,
and if this ever happens, the result is undefined.

Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: <stable@vger.kernel.org> # 4.1-
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(cherry picked from commit 60842ef8128e7bf58c024814cd0dc14319232b6c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

Input: vmmouse - fix absolute device registration

Orabug: 25652572

We should set device's capabilities first, and then register it,
otherwise various handlers already present in the kernel will not be
able to connect to the device.

Reported-by: Lauri Kasanen <cand@gmx.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(cherry picked from commit d4f1b06d685d11ebdaccf11c0db1cb3c78736862)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: use eth_hw_addr_random()

Orabug: 25645429

Use eth_hw_addr_random() to set a random MAC address in order to make
sure bp->dev->addr_assign_type will be properly set to NET_ADDR_RANDOM.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1faaa78f36cb2915ae89138ba5846f87ade85dcb)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: fix pci cleanup in bnxt_init_one() failure path

Orabug: 25645429

In the bnxt_init_one() failure path, bar1 and bar2 are not
being unmapped. This commit fixes this issue. Reorganize the
code so that bnxt_init_one()'s failure path and bnxt_remove_one()
can call the same function to do the PCI cleanup.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 17086399c113d933e1202697f85b8f0f82fcb8ce)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c

bnxt_en: Fix NULL pointer dereference in a failure path during open.

Orabug: 25645429

If bnxt_hwrm_ring_free() is called during a failure path in bnxt_open(),
it is possible that the completion rings have not been allocated yet.
In that case, the completion doorbell has not been initialized, and
calling bnxt_disable_int() will crash. Fix it by checking that the
completion ring has been initialized before writing to the completion
ring doorbell.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit daf1f1e7841138cb0e48d52c8573a5f064d8f495)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Reject driver probe against all bridge devices

Orabug: 25645429

There are additional SoC devices that use the same device ID for
bridge and NIC devices. The bnxt driver should reject probe against
all bridge devices since it's meant to be used with only endpoint
devices.

Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4e00338a61998de3502d0428c4f71ffc69772316)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Added PCI IDs for BCM57452 and BCM57454 ASICs

Orabug: 25645429

Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 32b40798c1b40343641f04cdfd09652af70ea0e9)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Fix bnxt_setup_tc() error message.

Orabug: 25645429

Add proper puctuation to make the message more clear.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b451c8b69e70de299aa6061e1fa6afbb4d7c1f9e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Print FEC settings as part of the linkup dmesg.

Orabug: 25645429

Print FEC (Forward Error Correction) autoneg and encoding settings during
link up.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e70c752f88ed23e6a0f081fa408282c2450c8ce9)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Do not setup PHY unless driving a single PF.

Orabug: 25645429

If it is a VF or an NPAR function, the firmware call to setup the PHY
will fail. Adding this check will prevent unnecessary firmware calls
to setup the PHY unless calling from the PF. This will also eliminate
many unnecessary warning messages when the call from a VF or NPAR fails.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 33dac24abbd5a77eefca18fb7ebbd01a3cf1b343)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add hardware NTUPLE filter for encapsulated packets.

Orabug: 25645429

If skb_flow_dissect_flow_keys() returns with the encapsulation flag
set, pass the information to the firmware to setup the NTUPLE filter
accordingly.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 61aad724ec0a685bc83b02b059a3ca0ad3bde6b0)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c

bnxt_en: Allow NETIF_F_NTUPLE to be enabled on VFs.

Orabug: 25645429

Commit ae10ae740ad2 ("bnxt_en: Add new hardware RFS mode.") has added
code to allow NTUPLE to be enabled on VFs. So we now remove the
BNXT_VF() check in rfs_capable() to allow NTUPLE on VFs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 964fd4801d40ead69a447482c0dd0cd4be495e47)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Fix ethtool -l pre-set max combined channel.

Orabug: 25645429

With commit d1e7925e6d80 ("bnxt_en: Centralize logic to reserve rings."),
ring allocation for combined rings has become stricter. A combined
ring must now have an rx-tx ring pair. The pre-set max. for combined
rings should now be min(rx, tx).

Fixes: d1e7925e6d80 ("bnxt_en: Centralize logic to reserve rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a79a5276aa2f844bd368c1d3d5a625e1fbefd989)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Retry failed NVM_INSTALL_UPDATE with defragmentation flag.

Orabug: 25645429

If the HWRM_NVM_INSTALL_UPDATE command fails with the error code
NVM_INSTALL_UPDATE_CMD_ERR_CODE_FRAG_ERR, retry the command with
a new flag to allow defragmentation. Since we are checking the
response for error code, we also need to take the mutex until
we finish reading the response.

Signed-off-by: Kshitij Soni <kshitij.soni@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cb4d1d6261453677feb54e7a09c23fc7648dd6bc)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Update to firmware interface spec 1.7.0.

Orabug: 25645429

The new spec has NVRAM defragmentation support which will be used in
the next patch to improve ethtool flash operation.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bac9a7e0f5d6da82478d5e0a2a236158f42d5757)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Refactor tx completion path.

Orabug: 25645429

XDP_TX requires a different function to handle completion. Add a
function pointer to handle tx completion logic. Regular TX rings
will be assigned the current bnxt_tx_int() for the ->tx_int()
function pointer.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fa3e93e86cc3d1809fba67cb138883ed4bb74a5f)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add a set of TX rings to support XDP.

Orabug: 25645429

Add logic for an extra set of TX rings for XDP.  If enabled, this
set of TX rings equals the number of RX rings and shares the same
IRQ as the RX ring set.  A new field bp->tx_nr_rings_xdp is added
to keep track of these TX XDP rings.  Adjust all other relevant functions
to handle bp->tx_nr_rings_xdp.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5f4492493e75dafc5cbb96eabe0f146c2ffb1e3d)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add tx ring mapping logic.

Orabug: 25645429

To support XDP_TX, we need to add a set of dedicated TX rings, each
associated with the NAPI of an RX ring. To assign XDP rings and regular
rings in a flexible way, we add a bp->tx_ring_map[] array to do the
remapping. The netdev txq index is stored in the new field txq_index
so that we can retrieve the netdev txq when handling TX completions.
In this patch, before we introduce XDP_TX, the mapping is 1:1.

v2: Fixed a bug in bnxt_tx_int().

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a960dec98861b009b4227d2ae3b94a142c83eb96)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Centralize logic to reserve rings.

Orabug: 25645429

Currently, bnxt_setup_tc() and bnxt_set_channels() have similar and
duplicated code to check and reserve rx and tx rings. Add a new
function bnxt_reserve_rings() to centralize the logic. This will
make it easier to add XDP_TX support which requires allocating a
new set of TX rings.

Also, the tx ring checking logic in bnxt_setup_msix() can be removed.
The rings have been reserved before hand.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d1e7925e6d80ce5f9ef6deb8f3cec7526f5c443c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Use event bit map in RX path.

Orabug: 25645429

In the current code, we have separate rx_event and agg_event parameters
to keep track of rx and aggregation events. Combine these events into
an u8 event mask with different bits defined for different events. This
way, it is easier to expand the logic to include XDP tx events.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4e5dbbda4c40a239e2ed4bbc98f2aa320e4dcca2)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add RX page mode support.

Orabug: 25645429

This mode is to support XDP. In this mode, each rx ring is configured
with page sized buffers for linear placement of each packet. MTU will be
restricted to what the page sized buffers can support.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c61fb99cae51958a9096d8540c8c05e74cfa7e59)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c

bnxt_en: Parameterize RX buffer offsets.

Orabug: 25645429

Convert the global constants BNXT_RX_OFFSET and BNXT_RX_DMA_OFFSET to
device parameters. This will make it easier to support XDP with
headroom support which requires different RX buffer offsets.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b3dba77cf0acb6e44b368979026df975658332bc)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add bp->rx_dir field for rx buffer DMA direction.

Orabug: 25645429

When driver is running in XDP mode, rx buffers are DMA mapped as
DMA_BIDIRECTIONAL. Add a field so the code will map/unmap rx buffers
according to this field.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 745fc05c9db1f17da076861c7f57507e13f28a3a)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Don't use DEFINE_DMA_UNMAP_ADDR to store DMA address in RX path.

Orabug: 25645429

To support XDP_TX, we need the RX buffer's DMA address to transmit the
packet. Convert the DMA address field to a permanent field in
bnxt_sw_rx_bd.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 11cd119d31a71b37c2362fc621f225e2aa12aea1)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Refactor rx SKB function.

Orabug: 25645429

Minor refactoring of bnxt_rx_skb() so that it can easily be replaced by
a new function that handles packets in a single page.  Also, use a
function pointer bp->rx_skb_func() to switch to a new function when
we add the new mode in the next patch.

Add a new field data_ptr that points to the packet data in the
bnxt_sw_rx_bd structure.  The original data field is changed to void
pointer so that it can either hold the kmalloc'ed data or a page
pointer.

The last parameter of bnxt_rx_skb() which was the length parameter is
changed to include the payload offset of the packet in the upper 16 bit.
The offset is needed to support the rx page mode and is not used in
this existing function.

v3: Added a new data_ptr parameter to bp->rx_skb_func().  The caller
has the option to modify the starting address of the packet.  This
will be needed when XDP with headroom support is added.

v2: Changed the name of the last parameter to offset_and_len to make the
code more clear.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6bb19474391d17954fee9a9997ecca25b35dfd46)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status().

Orabug: 25645429

bnxt_get_port_module_status() calls bnxt_update_link() which expects
RTNL to be held. In bnxt_sp_task() that does not hold RTNL, we need to
call it with a prior call to bnxt_rtnl_lock_sp() and the call needs to
be moved to the end of bnxt_sp_task().

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 90c694bb71819fb5bd3501ac397307d7e41ddeca)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Fix RTNL lock usage on bnxt_update_link().

Orabug: 25645429

bnxt_update_link() is called from multiple code paths.  Most callers,
such as open, ethtool, already hold RTNL.  Only the caller bnxt_sp_task()
does not.  So it is a bug to take RTNL inside bnxt_update_link().

Fix it by removing the RTNL inside bnxt_update_link().  The function
now expects the caller to always hold RTNL.

In bnxt_sp_task(), call bnxt_rtnl_lock_sp() before calling
bnxt_update_link().  We also need to move the call to the end of
bnxt_sp_task() since it will be clearing the BNXT_STATE_IN_SP_TASK bit.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0eaa24b971ae251ae9d3be23f77662a655532063)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Fix bnxt_reset() in the slow path task.

Orabug: 25645429

In bnxt_sp_task(), we set a bit BNXT_STATE_IN_SP_TASK so that bnxt_close()
will synchronize and wait for bnxt_sp_task() to finish.  Some functions
in bnxt_sp_task() require us to clear BNXT_STATE_IN_SP_TASK and then
acquire rtnl_lock() to prevent race conditions.

There are some bugs related to this logic. This patch refactors the code
to have common bnxt_rtnl_lock_sp() and bnxt_rtnl_unlock_sp() to handle
the RTNL and the clearing/setting of the bit.  Multiple functions will
need the same logic.  We also need to move bnxt_reset() to the end of
bnxt_sp_task().  Functions that clear BNXT_STATE_IN_SP_TASK must be the
last functions to be called in bnxt_sp_task().  The common scheme will
handle the condition properly.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a551ee94ea723b4af9b827c7460f108bc13425ee)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Fix "uninitialized variable" bug in TPA code path.

Orabug: 25645429

In the TPA GRO code path, initialize the tcp_opt_len variable to 0 so
that it will be correct for packets without TCP timestamps. The bug
caused the SKB fields to be incorrectly set up for packets without
TCP timestamps, leading to these packets being rejected by the stack.

Reported-by: Andy Gospodarek <andrew.gospodarek@broadocm.com>
Acked-by: Andy Gospodarek <andrew.gospodarek@broadocm.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 719ca8111402aa6157bd83a3c966d184db0d8956)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add the ulp_sriov_cfg hooks for bnxt_re RDMA driver.

Orabug: 25645429

Add the ulp_sriov_cfg callbacks when the number of VFs is changing. This
allows the RDMA driver to provision RDMA resources for the VFs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2f5938467bd7f34e59a1d6d3809f5970f62e194b)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add support for ethtool -p.

Orabug: 25645429

Add LED blinking code to support ethtool -p on the PF.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5ad2cbeed74bd1e89ac4ba14288158ec7eb167da)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Update to firmware interface spec to 1.6.1.

Orabug: 25645429

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f183886c0d798ca3cf0a51e8cab3c1902fbd1e8b)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Clear TPA flags when BNXT_FLAG_NO_AGG_RINGS is set.

Orabug: 25645429

Commit bdbd1eb59c56 ("bnxt_en: Handle no aggregation ring gracefully.")
introduced the BNXT_FLAG_NO_AGG_RINGS flag. For consistency,
bnxt_set_tpa_flags() should also clear TPA flags when there are no
aggregation rings.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 341138c3e6afa8e77f9f3e773d72b37022dbcee8)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Fix compiler warnings when CONFIG_RFS_ACCEL is not defined.

Orabug: 25645429

CC [M]  drivers/net/ethernet/broadcom/bnxt/bnxt.o
drivers/net/ethernet/broadcom/bnxt/bnxt.c:4947:21: warning: ‘bnxt_get_max_func_rss_ctxs’ defined but not used [-Wunused-function]
static unsigned int bnxt_get_max_func_rss_ctxs(struct bnxt *bp)
                     ^
  CC [M]  drivers/net/ethernet/broadcom/bnxt/bnxt.o
drivers/net/ethernet/broadcom/bnxt/bnxt.c:4956:21: warning: ‘bnxt_get_max_func_vnics’ defined but not used [-Wunused-function]
static unsigned int bnxt_get_max_func_vnics(struct bnxt *bp)
                     ^

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b742995445fbac874f5fe19ce2afc76c7a6ac2cf)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

MAINTAINERS: Add bnxt_en maintainer info.

Orabug: 25645429

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3f0d80b6d228f11117244a16d2e17ea684b540f5)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Handle no aggregation ring gracefully.

Orabug: 25645429

The current code assumes that we will always have at least 2 rx rings, 1
will be used as an aggregation ring for TPA and jumbo page placements.
However, it is possible, especially on a VF, that there is only 1 rx
ring available. In this scenario, the current code will fail to initialize.
To handle it, we need to properly set up only 1 ring without aggregation.
Set a new flag BNXT_FLAG_NO_AGG_RINGS for this condition and add logic to
set up the chip to place RX data linearly into a single buffer per packet.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bdbd1eb59c565c56a74d21076e2ae8706de00ecd)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Set default completion ring for async events.

Orabug: 25645429

With the added support for the bnxt_re RDMA driver, both drivers can be
allocating completion rings in any order. The firmware does not know
which completion ring should be receiving async events. Add an
extra step to tell firmware the completion ring number for receiving
async events after bnxt_en allocates the completion rings.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 486b5c22ea1d35e00e90dd79a32a9ee530b18915)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Implement new scheme to reserve tx rings.

Orabug: 25645429

In order to properly support TX rate limiting in SRIOV VF functions or
NPAR functions, firmware needs better control over tx ring allocations.
The new scheme requires the driver to reserve the number of tx rings
and to query to see if the requested number of tx rings is reserved.
The driver will use the new scheme when the firmware interface spec is
1.6.1 or newer.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 391be5c2736456f032fe0265031ecfe17aee84a0)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add IPV6 hardware RFS support.

Orabug: 25645429

Accept ipv6 flows in .ndo_rx_flow_steer() and support ETHTOOL_GRXCLSRULE
ipv6 flows.

Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit dda0e7465f040ed814d4a5c98c6bf042e59cba69)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c

bnxt_en: Assign additional vnics to VFs.

Orabug: 25645429

Assign additional vnics to VFs whenever possible so that NTUPLE can be
supported on the VFs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8427af811a2fcbbf0c71a4b1f904f2442abdcf39)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add new hardware RFS mode.

Orabug: 25645429

The existing hardware RFS mode uses one hardware RSS context block
per ring just to calculate the RSS hash. This is very wasteful and
prevents VF functions from using it. The new hardware mode shares
the same hardware RSS context for RSS placement and RFS steering.
This allows VFs to enable RFS.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ae10ae740ad2befd92b6f5b2ab39220bce6e5da2)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Refactor code that determines RFS capability.

Orabug: 25645429

Add function bnxt_rfs_supported() that determines if the chip supports
RFS. Refactor the existing function bnxt_rfs_capable() that determines
if run-time conditions support RFS.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8079e8f107bf02e1e5ece89239dd2fb475a4735f)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Add function to get vnic capability.

Orabug: 25645429

The new vnic RSS capability will enhance NTUPLE support, to be added
in subsequent patches.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8fdefd63c203d9b2955d679704f4ed92bf40752c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Refactor TPA code path.

Orabug: 25645429

Call tcp_gro_complete() in the common code path instead of the chip-
specific method. The newer 5731x method is missing the call.

Signed-off-by: Michael Chan <michael.chan@broadcmo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5910906ca9ee32943f67db24917f78a9ad1087db)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Fix and clarify link_info->advertising.

Orabug: 25645429

The advertising field is closely related to the auto_link_speeds field.
The former is the user setting while the latter is the firmware setting.
Both should be u16. We should use the advertising field in
bnxt_get_link_ksettings because the auto_link_speeds field may not
be updated with the latest from the firmware yet.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 68515a186cf8a8f97956eaea5829277752399f58)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Improve the IRQ disable sequence during shutdown.

Orabug: 25645429

The IRQ is disabled by writing to the completion ring doorbell. This
should be done before the hardware completion ring is freed for correctness.
The current code disables IRQs after all the completion rings are freed.

Fix it by calling bnxt_disable_int_sync() before freeing the completion
rings. Rearrange the code to avoid forward declaration.

Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 9d8bc09766f1a229b2d204c713a1cfc6c7fa1bb1)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

bnxt_en: Remove busy poll logic in the driver.

Orabug: 25645429

Use native NAPI polling instead. The next patch will complete the work
by switching to use napi_complete_done()

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b356a2e729cec145a648d22ba5686357c009da25)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c

vmxnet3: prevent building with 64K pages

Orabug: 25639933

I got a warning about broken code on ARM64 with 64K pages:

drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_rq_init':
drivers/net/vmxnet3/vmxnet3_drv.c:1679:29: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
rq->buf_info[0][i].len = PAGE_SIZE;

'len' here is a 16-bit integer, so this clearly won't work. I don't think
this driver is used much on anything other than x86, so there is no need
to fix this properly and we can work around it with a Kconfig dependency
to forbid known-broken configurations. qemu in theory supports it on
other architectures too, but presumably only for compatibility with x86
guests that also run on vmware.

CONFIG_PAGE_SIZE_64KB is used on hexagon, mips, sh and tile, the other
symbols are architecture-specific names for the same thing.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fbdf0e28d061708cf18ba0f8e0db5360dc9a15b9)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: Move PCI Id to pci_ids.h

Orabug: 25639933

The VMXNet3 PCI Id will be shared with our paravirtual RDMA driver.
Moved it to the shared location in pci_ids.h.

Suggested-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit b1226c7db1d997fa6955cd3b54ba333bd0d8a29c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: avoid assumption about invalid dma_pa in vmxnet3_set_mc()

Orabug: 25639933

vmxnet3_set_mc() checks new_table_pa returned by dma_map_single()
with dma_mapping_error(), but even there it assumes zero is invalid pa
(it assumes dma_mapping_error(...,0) returns true if new_table is NULL).

The patch adds an explicit variable to track status of new_table_pa.

Found by Linux Driver Verification project (linuxtesting.org).

v2: use "bool" and "true"/"false" for boolean variables.
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fb5c6cfaec126d9a96b9dd471d4711bf4c737a6f)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: Wake queue from reset work

Orabug: 25639933

vmxnet3_reset_work() expects tx queues to be stopped (via
vmxnet3_quiesce_dev -> netif_tx_disable). However, this races with the
netif_wake_queue() call in netif_tx_timeout() such that the driver's
start_xmit routine may be called unexpectedly, triggering one of the BUG_ON
in vmxnet3_map_pkt with a stack trace like this:

RIP: 0010:[<ffffffffa00cf4bc>] vmxnet3_map_pkt+0x3ac/0x4c0 [vmxnet3]
[<ffffffffa00cf7e0>] vmxnet3_tq_xmit+0x210/0x4e0 [vmxnet3]
[<ffffffff813ab144>] dev_hard_start_xmit+0x2e4/0x4c0
[<ffffffff813c956e>] sch_direct_xmit+0x17e/0x1e0
[<ffffffff813c96a7>] __qdisc_run+0xd7/0x130
[<ffffffff813a6a7a>] net_tx_action+0x10a/0x200
[<ffffffff810691df>] __do_softirq+0x11f/0x260
[<ffffffff81472fdc>] call_softirq+0x1c/0x30
[<ffffffff81004695>] do_softirq+0x65/0xa0
[<ffffffff81069b89>] local_bh_enable_ip+0x99/0xa0
[<ffffffffa031ff36>] destroy_conntrack+0x96/0x110 [nf_conntrack]
[<ffffffff813d65e2>] nf_conntrack_destroy+0x12/0x20
[<ffffffff8139c6d5>] skb_release_head_state+0xb5/0xf0
[<ffffffff8139d299>] skb_release_all+0x9/0x20
[<ffffffff8139cfe9>] __kfree_skb+0x9/0x90
[<ffffffffa00d0069>] vmxnet3_quiesce_dev+0x209/0x340 [vmxnet3]
[<ffffffffa00d020a>] vmxnet3_reset_work+0x6a/0xa0 [vmxnet3]
[<ffffffff8107d7cc>] process_one_work+0x16c/0x350
[<ffffffff810804fa>] worker_thread+0x17a/0x410
[<ffffffff810848c6>] kthread+0x96/0xa0
[<ffffffff81472ee4>] kernel_thread_helper+0x4/0x10

Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 277964e19e1416ca31301e113edb2580c81a8b66)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: fix non static symbol warning

Orabug: 25639933

Fixes the following sparse warning:

drivers/net/vmxnet3/vmxnet3_drv.c:1645:1: warning:
symbol 'vmxnet3_rq_destroy_all_rxdataring' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bb40aca7cf153e3e2941140d3850a4b6c4205ccb)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: fix tx data ring copy for variable size

Orabug: 25639933

'Commit 3c8b3efc061a ("vmxnet3: allow variable length transmit data ring
buffer")' changed the size of the buffers in the tx data ring from a
fixed size of 128 bytes to a variable size.

However, while copying data to the data ring, vmxnet3_copy_hdr continues
to carry the old code that assumes fixed buffer size of 128. This patch
fixes it by adding correct offset based on the actual data ring buffer
size.

Signed-off-by: Guolin Yang <gyang@vmware.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ff2e7d5d51469e98196f7933c83b781e96517e7c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: update to version 3

Orabug: 25639933

With all vmxnet3 version 3 changes incorporated in the vmxnet3 driver,
the driver can configure emulation to run at vmxnet3 version 3, provided
the emulation advertises support for version 3.

Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6af9d787459e3ea32da446fe0aa76cff65fd2f8c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: introduce command to register memory region

Orabug: 25639933

In vmxnet3 version 3, the emulation added support for the vmxnet3 driver
to communicate information about the memory regions the driver will use
for rx/tx buffers. The driver can also indicate which rx/tx queue the
memory region is applicable for. If this information is communicated
to the emulation, the emulation will always keep these memory regions
mapped, thereby avoiding the mapping/unmapping overhead for every packet.

Currently, Linux vmxnet3 driver does not leverage this capability. The
feasibility of using this approach for the Linux vmxnet3 driver will be
investigated independently and if possible, will be part of a different
patch. This patch only exposes the emulation capability to the driver
(vmxnet3_defs.h is identical between the driver and the emulation).

Signed-off-by: Guolin Yang <gyang@vmware.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 474432229f9482f0f4a2732f2e130dc48247f1d7)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: add support for get_coalesce, set_coalesce ethtool operations

Orabug: 25639933

The emulation supports a variety of coalescing modes viz. disabled
(no coalescing), adaptive, static (number of packets to batch before
raising an interrupt), rate based (number of interrupts per second).

This patch implements get_coalesce and set_coalesce methods to allow
querying and configuring different coalescing modes.

Signed-off-by: Keyong Sun <sunk@vmware.com>
Signed-off-by: Manoj Tammali <tammalim@vmware.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4edef40ef5f8d09a0b1ded4d1d9b0e988cd98e97)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: add receive data ring support

Orabug: 25639933

vmxnet3 driver preallocates buffers for receiving packets and posts the
buffers to the emulation. In order to deliver a received packet to the
guest, the emulation must map buffer(s) and copy the packet into it.

To avoid this memory mapping overhead, this patch introduces the receive
data ring - a set of small sized buffers that are always mapped by
the emulation. If a packet fits into the receive data ring buffer, the
emulation delivers the packet via the receive data ring (which must be
copied by the guest driver), or else the usual receive path is used.

Receive Data Ring buffer length is configurable via ethtool -G ethX rx-mini

Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 50a5ce3e7116a70edb7a1d1d209e3bc537752427)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: allow variable length transmit data ring buffer

Orabug: 25639933

vmxnet3 driver supports transmit data ring viz. a set of fixed size
buffers used by the driver to copy packet headers. Small packets that
fit these buffers are copied into these buffers entirely.

Currently this buffer size of fixed at 128 bytes. This patch extends
transmit data ring implementation to allow variable length transmit
data ring buffers. The length of the buffer is read from the emulation
during initialization.

Signed-off-by: Sriram Rangarajan <rangarajans@vmware.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3c8b3efc061a745d888869dc3462ac4f7dd582d9)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: introduce generalized command interface to configure the device

Orabug: 25639933

Shared memory is used to exchange information between the vmxnet3 driver
and the emulation. In order to request emulation to perform a task, the
driver first populates specific fields in this shared memory and then
issues corresponding command by writing to the command register(CMD). The
layout of the shared memory was defined by vmxnet3 version 1 and cannot
be extended for every new command without breaking backward compatibility.

To address this problem, in vmxnet3 version 3, the emulation repurposed
a reserved field in the shared memory to represent command information
instead. For new commands, the driver first populates the command
information field in the shared memory and then issues the command. The
emulation interprets the data written to the command information depending
on the type of the command. This patch exposes this capability to the driver.

Signed-off-by: Guolin Yang <gyang@vmware.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f35c7480f81b70f9c3030d96a3807e8faba34cf7)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

vmxnet3: prepare for version 3 changes

Orabug: 25639933

vmxnet3 is currently at version 2, but some command definitions from
previous vmxnet3 versions are missing. Add those definitions before
moving to version 3.

Also, introduce utility macros for vmxnet3 version comparison and update
Copyright information and Maintained by.

Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 190af10f0b5a07140ec4ce8e6ef04b7cb238dde1)
Signed-off-by: Brian Maly <brian.maly@oracle.com>