Chuck Anderson [Thu, 9 Mar 2017 07:36:20 +0000 (23:36 -0800)]
Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/drivers: (289 commits)
Input: vmmouse - remove port reservation
Input: vmmouse - fix absolute device registration
bnxt_en: use eth_hw_addr_random()
bnxt_en: fix pci cleanup in bnxt_init_one() failure path
bnxt_en: Fix NULL pointer dereference in a failure path during open.
bnxt_en: Reject driver probe against all bridge devices
bnxt_en: Added PCI IDs for BCM57452 and BCM57454 ASICs
bnxt_en: Fix bnxt_setup_tc() error message.
bnxt_en: Print FEC settings as part of the linkup dmesg.
bnxt_en: Do not setup PHY unless driving a single PF.
bnxt_en: Add hardware NTUPLE filter for encapsulated packets.
bnxt_en: Allow NETIF_F_NTUPLE to be enabled on VFs.
bnxt_en: Fix ethtool -l pre-set max combined channel.
bnxt_en: Retry failed NVM_INSTALL_UPDATE with defragmentation flag.
bnxt_en: Update to firmware interface spec 1.7.0.
bnxt_en: Refactor tx completion path.
bnxt_en: Add a set of TX rings to support XDP.
bnxt_en: Add tx ring mapping logic.
bnxt_en: Centralize logic to reserve rings.
bnxt_en: Use event bit map in RX path.
...
Chuck Anderson [Thu, 9 Mar 2017 07:34:05 +0000 (23:34 -0800)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/upstream-cherry-picks:
Btrfs: fix crash on fsync when using overlayfs v4
vfio/pci: Hide broken INTx support from user
crypto: cryptd - Assign statesize properly
crypto: ghash-clmulni - Fix load failure
USB: digi_acceleport: do sanity checking for the number of ports
ksplice: add sysctls for determining Ksplice features.
signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
Chuck Anderson [Thu, 9 Mar 2017 04:24:57 +0000 (20:24 -0800)]
Merge branch topic/uek-4.1/sparc of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/sparc: (32 commits)
sparc: fix kernel panic caused by vio handshake
sparc64: Add sensible read values for /proc/<pid>/sparc_adi
sparc64: Add ability to set the mcde state for a process
sparc64: Add proc files specific to ADI
sparc64: add mcd_on_by_default
Revert "sparc: fix intermittent LDom hang waiting for vdc_port_up"
sparc64: Add support for ADI (Application Data Integrity)
sparc64: Add support for ADI register fields, ASIs and traps
mm: Add functions to support extra actions on swap in/out
signals, sparc: Add signal codes for ADI violations
sparc64: shut down to OBP correctly
sparc64: fix for user probes in high memory
sparc64: Use online cpus instead of present cpus during hotplug.
sparc64: Update cpumaps correctly during hotplug.
sparc: fix intermittent LDom hang waiting for vdc_port_up
arch/sparc: Add a dedicated clear_page and clear_user_page for M7
sparc64: perf: Enable dynamic tracepoints when using perf probe
SPARC64: UEK4 LDOMS DOMAIN SERVICES UPDATE 7
arch/sparc: Fix indexing msi_msiqid_table and msi_irq_table
arch/sparc: Clear msi_msiqid_table during teardown
...
Chuck Anderson [Thu, 9 Mar 2017 04:16:08 +0000 (20:16 -0800)]
Merge branch 'topic/uek-4.1/drivers' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/drivers: (200 commits)
scsi: megaraid-sas: request irqs later
scsi: megaraid_sas: add in missing white spaces in error messages text
scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
scsi: megaraid_sas: driver version upgrade
scsi: megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map
scsi: megaraid_sas: Send SYNCHRONIZE_CACHE for VD to firmware
scsi: megaraid_sas: Do not fire DCMDs during PCI shutdown/detach
scsi: megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade
scsi: megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset
scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
scsi: megaraid_sas: clean function declarations in megaraid_sas_base.c up
scsi: megaraid_sas: add in missing white space in error message text
scsi: megaraid_sas: Fix the search of first memory bar
scsi: megaraid_sas: Use memdup_user() rather than duplicating its implementation
megaraid_sas: Fix probing cards without io port
megaraid_sas: Do not fire MR_DCMD_PD_LIST_QUERY to controllers which do not support it
megaraid_sas: Downgrade two success messages to info
megaraid_sas: driver version upgrade
megaraid_sas: task management code optimizations
megaraid_sas: call ISR function to clean up pending replies in OCR path
...
Chuck Anderson [Thu, 9 Mar 2017 04:00:03 +0000 (20:00 -0800)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/upstream-cherry-picks: (280 commits)
dm btree: fix bufio buffer leaks in dm_btree_del() error path
ipv4: keep skb->dst around in presence of IP options
ip6_gre: fix ip6gre_err() invalid reads
watchdog: hpwdt: changed maintainer information
watchdog: hpwdt: add support for iLO5
watchdog: hpwdt: remove email address from doc
watchdog: hpwdt: Adjust documentation to match latest kernel module parameters.
hpwdt: use nmi_panic() when kernel panics in NMI handler
panic: change nmi_panic from macro to function
watchdog/hpwdt: Fix build on certain configs
watchdog/hpwdt: Create stack frame in asminline_call()
x86/asm: Add C versions of frame pointer macros
x86/asm: Clean up frame pointer macros
watchdog: hpwdt: HP rebranding
panic, x86: Allow CPUs to save registers even if looping in NMI context
watchdog: hpwdt: Add support for WDIOC_SETOPTIONS
kvm: fix page struct leak in handle_vmon
bnx2: use READ_ONCE() instead of barrier()
bnx2: Wait for in-flight DMA to complete at probe stage
bnx2: fix locking when netconsole is used
...
Thomas Tai [Mon, 13 Feb 2017 14:50:05 +0000 (06:50 -0800)]
sparc: fix kernel panic caused by vio handshake
During hours long reboot test, the primary prints out multiple TX trigger
errors followed by a VIO handshake panic. The TX trigger error happens
because the primary ldmvsw detects that the ldc channel is down. In this
situation, the ldc operation is aborted, the tx and rx queue are then
flushed. The problem is that the rx queue may contain a LDC_EVENT_RESET
sent by the guest. It causes the primary to think that the ldc channel
is not in reset state. When the guest comes up again, the handshake is
out of sequence and thus causes handshake panic.
The TX trigger error would not have happened if the LDC_EVENT_RESET was
received before the TX checked the ldc link state. This is the reason
why the panic happens intermittently.
This patch checks for the connection reset and changes the ldc state to
reset. The reset logic is taken from existing vnet_event_napi() ldc_ctrl:
code path.
Khalid Aziz [Fri, 2 Dec 2016 19:45:37 +0000 (12:45 -0700)]
sparc64: Add sensible read values for /proc/<pid>/sparc_adi
This patch makes value read from /proc/<pid>/sparc_adi consistent
across platforms that support ADi and ones that do not. When ADI is
not available for a process either due to process being an anonymous
process on an ADI-capable platform or the process is running on a
non-ADI platform, a read from /proc/<pid>/sparc_adi always reads a
value of -1. This patch updates the documentation file as well with
the values for sparc_adi proc file.
Eric Snowberg [Thu, 17 Nov 2016 21:27:36 +0000 (13:27 -0800)]
sparc64: Add ability to set the mcde state for a process
turn off version checking (PSTATE.mcde) to avoid tripping over ADI
versions in flux. This has been partially remedied by using non-faulting
loads.
However, there is still a need to turn off PSTATE.mcde in memory dump
functions. This is to determine if an address is readable. If the
address is unreadable, the dump shows the memory contents as "********"
instead of a 4-byte hex value.
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Chuck Anderson [Sat, 18 Feb 2017 06:15:35 +0000 (22:15 -0800)]
sparc64: add mcd_on_by_default
Add the global variable mcd_on_by_default and support for the kernel boot arg
"mcd_on_by_default" which causes mcd_on_by_default = 1 if the kernel is
adi_capable().
Based on the code in commit:
sparc64: Enable Application Data Integrity for m7 and newer processors
Required by commit:
sparc64: Add proc files specific to ADI
Orabug: 22713162 Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Khalid Aziz [Wed, 15 Feb 2017 19:57:59 +0000 (12:57 -0700)]
sparc64: Add support for ADI (Application Data Integrity)
ADI is a new feature supported on SPARC M7 and newer processors to allow
hardware to catch rogue accesses to memory. ADI is supported for data
fetches only and not instruction fetches. An app can enable ADI on its
data pages, set version tags on them and use versioned addresses to
access the data pages. Upper bits of the address contain the version
tag. On M7 processors, upper four bits (bits 63-60) contain the version
tag. If a rogue app attempts to access ADI enabled data pages, its
access is blocked and processor generates an exception. Please see
Documentation/sparc/adi.txt for further details.
This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable
MCD (Memory Corruption Detection) on selected memory ranges, enable
TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI
version tags on page swap out/in or migration. It also adds handlers for
traps related to MCD. ADI is not enabled by default for any task. A task
must explicitly enable ADI on a memory range and set version tag for ADI
to be effective for the task.
This initial implementation supports saving and restoring one tag per
page. A page must use same version tag across the entire page for the
tag to survive swap and migration. Swap swupport infrastructure in this
patch allows for this capability to be expanded to store/restore more
than one tag per page in future.
This is a backport of patch sent upstream and brings UEK code closer to
upstream patch v6.
Khalid Aziz [Wed, 18 Jan 2017 17:59:26 +0000 (10:59 -0700)]
sparc64: Add support for ADI register fields, ASIs and traps
SPARC M7 processor adds new control register fields, ASIs and a new
trap to support the ADI (Application Data Integrity) feature. This
patch adds definitions for these register fields, ASIs and a handler
for the new precise memory corruption detected trap.
This is a backport of patch sent upstream and brings UEK code in sync
with upstream patch v6.
Khalid Aziz [Wed, 18 Jan 2017 17:36:21 +0000 (10:36 -0700)]
mm: Add functions to support extra actions on swap in/out
If a processor supports special metadata for a page, for example ADI
version tags on SPARC M7, this metadata must be saved when the page is
swapped out. The same metadata must be restored when the page is swapped
back in. This patch adds two new architecture specific functions -
arch_do_swap_page() to be called when a page is swapped in,
arch_unmap_one() to be called when a page is being unmapped for swap
out.
This is a backport of patch sent upstream and brings UEK code in sync
with upstream patch v6.
Khalid Aziz [Thu, 5 Jan 2017 18:46:54 +0000 (11:46 -0700)]
signals, sparc: Add signal codes for ADI violations
SPARC M7 processor introduces a new feature - Application Data
Integrity (ADI). ADI allows MMU to catch rogue accesses to memory.
When a rogue access occurs, MMU blocks the access and raises an
exception. In response to the exception, kernel sends the offending
task a SIGSEGV with si_code that indicates the nature of exception.
This patch adds three new signal codes specific to ADI feature:
1. ADI is not enabled for the address and task attempted to access
memory using ADI
2. Task attempted to access memory using wrong ADI tag and caused
a deferred exception.
3. Task attempted to access memory using wrong ADI Ttag and caused
a precise exception.
This is a backport of patch sent upstream and brings UEK code closer to
upstream patch v6.
The command "shutdown -h -H now" should shut the system down to the
OBP, however the machine was being powered off in the LDOM case.
In the LDOM case, the "reboot-command" variable must be set to
the string "noop" and then ldom_reboot() must be called.
This will make the OBP ignore the setting of "auto-boot?" after it
completes the reset. This causes the system to stop at the ok prompt.
Signed-off-by: Larry Bassel <larry.bassel@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
When returning from the user probe code into userspace process, PC & NPC are
truncated to 32 bits.
As a result of shared libraries get loaded very high in the virtual address
space of the process, placing a user probe inside a shared library makes the
kernel return into the process at the wrong address, causing it to seg'fault
most of the time.
This patch prevents truncating PC and NPC.
Signed-off-by: Eric Saint Etienne <eric.saint.etienne@oracle.com> Reviewed-by: David Aldridge <david.j.aldridge@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Atish Patra [Mon, 23 Jan 2017 21:40:35 +0000 (14:40 -0700)]
sparc64: Use online cpus instead of present cpus during hotplug.
As per the hotplug documentation, online cpu maps should be
updated if cpu hotplug happens via sysfs. Thus, all other
cpu maps should be updated basd on the online cpus instead
of present cpus. The following example illustrates the issue
if cpu maps are updated based on present cpus.
[root@ca-sparc64 hackbench]# echo 1 > /sys/devices/system/cpu/cpu2/online
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/core_siblings_list
0-255
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/thread_siblings_list
0-7
This is wrong because cpu0 is still offline.
After the fix:
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/core_siblings_list
1-255
[root@ca-sparc64 hackbench]#
cat /sys/devices/system/cpu/cpu1/topology/thread_siblings_list
1-7
Signed-off-by: Atish Patra <atish.patra@oracle.com> Reviewed-by: Chris Hyser <chris.hyser@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Atish Patra [Mon, 23 Jan 2017 21:39:24 +0000 (14:39 -0700)]
sparc64: Update cpumaps correctly during hotplug.
Currently,numa_cpu_mask is not updated when cpus are
hotplugged resulting incorrect number of cpus reported
by lscpu/numactl. Moreover, cpu_core_sib_cache_map is
also not cleared when cpu goes offline.
Update both the masks correctly whenever cpu goes online/
offline.
Thomas Tai [Tue, 17 Jan 2017 19:43:32 +0000 (11:43 -0800)]
sparc: fix intermittent LDom hang waiting for vdc_port_up
When an LDom boots, sunvdc probes the disk using the LDC channel.
If the channel was previously configured, we need to wait for
the channel state to change from UP to RESETTING so that the
seqid is properly reset in the primary. Otherwise the primary
will expect that the ldc packet contains a seqid other than 0.
Also disable ldc hypervisor interrupt before calling vio_port_up,
because interrupts can happen once ldc_bind is called. disabling the
interrupt ensures everything is configured before getting an interrupt
request.
Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Reviewed-By: Liam Merwick <Liam.Merwick@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Babu Moger [Wed, 18 Jan 2017 01:21:44 +0000 (17:21 -0800)]
arch/sparc: Add a dedicated clear_page and clear_user_page for M7
Adding a dedicated clear_page and clear_user_page for M7.
Avoids multiple checks which are really not required.
This eliminates about 30 instructions for each call.
Seen about 3 to 4 percent latency reduction in some cases.
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: Eric Saint Etienne <eric.saint.etienne@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Couple of indexing fixes.
1. Fix indexing pbm->msi_msiqid_table. It is initialized
based off of pbm->msi_first(not pbm->msiq_first as previously done).
Here is how it is initialized(Look at in sparc64_setup_msi_irq)
pbm->msi_msiqid_table[msi - pbm->msi_first] = msiqid;
2. In set_related_affinity, we dont need to subtract msi_first as
the loop is indexed from 0 to size of the table.
Saves time when smp_flush_tlb_page/smp_flush_tlb_pending
is called during do_exit(...). Without this patch, killing
processes had performance bottle neck in these functions
due to unnecessary xcalls made to flush TLBs.
Reviewed-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: Bob Picco <bob.picco@oracle.com Signed-off-by: Henry Willard <henry.willard@oracle.com> Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com>
Liam R. Howlett [Thu, 5 Jan 2017 20:58:41 +0000 (15:58 -0500)]
sparc64: Zero pages on allocation for mondo and error queues.
Error queues use a non-zero first word to detect if the queues are full.
Using pages that have not been zeroed may result in false positive
overflow events. These queues are set up once during boot so zeroing
all mondo and error queue pages is safe.
Note that this does not always occur because the page allocation for
these queues is so early in the boot cycle that higher number CPUs get
fresh pages. It is only when traps are serviced with lower number CPUs
who were given already used pages that this issue is exposed.
Liam R. Howlett [Thu, 22 Dec 2016 02:57:42 +0000 (21:57 -0500)]
sparc64: Don't panic on user mode non-resumable errors
Send a SIGBUS to the offending process on all userspace non-resumable
traps. This prevents userspace applications from creating a kernel
panic. The siginfo will return the code BUS_ADRERR and a valid address
if possible.
David S. Miller [Thu, 27 Oct 2016 16:04:54 +0000 (09:04 -0700)]
sparc64: Handle extremely large kernel TLB range flushes more gracefully.
When the vmalloc area gets fragmented, and because the firmware
mapping area sits between where modules live and the vmalloc area, we
can sometimes receive requests for enormous kernel TLB range flushes.
When this happens the cpu just spins flushing billions of pages and
this triggers the NMI watchdog and other problems.
We took care of this on the TSB side by doing a linear scan of the
table once we pass a certain threshold.
Do something similar for the TLB flush, however we are limited by
the TLB flush facilities provided by the different chip variants.
First of all we use an (mostly arbitrary) cut-off of 256K which is
about 32 pages. This can be tuned in the future.
The huge range code path for each chip works as follows:
1) On spitfire we flush all non-locked TLB entries using diagnostic
acceses.
2) On cheetah we use the "flush all" TLB flush.
3) On sun4v/hypervisor we do a TLB context flush on context 0, which
unlike previous chips does not remove "permanent" or locked
entries.
We could probably do something better on spitfire, such as limiting
the flush to kernel TLB entries or even doing range comparisons.
However that probably isn't worth it since those chips are old and
the TLB only had 64 entries.
Reported-by: James Clarke <jrtc27@jrtc27.com> Tested-by: James Clarke <jrtc27@jrtc27.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a74ad5e660a9ee1d071665e7e8ad822784a2dc7f) Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a236441bb69723032db94128761a469030c3fe6d) Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 830cda3f9855ff092b0e9610346d110846fc497c) Signed-off-by: Allen Pais <allen.pais@oracle.com>
David S. Miller [Wed, 26 Oct 2016 02:43:17 +0000 (19:43 -0700)]
sparc64: Handle extremely large kernel TSB range flushes sanely.
If the number of pages we are flushing is more than twice the number
of entries in the TSB, just scan the TSB table for matches rather
than probing each and every page in the range.
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 849c498766060a16aad5b0e0d03206726e7d2fa4) Signed-off-by: Allen Pais <allen.pais@oracle.com>
David S. Miller [Tue, 25 Oct 2016 23:23:26 +0000 (16:23 -0700)]
sparc64: Fix illegal relative branches in hypervisor patched TLB code.
When we copy code over to patch another piece of code, we can only use
PC-relative branches that target code within that piece of code.
Such PC-relative branches cannot be made to external symbols because
the patch moves the location of the code and thus modifies the
relative address of external symbols.
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b429ae4d5b565a71dfffd759dfcd4f6c093ced94) Signed-off-by: Allen Pais <allen.pais@oracle.com>
2. CPU DR related problems including 'length too big' errors and hangs. With
these new fixes, >256 vcpus can be successfully added/removed from a guest
domain. As part of this fix, a new scheme for reusing event data memory
buffers was implemented.
Signed-off-by: Aaron Young <Aaron.Young@oracle.com> Reviewed-By: Liam Merwick <Liam.Merwick@oracle.com>
Orabug: 23171935, 24848179 Signed-off-by: Allen Pais <allen.pais@oracle.com>
The VMWare EFI BIOS will expose port 0x5658 as an ACPI resource. This
causes the port to be reserved by the APCI module as the system comes up,
making it unavailable to be reserved again by other drivers, thus
preserving this VMWare port for special use in a VMWare guest.
This port is designed to be shared among multiple VMWare services, such as
the VMMOUSE. Because of this, VMMOUSE should not try to reserve this port
on its own.
The VMWare non-EFI BIOS does not do this to preserve compatibility with
existing/legacy VMs. It is known that there is small chance a VM may be
configured such that these ports get reserved by other non-VMWare devices,
and if this ever happens, the result is undefined.
Signed-off-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: <stable@vger.kernel.org> # 4.1- Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(cherry picked from commit 60842ef8128e7bf58c024814cd0dc14319232b6c) Signed-off-by: Brian Maly <brian.maly@oracle.com>
We should set device's capabilities first, and then register it,
otherwise various handlers already present in the kernel will not be
able to connect to the device.
Reported-by: Lauri Kasanen <cand@gmx.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(cherry picked from commit d4f1b06d685d11ebdaccf11c0db1cb3c78736862) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Use eth_hw_addr_random() to set a random MAC address in order to make
sure bp->dev->addr_assign_type will be properly set to NET_ADDR_RANDOM.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1faaa78f36cb2915ae89138ba5846f87ade85dcb) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In the bnxt_init_one() failure path, bar1 and bar2 are not
being unmapped. This commit fixes this issue. Reorganize the
code so that bnxt_init_one()'s failure path and bnxt_remove_one()
can call the same function to do the PCI cleanup.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 17086399c113d933e1202697f85b8f0f82fcb8ce) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
If bnxt_hwrm_ring_free() is called during a failure path in bnxt_open(),
it is possible that the completion rings have not been allocated yet.
In that case, the completion doorbell has not been initialized, and
calling bnxt_disable_int() will crash. Fix it by checking that the
completion ring has been initialized before writing to the completion
ring doorbell.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit daf1f1e7841138cb0e48d52c8573a5f064d8f495) Signed-off-by: Brian Maly <brian.maly@oracle.com>
There are additional SoC devices that use the same device ID for
bridge and NIC devices. The bnxt driver should reject probe against
all bridge devices since it's meant to be used with only endpoint
devices.
Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4e00338a61998de3502d0428c4f71ffc69772316) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 32b40798c1b40343641f04cdfd09652af70ea0e9) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Add proper puctuation to make the message more clear.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b451c8b69e70de299aa6061e1fa6afbb4d7c1f9e) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Print FEC (Forward Error Correction) autoneg and encoding settings during
link up.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e70c752f88ed23e6a0f081fa408282c2450c8ce9) Signed-off-by: Brian Maly <brian.maly@oracle.com>
If it is a VF or an NPAR function, the firmware call to setup the PHY
will fail. Adding this check will prevent unnecessary firmware calls
to setup the PHY unless calling from the PF. This will also eliminate
many unnecessary warning messages when the call from a VF or NPAR fails.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 33dac24abbd5a77eefca18fb7ebbd01a3cf1b343) Signed-off-by: Brian Maly <brian.maly@oracle.com>
If skb_flow_dissect_flow_keys() returns with the encapsulation flag
set, pass the information to the firmware to setup the NTUPLE filter
accordingly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 61aad724ec0a685bc83b02b059a3ca0ad3bde6b0) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
Commit ae10ae740ad2 ("bnxt_en: Add new hardware RFS mode.") has added
code to allow NTUPLE to be enabled on VFs. So we now remove the
BNXT_VF() check in rfs_capable() to allow NTUPLE on VFs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 964fd4801d40ead69a447482c0dd0cd4be495e47) Signed-off-by: Brian Maly <brian.maly@oracle.com>
With commit d1e7925e6d80 ("bnxt_en: Centralize logic to reserve rings."),
ring allocation for combined rings has become stricter. A combined
ring must now have an rx-tx ring pair. The pre-set max. for combined
rings should now be min(rx, tx).
Fixes: d1e7925e6d80 ("bnxt_en: Centralize logic to reserve rings.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a79a5276aa2f844bd368c1d3d5a625e1fbefd989) Signed-off-by: Brian Maly <brian.maly@oracle.com>
If the HWRM_NVM_INSTALL_UPDATE command fails with the error code
NVM_INSTALL_UPDATE_CMD_ERR_CODE_FRAG_ERR, retry the command with
a new flag to allow defragmentation. Since we are checking the
response for error code, we also need to take the mutex until
we finish reading the response.
Signed-off-by: Kshitij Soni <kshitij.soni@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cb4d1d6261453677feb54e7a09c23fc7648dd6bc) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The new spec has NVRAM defragmentation support which will be used in
the next patch to improve ethtool flash operation.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bac9a7e0f5d6da82478d5e0a2a236158f42d5757) Signed-off-by: Brian Maly <brian.maly@oracle.com>
XDP_TX requires a different function to handle completion. Add a
function pointer to handle tx completion logic. Regular TX rings
will be assigned the current bnxt_tx_int() for the ->tx_int()
function pointer.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fa3e93e86cc3d1809fba67cb138883ed4bb74a5f) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Add logic for an extra set of TX rings for XDP. If enabled, this
set of TX rings equals the number of RX rings and shares the same
IRQ as the RX ring set. A new field bp->tx_nr_rings_xdp is added
to keep track of these TX XDP rings. Adjust all other relevant functions
to handle bp->tx_nr_rings_xdp.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5f4492493e75dafc5cbb96eabe0f146c2ffb1e3d) Signed-off-by: Brian Maly <brian.maly@oracle.com>
To support XDP_TX, we need to add a set of dedicated TX rings, each
associated with the NAPI of an RX ring. To assign XDP rings and regular
rings in a flexible way, we add a bp->tx_ring_map[] array to do the
remapping. The netdev txq index is stored in the new field txq_index
so that we can retrieve the netdev txq when handling TX completions.
In this patch, before we introduce XDP_TX, the mapping is 1:1.
v2: Fixed a bug in bnxt_tx_int().
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a960dec98861b009b4227d2ae3b94a142c83eb96) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Currently, bnxt_setup_tc() and bnxt_set_channels() have similar and
duplicated code to check and reserve rx and tx rings. Add a new
function bnxt_reserve_rings() to centralize the logic. This will
make it easier to add XDP_TX support which requires allocating a
new set of TX rings.
Also, the tx ring checking logic in bnxt_setup_msix() can be removed.
The rings have been reserved before hand.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d1e7925e6d80ce5f9ef6deb8f3cec7526f5c443c) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In the current code, we have separate rx_event and agg_event parameters
to keep track of rx and aggregation events. Combine these events into
an u8 event mask with different bits defined for different events. This
way, it is easier to expand the logic to include XDP tx events.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4e5dbbda4c40a239e2ed4bbc98f2aa320e4dcca2) Signed-off-by: Brian Maly <brian.maly@oracle.com>
This mode is to support XDP. In this mode, each rx ring is configured
with page sized buffers for linear placement of each packet. MTU will be
restricted to what the page sized buffers can support.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c61fb99cae51958a9096d8540c8c05e74cfa7e59) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
Convert the global constants BNXT_RX_OFFSET and BNXT_RX_DMA_OFFSET to
device parameters. This will make it easier to support XDP with
headroom support which requires different RX buffer offsets.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b3dba77cf0acb6e44b368979026df975658332bc) Signed-off-by: Brian Maly <brian.maly@oracle.com>
When driver is running in XDP mode, rx buffers are DMA mapped as
DMA_BIDIRECTIONAL. Add a field so the code will map/unmap rx buffers
according to this field.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 745fc05c9db1f17da076861c7f57507e13f28a3a) Signed-off-by: Brian Maly <brian.maly@oracle.com>
To support XDP_TX, we need the RX buffer's DMA address to transmit the
packet. Convert the DMA address field to a permanent field in
bnxt_sw_rx_bd.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 11cd119d31a71b37c2362fc621f225e2aa12aea1) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Minor refactoring of bnxt_rx_skb() so that it can easily be replaced by
a new function that handles packets in a single page. Also, use a
function pointer bp->rx_skb_func() to switch to a new function when
we add the new mode in the next patch.
Add a new field data_ptr that points to the packet data in the
bnxt_sw_rx_bd structure. The original data field is changed to void
pointer so that it can either hold the kmalloc'ed data or a page
pointer.
The last parameter of bnxt_rx_skb() which was the length parameter is
changed to include the payload offset of the packet in the upper 16 bit.
The offset is needed to support the rx page mode and is not used in
this existing function.
v3: Added a new data_ptr parameter to bp->rx_skb_func(). The caller
has the option to modify the starting address of the packet. This
will be needed when XDP with headroom support is added.
v2: Changed the name of the last parameter to offset_and_len to make the
code more clear.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6bb19474391d17954fee9a9997ecca25b35dfd46) Signed-off-by: Brian Maly <brian.maly@oracle.com>
bnxt_get_port_module_status() calls bnxt_update_link() which expects
RTNL to be held. In bnxt_sp_task() that does not hold RTNL, we need to
call it with a prior call to bnxt_rtnl_lock_sp() and the call needs to
be moved to the end of bnxt_sp_task().
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 90c694bb71819fb5bd3501ac397307d7e41ddeca) Signed-off-by: Brian Maly <brian.maly@oracle.com>
bnxt_update_link() is called from multiple code paths. Most callers,
such as open, ethtool, already hold RTNL. Only the caller bnxt_sp_task()
does not. So it is a bug to take RTNL inside bnxt_update_link().
Fix it by removing the RTNL inside bnxt_update_link(). The function
now expects the caller to always hold RTNL.
In bnxt_sp_task(), call bnxt_rtnl_lock_sp() before calling
bnxt_update_link(). We also need to move the call to the end of
bnxt_sp_task() since it will be clearing the BNXT_STATE_IN_SP_TASK bit.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0eaa24b971ae251ae9d3be23f77662a655532063) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In bnxt_sp_task(), we set a bit BNXT_STATE_IN_SP_TASK so that bnxt_close()
will synchronize and wait for bnxt_sp_task() to finish. Some functions
in bnxt_sp_task() require us to clear BNXT_STATE_IN_SP_TASK and then
acquire rtnl_lock() to prevent race conditions.
There are some bugs related to this logic. This patch refactors the code
to have common bnxt_rtnl_lock_sp() and bnxt_rtnl_unlock_sp() to handle
the RTNL and the clearing/setting of the bit. Multiple functions will
need the same logic. We also need to move bnxt_reset() to the end of
bnxt_sp_task(). Functions that clear BNXT_STATE_IN_SP_TASK must be the
last functions to be called in bnxt_sp_task(). The common scheme will
handle the condition properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a551ee94ea723b4af9b827c7460f108bc13425ee) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In the TPA GRO code path, initialize the tcp_opt_len variable to 0 so
that it will be correct for packets without TCP timestamps. The bug
caused the SKB fields to be incorrectly set up for packets without
TCP timestamps, leading to these packets being rejected by the stack.
Reported-by: Andy Gospodarek <andrew.gospodarek@broadocm.com> Acked-by: Andy Gospodarek <andrew.gospodarek@broadocm.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 719ca8111402aa6157bd83a3c966d184db0d8956) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Add the ulp_sriov_cfg callbacks when the number of VFs is changing. This
allows the RDMA driver to provision RDMA resources for the VFs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2f5938467bd7f34e59a1d6d3809f5970f62e194b) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Add LED blinking code to support ethtool -p on the PF.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5ad2cbeed74bd1e89ac4ba14288158ec7eb167da) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f183886c0d798ca3cf0a51e8cab3c1902fbd1e8b) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Commit bdbd1eb59c56 ("bnxt_en: Handle no aggregation ring gracefully.")
introduced the BNXT_FLAG_NO_AGG_RINGS flag. For consistency,
bnxt_set_tpa_flags() should also clear TPA flags when there are no
aggregation rings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 341138c3e6afa8e77f9f3e773d72b37022dbcee8) Signed-off-by: Brian Maly <brian.maly@oracle.com>
CC [M] drivers/net/ethernet/broadcom/bnxt/bnxt.o
drivers/net/ethernet/broadcom/bnxt/bnxt.c:4947:21: warning: ‘bnxt_get_max_func_rss_ctxs’ defined but not used [-Wunused-function]
static unsigned int bnxt_get_max_func_rss_ctxs(struct bnxt *bp)
^
CC [M] drivers/net/ethernet/broadcom/bnxt/bnxt.o
drivers/net/ethernet/broadcom/bnxt/bnxt.c:4956:21: warning: ‘bnxt_get_max_func_vnics’ defined but not used [-Wunused-function]
static unsigned int bnxt_get_max_func_vnics(struct bnxt *bp)
^
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b742995445fbac874f5fe19ce2afc76c7a6ac2cf) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3f0d80b6d228f11117244a16d2e17ea684b540f5) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The current code assumes that we will always have at least 2 rx rings, 1
will be used as an aggregation ring for TPA and jumbo page placements.
However, it is possible, especially on a VF, that there is only 1 rx
ring available. In this scenario, the current code will fail to initialize.
To handle it, we need to properly set up only 1 ring without aggregation.
Set a new flag BNXT_FLAG_NO_AGG_RINGS for this condition and add logic to
set up the chip to place RX data linearly into a single buffer per packet.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bdbd1eb59c565c56a74d21076e2ae8706de00ecd) Signed-off-by: Brian Maly <brian.maly@oracle.com>
With the added support for the bnxt_re RDMA driver, both drivers can be
allocating completion rings in any order. The firmware does not know
which completion ring should be receiving async events. Add an
extra step to tell firmware the completion ring number for receiving
async events after bnxt_en allocates the completion rings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 486b5c22ea1d35e00e90dd79a32a9ee530b18915) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In order to properly support TX rate limiting in SRIOV VF functions or
NPAR functions, firmware needs better control over tx ring allocations.
The new scheme requires the driver to reserve the number of tx rings
and to query to see if the requested number of tx rings is reserved.
The driver will use the new scheme when the firmware interface spec is
1.6.1 or newer.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 391be5c2736456f032fe0265031ecfe17aee84a0) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Accept ipv6 flows in .ndo_rx_flow_steer() and support ETHTOOL_GRXCLSRULE
ipv6 flows.
Signed-off-by: Michael Chan <michael.chan@broadocm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit dda0e7465f040ed814d4a5c98c6bf042e59cba69) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
Assign additional vnics to VFs whenever possible so that NTUPLE can be
supported on the VFs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8427af811a2fcbbf0c71a4b1f904f2442abdcf39) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The existing hardware RFS mode uses one hardware RSS context block
per ring just to calculate the RSS hash. This is very wasteful and
prevents VF functions from using it. The new hardware mode shares
the same hardware RSS context for RSS placement and RFS steering.
This allows VFs to enable RFS.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ae10ae740ad2befd92b6f5b2ab39220bce6e5da2) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Add function bnxt_rfs_supported() that determines if the chip supports
RFS. Refactor the existing function bnxt_rfs_capable() that determines
if run-time conditions support RFS.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8079e8f107bf02e1e5ece89239dd2fb475a4735f) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The new vnic RSS capability will enhance NTUPLE support, to be added
in subsequent patches.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8fdefd63c203d9b2955d679704f4ed92bf40752c) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Call tcp_gro_complete() in the common code path instead of the chip-
specific method. The newer 5731x method is missing the call.
Signed-off-by: Michael Chan <michael.chan@broadcmo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5910906ca9ee32943f67db24917f78a9ad1087db) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The advertising field is closely related to the auto_link_speeds field.
The former is the user setting while the latter is the firmware setting.
Both should be u16. We should use the advertising field in
bnxt_get_link_ksettings because the auto_link_speeds field may not
be updated with the latest from the firmware yet.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 68515a186cf8a8f97956eaea5829277752399f58) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The IRQ is disabled by writing to the completion ring doorbell. This
should be done before the hardware completion ring is freed for correctness.
The current code disables IRQs after all the completion rings are freed.
Fix it by calling bnxt_disable_int_sync() before freeing the completion
rings. Rearrange the code to avoid forward declaration.
Signed-off-by: Michael Chan <michael.chan@broadocm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 9d8bc09766f1a229b2d204c713a1cfc6c7fa1bb1) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Use native NAPI polling instead. The next patch will complete the work
by switching to use napi_complete_done()
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b356a2e729cec145a648d22ba5686357c009da25) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
I got a warning about broken code on ARM64 with 64K pages:
drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_rq_init':
drivers/net/vmxnet3/vmxnet3_drv.c:1679:29: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
rq->buf_info[0][i].len = PAGE_SIZE;
'len' here is a 16-bit integer, so this clearly won't work. I don't think
this driver is used much on anything other than x86, so there is no need
to fix this properly and we can work around it with a Kconfig dependency
to forbid known-broken configurations. qemu in theory supports it on
other architectures too, but presumably only for compatibility with x86
guests that also run on vmware.
CONFIG_PAGE_SIZE_64KB is used on hexagon, mips, sh and tile, the other
symbols are architecture-specific names for the same thing.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fbdf0e28d061708cf18ba0f8e0db5360dc9a15b9) Signed-off-by: Brian Maly <brian.maly@oracle.com>
vmxnet3_set_mc() checks new_table_pa returned by dma_map_single()
with dma_mapping_error(), but even there it assumes zero is invalid pa
(it assumes dma_mapping_error(...,0) returns true if new_table is NULL).
The patch adds an explicit variable to track status of new_table_pa.
Found by Linux Driver Verification project (linuxtesting.org).
v2: use "bool" and "true"/"false" for boolean variables. Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fb5c6cfaec126d9a96b9dd471d4711bf4c737a6f) Signed-off-by: Brian Maly <brian.maly@oracle.com>
vmxnet3_reset_work() expects tx queues to be stopped (via
vmxnet3_quiesce_dev -> netif_tx_disable). However, this races with the
netif_wake_queue() call in netif_tx_timeout() such that the driver's
start_xmit routine may be called unexpectedly, triggering one of the BUG_ON
in vmxnet3_map_pkt with a stack trace like this:
Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 277964e19e1416ca31301e113edb2580c81a8b66) Signed-off-by: Brian Maly <brian.maly@oracle.com>
drivers/net/vmxnet3/vmxnet3_drv.c:1645:1: warning:
symbol 'vmxnet3_rq_destroy_all_rxdataring' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bb40aca7cf153e3e2941140d3850a4b6c4205ccb) Signed-off-by: Brian Maly <brian.maly@oracle.com>
'Commit 3c8b3efc061a ("vmxnet3: allow variable length transmit data ring
buffer")' changed the size of the buffers in the tx data ring from a
fixed size of 128 bytes to a variable size.
However, while copying data to the data ring, vmxnet3_copy_hdr continues
to carry the old code that assumes fixed buffer size of 128. This patch
fixes it by adding correct offset based on the actual data ring buffer
size.
Signed-off-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ff2e7d5d51469e98196f7933c83b781e96517e7c) Signed-off-by: Brian Maly <brian.maly@oracle.com>
With all vmxnet3 version 3 changes incorporated in the vmxnet3 driver,
the driver can configure emulation to run at vmxnet3 version 3, provided
the emulation advertises support for version 3.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6af9d787459e3ea32da446fe0aa76cff65fd2f8c) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In vmxnet3 version 3, the emulation added support for the vmxnet3 driver
to communicate information about the memory regions the driver will use
for rx/tx buffers. The driver can also indicate which rx/tx queue the
memory region is applicable for. If this information is communicated
to the emulation, the emulation will always keep these memory regions
mapped, thereby avoiding the mapping/unmapping overhead for every packet.
Currently, Linux vmxnet3 driver does not leverage this capability. The
feasibility of using this approach for the Linux vmxnet3 driver will be
investigated independently and if possible, will be part of a different
patch. This patch only exposes the emulation capability to the driver
(vmxnet3_defs.h is identical between the driver and the emulation).
Signed-off-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 474432229f9482f0f4a2732f2e130dc48247f1d7) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The emulation supports a variety of coalescing modes viz. disabled
(no coalescing), adaptive, static (number of packets to batch before
raising an interrupt), rate based (number of interrupts per second).
This patch implements get_coalesce and set_coalesce methods to allow
querying and configuring different coalescing modes.
Signed-off-by: Keyong Sun <sunk@vmware.com> Signed-off-by: Manoj Tammali <tammalim@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4edef40ef5f8d09a0b1ded4d1d9b0e988cd98e97) Signed-off-by: Brian Maly <brian.maly@oracle.com>
vmxnet3 driver preallocates buffers for receiving packets and posts the
buffers to the emulation. In order to deliver a received packet to the
guest, the emulation must map buffer(s) and copy the packet into it.
To avoid this memory mapping overhead, this patch introduces the receive
data ring - a set of small sized buffers that are always mapped by
the emulation. If a packet fits into the receive data ring buffer, the
emulation delivers the packet via the receive data ring (which must be
copied by the guest driver), or else the usual receive path is used.
Receive Data Ring buffer length is configurable via ethtool -G ethX rx-mini
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 50a5ce3e7116a70edb7a1d1d209e3bc537752427) Signed-off-by: Brian Maly <brian.maly@oracle.com>
vmxnet3 driver supports transmit data ring viz. a set of fixed size
buffers used by the driver to copy packet headers. Small packets that
fit these buffers are copied into these buffers entirely.
Currently this buffer size of fixed at 128 bytes. This patch extends
transmit data ring implementation to allow variable length transmit
data ring buffers. The length of the buffer is read from the emulation
during initialization.
Signed-off-by: Sriram Rangarajan <rangarajans@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3c8b3efc061a745d888869dc3462ac4f7dd582d9) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Shared memory is used to exchange information between the vmxnet3 driver
and the emulation. In order to request emulation to perform a task, the
driver first populates specific fields in this shared memory and then
issues corresponding command by writing to the command register(CMD). The
layout of the shared memory was defined by vmxnet3 version 1 and cannot
be extended for every new command without breaking backward compatibility.
To address this problem, in vmxnet3 version 3, the emulation repurposed
a reserved field in the shared memory to represent command information
instead. For new commands, the driver first populates the command
information field in the shared memory and then issues the command. The
emulation interprets the data written to the command information depending
on the type of the command. This patch exposes this capability to the driver.
Signed-off-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f35c7480f81b70f9c3030d96a3807e8faba34cf7) Signed-off-by: Brian Maly <brian.maly@oracle.com>
vmxnet3 is currently at version 2, but some command definitions from
previous vmxnet3 versions are missing. Add those definitions before
moving to version 3.
Also, introduce utility macros for vmxnet3 version comparison and update
Copyright information and Maintained by.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 190af10f0b5a07140ec4ce8e6ef04b7cb238dde1) Signed-off-by: Brian Maly <brian.maly@oracle.com>