PTT entries are per-hwfn; If some errneous flow is trying
to use a PTT belonging to a differnet hwfn warn user, as this
can break every register accessing flow later and is very hard
to root-cause.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When qedr is enabled, qed would try dividing the msi-x vectors between
L2 and RoCE, starting with L2 and providing it with sufficient vectors
for its queues.
Problem is qed would also do that for storage partitions, and as those
don't need queues it would lead qed to award those partitions with 0
msi-x vectors, causing them to believe theye're using INTa and
preventing them from operating.
Fixes: 51ff17251c9c ("qed: Add support for RoCE hw init") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
There seems to be a missing break on the OOO_LB_TC case, pq_id
is being assigned and then re-assigned on the fall through default
case and that seems suspect.
Detected by CoverityScan, CID#1424402 ("Missing break in switch")
Fixes: b5a9ee7cf3be1 ("qed: Revise QM cofiguration") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
We should be returning -ENOMEM if qed_mcp_cmd_add_elem() fails. The
current code returns success.
Fixes: 4ed1eea82a21 ("qed: Revise MFW command locking") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
As RoCE doesn't need to use the SRC, allocating ILT memory
on behalf of RoCE is wasting available ILT lines.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
As of today there's no protocol supported that requires
support from the TM hardware block and enables SRIOV,
but we should still correct the calculation to reflect
the lines required for such future VFs instead of changing
the PF's own lines.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When configuring the HW timers block we should set the number of CIDs
up until the last CID that require timers, instead of only those CIDs
whose protocol needs timers support.
Today, the protocols that require HW timers' support have their CIDs
before any other protocol, but that would change in future [when we
add iWARP support].
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Refactor and clean up the queue manager initialization logic.
Also, this adds support for RoC low latency queues, which later
would be used for improving RoCE latency in high throughput scenarios.
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Until now, qed used some port-defined value as BDQ index for both iSCSI
and FCoE.
As management firmware now treats BDQ as a resource and tells each PF
its BDQ-range, start using a valure from that range instead.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Management firmware is used as an arbiter between the various PFs
in matters of resources, but some of the resources that need to
be divided are dependent on the non-management firmware used,
so management firmware first needs to be told how many resources
there are before trying to divide them.
As part of the initialization sequence, driver would first inform
the management firmware of the available resources under
a dedicated resource lock, and afterwards request for various
resources which might be based on the previous set values.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Global locking can't properly be used to synchronize between different
PFs in all scenarios, as those instances might reside in different
logical partitions [e.g., when a PF is assigned via PDA to some VM].
The management firmware provides a generic infrastructure for
device locks. For each 'resource', it's guaranteed it could be acquired
by at most a single PF at any given time [or by management firmware].
This patch adds the necessary logic in qed for utilizing said
infrastructure, implementing lock/unlock internal APIs.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
During HW initialization, driver would set various registers to their
needed values - but it assumes all registers start at their reset-value,
so there's no need to re-configure a register's default value.
This assumption might be incorrect, e.g., in case of preboot driver
running and initializing the driver prior to our driver.
To overcome this, we now ask management firmware to initiate a PF-flr
early during the initialization sequence. That would return everything
in the PF's scope back to default and prevent previous configurations
from still being applied.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Management firmware is used as an arbiter between the various PFs
in regard to loading - it causes the various PFs to load/unload
sequentially and informs each of its appropriate rule in the init.
But the existing flow is too weak to handle some scenarios where
PFs aren't properly cleaned prior to loading.
The significant scenarios falling under this criteria:
a. Preboot drivers in some environment can't properly unload.
b. Unexpected driver replacement [kdump, PDA].
Modern management firmware supports a more intricate loading flow,
where the driver has the ability to overcome previous limitations.
This moves qed into using this newer scheme.
Notice new scheme is backward compatible, so new drivers would
still be able to load properly on top of older management firmwares
and vice versa.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
We'll soon need additional information, so start by changing
the infrastructure to receive the initializing variables
via a parameter struct.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Management firmware is used as arbiter between different PFs
which are loading/unloading, but in order to use the synchronization
it offers the contending configurations need to be applied either
between their LOAD_REQ <-> LOAD_DONE or UNLOAD_REQ <-> UNLOAD_DONE
management firmware commands.
Existing HW stop flow utilizes 2 different functions: qed_hw_stop() and
qed_hw_reset() which don't abide this requirement; Most of the closure
is doing outside the scope of the unload request.
This patch removes qed_hw_reset() and places the relevant stop
functionality underneath the management firmware protection.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Align the driver feature distribution with the flow utilized
by the management firmware - first reserve L2 queues for
VFs and use all the remaining for the PF.
The current distribution might lead to PFs with an enormous
amount of queues, but at the same time leave us with insufficient
resources for starting all VFs.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When RoCE is enabled on a given L2 interface, the interrupt lines
are divided equally between L2 and RoCE -
But in case number of lines needed for RoCE is limited by number
of available CNQs, we can utilize the additional lines for L2.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The management firmware is running on a Big Endian processor,
and when running on LE platform HW is configured to swap access
to memory shared between management firmware and driver on
32-bit granulariy.
As a result, for matters of simplicity most of the APIs between
driver and management firmware are based on 32-bit variables.
MAC settings are one exception, as driver needs to fill a byte
array when indicating to management firmware that primary MAC
has changed.
Due to the swap, driver must make sure that the mac that was
provided in byte-order would be translated into native order,
otherwise after the swap the management firmware would read
it swapped.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The driver interaction with management firmware involves a union
of all the data-members relating to the commands the driver prepares.
Current interface assumes the caller always passes such a union -
but thats cumbersome as well as risky [chancing a stack corruption
in case caller accidentally passes a smaller member instead of union].
Change implementation so that caller could pass a pointer to any
of the members instead of the union.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Interaction of driver -> management firmware is based
on a one-pending mailbox [per interface], and various
mailbox commands need to be synchronized.
Current scheme is messy, and there's a difficulty extending
it as it deals differently with various commands as well as
making assumption on the required behavior for load/unload
requests.
Drop the current scheme into a completion-list-based approach;
Each flow would try sending the command when possible,
allowing one flow to complete another flow's completion and
relieve the mailbox before sending its own command.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The link information exists only on the leading hwfn,
but some of its derivatives [e.g., min/max rate] need to
be configured for each hwfn.
When re-basing the VF link view, use the leading hwfn
information as basis for all existing hwfns to allow
said configurations to stick.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Malicious VF existance should be interesting enough for the
hyperuser. Change the PF indication that one of its child VF
became malicious to appear by default.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The PF<->VF interface allows for the VF to request
multiple queues closure via a single message, but this has
never been used by any official driver.
We now deprecate this option, forcing each queue close
to arrive via a different command; This would be required
for future TLVs that are going to extend the queue TLVs with
additional information on a per-queue basis.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
PF needs to validate the status of VF queues before asking firmware
to configure anything for them, but that validation is done in various
different forms - sometimes inadequate.
Add auxillary functions that can be used for testing of the queue
state and convert the various flows to use those instead of current
existing flows; Also, add missing validations where needed.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When starting the VF's vport, the PF would first configure
the status blocks of the VF and then reset them.
That would cause some of the configured information to be lost -
specifically it would mean that all the VFs queues would use
the Rx coalescing state-machine of the status block.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When PF responds to the VF requests it also cleans the HW-channel
indication in firmware to allow further VF messages to arrive,
but the order currently applied is wrong -
The PF is copying by DMAE the response the VF is polling on for
completion, and only afterwards sets the HW-channel to ready state.
This creates a race condition where the VF would be able to send
an additional message to the PF before the channel would get ready
again, causing the firmware to consider the VF as malicious.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When a VF is considered malicious, driver handling of the VF
FLR flow would clean said indication - but not if the FLR is
part of an sriov-disable flow.
That leads to further issues, as PF wouldn't re-enable the
previously malicious VF when sriov is re-enabled.
No reason for that - simply clean malicious indications in
the sriov-disable flow as well.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
VFs are currently logging errors when communicating
with their PFs in a too-low verbosity that wouldn't
be shown by default. As timeouts and failed commands
are crucial for VF operability, make them appear by
default.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
This adds the necessary infrastructure changes for initializing
and working with the new series of QL41xxx adapaters.
It also adds 2 new PCI device-IDs to qede:
- 0x8070 for QL41xxx PFs
- 0x8090 for VFs spawning from QL41xxx PFs
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Missing in the initial submission, qed fails to propagate qedi's
request to enable OOO to firmware.
Fixes: fc831825f99e ("qed: Add support for hardware offloaded iSCSI") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Need to set the number of entries in database, otherwise the logic
would quickly surpass the array.
Fixes: 1d6cff4fca43 ("qed: Add iSCSI out of order packet handling") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Before iterating over the the LL2 Rx ring, the ring's
spinlock is taken via spin_lock_irqsave().
The actual processing of the packet [including handling
by the protocol driver] is done without said lock,
so qed releases the spinlock and re-claims it afterwards.
Problem is that the final spin_lock_irqrestore() at the end
of the iteration uses the original flags saved from the
initial irqsave() instead of the flags from the most recent
irqsave(). So it's possible that the interrupt status would
be incorrect at the end of the processing.
Fixes: 0a7fb11c23c0 ("qed: Add Light L2 support"); CC: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Fixes: fc831825f99e ("qed: Add support for hardware offloaded iSCSI") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When receiving an Rx LL2 packet, qed fails to unmap the previous buffer.
Fixes: 0a7fb11c23c0 ("qed: Add Light L2 support"); Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Current Logic would allow the creation of a chain with U32_MAX + 1
elements, when the actual maximum supported by the driver infrastructure
is U32_MAX.
Fixes: a91eb52abb50 ("qed: Revisit chain implementation") Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The Doorbell HW block can be configured at a granularity
of 16 x CIDs, so we need to make sure that the actual number
of CIDs configured would be a multiplication of 16.
Today, when RoCE is enabled - given that the number is unaligned,
doorbelling the higher CIDs would fail to reach the firmware and
would eventually timeout.
Fixes: dbb799c39717 ("qed: Initialize hardware for new protocols") Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
This patch advances the qed* drivers into using the newer firmware -
This solves several firmware bugs, mostly related [but not limited to]
various init/deinit issues in various offloaded protocols.
It also introduces a major 4-Cached SGE change in firmware, which can be
seen in the storage drivers' changes.
In addition, this firmware is required for supporting the new QL41xxx
series of adapters; While this patch doesn't add the actual support,
the firmware contains the necessary initialization & firmware logic to
operate such adapters [actual support would be added later on].
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com> Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
In qed_ll2_start_ooo() the ll2_info variable is uninitialized and then
passed to qed_ll2_acquire_connection() where it is copied into a new
memory space.
This shouldn't cause any issue as long as non of the copied memory is
every read.
But the potential for a bug being introduced by reading this memory
is real.
Detected by CoverityScan, CID#1399632 ("Uninitialized scalar variable")
Signed-off-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Commit 653d2ffd6405 ("qed*: Fix link indication race") introduced another
race - one of the inner functions called from the link-change flow is
explicitly using the slowpath context dedicated PTT instead of gaining
that PTT from the caller. Since this flow can now be called from
a different context as well, we're in risk of the PTT breaking.
Fixes: 653d2ffd6405 ("qed*: Fix link indication race") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
A PF syncronizes all IOV activity relating to its VFs
by using a single workqueue handling the work.
The workqueue would reach a bitmask of pending VF events
and act upon each in turn.
Problem is that the indication of a VF message [which sets
the 'vf event' bit for that VF] arrives and is set in
the slowpath attention context, which isn't syncronized with
the processing of the events.
When multiple VFs are present, it's possible that PF would
lose the indication of one of the VF's pending evens, leading
that VF to later timeout.
Instead of adding locks/barriers, simply move from a bitmask
into a per-VF indication inside that VF entry in the PF database.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
This patch adds the driver support for,
- Registering the ptp clock functionality with the OS.
- Timestamping the Rx/Tx PTP packets.
- Ethtool callbacks related to PTP.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When driver receives a recognized encapsulated packet it needs
to set the skb->encapsulation field as well.
Signed-off-by: Manish Chopra <Manish.Chopra@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
During Rx flow driver allocates a replacement buffer each time
it consumes an Rx buffer. Failing to do so, it would consume the
currently processed buffer and re-post it on the ring.
As a result, the Rx ring is always completely full [from driver POV].
We now allow the Rx ring to shorten by doing the re-allocations
at the end of the NAPI run. The only limitation is that we still want to
make sure each time we reallocate that we'd still have sufficient
elements in the Rx ring to guarantee that FW would be able to post
additional data and trigger an interrupt.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
This takes the various filtering logic of the driver and
moves them into their own dedicated file - qede_filter.c.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
This adds a new file qede_fp.c and relocates the datapath-related
logic into it [from qede_main.c].
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
NVMe enables MSIx interrupts during the nvme device probe
and also during nvme reset. While enabling MSIx interrupts
using do_setup_msix_irqs(), it does it twice. It assigns
the IRQ under the function irq_alloc_hwirqs() and again
shortly using setup_msi_irq(). During the first invocation
from irq_alloc_hwirqs(), it sets the cfg->vector and
cfg->domain of the specific IRQ. During the subsequent
invocation from setup_msi_irq(), if the cfg->domain
intersects with the target cpumask (tmp_mask) then
the move_in_progress flag is set. This flag is never cleared
unless it is set via proc file system. As this flag is not
cleared, the subsequent smp affinity set via procfs fails.
Upstream introduced IRQ domain hierarchy where they assign
the IRQ only once. However, pulling in IRQ domain hierarchy
from the upstream brings with it lot of changes (85 commits).
Hence, this patch assigns the IRQ only once.
Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Keith Busch [Wed, 1 Mar 2017 19:22:12 +0000 (14:22 -0500)]
nvme: Complete all stuck requests
If the nvme driver is shutting down its controller, the drievr will not
start the queues up again, preventing blk-mq's hot CPU notifier from
making forward progress.
To fix that, this patch starts a request_queue freeze when the driver
resets a controller so no new requests may enter. The driver will wait
for frozen after IO queues are restarted to ensure the queue reference
can be reinitialized when nvme requests to unfreeze the queues.
If the driver is doing a safe shutdown, the driver will wait for the
controller to successfully complete all inflight requests so that we
don't unnecessarily fail them. Once the controller has been disabled,
the queues will be restarted to force remaining entered requests to end
in failure so that blk-mq's hot cpu notifier may progress.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 302ad8cc09339ea261eef58a8d5f4a116a8ffda5)
nvme: Don't suspend admin queue that wasn't created
This fixes a regression in my previous commit c21377f8366c ("nvme:
Suspend all queues before deletion"), which provoked an Oops in the
removal path when removing a device that became IO incapable very early
at probe (i.e. after a failed EEH recovery).
Turns out, if the error occurred very early at the probe path, before
even configuring the admin queue, we might try to suspend the
uninitialized admin queue, accessing bad memory.
Fixes: c21377f8366c ("nvme: Suspend all queues before deletion") Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 82469c59d222f839ded5cd282172258e026f9112)
Keith Busch [Wed, 12 Oct 2016 15:22:16 +0000 (09:22 -0600)]
nvme: Delete created IO queues on reset
The driver was decrementing the online_queues prior to attempting to
delete those IO queues, so the driver ended up not requesting the
controller delete any. This patch saves the online_queues prior to
suspending them, and adds that parameter for deleting io queues.
Fixes: c21377f8 ("nvme: Suspend all queues before deletion") Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 7065906096273b39b90a512a7170a6697ed94b23)
Gabriel Krisman Bertazi [Thu, 11 Aug 2016 15:35:57 +0000 (09:35 -0600)]
nvme: Suspend all queues before deletion
When nvme_delete_queue fails in the first pass of the
nvme_disable_io_queues() loop, we return early, failing to suspend all
of the IO queues. Later, on the nvme_pci_disable path, this causes us
to disable MSI without actually having freed all the IRQs, which
triggers the BUG_ON in free_msi_irqs(), as show below.
This patch refactors nvme_disable_io_queues to suspend all queues before
start submitting delete queue commands. This way, we ensure that we
have at least returned every IRQ before continuing with the removal
path.
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Cc: Keith Busch <keith.busch@intel.com> Cc: linux-nvme@lists.infradead.org Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit c21377f8366c95440d533edbe47d070f662c62ef)
Keith Busch [Fri, 10 Feb 2017 23:15:52 +0000 (18:15 -0500)]
nvme/pci: No special case for queue busy on IO
This driver previously required we have a special check for IO submitted
to nvme IO queues that are temporarily suspended. That is no longer
necessary since blk-mq provides a quiesce, so any IO that actually gets
submitted to such a queue must be ended since the queue isn't going to
start back up.
This is fixing a condition where we have fewer IO queues after a
controller reset. This may happen if the number of CPU's has changed,
or controller firmware update changed the queue count, for example.
While it may be possible to complete the IO on a different queue, the
block layer does not provide a way to resubmit a request on a different
hardware context once the request has entered the queue. We don't want
these requests to be stuck indefinitely either, so ending them in error
is our only option at the moment.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 9ef3932e250f8e2e11ffbc0c1f28b3ba5dc40cd6)
David S. Miller [Mon, 5 Jun 2017 01:41:10 +0000 (21:41 -0400)]
ipv6: Fix leak in ipv6_gso_segment().
If ip6_find_1stfragopt() fails and we return an error we have to free
up 'segs' because nobody else is going to.
Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e3e86b5119f81e5e2499bea7ea1ebe8ac6aab789)
Ben Hutchings [Wed, 31 May 2017 12:15:41 +0000 (13:15 +0100)]
ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt()
xfrm6_find_1stfragopt() may now return an error code and we must
not treat it as a length.
Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6e80ac5cc992ab6256c3dae87f7e57db15e1a58c)
David S. Miller [Thu, 18 May 2017 02:54:11 +0000 (22:54 -0400)]
ipv6: Check ip6_find_1stfragopt() return value properly.
Do not use unsigned variables to see if it returns a negative
error or not.
Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7dd7eb9513bd02184d45f000ab69d78cb1fa1531)
Craig Gallek [Tue, 16 May 2017 18:36:23 +0000 (14:36 -0400)]
ipv6: Prevent overrun when parsing v6 header options
The KASAN warning repoted below was discovered with a syzkaller
program. The reproducer is basically:
int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP);
send(s, &one_byte_of_data, 1, MSG_MORE);
send(s, &more_than_mtu_bytes_data, 2000, 0);
The socket() call sets the nexthdr field of the v6 header to
NEXTHDR_HOP, the first send call primes the payload with a non zero
byte of data, and the second send call triggers the fragmentation path.
The fragmentation code tries to parse the header options in order
to figure out where to insert the fragment option. Since nexthdr points
to an invalid option, the calculation of the size of the network header
can made to be much larger than the linear section of the skb and data
is read outside of it.
This fix makes ip6_find_1stfrag return an error if it detects
running out-of-bounds.
Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2423496af35d94a87156b063ea5cedffc10a70a1)
Acked-by: Jonathan Helman <jonathan.helman@oracle.com> Acked-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Jane Chu <jane.chu@oracle.com>
Michael Chan [Tue, 11 Jul 2017 17:05:36 +0000 (13:05 -0400)]
bnxt_en: Fix SRIOV on big-endian architecture.
The PF driver sets up a list of firmware commands from the VF driver that
needs to be forwarded to the PF for approval. This list is a 256-bit
bitmap. The code that sets up the bitmap falls apart on big-endian
architecture. __set_bit() does not work because it operates on long types
whereas the firmware interface is defined in u32 types, causing bits in
the wrong 32-bit word to be set.
Fix it by setting the proper bits on an array of u32.
Fixes: de68f5de5651 ("bnxt_en: Fix bitmap declaration to work on 32-bit arches.") Reported-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 26000471
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tejun Heo [Wed, 24 May 2017 16:03:48 +0000 (12:03 -0400)]
cpuset: consider dying css as offline
In most cases, a cgroup controller don't care about the liftimes of
cgroups. For the controller, a css becomes online when ->css_online()
is called on it and offline when ->css_offline() is called.
However, cpuset is special in that the user interface it exposes cares
whether certain cgroups exist or not. Combined with the RCU delay
between cgroup removal and css offlining, this can lead to user
visible behavior oddities where operations which should succeed after
cgroup removals fail for some time period. The effects of cgroup
removals are delayed when seen from userland.
This patch adds css_is_dying() which tests whether offline is pending
and updates is_cpuset_online() so that the function returns false also
while offline is pending. This gets rid of the userland visible
delays.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Link:
http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org>
Orabug: 26415290
Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Tom Hromatka <tom.hromatka@oracle.com> Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: chris hyser <chris.hyser@oracle.com>
Eric Snowberg [Fri, 12 May 2017 20:31:42 +0000 (13:31 -0700)]
proc: sparc64 ADI version tag debugging interface
To facilitate user space ADI debugging there needs to be a way for a
debugger to get/set ADI version tags in a target process. This is
accomplished with a new /proc/<pid>/adi/tags interface. This new interface
maps linearly to the address space of the target process at a ratio
of 1:adi_blksz. A read (or write) of offset K in the file returns
(or modifies) the ADI version tag stored in the cacheline containing
address K * adi_blksz, encoded as 1 version per byte.
Pseudocode example:
unsigned char vers[2];
long long addr = 0x20000;
fd = open(â\80\9c/proc/pid/adi/tagsâ\80\9d, O_RDONLY);
addr /= adi_blksz();
rv = pread64(fd, &vers, 2, addr);
/*
* vers[0] gets version from address 0x20000,
* vers[1] gets version from address 0x20000 + adi_blksz()
*/
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Eric Snowberg [Thu, 12 Jan 2017 23:05:23 +0000 (15:05 -0800)]
proc: Move directory functions into internal.h
Move directory macros and define proc_pident_lookup,
proc_pident_readdir and struct pid_entry within
fs/proc/internal.h. These were originally statically defined within
fs/proc/base.c and couldn't be used elsewhere.
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Chris Hyser <chris.hyser@oracle.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Atish Patra [Fri, 23 Jun 2017 19:32:57 +0000 (13:32 -0600)]
sched: Move the loadavg code to a more obvious location
A previous commit f33dfff75d968 ("sched/fair: Rewrite runnable load
and utilization average tracking") created a regression in global
load average in uptime. Active Load average computation function
should be invoked periodically to update the delta for each runqueue.
Use the following upstream commit 3289bdb42 to fix this in stead of
quick-fix.
Original upstream commit message:
I could not find the loadavg code.. turns out it was hidden in a file
called proc.c. It further got mingled up with the cruft per rq load
indexes (which we really want to get rid of).
Move the per rq load indexes into the fair.c load-balance code (that's
the only thing that uses them) and rename proc.c to loadavg.c so we
can find it again.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de>
[ Did minor cleanups to the code. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 3289bdb429884c0279bf9ab72dff7b934f19dfc6)
sparc64: Treat ERESTARTSYS as an acceptable error (DAX driver)
get_user_pages fails if the current process calling it has a SIGKILL
posted on it. Since this is an acceptable failure catch this error and
print a debug message instead of an error message.
Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Jonathan Helman <jonathan.helman@oracle.com>
Aaron Young [Tue, 11 Jul 2017 16:56:40 +0000 (09:56 -0700)]
SPARC64: vcc: delay device removal until close()
If a vcc device file is open while it's removed (due to a
domain being unbound), delay the removal of the associated vcc
device structure until the final close() call is made on
the device. This preventsthe device file cdev minor number from
being reused which can result in ugly filesystem warnings to
the console.
Signed-off-by: Aaron Young <aaron.young@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-By: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Thomas Tai [Mon, 10 Jul 2017 16:55:55 +0000 (10:55 -0600)]
sparc64: fix vio handshake issue
When rebooting multiple LDoms together with bind/unbind a separate
LDom, kernel panic with vio handshake error. The panic is caused by
vio trying to allocate a buffer which is not freed properly. The
ldc_unbind should unconfigure and stop the ldc queue before freeing
the irq. If the irq is freed before stopping the queue, interrupts
can continue to happen after the irq is freed which may cause
issue.
Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Use cpu_poke hypervisor call to resume idle cpu if supported.
Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 25575672 Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Vijay Kumar [Fri, 5 May 2017 19:35:05 +0000 (15:35 -0400)]
sparc64: Add a new hypercall CPU_POKE
This adds a new hypercall CPU_POKE for quickly waking up an idle CPU.
CPU POKE should only be sent to valid non-local CPUs.
Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 25575672 Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Thomas Tai [Tue, 27 Jun 2017 15:21:44 +0000 (09:21 -0600)]
sparc64: fix out of order spin_lock_irqsave and spin_unlock_restore
After enabling spinlocks debug option, kernel prints out call trace
suggesting that the function ldom_req_sp_token executes a might_sleep
function while IRQs is disabled. IRQs is disabled because the
spin_lock_irqsave and spin_unlock_irqrestore are out of order.
The last UNLOCK_DS_DEV() ends up restoring IRQs to disabled state,
because the previous LOCK_DS_DEV() is in irqs_disabled().
To fix the issue, follows the order of irqsave()/irqrestore().
Takashi Iwai [Fri, 2 Jun 2017 15:26:56 +0000 (17:26 +0200)]
ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
snd_timer_user_tselect() reallocates the queue buffer dynamically, but
it forgot to reset its indices. Since the read may happen
concurrently with ioctl and snd_timer_user_tselect() allocates the
buffer via kmalloc(), this may lead to the leak of uninitialized
kernel-space data, as spotted via KMSAN:
BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10
CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x143/0x1b0 lib/dump_stack.c:52
kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086
copy_to_user ./arch/x86/include/asm/uaccess.h:725
snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004
do_loop_readv_writev fs/read_write.c:716
__do_readv_writev+0x94c/0x1380 fs/read_write.c:864
do_readv_writev fs/read_write.c:894
vfs_readv fs/read_write.c:908
do_readv+0x52a/0x5d0 fs/read_write.c:934
SYSC_readv+0xb6/0xd0 fs/read_write.c:1021
SyS_readv+0x87/0xb0 fs/read_write.c:1018
This patch adds the missing reset of queue indices. Together with the
previous fix for the ioctl/read race, we cover the whole problem.
Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit ba3021b2c79b2fa9114f92790a99deb27a65b728)
Takashi Iwai [Fri, 2 Jun 2017 13:03:38 +0000 (15:03 +0200)]
ALSA: timer: Fix race between read and ioctl
The read from ALSA timer device, the function snd_timer_user_tread(),
may access to an uninitialized struct snd_timer_user fields when the
read is concurrently performed while the ioctl like
snd_timer_user_tselect() is invoked. We have already fixed the races
among ioctls via a mutex, but we seem to have forgotten the race
between read vs ioctl.
This patch simply applies (more exactly extends the already applied
range of) tu->ioctl_lock in snd_timer_user_tread() for closing the
race window.
Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit d11662f4f798b50d8c8743f433842c3e40fe3378)
NVMe: Retain QUEUE_FLAG_SG_GAPS flag for bio vector alignment.
The nvme queue flag QUEUE_FLAG_SG_GAPS checks for the bio vector
alignment against the page size. In upstream, the QUEUE_FLAG_SG_GAPS
flag is replaced by blk_queue_virt_boundary() and pulling in the
respective patches caused instability in the driver and hence
QUEUE_FLAG_SG_GAPS flag is retained for vector alignment.
Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Mintz, Yuval [Fri, 9 Jun 2017 14:17:02 +0000 (17:17 +0300)]
bnx2x: Don't post statistics to malicious VFs
Once firmware indicates that a given VF is malicious and until
that VF passes an FLR all bets are off - PF can't know anything
is happening to the VF [since VF can't communicate anything to its PF].
But PF is currently still periodically asking device to collect
statistics for the VF which might in turn fill logs by IOMMU blocking
memory access done by the VF's PCI function [in the case VF has unmapped
its buffers].
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3523882229b903e967de05665b871dab87c5df0f) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Mintz, Yuval [Fri, 9 Jun 2017 14:17:01 +0000 (17:17 +0300)]
bnx2x: Allow vfs to disable txvlan offload
VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 92f85f05caa51d844af6ea14ffbc7a786446a644) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Tue, 6 Jun 2017 14:30:31 +0000 (16:30 +0200)]
bnx2x: fix pf2vf bulletin DMA mapping leak
When freeing VF's DMA mappings, an already NULLed pointer was checked
again due to an apparent copy&paste error. Consequently, the pf2vf
bulletin DMA mapping was not freed.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 996652c7050c70008e4434af108be6f15f20fbd0) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Mintz, Yuval [Thu, 1 Jun 2017 12:57:56 +0000 (15:57 +0300)]
bnx2x: Fix Multi-Cos
Apparently multi-cos isn't working for bnx2x quite some time -
driver implements ndo_select_queue() to allow queue-selection
for FCoE, but the regular L2 flow would cause it to modulo the
fallback's result by the number of queues.
The fallback would return a queue matching the needed tc
[via __skb_tx_hash()], but since the modulo is by the number of TSS
queues where number of TCs is not accounted, transmission would always
be done by a queue configured into using TC0.
Fixes: ada7c19e6d27 ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3968d38917eb9bd0cd391265f6c9c538d9b33ffa) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e39513259450a8312cb98a9a3b16bb924310dbcc) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Fri, 3 Mar 2017 16:08:33 +0000 (17:08 +0100)]
bnx2x: fix incorrect filter count in an error message
filters->count is the number of filters we were supposed to configure.
There is no reason to increase it by +1 when printing the count in an error
message.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 74bcbeb7d77ec92e4262fc340cb436ef7d98ba01) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 78d5505432436516456c12abbe705ec8dee7ee2b) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 83bd9eb8fc69cdd5135ed6e1f066adc8841800fd) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Fri, 3 Mar 2017 16:08:30 +0000 (17:08 +0100)]
bnx2x: fix possible overrun of VFPF multicast addresses array
It is too late to check for the limit of the number of VF multicast
addresses after they have already been copied to the req->multicast[]
array, possibly overflowing it.
Do the check before copying.
Also fix the error path to not skip unlocking vf2pf_mutex.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 22118d861cec5da6ed525aaf12a3de9bfeffc58f) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Fri, 3 Mar 2017 16:08:29 +0000 (17:08 +0100)]
bnx2x: lower verbosity of VF stats debug messages
When BNX2X_MSG_IOV is enabled, the driver produces too many VF statistics
messages. Lower the verbosity of the VF stats messages similarly as in
commit 76ca70fabbdaa3 ("bnx2x: [Debug] change verbosity of some prints").
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 850268d320f0c7c5eb7ad0a62ef21859fa331ded) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Michal Schmidt [Fri, 3 Mar 2017 16:08:28 +0000 (17:08 +0100)]
bnx2x: prevent crash when accessing PTP with interface down
It is possible to crash the kernel by accessing a PTP device while its
associated bnx2x interface is down. Before the interface is brought up,
the timecounter is not initialized, so accessing it results in NULL
dereference.
Fix it by checking if the interface is up.
Use -ENETDOWN as the error code when the interface is down.
-EFAULT in bnx2x_ptp_adjfreq() did not seem right.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 466e8bf10ac104d96e1ea813e8126e11cb72ea20) Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tomas Jedlicka [Fri, 7 Apr 2017 20:51:53 +0000 (16:51 -0400)]
dtrace: FBT module support and SPARCs return probes
This fix adds two features to FBT provider:
1) support for modules
2) support for return probes on SPARC
The module support of x86 was almost ready as it does not rely on trampolines and
uses hashtables of tracepoints. This works well if we know amount of probes in
advance so can reserve correct amount of memory during module load time. Unfortunately
that is not possible on SPARC and we need to allocate a trampoline dynamically.
Major part of this code is about removing all static assumptions about FBT from kernel
code and moving the responsibility to dtrace modules. Trampolines for SPARC are now
allocated dynamically (including kernel's pseudo module). This applies to SDT trampolines
too.
Second change adds scan for return probes on SPARC with small heuristics to quickly
skip over cases that are not interesting for DTrace. At the same time this patch
allocates new SPARC Trap for FBT.
Support for .init section is not available on any platform. The .init section is freed
after a module is fully loaded and it is not possible to remove its probes without
further chagnes in DTrace framework (modules). This is deffered for later work.
Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
mm: fix use-after-free if memory allocation failed in vma_adjust()
There's one case when vma_adjust() expands the vma, overlapping with
*two* next vma. See case 6 of mprotect, described in the comment to
vma_merge().
To handle this (and only this) situation we iterate twice over main part
of the function. See "goto again".
Vegard reported[1] that he sees out-of-bounds access complain from
KASAN, if anon_vma_clone() on the *second* iteration fails.
This happens because we free 'next' vma by the end of first iteration
and don't have a way to undo this if anon_vma_clone() fails on the
second iteration.
The solution is to do all required allocations upfront, before we touch
vmas.
The allocation on the second iteration is only required if first two
vmas don't have anon_vma, but third does. So we need, in total, one
anon_vma_clone() call.
It's easy to adjust 'exporter' to the third vma for such case.
uek-rpm nano: Signature verification support in kexec_file_load
The following configuration options to support
signature verification in the kexec_file_load
syscall are enabled:
CONFIG_KEXEC_VERIFY_SIG=y
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_PKCS7_MESSAGE_PARSER=y
CONFIG_SIGNED_PE_FILE_VERIFICATION=y
Driver fails Beacon OFF if frequency is set to 0. As per fc-ls spec,
status, capability, frequency and duration fields are only applicable
for Beacon ON.
Remove frequency and type checks. Reject Beacon ON if duration is non
zero.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
OS crashes after the completion of firmware download.
Failure in posting SCSI SGL buffers because number of SGL buffers is
less than total count. Some of the pending IOs are not completed by
driver. SGL buffers for these IOs are not added back to the list.
Pending IOs are not completed because lpfc_wq_list list is initialized
before completion of pending IOs.
Postpone lpfc_wq_list reinitialization by moving
lpfc_sli4_queue_destroy() after lpfc_hba_down_post().
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Mailbox submission fails because mailbox interrupt is disabled. Mailbox
interrupt is disabled during port reset.
Do reset only for physical port.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Kernel panic when log_verbose is set to 0xffffffff
phba->pport is dereferenced before it is initialized
Fix: Do not dereference phba->pport if it is NULL
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
On hbacmd reset failure, observing wrong string "nline" in kernel log.
On failure, non negative value (1) is returned from sysfs store
routine. It is interpreted as count by kernel and store routine is
called again with the remaining characters as input.
Fix: Return negative error code (-EIO) in case of failure.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>