Vlad Buslov [Sun, 4 Aug 2019 10:52:31 +0000 (13:52 +0300)]
net/mlx5e: Only access fully initialized flows in neigh update
To remove dependency on rtnl lock and prevent neigh update code from
accessing uninitialized flows when executing concurrently with tc, extend
mlx5e_tc_flow with 'init_done' completion. Modify helper
mlx5e_take_all_encap_flows() to wait for flow completion after obtaining
reference to it. Modify mlx5e_tc_encap_flows_del() and
mlx5e_tc_encap_flows_add() to skip flows that don't have OFFLOADED flag
set, which can happen if concurrent flow initialization failed.
This commit finishes neigh update refactoring for concurrent execution
started in previous change in this series.
Vlad Buslov [Sun, 4 Aug 2019 09:25:57 +0000 (12:25 +0300)]
net/mlx5e: Refactor neigh update for concurrent execution
In order to remove dependency on rtnl lock and allow neigh update workqueue
task to execute concurrently with tc, refactor mlx5e_rep_neigh_update() for
concurrent execution:
- Lock encap table when accessing encap entry to prevent concurrent
changes. To do this properly, the initial encap state check is moved from
mlx5e_rep_neigh_update() into mlx5e_rep_update_flows() to be performed
under encap_tbl_lock protection.
- Wait for encap to be fully initialized before accessing it by means of
'res_ready' completion.
- Add mlx5e_take_all_encap_flows() helper which is used to construct a
temporary list of flows and efi indexes that is used to access current
encap data in flow which can be attached to multiple encaps
simultaneously. Release the flows from temporary list after
encap_tbl_lock critical section. This is necessary because
mlx5e_flow_put() can't be called while holding encap_tbl_lock.
- Modify mlx5e_tc_encap_flows_add() and mlx5e_tc_encap_flows_del() to work
with user-provided list of flows built by mlx5e_take_all_encap_flows(),
instead of traversing encap flow list directly.
This is first step in complex neigh update refactoring, which is finished
by following commit in this series.
Vlad Buslov [Sat, 3 Aug 2019 18:43:06 +0000 (21:43 +0300)]
net/mlx5e: Refactor neigh used value update for concurrent execution
In order to remove dependency on rtnl lock and allow neigh used value
update workqueue task to execute concurrently with tc, refactor
mlx5e_tc_update_neigh_used_value() for concurrent execution:
- Lock encap table when accessing encap entry to prevent concurrent
changes.
- Save offloaded encap flows to temporary list and release them after encap
entry is updated. Add mlx5e_put_encap_flow_list() helper which is
intended to be shared with neigh update code in following patch in this
series. This is necessary because mlx5e_flow_put() can't be called while
holding encap_tbl_lock.
Vlad Buslov [Mon, 11 Jun 2018 11:16:14 +0000 (14:16 +0300)]
net/mlx5e: Protect neigh hash encap list with spinlock and rcu
Rcu-ify mlx5e_neigh_hash_entry->encap_list by changing operations on encap
list to their rcu counterparts and extending encap structure with rcu_head
to free the encap instances after rcu grace period. Use rcu read lock when
traversing encap list. Implement helper mlx5e_get_next_valid_encap()
function that is used by mlx5e_tc_update_neigh_used_value() to safely
iterate over valid entries of nhe->encap_list.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
To remove dependency on rtnl lock, always take neigh update encap lock when
modifying neigh update hash table and list. Originally, this lock was only
used to synchronize with netevent handler function, which is called from bh
context and cannot use rtnl lock for synchronization. Take lock in encap
entry attach function to prevent concurrent modifications of neigh update
hash table and list.
Taking the encap lock when creating new nhe introduces a problem that we
need to allocate new entry with sleeping GFP_KERNEL flag while holding a
spinlock. However, since previous patch in this series has already
converted lookup in netevent handler function to user rcu read lock instead
of encap lock, we can safely convert the lock type to mutex.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Vlad Buslov [Fri, 8 Jun 2018 08:49:28 +0000 (11:49 +0300)]
net/mlx5e: Extend neigh hash entry with rcu
To remove dependency on rtnl lock and to allow unlocked iteration over list
of neigh hash entries, extend nhe with rcu. Change operations on neigh list
to their rcu counterparts and free neigh hash entry with rcu timeout.
Introduce mlx5e_get_next_nhe() helper that is used to iterate over rcu
neigh list with reference to nhe taken.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Vlad Buslov [Fri, 8 Jun 2018 08:31:28 +0000 (11:31 +0300)]
net/mlx5e: Always take reference to neigh entry
Neigh entry has reference counter, however it is only used when scheduling
neigh update event. In all other cases reference to neigh entry is not
taken while working with it. Neigh code relies on synchronization provided
by rtnl lock and uses encap list size as implicit reference counter.
To remove dependency on rtnl lock, always take reference to neigh entry
while using it. Remove neigh entry from hash table and delete it only when
reference counter reaches zero. This can result spurious neigh update
events, when there is an event on entry that has zero encaps attached.
However, such events are rare and properly handled by neigh update handler.
Extend encap entry with reference to neigh hash entry in order to be able
to directly release it when encap is detached, instead of lookup nhe by key
through hash table. Extend nhe with reference to device priv structure to
guarantee correctness when nhe is used with stack devices, bond setup, in
which case it is non-trivial to determine correct device when releasing the
nhe.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Vlad Buslov [Wed, 13 Jun 2018 11:49:46 +0000 (14:49 +0300)]
net/mlx5e: Extract code that queues neigh update work into function
As a preparation for following refactoring that removes rtnl lock
dependency from neigh hash entry handlers, extract code that enqueues neigh
update work into standalone function. This commit doesn't change
functionality.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
YueHaibing [Wed, 21 Aug 2019 13:54:06 +0000 (21:54 +0800)]
net: stmmac: dwmac-meson: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Wed, 21 Aug 2019 13:51:30 +0000 (21:51 +0800)]
net: stmmac: dwmac-meson8b: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Wed, 21 Aug 2019 13:46:13 +0000 (21:46 +0800)]
net: systemport: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Wed, 21 Aug 2019 13:41:31 +0000 (21:41 +0800)]
net: bcmgenet: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Wed, 21 Aug 2019 12:48:50 +0000 (20:48 +0800)]
net: ethernet: ti: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Wed, 21 Aug 2019 12:32:03 +0000 (20:32 +0800)]
amd-xgbe: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Aug 2019 20:02:32 +0000 (13:02 -0700)]
Merge tag 'mac80211-next-for-davem-2019-08-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg:
====================
Here are a few groups of changes:
* EDMG channel support (60 GHz, just a single patch)
* initial 6/7 GHz band support (Arend)
* association timestamp recording (Ben)
* rate control improvements for better performance with
the mt76 driver (Felix)
* various fixes for previous HE support changes (John)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:37 +0000 (10:19 +0300)]
selftests: mlxsw: Add a test case for devlink-trap
Test generic devlink-trap functionality over mlxsw. These tests are not
specific to a single trap, but do not check the devlink-trap common
infrastructure either.
Currently, the only test case is device deletion (by reloading the
driver) while packets are being trapped.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:32 +0000 (10:19 +0300)]
mlxsw: reg: Add new trap actions
Subsequent patches will add discard traps support in mlxsw. The driver
cannot configure such traps with a normal trap action, but needs to use
exception trap action, which also increments an error counter.
On the other hand, when these traps are initialized or set to drop
action, they should use the default drop action set by the firmware.
This guarantees that when the feature is disabled we get the exact same
behavior as before the feature was introduced.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:31 +0000 (10:19 +0300)]
mlxsw: core: Add API to set trap action
Up until now the action of a trap was never changed during its lifetime.
This is going to change by subsequent patches that will allow devlink to
control the action of certain traps.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Fietkau [Tue, 20 Aug 2019 09:54:49 +0000 (11:54 +0200)]
mac80211: minstrel_ht: improve rate probing for devices with static fallback
On some devices that only support static rate fallback tables sending rate
control probing packets can be really expensive.
Probing lower rates can already hurt throughput quite a bit. What hurts even
more is the fact that on mt76x0/mt76x2, single probing packets can only be
forced by directing packets at a different internal hardware queue, which
causes some heavy reordering and extra latency.
The reordering issue is mainly problematic while pushing lots of packets to
a particular station. If there is little activity, the overhead of probing is
neglegible.
The static fallback behavior is designed to pretty much only handle rate
control algorithms that use only a very limited set of rates on which the
algorithm switches up/down based on packet error rate.
In order to better support that kind of hardware, this patch implements a
different approach to rate probing where it switches to a slightly higher rate,
waits for tx status feedback, then updates the stats and switches back to
the new max throughput rate. This only triggers above a packet rate of 100
per stats interval (~50ms).
For that kind of probing, the code has to reduce the set of probing rates
a lot more compared to single packet probing, so it uses only one packet
per MCS group which is either slightly faster, or as close as possible to
the max throughput rate.
This allows switching between similar rates with different numbers of
streams. The algorithm assumes that the hardware will work its way lower
within an MCS group in case of retransmissions, so that lower rates don't
have to be probed by the high packets per second rate probing code.
To further reduce the search space, it also does not probe rates with lower
channel bandwidth than the max throughput rate.
At the moment, these changes will only affect mt76x0/mt76x2.
On hardware with static fallback tables (e.g. mt76x2), rate probing attempts
can be very expensive.
On such devices, avoid sampling rates slower than the per-group max throughput
rate, based on the assumption that the fallback table will take care of probing
lower rates within that group if the higher rates fail.
To further reduce unnecessary probing attempts, skip duplicate attempts on
rates slower than the max throughput rate.
802.11ay specification defines Enhanced Directional Multi-Gigabit
(EDMG) STA and AP which allow channel bonding of 2 channels and more.
Introduce new NL attributes that are needed for enabling and
configuring EDMG support.
Two new attributes are used by kernel to publish driver's EDMG
capabilities to the userspace:
NL80211_BAND_ATTR_EDMG_CHANNELS - bitmap field that indicates the 2.16
GHz channel(s) that are supported by the driver.
When this attribute is not set it means driver does not support EDMG.
NL80211_BAND_ATTR_EDMG_BW_CONFIG - represent the channel bandwidth
configurations supported by the driver.
Additional two new attributes are used by the userspace for connect
command and for AP configuration:
NL80211_ATTR_WIPHY_EDMG_CHANNELS
NL80211_ATTR_WIPHY_EDMG_BW_CONFIG
New rate info flag - RATE_INFO_FLAGS_EDMG, can be reported from driver
and used for bitrate calculation that will take into account EDMG
according to the 802.11ay specification.
Arend van Spriel [Fri, 2 Aug 2019 11:31:02 +0000 (13:31 +0200)]
cfg80211: add 6GHz in code handling array with NUM_NL80211_BANDS entries
In nl80211.c there is a policy for all bands in NUM_NL80211_BANDS and
in trace.h there is a callback trace for multicast rates which is per
band in NUM_NL80211_BANDS. Both need to be extended for the new
NL80211_BAND_6GHZ.
Arend van Spriel [Fri, 2 Aug 2019 11:31:00 +0000 (13:31 +0200)]
cfg80211: util: add 6GHz channel to freq conversion and vice versa
Extend the functions ieee80211_channel_to_frequency() and
ieee80211_frequency_to_channel() to support 6GHz band according
specification in 802.11ax D4.1 27.3.22.2.
Arend van Spriel [Fri, 2 Aug 2019 11:30:58 +0000 (13:30 +0200)]
nl80211: add 6GHz band definition to enum nl80211_band
In the 802.11ax specification a new band is introduced, which
is also proposed by FCC for unlicensed use. This band is referred
to as 6GHz spanning frequency range from 5925 to 7125 MHz.
John Crispin [Wed, 7 Aug 2019 07:59:49 +0000 (09:59 +0200)]
mac80211: add missing length field increment when generating Radiotap header
The code generating the Tx Radiotap header when using tx_status_ext was
missing a field increment after setting the VHT bandwidth.
Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190807075949.32414-4-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
John Crispin [Wed, 7 Aug 2019 07:59:48 +0000 (09:59 +0200)]
mac80211: 80Mhz was not reported properly when using tx_status_ext
When reporting 80MHz, we need to set 4 and not 2 inside the corresponding
field inside the Tx Radiotap header.
Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190807075949.32414-3-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
John Crispin [Wed, 7 Aug 2019 07:59:47 +0000 (09:59 +0200)]
mac80211: fix bad guard when reporting legacy rates
When reporting legacy rates inside the TX Radiotap header we need to split
the check between "uses tx_statua_ext" and "is legacy rate". Not doing so
would make the code drop into the !tx_status_ext path.
Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190807075949.32414-2-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
John Crispin [Wed, 7 Aug 2019 07:59:46 +0000 (09:59 +0200)]
mac80211: fix TX legacy rate reporting when tx_status_ext is used
The RX Radiotap header length was not calculated properly when reporting
legacy rates using tx_status_ext.
Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190807075949.32414-1-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
David S. Miller [Wed, 21 Aug 2019 00:02:44 +0000 (17:02 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-08-20
This series contains updates to ice driver only.
Brett fixes the detection of a hung transmit ring by checking the
software based tail (next_to_use) to determine if there is pending work.
Updates the driver to assume that using more than one receive queue per
receive ring container is a rare case, so use unlikely() in the case
were we actually need to divide our budget for multiple queues. Fixed
an issue where the write back on ITR bit was not being set when
interrupts are disabled, which was causing only write backs when polling
only when a cache line is filled. Cleans up unnecessary wait times
during VF bring up and reset paths. Increased the mailbox size for
receive queues that are used to communicate with VFs to accommodate the
large number of VFs that the driver can support.
Akeem restructures the initialization flows for VFs, including how VFs
are configured and resources allocated to improve flows so that when we
clean up resources, we do not try to free resources that were never
allocated. Organizes code to ensure that VF specific code is located in
the SR-IOV specific file.
Paul fixes an issue when setting the pause parameter which was
incorrectly blocking users from changing receive or transmit pause
settings. Ensure register access for MSIX vector index is only done in
the PF space and not absolute device space.
Usha fixes a potential kernel hang in the DCB rebuild path when in CEE
mode, where the ETS recommended DCB configuration is not being set or
set correctly.
Mitch updates the driver to process all receive descriptors, regardless
of the size of the associated data.
Tony fixes and issue during the reset/rebuild path of a PF VSI where we
were assuming that the PF VSI was always to be enabled, which can
attempt to bring up a PF VSI on a downed interface which can lead to
various crashes.
Pawel fixes up variable definitions to match the type of data being
stored.
v2: Dropped patch 1 of the series to add ethtool support to query/add
channels on a VSI, while we re-qork the functionality to match the
ethtool expected behavior to report combined (Tx and Rx) numbers.
v3: Updated patch 4 to use kzalloc() and kfree() instead devm_kzalloc()
and devm_kfree().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ice: improve print for VF's when adding/deleting MAC filters
When we fail to add/delete MAC filters in the VF, the print doesn't
distinguish between the two. Fix that by printing whether or not we
failed to add/delete the MAC filter respectively.
Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
These queue variables are being assigned values that are type u16.
Change the local variables to match these types. Since these
represent queue counts, they should never be negative.
Signed-off-by: Pawel Kaminski <pawel.kaminski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice: Move VF resources definition to SR-IOV specific file
In order to use some of the VF resources definition in the SR-IOV specific
virtchnl header file, this patch moves applicable code to
ice_virtchnl_pf.h file accordingly... and they should have been defined in
the destination file originally.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice: Increase size of Mailbox receive queue for many VFs
Currently we use the ICE_MBXQ_LEN for both the Mailbox send and receive
queues that are used to communicate with VFs. This is fine for the send
queue because the PF driver will lock the queue for every single send,
but for the Mailbox receive queue every VF is posting to its Mailbox
send queue and the hardware is then handing the message to the PF on its
Mailbox receive queue. This becomes a problem with many VFs because it
seems to overburden the Mailbox receive queue on the PF. Fix this by
increasing the Mailbox receive queue for the PF to 512 entries.
The number 512 was determined based on the number of VFs supported by
the device. We can have a total of 256 VFs so in the worst case this
allows the VFs to put 2 messages in the PFs Mailbox receive queue at the
same time.
Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently there are a couple places where the VF is waiting too long when
checking the status of registers. This is causing the AVF driver to
spin for longer than necessary in the __IAVF_STARTUP state. Sometimes
it causes the AVF to go into the __IAVF_COMM_FAILED, which may retrigger
the __IAVF_STARTUP state. Try to reduce the chance of this happening by
removing unnecessary wait times in VF bringup/resets.
Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Paul Greenwalt [Thu, 25 Jul 2019 08:55:36 +0000 (01:55 -0700)]
ice: update GLINT_DYN_CTL and GLINT_VECT2FUNC register access
Register access for GLINT_DYN_CTL and GLINT_VECT2FUNC should be within
the PF space and not the absolute device space.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Thu, 25 Jul 2019 08:55:35 +0000 (01:55 -0700)]
ice: Do not always bring up PF VSI in ice_ena_vsi()
During rebuild ice_ena_vsi() is called to recover the VSI state.
This function assumes the PF VSI is always to be enabled, however,
it's possible that during reset/rebuild the interface can be
brought down. If this occurs, we can attempt to bring up the PF
VSI on a downed interface which can lead to various crashes. If
the interface is not running, do not bring up the associated VSI.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Thu, 25 Jul 2019 08:55:34 +0000 (01:55 -0700)]
ice: allow empty Rx descriptors
In some circumstances, the hardware will hand us a receive descriptor
which has no data attached, but is otherwise valid. The receive code was
improperly ignoring these descriptors, which result in an infinite loop.
To fix this, change the receive code to process all descriptors,
regardless of the size of the associated data. Add checks to the
memory-handling functions to allow for zero size.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the set local MIB AQ call failures in the DCB rebuild path
by setting the defaults for the ETS recommended DCB configuration. Also,
willing bits for the DCB configuration needs to be set correctly. Resets
works fine in IEEE mode as the ETS recommended DCB configuration is
populated but not in CEE mode.
Without this patch, PFR causes the kernel hang in CEE mode.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice: Set WB_ON_ITR when we don't re-enable interrupts
Currently when busy polling is enabled we aren't setting/enabling
WB_ON_ITR in the driver. This doesn't break the driver, but it does
cause issues. If we don't enable WB_ON_ITR mode we will still get
write-backs from hardware during polling when a cache line has been
filled, but if a cache line is not filled we will not get the
write-back because WB_ON_ITR is not set. Fix this by enabling
WB_ON_ITR in the driver when interrupts are disabled.
Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
this is a pull request for net-next/master consisting of 18 patches.
The first patch is by Geert Uytterhoeven, it removes the unused platform
data support from the rcar_can driver.
A patch by Nishka Dasgupta marks the structure peak_pciec_i2c_bit_ops in
the peak_pci driver as constant.
A patch by me removes the custom DMA support from the hi311x driver.
The next 4 patches target the tcan4x5x driver and are also by me, they
first clean up the driver a bit, and then add missing error handling and
fix a bug in the length calculation in the regmap callbacks.
The next 2 patches are by me for the m_can_platform driver, they also
remove unneeded casts and add missing error handling.
The remaining 9 patches all target the mcp251x driver. The first 5 are
clean up patches by me, the next relaxes the timing in the
mcp251x_hw_reset() function. Alexander Shiyan's patch improves the name
which is used while registering the interrupt handler. Phil Elwell's
patch improves the mcp251x_open() function to use the DT-supplied
interrupt flags instead of hard coding them. The final patch is again by
me, it removes the custom DMA support from the hi311x driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Greenwalt [Thu, 25 Jul 2019 08:55:31 +0000 (01:55 -0700)]
ice: fix set pause param autoneg check
When ETHTOOL_GLINKSETTINGS is defined get pause param pause->autoneg
reports SW configured setting, however when not defined get pause param
pause->autoneg reports the link status. Set pause param needs to compare
pause->autoneg with the same source as get pause param to block the user
from changing autoneg with the set pause param option, or the user
may be incorrectly blocked from changing Rx|Tx pause settings.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 20 Aug 2019 20:51:46 +0000 (13:51 -0700)]
Merge branch 's390-net-next'
Julian Wiedmann says:
====================
s390/net: updates 2019-08-20
please apply the following patches to net-next. This series brings a mix
of cleanups and small improvements for various parts of qeth's control
path. Also, a minor cleanup for ctcm and lcs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:43 +0000 (16:46 +0200)]
s390/lcs: don't use intparm for channel IO
lcs passes an intparm when calling ccw_device_*(), even though lcs_irq()
later makes no use of this.
To reduce the confusion, consistently pass 0 as intparm instead.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:42 +0000 (16:46 +0200)]
s390/ctcm: don't use intparm for channel IO
ctcm passes an intparm when calling ccw_device_*(), even though
ctcm_irq_handler() later makes no use of this.
To reduce the confusion, consistently pass 0 as intparm instead.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:41 +0000 (16:46 +0200)]
s390/qeth: streamline control code for promisc mode
We have logic to determine the desired promisc mode in _each_ code path.
Change things around so that there is a clean split between
(a) high-level code that selects the new mode, and (b) implementations
of the various mechanisms to program this mode.
This also keeps qeth_promisc_to_bridge() from polluting the debug logs
on each RX modeset.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:40 +0000 (16:46 +0200)]
s390/qeth: get vnicc sub-cmd type from reply data
When processing the reply for a vnicc cmd, there's no need to remember
which specific sub-cmd type we initially sent. The reply itself contains
all the needed information.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:39 +0000 (16:46 +0200)]
s390/qeth: merge qeth_reply struct into qeth_cmd_buffer
Except for card->read_cmd, every cmd we issue now passes through
qeth_send_control_data() and allocates a qeth_reply struct. The way we
use this struct requires additional refcounting, and pointer tracking.
Clean up things by moving most of qeth_reply's content into the main
cmd struct. This keeps things in one place, saves us the additional
refcounting and simplifies the overall code flow.
A nice little benefit is that we can now match incoming replies against
the pending requests themselves, without caching the requests' seqnos.
The qeth_reply struct stays around for a little bit longer in a shrunk
form, to avoid touching every single callback.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:38 +0000 (16:46 +0200)]
s390/qeth: keep cmd alive after IO completion
Current code releases the cmd struct after its initial IO has completed.
Any reply processing is done independently, using a separate qeth_reply
struct.
In preparation for merging the cmd and reply structs together, take an
additional reference on the cmd object so that it stays around all the
way until qeth_send_control_data() returns.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:37 +0000 (16:46 +0200)]
s390/qeth: use correct length field in SNMP cmd callback
qeth_snmp_command_cb() is the only cmd callback that pulls the reply's
data length from a low-level transport header field. This requires
additional complexity (ie. reply->offset) to make the header accessible
to what is supposed to be a pure IPA cmd callback.
Adapter cmds have a length field in their sub-cmd header, get the data
length from there instead.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:36 +0000 (16:46 +0200)]
s390/qeth: propagate length of processed cmd IO data to callback
When an cmd IO completes in qeth_irq(), calculate how much data was
processed by the device and pass this value to the cmd's callback.
This allows cmds that retrieve data from the device to check whether
sufficient data was received, so we do that in qeth_read_conf_data_cb().
Suggested-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gavi Teitz [Sun, 11 Aug 2019 06:21:20 +0000 (09:21 +0300)]
net/mlx5: Fix the order of fc_stats cleanup
Previously, mlx5_cleanup_fc_stats() would cleanup the flow counter
pool beofre releasing all the counters to it, which would result in
flow counter bulks not getting freed. Resolve this by changing the
order in which elements of fc_stats are cleaned up, so that the flow
counter pool is cleaned up after all the counters are released.
Also move cleanup actions for freeing the bulk query memory and
destroying the idr to the end of mlx5_cleanup_fc_stats().
Vlad Buslov [Mon, 12 Aug 2019 12:38:38 +0000 (15:38 +0300)]
net/mlx5e: Fix deallocation of non-fully init encap entries
Recent rtnl lock dependency refactoring changed encap entry attach code to
insert encap entry to hash table before it was fully initialized in order
to allow concurrent tc users to wait on completion for encap entry to
finish initialization. That change required all the users of encap entry to
obtain reference to it first and for caller that creates encap to put
reference to it on error, instead of freeing the entry memory directly.
However, releasing reference to such encap entry that wasn't fully
initialized causes NULL pointer dereference in
mlx5e_rep_encap_entry_detach() which expects e->out_dev to be set and encap
to be attached to nhe:
To fix the issue, set e->compl_result to positive value after encap was
initialized successfully. Check e->compl_result value in
mlx5e_encap_dealloc() and only detach and dealloc encap if the value is
positive.
Aya Levin [Wed, 26 Jun 2019 20:21:40 +0000 (23:21 +0300)]
net/mlx5e: Report and recover from CQE with error on RQ
Add support for report and recovery from error on completion on RQ by
setting the queue back to ready state. Handle only errors with a
syndrome indicating the RQ might enter error state and could be
recovered.
Aya Levin [Tue, 25 Jun 2019 18:42:27 +0000 (21:42 +0300)]
net/mlx5e: Report and recover from rx timeout
Add support for report and recovery from rx timeout. On driver open we
post NOP work request on the rx channels to trigger napi in order to
fillup the rx rings. In case napi wasn't scheduled due to a lost
interrupt, perform EQ recovery.
Aya Levin [Tue, 25 Jun 2019 14:44:28 +0000 (17:44 +0300)]
net/mlx5e: Report and recover from CQE error on ICOSQ
Add support for report and recovery from error on completion on ICOSQ.
Deactivate RQ and flush, then deactivate ICOSQ. Set the queue back to
ready state (firmware) and reset the ICOSQ and the RQ (software
resources). Finally, activate the ICOSQ and the RQ.
Aya Levin [Tue, 2 Jul 2019 12:47:29 +0000 (15:47 +0300)]
net/mlx5e: Split open/close ICOSQ into stages
Align ICOSQ open/close behaviour with RQ and SQ. Split open flow into
open and activate where open handles creation and activate enables the
queue. Do a symmetric thing in close flow: split into close and
deactivate.
Aya Levin [Tue, 25 Jun 2019 13:26:46 +0000 (16:26 +0300)]
net/mlx5e: Add support to rx reporter diagnose
Add rx reporter, which supports diagnose call-back. Diagnostics output
include: information common to all RQs: RQ type, RQ size, RQ stride
size, CQ size and CQ stride size. In addition advertise information per
RQ and its related icosq and attached CQ.
Aya Levin [Thu, 11 Jul 2019 14:17:36 +0000 (17:17 +0300)]
net/mlx5e: Add helper functions for reporter's basics
Introduce helper functions for create and destroy reporters and update
channels. In the following patch, rx reporter is added and it will use
these helpers too.
Aya Levin [Sun, 30 Jun 2019 08:34:15 +0000 (11:34 +0300)]
net/mlx5e: Extend tx reporter diagnostics output
Enhance tx reporter's diagnostics output to include: information common
to all SQs: SQ size, SQ stride size.
In addition add channel ix, tc, txq ix, cc and pc.
Aya Levin [Mon, 24 Jun 2019 18:41:21 +0000 (21:41 +0300)]
net/mlx5e: Extend tx diagnose function
The following patches in the set enhance the diagnostics info of tx
reporter. Therefore, it is better to pass a pointer to the SQ for
further data extraction.
Aya Levin [Mon, 1 Jul 2019 12:08:13 +0000 (15:08 +0300)]
net/mlx5e: Generalize tx reporter's functionality
Prepare for code sharing with rx reporter, which is added in the
following patches in the set. Introduce a generic error_ctx for
agnostic recovery despatch.
Aya Levin [Mon, 1 Jul 2019 12:51:51 +0000 (15:51 +0300)]
net/mlx5e: Change naming convention for reporter's functions
Change from mlx5e_tx_reporter_* to mlx5e_reporter_tx_*. In the following
patches in the set rx reporter is added, the new naming convention is
more uniformed.
====================
net: dsa: enable and disable all ports
The DSA stack currently calls the .port_enable and .port_disable switch
callbacks for slave ports only. However, it is useful to call them for all
port types. For example this allows some drivers to delay the optimization
of power consumption after the switch is setup. This can also help reducing
the setup code of drivers a bit.
The first DSA core patches enable and disable all ports of a switch, regardless
their type. The last mv88e6xxx patches remove redundant code from the driver
setup and the said callbacks, now that they handle SERDES power for all ports.
Changes in v2: do not guard .port_disable for broadcom switches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:52 +0000 (16:00 -0400)]
net: dsa: mv88e6xxx: enable SERDES after setup
SERDES is powered on for CPU and DSA ports and powered down for unused
ports at setup time. But now that DSA calls mv88e6xxx_port_enable
and mv88e6xxx_port_disable for all ports, the SERDES power can now
be handled after setup inconditionally for all ports.
Using the port enable and disable callbacks also have the benefit to
handle the SERDES IRQ for non user ports as well.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:51 +0000 (16:00 -0400)]
net: dsa: mv88e6xxx: do not change STP state on port disabling
When disabling a port, that is not for the driver to decide what to
do with the STP state. This is already handled by the DSA layer.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:50 +0000 (16:00 -0400)]
net: dsa: enable and disable all ports
Call the .port_enable and .port_disable functions for all ports,
not only the user ports, so that drivers may optimize the power
consumption of all ports after a successful setup.
Unused ports are now disabled on setup. CPU and DSA ports are now
enabled on setup and disabled on teardown. User ports were already
enabled at slave creation and disabled at slave destruction.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>