]> www.infradead.org Git - users/griffoul/linux.git/log
users/griffoul/linux.git
7 years agoselftests/bpf: check for spurious extacks from the driver
Jakub Kicinski [Thu, 25 Jan 2018 22:00:52 +0000 (14:00 -0800)]
selftests/bpf: check for spurious extacks from the driver

Drivers should not report errors when offload is not forced.
Check stdout and stderr for familiar messages when with no
skip flags and with skip_hw.  Check for add, replace, and
destroy.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: use tc_cls_can_offload_and_chain0()
Jakub Kicinski [Thu, 25 Jan 2018 22:00:51 +0000 (14:00 -0800)]
mlxsw: use tc_cls_can_offload_and_chain0()

Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoi40e: use tc_cls_can_offload_and_chain0()
Jakub Kicinski [Thu, 25 Jan 2018 22:00:50 +0000 (14:00 -0800)]
i40e: use tc_cls_can_offload_and_chain0()

Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoixgbe: use tc_cls_can_offload_and_chain0()
Jakub Kicinski [Thu, 25 Jan 2018 22:00:49 +0000 (14:00 -0800)]
ixgbe: use tc_cls_can_offload_and_chain0()

Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnxt: use tc_cls_can_offload_and_chain0()
Jakub Kicinski [Thu, 25 Jan 2018 22:00:48 +0000 (14:00 -0800)]
bnxt: use tc_cls_can_offload_and_chain0()

Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx5: use tc_cls_can_offload_and_chain0()
Jakub Kicinski [Thu, 25 Jan 2018 22:00:47 +0000 (14:00 -0800)]
mlx5: use tc_cls_can_offload_and_chain0()

Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocxgb4: use tc_cls_can_offload_and_chain0()
Jakub Kicinski [Thu, 25 Jan 2018 22:00:46 +0000 (14:00 -0800)]
cxgb4: use tc_cls_can_offload_and_chain0()

Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: use tc_cls_can_offload_and_chain0()
Jakub Kicinski [Thu, 25 Jan 2018 22:00:45 +0000 (14:00 -0800)]
nfp: use tc_cls_can_offload_and_chain0()

Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonetdevsim: use tc_cls_can_offload_and_chain0()
Jakub Kicinski [Thu, 25 Jan 2018 22:00:44 +0000 (14:00 -0800)]
netdevsim: use tc_cls_can_offload_and_chain0()

Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agopkt_cls: add new tc cls helper to check offload flag and chain index
Jakub Kicinski [Thu, 25 Jan 2018 22:00:43 +0000 (14:00 -0800)]
pkt_cls: add new tc cls helper to check offload flag and chain index

Very few (mlxsw) upstream drivers seem to allow offload of chains
other than 0.  Save driver developers typing and add a helper for
checking both if ethtool's TC offload flag is on and if chain is 0.
This helper will set the extack appropriately in both error cases.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqed: code indent should use tabs where possible
Rohit Visavalia [Thu, 25 Jan 2018 10:26:14 +0000 (15:56 +0530)]
qed: code indent should use tabs where possible

Issue found by checkpatch.

Signed-off-by: Rohit Visavalia <rohit.visavalia@softnautics.com>
Acked-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobe2net: networking block comments don't use an empty /* line
Rohit Visavalia [Thu, 25 Jan 2018 12:58:24 +0000 (18:28 +0530)]
be2net: networking block comments don't use an empty /* line

Resolved Warning: networking block comments don't use an empty /* line,
use /* Comment...
Issue found by checkpatch.

Signed-off-by: Rohit Visavalia <rohit.visavalia@softnautics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetoot...
David S. Miller [Thu, 25 Jan 2018 21:32:28 +0000 (16:32 -0500)]
Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2018-01-25

Here's one last bluetooth-next pull request for the 4.16 kernel:

 - Improved support for Intel controllers
 - New set_parity method to serdev (agreed with maintainers to be taken
   through bluetooth-next)
 - Fix error path in hci_bcm (missing call to serdev close)
 - New ID for BCM4343A0 UART controller

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocxgb4: fix possible deadlock
Ganesh Goudar [Thu, 25 Jan 2018 07:59:43 +0000 (13:29 +0530)]
cxgb4: fix possible deadlock

t4_wr_mbox_meat_timeout() can be called from both softirq
context and process context, hence protect the mbox with
spin_lock_bh() instead of simple spin_lock()

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/ipv6: Do not allow route add with a device that is down
David Ahern [Thu, 25 Jan 2018 03:45:29 +0000 (19:45 -0800)]
net/ipv6: Do not allow route add with a device that is down

IPv6 allows routes to be installed when the device is not up (admin up).
Worse, it does not mark it as LINKDOWN. IPv4 does not allow it and really
there is no reason for IPv6 to allow it, so check the flags and deny if
device is admin down.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'net-smc-more-socket-closing-improvements'
David S. Miller [Thu, 25 Jan 2018 21:10:43 +0000 (16:10 -0500)]
Merge branch 'net-smc-more-socket-closing-improvements'

Ursula Braun says:

====================
net/smc: more socket closing improvements

these patches improve the smc behavior for abnormal socket closing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/smc: check for healthy link group resp. connections
Ursula Braun [Thu, 25 Jan 2018 10:15:36 +0000 (11:15 +0100)]
net/smc: check for healthy link group resp. connections

If a problem for at least one connection of a link group is detected,
the whole link group and all its connections are terminated.
This patch adds a check for healthy link group when trying to reserve
a work request, and checks for healthy connections before starting
a tx worker.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/smc: wake up wr_reg_wait when terminating a link group
Ursula Braun [Thu, 25 Jan 2018 10:15:35 +0000 (11:15 +0100)]
net/smc: wake up wr_reg_wait when terminating a link group

If a new connection with a new rmb is added to a link group, its
memory region is registered. If a link group is terminated, a pending
registration requires a wake up.

And consolidate setting of tx_flag peer_conn_abort in smc_lgr_terminate().

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/smc: do not reuse a linkgroup with setup problems
Ursula Braun [Thu, 25 Jan 2018 10:15:34 +0000 (11:15 +0100)]
net/smc: do not reuse a linkgroup with setup problems

Once a linkgroup is created successfully, it stays alive for a
certain time to service more connections potentially created.
If one of the initialization steps for a new linkgroup fails,
the linkgroup should not be reused by other connections following.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/smc: terminate link group for ib_post_send problems
Ursula Braun [Thu, 25 Jan 2018 10:15:33 +0000 (11:15 +0100)]
net/smc: terminate link group for ib_post_send problems

If ib_post_send() fails, terminate all connections of this
link group.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/smc: handle state SMC_PEERFINCLOSEWAIT correctly
Ursula Braun [Thu, 25 Jan 2018 10:15:32 +0000 (11:15 +0100)]
net/smc: handle state SMC_PEERFINCLOSEWAIT correctly

A state transition from closing state SMC_PEERFINCLOSEWAIT to closing
state SMC_APPFINCLOSEWAIT is not allowed. Once a closing indication
from the peer has been received, the socket reaches state SMC_CLOSED.

And receiving a peer_conn_abort just changes the state of the socket
into one of the states SMC_PROCESSABORT or SMC_CLOSED;
sending a peer_conn_abort occurs in smc_close_active() for state
SMC_PROCESSABORT only.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/smc: cancel tx worker in case of socket aborts
Ursula Braun [Thu, 25 Jan 2018 10:15:31 +0000 (11:15 +0100)]
net/smc: cancel tx worker in case of socket aborts

If an SMC socket is aborted, the tx worker should be cancelled.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'sfc-support-PTP-on-8000-and-X2000-series-NICs'
David S. Miller [Thu, 25 Jan 2018 21:05:15 +0000 (16:05 -0500)]
Merge branch 'sfc-support-PTP-on-8000-and-X2000-series-NICs'

Edward Cree says:

====================
sfc: support PTP on 8000 and X2000 series NICs

Starting from the 8000-series (Medford 1), SFC NICs can timestamp TX packets
 sent through an ordinary DMA queue, rather than a special control-plane
 operation as in the 7000-series.  Patches 2-8 implement support for this.
The X2000-series (Medford 2) changes the format of timestamps, from seconds+
 (2^27)ths to seconds + quarter nanoseconds, as well as changing the shift
 of the frequency adjustment for increased precision.  Patches 9-12
 implement support for these changes.
Patch #1 is an unrelated fix for NAPI budget handling, needed in order for
 TX completion changes in the later patches to apply cleanly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: support Medford2 frequency adjustment format
Laurence Evans [Thu, 25 Jan 2018 17:28:04 +0000 (17:28 +0000)]
sfc: support Medford2 frequency adjustment format

Support increased precision frequency adjustment format (FP44) used
 by Medford2 adapters.

Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: support second + quarter ns time format for receive datapath
Edward Cree [Thu, 25 Jan 2018 17:27:40 +0000 (17:27 +0000)]
sfc: support second + quarter ns time format for receive datapath

The time_format that we stash in the PTP data structure is never
 referenced, so we can remove it.  Instead, store the information needed
 to interpret sync event timestamps.
Also rolls in a couple of other related minor PTP fixes.

Based on patches by Bert Kenward <bkenward@solarflare.com> and Laurence
 Evans <levans@solarflare.com>.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: support separate PTP and general timestamping
Laurence Evans [Thu, 25 Jan 2018 17:27:22 +0000 (17:27 +0000)]
sfc: support separate PTP and general timestamping

Support MC_CMD_PTP_OUT_GET_TIMESTAMP_CORRECTIONS_V2.  Extract general
 timestamp corrections in addition to PTP corrections.  Apply receive
 timestamp corrections for general datapath receive timestamping, and
 correspondingly for transmit.

Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: simplify RX datapath timestamping
Laurence Evans [Thu, 25 Jan 2018 17:27:02 +0000 (17:27 +0000)]
sfc: simplify RX datapath timestamping

Use timestamp conversion function with correction to avoid duplicate
 correction handling.

Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: only advertise TX timestamping if we have the license for it
Martin Habets [Thu, 25 Jan 2018 17:26:31 +0000 (17:26 +0000)]
sfc: only advertise TX timestamping if we have the license for it

We check the license for TX hardware timestamping capability.
The PTP probe will have enabled PTP sync events from the adapter.  If
 later, at TX queue init, it turns out we do not have the license, we
 don't need the sync events either.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: on 8000 series use TX queues for TX timestamps
Edward Cree [Thu, 25 Jan 2018 17:26:06 +0000 (17:26 +0000)]
sfc: on 8000 series use TX queues for TX timestamps

For this we create and use one or more new TX queues on the PTP channel,
 and enable sync events for it.
Based on a patch by Martin Habets <mhabets@solarflare.com>.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: MAC TX timestamp handling on the 8000 series
Martin Habets [Thu, 25 Jan 2018 17:25:50 +0000 (17:25 +0000)]
sfc: MAC TX timestamp handling on the 8000 series

TX timestamps on 8000 series are supplied from the MAC. This timestamp is
 only 48 bits long. The high order bits from the last time sync event are
 used for the top 16 bits.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: only enable TX timestamping if the adapter is licensed for it
Martin Habets [Thu, 25 Jan 2018 17:25:33 +0000 (17:25 +0000)]
sfc: only enable TX timestamping if the adapter is licensed for it

If we try to enable the feature and do not have the license for it, the
 MCPU will refuse and fail our TX queue init.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: use main datapath for HW timestamps if available
Martin Habets [Thu, 25 Jan 2018 17:25:15 +0000 (17:25 +0000)]
sfc: use main datapath for HW timestamps if available

We can now transmit SKBs in 2 ways:
1. Via the MC (for the 7XXX series and earlier), using
   efx_ptp_xmit_skb_mc().
2. Via the TX queues on the dedicated PTP channel (8XXX series and later),
   using efx_ptp_xmit_skb_queue().
The PTP worker thread uses the method set up at probe time. It never
 checked the return code from the old efx_ptp_xmit_skb(), so it now
 returns void.
We increment the TX dropped counter of the device if the transmit fails.

As a result of the probe per channel the remove gets called multiple times.
 Clean up efx->ptp_data properly to avoid the 2nd call blowing up.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: add function to determine which TX timestamping method to use
Martin Habets [Thu, 25 Jan 2018 17:24:56 +0000 (17:24 +0000)]
sfc: add function to determine which TX timestamping method to use

Use MC capability MC_CMD_GET_CAPABILITIES_V2_OUT_TX_MAC_TIMESTAMPING to
 detect whether the NIC supports timestamping packets sent out the main
 datapath.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: handle TX timestamps in the normal data path
Martin Habets [Thu, 25 Jan 2018 17:24:43 +0000 (17:24 +0000)]
sfc: handle TX timestamps in the normal data path

Before this work, TX timestamping is done by sending each SKB to the MC.
On the 8000 series (Medford1) we have high speed timestamping via the
 MAC, which means we can use normal TX queues for this without a
 significant drop in bandwidth.  On the X2000 series (Medford2) support
 for transmitting via the MC is removed, so the new way must be used.

This patch enables timestamping on a TX queue, if requested.
It also enhances TX event handling to process the extra completion events,
 and puts the time in the SKB.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: remove tx and MCDI handling from NAPI budget consideration
Bert Kenward [Thu, 25 Jan 2018 17:24:20 +0000 (17:24 +0000)]
sfc: remove tx and MCDI handling from NAPI budget consideration

The NAPI budget is only for RX processing work, not other work such as
 TX or MCDI completion handling.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: Move net:netns_ids destruction out of rtnl_lock() and document locking scheme
Kirill Tkhai [Fri, 19 Jan 2018 16:14:53 +0000 (19:14 +0300)]
net: Move net:netns_ids destruction out of rtnl_lock() and document locking scheme

Currently, we unhash a dying net from netns_ids lists
under rtnl_lock(). It's a leftover from the time when
net::netns_ids was introduced. There was no net::nsid_lock,
and rtnl_lock() was mostly need to order modification
of alive nets nsid idr, i.e. for:
for_each_net(tmp) {
...
id = __peernet2id(tmp, net);
idr_remove(&tmp->netns_ids, id);
...
}

Since we have net::nsid_lock, the modifications are
protected by this local lock, and now we may introduce
better scheme of netns_ids destruction.

Let's look at the functions peernet2id_alloc() and
get_net_ns_by_id(). Previous commits taught these
functions to work well with dying net acquired from
rtnl unlocked lists. And they are the only functions
which can hash a net to netns_ids or obtain from there.
And as easy to check, other netns_ids operating functions
works with id, not with net pointers. So, we do not
need rtnl_lock to synchronize cleanup_net() with all them.

The another property, which is used in the patch,
is that net is unhashed from net_namespace_list
in the only place and by the only process. So,
we avoid excess rcu_read_lock() or rtnl_lock(),
when we'are iterating over the list in unhash_nsid().

All the above makes possible to keep rtnl_lock() locked
only for net->list deletion, and completely avoid it
for netns_ids unhashing and destruction. As these two
doings may take long time (e.g., memory allocation
to send skb), the patch should positively act on
the scalability and signify decrease the time, which
rtnl_lock() is held in cleanup_net().

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoBluetooth: btintel: Create common function for firmware download
Tedd Ho-Jeong An [Wed, 24 Jan 2018 17:19:21 +0000 (09:19 -0800)]
Bluetooth: btintel: Create common function for firmware download

The firmware download flow for RAM SKU is same for both USB and UART
and this patch creates a common function for both driver.

Signed-off-by: Tedd Ho-Jeong An <tedd.an@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
7 years agoMerge branch 'rebased-net-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Thu, 25 Jan 2018 04:48:11 +0000 (23:48 -0500)]
Merge branch 'rebased-net-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Thu, 25 Jan 2018 04:44:15 +0000 (23:44 -0500)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 25 Jan 2018 01:24:30 +0000 (17:24 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Avoid negative netdev refcount in error flow of xfrm state add, from
    Aviad Yehezkel.

 2) Fix tcpdump decoding of IPSEC decap'd frames by filling in the
    ethernet header protocol field in xfrm{4,6}_mode_tunnel_input().
    From Yossi Kuperman.

 3) Fix a syzbot triggered skb_under_panic in pppoe having to do with
    failing to allocate an appropriate amount of headroom. From
    Guillaume Nault.

 4) Fix memory leak in vmxnet3 driver, from Neil Horman.

 5) Cure out-of-bounds packet memory access in em_nbyte EMATCH module,
    from Wolfgang Bumiller.

 6) Restrict what kinds of sockets can be bound to the KCM multiplexer
    and also disallow when another layer has attached to the socket and
    made use of sk_user_data. From Tom Herbert.

 7) Fix use before init of IOTLB in vhost code, from Jason Wang.

 8) Correct STACR register write bit definition in IBM emac driver, from
    Ivan Mikhaylov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net/ibm/emac: wrong bit is used for STA control register write
  net/ibm/emac: add 8192 rx/tx fifo size
  vhost: do not try to access device IOTLB when not initialized
  vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()
  i40e: flower: check if TC offload is enabled on a netdev
  qed: Free reserved MR tid
  qed: Remove reserveration of dpi for kernel
  kcm: Check if sk_user_data already set in kcm_attach
  kcm: Only allow TCP sockets to be attached to a KCM mux
  net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr
  net: sched: em_nbyte: don't add the data offset twice
  mlxsw: spectrum_router: Don't log an error on missing neighbor
  vmxnet3: repair memory leak
  ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
  pppoe: take ->needed_headroom of lower device into account on xmit
  xfrm: fix boolean assignment in xfrm_get_type_offload
  xfrm: Fix eth_hdr(skb)->h_proto to reflect inner IP version
  xfrm: fix error flow in case of add state fails
  xfrm: Add SA to hardware at the end of xfrm_state_construct()

7 years agokill kernel_sock_ioctl()
Al Viro [Sat, 1 Jul 2017 22:46:30 +0000 (18:46 -0400)]
kill kernel_sock_ioctl()

no users since 2014

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agodev_ioctl(): move copyin/copyout to callers
Al Viro [Thu, 5 Oct 2017 16:59:44 +0000 (12:59 -0400)]
dev_ioctl(): move copyin/copyout to callers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agoipconfig: use dev_set_mtu()
Al Viro [Mon, 2 Oct 2017 00:27:01 +0000 (20:27 -0400)]
ipconfig: use dev_set_mtu()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agolift handling of SIOCIW... out of dev_ioctl()
Al Viro [Mon, 2 Oct 2017 00:13:08 +0000 (20:13 -0400)]
lift handling of SIOCIW... out of dev_ioctl()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agokill dev_ifname32()
Al Viro [Mon, 2 Oct 2017 01:12:09 +0000 (21:12 -0400)]
kill dev_ifname32()

same story...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agokill bond_ioctl()
Al Viro [Sat, 30 Sep 2017 23:32:17 +0000 (19:32 -0400)]
kill bond_ioctl()

Same story as with dev_ifsioc(), except that the last cases with non-trivial
conversions had been taken out in 2013...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agokill dev_ifsioc()
Al Viro [Sat, 30 Sep 2017 23:31:15 +0000 (19:31 -0400)]
kill dev_ifsioc()

Once upon a time net/socket.c:dev_ifsioc() used to handle SIOCSHWTSTAMP and
SIOCSIFMAP.  These have different native and compat layout, so the format
conversion had been needed.  In 2009 these two cases had been taken out,
turning the rest into a convoluted way to calling sock_do_ioctl().  We copy
compat structure into native one, call sock_do_ioctl() on that and copy
the result back for the in/out ioctls.  No layout transformation anywhere,
so we might as well just call sock_do_ioctl() and skip all the headache with
copying.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agoip_rt_ioctl(): take copyin to caller
Al Viro [Sat, 1 Jul 2017 12:03:10 +0000 (08:03 -0400)]
ip_rt_ioctl(): take copyin to caller

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agodevinet_ioctl(): take copyin/copyout to caller
Al Viro [Sat, 1 Jul 2017 11:53:12 +0000 (07:53 -0400)]
devinet_ioctl(): take copyin/copyout to caller

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agonet: separate SIOCGIFCONF handling from dev_ioctl()
Al Viro [Mon, 26 Jun 2017 17:19:16 +0000 (13:19 -0400)]
net: separate SIOCGIFCONF handling from dev_ioctl()

Only two of dev_ioctl() callers may pass SIOCGIFCONF to it.
Separating that codepath from the rest of dev_ioctl() allows both
to simplify dev_ioctl() itself (all other cases work with struct ifreq *)
*and* seriously simplify the compat side of that beast: all it takes
is passing to inet_gifconf() an extra argument - the size of individual
records (sizeof(struct ifreq) or sizeof(struct compat_ifreq)).  With
dev_ifconf() called directly from sock_do_ioctl()/compat_dev_ifconf()
that's easy to arrange.

As the result, compat side of SIOCGIFCONF doesn't need any
allocations, copy_in_user() back and forth, etc.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Linus Torvalds [Wed, 24 Jan 2018 23:49:02 +0000 (15:49 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc bugfix from David Miller:
 "Sparc Makefile typo fix"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64

7 years agonet/ibm/emac: wrong bit is used for STA control register write
Ivan Mikhaylov [Wed, 24 Jan 2018 12:53:25 +0000 (15:53 +0300)]
net/ibm/emac: wrong bit is used for STA control register write

STA control register has areas of mode and opcodes for opeations. 18 bit is
using for mode selection, where 0 is old MIO/MDIO access method and 1 is
indirect access mode. 19-20 bits are using for setting up read/write
operation(STA opcodes). In current state 'read' is set into old MIO/MDIO mode
with 19 bit and write operation is set into 18 bit which is mode selection,
not a write operation. To correlate write with read we set it into 20 bit.
All those bit operations are MSB 0 based.

Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/ibm/emac: add 8192 rx/tx fifo size
Ivan Mikhaylov [Wed, 24 Jan 2018 12:53:24 +0000 (15:53 +0300)]
net/ibm/emac: add 8192 rx/tx fifo size

emac4syn chips has availability to use 8192 rx/tx fifo buffer sizes,
in current state if we set it up in dts 8192 as example, we will get
only 2048 which may impact on network speed.

Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 24 Jan 2018 23:02:17 +0000 (18:02 -0500)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-01-24

This series contains updates to fm10k only.

Alex fixes MACVLAN offload for fm10k, where we were not seeing unicast
packets being received because we did not correctly configure the
default VLAN ID for the port and defaulting to 0.

Jake cleans up unnecessary parenthesis in a couple of "if" statements.
Fixed the driver to stop adding VLAN 0 into the VLAN table, since it
would cause the VLAN table to be inconsistent between the PF and VF.
Also fixed an issue where we were assuming that VLAN 1 is enabled when
the default VLAN ID is not set, so resolve by not requesting any filters
for the default_vid if it has not yet been assigned.

Ngai fixes an issue which was generating a dmesg regarding unbale to
kill a particular VLAN ID for the device.  This is due to
ndo_vlan_rx_kill_vid() exits with an error and the handler for this ndo
is fm10k_update_vid() which exits prematurely under PF VLAN management.
So to resolve, we must check the VLAN update action type before exiting
fm10k_update_vid(), and act appropriately based on the action type.
Also corrected code comment typos.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agofm10k: clarify action when updating the VLAN table
Ngai-Mint Kwan [Wed, 24 Jan 2018 22:23:27 +0000 (14:23 -0800)]
fm10k: clarify action when updating the VLAN table

Clarify the comment for when entering promiscuous mode that we update
the VLAN table. Add a comment distinguishing the case where we're
exiting promiscuous mode and need to clear the entire VLAN table.

Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@gmail.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agofm10k: correct typo in fm10k_pf.c
Ngai-Mint Kwan [Wed, 24 Jan 2018 22:22:18 +0000 (14:22 -0800)]
fm10k: correct typo in fm10k_pf.c

Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agofm10k: don't assume VLAN 1 is enabled
Jacob Keller [Wed, 24 Jan 2018 22:20:29 +0000 (14:20 -0800)]
fm10k: don't assume VLAN 1 is enabled

Since commit 856dfd69e84f ("fm10k: Fix multicast mode synch issues",
2016-03-03) we've incorrectly assumed that VLAN 1 is enabled when the
default VID is not set.

This occurs because we check the default_vid and if it's zero, start
several loops over the active_vlans bitmask at 1, instead of checking to
ensure that that bit is active.

This happened because of commit d9ff3ee8efe9 ("fm10k: Add support for
VLAN 0 w/o default VLAN", 2014-08-07) which mistakenly assumed that we
should send requests for MAC and VLAN filters with VLAN 0 when the
default_vid isn't set.

However, the switch generally considers this an invalid configuration,
so the only time we'd have a default_vid of 0 is when the switch is
down.

Instead, lets just not request any filters for the default_vid if it's
not yet been assigned.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agofm10k: stop adding VLAN 0 to the VLAN table
Jacob Keller [Wed, 24 Jan 2018 22:19:30 +0000 (14:19 -0800)]
fm10k: stop adding VLAN 0 to the VLAN table

Currently, when the driver loads, it sends a request to add VLAN 0 to the
VLAN table. For the PF, this is honored, and VLAN 0 is indeed set. For
the VF, this request is silently converted into a request for the
default VLAN as defined by either the switch vid or the PF vid.

This results in the odd behavior that the VLAN table doesn't appear
consistent between the PF and the VF.

Furthermore, setting a MAC filter with VLAN 0 is generally considered an
invalid configuration by the switch, and since commit 856dfd69e84f
("fm10k: Fix multicast mode synch issues", 2016-03-03) we've had code
which prevents us from ever sending such a request.

Since there's not really a good reason to keep VLAN 0 in the VLAN table,
stop requesting it in fm10k_restore_rx_state().

This might seem to indicate that we would no longer properly configure
the MAC and VLAN tables for the default vid. However, due to the way
that fm10k_find_next_vlan() behaves, it will always return the
default_vid as enabled.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agofm10k: fix "failed to kill vid" message for VF
Ngai-Mint Kwan [Wed, 24 Jan 2018 22:18:22 +0000 (14:18 -0800)]
fm10k: fix "failed to kill vid" message for VF

When a VF is under PF VLAN assignment:

ip link set <pf> vf <#> vlan <vid>

This will remove all previous entries in the VLAN table including those
generated by VLAN interfaces created on the VF. The issue arises when
the VF is under PF VLAN assignment and one or more of these VLAN
interfaces of the VF are deleted. When deleting these VLAN interfaces,
the following message will be generated in "dmesg":

failed to kill vid 0081/<vid> for device <vf>

This is due to the fact that "ndo_vlan_rx_kill_vid" exits with an error.
The handler for this ndo is "fm10k_update_vid". Any calls to this
function while under PF VLAN management will exit prematurely and, thus,
it will generate the failure message.

Additionally, since "fm10k_update_vid" exits prematurely, none of the
VLAN update is performed. So, even though the actual VLAN interfaces of
the VF will be deleted, the active_vlans bitmask is not cleared. When
the VF is no longer under PF VLAN assignment, the driver mistakenly
restores the previous entries of the VLAN table based on an
unsynchronized list of active VLANs.

The solution to this issue involves checking the VLAN update action type
before exiting "fm10k_update_vid". If the VLAN update action type is to
"add", this action will not be permitted while the VF is under PF VLAN
assignment and the VLAN update is abandoned like before.

However, if the VLAN update action type is to "kill", then we need to
also clear the active_vlans bitmask. However, we don't need to actually
queue any messages to the PF, because the MAC and VLAN tables have
already been cleared, and the PF would silently ignore these requests
anyways.

Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agofm10k: cleanup unnecessary parenthesis in fm10k_iov.c
Jacob Keller [Wed, 24 Jan 2018 22:17:15 +0000 (14:17 -0800)]
fm10k: cleanup unnecessary parenthesis in fm10k_iov.c

This fixes a few warnings found by checkpatch.pl --strict

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agocxgb4: make symbol pedits static
Wei Yongjun [Wed, 24 Jan 2018 02:14:33 +0000 (02:14 +0000)]
cxgb4: make symbol pedits static

Fixes the following sparse warning:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c:46:27: warning:
 symbol 'pedits' was not declared. Should it be static?

Fixes: 27ece1f357b7 ("cxgb4: add tc flower support for ETH-DMAC rewrite")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovhost: do not try to access device IOTLB when not initialized
Jason Wang [Tue, 23 Jan 2018 09:27:26 +0000 (17:27 +0800)]
vhost: do not try to access device IOTLB when not initialized

The code will try to access dev->iotlb when processing
VHOST_IOTLB_INVALIDATE even if it was not initialized which may lead
to NULL pointer dereference. Fixes this by check dev->iotlb before.

Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovhost: use mutex_lock_nested() in vhost_dev_lock_vqs()
Jason Wang [Tue, 23 Jan 2018 09:27:25 +0000 (17:27 +0800)]
vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()

We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to
hold mutexes of all virtqueues. This may confuse lockdep to report a
possible deadlock because of trying to hold locks belong to same
class. Switch to use mutex_lock_nested() to avoid false positive.

Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: erspan: fix use-after-free
William Tu [Wed, 24 Jan 2018 01:01:29 +0000 (17:01 -0800)]
net: erspan: fix use-after-free

When building the erspan header for either v1 or v2, the eth_hdr()
does not point to the right inner packet's eth_hdr,
causing kasan report use-after-free and slab-out-of-bouds read.

The patch fixes the following syzkaller issues:
[1] BUG: KASAN: slab-out-of-bounds in erspan_xmit+0x22d4/0x2430 net/ipv4/ip_gre.c:735
[2] BUG: KASAN: slab-out-of-bounds in erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698
[3] BUG: KASAN: use-after-free in erspan_xmit+0x22d4/0x2430 net/ipv4/ip_gre.c:735
[4] BUG: KASAN: use-after-free in erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698

[2] CPU: 0 PID: 3654 Comm: syzkaller377964 Not tainted 4.15.0-rc9+ #185
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:53
 print_address_description+0x73/0x250 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x25b/0x340 mm/kasan/report.c:409
 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:440
 erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698
 erspan_xmit+0x3b8/0x13b0 net/ipv4/ip_gre.c:740
 __netdev_start_xmit include/linux/netdevice.h:4042 [inline]
 netdev_start_xmit include/linux/netdevice.h:4051 [inline]
 packet_direct_xmit+0x315/0x6b0 net/packet/af_packet.c:266
 packet_snd net/packet/af_packet.c:2943 [inline]
 packet_sendmsg+0x3aed/0x60b0 net/packet/af_packet.c:2968
 sock_sendmsg_nosec net/socket.c:638 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:648
 SYSC_sendto+0x361/0x5c0 net/socket.c:1729
 SyS_sendto+0x40/0x50 net/socket.c:1697
 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
 do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
 entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129
RIP: 0023:0xf7fcfc79
RSP: 002b:00000000ffc6976c EFLAGS: 00000286 ORIG_RAX: 0000000000000171
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000020011000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020008000
RBP: 000000000000001c R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Fixes: f551c91de262 ("net: erspan: introduce erspan v2 for ip_gre")
Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Reported-by: syzbot+9723f2d288e49b492cf0@syzkaller.appspotmail.com
Reported-by: syzbot+f0ddeb2b032a8e1d9098@syzkaller.appspotmail.com
Reported-by: syzbot+f14b3703cd8d7670203f@syzkaller.appspotmail.com
Reported-by: syzbot+eefa384efad8d7997f20@syzkaller.appspotmail.com
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoi40e: flower: check if TC offload is enabled on a netdev
Jakub Kicinski [Tue, 23 Jan 2018 08:08:40 +0000 (00:08 -0800)]
i40e: flower: check if TC offload is enabled on a netdev

Since TC block changes drivers are required to check if
the TC hw offload flag is set on the interface themselves.

Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower")
Fixes: 44ae12a768b7 ("net: sched: move the can_offload check from binding phase to rule insertion phase")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64
Corentin Labbe [Tue, 23 Jan 2018 14:33:14 +0000 (14:33 +0000)]
sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64

This patch fixes the typo CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64

Fixes: 81658ad0d923 ("sparc64: Add CAMELLIA driver making use of the new camellia opcodes.")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'qed-rdma-bug-fixes'
David S. Miller [Wed, 24 Jan 2018 21:44:21 +0000 (16:44 -0500)]
Merge branch 'qed-rdma-bug-fixes'

Michal Kalderon says:

====================
qed: rdma bug fixes

This patch contains two small bug fixes related to RDMA.
Both related to resource reservations.
====================

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqed: Free reserved MR tid
Michal Kalderon [Tue, 23 Jan 2018 09:33:47 +0000 (11:33 +0200)]
qed: Free reserved MR tid

A tid was allocated for reserved MR during initialization but
not freed. This lead to an annoying output message during
rdma unload flow.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqed: Remove reserveration of dpi for kernel
Michal Kalderon [Tue, 23 Jan 2018 09:33:46 +0000 (11:33 +0200)]
qed: Remove reserveration of dpi for kernel

Double reservation for kernel dedicated dpi was performed.
Once in the core module and once in qedr.
Remove the reservation from core.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agofm10k: Fix configuration for macvlan offload
Alexander Duyck [Wed, 24 Jan 2018 21:39:44 +0000 (13:39 -0800)]
fm10k: Fix configuration for macvlan offload

The fm10k driver didn't work correctly when macvlan offload was enabled.
Specifically what would occur is that we would see no unicast packets being
received. This was traced down to us not correctly configuring the default
VLAN ID for the port and defaulting to 0.

To correct this we either use the default ID provided by the switch or
simply use 1. With that we are able to pass and receive traffic without any
issues.

In addition we were not repopulating the filter table following a reset. To
correct that I have added a bit of code to fm10k_restore_rx_state that will
repopulate the Rx filter configuration for the macvlan interfaces.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agonet: qcom/emac: extend DMA mask to 46bits
Wang Dongsheng [Tue, 23 Jan 2018 04:25:06 +0000 (20:25 -0800)]
net: qcom/emac: extend DMA mask to 46bits

Bit TPD3[31] is used as a timestamp bit if PTP is enabled, but
it's used as an address bit if PTP is disabled.  Since PTP isn't
supported by the driver, we can extend the DMA address to 46 bits.

Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoip_tunnel: Use mark in skb by default
Thomas Winter [Tue, 23 Jan 2018 03:46:24 +0000 (16:46 +1300)]
ip_tunnel: Use mark in skb by default

This allows marks set by connmark in iptables
to be used for route lookups.

Signed-off-by: Thomas Winter <thomas.winter@alliedtelesis.co.nz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: do not use a bitwise AND operator with a bool operand
Niklas Cassel [Mon, 22 Jan 2018 15:59:50 +0000 (16:59 +0100)]
net: stmmac: do not use a bitwise AND operator with a bool operand

Doing a bitwise AND between a bool and an int is generally not a good idea.
The bool will be promoted to an int with value 0 or 1,
the int is generally regarded as true with a non-zero value,
thus ANDing them has the potential to yield an undesired result.

This commit fixes the following smatch warnings:

drivers/net/ethernet/stmicro/stmmac/enh_desc.c:344 enh_desc_prepare_tx_desc() warn: maybe use && instead of &
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c:337 dwmac4_rd_prepare_tx_desc() warn: maybe use && instead of &
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c:380 dwmac4_rd_prepare_tso_tx_desc() warn: maybe use && instead of &

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 24 Jan 2018 21:09:09 +0000 (16:09 -0500)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2018-01-24

This series contains updates to igb and e1000e only.

Corinna Vinschen implements the ability to set the VF MAC to
00:00:00:00:00:00 via RTM_SETLINK on the PF, to prevent receiving
"invlaid argument" when libvirt attempts to restore the MAC address back
to its original state of 00:00:00:00:00:00.

Zhang Shengju adds a new function igb_get_max_rss_queues() to get the
maxium number of RSS queues and to reduce code duplication.

Matt fixes an issue with e1000e where when setting HTHREST, we should
only be setting bit 8.  Also added a dev_info() message to alert when
C-states have been disabled, to help in debugging.

Jesus adds code comments to clarify the idlescope configuration
constraints.

Lyude Paul fixes a kernel crash on hotplug of igb, by checking to ensure
that we are in the process of dismantling the netdev on hotplug events.

Markus Elfring removes a duplicate message for a memory allocation
failure.

Daniel Hua fixes an issue where transmit timestamps will stop being put
into the socket when link is repeatedly up/down due to TSYNCTXCTL's TXTT
bit (Transmit timestamp valid bit) not clearing in the timeout logic of
ptp_tx_work(), which in turn causes no new timestamp to be loaded to the
TXSTMP register.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'net-sched-propagate-extack-to-cls-offloads-on-destroy-and-only-with...
David S. Miller [Wed, 24 Jan 2018 21:01:11 +0000 (16:01 -0500)]
Merge branch 'net-sched-propagate-extack-to-cls-offloads-on-destroy-and-only-with-skip_sw'

Jakub Kicinski says:

====================
net: sched: propagate extack to cls offloads on destroy and only with skip_sw

This series some of Jiri's comments and the fact that today drivers
may produce extack even if there is no skip_sw flag (meaning the
driver failure is not really a problem), and warning messages will
only confuse the users.

First patch propagates extack to destroy as requested by Jiri, extack
is then propagated to the driver callback for each classifier.  I chose
not to provide the extack on error paths.  As a rule of thumb it seems
best to keep the extack of the condition which caused the error.  E.g.

     err = this_will_fail(arg, extack);
     if (err) {
        undo_things(arg, NULL /* don't pass extack */);
return err;
     }

Note that NL_SET_ERR_MSG() will ignore the message if extack is NULL.
I was pondering whether we should make NL_SET_ERR_MSG() refuse to
overwrite the msg, but there seem to be cases in the tree where extack
is set like this:

     err = this_will_fail(arg, extack);
     if (err) {
        undo_things(arg, NULL /* don't pass extack */);
NL_SET_ERR_MSG(extack, "extack is set after undo call :/");
return err;
     }

I think not passing extack to undo calls is reasonable.

v2:
 - rename the temporary tc_cls_common_offload_init().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sched: remove tc_cls_common_offload_init_deprecated()
Jakub Kicinski [Wed, 24 Jan 2018 20:54:24 +0000 (12:54 -0800)]
net: sched: remove tc_cls_common_offload_init_deprecated()

All users are now converted to tc_cls_common_offload_init().

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_u32: propagate extack to delete callback
Jakub Kicinski [Wed, 24 Jan 2018 20:54:23 +0000 (12:54 -0800)]
cls_u32: propagate extack to delete callback

Propagate extack on removal of offloaded filter.  Don't pass
extack from error paths.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_u32: pass offload flags to tc_cls_common_offload_init()
Jakub Kicinski [Wed, 24 Jan 2018 20:54:22 +0000 (12:54 -0800)]
cls_u32: pass offload flags to tc_cls_common_offload_init()

Pass offload flags to the new implementation of
tc_cls_common_offload_init().  Extack will now only
be set if user requested skip_sw.  hnodes need to
hold onto the flags now to be able to reuse them
on filter removal.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_flower: propagate extack to delete callback
Jakub Kicinski [Wed, 24 Jan 2018 20:54:21 +0000 (12:54 -0800)]
cls_flower: propagate extack to delete callback

Propagate extack on removal of offloaded filter.  Don't pass
extack from error paths.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_flower: pass offload flags to tc_cls_common_offload_init()
Jakub Kicinski [Wed, 24 Jan 2018 20:54:20 +0000 (12:54 -0800)]
cls_flower: pass offload flags to tc_cls_common_offload_init()

Pass offload flags to the new implementation of
tc_cls_common_offload_init().  Extack will now only
be set if user requested skip_sw.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_matchall: propagate extack to delete callback
Jakub Kicinski [Wed, 24 Jan 2018 20:54:19 +0000 (12:54 -0800)]
cls_matchall: propagate extack to delete callback

Propagate extack on removal of offloaded filter.  Don't pass
extack from error paths.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_matchall: pass offload flags to tc_cls_common_offload_init()
Jakub Kicinski [Wed, 24 Jan 2018 20:54:18 +0000 (12:54 -0800)]
cls_matchall: pass offload flags to tc_cls_common_offload_init()

Pass offload flags to the new implementation of
tc_cls_common_offload_init().  Extack will now only
be set if user requested skip_sw.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_bpf: propagate extack to offload delete callback
Jakub Kicinski [Wed, 24 Jan 2018 20:54:17 +0000 (12:54 -0800)]
cls_bpf: propagate extack to offload delete callback

Propagate extack on removal of offloaded filter.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_bpf: pass offload flags to tc_cls_common_offload_init()
Jakub Kicinski [Wed, 24 Jan 2018 20:54:16 +0000 (12:54 -0800)]
cls_bpf: pass offload flags to tc_cls_common_offload_init()

Pass offload flags to the new implementation of
tc_cls_common_offload_init().  Extack will now only
be set if user requested skip_sw.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocls_bpf: remove gen_flags from bpf_offload
Jakub Kicinski [Wed, 24 Jan 2018 20:54:15 +0000 (12:54 -0800)]
cls_bpf: remove gen_flags from bpf_offload

cls_bpf now guarantees that only device-bound programs are
allowed with skip_sw.  The drivers no longer pay attention to
flags on filter load, therefore the bpf_offload member can be
removed.  If flags are needed again they should probably be
added to struct tc_cls_common_offload instead.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sched: prepare for reimplementation of tc_cls_common_offload_init()
Jakub Kicinski [Wed, 24 Jan 2018 20:54:14 +0000 (12:54 -0800)]
net: sched: prepare for reimplementation of tc_cls_common_offload_init()

Rename the tc_cls_common_offload_init() helper function to
tc_cls_common_offload_init_deprecated() and add a new implementation
which also takes flags argument.  We will only set extack if flags
indicate that offload is forced (skip_sw) otherwise driver errors
should be ignored, as they don't influence the overall filter
installation.

Note that we need the tc_skip_hw() helper for new version, therefore
it is added later in the file.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sched: propagate extack to cls->destroy callbacks
Jakub Kicinski [Wed, 24 Jan 2018 20:54:13 +0000 (12:54 -0800)]
net: sched: propagate extack to cls->destroy callbacks

Propagate extack to cls->destroy callbacks when called from
non-error paths.  On error paths pass NULL to avoid overwriting
the failure message.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'kcm-fix-two-syzcaller-issues'
David S. Miller [Wed, 24 Jan 2018 20:54:31 +0000 (15:54 -0500)]
Merge branch 'kcm-fix-two-syzcaller-issues'

Tom Herbert says:

====================
kcm: fix two syzcaller issues

In this patch set:

- Don't allow attaching non-TCP or listener sockets to a KCM mux.
- In kcm_attach Check if sk_user_data is already set. This is
  under lock to avoid race conditions. More work is need to make
  all of the users of sk_user_data to use the same locking.

- v2
  Remove unncessary check for not PF_KCM in kcm_attach (suggested by
  Guillaume Nault)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agokcm: Check if sk_user_data already set in kcm_attach
Tom Herbert [Wed, 24 Jan 2018 20:35:41 +0000 (12:35 -0800)]
kcm: Check if sk_user_data already set in kcm_attach

This is needed to prevent sk_user_data being overwritten.
The check is done under the callback lock. This should prevent
a socket from being attached twice to a KCM mux. It also prevents
a socket from being attached for other use cases of sk_user_data
as long as the other cases set sk_user_data under the lock.
Followup work is needed to unify all the use cases of sk_user_data
to use the same locking.

Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Tom Herbert <tom@quantonium.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agokcm: Only allow TCP sockets to be attached to a KCM mux
Tom Herbert [Wed, 24 Jan 2018 20:35:40 +0000 (12:35 -0800)]
kcm: Only allow TCP sockets to be attached to a KCM mux

TCP sockets for IPv4 and IPv6 that are not listeners or in closed
stated are allowed to be attached to a KCM mux.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com
Signed-off-by: Tom Herbert <tom@quantonium.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMAINTAINERS: update email address for James Morris
James Morris [Wed, 24 Jan 2018 20:53:57 +0000 (07:53 +1100)]
MAINTAINERS: update email address for James Morris

Update my email address.

Signed-off-by: James Morris <jmorris@namei.org>
7 years agoigb: Clear TXSTMP when ptp_tx_work() is timeout
Daniel Hua [Tue, 2 Jan 2018 00:33:18 +0000 (08:33 +0800)]
igb: Clear TXSTMP when ptp_tx_work() is timeout

Problem description:
After ethernet cable connect and disconnect for several iterations on a
device with i210, tx timestamp will stop being put into the socket.

Steps to reproduce:
1. Setup a device with i210 and wire it to a 802.1AS capable switch (
Extreme Networks Summit x440 is used in our case)
2. Have the gptp daemon running on the device and make sure it is synced
with the switch
3. Have the switch disable and enable the port, wait for the device gets
resynced with the switch
4. Iterates step 3 until the device is not albe to get resynced
5. Review the log in dmesg and you will see warning message "igb : clearing
Tx timestamp hang"

Root cause:
If ptp_tx_work() gets scheduled just before the port gets disabled, a LINK
DOWN event will be processed before ptp_tx_work(), which may cause timeout
in ptp_tx_work(). In the timeout logic, the TSYNCTXCTL's TXTT bit (Transmit
timestamp valid bit) is not cleared, causing no new timestamp loaded to
TXSTMP register. Consequently therefore, no new interrupt is triggerred by
TSICR.TXTS bit and no more Tx timestamp send to the socket.

Signed-off-by: Daniel Hua <daniel.hua@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: Delete an error message for a failed memory allocation in igb_enable_sriov()
Markus Elfring [Mon, 1 Jan 2018 19:56:55 +0000 (20:56 +0100)]
igb: Delete an error message for a failed memory allocation in igb_enable_sriov()

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: Free IRQs when device is hotplugged
Lyude Paul [Tue, 12 Dec 2017 19:31:30 +0000 (14:31 -0500)]
igb: Free IRQs when device is hotplugged

Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon
hotplugging my kernel would immediately crash due to igb:

[  680.825801] kernel BUG at drivers/pci/msi.c:352!
[  680.828388] invalid opcode: 0000 [#1] SMP
[  680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb]
[  680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G           O     4.15.0-rc3Lyude-Test+ #6
[  680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017
[  680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0
[  680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286
[  680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c
[  680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178
[  680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00
[  680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298
[  680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000
[  680.836332] FS:  0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000
[  680.836817] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0
[  680.837954] Call Trace:
[  680.838853]  pci_disable_msix+0xce/0xf0
[  680.839616]  igb_reset_interrupt_capability+0x5d/0x60 [igb]
[  680.840278]  igb_remove+0x9d/0x110 [igb]
[  680.840764]  pci_device_remove+0x36/0xb0
[  680.841279]  device_release_driver_internal+0x157/0x220
[  680.841739]  pci_stop_bus_device+0x7d/0xa0
[  680.842255]  pci_stop_bus_device+0x2b/0xa0
[  680.842722]  pci_stop_bus_device+0x3d/0xa0
[  680.843189]  pci_stop_and_remove_bus_device+0xe/0x20
[  680.843627]  trim_stale_devices+0xf3/0x140
[  680.844086]  trim_stale_devices+0x94/0x140
[  680.844532]  trim_stale_devices+0xa6/0x140
[  680.845031]  ? get_slot_status+0x90/0xc0
[  680.845536]  acpiphp_check_bridge.part.5+0xfe/0x140
[  680.846021]  acpiphp_hotplug_notify+0x175/0x200
[  680.846581]  ? free_bridge+0x100/0x100
[  680.847113]  acpi_device_hotplug+0x8a/0x490
[  680.847535]  acpi_hotplug_work_fn+0x1a/0x30
[  680.848076]  process_one_work+0x182/0x3a0
[  680.848543]  worker_thread+0x2e/0x380
[  680.848963]  ? process_one_work+0x3a0/0x3a0
[  680.849373]  kthread+0x111/0x130
[  680.849776]  ? kthread_create_worker_on_cpu+0x50/0x50
[  680.850188]  ret_from_fork+0x1f/0x30
[  680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b
[  680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0

As it turns out, normally the freeing of IRQs that would fix this is called
inside of the scope of __igb_close(). However, since the device is
already gone by the point we try to unregister the netdevice from the
driver due to a hotplug we end up seeing that the netif isn't present
and thus, forget to free any of the device IRQs.

So: make sure that if we're in the process of dismantling the netdev, we
always allow __igb_close() to be called so that IRQs may be freed
normally. Additionally, only allow igb_close() to be called from
__igb_close() if it hasn't already been called for the given adapter.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach")
Cc: Todd Fujinaka <todd.fujinaka@intel.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: stable@vger.kernel.org
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoe1000e: Alert the user that C-states will be disabled by enabling jumbo frames
Matt Turner [Tue, 14 Nov 2017 23:51:33 +0000 (15:51 -0800)]
e1000e: Alert the user that C-states will be disabled by enabling jumbo frames

I personally spent a long time trying to decypher why my CPU would not
reach deeper C-states. Let's just tell the next user what's going on.

Signed-off-by: Matt Turner <matt.turner@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: Clarify idleslope config constraints
Jesus Sanchez-Palencia [Fri, 10 Nov 2017 22:21:50 +0000 (14:21 -0800)]
igb: Clarify idleslope config constraints

By design, the idleslope increments are restricted to 16.384kbps steps.
Add a comment to igb_main.c making that explicit and add one example
that illustrates the impact of that.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoe1000e: Set HTHRESH when PTHRESH is used
Matt Turner [Tue, 7 Nov 2017 22:13:30 +0000 (14:13 -0800)]
e1000e: Set HTHRESH when PTHRESH is used

According to section 12.0.3.4.13 "Receive Descriptor Control - RXDCTL"
of the IntelĀ® 82579 Gigabit Ethernet PHY Datasheet v2.1:

    "HTHRESH should be given a non zero value when ever PTHRESH is
     used."

In RXDCTL(0), PTHRESH lives at bits 5:0, and HTHREST lives at bits 13:8.
Set only bit 8 of HTHREST as is done in e1000_flush_rx_ring(). Found by
inspection.

Signed-off-by: Matt Turner <matt.turner@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: add function to get maximum RSS queues
Zhang Shengju [Tue, 19 Sep 2017 13:40:54 +0000 (21:40 +0800)]
igb: add function to get maximum RSS queues

This patch adds a new function igb_get_max_rss_queues() to get maximum
RSS queues, this will reduce duplicate code and facilitate future
maintenance.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: Allow to remove administratively set MAC on VFs
Corinna Vinschen [Mon, 10 Apr 2017 08:58:14 +0000 (10:58 +0200)]
igb: Allow to remove administratively set MAC on VFs

Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
for use by a virtual machine (either using vfio device assignment or
macvtap passthru mode), it saves the current MAC address and vlan tag
so that it can reset them to their original value when the guest is
done.  Libvirt can't leave the VF MAC set to the value used by the
now-defunct guest since it may be started again later using a
different VF, but it certainly shouldn't just pick any random value,
either. So it saves the state of everything prior to using the VF, and
resets it to that.

The igb driver initializes the MAC addresses of all VFs to
00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
netlink message, also visible in the list of VFs in the output of "ip
link show"). But when libvirt attempts to restore the MAC address back
to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
responds with "Invalid argument".

Forbidding a reset back to the original value leaves the VF MAC at the
value set for the now-defunct virtual machine. Especially on a system
with NetworkManager enabled, this has very bad consequences, since
NetworkManager forces all interfacess to be IFF_UP all the time - if
the same virtual machine is restarted using a different VF (or even on
a different host), there will be multiple interfaces watching for
traffic with the same MAC address.

To allow libvirt to revert to the original state, we need a way to
remove the administrative set MAC on a VF, to allow normal host
operation again, and to reset/overwrite the VF MAC via VF netdev.

This patch implements the outlined scenario by allowing to set the
VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
so it's possible to reset the VF MAC back to the original value via
the VF netdev.

Note: Recent patches to libvirt allow for a workaround if the NIC
isn't capable of resetting the administrative MAC back to all 0, but
in theory the NIC should allow resetting the MAC in the first place.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: Aaron Brown <arron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoMerge branch 'pktgen-Behavior-flags-fixes'
David S. Miller [Wed, 24 Jan 2018 20:03:37 +0000 (15:03 -0500)]
Merge branch 'pktgen-Behavior-flags-fixes'

Dmitry Safonov says:

====================
pktgen: Behavior flags fixes

v2:
  o fixed a nitpick from David Miller

There are a bunch of fixes/cleanups/Documentations.
Diffstat says for itself, regardless added docs and missed flag
parameters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>