]> www.infradead.org Git - users/hch/dma-mapping.git/log
users/hch/dma-mapping.git
9 years agol2tp: Correctly return -EBADF from pppol2tp_getname.
phil.turnbull@oracle.com [Tue, 26 Jul 2016 19:14:35 +0000 (15:14 -0400)]
l2tp: Correctly return -EBADF from pppol2tp_getname.

If 'tunnel' is NULL we should return -EBADF but the 'end_put_sess' path
unconditionally sets 'error' back to zero. Rework the error path so it
more closely matches pppol2tp_sendmsg.

Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core/health: Remove deprecated create_singlethread_workqueue
Bhaktipriya Shridhar [Tue, 26 Jul 2016 17:08:24 +0000 (22:38 +0530)]
net/mlx5_core/health: Remove deprecated create_singlethread_workqueue

The workqueue health->wq was used as per device private health thread.
This was done to perform delayed work.

The workqueue has a single workitem(&health->work) and
hence doesn't require ordering. It is involved in handling the health of
the device and is not being used on a memory reclaim path.
Hence, the singlethreaded workqueue has been replaced with the use of
system_wq.

Work item has been flushed in mlx5_health_cleanup() to ensure that
there are no pending tasks while disconnecting the driver.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: ipmr/ip6mr: update lastuse on entry change
Nikolay Aleksandrov [Tue, 26 Jul 2016 16:54:52 +0000 (18:54 +0200)]
net: ipmr/ip6mr: update lastuse on entry change

Currently lastuse is updated on entry creation and cache hit, but it should
also be updated on entry change. Since both on add and update the ttl array
is updated we can simply update the lastuse in ipmr_update_thresholds.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Donald Sharp <sharpd@cumulusnetworks.com>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomacsec: ensure rx_sa is set when validation is disabled
Beniamino Galvani [Tue, 26 Jul 2016 10:24:53 +0000 (12:24 +0200)]
macsec: ensure rx_sa is set when validation is disabled

macsec_decrypt() is not called when validation is disabled and so
macsec_skb_cb(skb)->rx_sa is not set; but it is used later in
macsec_post_decrypt(), ensure that it's always initialized.

Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Beniamino Galvani <bgalvani@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'tipc-netlink-monitor-updates'
David S. Miller [Tue, 26 Jul 2016 21:26:43 +0000 (14:26 -0700)]
Merge branch 'tipc-netlink-monitor-updates'

Parthasarathy Bhuvaragan says:

====================
tipc: netlink updates for neighbour monitor

This series contains the updates to configure and read the attributes for
neighbour monitor.

v2: rebase on top of net-next
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: dump monitor attributes
Parthasarathy Bhuvaragan [Tue, 26 Jul 2016 06:47:22 +0000 (08:47 +0200)]
tipc: dump monitor attributes

In this commit, we dump the monitor attributes when queried.
The link monitor attributes are separated into two kinds:
1. general attributes per bearer
2. specific attributes per node/peer
This style resembles the socket attributes and the nametable
publications per socket.

Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: add a function to get the bearer name
Parthasarathy Bhuvaragan [Tue, 26 Jul 2016 06:47:21 +0000 (08:47 +0200)]
tipc: add a function to get the bearer name

Introduce a new function to get the bearer name from
its id. This is used in subsequent commit.

Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: get monitor threshold for the cluster
Parthasarathy Bhuvaragan [Tue, 26 Jul 2016 06:47:20 +0000 (08:47 +0200)]
tipc: get monitor threshold for the cluster

In this commit, we add support to fetch the configured
cluster monitoring threshold.

Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: make cluster size threshold for monitoring configurable
Parthasarathy Bhuvaragan [Tue, 26 Jul 2016 06:47:19 +0000 (08:47 +0200)]
tipc: make cluster size threshold for monitoring configurable

In this commit, we introduce support to configure the minimum
threshold to activate the new link monitoring algorithm.

Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: introduce constants for tipc address validation
Parthasarathy Bhuvaragan [Tue, 26 Jul 2016 06:47:18 +0000 (08:47 +0200)]
tipc: introduce constants for tipc address validation

In this commit, we introduce defines for tipc address size,
offset and mask specification for Zone.Cluster.Node.
There is no functional change in this commit.

Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
He Chunhui [Tue, 26 Jul 2016 06:16:52 +0000 (06:16 +0000)]
net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()

NUD_STALE is used when the caller(e.g. arp_process()) can't guarantee
neighbour reachability. If the entry was NUD_VALID and lladdr is unchanged,
the entry state should not be changed.

Currently the code puts an extra "NUD_CONNECTED" condition. So if old state
was NUD_DELAY or NUD_PROBE (they are NUD_VALID but not NUD_CONNECTED), the
state can be changed to NUD_STALE.

This may cause problem. Because NUD_STALE lladdr doesn't guarantee
reachability, when we send traffic, the state will be changed to
NUD_DELAY. In normal case, if we get no confirmation (by dst_confirm()),
we will change the state to NUD_PROBE and send probe traffic. But now the
state may be reset to NUD_STALE again(e.g. by broadcast ARP packets),
so the probe traffic will not be sent. This situation may happen again and
again, and packets will be sent to an non-reachable lladdr forever.

The fix is to remove the "NUD_CONNECTED" condition. After that the
"NEIGH_UPDATE_F_WEAK_OVERRIDE" condition (used by IPv6) in that branch will
be redundant, so remove it.

This change may increase probe traffic, but it's essential since NUD_STALE
lladdr is unreliable. To ensure correctness, we prefer to resolve lladdr,
when we can't get confirmation, even while remote packets try to set
NUD_STALE state.

Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'xgene-fix-mod-crash-and-1g-hotplug'
David S. Miller [Tue, 26 Jul 2016 04:51:44 +0000 (21:51 -0700)]
Merge branch 'xgene-fix-mod-crash-and-1g-hotplug'

Iyappan Subramanian says:

====================
drivers: net: xgene: Fix module crash and 1G hot-plug

This patchset addresses the following issues,

1. Fixes the kernel crash when the driver loaded as an kernel module
- by fixing hardware cleanups and rearrange kernel API calls

2. Hot-plug issue on the SGMII 1G interface
- by adding a driver for MDIO management

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
---
v7: Address review comments from v6
- fixed kbuild warnings
- unmapped DMA memory on xgene_enet_delete_bufpool()
- delete descriptor rings and buffer pools on cle_init() failure
- fixed error deconstruction path on probe

v6: Address review comments from v5
- changed to use devm_ioremap_resource
- changed to return PTR_ERR(clk) on failure
- cleaned up and removed indirections
- exported mdio read/write and phy_register functions
- changed mii_bus is to indicate interface instance
- changed to call the exported mdio read/write and phy_register functions

v5: Address review comments from v4
- Fixed clock reset sequence by adding delay
- Fixed clock count by adding clk_unprepare_disable() in port shutdown

v4: Address review comments from v3
- Reorganized into smaller patches
- Added wrapper functions for sgmii_control_reset and sgmii_tbi_control_reset
- Removed clk_get warning info
- mdio: Changed the order of 'if' statements and removed the 'else' statement
- mdio: Removed the mdio_read(write) indirection wrapper functions
- ethtool: Fixed SGMII 1G get_settings and set_settings
- Documentation: dtb: Added MDIO node information
- MAINTAINERS: Added MDIO driver and documentation path

v3: Address review comments from v2
- Add comment about hardware clock reset sequence on xgene_mdio_reset

v2: Address review comments from v1
- Fixed patch 1 compilation error
- Fixed mdio@1f610000 xge0clk reference
- Squashed dtb patches
- Added PORT_OFFSET macro

v1:
- Initial version
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMAINTAINERS: xgene: Add driver and documentation path
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:47 +0000 (17:12 -0700)]
MAINTAINERS: xgene: Add driver and documentation path

Added path to the MDIO driver and Documentation file.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoDocumentation: dtb: xgene: Add MDIO node
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:46 +0000 (17:12 -0700)]
Documentation: dtb: xgene: Add MDIO node

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodtb: xgene: Add MDIO node
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:45 +0000 (17:12 -0700)]
dtb: xgene: Add MDIO node

Added mdio node for mdio driver.  Also added phy-handle
reference to the ethernet nodes.

Removed unused clock node from storm sgenet1.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: xgene: ethtool: Use phy_ethtool_gset and sset
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:44 +0000 (17:12 -0700)]
drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset

Changed SGMII 1G get_settings to use phy_ethtool_gset.
Changed SGMII 1G set_settings to use phy_ethtool_sset.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: xgene: Use exported functions
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:43 +0000 (17:12 -0700)]
drivers: net: xgene: Use exported functions

This patch reuses the mdio read/write and phy_register functions
and removed the local definitions.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: xgene: Enable MDIO driver
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:42 +0000 (17:12 -0700)]
drivers: net: xgene: Enable MDIO driver

This patch enables MDIO driver by,

- Selecting MDIO_XGENE
- Changed open and close to use phy_start and phy_stop
- Changed to use mac_ops->tx(rx)_enable and tx(rx)_disable

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: xgene: Add backward compatibility
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:41 +0000 (17:12 -0700)]
drivers: net: xgene: Add backward compatibility

This patch adds xgene_enet_check_phy_hanlde() function that checks whether
MDIO driver is probed successfully and sets pdata->mdio_driver to true.
If MDIO driver is not probed, ethernet driver falls back to backward
compatibility mode.

Since enum xgene_enet_cmd is used by MDIO driver, removing this from
ethernet driver.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: phy: xgene: Add MDIO driver
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:40 +0000 (17:12 -0700)]
drivers: net: phy: xgene: Add MDIO driver

Currently, SGMII based 1G rely on the hardware registers for link state
and sometimes it's not reliable.  To get most accurate link state, this
interface has to use the MDIO bus to poll the PHY.

In X-Gene SoC, MDIO bus is shared across RGMII and SGMII based 1G
interfaces, so adding this driver to manage MDIO bus.  This driver
registers the mdio bus and registers the PHYs connected to it.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: xgene: Fix module unload crash - clkrst sequence
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:39 +0000 (17:12 -0700)]
drivers: net: xgene: Fix module unload crash - clkrst sequence

This patch fixes clock reset sequence.

- Added clock reset sequence for ACPI
- Added delay in clock reset sequence to make sure pulse is generated
- Added clk_unprepare_disable() in port shutdown to make sure
  clock increment/decrement counts are matching
- Removed MII_MGMT_CONFIG programming, since it is not required
- Fixed programming XGENET_CONFIG_REG to enable SGMII mode

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: xgene: Fix module unload crash - change sw sequence
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:38 +0000 (17:12 -0700)]
drivers: net: xgene: Fix module unload crash - change sw sequence

When the driver is configured as kernel module and when it gets
unloaded and reloaded, kernel crash was observed.  This patch
addresses the software cleanup by doing the following,

- Moved register_netdev call after hardware is ready
- Since ndev is not ready, added set_irq_name to set irq name
- Since ndev is not ready, changed mdio_bus->parent to pdev->dev
- Replaced netif_start(stop)_queue by netif_tx_start(stop)_queues
- Removed napi_del call since it's called by free_netdev
- Added dev_close call, within remove
- Added shutdown callback
- Changed to use dmam_ APIs

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: xgene: Fix module unload crash - hw resource cleanup
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:37 +0000 (17:12 -0700)]
drivers: net: xgene: Fix module unload crash - hw resource cleanup

When the driver is configured as kernel module and when it gets
unloaded and reloaded, kernel crash was observed.  This patch
address the hardware resource cleanups by doing the following,

- Added mac_ops->clear() to do prefetch buffer clean up
- Fixed delete freepool buffers logic
- Reordered mac_enable and mac_disable
- Added Tx completion ring free
- Moved down delete_desc_rings after ring cleanup

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: xgene: Separate set_speed from mac_init
Iyappan Subramanian [Tue, 26 Jul 2016 00:12:36 +0000 (17:12 -0700)]
drivers: net: xgene: Separate set_speed from mac_init

Since mac_init is too heavy to be called when the link changes,
moved the speed_set configuration to a new function and added
mac_ops->set_speed function pointer.  This function will be
called from adjust_link callback.

Added cases for 10/100 support for SGMII based 1G interface.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Tested-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'refactor-tc_action-structs'
David S. Miller [Tue, 26 Jul 2016 04:49:20 +0000 (21:49 -0700)]
Merge branch 'refactor-tc_action-structs'

Cong Wang says:

====================
net_sched: refactor tc action structures

These two patches factor out the struct tcf_common.

v2: fix a compile warning
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet_sched: get rid of struct tcf_common
WANG Cong [Mon, 25 Jul 2016 23:09:42 +0000 (16:09 -0700)]
net_sched: get rid of struct tcf_common

After the previous patch, struct tc_action should be enough
to represent the generic tc action, tcf_common is not necessary
any more. This patch gets rid of it to make tc action code
more readable.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet_sched: move tc_action into tcf_common
WANG Cong [Mon, 25 Jul 2016 23:09:41 +0000 (16:09 -0700)]
net_sched: move tc_action into tcf_common

struct tc_action is confusing, currently we use it for two purposes:
1) Pass in arguments and carry out results from helper functions
2) A generic representation for tc actions

The first one is error-prone, since we need to make sure we don't
miss anything. This patch aims to get rid of this use, by moving
tc_action into tcf_common, so that they are allocated together
in hashtable and can be cast'ed easily.

And together with the following patch, we could really make
tc_action a generic representation for all tc actions and each
type of action can inherit from it.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipvlan: Scrub skb before crossing the namespace boundry
Mahesh Bandewar [Mon, 25 Jul 2016 21:38:16 +0000 (14:38 -0700)]
ipvlan: Scrub skb before crossing the namespace boundry

The earlier patch c3aaa06d5a63 (ipvlan: scrub skb before routing
in L3 mode.) did this but only for TX path in L3 mode. This
patch extends it for both the modes for TX/RX path.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'bnxt_en-improve-ntuple-and-new-IDs'
David S. Miller [Tue, 26 Jul 2016 04:43:31 +0000 (21:43 -0700)]
Merge branch 'bnxt_en-improve-ntuple-and-new-IDs'

Michael Chan says:

====================
bnxt_en: Improve ntuple filters and add new IDs.

Improve ntuple filters and add some new PCI device IDs.  Please review
for net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnxt_en: Add new NPAR and dual media device IDs.
Michael Chan [Mon, 25 Jul 2016 16:33:37 +0000 (12:33 -0400)]
bnxt_en: Add new NPAR and dual media device IDs.

Add 5741X/5731X NPAR device IDs and dual media SFP/10GBase-T device IDs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnxt_en: Log a message, if enabling NTUPLE filtering fails.
Vasundhara Volam [Mon, 25 Jul 2016 16:33:36 +0000 (12:33 -0400)]
bnxt_en: Log a message, if enabling NTUPLE filtering fails.

If there are not enough resources to enable ntuple filtering,
log a warning message.

v2: Use single message and add missing newline.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnxt_en: Improve ntuple filters by checking destination MAC address.
Michael Chan [Mon, 25 Jul 2016 16:33:35 +0000 (12:33 -0400)]
bnxt_en: Improve ntuple filters by checking destination MAC address.

Include the destination MAC address in the ntuple filter structure.  The
current code assumes that the destination MAC address is always the MAC
address of the NIC.  This may not be true if there are macvlans, for
example.  Add destination MAC address checking and configure the filter
correctly using the correct index for the destination MAC address.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoqed: Fix setting/clearing bit in completion bitmap
Manish Chopra [Mon, 25 Jul 2016 16:07:46 +0000 (19:07 +0300)]
qed: Fix setting/clearing bit in completion bitmap

Slowpath completion handling is incorrectly changing
SPQ_RING_SIZE bits instead of a single one.

Fixes: 76a9a3642a0b ("qed: fix handling of concurrent ramrods")
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoudp: use sk_filter_trim_cap for udp{,6}_queue_rcv_skb
Daniel Borkmann [Mon, 25 Jul 2016 16:06:12 +0000 (18:06 +0200)]
udp: use sk_filter_trim_cap for udp{,6}_queue_rcv_skb

After a612769774a3 ("udp: prevent bugcheck if filter truncates packet
too much"), there followed various other fixes for similar cases such
as f4979fcea7fd ("rose: limit sk_filter trim to payload").

Latter introduced a new helper sk_filter_trim_cap(), where we can pass
the trim limit directly to the socket filter handling. Make use of it
here as well with sizeof(struct udphdr) as lower cap limit and drop the
extra skb->len test in UDP's input path.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocaif-hsi: Remove deprecated create_singlethread_workqueue
Bhaktipriya Shridhar [Mon, 25 Jul 2016 13:10:57 +0000 (18:40 +0530)]
caif-hsi: Remove deprecated create_singlethread_workqueue

alloc_workqueue replaces deprecated create_singlethread_workqueue().

A dedicated workqueue has been used since the workitems are being used
on a packet tx/rx path. Hence, WQ_MEM_RECLAIM has been set to guarantee
forward progress under memory pressure.

An ordered workqueue has been used since workitems &cfhsi->wake_up_work
and &cfhsi->wake_down_work cannot be run concurrently.

Calls to flush_workqueue() before destroy_workqueue() have been dropped
since destroy_workqueue() itself calls drain_workqueue() which flushes
repeatedly till the workqueue becomes empty.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'bpf-probe-write-user'
David S. Miller [Tue, 26 Jul 2016 01:07:48 +0000 (18:07 -0700)]
Merge branch 'bpf-probe-write-user'

Sargun Dhillon says:

====================
bpf: add bpf_probe_write_user helper & example

This patch series contains two patches that add support for a probe_write
helper to BPF programs. This allows them to manipulate user memory during
the course of tracing. The second patch in the series has an example that
uses it, in one the intended ways to divert execution.

Thanks to Alexei Starovoitov, and Daniel Borkmann for being patient, review, and
helping me get familiar with the code base. I've made changes based on their
recommendations.

This helper should be considered for experimental usage and debugging, so we
print a warning to dmesg when it is along with the command and pid when someone
tries to install a proglet that uses it. A follow-up patchset will contain a
mechanism to verify the safety of the probe beyond what was done by hand.
----
v1->v2: restrict writing to user space, as opposed to globally v2->v3: Fixed
        formatting issues v3->v4: Rename copy_to_user -> bpf_probe_write
        Simplify checking of whether or not it's safe to write
        Add warnings to dmesg
v4->v5: Raise warning level
        Cleanup location of warning code
        Make test fail when helper is broken
v5->v6: General formatting cleanup
        Rename bpf_probe_write -> bpf_probe_write_user
v6->v7: More formatting cleanup.
        Clarifying a few comments
Clarified log message
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosamples/bpf: Add test/example of using bpf_probe_write_user bpf helper
Sargun Dhillon [Mon, 25 Jul 2016 12:55:02 +0000 (05:55 -0700)]
samples/bpf: Add test/example of using bpf_probe_write_user bpf helper

This example shows using a kprobe to act as a dnat mechanism to divert
traffic for arbitrary endpoints. It rewrite the arguments to a syscall
while they're still in userspace, and before the syscall has a chance
to copy the argument into kernel space.

Although this is an example, it also acts as a test because the mapped
address is 255.255.255.255:555 -> real address, and that's not a legal
address to connect to. If the helper is broken, the example will fail
on the intermediate steps, as well as the final step to verify the
rewrite of userspace memory succeeded.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobpf: Add bpf_probe_write_user BPF helper to be called in tracers
Sargun Dhillon [Mon, 25 Jul 2016 12:54:46 +0000 (05:54 -0700)]
bpf: Add bpf_probe_write_user BPF helper to be called in tracers

This allows user memory to be written to during the course of a kprobe.
It shouldn't be used to implement any kind of security mechanism
because of TOC-TOU attacks, but rather to debug, divert, and
manipulate execution of semi-cooperative processes.

Although it uses probe_kernel_write, we limit the address space
the probe can write into by checking the space with access_ok.
We do this as opposed to calling copy_to_user directly, in order
to avoid sleeping. In addition we ensure the threads's current fs
/ segment is USER_DS and the thread isn't exiting nor a kernel thread.

Given this feature is meant for experiments, and it has a risk of
crashing the system, and running programs, we print a warning on
when a proglet that attempts to use this helper is installed,
along with the pid and process name.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4_core: Check device state before unregistering it
Alex Vesker [Mon, 25 Jul 2016 12:42:13 +0000 (15:42 +0300)]
net/mlx4_core: Check device state before unregistering it

Verify that the device state is registered before un-registering it.
This check is required to prevent an OOPS on flows that do
re-registration of the device and its previous state was
unregistered.

Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlxsw: spectrum: Fix compilation error when CLS_ACT isn't set
Ido Schimmel [Mon, 25 Jul 2016 10:12:33 +0000 (13:12 +0300)]
mlxsw: spectrum: Fix compilation error when CLS_ACT isn't set

When CONFIG_NET_CLS_ACT isn't set 'struct tcf_exts' has no member named
'actions' and we therefore must not access it. Otherwise compilation
fails.

Fix this by introducing a new macro similar to tc_no_actions(), which
always returns 'false' if CONFIG_NET_CLS_ACT isn't set.

Fixes: 763b4b70afcd ("mlxsw: spectrum: Add support in matchall mirror TC offloading")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: davinci_cpdma: remove excessive dump of register values to kernel log
Uwe Kleine-König [Mon, 25 Jul 2016 09:54:45 +0000 (11:54 +0200)]
net: davinci_cpdma: remove excessive dump of register values to kernel log

Such a big dump of register values is hardly useful on a production
system.

Another downside of the now removed functions is that calling
emac_dump_regs resulted in at least 87 calls to dev_info while holding a
spinlock and having irqs off which is a big source of latency.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agogtp: #define #define _GTP_H_ and not #define _GTP_H
Colin Ian King [Sun, 24 Jul 2016 18:24:09 +0000 (19:24 +0100)]
gtp: #define #define _GTP_H_ and not #define _GTP_H

Fix clang build warning:

./include/net/gtp.h:1:9: warning: '_GTP_H_' is used as a header
guard here, followed by #define of a different macro [-Wheader-guard]

fix by defining _GTP_H_ and not _GTP_H

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'mlx5-minimum-inline-header-mode'
David S. Miller [Tue, 26 Jul 2016 00:53:41 +0000 (17:53 -0700)]
Merge branch 'mlx5-minimum-inline-header-mode'

Saeed Mahameed says:

====================
Mellanox 100G mlx5 minimum inline header mode

This small series from Hadar adds the support for minimum inline header mode query
in mlx5e NIC driver.

Today on TX the driver copies to the HW descriptor only up to L2 header which is the default
required mode and sufficient for today's needs.

The header in the HW descriptor is used for HW loopback steering decision, without it packets
will go directly to the wire with no questions asked.

For TX loopback steering according to L2/L3/L4 headers, ConnectX-4 requires to copy the
corresponding headers into the send queue(SQ) WQE HW descriptor so it can decide whether to loop it back
or to forward to wire.

For legacy E-Switch mode only L2 headers copy is required.
For advanced steering (E-Switch offloads) more header layers may be required to be copied,
the required mode will be advertised by FW to each VF and PF according to the corresponding
E-Switch configuration.

Changes V2:
 - Allocate query_nic_vport_context_out on the stack
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5e: Query minimum required header copy during xmit
Hadar Hen Zion [Sun, 24 Jul 2016 13:12:40 +0000 (16:12 +0300)]
net/mlx5e: Query minimum required header copy during xmit

Add support for query the minimum inline mode from the Firmware.
It is required for correct TX steering according to L3/L4 packet
headers.

Each send queue (SQ) has inline mode that defines the minimal required
headers that needs to be copied into the SQ WQE.
The driver asks the Firmware for the wqe_inline_mode device capability
value.  In case the device capability defined as "vport context" the
driver must check the reported min inline mode from the vport context
before creating its SQs.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5e: Check the minimum inline header mode before xmit
Hadar Hen Zion [Sun, 24 Jul 2016 13:12:39 +0000 (16:12 +0300)]
net/mlx5e: Check the minimum inline header mode before xmit

Each send queue (SQ) has inline mode that defines the minimal required
inline headers in the SQ WQE.
Before sending each packet check that the minimum required headers
on the WQE are copied.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/sctp: terminate rhashtable walk correctly
Vegard Nossum [Sat, 23 Jul 2016 07:42:35 +0000 (09:42 +0200)]
net/sctp: terminate rhashtable walk correctly

I was seeing a lot of these:

    BUG: sleeping function called from invalid context at mm/slab.h:388
    in_atomic(): 0, irqs_disabled(): 0, pid: 14971, name: trinity-c2
    Preemption disabled at:[<ffffffff819bcd46>] rhashtable_walk_start+0x46/0x150

     [<ffffffff81149abb>] preempt_count_add+0x1fb/0x280
     [<ffffffff83295722>] _raw_spin_lock+0x12/0x40
     [<ffffffff811aac87>] console_unlock+0x2f7/0x930
     [<ffffffff811ab5bb>] vprintk_emit+0x2fb/0x520
     [<ffffffff811aba6a>] vprintk_default+0x1a/0x20
     [<ffffffff812c171a>] printk+0x94/0xb0
     [<ffffffff811d6ed0>] print_stack_trace+0xe0/0x170
     [<ffffffff8115835e>] ___might_sleep+0x3be/0x460
     [<ffffffff81158490>] __might_sleep+0x90/0x1a0
     [<ffffffff8139b823>] kmem_cache_alloc+0x153/0x1e0
     [<ffffffff819bca1e>] rhashtable_walk_init+0xfe/0x2d0
     [<ffffffff82ec64de>] sctp_transport_walk_start+0x1e/0x60
     [<ffffffff82edd8ad>] sctp_transport_seq_start+0x4d/0x150
     [<ffffffff8143a82b>] seq_read+0x27b/0x1180
     [<ffffffff814f97fc>] proc_reg_read+0xbc/0x180
     [<ffffffff813d471b>] __vfs_read+0xdb/0x610
     [<ffffffff813d4d3a>] vfs_read+0xea/0x2d0
     [<ffffffff813d615b>] SyS_pread64+0x11b/0x150
     [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
     [<ffffffff832960a5>] return_from_SYSCALL_64+0x0/0x6a
     [<ffffffffffffffff>] 0xffffffffffffffff

Apparently we always need to call rhashtable_walk_stop(), even when
rhashtable_walk_start() fails:

 * rhashtable_walk_start - Start a hash table walk
 * @iter:       Hash table iterator
 *
 * Start a hash table walk.  Note that we take the RCU lock in all
 * cases including when we return an error.  So you must always call
 * rhashtable_walk_stop to clean up.

otherwise we never call rcu_read_unlock() and we get the splat above.

Fixes: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag")
See-also: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag")
See-also: f2dba9c6 ("rhashtable: Introduce rhashtable_walk_*")
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Mon, 25 Jul 2016 18:26:48 +0000 (11:26 -0700)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2016-07-22

This series contains updates to ixgbe and ixgbevf only.

Emil fixes the NACK check in ixgbevf_set_uc_addr_vf() for instances where
the index is not equal to zero.  Fixes an issue where mac->ops.setup_fc
can be NULL for backplanes which can cause the driver to crash on load.

Don fixes the second parameter of the LED functions, which is the index to
the LED we are interested in affecting.  Fixed variable to store register
reads to unsigned integer.  Adds support for the new x553 hardware into
ixgbevf.  Fixed a missing rtnl lock around ixgbevf_reinit_locked().
Fixed an issue where in ixgbevf_reset_subtask() was not verifying that
the port has been removed.  Cleans up the initial crosstalk fix, since
the SFP that indicates the presence of a SFP+ module changes between
hardware types.

Babu Moger fixes typo in freeing IRQ, since the array subscript increments
after the execution of the statement.

Wei Yongjun adds the missing destroy_workqueue() before returning from
ixgbe_init_module() in the error handling case.

Tony adds range checking for setting the MTU from the VF, where the PF can
return a NACK but this was not passed on to the VF, so propagate the
results from the PF to the VF so errors can be reported.  Consolidates
mailbox read and write functions, since the recent changes to
ixgbevf_write_msg_read_ack(), other functions are performing the same
operations done here.

Colin Ian King removes a redundant check on ret_val, since ret_val has
not changed since the previous check.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/irda: fix NULL pointer dereference on memory allocation failure
Vegard Nossum [Sat, 23 Jul 2016 05:43:50 +0000 (07:43 +0200)]
net/irda: fix NULL pointer dereference on memory allocation failure

I ran into this:

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ #19
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000
    RIP: 0010:[<ffffffff82bbf066>]  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
    RSP: 0018:ffff880111747bb8  EFLAGS: 00010286
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358
    RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048
    RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000
    R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000
    FS:  00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0
    Stack:
     0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220
     ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232
     ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000
    Call Trace:
     [<ffffffff82bca542>] irda_connect+0x562/0x1190
     [<ffffffff825ae582>] SYSC_connect+0x202/0x2a0
     [<ffffffff825b4489>] SyS_connect+0x9/0x10
     [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
     [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74
    RIP  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
     RSP <ffff880111747bb8>
    ---[ end trace 4cda2588bc055b30 ]---

The problem is that irda_open_tsap() can fail and leave self->tsap = NULL,
and then irttp_connect_request() almost immediately dereferences it.

Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosctp: also point GSO head_skb to the sk when it's available
Marcelo Ricardo Leitner [Sat, 23 Jul 2016 03:33:44 +0000 (00:33 -0300)]
sctp: also point GSO head_skb to the sk when it's available

The head skb for GSO packets won't travel through the inner depths of
SCTP stack as it doesn't contain any chunks on it. That means skb->sk
doesn't get set and then when sctp_recvmsg() calls
sctp_inet6_skb_msgname() on the head_skb it panics, as this last needs
to check flags at the socket (sp->v4mapped).

The fix is to initialize skb->sk for th head skb once we are able to do
it. That is, when the first chunk is processed.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosctp: fix BH handling on socket backlog
Marcelo Ricardo Leitner [Sat, 23 Jul 2016 03:32:48 +0000 (00:32 -0300)]
sctp: fix BH handling on socket backlog

Now that the backlog processing is called with BH enabled, we have to
disable BH before taking the socket lock via bh_lock_sock() otherwise
it may dead lock:

sctp_backlog_rcv()
                bh_lock_sock(sk);

                if (sock_owned_by_user(sk)) {
                        if (sk_add_backlog(sk, skb, sk->sk_rcvbuf))
                                sctp_chunk_free(chunk);
                        else
                                backloged = 1;
                } else
                        sctp_inq_push(inqueue, chunk);

                bh_unlock_sock(sk);

while sctp_inq_push() was disabling/enabling BH, but enabling BH
triggers pending softirq, which then may try to re-lock the socket in
sctp_rcv().

[  219.187215]  <IRQ>
[  219.187217]  [<ffffffff817ca3e0>] _raw_spin_lock+0x20/0x30
[  219.187223]  [<ffffffffa041888c>] sctp_rcv+0x48c/0xba0 [sctp]
[  219.187225]  [<ffffffff816e7db2>] ? nf_iterate+0x62/0x80
[  219.187226]  [<ffffffff816f1b14>] ip_local_deliver_finish+0x94/0x1e0
[  219.187228]  [<ffffffff816f1e1f>] ip_local_deliver+0x6f/0xf0
[  219.187229]  [<ffffffff816f1a80>] ? ip_rcv_finish+0x3b0/0x3b0
[  219.187230]  [<ffffffff816f17a8>] ip_rcv_finish+0xd8/0x3b0
[  219.187232]  [<ffffffff816f2122>] ip_rcv+0x282/0x3a0
[  219.187233]  [<ffffffff810d8bb6>] ? update_curr+0x66/0x180
[  219.187235]  [<ffffffff816abac4>] __netif_receive_skb_core+0x524/0xa90
[  219.187236]  [<ffffffff810d8e00>] ? update_cfs_shares+0x30/0xf0
[  219.187237]  [<ffffffff810d557c>] ? __enqueue_entity+0x6c/0x70
[  219.187239]  [<ffffffff810dc454>] ? enqueue_entity+0x204/0xdf0
[  219.187240]  [<ffffffff816ac048>] __netif_receive_skb+0x18/0x60
[  219.187242]  [<ffffffff816ad1ce>] process_backlog+0x9e/0x140
[  219.187243]  [<ffffffff816ac8ec>] net_rx_action+0x22c/0x370
[  219.187245]  [<ffffffff817cd352>] __do_softirq+0x112/0x2e7
[  219.187247]  [<ffffffff817cc3bc>] do_softirq_own_stack+0x1c/0x30
[  219.187247]  <EOI>
[  219.187248]  [<ffffffff810aa1c8>] do_softirq.part.14+0x38/0x40
[  219.187249]  [<ffffffff810aa24d>] __local_bh_enable_ip+0x7d/0x80
[  219.187254]  [<ffffffffa0408428>] sctp_inq_push+0x68/0x80 [sctp]
[  219.187258]  [<ffffffffa04190f1>] sctp_backlog_rcv+0x151/0x1c0 [sctp]
[  219.187260]  [<ffffffff81692b07>] __release_sock+0x87/0xf0
[  219.187261]  [<ffffffff81692ba0>] release_sock+0x30/0xa0
[  219.187265]  [<ffffffffa040e46d>] sctp_accept+0x17d/0x210 [sctp]
[  219.187266]  [<ffffffff810e7510>] ? prepare_to_wait_event+0xf0/0xf0
[  219.187268]  [<ffffffff8172d52c>] inet_accept+0x3c/0x130
[  219.187269]  [<ffffffff8168d7a3>] SYSC_accept4+0x103/0x210
[  219.187271]  [<ffffffff817ca2ba>] ? _raw_spin_unlock_bh+0x1a/0x20
[  219.187272]  [<ffffffff81692bfc>] ? release_sock+0x8c/0xa0
[  219.187276]  [<ffffffffa0413e22>] ? sctp_inet_listen+0x62/0x1b0 [sctp]
[  219.187277]  [<ffffffff8168f2d0>] SyS_accept+0x10/0x20

Fixes: 860fbbc343bf ("sctp: prepare for socket backlog behavior change")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agohv_netvsc: Fix VF register on bonding devices
Haiyang Zhang [Sat, 23 Jul 2016 01:14:50 +0000 (18:14 -0700)]
hv_netvsc: Fix VF register on bonding devices

Added a condition to avoid bonding devices with same MAC registering
as VF.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agokcm: remove redundant -ve error check and return path
Colin Ian King [Fri, 22 Jul 2016 18:04:12 +0000 (19:04 +0100)]
kcm: remove redundant -ve error check and return path

The check for a -ve error is redundant, remove it and just
immediately return the return value from the call to
seq_open_net.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: ipv6: Always leave anycast and multicast groups on link down
Mike Manning [Fri, 22 Jul 2016 17:32:11 +0000 (18:32 +0100)]
net: ipv6: Always leave anycast and multicast groups on link down

Default kernel behavior is to delete IPv6 addresses on link
down, which entails deletion of the multicast and the
subnet-router anycast addresses. These deletions do not
happen with sysctl setting to keep global IPv6 addresses on
link down, so every link down/up causes an increment of the
anycast and multicast refcounts. These bogus refcounts may
stop these addrs from being removed on subsequent calls to
delete them. The solution is to leave the groups for the
multicast and subnet anycast on link down for the callflow
when global IPv6 addresses are kept.

Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: Mike Manning <mmanning@brocade.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge tag 'wireless-drivers-next-for-davem-2016-07-22' of git://git.kernel.org/pub...
David S. Miller [Mon, 25 Jul 2016 18:09:19 +0000 (11:09 -0700)]
Merge tag 'wireless-drivers-next-for-davem-2016-07-22' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
pull-request: wireless-drivers-next 2016-07-22

I'm sick so I have to keep this short, but here's the last pull request
to net-next. This time there's a trivial conflict with mtd tree:

http://lkml.kernel.org/g/20160720123133.44dab209@canb.auug.org.au

We concluded with Brian (CCed) that it's best that we ask Linus to fix
this. The patches have been in linux-next for a couple of days. This
time I haven't done any merge tests so I don't know if there are any
other conflicts etc.

Please let me know if there are any problems.

wireless-drivers-next patches for 4.8

Major changes:

wl18xx

* add initial mesh support

bcma

* serial flash support on non-MIPS SoCs

ath10k

* enable support for QCA9888
* disable wake_tx_queue() mac80211 op for older devices to workaround
  throughput regression

ath9k

* implement temperature compensation support for AR9003+
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolibcxgb: remove unused including <linux/version.h>
Wei Yongjun [Fri, 22 Jul 2016 14:05:32 +0000 (14:05 +0000)]
libcxgb: remove unused including <linux/version.h>

Remove including <linux/version.h> that don't need it.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosctp: use inet_recvmsg to support sctp RFS well
Xin Long [Fri, 22 Jul 2016 13:25:42 +0000 (21:25 +0800)]
sctp: use inet_recvmsg to support sctp RFS well

Commit 486bdee0134c ("sctp: add support for RPS and RFS")
saves skb->hash into sk->sk_rxhash so that the inet_* can
record it to flow table.

But sctp uses sock_common_recvmsg as .recvmsg instead
of inet_recvmsg, sock_common_recvmsg doesn't invoke
sock_rps_record_flow to record the flow. It may cause
that the receiver has no chances to record the flow if
it doesn't send msg or poll the socket.

So this patch fixes it by using inet_recvmsg as .recvmsg
in sctp.

Fixes: 486bdee0134c ("sctp: add support for RPS and RFS")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'macsec-icv-fixes'
David S. Miller [Mon, 25 Jul 2016 17:55:40 +0000 (10:55 -0700)]
Merge branch 'macsec-icv-fixes'

Davide Caratti says:

====================
macsec: fix configurable ICV length

This series provides a fix for macsec configurable ICV length. The
maximum length of ICV element has been made compliant to IEEE 802.1AE,
and error reporting in case of cipher suite configuration failure has been
improved. Finally, a test has been added to netlink verify() callback in
order to avoid creation of macsec interfaces having user-provided ICV length
values that are not supported by the cipher suite.
====================

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomacsec: validate ICV length on link creation
Davide Caratti [Fri, 22 Jul 2016 13:07:58 +0000 (15:07 +0200)]
macsec: validate ICV length on link creation

Test the cipher suite initialization in case ICV length has a value
different than its default. If this test fails, creation of a new macsec
link will also fail. This avoids situations where further security
associations can't be added due to failures of crypto_aead_setauthsize(),
caused by unsupported user-provided values of the ICV length.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomacsec: fix error codes when a SA is created
Davide Caratti [Fri, 22 Jul 2016 13:07:57 +0000 (15:07 +0200)]
macsec: fix error codes when a SA is created

preserve the return value of AEAD functions that are called when a SA is
created, to avoid inappropriate display of "RTNETLINK answers: Cannot
allocate memory" message.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomacsec: limit ICV length to 16 octets
Davide Caratti [Fri, 22 Jul 2016 13:07:56 +0000 (15:07 +0200)]
macsec: limit ICV length to 16 octets

IEEE 802.1AE-2006 standard recommends that the ICV element in a MACsec
frame should not exceed 16 octets: add MACSEC_STD_ICV_LEN in uapi
definitions accordingly, and avoid accepting configurations where the ICV
length exceeds the standard value. Leave definition of MACSEC_MAX_ICV_LEN
unchanged for backwards compatibility with userspace programs.

Fixes: dece8d2b78d1 ("uapi: add MACsec bits")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobridge: Fix incorrect re-injection of LLDP packets
Ido Schimmel [Fri, 22 Jul 2016 11:56:20 +0000 (14:56 +0300)]
bridge: Fix incorrect re-injection of LLDP packets

Commit 8626c56c8279 ("bridge: fix potential use-after-free when hook
returns QUEUE or STOLEN verdict") caused LLDP packets arriving through a
bridge port to be re-injected to the Rx path with skb->dev set to the
bridge device, but this breaks the lldpad daemon.

The lldpad daemon opens a packet socket with protocol set to ETH_P_LLDP
for any valid device on the system, which doesn't not include soft
devices such as bridge and VLAN.

Since packet sockets (ptype_base) are processed in the Rx path after the
Rx handler, LLDP packets with skb->dev set to the bridge device never
reach the lldpad daemon.

Fix this by making the bridge's Rx handler re-inject LLDP packets with
RX_HANDLER_PASS, which effectively restores the behaviour prior to the
mentioned commit.

This means netfilter will never receive LLDP packets coming through a
bridge port, as I don't see a way in which we can have okfn() consume
the packet without breaking existing behaviour. I've already carried out
a similar fix for STP packets in commit 56fae404fb2c ("bridge: Fix
incorrect re-injection of STP packets").

Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosctp: support ipv6 nonlocal bind
Xin Long [Fri, 22 Jul 2016 09:38:51 +0000 (17:38 +0800)]
sctp: support ipv6 nonlocal bind

This patch makes sctp support ipv6 nonlocal bind by adding
sp->inet.freebind and net->ipv6.sysctl.ip_nonlocal_bind
check in sctp_v6_available as what sctp did to support
ipv4 nonlocal bind (commit cdac4e077489).

Reported-by: Shijoe George <spanjikk@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Mon, 25 Jul 2016 17:43:07 +0000 (10:43 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Conflicts:
drivers/net/ethernet/intel/i40e/i40e_main.c

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-07-22

This series contains updates to i40e and i40evf.

Heinrich Schuchardt found a possible null pointer being dereferenced in
i40e_debug_aq(), fixed the issue by doing the variable assignment after
we are sure the pointer is not null.

Avinash fixed an issue when link was down, we were not showing the
correct advertised link modes.

Mitch cleans up a useless initializer since the variable is assigned
right away.  Refactors the receive filter handling to properly track
filter adds and deletes so the driver will not lose filters during a
reset and up/down cycles.  Also added a tracking mechanism so that the
driver knows when to enter and leave promiscuous mode.

Catherine removes a device id which is not needed (or used).  Moves
a mutex lock since we need to lock the client list around the
i40e_client_release() call to prevent the release from interrupting
the client instances while they are being added.

Joshua adds Hyper-V specific VF device ids.

Amitoj Kaur Chawla cleans up a redundant memset() call before a memcpy().

Stefan Assmann adds the missing link advertise for some x710 NICs.

Tushar Dave fixes and issue found on SPARC, where a PF reset clears MAC
filters and if a platform-specific MAC address is used, the driver has
to explicitly write default MAC address to MAC filters otherwise all
incoming traffic destined to the default MAC address will be dropped
after reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobpf, events: fix offset in skb copy handler
Daniel Borkmann [Thu, 21 Jul 2016 23:19:42 +0000 (01:19 +0200)]
bpf, events: fix offset in skb copy handler

This patch fixes the __output_custom() routine we currently use with
bpf_skb_copy(). I missed that when len is larger than the size of the
current handle, we can issue multiple invocations of copy_func, and
__output_custom() advances destination but also source buffer by the
written amount of bytes. When we have __output_custom(), this is actually
wrong since in that case the source buffer points to a non-linear object,
in our case an skb, which the copy_func helper is supposed to walk.
Therefore, since this is non-linear we thus need to pass the offset into
the helper, so that copy_func can use it for extracting the data from
the source object.

Therefore, adjust the callback signatures properly and pass offset
into the skb_header_pointer() invoked from bpf_skb_copy() callback. The
__DEFINE_OUTPUT_COPY_BODY() is adjusted to accommodate for two things:
i) to pass in whether we should advance source buffer or not; this is
a compile-time constant condition, ii) to pass in the offset for
__output_custom(), which we do with help of __VA_ARGS__, so everything
can stay inlined as is currently. Both changes allow for adapting the
__output_* fast-path helpers w/o extra overhead.

Fixes: 555c8a8623a3 ("bpf: avoid stack copy and use skb ctx for event output")
Fixes: 7e3f977edd0b ("perf, events: add non-linear data support for raw records")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/ncsi: avoid maybe-uninitialized warning
Arnd Bergmann [Thu, 21 Jul 2016 19:28:34 +0000 (21:28 +0200)]
net/ncsi: avoid maybe-uninitialized warning

gcc-4.9 and higher warn about the newly added NSCI code:

net/ncsi/ncsi-manage.c: In function 'ncsi_process_next_channel':
net/ncsi/ncsi-manage.c:1003:2: error: 'old_state' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The warning is a false positive and therefore harmless, but it would be good to
avoid it anyway. I have determined that the barrier in the spin_unlock_irqsave()
is what confuses gcc to the point that it cannot track whether the variable
was unused or not.

This rearranges the code in a way that makes it obvious to gcc that old_state
is always initialized at the time of use, functionally this should not
change anything.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'libcxgb'
David S. Miller [Mon, 25 Jul 2016 17:31:09 +0000 (10:31 -0700)]
Merge branch 'libcxgb'

Varun Prakash says:

====================
common library for Chelsio drivers.

 This patch series adds common library module(libcxgb.ko)
 for Chelsio drivers to remove duplicate code.

 This series moves common iSCSI DDP Page Pod manager
 code from cxgb4.ko to libcxgb.ko, earlier this code
 was used by only cxgbit.ko now it is used by
 three Chelsio iSCSI drivers cxgb3i, cxgb4i, cxgbit.

 In future this module will have common connection
 management and hardware specific code that can
 be shared by multiple Chelsio drivers(cxgb4,
 csiostor, iw_cxgb4, cxgb4i, cxgbit).

 Please review.

 Thanks

-v3
- removed unused module init and exit functions.

-v2
- updated CONFIG_CHELSIO_LIB to an invisible option
- changed libcxgb.ko module license from GPL to Dual BSD/GPL
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb3i, cxgb4i: fix symbol not declared sparse warning
Varun Prakash [Thu, 21 Jul 2016 17:27:19 +0000 (22:57 +0530)]
cxgb3i, cxgb4i: fix symbol not declared sparse warning

Fix following sparse warnings
warning: symbol 'cxgb3i_ofld_init' was not declared. Should it be static?
warning: symbol 'cxgb4i_cplhandlers' was not declared. Should it be static?
warning: symbol 'cxgb4i_ofld_init' was not declared. Should it be static?

Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolibcxgb: export ppm release and tagmask set api
Varun Prakash [Thu, 21 Jul 2016 17:27:18 +0000 (22:57 +0530)]
libcxgb: export ppm release and tagmask set api

Export cxgbi_ppm_release() to release
ppod manager and cxgbi_tagmask_set() to
set tag mask, they are used by cxgb3i, cxgb4i
and cxgbit.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb3i: add iSCSI DDP support
Varun Prakash [Thu, 21 Jul 2016 17:27:17 +0000 (22:57 +0530)]
cxgb3i: add iSCSI DDP support

Add iSCSI DDP support in cxgb3i driver
using common iSCSI DDP Page Pod Manager.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4i,libcxgbi: add iSCSI DDP support
Varun Prakash [Thu, 21 Jul 2016 17:27:16 +0000 (22:57 +0530)]
cxgb4i,libcxgbi: add iSCSI DDP support

Add iSCSI DDP support in cxgb4i driver
using common iSCSI DDP Page Pod Manager.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb3i,cxgb4i,libcxgbi: remove iSCSI DDP support
Varun Prakash [Thu, 21 Jul 2016 17:27:15 +0000 (22:57 +0530)]
cxgb3i,cxgb4i,libcxgbi: remove iSCSI DDP support

Remove old ddp code from cxgb3i,cxgb4i,libcxgbi.

Next two commits adds DDP support using
common iSCSI DDP Page Pod Manager.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolibcxgb: add library module for Chelsio drivers
Varun Prakash [Thu, 21 Jul 2016 17:27:14 +0000 (22:57 +0530)]
libcxgb: add library module for Chelsio drivers

Add common library module(libcxgb.ko) for
Chelsio drivers to remove duplicate code.

Code for iSCSI DDP Page Pod Manager is moved
from cxgb4.ko to libcxgb.ko. Earlier only cxgbit.ko
was using this code, now cxgb3i and cxgb4i will
also use common Page Pod manager code.

In future this module will have common connection
management and hardware specific code that can be
shared by multiple Chelsio drivers.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: bridge: br_set_ageing_time takes a clock_t
Vivien Didelot [Thu, 21 Jul 2016 16:42:19 +0000 (12:42 -0400)]
net: bridge: br_set_ageing_time takes a clock_t

Change the ageing_time type in br_set_ageing_time() from u32 to what it
is expected to be, i.e. a clock_t.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: bridge: fix br_stp_enable_bridge comment
Vivien Didelot [Thu, 21 Jul 2016 16:42:10 +0000 (12:42 -0400)]
net: bridge: fix br_stp_enable_bridge comment

br_stp_enable_bridge() does take the br->lock spinlock. Fix its wrongly
pasted comment and use the same as br_stp_disable_bridge().

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4/cxgb4vf: Add link mode mask API to cxgb4 and cxgb4vf
Ganesh Goudar [Thu, 21 Jul 2016 14:49:18 +0000 (20:19 +0530)]
cxgb4/cxgb4vf: Add link mode mask API to cxgb4 and cxgb4vf

Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/bonding: Enforce active-backup policy for IPoIB bonds
Mark Bloch [Thu, 21 Jul 2016 08:52:55 +0000 (11:52 +0300)]
net/bonding: Enforce active-backup policy for IPoIB bonds

When using an IPoIB bond currently only active-backup mode is a valid
use case and this commit strengthens it.

Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave
netdevices not supporting set_mac_address()") was introduced till
4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the
fail over mac policy always applied to IPoIB bonds.

With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting
the device address"), that doesn't hold and practically IPoIB bonds are
broken as of that. To fix it, lets go to fail over mac if the device
doesn't support the ndo OR this is IPoIB device.

As a by-product, this commit also prevents a stack corruption which
occurred when trying to copy 20 bytes (IPoIB) device address
to a sockaddr struct that has only 16 bytes of storage.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'mlxsw-port-mirroring'
David S. Miller [Mon, 25 Jul 2016 06:12:00 +0000 (23:12 -0700)]
Merge branch 'mlxsw-port-mirroring'

Jiri Pirko says:

====================
mlxsw: implement port mirroring offload

This patchset introduces tc matchall classifier and its offload
to Spectrum hardware. In combination with mirred action, defined port mirroring
setup is offloaded by mlxsw/spectrum driver.

The commands used for creating mirror ports:

tc qdisc  add dev eth25 handle ffff: ingress
tc filter add dev eth25 parent ffff:            \
        matchall skip_sw                        \
        action mirred egress mirror             \
        dev eth27

tc qdisc add dev eth25 handle 1: root prio
tc filter add dev eth25 parent 1:               \
        matchall skip_sw                        \
        action mirred egress mirror             \
        dev eth27

These patches contain:
 - Resource query implementation
 - Hardware port mirorring support for spectrum.
 - Definition of the matchall traffic classifier.
 - General support for hw-offloading for that classifier.
 - Specific spectrum implementaion for matchall offloading.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlxsw: spectrum: Add support in matchall mirror TC offloading
Yotam Gigi [Thu, 21 Jul 2016 10:03:17 +0000 (12:03 +0200)]
mlxsw: spectrum: Add support in matchall mirror TC offloading

This patch offloads port mirroring directives to hw using the matchall TC
with action mirror. It includes both the implementation of the
ndo_setup_tc function for the spectrum driver and the spectrum hardware
offload configuration code.

The hardware offload code is basically two new functions which are capable
of adding and removing a new mirror ports pair. It is done using the MPAT,
MPAR and SBIB registers:
 - A new Switch-Port Analyzer (SPAN) entry is added using MPAT to the 'to'
   port.
 - The 'to' port is bound to the SPAN entry using MPAR register.
 - In case of egress SPAN, the 'to' port gets a new internal shared
   buffer using SBIB register.

In addition, a new database was added to the mlxsw_sp struct to store all
the SPAN entries and their bound ports list. The number of supported SPAN
entries is determined by resource query.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/sched: act_mirred: Add helper inlines to access tcf_mirred info.
Yotam Gigi [Thu, 21 Jul 2016 10:03:16 +0000 (12:03 +0200)]
net/sched: act_mirred: Add helper inlines to access tcf_mirred info.

The helper function is_tcf_mirred_mirror helps finding whether an action
struct is of type mirred and is configured to be of type mirror.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlxsw: reg: Add the Monitoring Port Analyzer register
Yotam Gigi [Thu, 21 Jul 2016 10:03:15 +0000 (12:03 +0200)]
mlxsw: reg: Add the Monitoring Port Analyzer register

The MPAR register is used to bind ports to a SPAN entry (which was
created using MPAT register) and thus mirror their traffic (ingress /
egress) to a different port.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlxsw: reg: Add Monitoring Port Analyzer Table register
Yotam Gigi [Thu, 21 Jul 2016 10:03:14 +0000 (12:03 +0200)]
mlxsw: reg: Add Monitoring Port Analyzer Table register

The MPAT register is used to query and configure the Switch Port Analyzer
(SPAN) table. This register is used to configure a port as a mirror output
port, while after that a mirrored input port can be bound using MPAR
register.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlxsw: reg: Add Shared Buffer Internal Buffer register
Yotam Gigi [Thu, 21 Jul 2016 10:03:13 +0000 (12:03 +0200)]
mlxsw: reg: Add Shared Buffer Internal Buffer register

The SBIB register configures per port buffer for internal use. This
register is used to configure an egress mirror buffer on the egress port
which does the mirroring.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/sched: Add match-all classifier hw offloading.
Yotam Gigi [Thu, 21 Jul 2016 10:03:12 +0000 (12:03 +0200)]
net/sched: Add match-all classifier hw offloading.

Following the work that have been done on offloading classifiers like u32
and flower, now the match-all classifier hw offloading is possible. if
the interface supports tc offloading.

To control the offloading, two tc flags have been introduced: skip_sw and
skip_hw. Typical usage:

tc filter add dev eth25 parent ffff:  \
matchall skip_sw \
action mirred egress mirror \
dev eth27

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/sched: introduce Match-all classifier
Jiri Pirko [Thu, 21 Jul 2016 10:03:11 +0000 (12:03 +0200)]
net/sched: introduce Match-all classifier

The matchall classifier matches every packet and allows the user to apply
actions on it. This filter is very useful in usecases where every packet
should be matched, for example, packet mirroring (SPAN) can be setup very
easily using that filter.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlxsw: pci: Add max span resources to resources query
Nogah Frankel [Thu, 21 Jul 2016 10:03:10 +0000 (12:03 +0200)]
mlxsw: pci: Add max span resources to resources query

Add max span resources to resources query.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlxsw: pci: Add resources query implementation.
Nogah Frankel [Thu, 21 Jul 2016 10:03:09 +0000 (12:03 +0200)]
mlxsw: pci: Add resources query implementation.

Add resources query implementation. If exists, query the HW for its
builtin resources instead of having them as consts in the code.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocdc_ether: Improve ZTE MF823/831/910 handling
Kristian Evensen [Thu, 21 Jul 2016 09:10:06 +0000 (11:10 +0200)]
cdc_ether: Improve ZTE MF823/831/910 handling

The firmware in several ZTE devices (at least the MF823/831/910
modems/mifis) use OS fingerprinting to determine which type of device to
export. In addition, these devices export a REST API which can be used to
control the type of device. So far, on Linux, the devices have been seen as
RNDIS or CDC Ether.

When CDC Ether is used, devices of the same type are, as with RNDIS,
exported with the same, bogus random MAC address. In addition, the devices
(at least on all firmware revisions I have found) use the bogus MAC when
sending traffic routed from external networks. And as a final feature, the
devices sometimes export the link state incorrectly. There are also
references online to several other ZTE devices displaying this behavior,
with several different PIDs and MAC addresses.

This patch tries to improve the handling of ZTE devices by doing the
following:

* Create a new driver_info-struct that is used by ZTE devices that do not
have an explicit entry in the product table. This struct is the same as the
default cdc_ether driver info, but a new bind- and an rx_fixup-function
have been added.

* In the new bind function, we check if we have read a random MAC from the
device. If we have, then we generate a new random MAC address. This will
ensure that all devices get a unique MAC.

* The rx_fixup-function replaces the destination MAC address in the skb
with that of the device. I have not seen a revision of these devices that
behaves correctly (i.e., sets the right destination MAC), so I chose not to
do any comparison with for example the known, bogus addresses.

* The MF823/MF832/MF910 sometimes export cdc carrier on twice on connect
(the correct behavior is off then on). Work around this by manually setting
carrier to off if an on-notification is received and the NOCARRIER-bit is
not set.

This change will affect all devices, but it should take care of similar
mistakes made by other manufacturers. I tried to think of/look/test for
problems/regressions that could be introduced by this behavior, but could
not find any. However, my familiarity with this code path is not that
great, so there could be something I have overlooked.

I have tested this patch with multiple revisions of all three devices, and
they behave as expected. In other words, they all got a valid, random MAC,
the correct operational state and I can receive/sent traffic without
problems. I also tested with some other cdc_ether devices I have and did
not find any problems/regressions caused by the two general changes.

v3->v4:
* Forgot to remove unused variables, sorry about that (thanks David
Miller).

v2->v3:
* I had forgot to remove the random MAC generation from usbnet_cdc_bind()
(thanks Oliver).
* Rework logic in the ZTE bind-function a bit.

v1->v2:
* Only generate random MAC for ZTE devices (thanks Oliver Neukum).
* Set random MAC and do RX fixup for all ZTE devices that do not have a
product-entry, as the bogus MAC have been seen on devices with several
different PIDs/MAC addresses. In other words, it seems to be the default
behavior of ZTE CDC Ether devices (thanks Lars Melin).

Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Mon, 25 Jul 2016 05:02:36 +0000 (22:02 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
they are:

1) Count pre-established connections as active in "least connection"
   schedulers such that pre-established connections to avoid overloading
   backend servers on peak demands, from Michal Kubecek via Simon Horman.

2) Address a race condition when resizing the conntrack table by caching
   the bucket size when fulling iterating over the hashtable in these
   three possible scenarios: 1) dump via /proc/net/nf_conntrack,
   2) unlinking userspace helper and 3) unlinking custom conntrack timeout.
   From Liping Zhang.

3) Revisit early_drop() path to perform lockless traversal on conntrack
   eviction under stress, use del_timer() as synchronization point to
   avoid two CPUs evicting the same entry, from Florian Westphal.

4) Move NAT hlist_head to nf_conn object, this simplifies the existing
   NAT extension and it doesn't increase size since recent patches to
   align nf_conn, from Florian.

5) Use rhashtable for the by-source NAT hashtable, also from Florian.

6) Don't allow --physdev-is-out from OUTPUT chain, just like
   --physdev-out is not either, from Hangbin Liu.

7) Automagically set on nf_conntrack counters if the user tries to
   match ct bytes/packets from nftables, from Liping Zhang.

8) Remove possible_net_t fields in nf_tables set objects since we just
   simply pass the net pointer to the backend set type implementations.

9) Fix possible off-by-one in h323, from Toby DiPasquale.

10) early_drop() may be called from ctnetlink patch, so we must hold
    rcu read size lock from them too, this amends Florian's patch #3
    coming in this batch, from Liping Zhang.

11) Use binary search to validate jump offset in x_tables, this
    addresses the O(n!) validation that was introduced recently
    resolve security issues with unpriviledge namespaces, from Florian.

12) Fix reference leak to connlabel in error path of nft_ct, from Zhang.

13) Three updates for nft_log: Fix log prefix leak in error path. Bail
    out on loglevel larger than debug in nft_log and set on the new
    NF_LOG_F_COPY_LEN flag when snaplen is specified. Again from Zhang.

14) Allow to filter rule dumps in nf_tables based on table and chain
    names.

15) Simplify connlabel to always use 128 bits to store labels and
    get rid of unused function in xt_connlabel, from Florian.

16) Replace set_expect_timeout() by mod_timer() from the h323 conntrack
    helper, by Gao Feng.

17) Put back x_tables module reference in nft_compat on error, from
    Liping Zhang.

18) Add a reference count to the x_tables extensions cache in
    nft_compat, so we can remove them when unused and avoid a crash
    if the extensions are rmmod, again from Zhang.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Sat, 23 Jul 2016 23:31:37 +0000 (19:31 -0400)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Just several instances of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: nft_compat: fix crash when related match/target module is removed
Liping Zhang [Sat, 23 Jul 2016 08:00:32 +0000 (16:00 +0800)]
netfilter: nft_compat: fix crash when related match/target module is removed

We "cache" the loaded match/target modules and reuse them, but when the
modules are removed, we still point to them. Then we may end up with
invalid memory references when using iptables-compat to add rules later.

Input the following commands will reproduce the kernel crash:
  # iptables-compat -A INPUT -j LOG
  # iptables-compat -D INPUT -j LOG
  # rmmod xt_LOG
  # iptables-compat -A INPUT -j LOG
  BUG: unable to handle kernel paging request at ffffffffa05a9010
  IP: [<ffffffff813f783e>] strcmp+0xe/0x30
  Call Trace:
  [<ffffffffa05acc43>] nft_target_select_ops+0x83/0x1f0 [nft_compat]
  [<ffffffffa058a177>] nf_tables_expr_parse+0x147/0x1f0 [nf_tables]
  [<ffffffffa058e541>] nf_tables_newrule+0x301/0x810 [nf_tables]
  [<ffffffff8141ca00>] ? nla_parse+0x20/0x100
  [<ffffffffa057fa8f>] nfnetlink_rcv+0x33f/0x53d [nfnetlink]
  [<ffffffffa057f94b>] ? nfnetlink_rcv+0x1fb/0x53d [nfnetlink]
  [<ffffffff817116b8>] netlink_unicast+0x178/0x220
  [<ffffffff81711a5b>] netlink_sendmsg+0x2fb/0x3a0
  [<ffffffff816b7fc8>] sock_sendmsg+0x38/0x50
  [<ffffffff816b8a7e>] ___sys_sendmsg+0x28e/0x2a0
  [<ffffffff816bcb7e>] ? release_sock+0x1e/0xb0
  [<ffffffff81804ac5>] ? _raw_spin_unlock_bh+0x35/0x40
  [<ffffffff816bcbe2>] ? release_sock+0x82/0xb0
  [<ffffffff816b93d4>] __sys_sendmsg+0x54/0x90
  [<ffffffff816b9422>] SyS_sendmsg+0x12/0x20
  [<ffffffff81805172>] entry_SYSCALL_64_fastpath+0x1a/0xa9

So when nobody use the related match/target module, there's no need to
"cache" it. And nft_[match|target]_release are useless anymore, remove
them.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
9 years agonetfilter: nft_compat: put back match/target module if init fail
Liping Zhang [Sat, 23 Jul 2016 08:00:31 +0000 (16:00 +0800)]
netfilter: nft_compat: put back match/target module if init fail

If the user specify the invalid NFTA_MATCH_INFO/NFTA_TARGET_INFO attr
or memory alloc fail, we should call module_put to the related match
or target. Otherwise, we cannot remove the module even nobody use it.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
9 years agonetfilter: h323: Use mod_timer instead of set_expect_timeout
Gao Feng [Fri, 22 Jul 2016 04:59:15 +0000 (12:59 +0800)]
netfilter: h323: Use mod_timer instead of set_expect_timeout

Simplify the code without any side effect. The set_expect_timeout is
used to modify the timer expired time.  It tries to delete timer, and
add it again.  So we could use mod_timer directly.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sat, 23 Jul 2016 06:44:31 +0000 (15:44 +0900)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix memory leak in nftables, from Liping Zhang.

 2) Need to check result of vlan_insert_tag() in batman-adv otherwise we
    risk NULL skb derefs, from Sven Eckelmann.

 3) Check for dev_alloc_skb() failures in cfg80211, from Gregory
    Greenman.

 4) Handle properly when we have ppp_unregister_channel() happening in
    parallel with ppp_connect_channel(), from WANG Cong.

 5) Fix DCCP deadlock, from Eric Dumazet.

 6) Bail out properly in UDP if sk_filter() truncates the packet to be
    smaller than even the space that the protocol headers need.  From
    Michal Kubecek.

 7) Similarly for rose, dccp, and sctp, from Willem de Bruijn.

 8) Make TCP challenge ACKs less predictable, from Eric Dumazet.

 9) Fix infinite loop in bgmac_dma_tx_add() from Florian Fainelli.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
  packet: propagate sock_cmsg_send() error
  net/mlx5e: Fix del vxlan port command buffer memset
  packet: fix second argument of sock_tx_timestamp()
  net: switchdev: change ageing_time type to clock_t
  Update maintainer for EHEA driver.
  net/mlx4_en: Add resilience in low memory systems
  net/mlx4_en: Move filters cleanup to a proper location
  sctp: load transport header after sk_filter
  net/sched/sch_htb: clamp xstats tokens to fit into 32-bit int
  net: cavium: liquidio: Avoid dma_unmap_single on uninitialized ndata
  net: nb8800: Fix SKB leak in nb8800_receive()
  et131x: Fix logical vs bitwise check in et131x_tx_timeout()
  vlan: use a valid default mtu value for vlan over macsec
  net: bgmac: Fix infinite loop in bgmac_dma_tx_add()
  mlxsw: spectrum: Prevent invalid ingress buffer mapping
  mlxsw: spectrum: Prevent overwrite of DCB capability fields
  mlxsw: spectrum: Don't emit errors when PFC is disabled
  mlxsw: spectrum: Indicate support for autonegotiation
  mlxsw: spectrum: Force link training according to admin state
  r8152: add MODULE_VERSION
  ...

9 years agoMerge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszer...
Linus Torvalds [Sat, 23 Jul 2016 05:25:02 +0000 (14:25 +0900)]
Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fixes from Miklos Szeredi:
 "This contains a fix for a potential crash/corruption issue and another
  where the suid/sgid bits weren't cleared on write"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: verify upper dentry in ovl_remove_and_whiteout()
  ovl: Copy up underlying inode's ->i_mode to overlay inode
  ovl: handle ATTR_KILL*

9 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 23 Jul 2016 03:54:20 +0000 (12:54 +0900)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "Five fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  pps: do not crash when failed to register
  tools/vm/slabinfo: fix an unintentional printf
  testing/radix-tree: fix a macro expansion bug
  radix-tree: fix radix_tree_iter_retry() for tagged iterators.
  mm: memcontrol: fix cgroup creation failure after many small jobs

9 years agoMerge tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of git://people.freedesktop.org/~airlied...
Linus Torvalds [Sat, 23 Jul 2016 03:51:52 +0000 (12:51 +0900)]
Merge tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of git://people.freedesktop.org/~airlied/linux

Pull intel kabylake drm fixes from Dave Airlie:
 "As mentioned Intel has gathered all the Kabylake fixes from -next,
  which we've enabled in 4.7 for the first time, these are pretty much
  limited in scope to only affects kabylake, which is hw that isn't
  shipping yet.  So I'm mostly okay with it going in now.

  If we don't land this, it might be a good idea to disable kabylake
  support in 4.7 before we ship"

* tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of git://people.freedesktop.org/~airlied/linux: (28 commits)
  drm/i915/kbl: Introduce the first official DMC for Kabylake.
  drm/i915: Introduce Kabypoint PCH for Kabylake H/DT.
  drm/i915/gen9: implement WaConextSwitchWithConcurrentTLBInvalidate
  drm/i915/gen9: Add WaFbcHighMemBwCorruptionAvoidance
  drm/i195/fbc: Add WaFbcNukeOnHostModify
  drm/i915/gen9: Add WaFbcWakeMemOn
  drm/i915/gen9: Add WaFbcTurnOffFbcWatermark
  drm/i915/kbl: Add WaClearSlmSpaceAtContextSwitch
  drm/i915/gen9: Add WaEnableChickenDCPR
  drm/i915/kbl: Add WaDisableSbeCacheDispatchPortSharing
  drm/i915/kbl: Add WaDisableGafsUnitClkGating
  drm/i915/kbl: Add WaForGAMHang
  drm/i915: Add WaInsertDummyPushConstP for bxt and kbl
  drm/i915/kbl: Add WaDisableDynamicCreditSharing
  drm/i915/kbl: Add WaDisableGamClockGating
  drm/i915/gen9: Enable must set chicken bits in config0 reg
  drm/i915/kbl: Add WaDisableLSQCROPERFforOCL
  drm/i915/kbl: Add WaDisableSDEUnitClockGating
  drm/i915/kbl: Add WaDisableFenceDestinationToSLM for A0
  drm/i915/kbl: Add WaEnableGapsTsvCreditFix
  ...

9 years agoMerge tag 'drm-fixes-for-v4.7-rc8-intel' of git://people.freedesktop.org/~airlied...
Linus Torvalds [Sat, 23 Jul 2016 03:46:42 +0000 (12:46 +0900)]
Merge tag 'drm-fixes-for-v4.7-rc8-intel' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Two i915 regression fixes.

  Intel have submitted some Kabylake fixes I'll send separately, since
  this is the first kernel with kabylake support and they don't go much
  outside that area I think they should be fine"

* tag 'drm-fixes-for-v4.7-rc8-intel' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: add missing condition for committing planes on crtc
  drm/i915: Treat eDP as always connected, again

9 years agoMerge tag 'm68k-for-v4.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert...
Linus Torvalds [Sat, 23 Jul 2016 03:39:08 +0000 (12:39 +0900)]
Merge tag 'm68k-for-v4.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k upddates from Geert Uytterhoeven:
 - assorted spelling fixes
 - defconfig updates

* tag 'm68k-for-v4.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k/defconfig: Update defconfigs for v4.7-rc2
  m68k: Assorted spelling fixes

9 years agoMerge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Sat, 23 Jul 2016 03:32:50 +0000 (12:32 +0900)]
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "A handful of fixes before final release:

  Marvell Armada:
   - One to fix a typo in the devicetree specifying memory ranges for
     the crypto engine
   - Two to deal with marking PCI and device-memory as strongly ordered
     to avoid hardware deadlocks, in particular when enabling above
     crypto driver.
   - Compile fix for PM

  Allwinner:
   - DT clock fixes to deal with u-boot-enabled framebuffer (simplefb).
   - Make R8 (C.H.I.P. SoC) inherit system compatibility from A13 to
     make clocks register proper.

  Tegra:
   - Fix SD card voltage setting on the Tegra3 Beaver dev board

  Misc:
   - Two maintainers updates for STM32 and STi platforms"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: tegra: beaver: Allow SD card voltage to be changed
  MAINTAINERS: update STi maintainer list
  MAINTAINERS: update STM32 maintainers list
  ARM: mvebu: compile pm code conditionally
  ARM: dts: sun7i: Fix pll3x2 and pll7x2 not having a parent clock
  ARM: dts: sunxi: Add pll3 to simplefb nodes clocks lists
  ARM: dts: armada-38x: fix MBUS_ID for crypto SRAM on Armada 385 Linksys
  ARM: mvebu: map PCI I/O regions strongly ordered
  ARM: mvebu: fix HW I/O coherency related deadlocks
  ARM: sunxi/dt: make the CHIP inherit from allwinner,sun5i-a13

9 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Sat, 23 Jul 2016 03:20:55 +0000 (12:20 +0900)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes a sporadic build failure in the qat driver as well as a
  memory corruption bug in rsa-pkcs1pad"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
  crypto: qat - make qat_asym_algs.o depend on asn1 headers