]> www.infradead.org Git - users/hch/block.git/log
users/hch/block.git
6 years agolib/dim: Fix -Wunused-const-variable warnings
Leon Romanovsky [Tue, 23 Jul 2019 07:22:48 +0000 (10:22 +0300)]
lib/dim: Fix -Wunused-const-variable warnings

DIM causes to the following warnings during kernel compilation
which indicates that tx_profile and rx_profile are supposed to
be declared in *.c and not in *.h files.

In file included from ./include/rdma/ib_verbs.h:64,
                 from ./include/linux/mlx5/device.h:37,
                 from ./include/linux/mlx5/driver.h:51,
                 from ./include/linux/mlx5/vport.h:36,
                 from drivers/infiniband/hw/mlx5/ib_virt.c:34:
./include/linux/dim.h:326:1: warning: _tx_profile_ defined but not used [-Wunused-const-variable=]
  326 | tx_profile[DIM_CQ_PERIOD_NUM_MODES][NET_DIM_PARAMS_NUM_PROFILES] = {
      | ^~~~~~~~~~
./include/linux/dim.h:320:1: warning: _rx_profile_ defined but not used [-Wunused-const-variable=]
  320 | rx_profile[DIM_CQ_PERIOD_NUM_MODES][NET_DIM_PARAMS_NUM_PROFILES] = {
      | ^~~~~~~~~~

Fixes: 4f75da3666c0 ("linux/dim: Move implementation to .c files")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agolinux/dim: Fix overflow in dim calculation
Yamin Friedman [Tue, 23 Jul 2019 07:22:47 +0000 (10:22 +0300)]
linux/dim: Fix overflow in dim calculation

While using net_dim, a dim_sample was used without ever initializing the
comps value. Added use of DIV_ROUND_DOWN_ULL() to prevent potential
overflow, it should not be a problem to save the final result in an int
because after the division by epms the value should not be larger than a
few thousand.

[ 1040.127124] UBSAN: Undefined behaviour in lib/dim/dim.c:78:23
[ 1040.130118] signed integer overflow:
[ 1040.131643] 134718714 * 100 cannot be represented in type 'int'

Fixes: 398c2b05bbee ("linux/dim: Add completions count to dim_sample")
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoife: error out when nla attributes are empty
Cong Wang [Tue, 23 Jul 2019 04:43:00 +0000 (21:43 -0700)]
ife: error out when nla attributes are empty

act_ife at least requires TCA_IFE_PARMS, so we have to bail out
when there is no attribute passed in.

Reported-by: syzbot+fbb5b288c9cb6a2eeac4@syzkaller.appspotmail.com
Fixes: ef6980b6becb ("introduce IFE action")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetrom: hold sock when setting skb->destructor
Cong Wang [Tue, 23 Jul 2019 03:41:22 +0000 (20:41 -0700)]
netrom: hold sock when setting skb->destructor

sock_efree() releases the sock refcnt, if we don't hold this refcnt
when setting skb->destructor to it, the refcnt would not be balanced.
This leads to several bug reports from syzbot.

I have checked other users of sock_efree(), all of them hold the
sock refcnt.

Fixes: c8c8218ec5af ("netrom: fix a memory leak in nr_rx_frame()")
Reported-and-tested-by: <syzbot+622bdabb128acc33427d@syzkaller.appspotmail.com>
Reported-and-tested-by: <syzbot+6eaef7158b19e3fec3a0@syzkaller.appspotmail.com>
Reported-and-tested-by: <syzbot+9399c158fcc09b21d0d2@syzkaller.appspotmail.com>
Reported-and-tested-by: <syzbot+a34e5f3d0300163f0c87@syzkaller.appspotmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoovs: datapath: hide clang frame-overflow warnings
Arnd Bergmann [Mon, 22 Jul 2019 15:00:01 +0000 (17:00 +0200)]
ovs: datapath: hide clang frame-overflow warnings

Some functions in the datapath code are factored out so that each
one has a stack frame smaller than 1024 bytes with gcc. However,
when compiling with clang, the functions are inlined more aggressively
and combined again so we get

net/openvswitch/datapath.c:1124:12: error: stack frame size of 1528 bytes in function 'ovs_flow_cmd_set' [-Werror,-Wframe-larger-than=]

Marking both get_flow_actions() and ovs_nla_init_match_and_action()
as 'noinline_for_stack' gives us the same behavior that we see with
gcc, and no warning. Note that this does not mean we actually use
less stack, as the functions call each other, and we still get
three copies of the large 'struct sw_flow_key' type on the stack.

The comment tells us that this was previously considered safe,
presumably since the netlink parsing functions are called with
a known backchain that does not also use a lot of stack space.

Fixes: 9cc9a5cb176c ("datapath: Avoid using stack larger than 1024.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/tls: add myself as a co-maintainer
Jakub Kicinski [Wed, 24 Jul 2019 18:02:48 +0000 (11:02 -0700)]
net/tls: add myself as a co-maintainer

I've been spending quite a bit of time fixing and
preventing bit rot in the core TLS code. TLS seems
to only be growing in importance, I'd like to help
ensuring the quality of our implementation.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: mscc: initialize stats array
Andreas Schwab [Wed, 24 Jul 2019 15:32:57 +0000 (17:32 +0200)]
net: phy: mscc: initialize stats array

The memory allocated for the stats array may contain arbitrary data.

Fixes: e4f9ba642f0b ("net: phy: mscc: add support for VSC8514 PHY.")
Fixes: 00d70d8e0e78 ("net: phy: mscc: add support for VSC8574 PHY")
Fixes: a5afc1678044 ("net: phy: mscc: add support for VSC8584 PHY")
Fixes: f76178dc5218 ("net: phy: mscc: add ethtool statistics counters")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phylink: don't start and stop SGMII PHYs in SFP modules twice
Arseny Solokha [Wed, 24 Jul 2019 13:31:39 +0000 (20:31 +0700)]
net: phylink: don't start and stop SGMII PHYs in SFP modules twice

SFP modules connected using the SGMII interface have their own PHYs which
are handled by the struct phylink's phydev field. On the other hand, for
the modules connected using 1000Base-X interface that field is not set.

Since commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network
devices and sfp cages") phylink_start() ends up setting the phydev field
using the sfp-bus infrastructure, which eventually calls phy_start() on it,
and then calling phy_start() again on the same phydev from phylink_start()
itself. Similar call sequence holds for phylink_stop(), only in the reverse
order. This results in WARNs during network interface bringup and shutdown
when a copper SFP module is connected, as phy_start() and phy_stop() are
called twice in a row for the same phy_device:

  % ip link set up dev eth0
  ------------[ cut here ]------------
  called from state UP
  WARNING: CPU: 1 PID: 155 at drivers/net/phy/phy.c:895 phy_start+0x74/0xc0
  Modules linked in:
  CPU: 1 PID: 155 Comm: backend Not tainted 5.2.0+ #1
  NIP:  c0227bf0 LR: c0227bf0 CTR: c004d224
  REGS: df547720 TRAP: 0700   Not tainted  (5.2.0+)
  MSR:  00029000 <CE,EE,ME>  CR: 24002822  XER: 00000000

  GPR00: c0227bf0 df5477d8 df5d7080 00000014 df9d2370 df9d5ac4 1f4eb000 00000001
  GPR08: c061fe58 00000000 00000000 df5477d8 0000003c 100c8768 00000000 00000000
  GPR16: df486a00 c046f1c8 c046eea0 00000000 c046e904 c0239604 db68449c 00000000
  GPR24: e9083204 00000000 00000001 db684460 e9083404 00000000 db6dce00 db6dcc00
  NIP [c0227bf0] phy_start+0x74/0xc0
  LR [c0227bf0] phy_start+0x74/0xc0
  Call Trace:
  [df5477d8] [c0227bf0] phy_start+0x74/0xc0 (unreliable)
  [df5477e8] [c023cad0] startup_gfar+0x398/0x3f4
  [df547828] [c023cf08] gfar_enet_open+0x364/0x374
  [df547898] [c029d870] __dev_open+0xe4/0x140
  [df5478c8] [c029db70] __dev_change_flags+0xf0/0x188
  [df5478f8] [c029dc28] dev_change_flags+0x20/0x54
  [df547918] [c02ae304] do_setlink+0x310/0x818
  [df547a08] [c02b1eb8] __rtnl_newlink+0x384/0x6b0
  [df547c28] [c02b222c] rtnl_newlink+0x48/0x68
  [df547c48] [c02ad7c8] rtnetlink_rcv_msg+0x240/0x27c
  [df547c98] [c02cc068] netlink_rcv_skb+0x8c/0xf0
  [df547cd8] [c02cba3c] netlink_unicast+0x114/0x19c
  [df547d08] [c02cbd74] netlink_sendmsg+0x2b0/0x2c0
  [df547d58] [c027b668] sock_sendmsg_nosec+0x20/0x40
  [df547d68] [c027d080] ___sys_sendmsg+0x17c/0x1dc
  [df547e98] [c027df7c] __sys_sendmsg+0x68/0x84
  [df547ef8] [c027e430] sys_socketcall+0x1a0/0x204
  [df547f38] [c000d1d8] ret_from_syscall+0x0/0x38
  --- interrupt: c01 at 0xfd4e030
      LR = 0xfd4e010
  Instruction dump:
  813f0188 38800000 2b890005 419d0014 3d40c046 5529103a 394aa208 7c8a482e
  3c60c046 3863a1b8 4cc63182 4be009a1 <0fe0000048000030 3c60c046 3863a1d0
  ---[ end trace d4c095aeaf6ea998 ]---

and

  % ip link set down dev eth0
  ------------[ cut here ]------------
  called from state HALTED
  WARNING: CPU: 1 PID: 184 at drivers/net/phy/phy.c:858 phy_stop+0x3c/0x88

  <...>

  Call Trace:
  [df581788] [c0228450] phy_stop+0x3c/0x88 (unreliable)
  [df581798] [c022d548] sfp_sm_phy_detach+0x1c/0x44
  [df5817a8] [c022e8cc] sfp_sm_event+0x4b0/0x87c
  [df581848] [c022f04c] sfp_upstream_stop+0x34/0x44
  [df581858] [c0225608] phylink_stop+0x7c/0xe4
  [df581868] [c023c57c] stop_gfar+0x7c/0x94
  [df581888] [c023c5b8] gfar_close+0x24/0x94
  [df5818a8] [c0298688] __dev_close_many+0xdc/0xf8
  [df5818c8] [c029db58] __dev_change_flags+0xd8/0x188
  [df5818f8] [c029dc28] dev_change_flags+0x20/0x54
  [df581918] [c02ae304] do_setlink+0x310/0x818
  [df581a08] [c02b1eb8] __rtnl_newlink+0x384/0x6b0
  [df581c28] [c02b222c] rtnl_newlink+0x48/0x68
  [df581c48] [c02ad7c8] rtnetlink_rcv_msg+0x240/0x27c
  [df581c98] [c02cc068] netlink_rcv_skb+0x8c/0xf0
  [df581cd8] [c02cba3c] netlink_unicast+0x114/0x19c
  [df581d08] [c02cbd74] netlink_sendmsg+0x2b0/0x2c0
  [df581d58] [c027b668] sock_sendmsg_nosec+0x20/0x40
  [df581d68] [c027d080] ___sys_sendmsg+0x17c/0x1dc
  [df581e98] [c027df7c] __sys_sendmsg+0x68/0x84
  [df581ef8] [c027e430] sys_socketcall+0x1a0/0x204
  [df581f38] [c000d1d8] ret_from_syscall+0x0/0x38

  <...>

  ---[ end trace d4c095aeaf6ea999 ]---

SFP modules with the 1000Base-X interface are not affected.

Place explicit calls to phy_start() and phy_stop() before enabling or after
disabling an attached SFP module, where phydev is not yet set (or is
already unset), so they will be made only from the inside of sfp-bus, if
needed.

Fixes: 217962615662 ("net: phy: warn if phy_start is called from invalid state")
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'linux-can-fixes-for-5.3-20190724' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Wed, 24 Jul 2019 21:14:50 +0000 (14:14 -0700)]
Merge tag 'linux-can-fixes-for-5.3-20190724' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2019-07-24

this is a pull reqeust of 7 patches for net/master.

The first patch is by Rasmus Villemoes add a missing netif_carrier_off() to
register_candev() so that generic netdev trigger based LEDs are initially off.

Nikita Yushchenko's patch for the rcar_canfd driver fixes a possible IRQ storm
on high load.

The patch by Weitao Hou for the mcp251x driver add missing error checking to
the work queue allocation.

Both Wen Yang's and Joakim Zhang's patch for the flexcan driver fix a problem
with the stop-mode.

Stephane Grosjean contributes a patch for the peak_usb driver to fix a
potential double kfree_skb().

The last patch is by YueHaibing and fixes the error path in can-gw's
cgw_module_init() function.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoip6_gre: reload ipv6h in prepare_ip6gre_xmit_ipv6
Haishuang Yan [Wed, 24 Jul 2019 12:00:42 +0000 (20:00 +0800)]
ip6_gre: reload ipv6h in prepare_ip6gre_xmit_ipv6

Since ip6_tnl_parse_tlv_enc_lim() can call pskb_may_pull()
which may change skb->data, so we need to re-load ipv6h at
the right place.

Fixes: 898b29798e36 ("ip6_gre: Refactor ip6gre xmit codes")
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ipv4: cleanup error condition testing
Pavel Machek [Wed, 24 Jul 2019 20:56:37 +0000 (13:56 -0700)]
net/ipv4: cleanup error condition testing

Cleanup testing for error condition.

Signed-off-by: Pavel Machek <pavel@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocan: gw: Fix error path of cgw_module_init
YueHaibing [Sat, 18 May 2019 09:35:43 +0000 (17:35 +0800)]
can: gw: Fix error path of cgw_module_init

This patch add error path for cgw_module_init to avoid possible crash if
some error occurs.

Fixes: c1aabdf379bc ("can-gw: add netlink based CAN routing")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: peak_usb: fix potential double kfree_skb()
Stephane Grosjean [Fri, 5 Jul 2019 13:32:16 +0000 (15:32 +0200)]
can: peak_usb: fix potential double kfree_skb()

When closing the CAN device while tx skbs are inflight, echo skb could
be released twice. By calling close_candev() before unlinking all
pending tx urbs, then the internal echo_skb[] array is fully and
correctly cleared before the USB write callback and, therefore,
can_get_echo_skb() are called, for each aborted URB.

Fixes: bb4785551f64 ("can: usb: PEAK-System Technik USB adapters driver core")
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: flexcan: fix stop mode acknowledgment
Joakim Zhang [Tue, 2 Jul 2019 01:45:41 +0000 (01:45 +0000)]
can: flexcan: fix stop mode acknowledgment

To enter stop mode, the CPU should manually assert a global Stop Mode
request and check the acknowledgment asserted by FlexCAN. The CPU must
only consider the FlexCAN in stop mode when both request and
acknowledgment conditions are satisfied.

Fixes: de3578c198c6 ("can: flexcan: add self wakeup support")
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: flexcan: fix an use-after-free in flexcan_setup_stop_mode()
Wen Yang [Sat, 6 Jul 2019 03:37:20 +0000 (11:37 +0800)]
can: flexcan: fix an use-after-free in flexcan_setup_stop_mode()

The gpr_np variable is still being used in dev_dbg() after the
of_node_put() call, which may result in use-after-free.

Fixes: de3578c198c6 ("can: flexcan: add self wakeup support")
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: mcp251x: add error check when wq alloc failed
Weitao Hou [Tue, 25 Jun 2019 12:50:48 +0000 (20:50 +0800)]
can: mcp251x: add error check when wq alloc failed

add error check when workqueue alloc failed, and remove redundant code
to make it clear.

Fixes: e0000163e30e ("can: Driver for the Microchip MCP251x SPI CAN controllers")
Signed-off-by: Weitao Hou <houweitaoo@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: rcar_canfd: fix possible IRQ storm on high load
Nikita Yushchenko [Wed, 26 Jun 2019 13:08:48 +0000 (16:08 +0300)]
can: rcar_canfd: fix possible IRQ storm on high load

We have observed rcar_canfd driver entering IRQ storm under high load,
with following scenario:
- rcar_canfd_global_interrupt() in entered due to Rx available,
- napi_schedule_prep() is called, and sets NAPIF_STATE_SCHED in state
- Rx fifo interrupts are masked,
- rcar_canfd_global_interrupt() is entered again, this time due to
  error interrupt (e.g. due to overflow),
- since scheduled napi poller has not yet executed, condition for calling
  napi_schedule_prep() from rcar_canfd_global_interrupt() remains true,
  thus napi_schedule_prep() gets called and sets NAPIF_STATE_MISSED flag
  in state,
- later, napi poller function rcar_canfd_rx_poll() gets executed, and
  calls napi_complete_done(),
- due to NAPIF_STATE_MISSED flag in state, this call does not clear
  NAPIF_STATE_SCHED flag from state,
- on return from napi_complete_done(), rcar_canfd_rx_poll() unmasks Rx
  interrutps,
- Rx interrupt happens, rcar_canfd_global_interrupt() gets called
  and calls napi_schedule_prep(),
- since NAPIF_STATE_SCHED is set in state at this time, this call
  returns false,
- due to that false return, rcar_canfd_global_interrupt() returns
  without masking Rx interrupt
- and this results into IRQ storm: unmasked Rx interrupt happens again
  and again is misprocessed in the same way.

This patch fixes that scenario by unmasking Rx interrupts only when
napi_complete_done() returns true, which means it has cleared
NAPIF_STATE_SCHED in state.

Fixes: dd3bd23eb438 ("can: rcar_canfd: Add Renesas R-Car CAN FD driver")
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: dev: call netif_carrier_off() in register_candev()
Rasmus Villemoes [Mon, 24 Jun 2019 08:34:13 +0000 (08:34 +0000)]
can: dev: call netif_carrier_off() in register_candev()

CONFIG_CAN_LEDS is deprecated. When trying to use the generic netdev
trigger as suggested, there's a small inconsistency with the link
property: The LED is on initially, stays on when the device is brought
up, and then turns off (as expected) when the device is brought down.

Make sure the LED always reflects the state of the CAN device.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agonet: thunderx: Use fwnode_get_mac_address()
Andy Shevchenko [Tue, 23 Jul 2019 20:03:43 +0000 (23:03 +0300)]
net: thunderx: Use fwnode_get_mac_address()

Replace the custom implementation with fwnode_get_mac_address,
which works on both DT and ACPI platforms.

While here, replace memcpy() by ether_addr_copy().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosky2: Disable MSI on ASUS P6T
Takashi Iwai [Tue, 23 Jul 2019 15:15:25 +0000 (17:15 +0200)]
sky2: Disable MSI on ASUS P6T

The onboard sky2 NIC on ASUS P6T WS PRO doesn't work after PM resume
due to the infamous IRQ problem.  Disabling MSI works around it, so
let's add it to the blacklist.

Unfortunately the BIOS on the machine doesn't fill the standard
DMI_SYS_* entry, so we pick up DMI_BOARD_* entries instead.

BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1142496
Reported-and-tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: sja1105: sja1105_main: Add of_node_put()
Nishka Dasgupta [Tue, 23 Jul 2019 10:44:48 +0000 (16:14 +0530)]
net: dsa: sja1105: sja1105_main: Add of_node_put()

Each iteration of for_each_child_of_node puts the previous node, but in
the case of a return from the middle of the loop, there is no put, thus
causing a memory leak. Hence add an of_node_put before the return.
Issue found with Coccinelle.

Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: mv88e6xxx: chip: Add of_node_put() before return
Nishka Dasgupta [Tue, 23 Jul 2019 10:43:07 +0000 (16:13 +0530)]
net: dsa: mv88e6xxx: chip: Add of_node_put() before return

Each iteration of for_each_available_child_of_node puts the previous
node, but in the case of a return from the middle of the loop, there is
no put, thus causing a memory leak. Hence add an of_node_put before the
return.
Issue found with Coccinelle.

Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'selftests-forwarding-GRE-multipath-fixes'
David S. Miller [Tue, 23 Jul 2019 20:06:49 +0000 (13:06 -0700)]
Merge branch 'selftests-forwarding-GRE-multipath-fixes'

Ido Schimmel says:

====================
selftests: forwarding: GRE multipath fixes

Patch #1 ensures IPv4 forwarding is enabled during the test.

Patch #2 fixes the flower filters used to measure the distribution of
the traffic between the two nexthops, so that the test will pass
regardless if traffic is offloaded or not.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: gre_multipath: Fix flower filters
Ido Schimmel [Tue, 23 Jul 2019 08:19:26 +0000 (11:19 +0300)]
selftests: forwarding: gre_multipath: Fix flower filters

The TC filters used in the test do not work with veth devices because the
outer Ethertype is 802.1Q and not IPv4. The test passes with mlxsw
netdevs since the hardware always looks at "The first Ethertype that
does not point to either: VLAN, CNTAG or configurable Ethertype".

Fix this by matching on the VLAN ID instead, but on the ingress side.
The reason why this is not performed at egress is explained in the
commit cited below.

Fixes: 541ad323db3a ("selftests: forwarding: gre_multipath: Update next-hop statistics match criteria")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Tested-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: gre_multipath: Enable IPv4 forwarding
Ido Schimmel [Tue, 23 Jul 2019 08:19:25 +0000 (11:19 +0300)]
selftests: forwarding: gre_multipath: Enable IPv4 forwarding

The test did not enable IPv4 forwarding during its setup phase, which
causes the test to fail on machines where IPv4 forwarding is disabled.

Fixes: 54818c4c4b93 ("selftests: forwarding: Test multipath tunneling")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Tested-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoRevert "net: hns: fix LED configuration for marvell phy"
David S. Miller [Tue, 23 Jul 2019 03:44:48 +0000 (20:44 -0700)]
Revert "net: hns: fix LED configuration for marvell phy"

This reverts commit f4e5f775db5a4631300dccd0de5eafb50a77c131.

Andrew Lunn says this should be handled another way.

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: stmmac: Do not cut down 1G modes
Jose Abreu [Mon, 22 Jul 2019 14:07:21 +0000 (16:07 +0200)]
net: stmmac: Do not cut down 1G modes

Some glue logic drivers support 1G without having GMAC/GMAC4/XGMAC.

Let's allow this speed by default.

Reported-by: Ondrej Jirman <megi@xff.cz>
Tested-by: Ondrej Jirman <megi@xff.cz>
Fixes: 5b0d7d7da64b ("net: stmmac: Add the missing speeds that XGMAC supports")
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'stmmac-fixes'
David S. Miller [Tue, 23 Jul 2019 01:23:32 +0000 (18:23 -0700)]
Merge branch 'stmmac-fixes'

Jose Abreu says:

====================
net: stmmac: Two fixes

Two fixes targeting -net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: stmmac: Use kcalloc() instead of kmalloc_array()
Jose Abreu [Mon, 22 Jul 2019 08:39:31 +0000 (10:39 +0200)]
net: stmmac: Use kcalloc() instead of kmalloc_array()

We need the memory to be zeroed upon allocation so use kcalloc()
instead.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: stmmac: RX Descriptors need to be clean before setting buffers
Jose Abreu [Mon, 22 Jul 2019 08:39:30 +0000 (10:39 +0200)]
net: stmmac: RX Descriptors need to be clean before setting buffers

RX Descriptors are being cleaned after setting the buffers which may
lead to buffer addresses being wiped out.

Fix this by clearing earlier the RX Descriptors.

Fixes: 2af6106ae949 ("net: stmmac: Introducing support for Page Pool")
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns: fix LED configuration for marvell phy
Yonglong Liu [Mon, 22 Jul 2019 05:59:12 +0000 (13:59 +0800)]
net: hns: fix LED configuration for marvell phy

Since commit(net: phy: marvell: change default m88e1510 LED configuration),
the active LED of Hip07 devices is always off, because Hip07 just
use 2 LEDs.
This patch adds a phy_register_fixup_for_uid() for m88e1510 to
correct the LED configuration.

Fixes: 077772468ec1 ("net: phy: marvell: change default m88e1510 LED configuration")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: linyunsheng <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvpp2: Don't check for 3 consecutive Idle frames for 10G links
Maxime Chevallier [Fri, 19 Jul 2019 14:38:48 +0000 (16:38 +0200)]
net: mvpp2: Don't check for 3 consecutive Idle frames for 10G links

PPv2's XLGMAC can wait for 3 idle frames before triggering a link up
event. This can cause the link to be stuck low when there's traffic on
the interface, so disable this feature.

Fixes: 4bb043262878 ("net: mvpp2: phylink support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobonding: Force slave speed check after link state recovery for 802.3ad
Thomas Falcon [Tue, 16 Jul 2019 22:25:10 +0000 (17:25 -0500)]
bonding: Force slave speed check after link state recovery for 802.3ad

The following scenario was encountered during testing of logical
partition mobility on pseries partitions with bonded ibmvnic
adapters in LACP mode.

1. Driver receives a signal that the device has been
   swapped, and it needs to reset to initialize the new
   device.

2. Driver reports loss of carrier and begins initialization.

3. Bonding driver receives NETDEV_CHANGE notifier and checks
   the slave's current speed and duplex settings. Because these
   are unknown at the time, the bond sets its link state to
   BOND_LINK_FAIL and handles the speed update, clearing
   AD_PORT_LACP_ENABLE.

4. Driver finishes recovery and reports that the carrier is on.

5. Bond receives a new notification and checks the speed again.
   The speeds are valid but miimon has not altered the link
   state yet.  AD_PORT_LACP_ENABLE remains off.

Because the slave's link state is still BOND_LINK_FAIL,
no further port checks are made when it recovers. Though
the slave devices are operational and have valid speed
and duplex settings, the bond will not send LACPDU's. The
simplest fix I can see is to force another speed check
in bond_miimon_commit. This way the bond will update
AD_PORT_LACP_ENABLE if needed when transitioning from
BOND_LINK_FAIL to BOND_LINK_UP.

CC: Jarod Wilson <jarod@redhat.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 22 Jul 2019 16:30:34 +0000 (09:30 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull preemption Kconfig fix from Thomas Gleixner:
 "The PREEMPT_RT stub config renamed PREEMPT to PREEMPT_LL and defined
  PREEMPT outside of the menu and made it selectable by both PREEMPT_LL
  and PREEMPT_RT.

  Stupid me missed that 114 defconfigs select CONFIG_PREEMPT which
  obviously can't work anymore. oldconfig builds are affected as well,
  but it's more obvious as the user gets asked. [old]defconfig silently
  fixes it up and selects PREEMPT_NONE.

  Unbreak it by undoing the rename and adding a intermediate config
  symbol which is selected by both PREEMPT and PREEMPT_RT. That requires
  to chase down a few #ifdefs, but it's better than tweaking 114
  defconfigs and annoying users"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/rt, Kconfig: Unbreak def/oldconfig with CONFIG_PREEMPT=y

6 years agoMerge tag 'for-linus-20190722' of git://git.kernel.org/pub/scm/linux/kernel/git/braun...
Linus Torvalds [Mon, 22 Jul 2019 16:14:19 +0000 (09:14 -0700)]
Merge tag 'for-linus-20190722' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull pidfd polling fix from Christian Brauner:
 "A fix for pidfd polling. It ensures that the task's exit state is
  visible to all waiters"

* tag 'for-linus-20190722' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  pidfd: fix a poll race when setting exit_state

6 years agoMerge tag 'for-5.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Mon, 22 Jul 2019 16:08:38 +0000 (09:08 -0700)]
Merge tag 'for-5.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fixes for leaks caused by recently merged patches

 - one build fix

 - a fix to prevent mixing of incompatible features

* tag 'for-5.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: don't leak extent_map in btrfs_get_io_geometry()
  btrfs: free checksum hash on in close_ctree
  btrfs: Fix build error while LIBCRC32C is module
  btrfs: inode: Don't compress if NODATASUM or NODATACOW set

6 years agosched/rt, Kconfig: Unbreak def/oldconfig with CONFIG_PREEMPT=y
Thomas Gleixner [Mon, 22 Jul 2019 15:59:19 +0000 (17:59 +0200)]
sched/rt, Kconfig: Unbreak def/oldconfig with CONFIG_PREEMPT=y

The merge of the CONFIG_PREEMPT_RT stub renamed CONFIG_PREEMPT to
CONFIG_PREEMPT_LL which causes all defconfigs which have CONFIG_PREEMPT=y
set to fall back to CONFIG_PREEMPT_NONE because CONFIG_PREEMPT depends on
the preemption mode choice wich defaults to NONE. This also affects
oldconfig builds.

So rather than changing 114 defconfig files and being an annoyance to
users, revert the rename and select a new config symbol PREEMPTION. That
keeps everything working smoothly and the revelant ifdef's are going to be
fixed up step by step.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Fixes: a50a3f4b6a31 ("sched/rt, Kconfig: Introduce CONFIG_PREEMPT_RT")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
6 years agoMerge tag 'media/v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Mon, 22 Jul 2019 16:01:47 +0000 (09:01 -0700)]
Merge tag 'media/v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "For two regressions in media core:

   - v4l2-subdev: fix regression in check_pad()

   - videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already
     in use"

* tag 'media/v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already in use
  media: v4l2-subdev: fix regression in check_pad()

6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 22 Jul 2019 15:49:22 +0000 (08:49 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Several netfilter fixes including a nfnetlink deadlock fix from
    Florian Westphal and fix for dropping VRF packets from Miaohe Lin.

 2) Flow offload fixes from Pablo Neira Ayuso including a fix to restore
    proper block sharing.

 3) Fix r8169 PHY init from Thomas Voegtle.

 4) Fix memory leak in mac80211, from Lorenzo Bianconi.

 5) Missing NULL check on object allocation in cxgb4, from Navid
    Emamdoost.

 6) Fix scaling of RX power in sfp phy driver, from Andrew Lunn.

 7) Check that there is actually an ip header to access in skb->data in
    VRF, from Peter Kosyh.

 8) Remove spurious rcu unlock in hv_netvsc, from Haiyang Zhang.

 9) One more tweak the the TCP fragmentation memory limit changes, to be
    less harmful to applications setting small SO_SNDBUF values. From
    Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
  tcp: be more careful in tcp_fragment()
  hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback()
  vrf: make sure skb->data contains ip header to make routing
  connector: remove redundant input callback from cn_dev
  qed: Prefer pcie_capability_read_word()
  igc: Prefer pcie_capability_read_word()
  cxgb4: Prefer pcie_capability_read_word()
  be2net: Synchronize be_update_queues with dev_watchdog
  bnx2x: Prevent load reordering in tx completion processing
  net: phy: sfp: hwmon: Fix scaling of RX power
  net: sched: verify that q!=NULL before setting q->flags
  chelsio: Fix a typo in a function name
  allocate_flower_entry: should check for null deref
  net: hns3: typo in the name of a constant
  kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist.
  tipc: Fix a typo
  mac80211: don't warn about CW params when not using them
  mac80211: fix possible memory leak in ieee80211_assign_beacon
  nl80211: fix NL80211_HE_MAX_CAPABILITY_LEN
  nl80211: fix VENDOR_CMD_RAW_DATA
  ...

6 years agopidfd: fix a poll race when setting exit_state
Suren Baghdasaryan [Wed, 17 Jul 2019 17:21:00 +0000 (13:21 -0400)]
pidfd: fix a poll race when setting exit_state

There is a race between reading task->exit_state in pidfd_poll and
writing it after do_notify_parent calls do_notify_pidfd. Expected
sequence of events is:

CPU 0                            CPU 1
------------------------------------------------
exit_notify
  do_notify_parent
    do_notify_pidfd
  tsk->exit_state = EXIT_DEAD
                                  pidfd_poll
                                     if (tsk->exit_state)

However nothing prevents the following sequence:

CPU 0                            CPU 1
------------------------------------------------
exit_notify
  do_notify_parent
    do_notify_pidfd
                                   pidfd_poll
                                      if (tsk->exit_state)
  tsk->exit_state = EXIT_DEAD

This causes a polling task to wait forever, since poll blocks because
exit_state is 0 and the waiting task is not notified again. A stress
test continuously doing pidfd poll and process exits uncovered this bug.

To fix it, we make sure that the task's exit_state is always set before
calling do_notify_pidfd.

Fixes: b53b0b9d9a6 ("pidfd: add polling support")
Cc: kernel-team@android.com
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20190717172100.261204-1-joel@joelfernandes.org
[christian@brauner.io: adapt commit message and drop unneeded changes from wait_task_zombie]
Signed-off-by: Christian Brauner <christian@brauner.io>
6 years agotcp: be more careful in tcp_fragment()
Eric Dumazet [Fri, 19 Jul 2019 18:52:33 +0000 (11:52 -0700)]
tcp: be more careful in tcp_fragment()

Some applications set tiny SO_SNDBUF values and expect
TCP to just work. Recent patches to address CVE-2019-11478
broke them in case of losses, since retransmits might
be prevented.

We should allow these flows to make progress.

This patch allows the first and last skb in retransmit queue
to be split even if memory limits are hit.

It also adds the some room due to the fact that tcp_sendmsg()
and tcp_sendpage() might overshoot sk_wmem_queued by about one full
TSO skb (64KB size). Note this allowance was already present
in stable backports for kernels < 4.15

Note for < 4.15 backports :
 tcp_rtx_queue_tail() will probably look like :

static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
{
struct sk_buff *skb = tcp_send_head(sk);

return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
}

Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrew Prout <aprout@ll.mit.edu>
Tested-by: Andrew Prout <aprout@ll.mit.edu>
Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Christoph Paasch <cpaasch@apple.com>
Cc: Jonathan Looney <jtl@netflix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agohv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback()
Haiyang Zhang [Fri, 19 Jul 2019 17:33:51 +0000 (17:33 +0000)]
hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback()

There is an extra rcu_read_unlock left in netvsc_recv_callback(),
after a previous patch that removes RCU from this function.
This patch removes the extra RCU unlock.

Fixes: 345ac08990b8 ("hv_netvsc: pass netvsc_device to receive callback")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoLinus 5.3-rc1 v5.3-rc1
Linus Torvalds [Sun, 21 Jul 2019 21:05:38 +0000 (14:05 -0700)]
Linus 5.3-rc1

6 years agovrf: make sure skb->data contains ip header to make routing
Peter Kosyh [Fri, 19 Jul 2019 08:11:47 +0000 (11:11 +0300)]
vrf: make sure skb->data contains ip header to make routing

vrf_process_v4_outbound() and vrf_process_v6_outbound() do routing
using ip/ipv6 addresses, but don't make sure the header is available
in skb->data[] (skb_headlen() is less then header size).

Case:

1) igb driver from intel.
2) Packet size is greater then 255.
3) MPLS forwards to VRF device.

So, patch adds pskb_may_pull() calls in vrf_process_v4/v6_outbound()
functions.

Signed-off-by: Peter Kosyh <p.kosyh@gmail.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoconnector: remove redundant input callback from cn_dev
Vasily Averin [Thu, 18 Jul 2019 04:26:46 +0000 (07:26 +0300)]
connector: remove redundant input callback from cn_dev

A small cleanup: this callback is never used.
Originally fixed by Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
for OpenVZ7 bug OVZ-6877

cc: stanislav.kinsburskiy@gmail.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Prefer pcie_capability_read_word()
Frederick Lawler [Thu, 18 Jul 2019 02:07:42 +0000 (21:07 -0500)]
qed: Prefer pcie_capability_read_word()

Commit 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.

Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().

Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoigc: Prefer pcie_capability_read_word()
Frederick Lawler [Thu, 18 Jul 2019 02:07:39 +0000 (21:07 -0500)]
igc: Prefer pcie_capability_read_word()

Commit 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.

Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().

Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: Prefer pcie_capability_read_word()
Frederick Lawler [Thu, 18 Jul 2019 02:07:36 +0000 (21:07 -0500)]
cxgb4: Prefer pcie_capability_read_word()

Commit 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.

Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().

Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobe2net: Synchronize be_update_queues with dev_watchdog
Benjamin Poirier [Thu, 18 Jul 2019 01:42:18 +0000 (10:42 +0900)]
be2net: Synchronize be_update_queues with dev_watchdog

As pointed out by Firo Yang, a netdev tx timeout may trigger just before an
ethtool set_channels operation is started. be_tx_timeout(), which dumps
some queue structures, is not written to run concurrently with
be_update_queues(), which frees/allocates those queues structures. Add some
synchronization between the two.

Message-id: <CH2PR18MB31898E033896F9760D36BFF288C90@CH2PR18MB3189.namprd18.prod.outlook.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobnx2x: Prevent load reordering in tx completion processing
Brian King [Mon, 15 Jul 2019 21:41:50 +0000 (16:41 -0500)]
bnx2x: Prevent load reordering in tx completion processing

This patch fixes an issue seen on Power systems with bnx2x which results
in the skb is NULL WARN_ON in bnx2x_free_tx_pkt firing due to the skb
pointer getting loaded in bnx2x_free_tx_pkt prior to the hw_cons
load in bnx2x_tx_int. Adding a read memory barrier resolves the issue.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: sfp: hwmon: Fix scaling of RX power
Andrew Lunn [Sun, 21 Jul 2019 16:50:08 +0000 (18:50 +0200)]
net: phy: sfp: hwmon: Fix scaling of RX power

The RX power read from the SFP uses units of 0.1uW. This must be
scaled to units of uW for HWMON. This requires a divide by 10, not the
current 100.

With this change in place, sensors(1) and ethtool -m agree:

sff2-isa-0000
Adapter: ISA adapter
in0:          +3.23 V
temp1:        +33.1 C
power1:      270.00 uW
power2:      200.00 uW
curr1:        +0.01 A

        Laser output power                        : 0.2743 mW / -5.62 dBm
        Receiver signal average optical power     : 0.2014 mW / -6.96 dBm

Reported-by: chris.healy@zii.aero
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 1323061a018a ("net: phy: sfp: Add HWMON support for module sensors")
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: verify that q!=NULL before setting q->flags
Vlad Buslov [Sun, 21 Jul 2019 14:44:12 +0000 (17:44 +0300)]
net: sched: verify that q!=NULL before setting q->flags

In function int tc_new_tfilter() q pointer can be NULL when adding filter
on a shared block. With recent change that resets TCQ_F_CAN_BYPASS after
filter creation, following NULL pointer dereference happens in case parent
block is shared:

[  212.925060] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  212.925445] #PF: supervisor write access in kernel mode
[  212.925709] #PF: error_code(0x0002) - not-present page
[  212.925965] PGD 8000000827923067 P4D 8000000827923067 PUD 827924067 PMD 0
[  212.926302] Oops: 0002 [#1] SMP KASAN PTI
[  212.926539] CPU: 18 PID: 2617 Comm: tc Tainted: G    B             5.2.0+ #512
[  212.926938] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[  212.927364] RIP: 0010:tc_new_tfilter+0x698/0xd40
[  212.927633] Code: 74 0d 48 85 c0 74 08 48 89 ef e8 03 aa 62 00 48 8b 84 24 a0 00 00 00 48 8d 78 10 48 89 44 24 18 e8 4d 0c 6b ff 48 8b 44 24 18 <83> 60 10 f
b 48 85 ed 0f 85 3d fe ff ff e9 4f fe ff ff e8 81 26 f8
[  212.928607] RSP: 0018:ffff88884fd5f5d8 EFLAGS: 00010296
[  212.928905] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dffffc0000000000
[  212.929201] RDX: 0000000000000007 RSI: 0000000000000004 RDI: 0000000000000297
[  212.929402] RBP: ffff88886bedd600 R08: ffffffffb91d4b51 R09: fffffbfff7616e4d
[  212.929609] R10: fffffbfff7616e4c R11: ffffffffbb0b7263 R12: ffff88886bc61040
[  212.929803] R13: ffff88884fd5f950 R14: ffffc900039c5000 R15: ffff88835e927680
[  212.929999] FS:  00007fe7c50b6480(0000) GS:ffff88886f980000(0000) knlGS:0000000000000000
[  212.930235] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  212.930394] CR2: 0000000000000010 CR3: 000000085bd04002 CR4: 00000000001606e0
[  212.930588] Call Trace:
[  212.930682]  ? tc_del_tfilter+0xa40/0xa40
[  212.930811]  ? __lock_acquire+0x5b5/0x2460
[  212.930948]  ? find_held_lock+0x85/0xa0
[  212.931081]  ? tc_del_tfilter+0xa40/0xa40
[  212.931201]  rtnetlink_rcv_msg+0x4ab/0x5f0
[  212.931332]  ? rtnl_dellink+0x490/0x490
[  212.931454]  ? lockdep_hardirqs_on+0x260/0x260
[  212.931589]  ? netlink_deliver_tap+0xab/0x5a0
[  212.931717]  ? match_held_lock+0x1b/0x240
[  212.931844]  netlink_rcv_skb+0xd0/0x200
[  212.931958]  ? rtnl_dellink+0x490/0x490
[  212.932079]  ? netlink_ack+0x440/0x440
[  212.932205]  ? netlink_deliver_tap+0x161/0x5a0
[  212.932335]  ? lock_downgrade+0x360/0x360
[  212.932457]  ? lock_acquire+0xe5/0x210
[  212.932579]  netlink_unicast+0x296/0x350
[  212.932705]  ? netlink_attachskb+0x390/0x390
[  212.932834]  ? _copy_from_iter_full+0xe0/0x3a0
[  212.932976]  netlink_sendmsg+0x394/0x600
[  212.937998]  ? netlink_unicast+0x350/0x350
[  212.943033]  ? move_addr_to_kernel.part.0+0x90/0x90
[  212.948115]  ? netlink_unicast+0x350/0x350
[  212.953185]  sock_sendmsg+0x96/0xa0
[  212.958099]  ___sys_sendmsg+0x482/0x520
[  212.962881]  ? match_held_lock+0x1b/0x240
[  212.967618]  ? copy_msghdr_from_user+0x250/0x250
[  212.972337]  ? lock_downgrade+0x360/0x360
[  212.976973]  ? rwlock_bug.part.0+0x60/0x60
[  212.981548]  ? __mod_node_page_state+0x1f/0xa0
[  212.986060]  ? match_held_lock+0x1b/0x240
[  212.990567]  ? find_held_lock+0x85/0xa0
[  212.994989]  ? do_user_addr_fault+0x349/0x5b0
[  212.999387]  ? lock_downgrade+0x360/0x360
[  213.003713]  ? find_held_lock+0x85/0xa0
[  213.007972]  ? __fget_light+0xa1/0xf0
[  213.012143]  ? sockfd_lookup_light+0x91/0xb0
[  213.016165]  __sys_sendmsg+0xba/0x130
[  213.020040]  ? __sys_sendmsg_sock+0xb0/0xb0
[  213.023870]  ? handle_mm_fault+0x337/0x470
[  213.027592]  ? page_fault+0x8/0x30
[  213.031316]  ? lockdep_hardirqs_off+0xbe/0x100
[  213.034999]  ? mark_held_locks+0x24/0x90
[  213.038671]  ? do_syscall_64+0x1e/0xe0
[  213.042297]  do_syscall_64+0x74/0xe0
[  213.045828]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  213.049354] RIP: 0033:0x7fe7c527c7b8
[  213.052792] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f
0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 54
[  213.060269] RSP: 002b:00007ffc3f7908a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  213.064144] RAX: ffffffffffffffda RBX: 000000005d34716f RCX: 00007fe7c527c7b8
[  213.068094] RDX: 0000000000000000 RSI: 00007ffc3f790910 RDI: 0000000000000003
[  213.072109] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007fe7c5340cc0
[  213.076113] R10: 0000000000404ec2 R11: 0000000000000246 R12: 0000000000000080
[  213.080146] R13: 0000000000480640 R14: 0000000000000080 R15: 0000000000000000
[  213.084147] Modules linked in: act_gact cls_flower sch_ingress nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc sunrpc intel_rapl_msr intel_rapl_common
\e[<1;69;32Msb_edac rdma_ucm rdma_cm x86_pkg_temp_thermal iw_cm intel_powerclamp ib_cm coretemp kvm_intel kvm irqbypass mlx5_ib ib_uverbs ib_core crct10dif_pclmul crc32_pc
lmul crc32c_intel ghash_clmulni_intel mlx5_core intel_cstate intel_uncore iTCO_wdt igb iTCO_vendor_support mlxfw mei_me ptp ses intel_rapl_perf mei pcspkr ipmi
_ssif i2c_i801 joydev enclosure pps_core lpc_ich ioatdma wmi dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ast i2c_algo_bit drm_vram_helpe
r ttm drm_kms_helper drm mpt3sas raid_class scsi_transport_sas
[  213.112326] CR2: 0000000000000010
[  213.117429] ---[ end trace adb58eb0a4ee6283 ]---

Verify that q pointer is not NULL before setting the 'flags' field.

Fixes: 3f05e6886a59 ("net_sched: unset TCQ_F_CAN_BYPASS when adding filters")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agochelsio: Fix a typo in a function name
Christophe JAILLET [Sun, 21 Jul 2019 13:16:05 +0000 (15:16 +0200)]
chelsio: Fix a typo in a function name

It is likely that 'my3216_poll()' should be 'my3126_poll()'. (1 and 2
switched in 3126.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoallocate_flower_entry: should check for null deref
Navid Emamdoost [Sun, 21 Jul 2019 06:37:31 +0000 (01:37 -0500)]
allocate_flower_entry: should check for null deref

allocate_flower_entry does not check for allocation success, but tries
to deref the result. I only moved the spin_lock under null check, because
 the caller is checking allocation's status at line 652.

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: typo in the name of a constant
Christophe JAILLET [Sun, 21 Jul 2019 13:08:31 +0000 (15:08 +0200)]
net: hns3: typo in the name of a constant

All constant in 'enum HCLGE_MBX_OPCODE' start with HCLGE, except
'HLCGE_MBX_PUSH_VLAN_INFO' (C and L switched)

s/HLC/HCL/

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agokbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist.
Jeremy Sowden [Sun, 21 Jul 2019 11:31:05 +0000 (12:31 +0100)]
kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist.

net/netfilter/nf_tables_offload.h includes net/netfilter/nf_tables.h
which is itself on the blacklist.

Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotipc: Fix a typo
Christophe JAILLET [Sun, 21 Jul 2019 10:38:11 +0000 (12:38 +0200)]
tipc: Fix a typo

s/tipc_toprsv_listener_data_ready/tipc_topsrv_listener_data_ready/
(r and s switched in topsrv)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'mac80211-for-davem-2019-07-20' of git://git.kernel.org/pub/scm/linux/kerne...
David S. Miller [Sun, 21 Jul 2019 18:39:05 +0000 (11:39 -0700)]
Merge tag 'mac80211-for-davem-2019-07-20' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
We have a handful of fixes:
 * ignore bad CW parameters if we aren't using them,
   instead of warning
 * fix operation (and then build) with the new netlink vendor
   command policy requirement
 * fix a memory leak in an error path when setting beacons
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'devicetree-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 21 Jul 2019 17:28:39 +0000 (10:28 -0700)]
Merge tag 'devicetree-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull Devicetree fixes from Rob Herring:
 "Fix several warnings/errors in validation of binding schemas"

* tag 'devicetree-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: pinctrl: stm32: Fix missing 'clocks' property in examples
  dt-bindings: iio: ad7124: Fix dtc warnings in example
  dt-bindings: iio: avia-hx711: Fix avdd-supply typo in example
  dt-bindings: pinctrl: aspeed: Fix AST2500 example errors
  dt-bindings: pinctrl: aspeed: Fix 'compatible' schema errors
  dt-bindings: riscv: Limit cpus schema to only check RiscV 'cpu' nodes
  dt-bindings: Ensure child nodes are of type 'object'

6 years agoMerge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sun, 21 Jul 2019 17:09:43 +0000 (10:09 -0700)]
Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs documentation typo fix from Al Viro.

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  typo fix: it's d_make_root, not d_make_inode...

6 years agoMerge tag '5.3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 21 Jul 2019 17:01:17 +0000 (10:01 -0700)]
Merge tag '5.3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Two fixes for stable, one that had dependency on earlier patch in this
  merge window and can now go in, and a perf improvement in SMB3 open"

* tag '5.3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module number
  cifs: flush before set-info if we have writeable handles
  smb3: optimize open to not send query file internal info
  cifs: copy_file_range needs to strip setuid bits and update timestamps
  CIFS: fix deadlock in cached root handling

6 years agoiommu/amd: fix a crash in iova_magazine_free_pfns
Qian Cai [Thu, 11 Jul 2019 16:17:45 +0000 (12:17 -0400)]
iommu/amd: fix a crash in iova_magazine_free_pfns

The commit b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops
method") incorrectly changed the checking from dma_ops_alloc_iova() in
map_sg() causes a crash under memory pressure as dma_ops_alloc_iova()
never return DMA_MAPPING_ERROR on failure but 0, so the error handling
is all wrong.

   kernel BUG at drivers/iommu/iova.c:801!
    Workqueue: kblockd blk_mq_run_work_fn
    RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0
    Call Trace:
     free_cpu_cached_iovas+0xbd/0x150
     alloc_iova_fast+0x8c/0xba
     dma_ops_alloc_iova.isra.6+0x65/0xa0
     map_sg+0x8c/0x2a0
     scsi_dma_map+0xc6/0x160
     pqi_aio_submit_io+0x1f6/0x440 [smartpqi]
     pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi]
     scsi_queue_rq+0x79c/0x1200
     blk_mq_dispatch_rq_list+0x4dc/0xb70
     blk_mq_sched_dispatch_requests+0x249/0x310
     __blk_mq_run_hw_queue+0x128/0x200
     blk_mq_run_work_fn+0x27/0x30
     process_one_work+0x522/0xa10
     worker_thread+0x63/0x5b0
     kthread+0x1d2/0x1f0
     ret_from_fork+0x22/0x40

Fixes: b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops method")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agohexagon: switch to generic version of pte allocation
Mike Rapoport [Tue, 30 Apr 2019 14:27:50 +0000 (17:27 +0300)]
hexagon: switch to generic version of pte allocation

The hexagon implementation pte_alloc_one(), pte_alloc_one_kernel(),
pte_free_kernel() and pte_free() is identical to the generic except of
lack of __GFP_ACCOUNT for the user PTEs allocation.

Switch hexagon to use generic version of these functions.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMerge tag 'ntb-5.3' of git://github.com/jonmason/ntb
Linus Torvalds [Sun, 21 Jul 2019 16:46:59 +0000 (09:46 -0700)]
Merge tag 'ntb-5.3' of git://github.com/jonmason/ntb

Pull NTB updates from Jon Mason:
 "New feature to add support for NTB virtual MSI interrupts, the ability
  to test and use this feature in the NTB transport layer.

  Also, bug fixes for the AMD and Switchtec drivers, as well as some
  general patches"

* tag 'ntb-5.3' of git://github.com/jonmason/ntb: (22 commits)
  NTB: Describe the ntb_msi_test client in the documentation.
  NTB: Add MSI interrupt support to ntb_transport
  NTB: Add ntb_msi_test support to ntb_test
  NTB: Introduce NTB MSI Test Client
  NTB: Introduce MSI library
  NTB: Rename ntb.c to support multiple source files in the module
  NTB: Introduce functions to calculate multi-port resource index
  NTB: Introduce helper functions to calculate logical port number
  PCI/switchtec: Add module parameter to request more interrupts
  PCI/MSI: Support allocating virtual MSI interrupts
  ntb_hw_switchtec: Fix setup MW with failure bug
  ntb_hw_switchtec: Skip unnecessary re-setup of shared memory window for crosslink case
  ntb_hw_switchtec: Remove redundant steps of switchtec_ntb_reinit_peer() function
  NTB: correct ntb_dev_ops and ntb_dev comment typos
  NTB: amd: Silence shift wrapping warning in amd_ntb_db_vector_mask()
  ntb_hw_switchtec: potential shift wrapping bug in switchtec_ntb_init_sndev()
  NTB: ntb_transport: Ensure qp->tx_mw_dma_addr is initaliazed
  NTB: ntb_hw_amd: set peer limit register
  NTB: ntb_perf: Clear stale values in doorbell and command SPAD register
  NTB: ntb_perf: Disable NTB link after clearing peer XLAT registers
  ...

6 years agotypo fix: it's d_make_root, not d_make_inode...
Al Viro [Sun, 21 Jul 2019 03:17:30 +0000 (23:17 -0400)]
typo fix: it's d_make_root, not d_make_inode...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
6 years agodt-bindings: pinctrl: stm32: Fix missing 'clocks' property in examples
Rob Herring [Tue, 16 Jul 2019 21:34:40 +0000 (15:34 -0600)]
dt-bindings: pinctrl: stm32: Fix missing 'clocks' property in examples

Now that examples are validated against the DT schema, an error with
required 'clocks' property missing is exposed:

Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \
pinctrl@40020000: gpio@0: 'clocks' is a required property
Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \
pinctrl@50020000: gpio@1000: 'clocks' is a required property
Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \
pinctrl@50020000: gpio@2000: 'clocks' is a required property

Add the missing 'clocks' properties to the examples to fix the errors.

Fixes: 2c9239c125f0 ("dt-bindings: pinctrl: Convert stm32 pinctrl bindings to json-schema")
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Rob Herring <robh@kernel.org>
6 years agodt-bindings: iio: ad7124: Fix dtc warnings in example
Rob Herring [Tue, 16 Jul 2019 20:21:56 +0000 (14:21 -0600)]
dt-bindings: iio: ad7124: Fix dtc warnings in example

With the conversion to DT schema, the examples are now compiled with
dtc. The ad7124 binding example has the following warning:

Documentation/devicetree/bindings/iio/adc/adi,ad7124.example.dts:19.11-21: \
Warning (reg_format): /example-0/adc@0:reg: property has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1)

There's a default #size-cells and #address-cells values of 1 for
examples. For examples needing different values such as this one on a
SPI bus, they need to provide a SPI bus parent node.

Fixes: 26ae15e62d3c ("Convert AD7124 bindings documentation to YAML format.")
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: linux-iio@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
6 years agodt-bindings: iio: avia-hx711: Fix avdd-supply typo in example
Rob Herring [Tue, 16 Jul 2019 20:13:29 +0000 (14:13 -0600)]
dt-bindings: iio: avia-hx711: Fix avdd-supply typo in example

Now that examples are validated against the DT schema, a typo in
avia-hx711 example generates a warning:

Documentation/devicetree/bindings/iio/adc/avia-hx711.example.dt.yaml: weight: 'avdd-supply' is a required property

Fix the typo.

Fixes: 5150ec3fe125 ("avia-hx711.yaml: transform DT binding to YAML")
Cc: Andreas Klinger <ak@it-klinger.de>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: linux-iio@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
6 years agodt-bindings: pinctrl: aspeed: Fix AST2500 example errors
Rob Herring [Mon, 15 Jul 2019 22:48:41 +0000 (16:48 -0600)]
dt-bindings: pinctrl: aspeed: Fix AST2500 example errors

The schema examples are now validated against the schema itself. The
AST2500 pinctrl schema has a couple of errors:

Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.example.dt.yaml: \
example-0: $nodename:0: 'example-0' does not match '^(bus|soc|axi|ahb|apb)(@[0-9a-f]+)?$'
Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.example.dt.yaml: \
pinctrl: aspeed,external-nodes: [[1, 2]] is too short

Fixes: 0a617de16730 ("dt-bindings: pinctrl: aspeed: Convert AST2500 bindings to json-schema")
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: linux-aspeed@lists.ozlabs.org
Cc: linux-gpio@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Rob Herring <robh@kernel.org>
6 years agodt-bindings: pinctrl: aspeed: Fix 'compatible' schema errors
Rob Herring [Mon, 15 Jul 2019 22:37:25 +0000 (16:37 -0600)]
dt-bindings: pinctrl: aspeed: Fix 'compatible' schema errors

The Aspeed pinctl schema have errors in the 'compatible' schema:

Documentation/devicetree/bindings/pinctrl/aspeed,ast2400-pinctrl.yaml: \
properties:compatible:enum: ['aspeed', 'ast2400-pinctrl', 'aspeed', 'g4-pinctrl'] has non-unique elements
Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml: \
properties:compatible:enum: ['aspeed', 'ast2500-pinctrl', 'aspeed', 'g5-pinctrl'] has non-unique elements

Flow style sequences have to be quoted if the vales contain ','. Fix
this by using the more common one line per entry formatting.

Fixes: 0a617de16730 ("dt-bindings: pinctrl: aspeed: Convert AST2500 bindings to json-schema")
Fixes: 07457937bb5c ("dt-bindings: pinctrl: aspeed: Convert AST2400 bindings to json-schema")
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: linux-aspeed@lists.ozlabs.org
Cc: linux-gpio@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Rob Herring <robh@kernel.org>
6 years agodt-bindings: riscv: Limit cpus schema to only check RiscV 'cpu' nodes
Rob Herring [Wed, 26 Jun 2019 23:57:59 +0000 (17:57 -0600)]
dt-bindings: riscv: Limit cpus schema to only check RiscV 'cpu' nodes

Matching on the 'cpus' node was a bad choice because the schema is
incorrectly applied to non-RiscV cpus nodes. As we now have a common cpus
schema which checks the general structure, it is also redundant to do so
in the Risc-V CPU schema.

The downside is one could conceivably mix different architecture's cpu
nodes or have typos in the compatible string. The latter problem pretty
much exists for every schema.

Acked-by: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Rob Herring <robh@kernel.org>
6 years agodt-bindings: Ensure child nodes are of type 'object'
Rob Herring [Wed, 3 Jul 2019 20:17:06 +0000 (14:17 -0600)]
dt-bindings: Ensure child nodes are of type 'object'

Properties which are child node definitions need to have an explict
type. Otherwise, a matching (DT) property can silently match when an
error is desired. Fix this up tree-wide. Once this is fixed, the
meta-schema will enforce this on any child node definitions.

Cc: Chen-Yu Tsai <wens@csie.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Marek Vasut <marek.vasut@gmail.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: linux-mtd@lists.infradead.org
Cc: linux-gpio@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-spi@vger.kernel.org
Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Rob Herring <robh@kernel.org>
6 years agomac80211: don't warn about CW params when not using them
Brian Norris [Thu, 18 Jul 2019 01:57:12 +0000 (18:57 -0700)]
mac80211: don't warn about CW params when not using them

ieee80211_set_wmm_default() normally sets up the initial CW min/max for
each queue, except that it skips doing this if the driver doesn't
support ->conf_tx. We still end up calling drv_conf_tx() in some cases
(e.g., ieee80211_reconfig()), which also still won't do anything
useful...except it complains here about the invalid CW parameters.

Let's just skip the WARN if we weren't going to do anything useful with
the parameters.

Signed-off-by: Brian Norris <briannorris@chromium.org>
Link: https://lore.kernel.org/r/20190718015712.197499-1-briannorris@chromium.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agomac80211: fix possible memory leak in ieee80211_assign_beacon
Lorenzo Bianconi [Tue, 2 Jul 2019 22:29:47 +0000 (00:29 +0200)]
mac80211: fix possible memory leak in ieee80211_assign_beacon

Free new beacon_data in ieee80211_assign_beacon whenever
ieee80211_assign_beacon fails

Fixes: 8860020e0be1 ("cfg80211: restructure AP/GO mode API")
Fixes: bc847970f432 ("mac80211: support FTM responder configuration/statistic")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/770285772543c9fca33777bb4ad4760239e56256.1562105631.git.lorenzo@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agonl80211: fix NL80211_HE_MAX_CAPABILITY_LEN
John Crispin [Thu, 27 Jun 2019 09:58:32 +0000 (11:58 +0200)]
nl80211: fix NL80211_HE_MAX_CAPABILITY_LEN

NL80211_HE_MAX_CAPABILITY_LEN has changed between D2.0 and D4.0. It is now
MAC (6) + PHY (11) + MCS (12) + PPE (25) = 54.

Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190627095832.19445-1-john@phrozen.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agonl80211: fix VENDOR_CMD_RAW_DATA
Johannes Berg [Tue, 25 Jun 2019 08:04:51 +0000 (10:04 +0200)]
nl80211: fix VENDOR_CMD_RAW_DATA

Since ERR_PTR() is an inline, not a macro, just open-code it
here so it's usable as an initializer, fixing the build in
brcmfmac.

Reported-by: Arend Van Spriel <arend.vanspriel@broadcom.com>
Fixes: 901bb9891855 ("nl80211: require and validate vendor command policy")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agowireless: fix nl80211 vendor commands
Johannes Berg [Tue, 25 Jun 2019 08:04:51 +0000 (10:04 +0200)]
wireless: fix nl80211 vendor commands

In my previous commit to validate a policy I neglected to
actually add one to the few drivers using vendor commands,
fix that now.

Reported-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Fixes: 901bb9891855 ("nl80211: require and validate vendor command policy")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sat, 20 Jul 2019 19:22:30 +0000 (12:22 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull more input updates from Dmitry Torokhov:

 - Apple SPI keyboard and trackpad driver for newer Macs

 - ALPS driver will ignore trackpoint-only devices to give the
   trackpoint driver a chance to handle them properly

 - another Lenovo is switched over to SMbus from PS/2

 - assorted driver fixups.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: alps - fix a mismatch between a condition check and its comment
  Input: psmouse - fix build error of multiple definition
  Input: applespi - remove set but not used variables 'sts'
  Input: add Apple SPI keyboard and trackpad driver
  Input: alps - don't handle ALPS cs19 trackpoint-only device
  Input: hyperv-keyboard - remove dependencies on PAGE_SIZE for ring buffer
  Input: adp5589 - initialize GPIO controller parent device
  Input: iforce - remove empty multiline comments
  Input: synaptics - fix misuse of strlcpy
  Input: auo-pixcir-ts - switch to using  devm_add_action_or_reset()
  Input: gtco - bounds check collection indent level
  Input: mtk-pmic-keys - add of_node_put() before return
  Input: sun4i-lradc-keys - add of_node_put() before return
  Input: synaptics - whitelist Lenovo T580 SMBus intertouch

6 years agor8169: fix RTL8168g PHY init
Thomas Voegtle [Sat, 20 Jul 2019 17:01:22 +0000 (19:01 +0200)]
r8169: fix RTL8168g PHY init

This fixes a copy&paste error in the original patch. Setting the wrong
register resulted in massive packet loss on some systems.

Fixes: a2928d28643e ("r8169: use paged versions of phylib MDIO access functions")
Tested-by: Thomas Voegtle <tv@lio96.de>
Signed-off-by: Thomas Voegtle <tv@lio96.de>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'dma-mapping-5.3-1' of git://git.infradead.org/users/hch/dma-mapping
Linus Torvalds [Sat, 20 Jul 2019 19:09:52 +0000 (12:09 -0700)]
Merge tag 'dma-mapping-5.3-1' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fixes from Christoph Hellwig:
 "Fix various regressions:

   - force unencrypted dma-coherent buffers if encryption bit can't fit
     into the dma coherent mask (Tom Lendacky)

   - avoid limiting request size if swiotlb is not used (me)

   - fix swiotlb handling in dma_direct_sync_sg_for_cpu/device (Fugang
     Duan)"

* tag 'dma-mapping-5.3-1' of git://git.infradead.org/users/hch/dma-mapping:
  dma-direct: correct the physical addr in dma_direct_sync_sg_for_cpu/device
  dma-direct: only limit the mapping size if swiotlb could be used
  dma-mapping: add a dma_addressing_limited helper
  dma-direct: Force unencrypted DMA under SME for certain DMA masks

6 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jul 2019 18:24:49 +0000 (11:24 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of x86 specific fixes and updates:

   - The CR2 corruption fixes which store CR2 early in the entry code
     and hand the stored address to the fault handlers.

   - Revert a forgotten leftover of the dropped FSGSBASE series.

   - Plug a memory leak in the boot code.

   - Make the Hyper-V assist functionality robust by zeroing the shadow
     page.

   - Remove a useless check for dead processes with LDT

   - Update paravirt and VMware maintainers entries.

   - A few cleanup patches addressing various compiler warnings"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry/64: Prevent clobbering of saved CR2 value
  x86/hyper-v: Zero out the VP ASSIST PAGE on allocation
  x86, boot: Remove multiple copy of static function sanitize_boot_params()
  x86/boot/compressed/64: Remove unused variable
  x86/boot/efi: Remove unused variables
  x86/mm, tracing: Fix CR2 corruption
  x86/entry/64: Update comments and sanity tests for create_gap
  x86/entry/64: Simplify idtentry a little
  x86/entry/32: Simplify common_exception
  x86/paravirt: Make read_cr2() CALLEE_SAVE
  MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE
  x86/process: Delete useless check for dead process with LDT
  x86: math-emu: Hide clang warnings for 16-bit overflow
  x86/e820: Use proper booleans instead of 0/1
  x86/apic: Silence -Wtype-limits compiler warnings
  x86/mm: Free sme_early_buffer after init
  x86/boot: Fix memory leak in default_get_smp_config()
  Revert "x86/ptrace: Prevent ptrace from clearing the FS/GS selector" and fix the test

6 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jul 2019 18:06:12 +0000 (11:06 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf tooling updates from Thomas Gleixner:
 "A set of perf improvements and fixes:

  perf db-export:
   - Improvements in how COMM details are exported to databases for post
     processing and use in the sql-viewer.py UI.

   - Export switch events to the database.

  BPF:
   - Bump rlimit(MEMLOCK) for 'perf test bpf' and 'perf trace', just
     like selftests/bpf/bpf_rlimit.h do, which makes errors due to
     exhaustion of this limit, which are kinda cryptic (EPERM sometimes)
     less frequent.

  perf version:
   - Fix segfault due to missing OPT_END(), noticed on PowerPC.

  perf vendor events:
   - Add JSON files for IBM s/390 machine type 8561.

  perf cs-etm (ARM):
   - Fix two cases of error returns not bing done properly: Invalid
     ERR_PTR() use and loss of propagation error codes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
  perf version: Fix segfault due to missing OPT_END()
  perf vendor events s390: Add JSON files for machine type 8561
  perf cs-etm: Return errcode in cs_etm__process_auxtrace_info()
  perf cs-etm: Remove errnoeous ERR_PTR() usage in cs_etm__process_auxtrace_info
  perf scripts python: export-to-postgresql.py: Export switch events
  perf scripts python: export-to-sqlite.py: Export switch events
  perf db-export: Export switch events
  perf db-export: Factor out db_export__threads()
  perf script: Add scripting operation process_switch()
  perf scripts python: exported-sql-viewer.py: Use new 'has_calls' column
  perf scripts python: exported-sql-viewer.py: Remove redundant semi-colons
  perf scripts python: export-to-postgresql.py: Add has_calls column to comms table
  perf scripts python: export-to-sqlite.py: Add has_calls column to comms table
  perf db-export: Also export thread's current comm
  perf db-export: Factor out db_export__comm()
  perf scripts python: export-to-postgresql.py: Export comm details
  perf scripts python: export-to-sqlite.py: Export comm details
  perf db-export: Export comm details
  perf db-export: Fix a white space issue in db_export__sample()
  perf db-export: Move export__comm_thread into db_export__sample()
  ...

6 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jul 2019 17:45:15 +0000 (10:45 -0700)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core fixes from Thomas Gleixner:

 - A collection of objtool fixes which address recent fallout partially
   exposed by newer toolchains, clang, BPF and general code changes.

 - Force USER_DS for user stack traces

[ Note: the "objtool fixes" are not all to objtool itself, but for
  kernel code that triggers objtool warnings.

  Things like missing function size annotations, or code that confuses
  the unwinder etc.   - Linus]

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  objtool: Support conditional retpolines
  objtool: Convert insn type to enum
  objtool: Fix seg fault on bad switch table entry
  objtool: Support repeated uses of the same C jump table
  objtool: Refactor jump table code
  objtool: Refactor sibling call detection logic
  objtool: Do frame pointer check before dead end check
  objtool: Change dead_end_function() to return boolean
  objtool: Warn on zero-length functions
  objtool: Refactor function alias logic
  objtool: Track original function across branches
  objtool: Add mcsafe_handle_tail() to the uaccess safe list
  bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()
  x86/uaccess: Remove redundant CLACs in getuser/putuser error paths
  x86/uaccess: Don't leak AC flag into fentry from mcsafe_handle_tail()
  x86/uaccess: Remove ELF function annotation from copy_user_handle_tail()
  x86/head/64: Annotate start_cpu0() as non-callable
  x86/entry: Fix thunk function ELF sizes
  x86/kvm: Don't call kvm_spurious_fault() from .fixup
  x86/kvm: Replace vmx_vmenter()'s call to kvm_spurious_fault() with UD2
  ...

6 years agoMerge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jul 2019 17:43:03 +0000 (10:43 -0700)]
Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull smp fix from Thomas Gleixner:
 "Add warnings to the smp function calls so callers from wrong contexts
  get detected"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smp: Warn on function calls from softirq context

6 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jul 2019 17:33:44 +0000 (10:33 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull CONFIG_PREEMPT_RT stub config from Thomas Gleixner:
 "The real-time preemption patch set exists for almost 15 years now and
  while the vast majority of infrastructure and enhancements have found
  their way into the mainline kernel, the final integration of RT is
  still missing.

  Over the course of the last few years, we have worked on reducing the
  intrusivenness of the RT patches by refactoring kernel infrastructure
  to be more real-time friendly. Almost all of these changes were
  benefitial to the mainline kernel on their own, so there was no
  objection to integrate them.

  Though except for the still ongoing printk refactoring, the remaining
  changes which are required to make RT a first class mainline citizen
  are not longer arguable as immediately beneficial for the mainline
  kernel. Most of them are either reordering code flows or adding RT
  specific functionality.

  But this now has hit a wall and turned into a classic hen and egg
  problem:

     Maintainers are rightfully wary vs. these changes as they make only
     sense if the final integration of RT into the mainline kernel takes
     place.

  Adding CONFIG_PREEMPT_RT aims to solve this as a clear sign that RT
  will be fully integrated into the mainline kernel. The final
  integration of the missing bits and pieces will be of course done with
  the same careful approach as we have used in the past.

  While I'm aware that you are not entirely enthusiastic about that, I
  think that RT should receive the same treatment as any other widely
  used out of tree functionality, which we have accepted into mainline
  over the years.

  RT has become the de-facto standard real-time enhancement and is
  shipped by enterprise, embedded and community distros. It's in use
  throughout a wide range of industries: telecommunications, industrial
  automation, professional audio, medical devices, data acquisition,
  automotive - just to name a few major use cases.

  RT development is backed by a Linuxfoundation project which is
  supported by major stakeholders of this technology. The funding will
  continue over the actual inclusion into mainline to make sure that the
  functionality is neither introducing regressions, regressing itself,
  nor becomes subject to bitrot. There is also a lifely user community
  around RT as well, so contrary to the grim situation 5 years ago, it's
  a healthy project.

  As RT is still a good vehicle to exercise rarely used code paths and
  to detect hard to trigger issues, you could at least view it as a QA
  tool if nothing else"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/rt, Kconfig: Introduce CONFIG_PREEMPT_RT

6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 20 Jul 2019 17:20:27 +0000 (10:20 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
 "Mostly bugfixes, but also:

   - s390 support for KVM selftests

   - LAPIC timer offloading to housekeeping CPUs

   - Extend an s390 optimization for overcommitted hosts to all
     architectures

   - Debugging cleanups and improvements"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  KVM: x86: Add fixed counters to PMU filter
  KVM: nVMX: do not use dangling shadow VMCS after guest reset
  KVM: VMX: dump VMCS on failed entry
  KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
  KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup
  KVM: Boost vCPUs that are delivering interrupts
  KVM: selftests: Remove superfluous define from vmx.c
  KVM: SVM: Fix detection of AMD Errata 1096
  KVM: LAPIC: Inject timer interrupt via posted interrupt
  KVM: LAPIC: Make lapic timer unpinned
  KVM: x86/vPMU: reset pmc->counter to 0 for pmu fixed_counters
  KVM: nVMX: Ignore segment base for VMX memory operand when segment not FS or GS
  kvm: x86: ioapic and apic debug macros cleanup
  kvm: x86: some tsc debug cleanup
  kvm: vmx: fix coccinelle warnings
  x86: kvm: avoid constant-conversion warning
  x86: kvm: avoid -Wsometimes-uninitized warning
  KVM: x86: expose AVX512_BF16 feature to guest
  KVM: selftests: enable pgste option for the linker on s390
  KVM: selftests: Move kvm_create_max_vcpus test to generic code
  ...

6 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 20 Jul 2019 17:04:58 +0000 (10:04 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is the final round of mostly small fixes in our initial submit.

  It's mostly minor fixes and driver updates. The only change of note is
  adding a virt_boundary_mask to the SCSI host and host template to
  parametrise this for NVMe devices instead of having them do a call in
  slave_alloc. It's a fairly straightforward conversion except in the
  two NVMe handling drivers that didn't set it who now have a virtual
  infinity parameter added"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
  scsi: megaraid_sas: set an unlimited max_segment_size
  scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs
  scsi: IB/srp: set virt_boundary_mask in the scsi host
  scsi: IB/iser: set virt_boundary_mask in the scsi host
  scsi: storvsc: set virt_boundary_mask in the scsi host template
  scsi: ufshcd: set max_segment_size in the scsi host template
  scsi: core: take the DMA max mapping size into account
  scsi: core: add a host / host template field for the virt boundary
  scsi: core: Fix race on creating sense cache
  scsi: sd_zbc: Fix compilation warning
  scsi: libfc: fix null pointer dereference on a null lport
  scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
  scsi: zfcp: fix request object use-after-free in send path causing wrong traces
  scsi: zfcp: fix request object use-after-free in send path causing seqno errors
  scsi: megaraid_sas: Update driver version to 07.710.50.00
  scsi: megaraid_sas: Add module parameter for FW Async event logging
  scsi: megaraid_sas: Enable msix_load_balance for Invader and later controllers
  scsi: megaraid_sas: Fix calculation of target ID
  scsi: lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE
  scsi: devinfo: BLIST_TRY_VPD_PAGES for SanDisk Cruzer Blade
  ...

6 years agoMerge tag 'kbuild-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy...
Linus Torvalds [Sat, 20 Jul 2019 16:34:55 +0000 (09:34 -0700)]
Merge tag 'kbuild-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:

 - match the directory structure of the linux-libc-dev package to that
   of Debian-based distributions

 - fix incorrect include/config/auto.conf generation when Kconfig
   creates it along with the .config file

 - remove misleading $(AS) from documents

 - clean up precious tag files by distclean instead of mrproper

 - add a new coccinelle patch for devm_platform_ioremap_resource
   migration

 - refactor module-related scripts to read modules.order instead of
   $(MODVERDIR)/*.mod files to get the list of created modules

 - remove MODVERDIR

 - update list of header compile-test

 - add -fcf-protection=none flag to avoid conflict with the retpoline
   flags when CONFIG_RETPOLINE=y

 - misc cleanups

* tag 'kbuild-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits)
  kbuild: add -fcf-protection=none when using retpoline flags
  kbuild: update compile-test header list for v5.3-rc1
  kbuild: split out *.mod out of {single,multi}-used-m rules
  kbuild: remove 'prepare1' target
  kbuild: remove the first line of *.mod files
  kbuild: create *.mod with full directory path and remove MODVERDIR
  kbuild: export_report: read modules.order instead of .tmp_versions/*.mod
  kbuild: modpost: read modules.order instead of $(MODVERDIR)/*.mod
  kbuild: modsign: read modules.order instead of $(MODVERDIR)/*.mod
  kbuild: modinst: read modules.order instead of $(MODVERDIR)/*.mod
  scsi: remove pointless $(MODVERDIR)/$(obj)/53c700.ver
  kbuild: remove duplication from modules.order in sub-directories
  kbuild: get rid of kernel/ prefix from in-tree modules.{order,builtin}
  kbuild: do not create empty modules.order in the prepare stage
  coccinelle: api: add devm_platform_ioremap_resource script
  kbuild: compile-test headers listed in header-test-m as well
  kbuild: remove unused hostcc-option
  kbuild: remove tag files by distclean instead of mrproper
  kbuild: add --hash-style= and --build-id unconditionally
  kbuild: get rid of misleading $(AS) from documents
  ...

6 years agoMerge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 20 Jul 2019 16:15:51 +0000 (09:15 -0700)]
Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull dcache and mountpoint updates from Al Viro:
 "Saner handling of refcounts to mountpoints.

  Transfer the counting reference from struct mount ->mnt_mountpoint
  over to struct mountpoint ->m_dentry. That allows us to get rid of the
  convoluted games with ordering of mount shutdowns.

  The cost is in teaching shrink_dcache_{parent,for_umount} to cope with
  mixed-filesystem shrink lists, which we'll also need for the Slab
  Movable Objects patchset"

* 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  switch the remnants of releasing the mountpoint away from fs_pin
  get rid of detach_mnt()
  make struct mountpoint bear the dentry reference to mountpoint, not struct mount
  Teach shrink_dcache_parent() to cope with mixed-filesystem shrink lists
  fs/namespace.c: shift put_mountpoint() to callers of unhash_mnt()
  __detach_mounts(): lookup_mountpoint() can't return ERR_PTR() anymore
  nfs: dget_parent() never returns NULL
  ceph: don't open-code the check for dead lockref

6 years agox86/entry/64: Prevent clobbering of saved CR2 value
Thomas Gleixner [Sat, 20 Jul 2019 08:56:41 +0000 (10:56 +0200)]
x86/entry/64: Prevent clobbering of saved CR2 value

The recent fix for CR2 corruption introduced a new way to reliably corrupt
the saved CR2 value.

CR2 is saved early in the entry code in RDX, which is the third argument to
the fault handling functions. But it missed that between saving and
invoking the fault handler enter_from_user_mode() can be called. RDX is a
caller saved register so the invoked function can freely clobber it with
the obvious consequences.

The TRACE_IRQS_OFF call is safe as it calls through the thunk which
preserves RDX, but TRACE_IRQS_OFF_DEBUG is not because it also calls into
C-code outside of the thunk.

Store CR2 in R12 instead which is a callee saved register and move R12 to
RDX just before calling the fault handler.

Fixes: a0d14b8909de ("x86/mm, tracing: Fix CR2 corruption")
Reported-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907201020540.1782@nanos.tec.linutronix.de
6 years agosmp: Warn on function calls from softirq context
Peter Zijlstra [Thu, 18 Jul 2019 09:20:09 +0000 (11:20 +0200)]
smp: Warn on function calls from softirq context

It's clearly documented that smp function calls cannot be invoked from
softirq handling context. Unfortunately nothing enforces that or emits a
warning.

A single function call can be invoked from softirq context only via
smp_call_function_single_async().

The only legit context is task context, so add a warning to that effect.

Reported-by: luferry <luferry@163.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass.net
6 years agoKVM: x86: Add fixed counters to PMU filter
Eric Hankland [Thu, 18 Jul 2019 18:38:18 +0000 (11:38 -0700)]
KVM: x86: Add fixed counters to PMU filter

Updates KVM_CAP_PMU_EVENT_FILTER so it can also whitelist or blacklist
fixed counters.

Signed-off-by: Eric Hankland <ehankland@google.com>
[No need to check padding fields for zero. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: nVMX: do not use dangling shadow VMCS after guest reset
Paolo Bonzini [Fri, 19 Jul 2019 16:41:10 +0000 (18:41 +0200)]
KVM: nVMX: do not use dangling shadow VMCS after guest reset

If a KVM guest is reset while running a nested guest, free_nested will
disable the shadow VMCS execution control in the vmcs01.  However,
on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync
the VMCS12 to the shadow VMCS which has since been freed.

This causes a vmptrld of a NULL pointer on my machime, but Jan reports
the host to hang altogether.  Let's see how much this trivial patch fixes.

Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: VMX: dump VMCS on failed entry
Paolo Bonzini [Fri, 19 Jul 2019 16:15:08 +0000 (18:15 +0200)]
KVM: VMX: dump VMCS on failed entry

This is useful for debugging, and is ratelimited nowadays.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
Like Xu [Thu, 18 Jul 2019 05:35:14 +0000 (13:35 +0800)]
KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed

If a perf_event creation fails due to any reason of the host perf
subsystem, it has no chance to log the corresponding event for guest
which may cause abnormal sampling data in guest result. In debug mode,
this message helps to understand the state of vPMC and we may not
limit the number of occurrences but not in a spamming style.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup
Wanpeng Li [Thu, 18 Jul 2019 11:39:07 +0000 (19:39 +0800)]
KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup

Use kvm_vcpu_wake_up() in kvm_s390_vcpu_wakeup().

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: Boost vCPUs that are delivering interrupts
Wanpeng Li [Thu, 18 Jul 2019 11:39:06 +0000 (19:39 +0800)]
KVM: Boost vCPUs that are delivering interrupts

Inspired by commit 9cac38dd5d (KVM/s390: Set preempted flag during
vcpu wakeup and interrupt delivery), we want to also boost not just
lock holders but also vCPUs that are delivering interrupts. Most
smp_call_function_many calls are synchronous, so the IPI target vCPUs
are also good yield candidates.  This patch introduces vcpu->ready to
boost vCPUs during wakeup and interrupt delivery time; unlike s390 we do
not reuse vcpu->preempted so that voluntarily preempted vCPUs are taken
into account by kvm_vcpu_on_spin, but vmx_vcpu_pi_put is not affected
(VT-d PI handles voluntary preemption separately, in pi_pre_block).

Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
ebizzy -M

            vanilla     boosting    improved
1VM          21443       23520         9%
2VM           2800        8000       180%
3VM           1800        3100        72%

Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs,
one running ebizzy -M, the other running 'stress --cpu 2':

w/ boosting + w/o pv sched yield(vanilla)

            vanilla     boosting   improved
              1570         4000      155%

w/ boosting + w/ pv sched yield(vanilla)

            vanilla     boosting   improved
              1844         5157      179%

w/o boosting, perf top in VM:

 72.33%  [kernel]       [k] smp_call_function_many
  4.22%  [kernel]       [k] call_function_i
  3.71%  [kernel]       [k] async_page_fault

w/ boosting, perf top in VM:

 38.43%  [kernel]       [k] smp_call_function_many
  6.31%  [kernel]       [k] async_page_fault
  6.13%  libc-2.23.so   [.] __memcpy_avx_unaligned
  4.88%  [kernel]       [k] call_function_interrupt

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: selftests: Remove superfluous define from vmx.c
Thomas Huth [Thu, 18 Jul 2019 11:55:27 +0000 (13:55 +0200)]
KVM: selftests: Remove superfluous define from vmx.c

The code in vmx.c does not use "program_invocation_name", so there
is no need to "#define _GNU_SOURCE" here.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: SVM: Fix detection of AMD Errata 1096
Liran Alon [Tue, 16 Jul 2019 23:56:58 +0000 (02:56 +0300)]
KVM: SVM: Fix detection of AMD Errata 1096

When CPU raise #NPF on guest data access and guest CR4.SMAP=1, it is
possible that CPU microcode implementing DecodeAssist will fail
to read bytes of instruction which caused #NPF. This is AMD errata
1096 and it happens because CPU microcode reading instruction bytes
incorrectly attempts to read code as implicit supervisor-mode data
accesses (that is, just like it would read e.g. a TSS), which are
susceptible to SMAP faults. The microcode reads CS:RIP and if it is
a user-mode address according to the page tables, the processor
gives up and returns no instruction bytes.  In this case,
GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
return 0 instead of the correct guest instruction bytes.

Current KVM code attemps to detect and workaround this errata, but it
has multiple issues:

1) It mistakenly checks if guest CR4.SMAP=0 instead of guest CR4.SMAP=1,
which is required for encountering a SMAP fault.

2) It assumes SMAP faults can only occur when guest CPL==3.
However, in case guest CR4.SMEP=0, the guest can execute an instruction
which reside in a user-accessible page with CPL<3 priviledge. If this
instruction raise a #NPF on it's data access, then CPU DecodeAssist
microcode will still encounter a SMAP violation.  Even though no sane
OS will do so (as it's an obvious priviledge escalation vulnerability),
we still need to handle this semanticly correct in KVM side.

Note that (2) *is* a useful optimization, because CR4.SMAP=1 is an easy
triggerable condition and guests usually enable SMAP together with SMEP.
If the vCPU has CR4.SMEP=1, the errata could indeed be encountered onlt
at guest CPL==3; otherwise, the CPU would raise a SMEP fault to guest
instead of #NPF.  We keep this condition to avoid false positives in
the detection of the errata.

In addition, to avoid future confusion and improve code readbility,
include details of the errata in code and not just in commit message.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
Cc: Singh Brijesh <brijesh.singh@amd.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: LAPIC: Inject timer interrupt via posted interrupt
Wanpeng Li [Sat, 6 Jul 2019 01:26:51 +0000 (09:26 +0800)]
KVM: LAPIC: Inject timer interrupt via posted interrupt

Dedicated instances are currently disturbed by unnecessary jitter due
to the emulated lapic timers firing on the same pCPUs where the
vCPUs reside.  There is no hardware virtual timer on Intel for guest
like ARM, so both programming timer in guest and the emulated timer fires
incur vmexits.  This patch tries to avoid vmexit when the emulated timer
fires, at least in dedicated instance scenario when nohz_full is enabled.

In that case, the emulated timers can be offload to the nearest busy
housekeeping cpus since APICv has been found for several years in server
processors. The guest timer interrupt can then be injected via posted interrupts,
which are delivered by the housekeeping cpu once the emulated timer fires.

The host should tuned so that vCPUs are placed on isolated physical
processors, and with several pCPUs surplus for busy housekeeping.
If disabled mwait/hlt/pause vmexits keep the vCPUs in non-root mode,
~3% redis performance benefit can be observed on Skylake server, and the
number of external interrupt vmexits drops substantially.  Without patch

            VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time   Avg time
EXTERNAL_INTERRUPT    42916    49.43%   39.30%   0.47us   106.09us   0.71us ( +-   1.09% )

While with patch:

            VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time         Avg time
EXTERNAL_INTERRUPT    6871     9.29%     2.96%   0.44us    57.88us   0.72us ( +-   4.02% )

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>