]> www.infradead.org Git - users/dwmw2/linux.git/log
users/dwmw2/linux.git
5 years agotipc: fix use-after-free in tipc_bcast_get_mode
Hoang Huu Le [Thu, 27 Aug 2020 02:56:51 +0000 (09:56 +0700)]
tipc: fix use-after-free in tipc_bcast_get_mode

Syzbot has reported those issues as:

==================================================================
BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759
Read of size 1 at addr ffff88805e6b3571 by task kworker/0:6/3850

CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events tipc_net_finalize_work

Thread 1's call trace:
[...]
  kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
  tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
  tipc_exit_net+0x24/0x270 net/tipc/core.c:112
[...]

Thread 2's call trace:
[...]
  tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
has already been freed by Thread 1

  tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
  tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
  tipc_net_finalize net/tipc/net.c:141 [inline]
  tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
  tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
[...]

==================================================================
BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
Read of size 8 at addr ffff888052ab2000 by task kworker/0:13/30628
CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events tipc_net_finalize_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1f0/0x31e lib/dump_stack.c:118
 print_address_description+0x66/0x5a0 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report+0x132/0x1d0 mm/kasan/report.c:530
 tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
 tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
 tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
[...]
Freed by task 14058:
 save_stack mm/kasan/common.c:48 [inline]
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x220 mm/slab.c:3757
 tipc_exit_net+0x29/0x50 net/tipc/core.c:113
 ops_exit_list net/core/net_namespace.c:186 [inline]
 cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

Fix it by calling flush_scheduled_work() to make sure the
tipc_net_finalize_work() stopped before releasing bcbase object.

Reported-by: syzbot+6ea1f7a8df64596ef4d7@syzkaller.appspotmail.com
Reported-by: syzbot+e9cc557752ab126c1b99@syzkaller.appspotmail.com
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Move-MDIO-drivers-into-their-own-directory'
David S. Miller [Thu, 27 Aug 2020 13:55:51 +0000 (06:55 -0700)]
Merge branch 'Move-MDIO-drivers-into-their-own-directory'

Andrew Lunn says:

====================
Move MDIO drivers into their own directory

The phy subdirectory is getting cluttered. It has both PHY drivers and
MDIO drivers, plus a stray switch driver. Soon more PCS drivers are
likely to appear.

Move MDIO and PCS drivers into new directories. This requires fixing
up the xgene driver which uses a relative include path.

v2:
Move the subdirs to drivers/net, rather than drivers/net/phy.

v3:
Add subdirectories under include/linux for mdio and pcs

v4:
there->their
include path fix
No new kconfig prompts
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: Sort Kconfig and Makefile
Andrew Lunn [Thu, 27 Aug 2020 02:00:32 +0000 (04:00 +0200)]
net: phy: Sort Kconfig and Makefile

Sort the Kconfig based on the text shown in make menuconfig and sort
the Makefile by CONFIG symbol.

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mdio: Move MDIO drivers into a new subdirectory
Andrew Lunn [Thu, 27 Aug 2020 02:00:31 +0000 (04:00 +0200)]
net: mdio: Move MDIO drivers into a new subdirectory

Move all the MDIO drivers and multiplexers into drivers/net/mdio.  The
mdio core is however left in the phy directory, due to mutual
dependencies between the MDIO core and the PHY core.

Take this opportunity to sort the Kconfig based on the menuconfig
strings, and move the multiplexers to the end with a separating
comment.

v2:
Fix typo in commit message

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: xgene: Move shared header file into include/linux
Andrew Lunn [Thu, 27 Aug 2020 02:00:30 +0000 (04:00 +0200)]
net: xgene: Move shared header file into include/linux

This header file is currently included into the ethernet driver via a
relative path into the PHY subsystem. This is bad practice, and causes
issues for the upcoming move of the MDIO driver. Move the header file
into include/linux to clean this up.

v2:
Move header to include/linux/mdio

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/phy/mdio-i2c: Move header file to include/linux/mdio
Andrew Lunn [Thu, 27 Aug 2020 02:00:29 +0000 (04:00 +0200)]
net/phy/mdio-i2c: Move header file to include/linux/mdio

In preparation for moving all MDIO drivers into drivers/net/mdio, move
the mdio-i2c header file into include/linux/mdio so it can be used by
both the MDIO driver and the SFP code which instantiates I2C MDIO
busses.

v2:
Add include/linux/mdio

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: pcs: Move XPCS into new PCS subdirectory
Andrew Lunn [Thu, 27 Aug 2020 02:00:28 +0000 (04:00 +0200)]
net: pcs: Move XPCS into new PCS subdirectory

Create drivers/net/pcs and move the Synopsys DesignWare XPCS into the
new directory. Move the header file into a subdirectory
include/linux/pcs

Start a naming convention of all PCS files use the prefix pcs-, and
rename the XPCS files to fit.

v2:
Add include/linux/pcs

v4:
Fix include path in stmmac.
Remove PCS_DEVICES to avoid new prompts

Cc: Jose Abreu <Jose.Abreu@synopsys.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'drivers-net-constify-static-ops-variables'
David S. Miller [Wed, 26 Aug 2020 23:21:17 +0000 (16:21 -0700)]
Merge branch 'drivers-net-constify-static-ops-variables'

Rikard Falkeborn says:

====================
drivers/net: constify static ops-variables

This series constifies a number of static ops variables, to allow the
compiler to put them in read-only memory. Compile-tested only.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ath11k: constify ath11k_thermal_ops
Rikard Falkeborn [Wed, 26 Aug 2020 22:56:08 +0000 (00:56 +0200)]
net: ath11k: constify ath11k_thermal_ops

The only usage of ath11k_thermal_ops is to pass its address to
thermal_cooling_device_register() which takes a const pointer. Make it
const to allow the compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: mscc: macsec: constify vsc8584_macsec_ops
Rikard Falkeborn [Wed, 26 Aug 2020 22:56:07 +0000 (00:56 +0200)]
net: phy: mscc: macsec: constify vsc8584_macsec_ops

The only usage of vsc8584_macsec_ops is to assign its address to the
macsec_ops field in the phydev struct, which is a const pointer. Make it
const to allow the compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: at803x: constify static regulator_ops
Rikard Falkeborn [Wed, 26 Aug 2020 22:56:06 +0000 (00:56 +0200)]
net: phy: at803x: constify static regulator_ops

The only usage of vddio_regulator_ops and vddh_regulator_ops is to
assign their address to the ops field in the regulator_desc struct,
which is a const pointer. Make them const to allow the compiler to
put them in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: renesas: sh_eth: constify bb_ops
Rikard Falkeborn [Wed, 26 Aug 2020 22:56:05 +0000 (00:56 +0200)]
net: renesas: sh_eth: constify bb_ops

The only usage of bb_ops is to assign its address to the ops field in
the mdiobb_ctrl struct, which is a const pointer. Make it const to allow
the compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: ravb: constify bb_ops
Rikard Falkeborn [Wed, 26 Aug 2020 22:56:04 +0000 (00:56 +0200)]
net: ethernet: ravb: constify bb_ops

The only usage of bb_ops is to assign its address to the ops field in
the mdiobb_ctrl struct, which is a const pointer. Make it const to allow
the compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: qualcomm: constify qca_serdev_ops
Rikard Falkeborn [Wed, 26 Aug 2020 22:56:03 +0000 (00:56 +0200)]
net: ethernet: qualcomm: constify qca_serdev_ops

The only usage of qca_serdev_ops is to pass its address to
serdev_device_set_client_ops() which takes a const pointer. Make it
const to allow the compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipa: remove duplicate include
Wang Hai [Wed, 19 Aug 2020 02:46:45 +0000 (10:46 +0800)]
net: ipa: remove duplicate include

Remove linux/notifier.h which is included more than once

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'refactoring-of-ibmvnic-code'
David S. Miller [Wed, 26 Aug 2020 23:11:39 +0000 (16:11 -0700)]
Merge branch 'refactoring-of-ibmvnic-code'

Lijun Pan says:

====================
refactoring of ibmvnic code

This patch series refactor reset_init and init functions,
and make some other cosmetic changes to make the code
easier to read and debug. v2 removes __func__ and v1's 1/5.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoibmvnic: merge ibmvnic_reset_init and ibmvnic_init
Lijun Pan [Wed, 19 Aug 2020 22:52:26 +0000 (17:52 -0500)]
ibmvnic: merge ibmvnic_reset_init and ibmvnic_init

These two functions share the majority of the code, hence merge
them together. In the meanwhile, add a reset pass-in parameter
to differentiate them. Thus, the code is easier to read and to tell
the difference between reset_init and regular init.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoibmvnic: remove never executed if statement
Lijun Pan [Wed, 19 Aug 2020 22:52:25 +0000 (17:52 -0500)]
ibmvnic: remove never executed if statement

At the beginning of the function, from_passive_init is set false by
"adapter->from_passive_init = false;",
hence the if statement will never run.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoibmvnic: improve ibmvnic_init and ibmvnic_reset_init
Lijun Pan [Wed, 19 Aug 2020 22:52:24 +0000 (17:52 -0500)]
ibmvnic: improve ibmvnic_init and ibmvnic_reset_init

When H_SEND_CRQ command returns with H_CLOSED, it means the
server's CRQ is not ready yet. Instead of resetting immediately,
we wait for the server to launch passive init.
ibmvnic_init() and ibmvnic_reset_init() should also return the
error code from ibmvnic_send_crq_init() call.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoibmvnic: compare adapter->init_done_rc with more readable ibmvnic_rc_codes
Lijun Pan [Wed, 19 Aug 2020 22:52:23 +0000 (17:52 -0500)]
ibmvnic: compare adapter->init_done_rc with more readable ibmvnic_rc_codes

Instead of comparing (adapter->init_done_rc == 1), let it
be (adapter->init_done_rc == PARTIALSUCCESS).

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'ipv4-nexthop-Various-improvements'
David S. Miller [Wed, 26 Aug 2020 23:00:51 +0000 (16:00 -0700)]
Merge branch 'ipv4-nexthop-Various-improvements'

Ido Schimmel says:

====================
ipv4: nexthop: Various improvements

This patch set contains various improvements that I made to the nexthop
object code while studying it towards my upcoming changes.

While patches #4 and #6 fix bugs, they are not regressions (never
worked). They also do not occur to me as critical issues, which is why I
am targeting them at net-next.

Tested with fib_nexthops.sh:

Tests passed: 134
Tests failed:   0
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: fib_nexthops: Test IPv6 route with group after replacing IPv4 nexthops
Ido Schimmel [Wed, 26 Aug 2020 16:48:57 +0000 (19:48 +0300)]
selftests: fib_nexthops: Test IPv6 route with group after replacing IPv4 nexthops

Test that an IPv6 route can not use a nexthop group with mixed IPv4 and
IPv6 nexthops, but can use it after replacing the IPv4 nexthops with
IPv6 nexthops.

Output without previous patch:

# ./fib_nexthops.sh -t ipv6_fcnal_runtime

IPv6 functional runtime
-----------------------
TEST: Route add                                                     [ OK ]
TEST: Route delete                                                  [ OK ]
TEST: Ping with nexthop                                             [ OK ]
TEST: Ping - multipath                                              [ OK ]
TEST: Ping - blackhole                                              [ OK ]
TEST: Ping - blackhole replaced with gateway                        [ OK ]
TEST: Ping - gateway replaced by blackhole                          [ OK ]
TEST: Ping - group with blackhole                                   [ OK ]
TEST: Ping - group blackhole replaced with gateways                 [ OK ]
TEST: IPv6 route with device only nexthop                           [ OK ]
TEST: IPv6 multipath route with nexthop mix - dev only + gw         [ OK ]
TEST: IPv6 route can not have a v4 gateway                          [ OK ]
TEST: Nexthop replace - v6 route, v4 nexthop                        [ OK ]
TEST: Nexthop replace of group entry - v6 route, v4 nexthop         [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route using a group after removing v4 gateways           [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route using a group after replacing v4 gateways          [FAIL]
TEST: Nexthop with default route and rpfilter                       [ OK ]
TEST: Nexthop with multipath default route and rpfilter             [ OK ]

Tests passed:  21
Tests failed:   1

Output with previous patch:

# ./fib_nexthops.sh -t ipv6_fcnal_runtime

IPv6 functional runtime
-----------------------
TEST: Route add                                                     [ OK ]
TEST: Route delete                                                  [ OK ]
TEST: Ping with nexthop                                             [ OK ]
TEST: Ping - multipath                                              [ OK ]
TEST: Ping - blackhole                                              [ OK ]
TEST: Ping - blackhole replaced with gateway                        [ OK ]
TEST: Ping - gateway replaced by blackhole                          [ OK ]
TEST: Ping - group with blackhole                                   [ OK ]
TEST: Ping - group blackhole replaced with gateways                 [ OK ]
TEST: IPv6 route with device only nexthop                           [ OK ]
TEST: IPv6 multipath route with nexthop mix - dev only + gw         [ OK ]
TEST: IPv6 route can not have a v4 gateway                          [ OK ]
TEST: Nexthop replace - v6 route, v4 nexthop                        [ OK ]
TEST: Nexthop replace of group entry - v6 route, v4 nexthop         [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route using a group after removing v4 gateways           [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route using a group after replacing v4 gateways          [ OK ]
TEST: Nexthop with default route and rpfilter                       [ OK ]
TEST: Nexthop with multipath default route and rpfilter             [ OK ]

Tests passed:  22
Tests failed:   0

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: nexthop: Correctly update nexthop group when replacing a nexthop
Ido Schimmel [Wed, 26 Aug 2020 16:48:56 +0000 (19:48 +0300)]
ipv4: nexthop: Correctly update nexthop group when replacing a nexthop

Each nexthop group contains an indication if it has IPv4 nexthops
('has_v4'). Its purpose is to prevent IPv6 routes from using groups with
IPv4 nexthops.

However, the indication is not updated when a nexthop is replaced. This
results in the kernel wrongly rejecting IPv6 routes from pointing to
groups that only contain IPv6 nexthops. Example:

# ip nexthop replace id 1 via 192.0.2.2 dev dummy10
# ip nexthop replace id 10 group 1
# ip nexthop replace id 1 via 2001:db8:1::2 dev dummy10
# ip route replace 2001:db8:10::/64 nhid 10
Error: IPv6 routes can not use an IPv4 nexthop.

Solve this by iterating over all the nexthop groups that the replaced
nexthop is a member of and potentially update their IPv4 indication
according to the new set of member nexthops.

Avoid wasting cycles by only performing the update in case an IPv4
nexthop is replaced by an IPv6 nexthop.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: fib_nexthops: Test IPv6 route with group after removing IPv4 nexthops
Ido Schimmel [Wed, 26 Aug 2020 16:48:55 +0000 (19:48 +0300)]
selftests: fib_nexthops: Test IPv6 route with group after removing IPv4 nexthops

Test that an IPv6 route can not use a nexthop group with mixed IPv4 and
IPv6 nexthops, but can use it after deleting the IPv4 nexthops.

Output without previous patch:

# ./fib_nexthops.sh -t ipv6_fcnal_runtime

IPv6 functional runtime
-----------------------
TEST: Route add                                                     [ OK ]
TEST: Route delete                                                  [ OK ]
TEST: Ping with nexthop                                             [ OK ]
TEST: Ping - multipath                                              [ OK ]
TEST: Ping - blackhole                                              [ OK ]
TEST: Ping - blackhole replaced with gateway                        [ OK ]
TEST: Ping - gateway replaced by blackhole                          [ OK ]
TEST: Ping - group with blackhole                                   [ OK ]
TEST: Ping - group blackhole replaced with gateways                 [ OK ]
TEST: IPv6 route with device only nexthop                           [ OK ]
TEST: IPv6 multipath route with nexthop mix - dev only + gw         [ OK ]
TEST: IPv6 route can not have a v4 gateway                          [ OK ]
TEST: Nexthop replace - v6 route, v4 nexthop                        [ OK ]
TEST: Nexthop replace of group entry - v6 route, v4 nexthop         [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route using a group after deleting v4 gateways           [FAIL]
TEST: Nexthop with default route and rpfilter                       [ OK ]
TEST: Nexthop with multipath default route and rpfilter             [ OK ]

Tests passed:  18
Tests failed:   1

Output with previous patch:

bash-5.0# ./fib_nexthops.sh -t ipv6_fcnal_runtime

IPv6 functional runtime
-----------------------
TEST: Route add                                                     [ OK ]
TEST: Route delete                                                  [ OK ]
TEST: Ping with nexthop                                             [ OK ]
TEST: Ping - multipath                                              [ OK ]
TEST: Ping - blackhole                                              [ OK ]
TEST: Ping - blackhole replaced with gateway                        [ OK ]
TEST: Ping - gateway replaced by blackhole                          [ OK ]
TEST: Ping - group with blackhole                                   [ OK ]
TEST: Ping - group blackhole replaced with gateways                 [ OK ]
TEST: IPv6 route with device only nexthop                           [ OK ]
TEST: IPv6 multipath route with nexthop mix - dev only + gw         [ OK ]
TEST: IPv6 route can not have a v4 gateway                          [ OK ]
TEST: Nexthop replace - v6 route, v4 nexthop                        [ OK ]
TEST: Nexthop replace of group entry - v6 route, v4 nexthop         [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
TEST: IPv6 route using a group after deleting v4 gateways           [ OK ]
TEST: Nexthop with default route and rpfilter                       [ OK ]
TEST: Nexthop with multipath default route and rpfilter             [ OK ]

Tests passed:  19
Tests failed:   0

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: nexthop: Correctly update nexthop group when removing a nexthop
Ido Schimmel [Wed, 26 Aug 2020 16:48:54 +0000 (19:48 +0300)]
ipv4: nexthop: Correctly update nexthop group when removing a nexthop

Each nexthop group contains an indication if it has IPv4 nexthops
('has_v4'). Its purpose is to prevent IPv6 routes from using groups with
IPv4 nexthops.

However, the indication is not updated when a nexthop is removed. This
results in the kernel wrongly rejecting IPv6 routes from pointing to
groups that only contain IPv6 nexthops. Example:

# ip nexthop replace id 1 via 192.0.2.2 dev dummy10
# ip nexthop replace id 2 via 2001:db8:1::2 dev dummy10
# ip nexthop replace id 10 group 1/2
# ip nexthop del id 1
# ip route replace 2001:db8:10::/64 nhid 10
Error: IPv6 routes can not use an IPv4 nexthop.

Solve this by updating the indication according to the new set of
member nexthops.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: nexthop: Remove unnecessary rtnl_dereference()
Ido Schimmel [Wed, 26 Aug 2020 16:48:53 +0000 (19:48 +0300)]
ipv4: nexthop: Remove unnecessary rtnl_dereference()

The pointer is not RCU protected, so remove the unnecessary
rtnl_dereference(). This suppresses the following warning:

net/ipv4/nexthop.c:1101:24: error: incompatible types in comparison expression (different address spaces):
net/ipv4/nexthop.c:1101:24:    struct rb_node [noderef] __rcu *
net/ipv4/nexthop.c:1101:24:    struct rb_node *

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: nexthop: Use nla_put_be32() for NHA_GATEWAY
Ido Schimmel [Wed, 26 Aug 2020 16:48:52 +0000 (19:48 +0300)]
ipv4: nexthop: Use nla_put_be32() for NHA_GATEWAY

The code correctly uses nla_get_be32() to get the payload of the
attribute, but incorrectly uses nla_put_u32() to add the attribute to
the payload. This results in the following warning:

net/ipv4/nexthop.c:279:59: warning: incorrect type in argument 3 (different base types)
net/ipv4/nexthop.c:279:59:    expected unsigned int [usertype] value
net/ipv4/nexthop.c:279:59:    got restricted __be32 [usertype] ipv4

Suppress the warning by using nla_put_be32().

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: nexthop: Reduce allocation size of 'struct nh_group'
Ido Schimmel [Wed, 26 Aug 2020 16:48:51 +0000 (19:48 +0300)]
ipv4: nexthop: Reduce allocation size of 'struct nh_group'

The struct looks as follows:

struct nh_group {
struct nh_group *spare; /* spare group for removals */
u16 num_nh;
bool mpath;
bool fdb_nh;
bool has_v4;
struct nh_grp_entry nh_entries[];
};

But its offset within 'struct nexthop' is also taken into account to
determine the allocation size.

Instead, use struct_size() to allocate only the required number of
bytes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net_prefetch-API'
David S. Miller [Wed, 26 Aug 2020 22:55:54 +0000 (15:55 -0700)]
Merge branch 'net_prefetch-API'

Tariq Toukan says:

====================
net_prefetch API

This patchset adds a common net API for L1 cacheline size-aware prefetch.

Patch 1 introduces the common API in net and aligns the drivers to use it.
Patches 2 and 3 add usage in mlx4 and mlx5 Eth drivers.

Series generated against net-next commit:
079f921e9f4d Merge tag 'batadv-next-for-davem-20200824' of git://git.open-mesh.org/linux-merge
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mlx4_en: RX, Add a prefetch command for small L1_CACHE_BYTES
Tariq Toukan [Wed, 26 Aug 2020 12:54:18 +0000 (15:54 +0300)]
net/mlx4_en: RX, Add a prefetch command for small L1_CACHE_BYTES

A single cacheline might not contain the packet header for
small L1_CACHE_BYTES values.
Use net_prefetch() as it issues an additional prefetch
in this case.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
Tariq Toukan [Wed, 26 Aug 2020 12:54:17 +0000 (15:54 +0300)]
net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES

A single cacheline might not contain the packet header for
small L1_CACHE_BYTES values.
Use net_prefetch() as it issues an additional prefetch
in this case.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Take common prefetch code structure into a function
Tariq Toukan [Wed, 26 Aug 2020 12:54:16 +0000 (15:54 +0300)]
net: Take common prefetch code structure into a function

Many device drivers use the same prefetch code structure to
deal with small L1 cacheline size.
Take this code into a function and call it from the drivers.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Add-Ethernet-support-for-Intel-Keem-Bay-SoC'
David S. Miller [Wed, 26 Aug 2020 22:52:30 +0000 (15:52 -0700)]
Merge branch 'Add-Ethernet-support-for-Intel-Keem-Bay-SoC'

Vineetha G. Jaya Kumaran says:

====================
Add Ethernet support for Intel Keem Bay SoC

This patch set enables support for Ethernet on the Intel Keem Bay SoC.
The first patch contains the required Device Tree bindings documentation,
while the second patch adds the Intel platform glue layer for the stmmac
device driver.

This driver was tested on the Keem Bay evaluation module board.
Changes since v2:
-Add a select in DT documentation to avoid matching with all nodes containing 'snps,dwmac'
-Rebased to 5.9-rc1

Changes since v1:
-Removed clocks maxItems property from DT bindings documentation
-Removed phy compatible strings from DT bindings documentation
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Add dwmac-intel-plat for GBE driver
Rusaimi Amira Ruslan [Wed, 26 Aug 2020 04:33:42 +0000 (12:33 +0800)]
net: stmmac: Add dwmac-intel-plat for GBE driver

Add dwmac-intel-plat to enable the stmmac driver in Intel Keem Bay.
Also add fix_mac_speed and tx_clk in order to change link speeds.
This is required as mac_speed_o is not connected in the
Intel Keem Bay SoC.

Signed-off-by: Rusaimi Amira Ruslan <rusaimi.amira.rusaimi@intel.com>
Signed-off-by: Vineetha G. Jaya Kumaran <vineetha.g.jaya.kumaran@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: Add bindings for Intel Keem Bay
Vineetha G. Jaya Kumaran [Wed, 26 Aug 2020 04:33:41 +0000 (12:33 +0800)]
dt-bindings: net: Add bindings for Intel Keem Bay

Add Device Tree bindings documentation for the ethernet controller
on Intel Keem Bay.

Signed-off-by: Vineetha G. Jaya Kumaran <vineetha.g.jaya.kumaran@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoinet: remove inet_sk_copy_descendant()
Eric Dumazet [Wed, 26 Aug 2020 13:50:16 +0000 (06:50 -0700)]
inet: remove inet_sk_copy_descendant()

This is no longer used, SCTP now uses a private helper.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoptp: ptp_ines: Remove redundant null check
Xu Wang [Wed, 26 Aug 2020 03:12:51 +0000 (03:12 +0000)]
ptp: ptp_ines: Remove redundant null check

Because kfree_skb already checked NULL skb parameter,
so the additional check is unnecessary, just remove it.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosunrpc: Avoid comma separated statements
Joe Perches [Tue, 25 Aug 2020 04:56:25 +0000 (21:56 -0700)]
sunrpc: Avoid comma separated statements

Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: fib6: Avoid comma separated statements
Joe Perches [Tue, 25 Aug 2020 04:56:24 +0000 (21:56 -0700)]
ipv6: fib6: Avoid comma separated statements

Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agowan: sbni: Avoid comma separated statements
Joe Perches [Tue, 25 Aug 2020 04:56:15 +0000 (21:56 -0700)]
wan: sbni: Avoid comma separated statements

Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofs_enet: Avoid comma separated statements
Joe Perches [Tue, 25 Aug 2020 04:56:14 +0000 (21:56 -0700)]
fs_enet: Avoid comma separated statements

Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years ago8390: Avoid comma separated statements
Joe Perches [Tue, 25 Aug 2020 04:56:13 +0000 (21:56 -0700)]
8390: Avoid comma separated statements

Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: clean up codestyle for net/ipv4
Miaohe Lin [Tue, 25 Aug 2020 12:32:11 +0000 (08:32 -0400)]
net: clean up codestyle for net/ipv4

This is a pure codestyle cleanup patch. Also add a blank line after
declarations as warned by checkpatch.pl.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Remove duplicated midx check against 0
Miaohe Lin [Tue, 25 Aug 2020 12:10:37 +0000 (08:10 -0400)]
net: Remove duplicated midx check against 0

Check midx against 0 is always equal to check midx against sk_bound_dev_if
when sk_bound_dev_if is known not equal to 0 in these case.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Avoid unnecessary inet_addr_type() call when addr is INADDR_ANY
Miaohe Lin [Tue, 25 Aug 2020 11:40:48 +0000 (07:40 -0400)]
net: Avoid unnecessary inet_addr_type() call when addr is INADDR_ANY

We can avoid unnecessary inet_addr_type() call by check addr against
INADDR_ANY first.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Set ping saddr after we successfully get the ping port
Miaohe Lin [Tue, 25 Aug 2020 11:33:22 +0000 (07:33 -0400)]
net: Set ping saddr after we successfully get the ping port

We can defer set ping saddr until we successfully get the ping port. So we
can avoid clear saddr when failed. Since ping_clear_saddr() is not used
anymore now, remove it.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: add error handlers to LE intr_handler
Raju Rangoju [Tue, 25 Aug 2020 03:55:46 +0000 (09:25 +0530)]
cxgb4: add error handlers to LE intr_handler

cxgb4 does not look for HASHTBLMEMCRCERR and CMDTIDERR
bits in LE_DB_INT_CAUSE register, but these are enabled
in LE_DB_INT_ENABLE. So, add error handlers to LE
interrupt handler to emit a warning or alert message
for hash table mem crc and cmd tid errors

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetlink: remove duplicated nla_need_padding_for_64bit() check
Miaohe Lin [Tue, 25 Aug 2020 03:25:17 +0000 (23:25 -0400)]
netlink: remove duplicated nla_need_padding_for_64bit() check

The need for padding 64bit is implicitly checked by nla_align_64bit(), so
remove this explicit one.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: gain ipv4 mtu when mtu is not locked
Miaohe Lin [Tue, 25 Aug 2020 03:20:28 +0000 (23:20 -0400)]
net: gain ipv4 mtu when mtu is not locked

When mtu is locked, we should not obtain ipv4 mtu as we return immediately
in this case and leave acquired ipv4 mtu unused.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'batadv-next-for-davem-20200824' of git://git.open-mesh.org/linux-merge
David S. Miller [Tue, 25 Aug 2020 01:18:22 +0000 (18:18 -0700)]
Merge tag 'batadv-next-for-davem-20200824' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - Drop unused function batadv_hardif_remove_interfaces(),
   by Sven Eckelmann

 - delete duplicated words, by Randy Dunlap

 - Drop (even more) repeated words in comments, by Sven Eckelmann

 - Migrate to linux/prandom.h, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Add-PTP-support-for-Octeontx2'
David S. Miller [Tue, 25 Aug 2020 01:15:45 +0000 (18:15 -0700)]
Merge branch 'Add-PTP-support-for-Octeontx2'

Subbaraya Sundeep says:

====================
Add PTP support for Octeontx2

This patchset adds PTP support for Octeontx2 platform.
PTP is an independent coprocessor block from which
CGX block fetches timestamp and prepends it to the
packet before sending to NIX block. Patches are as
follows:

Patch 1: Patch to enable/disable packet timstamping
         in CGX upon mailbox request. It also adjusts
         packet parser (NPC) for the 8 bytes timestamp
         appearing before the packet.

Patch 2: Patch adding PTP pci driver which configures
         the PTP block and hooks up to RVU AF driver.
         It also exposes a mailbox call to adjust PTP
         hardware clock.

Patch 3: Patch adding PTP clock driver for PF netdev.

v8:
 Added missing header file reported by kernel test robot
 in patch 2
v7:
 As per Jesse Brandeburg comments:
 Simplified functions in patch 1
 Replaced magic numbers with macros
 Added Copyrights
 Added code comments wherever required
 Modified commit description of patch 2
v6:
 Resent after net-next is open
v5:
 As suggested by David separated the fix (adding rtnl lock/unlock)
 and submitted to net.
 https://www.spinics.net/lists/netdev/msg669617.html
v4:
 Added rtnl_lock/unlock in otx2_reset to protect against
 network stack ndo_open and close calls
 Added NULL check after ptp_clock_register in otx2_ptp.c
v3:
 Fixed sparse error in otx2_txrx.c
 Removed static inlines in otx2_txrx.c
v2:
 Fixed kernel build robot reported error by
 adding timecounter.h to otx2_common.h
====================

Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-pf: Add support for PTP clock
Aleksey Makarov [Mon, 24 Aug 2020 15:50:02 +0000 (21:20 +0530)]
octeontx2-pf: Add support for PTP clock

This patch adds PTP clock and uses it in Octeontx2
network device. PTP clock uses mailbox calls to
access the hardware counter on the RVU side.

Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Add support for Marvell PTP coprocessor
Aleksey Makarov [Mon, 24 Aug 2020 15:50:01 +0000 (21:20 +0530)]
octeontx2-af: Add support for Marvell PTP coprocessor

Precision Timestamping block found on Octeontx2
platform is an independent coprocessor and has
internal PTP hardware clock. Once configured PTP
runs independently and when a packet arrives
CGX hardware block gets the current timestamp
from PTP block and forwards the packet to NIX
by prepending timestamp to the packet.
This patch adds the pci driver for PTP block.
The driver gets registered by AF driver and does
initial configuration and exposes a mailbox function to
read and adjust PTP hardware clock. The mailbox function
is called by AF consumers like netdev drivers or
userspace drivers. Since PTP being a single block
in platform this driver helps in accessing PTP
block by any AF consumer.

Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Support to enable/disable HW timestamping
Zyta Szpak [Mon, 24 Aug 2020 15:50:00 +0000 (21:20 +0530)]
octeontx2-af: Support to enable/disable HW timestamping

Four new mbox messages ids and handler are added in order to
enable or disable timestamping procedure on tx and rx side.
Additionally when PTP is enabled, the packet parser must skip
over 8 bytes and start analyzing packet data there. To make NPC
profiles work seemlesly PTR_ADVANCE of IKPU is set so that
parsing can be done as before when all data pointers
are shifted by 8 bytes automatically.

Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Zyta Szpak <zyta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Use helper macro RT_TOS() in __icmp_send()
Miaohe Lin [Mon, 24 Aug 2020 11:44:37 +0000 (07:44 -0400)]
net: Use helper macro RT_TOS() in __icmp_send()

Use helper macro RT_TOS() to get tos in __icmp_send().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Avoid access icmp_err_convert when icmp code is ICMP_FRAG_NEEDED
Miaohe Lin [Mon, 24 Aug 2020 11:15:04 +0000 (07:15 -0400)]
net: Avoid access icmp_err_convert when icmp code is ICMP_FRAG_NEEDED

There is no need to fetch errno and fatal info from icmp_err_convert when
icmp code is ICMP_FRAG_NEEDED.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'qed-introduce-devlink-health-support'
David S. Miller [Tue, 25 Aug 2020 01:01:33 +0000 (18:01 -0700)]
Merge branch 'qed-introduce-devlink-health-support'

Igor Russkikh says:

====================
qed: introduce devlink health support

This is a followup implementation after series

https://patchwork.ozlabs.org/project/netdev/cover/20200514095727.1361-1-irusskikh@marvell.com/

This is an implementation of devlink health infrastructure.

With this we are now able to report HW errors to devlink, and it'll take
its own actions depending on user configuration to capture and store the
dump at the bad moment, and to request the driver to recover the device.

So far we do not differentiate global device failures or specific PCI
function failures. This means that some errors specific to one physical
function will affect an entire device. This is not yet fully designed
and verified, will followup in future.

Solution was verified with artificial HW errors generated, existing
tools for dump analysis could be used.

v7: comments from Jesse and Jakub
 - p2: extra edev check
 - p9: removed extra indents
v6: patch 4: changing serial to board.serial and fw to fw.app
v5: improved patch 4 description
v4:
 - commit message and other fixes after Jiri's comments
 - removed one patch (will send to net)
v3: fix uninit var usage in patch 11
v2: fix #include issue from kbuild test robot.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqede: make driver reliable on unload after failures
Igor Russkikh [Sun, 23 Aug 2020 11:19:34 +0000 (14:19 +0300)]
qede: make driver reliable on unload after failures

In case recovery was not successful, netdev still should be
present. But we should clear cdev if something bad happens
on recovery.

We also check cdev for null on dev close. That could be a case
if recovery was not successful.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: align adjacent indent
Igor Russkikh [Sun, 23 Aug 2020 11:19:33 +0000 (14:19 +0300)]
qed: align adjacent indent

Remove extra indent on some of adjacent declarations.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: implement devlink dump
Igor Russkikh [Sun, 23 Aug 2020 11:19:32 +0000 (14:19 +0300)]
qed: implement devlink dump

Gather and push out full device dump to devlink.
Device dump is the same as with `ethtool -d`, but now its generated
exactly at the moment bad thing happens.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed*: make use of devlink recovery infrastructure
Igor Russkikh [Sun, 23 Aug 2020 11:19:31 +0000 (14:19 +0300)]
qed*: make use of devlink recovery infrastructure

Remove forcible recovery trigger and put it as a normal devlink
callback.

This allows user to enable/disable it via

    devlink health set pci/0000:03:00.0 reporter fw_fatal auto_recover false

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: use devlink logic to report errors
Igor Russkikh [Sun, 23 Aug 2020 11:19:30 +0000 (14:19 +0300)]
qed: use devlink logic to report errors

Use devlink_health_report to push error indications.
We implement this in qede via callback function to make it possible
to reuse the same for other drivers sitting on top of qed in future.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: health reporter init deinit seq
Igor Russkikh [Sun, 23 Aug 2020 11:19:29 +0000 (14:19 +0300)]
qed: health reporter init deinit seq

Here we declare health reporter ops (empty for now)
and register these in qed probe and remove callbacks.

This way we get devlink attached to all kind of qed* PCI
device entities: networking or storage offload entity.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: implement devlink info request
Igor Russkikh [Sun, 23 Aug 2020 11:19:28 +0000 (14:19 +0300)]
qed: implement devlink info request

Here we return existing fw & mfw versions, we also fetch device's
serial number:

~$ sudo ~/iproute2/devlink/devlink  dev info
pci/0000:01:00.1:
  driver qed
  board.serial_number REE1915E44552
  versions:
      running:
        fw.app 8.42.2.0
      stored:
        fw.mgmt 8.52.10.0

MFW and FW are different firmwares on device.
Management is a firmware responsible for link configuration and
various control plane features. Its permanent and resides in NVM.

Running FW (or fastpath FW) is an embedded microprogram implementing
all the packet processing, offloads, etc. This FW is being loaded
on each start by the driver from FW binary blob.

The base device specific structure (qed_dev_info) was not directly
available to the base driver before. Thus, here we create and store
a private copy of this structure in qed_dev root object to
access the data.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: fix kconfig help entries
Igor Russkikh [Sun, 23 Aug 2020 11:19:27 +0000 (14:19 +0300)]
qed: fix kconfig help entries

This patch replaces stubs in kconfig help entries with an actual description.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed/qede: make devlink survive recovery
Igor Russkikh [Sun, 23 Aug 2020 11:19:26 +0000 (14:19 +0300)]
qed/qede: make devlink survive recovery

Devlink instance lifecycle was linked to qed_dev object,
that caused devlink to be recreated on each recovery.

Changing it by making higher level driver (qede) responsible for its
life. This way devlink now survives recoveries.

qede now stores devlink structure pointer as a part of its device
object, devlink private data contains a linkage structure,
qed_devlink.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: move out devlink logic into a new file
Igor Russkikh [Sun, 23 Aug 2020 11:19:25 +0000 (14:19 +0300)]
qed: move out devlink logic into a new file

We are extending devlink infrastructure, thus move the existing
stuff into a new file qed_devlink.c

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agochelsio: switch from 'pci_' to 'dma_' API
Christophe JAILLET [Sun, 23 Aug 2020 08:36:48 +0000 (10:36 +0200)]
chelsio: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'free_rx_resources()' and
'alloc_tx_resources()' (sge.c) GFP_KERNEL can be used because it is
already used in these functions.

Moreover, they can only be called from a .ndo_open function. So it is
guarded by the 'rtnl_lock()', which is a mutex.

While at it, a pr_err message in 'init_one()' has been updated accordingly
(s/consistent/coherent).

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-Misc-updates'
David S. Miller [Tue, 25 Aug 2020 00:36:11 +0000 (17:36 -0700)]
Merge branch 'mlxsw-Misc-updates'

Ido Schimmel says:

====================
mlxsw: Misc updates

This patch set includes various updates for mlxsw.

Patches #1-#4 adjust the default burst size of packet trap policers to
conform to Spectrum-{2,3} requirements. The corresponding selftest is
also adjusted so that it could reliably pass on these platforms.

Patch #5 adjusts a selftest so that it could pass with both old and new
versions of mausezahn.

Patch #6 significantly reduces the runtime of tc-police scale test by
changing the preference and masks of the used tc filters.

Patch #7 prevents the driver from trying to set invalid ethtool link
modes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_ethtool: Remove internal speeds from PTYS register
Danielle Ratson [Sun, 23 Aug 2020 08:06:28 +0000 (11:06 +0300)]
mlxsw: spectrum_ethtool: Remove internal speeds from PTYS register

The PTYS register is used to report and configure the port type and
speed. Currently, internal bits in the register are used the same way
other bits are used.

Using the internal bits can cause bad parameter firmware errors. For
example, trying to write to internal bit 25 returns:

EMAD reg access failed (tid=53e2bffa00004310,reg_id=5004(ptys),type=write,status=7(bad parameter))

Remove the internal bits from the PTYS register, so that it is no longer
possible to pass them to firmware.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: Reduce runtime of tc-police scale test
Ido Schimmel [Sun, 23 Aug 2020 08:06:27 +0000 (11:06 +0300)]
selftests: mlxsw: Reduce runtime of tc-police scale test

Currently, the test takes about 626 seconds to complete because of an
inefficient use of the device's TCAM. Reduce the runtime to 202 seconds
by inserting all the flower filters with the same preference and mask,
but with a different key.

In particular, this reduces the deletion of the qdisc (which triggers
the deletion of all the filters) from 66 seconds to 0.2 seconds. This
prevents various netlink requests from user space applications (e.g.,
systemd-networkd) from timing-out because RTNL is not held for too long
anymore.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Fix mausezahn delay parameter in mirror_test()
Danielle Ratson [Sun, 23 Aug 2020 08:06:26 +0000 (11:06 +0300)]
selftests: forwarding: Fix mausezahn delay parameter in mirror_test()

Currently, mausezahn delay parameter in mirror_test() is specified with
'ms' units.

mausezahn versions before 0.6.5 interpret 'ms' as seconds and therefore
the tests that use mirror_test() take a very long time to complete.

Resolve this by specifying 'msec' units.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: Increase burst size for burst test
Ido Schimmel [Sun, 23 Aug 2020 08:06:25 +0000 (11:06 +0300)]
selftests: mlxsw: Increase burst size for burst test

The current combination of rate and burst size does not adhere to
Spectrum-{2,3} limitation which states that the minimum burst size
should be 40% of the rate.

Increase the burst size in order to honor above mentioned limitation and
avoid intermittent failures of this test case on Spectrum-{2,3}.

Remove the first sub-test case as the variation in number of received
packets is simply too large to reliably test it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: Increase burst size for rate test
Ido Schimmel [Sun, 23 Aug 2020 08:06:24 +0000 (11:06 +0300)]
selftests: mlxsw: Increase burst size for rate test

The current combination of rate and burst size does not adhere to
Spectrum-{2,3} limitation which states that the minimum burst size
should be 40% of the rate.

Increase the burst size in order to honor above mentioned limitation and
avoid intermittent failures of this test case on Spectrum-{2,3}.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: Decrease required rate accuracy
Ido Schimmel [Sun, 23 Aug 2020 08:06:23 +0000 (11:06 +0300)]
selftests: mlxsw: Decrease required rate accuracy

On Spectrum-{2,3} the required accuracy is +/-10%.

Align the test to this requirement so that it can reliably pass on these
platforms.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_trap: Adjust default policer burst size for Spectrum-{2, 3}
Ido Schimmel [Sun, 23 Aug 2020 08:06:22 +0000 (11:06 +0300)]
mlxsw: spectrum_trap: Adjust default policer burst size for Spectrum-{2, 3}

On the Spectrum-{2,3} ASICs the minimum burst size of the packet trap
policers needs to be 40% of the configured rate. Otherwise, intermittent
drops are observed even when the incoming packet rate is slightly lower
than the configured policer rate.

Adjust the burst size of the registered packet trap policers so that
they do not violate above mentioned limitation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: atheros: switch from 'pci_' to 'dma_' API
Christophe JAILLET [Sun, 23 Aug 2020 08:03:53 +0000 (10:03 +0200)]
net: atheros: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'atl1e_setup_ring_resources()' (atl1e_main.c),
'atl1_setup_ring_resources()' (atl1.c) and 'atl2_setup_ring_resources()'
(atl2.c) GFP_KERNEL can be used because it can be called from a .ndo_open.

'atl1_setup_ring_resources()' (atl1.c) can also be called from a
'.set_ringparam' (see struct ethtool_ops) where sleep is also allowed.

Both cases are protected by 'rtnl_lock()' which is a mutex. So these
function can sleep.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agostarfire: switch from 'pci_' to 'dma_' API
Christophe JAILLET [Sun, 23 Aug 2020 06:26:41 +0000 (08:26 +0200)]
starfire: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'netdev_open()', GFP_ATOMIC must be used
because it can be called from a .ndo_tx_timeout function.
So this function can be called with the 'netif_tx_lock' acquired.
The call chain is:
  --> tx_timeout                 (.ndo_tx_timeout function)
    --> netdev_open

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotyphoon: switch from 'pci_' to 'dma_' API
Christophe JAILLET [Sun, 23 Aug 2020 06:11:50 +0000 (08:11 +0200)]
typhoon: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'typhoon_init_one()' GFP_KERNEL can be used
because it is a probe function and no lock is acquired.

When memory is allocated in 'typhoon_download_firmware()', GFP_ATOMIC
must be used because it can be called from a .ndo_tx_timeout function.
So this function can be called with the 'netif_tx_lock' acquired.
The call chain is:
  --> typhoon_tx_timeout                 (.ndo_tx_timeout function)
    --> typhoon_start_runtime
      --> typhoon_download_firmware

While at is, update some comments accordingly.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: David Dillow <dave@thedillows.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dccp: delete repeated words
Randy Dunlap [Sun, 23 Aug 2020 01:07:13 +0000 (18:07 -0700)]
net: dccp: delete repeated words

Drop duplicated words in /net/dccp/.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: netlink: delete repeated words
Randy Dunlap [Sat, 22 Aug 2020 23:40:15 +0000 (16:40 -0700)]
net: netlink: delete repeated words

Drop duplicated words in net/netlink/.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipv4: delete repeated words
Randy Dunlap [Sat, 22 Aug 2020 23:31:41 +0000 (16:31 -0700)]
net: ipv4: delete repeated words

Drop duplicate words in comments in net/ipv4/.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-sctp-delete-duplicated-words-plus-other-fixes'
David S. Miller [Mon, 24 Aug 2020 23:21:44 +0000 (16:21 -0700)]
Merge branch 'net-sctp-delete-duplicated-words-plus-other-fixes'

Randy Dunlap says:

====================
net: sctp: delete duplicated words + other fixes

Drop or fix repeated words in net/sctp/.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sctp: ulpqueue.c: delete duplicated word
Randy Dunlap [Sat, 22 Aug 2020 23:16:01 +0000 (16:16 -0700)]
net: sctp: ulpqueue.c: delete duplicated word

Drop the repeated word "an".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sctp: sm_make_chunk.c: delete duplicated words + fix typo
Randy Dunlap [Sat, 22 Aug 2020 23:16:00 +0000 (16:16 -0700)]
net: sctp: sm_make_chunk.c: delete duplicated words + fix typo

Drop the repeated words "for", "that", and "a".
Change "his" to "this".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sctp: protocol.c: delete duplicated words + punctuation
Randy Dunlap [Sat, 22 Aug 2020 23:15:59 +0000 (16:15 -0700)]
net: sctp: protocol.c: delete duplicated words + punctuation

Drop the repeated words "of" and "that".
Add some punctuation for readability.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sctp: chunk.c: delete duplicated word
Randy Dunlap [Sat, 22 Aug 2020 23:15:58 +0000 (16:15 -0700)]
net: sctp: chunk.c: delete duplicated word

Drop the repeated word "the".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sctp: bind_addr.c: delete duplicated word
Randy Dunlap [Sat, 22 Aug 2020 23:15:57 +0000 (16:15 -0700)]
net: sctp: bind_addr.c: delete duplicated word

Drop the repeated word "of".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sctp: auth.c: delete duplicated words
Randy Dunlap [Sat, 22 Aug 2020 23:15:56 +0000 (16:15 -0700)]
net: sctp: auth.c: delete duplicated words

Drop the repeated word "the" and "now".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sctp: associola.c: delete duplicated words
Randy Dunlap [Sat, 22 Aug 2020 23:15:55 +0000 (16:15 -0700)]
net: sctp: associola.c: delete duplicated words

Drop the repeated word "the" in two places.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoio_uring: ignore POLLIN for recvmsg on MSG_ERRQUEUE
Luke Hsiao [Sat, 22 Aug 2020 04:41:05 +0000 (21:41 -0700)]
io_uring: ignore POLLIN for recvmsg on MSG_ERRQUEUE

Currently, io_uring's recvmsg subscribes to both POLLERR and POLLIN. In
the context of TCP tx zero-copy, this is inefficient since we are only
reading the error queue and not using recvmsg to read POLLIN responses.

This patch was tested by using a simple sending program to call recvmsg
using io_uring with MSG_ERRQUEUE set and verifying with printks that the
POLLIN is correctly unset when the msg flags are MSG_ERRQUEUE.

Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Luke Hsiao <lukehsiao@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoio_uring: allow tcp ancillary data for __sys_recvmsg_sock()
Luke Hsiao [Sat, 22 Aug 2020 04:41:04 +0000 (21:41 -0700)]
io_uring: allow tcp ancillary data for __sys_recvmsg_sock()

For TCP tx zero-copy, the kernel notifies the process of completions by
queuing completion notifications on the socket error queue. This patch
allows reading these notifications via recvmsg to support TCP tx
zero-copy.

Ancillary data was originally disallowed due to privilege escalation
via io_uring's offloading of sendmsg() onto a kernel thread with kernel
credentials (https://crbug.com/project-zero/1975). So, we must ensure
that the socket type is one where the ancillary data types that are
delivered on recvmsg are plain data (no file descriptors or values that
are translated based on the identity of the calling process).

This was tested by using io_uring to call recvmsg on the MSG_ERRQUEUE
with tx zero-copy enabled. Before this patch, we received -EINVALID from
this specific code path. After this patch, we could read tcp tx
zero-copy completion notifications from the MSG_ERRQUEUE.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Arjun Roy <arjunroy@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Luke Hsiao <lukehsiao@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'devlink-fixes-for-port-and-reporter-field-access'
David S. Miller [Mon, 24 Aug 2020 23:02:47 +0000 (16:02 -0700)]
Merge branch 'devlink-fixes-for-port-and-reporter-field-access'

Parav Pandit says:

====================
devlink fixes for port and reporter field access

These series contains two small fixes of devlink.

Patch-1 initializes port reporter fields early enough to
avoid access before initialized error.
Patch-2 protects port list lock during traversal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: Protect devlink port list traversal
Parav Pandit [Fri, 21 Aug 2020 19:12:21 +0000 (22:12 +0300)]
devlink: Protect devlink port list traversal

Cited patch in fixes tag misses to protect port list traversal
while traversing per port reporter list.

Protect it using devlink instance lock.

Fixes: f4f541660121 ("devlink: Implement devlink health reporters on per-port basis")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: Fix per port reporter fields initialization
Parav Pandit [Fri, 21 Aug 2020 19:12:20 +0000 (22:12 +0300)]
devlink: Fix per port reporter fields initialization

Cited patch in fixes tag initializes reporters_list and reporters_lock
of a devlink port after devlink port is added to the list. Once port
is added to the list, devlink_nl_cmd_health_reporter_get_dumpit()
can access the uninitialized mutex and reporters list head.
Fix it by initializing port reporters field before adding port to the
list.

Fixes: f4f541660121 ("devlink: Implement devlink health reporters on per-port basis")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoibmvnic: Fix use-after-free of VNIC login response buffer
Thomas Falcon [Fri, 21 Aug 2020 18:39:01 +0000 (13:39 -0500)]
ibmvnic: Fix use-after-free of VNIC login response buffer

The login response buffer is freed after it is received
and parsed, but other functions in the driver still attempt
to read it, such as when the device is opened, causing the
Oops below. Store relevant information in the driver's
private data structures and use those instead.

BUG: Kernel NULL pointer dereference on read at 0x00000010
Faulting instruction address: 0xc00800000050a900
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables ibmvnic ibmveth crc32c_vpmsum autofs4
CPU: 7 PID: 759 Comm: NetworkManager Not tainted 5.9.0-rc1-00124-gd0a84e1f38d9 #14
NIP:  c00800000050a900 LR: c00800000050a8f0 CTR: 00000000005b1904
REGS: c0000001ed746d20 TRAP: 0300   Not tainted  (5.9.0-rc1-00124-gd0a84e1f38d9)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24428484  XER: 00000001
CFAR: c0000000000101b0 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
GPR00: c00800000050a8f0 c0000001ed746fb0 c008000000518e00 0000000000000000
GPR04: 00000000000000c0 0000000000000080 0003c366c60c4501 0000000000000352
GPR08: 000000000001f400 0000000000000010 0000000000000000 0000000000000000
GPR12: 0001cf0000000019 c00000001ec97680 00000001003dfd40 0000010008dbb22c
GPR16: 0000000000000000 0000000000000000 0000000000000000 c000000000edb6c8
GPR20: c000000004e73e00 c000000004fd2448 c000000004e6d700 c000000004fd2448
GPR24: c000000004fd2400 c000000004a0cd20 c0000001ed961860 c0080000005029d8
GPR28: 0000000000000000 0000000000000003 c000000004a0c000 0000000000000000
NIP [c00800000050a900] init_resources+0x338/0xa00 [ibmvnic]
LR [c00800000050a8f0] init_resources+0x328/0xa00 [ibmvnic]
Call Trace:
[c0000001ed746fb0] [c00800000050a8f0] init_resources+0x328/0xa00 [ibmvnic] (unreliable)
[c0000001ed747090] [c00800000050b024] ibmvnic_open+0x5c/0x100 [ibmvnic]
[c0000001ed747110] [c000000000bdcc0c] __dev_open+0x17c/0x250
[c0000001ed7471b0] [c000000000bdd1ec] __dev_change_flags+0x1dc/0x270
[c0000001ed747260] [c000000000bdd2bc] dev_change_flags+0x3c/0x90
[c0000001ed7472a0] [c000000000bf24b8] do_setlink+0x3b8/0x1280
[c0000001ed747450] [c000000000bf8cc8] __rtnl_newlink+0x5a8/0x980
[c0000001ed7478b0] [c000000000bf9110] rtnl_newlink+0x70/0xb0
[c0000001ed7478f0] [c000000000bf07c4] rtnetlink_rcv_msg+0x364/0x460
[c0000001ed747990] [c000000000c68b94] netlink_rcv_skb+0x84/0x1a0
[c0000001ed747a00] [c000000000bef758] rtnetlink_rcv+0x28/0x40
[c0000001ed747a20] [c000000000c68188] netlink_unicast+0x218/0x310
[c0000001ed747a80] [c000000000c6848c] netlink_sendmsg+0x20c/0x4e0
[c0000001ed747b20] [c000000000b9dc88] ____sys_sendmsg+0x158/0x360
[c0000001ed747bb0] [c000000000ba1c88] ___sys_sendmsg+0x98/0xf0
[c0000001ed747d10] [c000000000ba1db8] __sys_sendmsg+0x78/0x100
[c0000001ed747dc0] [c000000000033820] system_call_exception+0x160/0x280
[c0000001ed747e20] [c00000000000d740] system_call_common+0xf0/0x27c
Instruction dump:
3be00000 38810068 b1410076 3941006a 93e10072 fbea0000 b1210068 4bff9915
eb9e0ca0 eabe0900 393c0010 3ab50048 <7fa04c2c7fba07b4 7b431764 7b4917a0
---[ end trace fbc5949a28e103bd ]---

Fixes: f3ae59c0c015 ("ibmvnic: store RX and TX subCRQ handle array in ibmvnic_adapter struct")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipvlan: advertise link netns via netlink
Taehee Yoo [Fri, 21 Aug 2020 17:47:32 +0000 (17:47 +0000)]
ipvlan: advertise link netns via netlink

Assign rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is
added to rtnetlink messages.

Test commands:
    ip netns add nst
    ip link add dummy0 type dummy
    ip link add ipvlan0 link dummy0 type ipvlan
    ip link set ipvlan0 netns nst
    ip netns exec nst ip link show ipvlan0

Result:
    ---Before---
    6: ipvlan0@if5: <BROADCAST,MULTICAST> ...
        link/ether 82:3a:78:ab:60:50 brd ff:ff:ff:ff:ff:ff

    ---After---
    12: ipvlan0@if11: <BROADCAST,MULTICAST> ...
        link/ether 42:b1:ad:57:4e:27 brd ff:ff:ff:ff:ff:ff link-netnsid 0
                                                           ~~~~~~~~~~~~~~

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
David S. Miller [Sun, 23 Aug 2020 18:48:27 +0000 (11:48 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

5 years agoMerge tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 23 Aug 2020 18:37:23 +0000 (11:37 -0700)]
Merge tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Add perf support for emitting extended registers for power10.

 - A fix for CPU hotplug on pseries, where on large/loaded systems we
   may not wait long enough for the CPU to be offlined, leading to
   crashes.

 - Addition of a raw cputable entry for Power10, which is not required
   to boot, but is required to make our PMU setup work correctly in
   guests.

 - Three fixes for the recent changes on 32-bit Book3S to move modules
   into their own segment for strict RWX.

 - A fix for a recent change in our powernv PCI code that could lead to
   crashes.

 - A change to our perf interrupt accounting to avoid soft lockups when
   using some events, found by syzkaller.

 - A change in the way we handle power loss events from the hypervisor
   on pseries. We no longer immediately shut down if we're told we're
   running on a UPS.

 - A few other minor fixes.

Thanks to Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T
Sudhakar, Athira Rajeev, Christophe Leroy, Frederic Barrat, Greg Kurz,
Kajol Jain, Madhavan Srinivasan, Michael Neuling, Michael Roth,
Nageswara R Sastry, Oliver O'Halloran, Thiago Jung Bauermann,
Vaidyanathan Srinivasan, Vasant Hegde.

* tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver
  powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000
  powerpc/pseries: Do not initiate shutdown when system is running on UPS
  powerpc/perf: Fix soft lockups due to missed interrupt accounting
  powerpc/powernv/pci: Fix possible crash when releasing DMA resources
  powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death
  powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined
  powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32
  powerpc/fixmap: Fix the size of the early debug area
  powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled
  powerpc/kernel: Cleanup machine check function declarations
  powerpc: Add POWER10 raw mode cputable entry
  powerpc/perf: Add extended regs support for power10 platform
  powerpc/perf: Add support for outputting extended regs in perf intr_regs
  powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores

5 years agoMerge tag 'x86-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 23 Aug 2020 18:21:16 +0000 (11:21 -0700)]
Merge tag 'x86-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Thomas Gleixner:
 "A single fix for x86 which removes the RDPID usage from the paranoid
  entry path and unconditionally uses LSL to retrieve the CPU number.

  RDPID depends on MSR_TSX_AUX. KVM has an optmization to avoid
  expensive MRS read/writes on VMENTER/EXIT. It caches the MSR values
  and restores them either when leaving the run loop, on preemption or
  when going out to user space. MSR_TSX_AUX is part of that lazy MSR
  set, so after writing the guest value and before the lazy restore any
  exception using the paranoid entry will read the guest value and use
  it as CPU number to retrieve the GSBASE value for the current CPU when
  FSGSBASE is enabled. As RDPID is only used in that particular entry
  path, there is no reason to burden VMENTER/EXIT with two extra MSR
  writes. Remove the RDPID optimization, which is not even backed by
  numbers from the paranoid entry path instead"

* tag 'x86-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM