www.infradead.org Git - users/willy/linux.git/log

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Conflicting commits, all resolutions pretty trivial:

drivers/bus/mhi/pci_generic.c
  5c2c85315948 ("bus: mhi: pci-generic: configurable network interface MRU")
  56f6f4c4eb2a ("bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean")

drivers/nfc/s3fwrn5/firmware.c
  a0302ff5906a ("nfc: s3fwrn5: remove unnecessary label")
  46573e3ab08f ("nfc: s3fwrn5: fix undefined parameter values in dev_err()")
  801e541c79bb ("nfc: s3fwrn5: fix undefined parameter values in dev_err()")

MAINTAINERS
  7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver")
  8a7b46fa7902 ("MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.14-rc4, including fixes from bpf, can, WiFi
  (mac80211) and netfilter trees.

  Current release - regressions:

   - mac80211: fix starting aggregation sessions on mesh interfaces

  Current release - new code bugs:

   - sctp: send pmtu probe only if packet loss in Search Complete state

   - bnxt_en: add missing periodic PHC overflow check

   - devlink: fix phys_port_name of virtual port and merge error

   - hns3: change the method of obtaining default ptp cycle

   - can: mcba_usb_start(): add missing urb->transfer_dma initialization

  Previous releases - regressions:

   - set true network header for ECN decapsulation

   - mlx5e: RX, avoid possible data corruption w/ relaxed ordering and
     LRO

   - phy: re-add check for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54811
     PHY

   - sctp: fix return value check in __sctp_rcv_asconf_lookup

  Previous releases - always broken:

   - bpf:
       - more spectre corner case fixes, introduce a BPF nospec
         instruction for mitigating Spectre v4
       - fix OOB read when printing XDP link fdinfo
       - sockmap: fix cleanup related races

   - mac80211: fix enabling 4-address mode on a sta vif after assoc

   - can:
       - raw: raw_setsockopt(): fix raw_rcv panic for sock UAF
       - j1939: j1939_session_deactivate(): clarify lifetime of session
         object, avoid UAF
       - fix number of identical memory leaks in USB drivers

   - tipc:
       - do not blindly write skb_shinfo frags when doing decryption
       - fix sleeping in tipc accept routine"

* tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
  gve: Update MAINTAINERS list
  can: esd_usb2: fix memory leak
  can: ems_usb: fix memory leak
  can: usb_8dev: fix memory leak
  can: mcba_usb_start(): add missing urb->transfer_dma initialization
  can: hi311x: fix a signedness bug in hi3110_cmd()
  MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
  bpf: Fix leakage due to insufficient speculative store bypass mitigation
  bpf: Introduce BPF nospec instruction for mitigating Spectre v4
  sis900: Fix missing pci_disable_device() in probe and remove
  net: let flow have same hash in two directions
  nfc: nfcsim: fix use after free during module unload
  tulip: windbond-840: Fix missing pci_disable_device() in probe and remove
  sctp: fix return value check in __sctp_rcv_asconf_lookup
  nfc: s3fwrn5: fix undefined parameter values in dev_err()
  net/mlx5: Fix mlx5_vport_tbl_attr chain from u16 to u32
  net/mlx5e: Fix nullptr in mlx5e_hairpin_get_mdev()
  net/mlx5: Unload device upon firmware fatal error
  net/mlx5e: Fix page allocation failure for ptp-RQ over SF
  net/mlx5e: Fix page allocation failure for trap-RQ over SF
  ...

Merge tag 'acpi-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
"These revert a recent IRQ resources handling modification that turned
  out to be problematic, fix suspend-to-idle handling on AMD platforms
  to take upcoming systems into account properly and fix the retrieval
  of the DPTF attributes of the PCH FIVR.

  Specifics:

   - Revert recent change of the ACPI IRQ resources handling that
     attempted to improve the ACPI IRQ override selection logic, but
     introduced serious regressions on some systems (Hui Wang).

   - Fix up quirks for AMD platforms in the suspend-to-idle support code
     so as to take upcoming systems using uPEP HID AMDI007 into account
     as appropriate (Mario Limonciello).

   - Fix the code retrieving DPTF attributes of the PCH FIVR so that it
     agrees on the return data type with the ACPI control method
     evaluated for this purpose (Srinivas Pandruvada)"

* tag 'acpi-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: DPTF: Fix reading of attributes
  Revert "ACPI: resources: Add checks for ACPI IRQ override"
  ACPI: PM: Add support for upcoming AMD uPEP HID AMDI007

pipe: make pipe writes always wake up readers

Since commit 1b6b26ae7053 ("pipe: fix and clarify pipe write wakeup
logic") we have sanitized the pipe write logic, and would only try to
wake up readers if they needed it.

In particular, if the pipe already had data in it before the write,
there was no point in trying to wake up a reader, since any existing
readers must have been aware of the pre-existing data already.  Doing
extraneous wakeups will only cause potential thundering herd problems.

However, it turns out that some Android libraries have misused the EPOLL
interface, and expected "edge triggered" be to "any new write will
trigger it".  Even if there was no edge in sight.

Quoting Sandeep Patil:
"The commit 1b6b26ae7053 ('pipe: fix and clarify pipe write wakeup
  logic') changed pipe write logic to wakeup readers only if the pipe
  was empty at the time of write. However, there are libraries that
  relied upon the older behavior for notification scheme similar to
  what's described in [1]

  One such library 'realm-core'[2] is used by numerous Android
  applications. The library uses a similar notification mechanism as GNU
  Make but it never drains the pipe until it is full. When Android moved
  to v5.10 kernel, all applications using this library stopped working.

  The library has since been fixed[3] but it will be a while before all
  applications incorporate the updated library"

Our regression rule for the kernel is that if applications break from
new behavior, it's a regression, even if it was because the application
did something patently wrong.  Also note the original report [4] by
Michal Kerrisk about a test for this epoll behavior - but at that point
we didn't know of any actual broken use case.

So add the extraneous wakeup, to approximate the old behavior.

[ I say "approximate", because the exact old behavior was to do a wakeup
  not for each write(), but for each pipe buffer chunk that was filled
  in. The behavior introduced by this change is not that - this is just
  "every write will cause a wakeup, whether necessary or not", which
  seems to be sufficient for the broken library use. ]

It's worth noting that this adds the extraneous wakeup only for the
write side, while the read side still considers the "edge" to be purely
about reading enough from the pipe to allow further writes.

See commit f467a6a66419 ("pipe: fix and clarify pipe read wakeup logic")
for the pipe read case, which remains that "only wake up if the pipe was
full, and we read something from it".

Link: https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0bam6w@mail.gmail.com/
Link: https://github.com/realm/realm-core
Link: https://github.com/realm/realm-core/issues/4666
Link: https://lore.kernel.org/lkml/CAKgNAkjMBGeAwF=2MKK758BhxvW58wYTgYKB2V-gY1PwXxrH+Q@mail.gmail.com/
Link: https://lore.kernel.org/lkml/20210729222635.2937453-1-sspatil@android.com/
Reported-by: Sandeep Patil <sspatil@android.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'clean-devlink-net-namespace-operations'

Leon Romanovsky says:

====================
Clean devlink net namespace operations

This short series continues my work on devlink core code to make devlink
reload less prone to errors and harden it from API abuse.

Despite first patch being a clear fix, I would ask you to apply it to
net-next anyway, because the fixed patch is anyway old and it will
help us to eliminate merge conflicts that will arise for following
patches or even for the second one.
====================

Link: https://lore.kernel.org/r/cover.1627578998.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: Allocate devlink directly in requested net namespace

There is no need in extra call indirection and check from impossible
flow where someone tries to set namespace without prior call
to devlink_alloc().

Instead of this extra logic and additional EXPORT_SYMBOL, use specialized
devlink allocation function that receives net namespace as an argument.

Such specialized API allows clear view when devlink initialized in wrong
net namespace and/or kernel users don't try to change devlink namespace
under the hood.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: Break parameter notification sequence to be before/after unload/load driver

The change of namespaces during devlink reload calls to driver unload
before it accesses devlink parameters. The commands below causes to
use-after-free bug when trying to get flow steering mode.

* ip netns add n1
* devlink dev reload pci/0000:00:09.0 netns n1

==================================================================
BUG: KASAN: use-after-free in mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
Read of size 4 at addr ffff888009d04308 by task devlink/275

CPU: 6 PID: 275 Comm: devlink Not tainted 5.12.0-rc2+ #2853
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
  dump_stack+0x93/0xc2
  print_address_description.constprop.0+0x18/0x140
  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
  kasan_report.cold+0x7c/0xd8
  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
  mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
  devlink_nl_param_fill+0x1c8/0xe80
  ? __free_pages_ok+0x37a/0x8a0
  ? devlink_flash_update_timeout_notify+0xd0/0xd0
  ? lock_acquire+0x1a9/0x6d0
  ? fs_reclaim_acquire+0xb7/0x160
  ? lock_is_held_type+0x98/0x110
  ? 0xffffffff81000000
  ? lock_release+0x1f9/0x6c0
  ? fs_reclaim_release+0xa1/0xf0
  ? lock_downgrade+0x6d0/0x6d0
  ? lock_is_held_type+0x98/0x110
  ? lock_is_held_type+0x98/0x110
  ? memset+0x20/0x40
  ? __build_skb_around+0x1f8/0x2b0
  devlink_param_notify+0x6d/0x180
  devlink_reload+0x1c3/0x520
  ? devlink_remote_reload_actions_performed+0x30/0x30
  ? mutex_trylock+0x24b/0x2d0
  ? devlink_nl_cmd_reload+0x62b/0x1070
  devlink_nl_cmd_reload+0x66d/0x1070
  ? devlink_reload+0x520/0x520
  ? devlink_get_from_attrs+0x1bc/0x260
  ? devlink_nl_pre_doit+0x64/0x4d0
  genl_family_rcv_msg_doit+0x1e9/0x2f0
  ? mutex_lock_io_nested+0x1130/0x1130
  ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
  ? security_capable+0x51/0x90
  genl_rcv_msg+0x27f/0x4a0
  ? genl_get_cmd+0x3c0/0x3c0
  ? lock_acquire+0x1a9/0x6d0
  ? devlink_reload+0x520/0x520
  ? lock_release+0x6c0/0x6c0
  netlink_rcv_skb+0x11d/0x340
  ? genl_get_cmd+0x3c0/0x3c0
  ? netlink_ack+0x9f0/0x9f0
  ? lock_release+0x1f9/0x6c0
  genl_rcv+0x24/0x40
  netlink_unicast+0x433/0x700
  ? netlink_attachskb+0x730/0x730
  ? _copy_from_iter_full+0x178/0x650
  ? __alloc_skb+0x113/0x2b0
  netlink_sendmsg+0x6f1/0xbd0
  ? netlink_unicast+0x700/0x700
  ? lock_is_held_type+0x98/0x110
  ? netlink_unicast+0x700/0x700
  sock_sendmsg+0xb0/0xe0
  __sys_sendto+0x193/0x240
  ? __x64_sys_getpeername+0xb0/0xb0
  ? do_sys_openat2+0x10b/0x370
  ? __up_read+0x1a1/0x7b0
  ? do_user_addr_fault+0x219/0xdc0
  ? __x64_sys_openat+0x120/0x1d0
  ? __x64_sys_open+0x1a0/0x1a0
  __x64_sys_sendto+0xdd/0x1b0
  ? syscall_enter_from_user_mode+0x1d/0x50
  do_syscall_64+0x2d/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fc69d0af14a
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
RSP: 002b:00007ffc1d8292f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fc69d0af14a
RDX: 0000000000000038 RSI: 0000555f57c56440 RDI: 0000000000000003
RBP: 0000555f57c56410 R08: 00007fc69d17b200 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Allocated by task 146:
  kasan_save_stack+0x1b/0x40
  __kasan_kmalloc+0x99/0xc0
  mlx5_init_fs+0xf0/0x1c50 [mlx5_core]
  mlx5_load+0xd2/0x180 [mlx5_core]
  mlx5_init_one+0x2f6/0x450 [mlx5_core]
  probe_one+0x47d/0x6e0 [mlx5_core]
  pci_device_probe+0x2a0/0x4a0
  really_probe+0x20a/0xc90
  driver_probe_device+0xd8/0x380
  device_driver_attach+0x1df/0x250
  __driver_attach+0xff/0x240
  bus_for_each_dev+0x11e/0x1a0
  bus_add_driver+0x309/0x570
  driver_register+0x1ee/0x380
  0xffffffffa06b8062
  do_one_initcall+0xd5/0x410
  do_init_module+0x1c8/0x760
  load_module+0x6d8b/0x9650
  __do_sys_finit_module+0x118/0x1b0
  do_syscall_64+0x2d/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 275:
  kasan_save_stack+0x1b/0x40
  kasan_set_track+0x1c/0x30
  kasan_set_free_info+0x20/0x30
  __kasan_slab_free+0x102/0x140
  slab_free_freelist_hook+0x74/0x1b0
  kfree+0xd7/0x2a0
  mlx5_unload+0x16/0xb0 [mlx5_core]
  mlx5_unload_one+0xae/0x120 [mlx5_core]
  mlx5_devlink_reload_down+0x1bc/0x380 [mlx5_core]
  devlink_reload+0x141/0x520
  devlink_nl_cmd_reload+0x66d/0x1070
  genl_family_rcv_msg_doit+0x1e9/0x2f0
  genl_rcv_msg+0x27f/0x4a0
  netlink_rcv_skb+0x11d/0x340
  genl_rcv+0x24/0x40
  netlink_unicast+0x433/0x700
  netlink_sendmsg+0x6f1/0xbd0
  sock_sendmsg+0xb0/0xe0
  __sys_sendto+0x193/0x240
  __x64_sys_sendto+0xdd/0x1b0
  do_syscall_64+0x2d/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff888009d04300
  which belongs to the cache kmalloc-128 of size 128
The buggy address is located 8 bytes inside of
  128-byte region [ffff888009d04300, ffff888009d04380)
The buggy address belongs to the page:
page:0000000086a64ecc refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888009d04000 pfn:0x9d04
head:0000000086a64ecc order:1 compound_mapcount:0
flags: 0x4000000000010200(slab|head)
raw: 4000000000010200 ffffea0000203980 0000000200000002 ffff8880050428c0
raw: ffff888009d04000 000000008020001d 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff888009d04200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff888009d04280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888009d04300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                       ^
  ffff888009d04380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff888009d04400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

The right solution to devlink reload is to notify about deletion of
parameters, unload driver, change net namespaces, load driver and notify
about addition of parameters.

Fixes: 070c63f20f6c ("net: devlink: allow to change namespaces during reload")
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sk_buff: avoid potentially clearing 'slow_gro' field

If skb_dst_set_noref() is invoked with a NULL dst, the 'slow_gro'
field is cleared, too. That could lead to wrong behavior if
the skb later enters the GRO stage.

Fix the potential issue replacing preserving a non-zero value of
the 'slow_gro' field.

Additionally, fix a comment typo.

Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 8a886b142bd0 ("sk_buff: track dst status in slow_gro")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/aa42529252dc8bb02bd42e8629427040d1058537.1627662501.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branches 'acpi-resources' and 'acpi-dptf'

* acpi-resources:
Revert "ACPI: resources: Add checks for ACPI IRQ override"

* acpi-dptf:
ACPI: DPTF: Fix reading of attributes

Merge tag 'block-5.14-2021-07-30' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

- gendisk freeing fix (Christoph)

- blk-iocost wake ordering fix (Tejun)

- tag allocation error handling fix (John)

- loop locking fix. While this isn't the prettiest fix in the world,
   nobody has any good alternatives for 5.14. Something to likely
   revisit for 5.15. (Tetsuo)

* tag 'block-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
  block: delay freeing the gendisk
  blk-iocost: fix operation ordering in iocg_wake_fn()
  blk-mq-sched: Fix blk_mq_sched_alloc_tags() error handling
  loop: reintroduce global lock for safe loop_validate_file() traversal

Merge tag 'io_uring-5.14-2021-07-30' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

- A fix for block backed reissue (me)

- Reissue context hardening (me)

- Async link locking fix (Pavel)

* tag 'io_uring-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
  io_uring: fix poll requests leaking second poll entries
  io_uring: don't block level reissue off completion path
  io_uring: always reissue from task_work context
  io_uring: fix race in unified task_work running
  io_uring: fix io_prep_async_link locking

Merge tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block

Pull libata fixlets from Jens Axboe:

- A fix for PIO highmem (Christoph)

- Kill HAVE_IDE as it's now unused (Lukas)

* tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
arch: Kconfig: clean up obsolete use of HAVE_IDE
libata: fix ata_pio_sector for CONFIG_HIGHMEM

Merge tag 'for-5.14-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- fix -Warray-bounds warning, to help external patchset to make it
   default treewide

- fix writeable device accounting (syzbot report)

- fix fsync and log replay after a rename and inode eviction

- fix potentially lost error code when submitting multiple bios for
   compressed range

* tag 'for-5.14-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: calculate number of eb pages properly in csum_tree_block
  btrfs: fix rw device counting in __btrfs_free_extra_devids
  btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction
  btrfs: mark compressed range uptodate only if all bio succeed

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

- resume timing fix for intel-ish driver (Ye Xiang)

- fix for using incorrect MMIO register in amd_sfh driver (Dylan
   MacKenzie)

- Cintiq 24HDT / 27QHDT regression fix and touch processing fix for
   Wacom driver (Jason Gerecke)

- device removal bugfix for ft260 driver (Michael Zaidman)

- other small assorted fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: ft260: fix device removal due to USB disconnect
  HID: wacom: Skip processing of touches with negative slot values
  HID: wacom: Re-enable touch by default for Cintiq 24HDT / 27QHDT
  HID: Kconfig: Fix spelling mistake "Uninterruptable" -> "Uninterruptible"
  HID: apple: Add support for Keychron K1 wireless keyboard
  HID: fix typo in Kconfig
  HID: ft260: fix format type warning in ft260_word_show()
  HID: amd_sfh: Use correct MMIO register for DMA address
  HID: asus: Remove check for same LED brightness on set
  HID: intel-ish-hid: use async resume function

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"7 patches.

  Subsystems affected by this patch series: lib, ocfs2, and mm (slub,
  migration, and memcg)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  slub: fix unreclaimable slab stat for bulk free
  mm/migrate: fix NR_ISOLATED corruption on 64-bit
  mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
  ocfs2: issue zeroout to EOF blocks
  ocfs2: fix zero out valid data
  lib/test_string.c: move string selftest in the Runtime Testing menu

Merge tag 'linux-can-fixes-for-5.14-20210730' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2021-07-30

The first patch is by me and adds Yasushi SHOJI as a reviewer for the
Microchip CAN BUS Analyzer Tool driver.

Dan Carpenter's patch fixes a signedness bug in the hi311x driver.

Pavel Skripkin provides 4 patches, the first targets the mcba_usb
driver by adding the missing urb->transfer_dma initialization, which
was broken in a previous commit. The last 3 patches fix a memory leak
in the usb_8dev, ems_usb and esd_usb2 driver.

* tag 'linux-can-fixes-for-5.14-20210730' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: esd_usb2: fix memory leak
  can: ems_usb: fix memory leak
  can: usb_8dev: fix memory leak
  can: mcba_usb_start(): add missing urb->transfer_dma initialization
  can: hi311x: fix a signedness bug in hi3110_cmd()
  MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
====================

Link: https://lore.kernel.org/r/20210730070526.1699867-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

When I use kfree_rcu() to free a large memory allocated by kmalloc_node(),
the following dump occurs.

  BUG: kernel NULL pointer dereference, address: 0000000000000020
  [...]
  Oops: 0000 [#1] SMP
  [...]
  Workqueue: events kfree_rcu_work
  RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
  RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
  RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
  [...]
  Call Trace:
    kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
    kfree_bulk include/linux/slab.h:413 [inline]
    kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
    process_one_work+0x207/0x530 kernel/workqueue.c:2276
    worker_thread+0x320/0x610 kernel/workqueue.c:2422
    kthread+0x13d/0x160 kernel/kthread.c:313
    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

When kmalloc_node() a large memory, page is allocated, not slab, so when
freeing memory via kfree_rcu(), this large memory should not be used by
memcg_slab_free_hook(), because memcg_slab_free_hook() is is used for
slab.

Using page_objcgs_check() instead of page_objcgs() in
memcg_slab_free_hook() to fix this bug.

Link: https://lkml.kernel.org/r/20210728145655.274476-1-wanghai38@huawei.com
Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

slub: fix unreclaimable slab stat for bulk free

SLUB uses page allocator for higher order allocations and update
unreclaimable slab stat for such allocations.  At the moment, the bulk
free for SLUB does not share code with normal free code path for these
type of allocations and have missed the stat update.  So, fix the stat
update by common code.  The user visible impact of the bug is the
potential of inconsistent unreclaimable slab stat visible through
meminfo and vmstat.

Link: https://lkml.kernel.org/r/20210728155354.3440560-1-shakeelb@google.com
Fixes: 6a486c0ad4dc ("mm, sl[ou]b: improve memory accounting")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/migrate: fix NR_ISOLATED corruption on 64-bit

Similar to commit 2da9f6305f30 ("mm/vmscan: fix NR_ISOLATED_FILE
corruption on 64-bit") avoid using unsigned int for nr_pages. With
unsigned int type the large unsigned int converts to a large positive
signed long.

Symptoms include CMA allocations hanging forever due to
alloc_contig_range->...->isolate_migratepages_block waiting forever in
"while (unlikely(too_many_isolated(pgdat)))".

Link: https://lkml.kernel.org/r/20210728042531.359409-1-aneesh.kumar@linux.ibm.com
Fixes: c5fc5c3ae0c8 ("mm: migrate: account THP NUMA migration counters correctly")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code

Dan Carpenter reports:

    The patch 2d146aa3aa84: "mm: memcontrol: switch to rstat" from Apr
    29, 2021, leads to the following static checker warning:

    kernel/cgroup/rstat.c:200 cgroup_rstat_flush()
    warn: sleeping in atomic context

    mm/memcontrol.c
      3572  static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
      3573  {
      3574          unsigned long val;
      3575
      3576          if (mem_cgroup_is_root(memcg)) {
      3577                  cgroup_rstat_flush(memcg->css.cgroup);
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    This is from static analysis and potentially a false positive.  The
    problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold()
    which holds an rcu_read_lock().  And the cgroup_rstat_flush() function
    can sleep.

      3578                  val = memcg_page_state(memcg, NR_FILE_PAGES) +
      3579                          memcg_page_state(memcg, NR_ANON_MAPPED);
      3580                  if (swap)
      3581                          val += memcg_page_state(memcg, MEMCG_SWAP);
      3582          } else {
      3583                  if (!swap)
      3584                          val = page_counter_read(&memcg->memory);
      3585                  else
      3586                          val = page_counter_read(&memcg->memsw);
      3587          }
      3588          return val;
      3589  }

__mem_cgroup_threshold() indeed holds the rcu lock.  In addition, the
thresholding code is invoked during stat changes, and those contexts
have irqs disabled as well.  If the lock breaking occurs inside the
flush function, it will result in a sleep from an atomic context.

Use the irqsafe flushing variant in mem_cgroup_usage() to fix this.

Link: https://lkml.kernel.org/r/20210726150019.251820-1-hannes@cmpxchg.org
Fixes: 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Rik van Riel <riel@surriel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: issue zeroout to EOF blocks

For punch holes in EOF blocks, fallocate used buffer write to zero the
EOF blocks in last cluster.  But since ->writepage will ignore EOF
pages, those zeros will not be flushed.

This "looks" ok as commit 6bba4471f0cc ("ocfs2: fix data corruption by
fallocate") will zero the EOF blocks when extend the file size, but it
isn't.  The problem happened on those EOF pages, before writeback, those
pages had DIRTY flag set and all buffer_head in them also had DIRTY flag
set, when writeback run by write_cache_pages(), DIRTY flag on the page
was cleared, but DIRTY flag on the buffer_head not.

When next write happened to those EOF pages, since buffer_head already
had DIRTY flag set, it would not mark page DIRTY again.  That made
writeback ignore them forever.  That will cause data corruption.  Even
directio write can't work because it will fail when trying to drop pages
caches before direct io, as it found the buffer_head for those pages
still had DIRTY flag set, then it will fall back to buffer io mode.

To make a summary of the issue, as writeback ingores EOF pages, once any
EOF page is generated, any write to it will only go to the page cache,
it will never be flushed to disk even file size extends and that page is
not EOF page any more.  The fix is to avoid zero EOF blocks with buffer
write.

The following code snippet from qemu-img could trigger the corruption.

  656   open("6b3711ae-3306-4bdd-823c-cf1c0060a095.conv.2", O_RDWR|O_DIRECT|O_CLOEXEC) = 11
  ...
  660   fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2275868672, 327680 <unfinished ...>
  660   fallocate(11, 0, 2275868672, 327680) = 0
  658   pwrite64(11, "

Link: https://lkml.kernel.org/r/20210722054923.24389-2-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: fix zero out valid data

If append-dio feature is enabled, direct-io write and fallocate could
run in parallel to extend file size, fallocate used "orig_isize" to
record i_size before taking "ip_alloc_sem", when
ocfs2_zeroout_partial_cluster() zeroout EOF blocks, i_size maybe already
extended by ocfs2_dio_end_io_write(), that will cause valid data zeroed
out.

Link: https://lkml.kernel.org/r/20210722054923.24389-1-junxiao.bi@oracle.com
Fixes: 6bba4471f0cc ("ocfs2: fix data corruption by fallocate")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/test_string.c: move string selftest in the Runtime Testing menu

STRING_SELFTEST is presented in the "Library routines" menu.  Move it in
Kernel hacking > Kernel Testing and Coverage > Runtime Testing together
with other similar tests found in lib/

--- Runtime Testing
<*>   Test functions located in the hexdump module at runtime
<*>   Test string functions (NEW)
<*>   Test functions located in the string_helpers module at runtime
<*>   Test strscpy*() family of functions at runtime
<*>   Test kstrto*() family of functions at runtime
<*>   Test printf() family of functions at runtime
<*>   Test scanf() family of functions at runtime

Link: https://lkml.kernel.org/r/20210719185158.190371-1-mcroce@linux.microsoft.com
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Cc: Peter Rosin <peda@axentia.se>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gve: Update MAINTAINERS list

The team maintaining the gve driver has undergone some changes,
this updates the MAINTAINERS file accordingly.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Link: https://lore.kernel.org/r/20210729155258.442650-1-csully@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: netlink: Remove unused function

lockdep_genl_is_held() and its caller arm not used now, just remove them.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Link: https://lore.kernel.org/r/20210729074854.8968-1-yajun.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'nfc-constify-pointed-data-missed-part'

Krzysztof Kozlowski says:

====================
nfc: constify pointed data - missed part

This was previously sent [1] but got lost. It was a prerequisite to part two of NFC const [2].

Changes since v2:
1. Drop patch previously 7/8 which cases new warnings "warning: Using
plain integer as NULL pointer".

Changes since v1:
1. Add patch 1/8 fixing up nfcmrvl_spi_parse_dt()

[1] https://lore.kernel.org/lkml/20210726145224.146006-1-krzysztof.kozlowski@canonical.com/
[2] https://lore.kernel.org/linux-nfc/20210729104022.47761-1-krzysztof.kozlowski@canonical.com/T/#m199fbdde180fa005a10addf28479fcbdc6263eab
====================

Link: https://lore.kernel.org/r/20210730144202.255890-1-krzysztof.kozlowski@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfc: hci: cleanup unneeded spaces

No need for multiple spaces in variable declaration (the code does not
use them in other places). No functional change.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfc: nci: constify several pointers to u8, sk_buff and other structs

Several functions receive pointers to u8, sk_buff or other structs but
do not modify the contents so make them const.  This allows doing the
same for local variables and in total makes the code a little bit safer.

This makes const also data passed as "unsigned long opt" argument to
nci_request() function.  Usual flow for such functions is:
1. Receive "u8 *" and store it (the pointer) in a structure
   allocated on stack (e.g. struct nci_set_config_param),
2. Call nci_request() or __nci_request() passing a callback function an
   the pointer to the structure via an "unsigned long opt",
3. nci_request() calls the callback which dereferences "unsigned long
   opt" in a read-only way.

This converts all above paths to use proper pointer to const data, so
entire flow is safer.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfc: constify local pointer variables

Few pointers to struct nfc_target and struct nfc_se can be made const.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfc: constify several pointers to u8, char and sk_buff

Several functions receive pointers to u8, char or sk_buff but do not
modify the contents so make them const. This allows doing the same for
local variables and in total makes the code a little bit safer.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfc: hci: annotate nfc_llc_init() as __init

The nfc_llc_init() is used only in other __init annotated context.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfc: annotate af_nfc_exit() as __exit

The af_nfc_exit() is used only in other __exit annotated context
(nfc_exit()).

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfc: mrvl: correct nfcmrvl_spi_parse_dt() device_node argument

The device_node in nfcmrvl_spi_parse_dt() cannot be const as it is
passed to OF functions which modify it.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

arch: Kconfig: clean up obsolete use of HAVE_IDE

The arch-specific Kconfig files use HAVE_IDE to indicate if IDE is
supported.

As IDE support and the HAVE_IDE config vanishes with commit b7fb14d3ac63
("ide: remove the legacy ide driver"), there is no need to mention
HAVE_IDE in all those arch-specific Kconfig files.

The issue was identified with ./scripts/checkkconfigsymbols.py.

Fixes: b7fb14d3ac63 ("ide: remove the legacy ide driver")
Suggested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210728182115.4401-1-lukas.bulwahn@gmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

net: convert fib_treeref from int to refcount_t

refcount_t type should be used instead of int when fib_treeref is used as
a reference counter,and avoid use-after-free risks.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210729071350.28919-1-yajun.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

can: esd_usb2: fix memory leak

In esd_usb2_setup_rx_urbs() MAX_RX_URBS coherent buffers are allocated
and there is nothing, that frees them:

1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see esd_usb2_setup_rx_urbs) and this flag cannot be used
with coherent buffers.

So, all allocated buffers should be freed with usb_free_coherent()
explicitly.

Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.

Fixes: 96d8e90382dc ("can: Add driver for esd CAN-USB/2 device")
Link: https://lore.kernel.org/r/b31b096926dcb35998ad0271aac4b51770ca7cc8.1627404470.git.paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: ems_usb: fix memory leak

In ems_usb_start() MAX_RX_URBS coherent buffers are allocated and
there is nothing, that frees them:

1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see ems_usb_start) and this flag cannot be used with
coherent buffers.

So, all allocated buffers should be freed with usb_free_coherent()
explicitly.

Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.

Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface")
Link: https://lore.kernel.org/r/59aa9fbc9a8cbf9af2bbd2f61a659c480b415800.1627404470.git.paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: usb_8dev: fix memory leak

In usb_8dev_start() MAX_RX_URBS coherent buffers are allocated and
there is nothing, that frees them:

1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see usb_8dev_start) and this flag cannot be used with
coherent buffers.

So, all allocated buffers should be freed with usb_free_coherent()
explicitly.

Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.

Fixes: 0024d8ad1639 ("can: usb_8dev: Add support for USB2CAN interface from 8 devices")
Link: https://lore.kernel.org/r/d39b458cd425a1cf7f512f340224e6e9563b07bd.1627404470.git.paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: mcba_usb_start(): add missing urb->transfer_dma initialization

Yasushi reported, that his Microchip CAN Analyzer stopped working
since commit 91c02557174b ("can: mcba_usb: fix memory leak in
mcba_usb"). The problem was in missing urb->transfer_dma
initialization.

In my previous patch to this driver I refactored mcba_usb_start() code
to avoid leaking usb coherent buffers. To archive it, I passed local
stack variable to usb_alloc_coherent() and then saved it to private
array to correctly free all coherent buffers on ->close() call. But I
forgot to initialize urb->transfer_dma with variable passed to
usb_alloc_coherent().

All of this was causing device to not work, since dma addr 0 is not
valid and following log can be found on bug report page, which points
exactly to problem described above.

| DMAR: [DMA Write] Request device [00:14.0] PASID ffffffff fault addr 0 [fault reason 05] PTE Write access is not set

Fixes: 91c02557174b ("can: mcba_usb: fix memory leak in mcba_usb")
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990850
Link: https://lore.kernel.org/r/20210725103630.23864-1-paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Tested-by: Yasushi SHOJI <yashi@spacecubics.com>
[mkl: fixed typos in commit message - thanks Yasushi SHOJI]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: hi311x: fix a signedness bug in hi3110_cmd()

The hi3110_cmd() is supposed to return zero on success and negative
error codes on failure, but it was accidentally declared as a u8 when
it needs to be an int type.

Fixes: 57e83fb9b746 ("can: hi311x: Add Holt HI-311x CAN driver")
Link: https://lore.kernel.org/r/20210729141246.GA1267@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver

This patch adds Yasushi SHOJI as a reviewer for the Microchip CAN BUS
Analyzer Tool driver.

Link: https://lore.kernel.org/r/20210726111619.1023991-1-mkl@pengutronix.de
Acked-by: Yasushi SHOJI <yashi@spacecubics.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Merge tag 'drm-fixes-2021-07-30' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Regular drm fixes pull, seems about the right size, lots of small
  fixes across the board, mostly amdgpu, but msm and i915 are in there
  along with panel and ttm.

  amdgpu:
   - Fix resource leak in an error path
   - Avoid stack contents exposure in error path
   - pmops check fix for S0ix vs S3
   - DCN 2.1 display fixes
   - DCN 2.0 display fix
   - Backlight control fix for laptops with HDR panels
   - Maintainers updates

  i915:
   - Fix vbt port mask
   - Fix around reading the right DSC disable fuse in display_ver 10
   - Split display version 9 and 10 in intel_setup_outputs

  msm:
   - iommu fault display fix
   - misc dp compliance fixes
   - dpu reg sizing fix

  panel:
   - Fix bpc for ytc700tlag_05_201c

  ttm:
   - debugfs init fixes"

* tag 'drm-fixes-2021-07-30' of git://anongit.freedesktop.org/drm/drm:
  maintainers: add bugs and chat URLs for amdgpu
  drm/amdgpu/display: only enable aux backlight control for OLED panels
  drm/amd/display: ensure dentist display clock update finished in DCN20
  drm/amd/display: Add missing DCN21 IP parameter
  drm/amd/display: Guard DST_Y_PREFETCH register overflow in DCN21
  drm/amdgpu: Check pmops for desired suspend state
  drm/msm/dp: Initialize dp->aux->drm_dev before registration
  drm/msm/dp: signal audio plugged change at dp_pm_resume
  drm/msm/dp: Initialize the INTF_CONFIG register
  drm/msm/dp: use dp_ctrl_off_link_stream during PHY compliance test run
  drm/msm: Fix display fault handling
  drm/msm/dpu: Fix sm8250_mdp register length
  drm/amdgpu: Avoid printing of stack contents on firmware load error
  drm/amdgpu: Fix resource leak on probe error path
  drm/i915/display: split DISPLAY_VER 9 and 10 in intel_setup_outputs()
  drm/i915: fix not reading DSC disable fuse in GLK
  drm/i915/bios: Fix ports mask
  drm/panel: panel-simple: Fix proper bpc for ytc700tlag_05_201c
  drm/ttm: Initialize debugfs from ttm_global_init()

Merge tag 'fallthrough-fixes-clang-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull fallthrough fixes from Gustavo Silva:
"Fix some fall-through warnings when building with Clang and
  '-Wimplicit-fallthrough' on ARM"

* tag 'fallthrough-fixes-clang-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  scsi: fas216: Fix fall-through warning for Clang
  scsi: acornscsi: Fix fall-through warning for clang
  ARM: riscpc: Fix fall-through warning for Clang

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha

Pull alpha updates from Matt Turner:
"They're mostly small janitorial fixes but there's also more important
  ones:

   - drop the alpha-specific x86 binary loader (David Hildenbrand)

   - regression fix for at least Marvel platforms (Mike Rapoport)

   - fix for a scary-looking typo (Zheng Yongjun)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
  alpha: register early reserved memory in memblock
  alpha: fix spelling mistakes
  alpha: Remove space between * and parameter name
  alpha: fp_emul: avoid init/cleanup_module names
  alpha: Add syscall_get_return_value()
  binfmt: remove support for em86 (alpha only)
  alpha: fix typos in a comment
  alpha: defconfig: add necessary configs for boot testing
  alpha: Send stop IPI to send to online CPUs
  alpha: convert comma to semicolon
  alpha: remove undef inline in compiler.h
  alpha: Kconfig: Replace HTTP links with HTTPS ones
  alpha: __udiv_qrnnd should be exported

bcm63xx_enet: delete a redundant assignment

In the function bcm_enetsw_probe(), 'ret' will be assigned by
bcm_enet_change_mtu(), so 'ret = 0' make no sense.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: don't set skb->offload_fwd_mark when not offloading the bridge

DSA has gained the recent ability to deal gracefully with upper
interfaces it cannot offload, such as the bridge, bonding or team
drivers. When such uppers exist, the ports are still in standalone mode
as far as the hardware is concerned.

But when we deliver packets to the software bridge in order for that to
do the forwarding, there is an unpleasant surprise in that the bridge
will refuse to forward them. This is because we unconditionally set
skb->offload_fwd_mark = true, meaning that the bridge thinks the frames
were already forwarded in hardware by us.

Since dp->bridge_dev is populated only when there is hardware offload
for it, but not in the software fallback case, let's introduce a new
helper that can be called from the tagger data path which sets the
skb->offload_fwd_mark accordingly to zero when there is no hardware
offload for bridging. This lets the bridge forward packets back to other
interfaces of our switch, if needed.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: Add handling of NETDEV_UP events

When an ipvlan device is created on a bond device, the link state
of the ipvlan device may be abnormal. This is because bonding device
allows to add physical network card device in the down state and so
NETDEV_CHANGE event will not be notified to other listeners, so ipvlan
has no chance to update its link status.

The following steps can cause such problems:
1) bond0 is down
2) ip link add link bond0 name ipvlan type ipvlan mode l2
3) echo +enp2s7 >/sys/class/net/bond0/bonding/slaves
4) ip link set bond0 up

After these steps, use ip link command, we found ipvlan has NO-CARRIER:
ipvlan@bond0: <NO-CARRIER, BROADCAST,MULTICAST,UP,M-DOWN> mtu ...>

We can deal with this problem like VLAN: Add handling of NETDEV_UP
events. If we receive NETDEV_UP event, we will update the link status
of the ipvlan.

Signed-off-by: Di Zhu <zhudi21@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: store the last executed chain also for clsact egress

currently, only 'ingress' and 'clsact ingress' qdiscs store the tc 'chain
id' in the skb extension. However, userspace programs (like ovs) are able
to setup egress rules, and datapath gets confused in case it doesn't find
the 'chain id' for a packet that's "recirculated" by tc.
Change tcf_classify() to have the same semantic as tcf_classify_ingress()
so that a single function can be called in ingress / egress, using the tc
ingress / egress block respectively.

Suggested-by: Alaa Hleilel <alaa@nvidia.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dpaa2-switch-add-mirroring-support'

Ioana Ciornei says:

====================
dpaa2-switch: add mirroring support

This patch set adds per port and per VLAN mirroring in dpaa2-switch.

The first 4 patches are just cosmetic changes. We renamed the
dpaa2_switch_acl_tbl structure into dpaa2_switch_filter_block so that we
can reuse it for filters that do not use the ACL table and reorganized
the addition of trap, redirect and drop filters into a separate
function. All this just to make for a more streamlined addition of the
support for mirroring.

The next 4 patches are actually adding the advertised support. Mirroring
rules can be added in shared blocks, the driver will replicate the same
configuration on all the switch ports part of the same block.

The last patch documents the feature, presents its behavior and
limitations and gives a couple of examples.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

docs: networking: dpaa2: document mirroring support on the switch

Document the mirroring capabilities of the dpaa2-switch driver,
any restrictions that are imposed and some example commands.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: offload shared block mirror filters when binding to a port

When mirroring rules are added in shared filter blocks, the same
mirroring rule has to be configured on all the switch ports that are
part of the same block.

In case a switch port joins a shared block after mirroring filters have
been already added to it, then all the mirror rules should be offloaded
to the port. The reverse, removal of mirroring rules, has to be done at
block unbind.

For this purpose, the dpaa2_switch_block_offload_mirror() and
dpaa2_switch_block_unoffload_mirror() functions are added and called
upon binding and unbinding a switch port to/from a block.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: add VLAN based mirroring

Using the infrastructure added in the previous patch, extend tc-flower
support with FLOW_ACTION_MIRRED based on VLAN.

Tested with:

tc qdisc add dev eth8 ingress_block 1 clsact
tc filter add block 1 ingress protocol 802.1q flower skip_sw \
vlan_id 100 action mirred egress mirror dev eth6
tc filter del block 1 ingress pref 49152

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: add support for port mirroring

Add support for per port mirroring for the DPAA2 switch. We support
only single mirror port, therefore we allow mirroring rules only as long
as the destination port is always the same.

Unlike all the actions (drop, redirect, trap) already supported by the
dpaa2-switch driver, adding mirroring filters in shared blocks is not
achieved by a singular ACL entry added in a table shared by the ports.
This is why, when a new mirror filter is added in a block we have to got
through all the switch ports sharing it and configure the filter
individually on all.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: add API for setting up mirroring

Add the necessary MC API for setting up and configuring the mirroring
feature on the DPSW DPAA2 object.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: reorganize dpaa2_switch_cls_matchall_replace

Extract the necessary steps to offload a filter by using the ACL table
in a separate function - dpaa2_switch_cls_matchall_replace_acl().

This is intended to help with the code readability when the mirroring
support is added.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: reorganize dpaa2_switch_cls_flower_replace

Extract the necessary steps to offload a filter by using the ACL table
in a separate function - dpaa2_switch_cls_flower_replace_acl().
This is intended to help with the code readability when the mirroring
support is added.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: rename dpaa2_switch_acl_tbl into filter_block

Until now, shared filter blocks were implemented only by ACL tables
shared between ports. Going forward, when the mirroring support will be
added, this will not be true anymore.

Rename the dpaa2_switch_acl_tbl into dpaa2_switch_filter_block so that
we make it clear that the structure is used not only for filters that
use the ACL table but will be used for all the filters that are added in
a block.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: rename dpaa2_switch_tc_parse_action to specify the ACL

Until now, the dpaa2_switch_tc_parse_action() function was used for all
the supported tc actions since all of them were implemented by adding
ACL table entries. In the next commits, the dpaa2-switch driver will
gain mirroring support which is not using the same HW feature.

Make sure that we specify the ACL in the function name so that we make
it clear that it's only used for specific actions.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

scsi: fas216: Fix fall-through warning for Clang

Fix the following fallthrough warning (on ARM):

drivers/scsi/arm/fas216.c:1379:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
           default:
           ^
   drivers/scsi/arm/fas216.c:1379:2: note: insert 'break;' to avoid fall-through
           default:
           ^
           break;

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>

scsi: acornscsi: Fix fall-through warning for clang

Fix the following fallthrough warning (on ARM):

drivers/scsi/arm/acornscsi.c:2651:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
           case res_success:
           ^
   drivers/scsi/arm/acornscsi.c:2651:2: note: insert '__attribute__((fallthrough));' to silence this warning
           case res_success:
           ^
           __attribute__((fallthrough));
   drivers/scsi/arm/acornscsi.c:2651:2: note: insert 'break;' to avoid fall-through
           case res_success:
           ^
           break;
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>

ARM: riscpc: Fix fall-through warning for Clang

Fix the following fallthrough warning:

arch/arm/mach-rpc/riscpc.c:52:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
           default:
           ^
arch/arm/mach-rpc/riscpc.c:52:2: note: insert 'break;' to avoid fall-through
           default:
           ^
           break;

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"ARM:

   - Fix MTE shared page detection

   - Enable selftest's use of PMU registers when asked to

  s390:

   - restore 5.13 debugfs names

  x86:

   - fix sizes for vcpu-id indexed arrays

   - fixes for AMD virtualized LAPIC (AVIC)

   - other small bugfixes

  Generic:

   - access tracking performance test

   - dirty_log_perf_test command line parsing fix

   - Fix selftest use of obsolete pthread_yield() in favour of
     sched_yield()

   - use cpu_relax when halt polling

   - fixed missing KVM_CLEAR_DIRTY_LOG compat ioctl"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: add missing compat KVM_CLEAR_DIRTY_LOG
  KVM: use cpu_relax when halt polling
  KVM: SVM: use vmcb01 in svm_refresh_apicv_exec_ctrl
  KVM: SVM: tweak warning about enabled AVIC on nested entry
  KVM: SVM: svm_set_vintr don't warn if AVIC is active but is about to be deactivated
  KVM: s390: restore old debugfs names
  KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized
  KVM: selftests: Introduce access_tracking_perf_test
  KVM: selftests: Fix missing break in dirty_log_perf_test arg parsing
  x86/kvm: fix vcpu-id indexed array sizes
  KVM: x86: Check the right feature bit for MSR_KVM_ASYNC_PF_ACK access
  docs: virt: kvm: api.rst: replace some characters
  KVM: Documentation: Fix KVM_CAP_ENFORCE_PV_FEATURE_CPUID name
  KVM: nSVM: Swap the parameter order for svm_copy_vmrun_state()/svm_copy_vmloadsave_state()
  KVM: nSVM: Rename nested_svm_vmloadsave() to svm_copy_vmloadsave_state()
  KVM: arm64: selftests: get-reg-list: actually enable pmu regs in pmu sublist
  KVM: selftests: change pthread_yield to sched_yield
  KVM: arm64: Fix detection of shared VMAs on guest fault

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu

Pull m68knommu fix from Greg Ungerer:
"A single compile time fix"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k/coldfire: change pll var. to clk_pll

qede: Remove the qede module version

Removing the qede module version which is not needed and not allowed
with inbox drivers.

Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Remove the qed module version

Removing the qed module version which is not needed and not allowed
with inbox drivers.

Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'sja110-vlan-fixes'

Vladimir Oltean says:

====================
NXP SJA1105 VLAN regressions

These are 3 patches to fix issues seen with some more varied testing
done after the changes in the "Traffic termination for sja1105 ports
under VLAN-aware bridge" series were made:
https://patchwork.kernel.org/project/netdevbpf/cover/20210726165536.1338471-1-vladimir.oltean@nxp.com/

Issue 1: traffic no longer works on a port after leaving a VLAN-aware bridge
Issue 2: untagged traffic not dropped if pvid is absent from a VLAN-aware port
Issue 3: PTP and STP broken on ports under a VLAN-aware bridge
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: tag_sja1105: fix control packets on SJA1110 being received on an imprecise port

On RX, a control packet with SJA1110 will have:
- an in-band control extension (DSA tag) composed of a header and an
  optional trailer (if it is a timestamp frame). We can (and do) deduce
  the source port and switch id from this.
- a VLAN header, which can either be the tag_8021q RX VLAN (pvid) or the
  bridge VLAN. The sja1105_vlan_rcv() function attempts to deduce the
  source port and switch id a second time from this.

The basic idea is that even though we don't need the source port
information from the tag_8021q header if it's a control packet, we do
need to strip that header before we pass it on to the network stack.

The problem is that we call sja1105_vlan_rcv for ports under VLAN-aware
bridges, and that function tells us it couldn't identify a tag_8021q
header, so we need to perform imprecise RX by VID. Well, we don't,
because we already know the source port and switch ID.

This patch drops the return value from sja1105_vlan_rcv and we just look
at the source_port and switch_id values from sja1105_rcv and sja1110_rcv
which were initialized to -1. If they are still -1 it means we need to
perform imprecise RX.

Fixes: 884be12f8566 ("net: dsa: sja1105: add support for imprecise RX")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: sja1105: make sure untagged packets are dropped on ingress ports with no pvid

Surprisingly, this configuration:

ip link add br0 type bridge vlan_filtering 1
ip link set swp2 master br0
bridge vlan del dev swp2 vid 1

still has the sja1105 switch sending untagged packets to the CPU (and
failing to decode them, since dsa_find_designated_bridge_port_by_vid
searches by VID 1 and rightfully finds no bridge VLAN 1 on a port).

Dumping the switch configuration, the VLANs are managed properly:
- the pvid of swp2 is 1 in the MAC Configuration Table, but
- only the CPU port is in the port membership of VLANID 1 in the VLAN
  Lookup Table

When the ingress packets are tagged with VID 1, they are properly
dropped. But when they are untagged, they are able to reach the CPU
port. Also, when the pvid in the MAC Configuration Table is changed to
e.g. 55 (an unused VLAN), the untagged packets are also dropped.

So it looks like:
- the switch bypasses ingress VLAN membership checks for untagged traffic
- the reason why the untagged traffic is dropped when I make the pvid 55
  is due to the lack of valid destination ports in VLAN 55, rather than
  an ingress membership violation
- the ingress VLAN membership cheks are only done for VLAN-tagged traffic

Interesting. It looks like there is an explicit bit to drop untagged
traffic, so we should probably be using that to preserve user expectations.

Note that only VLAN-aware ports should drop untagged packets due to no
pvid - when VLAN-unaware, the software bridge doesn't do this even if
there is no pvid on any bridge port and on the bridge itself. So the new
sja1105_drop_untagged() function cannot simply be called with "false"
from sja1105_bridge_vlan_add() and with "true" from sja1105_bridge_vlan_del.
Instead, we need to also consider the VLAN awareness state. That means
we need to hook the "drop untagged" setting in all the same places where
the "commit pvid" logic is, and it needs to factor in all the state when
flipping the "drop untagged" bit: is our current pvid in the VLAN Lookup
Table, and is the current port in that VLAN's port membership list?
VLAN-unaware ports will never drop untagged frames because these checks
always succeed by construction, and the tag_8021q VLANs cannot be changed
by the user.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: sja1105: reset the port pvid when leaving a VLAN-aware bridge

Now that we no longer have the ultra-central sja1105_build_vlan_table(),
we need to be more careful about checking all corner cases manually.

For example, when a port leaves a VLAN-aware bridge, it becomes
standalone so its pvid should become a tag_8021q RX VLAN again. However,
sja1105_commit_pvid() only gets called from sja1105_bridge_vlan_add()
and from sja1105_vlan_filtering(), and no VLAN awareness change takes
place (VLAN filtering is a global setting for sja1105, so the switch
remains VLAN-aware overall).

This means that we need to put another sja1105_commit_pvid() call in
sja1105_bridge_member().

Fixes: 6dfd23d35e75 ("net: dsa: sja1105: delete vlan delta save/restore logic")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mctp'

Jeremy Kerr says:

====================
Add Management Component Transport Protocol support

This series adds core MCTP support to the kernel. From the Kconfig
description:

  Management Component Transport Protocol (MCTP) is an in-system
  protocol for communicating between management controllers and
  their managed devices (peripherals, host processors, etc.). The
  protocol is defined by DMTF specification DSP0236.

  This option enables core MCTP support. For communicating with other
  devices, you'll want to enable a driver for a specific hardware
  channel.

This implementation allows a sockets-based API for sending and receiving
MCTP messages via sendmsg/recvmsg on SOCK_DGRAM sockets. Kernel stack
control is all via netlink, using existing RTM_* messages. The userspace
ABI change is fairly small; just the necessary AF_/ETH_P_/ARPHDR_
constants, a new sockaddr, and a new netlink attribute.

For MAINTAINERS, I've just included netdev@ as the list entry. I'm happy
to alter this based on preferences here - an alternative would be the
OpenBMC list (the main user of the MCTP interface), or we can create a
new list entirely.

We have a couple of interface drivers almost ready to go at the moment,
but those can wait until the core code has some review.

This is v4 of the series; v1 and v2 were both RFC.

selinux folks: CCing 01/15 due to the new PF_MCTP protocol family.

linux-doc folks: CCing 15/15 for the new MCTP overview document.

Review, comments, questions etc. are most welcome.

Cheers,

Jeremy

v2:
- change to match spec terminology: controller -> component
- require specific capabilities for bind() & sendmsg()
- add address and tag defintions to uapi
- add selinux AF_MCTP table definitions
- remove strict cflags; warnings are present in common headers

v3:
- require caps for MCTP bind() & send()
- comment typo fixes
- switch to an array for local EIDs
- fix addrinfo dump iteration & error path
- add RTM_DELADDR
- remove GENMASK() and BIT() from uapi

v4:
- drop tun patch; that can be submitted separately
- keep nipa happy: add maintainer CCs, including doc and selinux
- net-next rebase
- Include AF_MCTP in af_family_slock_keys and pf_family_names
- Introduce MODULE_ definitions earlier
- upstream change: set_link_af no longer called with RTNL held
- add kdoc for net_device.mctp_ptr
- don't inline mctp_rt_match_eid
- require rtm_type == RTN_UNICAST in route management handlers
- remove unused RTAX policy table
- fix mctp_sock->keys rcu annotations
- fix spurious rcu_read_unlock in route input
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add MCTP overview document

This change adds a brief document about the sockets API provided for
sending and receiving MCTP messages from userspace.

This is roughly based on the OpenBMC design document, at:

https://github.com/openbmc/docs/blob/master/designs/mctp/mctp-kernel.md

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Allow per-netns default networks

Currently we have a compile-time default network
(MCTP_INITIAL_DEFAULT_NET). This change introduces a default_net field
on the net namespace, allowing future configuration for new interfaces.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add dest neighbour lladdr to route output

Now that we have a neighbour implementation, hook it up to the output
path to set the dest hardware address for outgoing packets.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Implement message fragmentation & reassembly

This change implements MCTP fragmentation (based on route & device MTU),
and corresponding reassembly.

The MCTP specification only allows for fragmentation on the originating
message endpoint, and reassembly on the destination endpoint -
intermediate nodes do not need to reassemble/refragment. Consequently,
we only fragment in the local transmit path, and reassemble
locally-bound packets. Messages are required to be in-order, so we
simply cancel reassembly on out-of-order or missing packets.

In the fragmentation path, we just break up the message into MTU-sized
fragments; the skb structure is a simple copy for now, which we can later
improve with a shared data implementation.

For reassembly, we keep track of incoming message fragments using the
existing tag infrastructure, allocating a key on the (src,dest,tag)
tuple, and reassembles matching fragments into a skb->frag_list.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Populate socket implementation

Start filling-out the socket syscalls: bind, sendmsg & recvmsg.

This requires an input route implementation, so we add to
mctp_route_input, allowing lookups on binds & message tags. This just
handles single-packet messages at present, we will add fragmentation in
a future change.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add neighbour netlink interface

This change adds the netlink interfaces for manipulating the MCTP
neighbour table.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add neighbour implementation

Add an initial neighbour table implementation, to be used in the route
output path.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add netlink route management

This change adds RTM_GETROUTE, RTM_NEWROUTE & RTM_DELROUTE handlers,
allowing management of the MCTP route table.

Includes changes from Jeremy Kerr <jk@codeconstruct.com.au>.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add initial routing framework

Add a simple routing table, and a couple of route output handlers, and
the mctp packet_type & handler.

Includes changes from Matt Johnston <matt@codeconstruct.com.au>.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add device handling and netlink interface

This change adds the infrastructure for managing MCTP netdevices; we add
a pointer to the AF_MCTP-specific data to struct netdevice, and hook up
the rtnetlink operations for adding and removing addresses.

Includes changes from Matt Johnston <matt@codeconstruct.com.au>.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add initial driver infrastructure

Add an empty drivers/net/mctp/, for future interface drivers.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add sockaddr_mctp to uapi

This change introduces the user-visible MCTP header, containing the
protocol-specific addressing definitions.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add base packet definitions

Simple packet header format as defined by DMTF DSP0236.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add base socket/protocol definitions

Add an empty socket implementation, plus initialisation/destruction
handlers.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp: Add MCTP base

Add basic Kconfig, an initial (empty) af_mctp source object, and
{AF,PF}_MCTP definitions, and the required definitions for a new
protocol type.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'nfc-const'

Krzysztof Kozlowski says:

====================
nfc: constify, continued (part 2)

On top of:
nfc: constify pointed data
https://lore.kernel.org/lkml/20210726145224.146006-1-krzysztof.kozlowski@canonical.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: mrvl: constify static nfcmrvl_if_ops

File-scope struct nfcmrvl_if_ops is not modified so can be made const.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: mrvl: constify several pointers

Several functions do not modify pointed data so arguments and local
variables can be const for correctness and safety.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: microread: constify several pointers

Several functions do not modify pointed data so arguments and local
variables can be const for correctness and safety.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: fdp: constify several pointers

Several functions do not modify pointed data so arguments and local
variables can be const for correctness and safety. This allows also
making file-scope nci_core_get_config_otp_ram_version array const.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: fdp: use unsigned int as loop iterator

Loop iterators are simple integers, no point to optimize the size and
use u8. It only raises the question whether the variable is used in
some other context.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: fdp: drop unneeded cast for printing firmware size in dev_dbg()

Size of firmware is a type of size_t, so print it directly instead of
casting to int.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: nfcsim: constify drvdata (struct nfcsim)

nfcsim_abort_cmd() does not modify struct nfcsim, so local variable
can be a pointer to const.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: virtual_ncidev: constify pointer to nfc_dev

virtual_ncidev_ioctl() does not modify struct nfc_dev, so local variable
can be a pointer to const.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: trf7970a: constify several pointers

Several functions do not modify pointed data so arguments and local
variables can be const for correctness and safety.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: port100: constify several pointers

Several functions do not modify pointed data so arguments and local
variables can be const for correctness and safety.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: mei_phy: constify buffer passed to mei_nfc_send()

The buffer passed to mei_nfc_send() can be const for correctness and
safety.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: constify passed nfc_dev

The struct nfc_dev is not modified by nfc_get_drvdata() and
nfc_device_name() so it can be made a const.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'skb-gro-optimize'

Paolo Abeni says:

====================
sk_buff: optimize GRO for the common case

This is a trimmed down revision of "sk_buff: optimize layout for GRO",
specifically dropping the changes to the sk_buff layout[1].

This series tries to accomplish 2 goals:
- optimize the GRO stage for the most common scenario, avoiding a bunch
  of conditional and some more code
- let owned skbs entering the GRO engine, allowing backpressure in the
  veth GRO forward path.

A new sk_buff flag (!!!) is introduced and maintained for GRO's sake.
Such field uses an existing hole, so there is no change to the sk_buff
size.

[1] two main reasons:
- move skb->inner_ field requires some extra care, as some in kernel
  users access and the fields regardless of skb->encapsulation.
- extending secmark size clash with ct and nft uAPIs

address the all above is possible, I think, but for sure not in a single
series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

veth: use skb_prepare_for_gro()

Leveraging the previous patch we can now avoid orphaning the
skb in the veth gro path, allowing correct backpressure.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>