]> www.infradead.org Git - users/dwmw2/linux.git/log
users/dwmw2/linux.git
4 years agoARM: dts: s5pv210: Enable audio on Aries boards
Jonathan Bakker [Wed, 2 Sep 2020 00:38:58 +0000 (17:38 -0700)]
ARM: dts: s5pv210: Enable audio on Aries boards

[ Upstream commit cd972fe90008adf49de0790250c1275480ac5cdc ]

Both the Galaxy S and the Fascinate4G have a WM8994 codec, but they
differ slightly in their jack detection and micbias configuration.

Signed-off-by: Jonathan Bakker <xc-racer2@live.ca>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomemory: emif: Remove bogus debugfs error handling
Dan Carpenter [Wed, 26 Aug 2020 11:37:59 +0000 (14:37 +0300)]
memory: emif: Remove bogus debugfs error handling

[ Upstream commit fd22781648080cc400772b3c68aa6b059d2d5420 ]

Callers are generally not supposed to check the return values from
debugfs functions.  Debugfs functions never return NULL so this error
handling will never trigger.  (Historically debugfs functions used to
return a mix of NULL and error pointers but it was eventually deemed too
complicated for something which wasn't intended to be used in normal
situations).

Delete all the error handling.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Link: https://lore.kernel.org/r/20200826113759.GF393664@mwanda
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoARM: dts: omap4: Fix sgx clock rate for 4430
Tony Lindgren [Tue, 10 Mar 2020 21:02:48 +0000 (14:02 -0700)]
ARM: dts: omap4: Fix sgx clock rate for 4430

[ Upstream commit 19d3e9a0bdd57b90175f30390edeb06851f5f9f3 ]

We currently have a different clock rate for droid4 compared to the
stock v3.0.8 based Android Linux kernel:

# cat /sys/kernel/debug/clk/dpll_*_m7x2_ck/clk_rate
266666667
307200000
# cat /sys/kernel/debug/clk/l3_gfx_cm:clk:0000:0/clk_rate
307200000

Let's fix this by configuring sgx to use 153.6 MHz instead of 307.2 MHz.
Looks like also at least duover needs this change to avoid hangs, so
let's apply it for all 4430.

This helps a bit with thermal issues that seem to be related to memory
corruption when using sgx. It seems that other driver related issues
still remain though.

Cc: Arthur Demchenkov <spinal.by@gmail.com>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoarm64: dts: renesas: ulcb: add full-pwr-cycle-in-suspend into eMMC nodes
Yoshihiro Shimoda [Fri, 17 Jul 2020 12:33:21 +0000 (21:33 +0900)]
arm64: dts: renesas: ulcb: add full-pwr-cycle-in-suspend into eMMC nodes

[ Upstream commit 992d7a8b88c83c05664b649fc54501ce58e19132 ]

Add full-pwr-cycle-in-suspend property to do a graceful shutdown of
the eMMC device in system suspend.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/1594989201-24228-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agocifs: handle -EINTR in cifs_setattr
Ronnie Sahlberg [Thu, 8 Oct 2020 23:32:56 +0000 (09:32 +1000)]
cifs: handle -EINTR in cifs_setattr

[ Upstream commit c6cc4c5a72505a0ecefc9b413f16bec512f38078 ]

RHBZ: 1848178

Some calls that set attributes, like utimensat(), are not supposed to return
-EINTR and thus do not have handlers for this in glibc which causes us
to leak -EINTR to the applications which are also unprepared to handle it.

For example tar will break if utimensat() return -EINTR and abort unpacking
the archive. Other applications may break too.

To handle this we add checks, and retry, for -EINTR in cifs_setattr()

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoHandle STATUS_IO_TIMEOUT gracefully
Rohith Surabattula [Fri, 18 Sep 2020 05:37:28 +0000 (05:37 +0000)]
Handle STATUS_IO_TIMEOUT gracefully

[ Upstream commit 8e670f77c4a55013db6d23b962f9bf6673a5e7b6 ]

Currently STATUS_IO_TIMEOUT is not treated as retriable error.
It is currently mapped to ETIMEDOUT and returned to userspace
for most system calls. STATUS_IO_TIMEOUT is returned by server
in case of unavailability or throttling errors.

This patch will map the STATUS_IO_TIMEOUT to EAGAIN, so that it
can be retried. Also, added a check to drop the connection to
not overload the server in case of ongoing unavailability.

Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agogfs2: add validation checks for size of superblock
Anant Thazhemadam [Wed, 14 Oct 2020 16:31:09 +0000 (22:01 +0530)]
gfs2: add validation checks for size of superblock

[ Upstream commit 0ddc5154b24c96f20e94d653b0a814438de6032b ]

In gfs2_check_sb(), no validation checks are performed with regards to
the size of the superblock.
syzkaller detected a slab-out-of-bounds bug that was primarily caused
because the block size for a superblock was set to zero.
A valid size for a superblock is a power of 2 between 512 and PAGE_SIZE.
Performing validation checks and ensuring that the size of the superblock
is valid fixes this bug.

Reported-by: syzbot+af90d47a37376844e731@syzkaller.appspotmail.com
Tested-by: syzbot+af90d47a37376844e731@syzkaller.appspotmail.com
Suggested-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
[Minor code reordering.]
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agogfs2: use-after-free in sysfs deregistration
Jamie Iles [Mon, 12 Oct 2020 13:13:09 +0000 (14:13 +0100)]
gfs2: use-after-free in sysfs deregistration

[ Upstream commit c2a04b02c060c4858762edce4674d5cba3e5a96f ]

syzkaller found the following splat with CONFIG_DEBUG_KOBJECT_RELEASE=y:

  Read of size 1 at addr ffff000028e896b8 by task kworker/1:2/228

  CPU: 1 PID: 228 Comm: kworker/1:2 Tainted: G S                5.9.0-rc8+ #101
  Hardware name: linux,dummy-virt (DT)
  Workqueue: events kobject_delayed_cleanup
  Call trace:
   dump_backtrace+0x0/0x4d8
   show_stack+0x34/0x48
   dump_stack+0x174/0x1f8
   print_address_description.constprop.0+0x5c/0x550
   kasan_report+0x13c/0x1c0
   __asan_report_load1_noabort+0x34/0x60
   memcmp+0xd0/0xd8
   gfs2_uevent+0xc4/0x188
   kobject_uevent_env+0x54c/0x1240
   kobject_uevent+0x2c/0x40
   __kobject_del+0x190/0x1d8
   kobject_delayed_cleanup+0x2bc/0x3b8
   process_one_work+0x96c/0x18c0
   worker_thread+0x3f0/0xc30
   kthread+0x390/0x498
   ret_from_fork+0x10/0x18

  Allocated by task 1110:
   kasan_save_stack+0x28/0x58
   __kasan_kmalloc.isra.0+0xc8/0xe8
   kasan_kmalloc+0x10/0x20
   kmem_cache_alloc_trace+0x1d8/0x2f0
   alloc_super+0x64/0x8c0
   sget_fc+0x110/0x620
   get_tree_bdev+0x190/0x648
   gfs2_get_tree+0x50/0x228
   vfs_get_tree+0x84/0x2e8
   path_mount+0x1134/0x1da8
   do_mount+0x124/0x138
   __arm64_sys_mount+0x164/0x238
   el0_svc_common.constprop.0+0x15c/0x598
   do_el0_svc+0x60/0x150
   el0_svc+0x34/0xb0
   el0_sync_handler+0xc8/0x5b4
   el0_sync+0x15c/0x180

  Freed by task 228:
   kasan_save_stack+0x28/0x58
   kasan_set_track+0x28/0x40
   kasan_set_free_info+0x24/0x48
   __kasan_slab_free+0x118/0x190
   kasan_slab_free+0x14/0x20
   slab_free_freelist_hook+0x6c/0x210
   kfree+0x13c/0x460

Use the same pattern as f2fs + ext4 where the kobject destruction must
complete before allowing the FS itself to be freed.  This means that we
need an explicit free_sbd in the callers.

Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
[Also go to fail_free when init_names fails.]
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agogfs2: Fix NULL pointer dereference in gfs2_rgrp_dump
Andrew Price [Wed, 7 Oct 2020 11:30:58 +0000 (12:30 +0100)]
gfs2: Fix NULL pointer dereference in gfs2_rgrp_dump

[ Upstream commit 0e539ca1bbbe85a86549c97a30a765ada4a09df9 ]

When an rindex entry is found to be corrupt, compute_bitstructs() calls
gfs2_consist_rgrpd() which calls gfs2_rgrp_dump() like this:

    gfs2_rgrp_dump(NULL, rgd->rd_gl, fs_id_buf);

gfs2_rgrp_dump then dereferences the gl without checking it and we get

    BUG: KASAN: null-ptr-deref in gfs2_rgrp_dump+0x28/0x280

because there's no rgrp glock involved while reading the rindex on mount.

Fix this by changing gfs2_rgrp_dump to take an rgrp argument.

Reported-by: syzbot+43fa87986bdd31df9de6@syzkaller.appspotmail.com
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agogfs2: call truncate_inode_pages_final for address space glocks
Bob Peterson [Wed, 16 Sep 2020 16:06:23 +0000 (11:06 -0500)]
gfs2: call truncate_inode_pages_final for address space glocks

[ Upstream commit ee1e2c773e4f4ce2213f9d77cc703b669ca6fa3f ]

Before this patch, we were not calling truncate_inode_pages_final for the
address space for glocks, which left the possibility of a leak. We now
take care of the problem instead of complaining, and we do it during
glock tear-down..

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoscsi: core: Clean up allocation and freeing of sgtables
Christoph Hellwig [Mon, 5 Oct 2020 08:41:28 +0000 (10:41 +0200)]
scsi: core: Clean up allocation and freeing of sgtables

[ Upstream commit 7007e9dd56767a95de0947b3f7599bcc2f21687f ]

Rename scsi_init_io() to scsi_alloc_sgtables(), and ensure callers call
scsi_free_sgtables() to cleanup failures close to scsi_init_io() instead of
leaking it down the generic I/O submission path.

Link: https://lore.kernel.org/r/20201005084130.143273-9-hch@lst.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoKVM: PPC: Book3S HV: Do not allocate HPT for a nested guest
Fabiano Rosas [Fri, 11 Sep 2020 04:16:07 +0000 (01:16 -0300)]
KVM: PPC: Book3S HV: Do not allocate HPT for a nested guest

[ Upstream commit 05e6295dc7de859c9d56334805485c4d20bebf25 ]

The current nested KVM code does not support HPT guests. This is
informed/enforced in some ways:

- Hosts < P9 will not be able to enable the nested HV feature;

- The nested hypervisor MMU capabilities will not contain
  KVM_CAP_PPC_MMU_HASH_V3;

- QEMU reflects the MMU capabilities in the
  'ibm,arch-vec-5-platform-support' device-tree property;

- The nested guest, at 'prom_parse_mmu_model' ignores the
  'disable_radix' kernel command line option if HPT is not supported;

- The KVM_PPC_CONFIGURE_V3_MMU ioctl will fail if trying to use HPT.

There is, however, still a way to start a HPT guest by using
max-compat-cpu=power8 at the QEMU machine options. This leads to the
guest being set to use hash after QEMU calls the KVM_PPC_ALLOCATE_HTAB
ioctl.

With the guest set to hash, the nested hypervisor goes through the
entry path that has no knowledge of nesting (kvmppc_run_vcpu) and
crashes when it tries to execute an hypervisor-privileged (mtspr
HDEC) instruction at __kvmppc_vcore_entry:

root@L1:~ $ qemu-system-ppc64 -machine pseries,max-cpu-compat=power8 ...

<snip>
[  538.543303] CPU: 83 PID: 25185 Comm: CPU 0/KVM Not tainted 5.9.0-rc4 #1
[  538.543355] NIP:  c00800000753f388 LR: c00800000753f368 CTR: c0000000001e5ec0
[  538.543417] REGS: c0000013e91e33b0 TRAP: 0700   Not tainted  (5.9.0-rc4)
[  538.543470] MSR:  8000000002843033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE>  CR: 22422882  XER: 20040000
[  538.543546] CFAR: c00800000753f4b0 IRQMASK: 3
               GPR00: c0080000075397a0 c0000013e91e3640 c00800000755e600 0000000080000000
               GPR04: 0000000000000000 c0000013eab19800 c000001394de0000 00000043a054db72
               GPR08: 00000000003b1652 0000000000000000 0000000000000000 c0080000075502e0
               GPR12: c0000000001e5ec0 c0000007ffa74200 c0000013eab19800 0000000000000008
               GPR16: 0000000000000000 c00000139676c6c0 c000000001d23948 c0000013e91e38b8
               GPR20: 0000000000000053 0000000000000000 0000000000000001 0000000000000000
               GPR24: 0000000000000001 0000000000000001 0000000000000000 0000000000000001
               GPR28: 0000000000000001 0000000000000053 c0000013eab19800 0000000000000001
[  538.544067] NIP [c00800000753f388] __kvmppc_vcore_entry+0x90/0x104 [kvm_hv]
[  538.544121] LR [c00800000753f368] __kvmppc_vcore_entry+0x70/0x104 [kvm_hv]
[  538.544173] Call Trace:
[  538.544196] [c0000013e91e3640] [c0000013e91e3680] 0xc0000013e91e3680 (unreliable)
[  538.544260] [c0000013e91e3820] [c0080000075397a0] kvmppc_run_core+0xbc8/0x19d0 [kvm_hv]
[  538.544325] [c0000013e91e39e0] [c00800000753d99c] kvmppc_vcpu_run_hv+0x404/0xc00 [kvm_hv]
[  538.544394] [c0000013e91e3ad0] [c0080000072da4fc] kvmppc_vcpu_run+0x34/0x48 [kvm]
[  538.544472] [c0000013e91e3af0] [c0080000072d61b8] kvm_arch_vcpu_ioctl_run+0x310/0x420 [kvm]
[  538.544539] [c0000013e91e3b80] [c0080000072c7450] kvm_vcpu_ioctl+0x298/0x778 [kvm]
[  538.544605] [c0000013e91e3ce0] [c0000000004b8c2c] sys_ioctl+0x1dc/0xc90
[  538.544662] [c0000013e91e3dc0] [c00000000002f9a4] system_call_exception+0xe4/0x1c0
[  538.544726] [c0000013e91e3e20] [c00000000000d140] system_call_common+0xf0/0x27c
[  538.544787] Instruction dump:
[  538.544821] f86d1098 60000000 60000000 48000099 e8ad0fe8 e8c500a0 e9264140 75290002
[  538.544886] 7d1602a6 7cec42a6 40820008 7d0807b4 <7d164ba67d083a14 f90d10a0 480104fd
[  538.544953] ---[ end trace 74423e2b948c2e0c ]---

This patch makes the KVM_PPC_ALLOCATE_HTAB ioctl fail when running in
the nested hypervisor, causing QEMU to abort.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoext4: Detect already used quota file early
Jan Kara [Thu, 15 Oct 2020 11:03:30 +0000 (13:03 +0200)]
ext4: Detect already used quota file early

[ Upstream commit e0770e91424f694b461141cbc99adf6b23006b60 ]

When we try to use file already used as a quota file again (for the same
or different quota type), strange things can happen. At the very least
lockdep annotations may be wrong but also inode flags may be wrongly set
/ reset. When the file is used for two quota types at once we can even
corrupt the file and likely crash the kernel. Catch all these cases by
checking whether passed file is already used as quota file and bail
early in that case.

This fixes occasional generic/219 failure due to lockdep complaint.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20201015110330.28716-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agojbd2: avoid transaction reuse after reformatting
changfengnan [Mon, 12 Oct 2020 16:49:00 +0000 (18:49 +0200)]
jbd2: avoid transaction reuse after reformatting

[ Upstream commit fc750a3b44bdccb9fb96d6abbc48a9b8e480ce7b ]

When ext4 is formatted with lazy_journal_init=1 and transactions from
the previous filesystem are still on disk, it is possible that they are
considered during a recovery after a crash. Because the checksum seed
has changed, the CRC check will fail, and the journal recovery fails
with checksum error although the journal is otherwise perfectly valid.
Fix the problem by checking commit block time stamps to determine
whether the data in the journal block is just stale or whether it is
indeed corrupt.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Fengnan Chang <changfengnan@hikvision.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20201012164900.20197-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrivers: watchdog: rdc321x_wdt: Fix race condition bugs
Madhuparna Bhowmik [Fri, 7 Aug 2020 11:29:02 +0000 (16:59 +0530)]
drivers: watchdog: rdc321x_wdt: Fix race condition bugs

[ Upstream commit 4b2e7f99cdd314263c9d172bc17193b8b6bba463 ]

In rdc321x_wdt_probe(), rdc321x_wdt_device.queue is initialized
after misc_register(), hence if ioctl is called before its
initialization which can call rdc321x_wdt_start() function,
it will see an uninitialized value of rdc321x_wdt_device.queue,
hence initialize it before misc_register().
Also, rdc321x_wdt_device.default_ticks is accessed in reset()
function called from write callback, thus initialize it before
misc_register().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20200807112902.28764-1-madhuparnabhowmik10@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoceph: encode inodes' parent/d_name in cap reconnect message
Yan, Zheng [Tue, 11 Aug 2020 07:23:03 +0000 (15:23 +0800)]
ceph: encode inodes' parent/d_name in cap reconnect message

[ Upstream commit a33f6432b3a63a4909dbbb0967f7c9df8ff2de91 ]

Since nautilus, MDS tracks dirfrags whose child inodes have caps in open
file table. When MDS recovers, it prefetches all of these dirfrags. This
avoids using backtrace to load inodes. But dirfrags prefetch may load
lots of useless inodes into cache, and make MDS run out of memory.

Recent MDS adds an option that disables dirfrags prefetch. When dirfrags
prefetch is disabled. Recovering MDS only prefetches corresponding dir
inodes. Including inodes' parent/d_name in cap reconnect message can
help MDS to load inodes into its cache.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agonet: 9p: initialize sun_server.sun_path to have addr's value only when addr is valid
Anant Thazhemadam [Mon, 12 Oct 2020 04:24:04 +0000 (09:54 +0530)]
net: 9p: initialize sun_server.sun_path to have addr's value only when addr is valid

[ Upstream commit 7ca1db21ef8e0e6725b4d25deed1ca196f7efb28 ]

In p9_fd_create_unix, checking is performed to see if the addr (passed
as an argument) is NULL or not.
However, no check is performed to see if addr is a valid address, i.e.,
it doesn't entirely consist of only 0's.
The initialization of sun_server.sun_path to be equal to this faulty
addr value leads to an uninitialized variable, as detected by KMSAN.
Checking for this (faulty addr) and returning a negative error number
appropriately, resolves this issue.

Link: http://lkml.kernel.org/r/20201012042404.2508-1-anant.thazhemadam@gmail.com
Reported-by: syzbot+75d51fe5bf4ebe988518@syzkaller.appspotmail.com
Tested-by: syzbot+75d51fe5bf4ebe988518@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agonfsd4: remove check_conflicting_opens warning
J. Bruce Fields [Fri, 25 Sep 2020 14:09:58 +0000 (10:09 -0400)]
nfsd4: remove check_conflicting_opens warning

[ Upstream commit 50747dd5e47bde3b7d7f839c84d0d3b554090497 ]

There are actually rare races where this is possible (e.g. if a new open
intervenes between the read of i_writecount and the fi_fds).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agonfsd: rename delegation related tracepoints to make them less confusing
Hou Tao [Fri, 28 Aug 2020 07:02:55 +0000 (15:02 +0800)]
nfsd: rename delegation related tracepoints to make them less confusing

[ Upstream commit 3caf91757ced158e6c4a44d8b105bd7b3e1767d8 ]

Now when a read delegation is given, two delegation related traces
will be printed:

    nfsd_deleg_open: client 5f45b854:e6058001 stateid 00000030:00000001
    nfsd_deleg_none: client 5f45b854:e6058001 stateid 0000002f:00000001

Although the intention is to let developers know two stateid are
returned, the traces are confusing about whether or not a read delegation
is handled out. So renaming trace_nfsd_deleg_none() to trace_nfsd_open()
and trace_nfsd_deleg_open() to trace_nfsd_deleg_read() to make
the intension clearer.

The patched traces will be:

    nfsd_deleg_read: client 5f48a967:b55b21cd stateid 00000003:00000001
    nfsd_open: client 5f48a967:b55b21cd stateid 00000002:00000001

Suggested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoclk: ti: clockdomain: fix static checker warning
Tero Kristo [Mon, 7 Sep 2020 08:25:59 +0000 (11:25 +0300)]
clk: ti: clockdomain: fix static checker warning

[ Upstream commit b7a7943fe291b983b104bcbd2f16e8e896f56590 ]

Fix a memory leak induced by not calling clk_put after doing of_clk_get.

Reported-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Link: https://lore.kernel.org/r/20200907082600.454-3-t-kristo@ti.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoPCI/ACPI: Add Ampere Altra SOC MCFG quirk
Tuan Phan [Thu, 6 Aug 2020 21:57:34 +0000 (14:57 -0700)]
PCI/ACPI: Add Ampere Altra SOC MCFG quirk

[ Upstream commit 877c1a5f79c6984bbe3f2924234c08e2f4f1acd5 ]

Ampere Altra SOC supports only 32-bit ECAM reads.  Add an MCFG quirk for
the platform.

Link: https://lore.kernel.org/r/1596751055-12316-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agorpmsg: glink: Use complete_all for open states
Chris Lew [Wed, 24 Jun 2020 16:45:18 +0000 (22:15 +0530)]
rpmsg: glink: Use complete_all for open states

[ Upstream commit 4fcdaf6e28d11e2f3820d54dd23cd12a47ddd44e ]

The open_req and open_ack completion variables are the state variables
to represet a remote channel as open. Use complete_all so there are no
races with waiters and using completion_done.

Signed-off-by: Chris Lew <clew@codeaurora.org>
Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org>
Signed-off-by: Deepak Kumar Singh <deesin@codeaurora.org>
Link: https://lore.kernel.org/r/1593017121-7953-2-git-send-email-deesin@codeaurora.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agobnxt_en: Log unknown link speed appropriately.
Michael Chan [Mon, 12 Oct 2020 09:10:51 +0000 (05:10 -0400)]
bnxt_en: Log unknown link speed appropriately.

[ Upstream commit 8eddb3e7ce124dd6375d3664f1aae13873318b0f ]

If the VF virtual link is set to always enabled, the speed may be
unknown when the physical link is down.  The driver currently logs
the link speed as 4294967295 Mbps which is SPEED_UNKNOWN.  Modify
the link up log message as "speed unknown" which makes more sense.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1602493854-29283-7-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agof2fs: fix to set SBI_NEED_FSCK flag for inconsistent inode
Chao Yu [Fri, 9 Oct 2020 02:40:48 +0000 (10:40 +0800)]
f2fs: fix to set SBI_NEED_FSCK flag for inconsistent inode

[ Upstream commit d662fad143c0470ad7f46ea7b02da539f613d7d7 ]

If compressed inode has inconsistent fields on i_compress_algorithm,
i_compr_blocks and i_log_cluster_size, we missed to set SBI_NEED_FSCK
to notice fsck to repair the inode, fix it.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomd/bitmap: md_bitmap_get_counter returns wrong blocks
Zhao Heming [Mon, 5 Oct 2020 16:00:24 +0000 (00:00 +0800)]
md/bitmap: md_bitmap_get_counter returns wrong blocks

[ Upstream commit d837f7277f56e70d82b3a4a037d744854e62f387 ]

md_bitmap_get_counter() has code:

```
    if (bitmap->bp[page].hijacked ||
        bitmap->bp[page].map == NULL)
        csize = ((sector_t)1) << (bitmap->chunkshift +
                      PAGE_COUNTER_SHIFT - 1);
```

The minus 1 is wrong, this branch should report 2048 bits of space.
With "-1" action, this only report 1024 bit of space.

This bug code returns wrong blocks, but it doesn't inflence bitmap logic:
1. Most callers focus this function return value (the counter of offset),
   not the parameter blocks.
2. The bug is only triggered when hijacked is true or map is NULL.
   the hijacked true condition is very rare.
   the "map == null" only true when array is creating or resizing.
3. Even the caller gets wrong blocks, current code makes caller just to
   call md_bitmap_get_counter() one more time.

Signed-off-by: Zhao Heming <heming.zhao@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agobtrfs: fix replace of seed device
Anand Jain [Fri, 4 Sep 2020 17:34:22 +0000 (01:34 +0800)]
btrfs: fix replace of seed device

[ Upstream commit c6a5d954950c5031444173ad2195efc163afcac9 ]

If you replace a seed device in a sprouted fs, it appears to have
successfully replaced the seed device, but if you look closely, it
didn't.  Here is an example.

  $ mkfs.btrfs /dev/sda
  $ btrfstune -S1 /dev/sda
  $ mount /dev/sda /btrfs
  $ btrfs device add /dev/sdb /btrfs
  $ umount /btrfs
  $ btrfs device scan --forget
  $ mount -o device=/dev/sda /dev/sdb /btrfs
  $ btrfs replace start -f /dev/sda /dev/sdc /btrfs
  $ echo $?
  0

  BTRFS info (device sdb): dev_replace from /dev/sda (devid 1) to /dev/sdc started
  BTRFS info (device sdb): dev_replace from /dev/sda (devid 1) to /dev/sdc finished

  $ btrfs fi show
  Label: none  uuid: ab2c88b7-be81-4a7e-9849-c3666e7f9f4f
  Total devices 2 FS bytes used 256.00KiB
  devid    1 size 3.00GiB used 520.00MiB path /dev/sdc
  devid    2 size 3.00GiB used 896.00MiB path /dev/sdb

  Label: none  uuid: 10bd3202-0415-43af-96a8-d5409f310a7e
  Total devices 1 FS bytes used 128.00KiB
  devid    1 size 3.00GiB used 536.00MiB path /dev/sda

So as per the replace start command and kernel log replace was successful.
Now let's try to clean mount.

  $ umount /btrfs
  $ btrfs device scan --forget

  $ mount -o device=/dev/sdc /dev/sdb /btrfs
  mount: /btrfs: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error.

  [  636.157517] BTRFS error (device sdc): failed to read chunk tree: -2
  [  636.180177] BTRFS error (device sdc): open_ctree failed

That's because per dev items it is still looking for the original seed
device.

 $ btrfs inspect-internal dump-tree -d /dev/sdb

item 0 key (DEV_ITEMS DEV_ITEM 1) itemoff 16185 itemsize 98
devid 1 total_bytes 3221225472 bytes_used 545259520
io_align 4096 io_width 4096 sector_size 4096 type 0
generation 6 start_offset 0 dev_group 0
seek_speed 0 bandwidth 0
uuid 59368f50-9af2-4b17-91da-8a783cc418d4  <--- seed uuid
fsid 10bd3202-0415-43af-96a8-d5409f310a7e  <--- seed fsid
item 1 key (DEV_ITEMS DEV_ITEM 2) itemoff 16087 itemsize 98
devid 2 total_bytes 3221225472 bytes_used 939524096
io_align 4096 io_width 4096 sector_size 4096 type 0
generation 0 start_offset 0 dev_group 0
seek_speed 0 bandwidth 0
uuid 56a0a6bc-4630-4998-8daf-3c3030c4256a  <- sprout uuid
fsid ab2c88b7-be81-4a7e-9849-c3666e7f9f4f <- sprout fsid

But the replaced target has the following uuid+fsid in its superblock
which doesn't match with the expected uuid+fsid in its devitem.

  $ btrfs in dump-super /dev/sdc | egrep '^generation|dev_item.uuid|dev_item.fsid|devid'
  generation 20
  dev_item.uuid 59368f50-9af2-4b17-91da-8a783cc418d4
  dev_item.fsid ab2c88b7-be81-4a7e-9849-c3666e7f9f4f [match]
  dev_item.devid 1

So if you provide the original seed device the mount shall be
successful.  Which so long happening in the test case btrfs/163.

  $ btrfs device scan --forget
  $ mount -o device=/dev/sda /dev/sdb /btrfs

Fix in this patch:
If a seed is not sprouted then there is no replacement of it, because of
its read-only filesystem with a read-only device. Similarly, in the case
of a sprouted filesystem, the seed device is still read only. So, mark
it as you can't replace a seed device, you can only add a new device and
then delete the seed device. If replace is attempted then returns
-EINVAL.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoblock: Consider only dispatched requests for inflight statistic
Gabriel Krisman Bertazi [Tue, 6 Oct 2020 19:41:25 +0000 (15:41 -0400)]
block: Consider only dispatched requests for inflight statistic

[ Upstream commit a926c7afffcc0f2e35e6acbccb16921bacf34617 ]

According to Documentation/block/stat.rst, inflight should not include
I/O requests that are in the queue but not yet dispatched to the device,
but blk-mq identifies as inflight any request that has a tag allocated,
which, for queues without elevator, happens at request allocation time
and before it is queued in the ctx (default case in blk_mq_submit_bio).

In addition, current behavior is different for queues with elevator from
queues without it, since for the former the driver tag is allocated at
dispatch time.  A more precise approach would be to only consider
requests with state MQ_RQ_IN_FLIGHT.

This effectively reverts commit 6131837b1de6 ("blk-mq: count allocated
but not started requests in iostats inflight") to consolidate blk-mq
behavior with itself (elevator case) and with original documentation,
but it differs from the behavior used by the legacy path.

This version differs from v1 by using blk_mq_rq_state to access the
state attribute.  Avoid using blk_mq_request_started, which was
suggested, since we don't want to include MQ_RQ_COMPLETE.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoARC: [dts] fix the errors detected by dtbs_check
Zhen Lei [Thu, 24 Sep 2020 07:17:54 +0000 (15:17 +0800)]
ARC: [dts] fix the errors detected by dtbs_check

[ Upstream commit 05b1be68c4d6d76970025e6139bfd735c2256ee5 ]

xxx/arc/boot/dts/axs101.dt.yaml: dw-apb-ictl@e0012000: $nodename:0: \
'dw-apb-ictl@e0012000' does not match '^interrupt-controller(@[0-9a-f,]+)*$'
 From schema: xxx/interrupt-controller/snps,dw-apb-ictl.yaml

The node name of the interrupt controller must start with
"interrupt-controller" instead of "dw-apb-ictl".

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/amd/display: Avoid set zero in the requested clk
Rodrigo Siqueira [Wed, 30 Sep 2020 17:57:54 +0000 (13:57 -0400)]
drm/amd/display: Avoid set zero in the requested clk

[ Upstream commit 2f8be0e516803cc3fd87c1671247896571a5a8fb ]

[Why]
Sometimes CRTCs can be disabled due to display unplugging or temporarily
transition in the userspace; in these circumstances, DCE tries to set
the minimum clock threshold. When we have this situation, the function
bw_calcs is invoked with number_of_displays set to zero, making DCE set
dispclk_khz and sclk_khz to zero. For these reasons, we have seen some
ATOM bios errors that look like:

[drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than
5secs aborting
[drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck
executing EA8A (len 761, WS 0, PS 0) @ 0xEABA

[How]
This error happens due to an attempt to optimize the bandwidth using the
sclk, and the dispclk clock set to zero. Technically we handle this in
the function dce112_set_clock, but we are not considering the case that
this value is set to zero. This commit fixes this issue by ensuring that
we never set a minimum value below the minimum clock threshold.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/amd/display: HDMI remote sink need mode validation for Linux
Fangzhi Zuo [Mon, 21 Sep 2020 21:52:43 +0000 (17:52 -0400)]
drm/amd/display: HDMI remote sink need mode validation for Linux

[ Upstream commit 95d620adb48f7728e67d82f56f756e8d451cf8d2 ]

[Why]
Currently mode validation is bypassed if remote sink exists. That
leads to mode set issue when a BW bottle neck exists in the link path,
e.g., a DP-to-HDMI converter that only supports HDMI 1.4.

Any invalid mode passed to Linux user space will cause the modeset
failure due to limitation of Linux user space implementation.

[How]
Mode validation is skipped only if in edid override. For real remote
sink, clock limit check should be done for HDMI remote sink.

Have HDMI related remote sink going through mode validation to
elimiate modes which pixel clock exceeds BW limitation.

Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agopower: supply: test_power: add missing newlines when printing parameters by sysfs
Xiongfeng Wang [Fri, 4 Sep 2020 06:09:58 +0000 (14:09 +0800)]
power: supply: test_power: add missing newlines when printing parameters by sysfs

[ Upstream commit c07fa6c1631333f02750cf59f22b615d768b4d8f ]

When I cat some module parameters by sysfs, it displays as follows.
It's better to add a newline for easy reading.

root@syzkaller:~# cd /sys/module/test_power/parameters/
root@syzkaller:/sys/module/test_power/parameters# cat ac_online
onroot@syzkaller:/sys/module/test_power/parameters# cat battery_present
trueroot@syzkaller:/sys/module/test_power/parameters# cat battery_health
goodroot@syzkaller:/sys/module/test_power/parameters# cat battery_status
dischargingroot@syzkaller:/sys/module/test_power/parameters# cat battery_technology
LIONroot@syzkaller:/sys/module/test_power/parameters# cat usb_online
onroot@syzkaller:/sys/module/test_power/parameters#

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3
Jonathan Cameron [Wed, 30 Sep 2020 14:05:45 +0000 (22:05 +0800)]
ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3

[ Upstream commit 2c5b9bde95c96942f2873cea6ef383c02800e4a8 ]

In ACPI 6.3, the Memory Proximity Domain Attributes Structure
changed substantially.  One of those changes was that the flag
for "Memory Proximity Domain field is valid" was deprecated.

This was because the field "Proximity Domain for the Memory"
became a required field and hence having a validity flag makes
no sense.

So the correct logic is to always assume the field is there.
Current code assumes it never is.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agobus/fsl_mc: Do not rely on caller to provide non NULL mc_io
Diana Craciun [Tue, 29 Sep 2020 08:54:38 +0000 (11:54 +0300)]
bus/fsl_mc: Do not rely on caller to provide non NULL mc_io

[ Upstream commit 5026cf605143e764e1785bbf9158559d17f8d260 ]

Before destroying the mc_io, check first that it was
allocated.

Reviewed-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Acked-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com>
Link: https://lore.kernel.org/r/20200929085441.17448-11-diana.craciun@oss.nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agobus: mhi: core: Abort suspends due to outgoing pending packets
Bhaumik Bhatt [Tue, 29 Sep 2020 17:52:02 +0000 (23:22 +0530)]
bus: mhi: core: Abort suspends due to outgoing pending packets

[ Upstream commit 515847c557dd33167be86cb429fc0674a331bc88 ]

Add the missing check to abort suspends if a client driver has pending
outgoing packets to send to the device. This allows better utilization
of the MHI bus wherein clients on the host are not left waiting for
longer suspend or resume cycles to finish for data transfers.

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20200929175218.8178-4-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousb: dwc3: core: do not queue work if dr_mode is not USB_DR_MODE_OTG
Li Jun [Sun, 26 Jul 2020 03:17:39 +0000 (11:17 +0800)]
usb: dwc3: core: do not queue work if dr_mode is not USB_DR_MODE_OTG

[ Upstream commit dc336b19e82d0454ea60270cd18fbb4749e162f6 ]

Do not try to queue a drd work if dr_mode is not USB_DR_MODE_OTG
because the work is not inited, this may be triggered by user try
to change mode file of debugfs on a single role port, which will
cause below kernel dump:
[   60.115529] ------------[ cut here ]------------
[   60.120166] WARNING: CPU: 1 PID: 627 at kernel/workqueue.c:1473
__queue_work+0x46c/0x520
[   60.128254] Modules linked in:
[   60.131313] CPU: 1 PID: 627 Comm: sh Not tainted
5.7.0-rc4-00022-g914a586-dirty #135
[   60.139054] Hardware name: NXP i.MX8MQ EVK (DT)
[   60.143585] pstate: a0000085 (NzCv daIf -PAN -UAO)
[   60.148376] pc : __queue_work+0x46c/0x520
[   60.152385] lr : __queue_work+0x314/0x520
[   60.156393] sp : ffff8000124ebc40
[   60.159705] x29: ffff8000124ebc40 x28: ffff800011808018
[   60.165018] x27: ffff800011819ef8 x26: ffff800011d39980
[   60.170331] x25: ffff800011808018 x24: 0000000000000100
[   60.175643] x23: 0000000000000013 x22: 0000000000000001
[   60.180955] x21: ffff0000b7c08e00 x20: ffff0000b6c31080
[   60.186267] x19: ffff0000bb99bc00 x18: 0000000000000000
[   60.191579] x17: 0000000000000000 x16: 0000000000000000
[   60.196891] x15: 0000000000000000 x14: 0000000000000000
[   60.202202] x13: 0000000000000000 x12: 0000000000000000
[   60.207515] x11: 0000000000000000 x10: 0000000000000040
[   60.212827] x9 : ffff800011d55460 x8 : ffff800011d55458
[   60.218138] x7 : ffff0000b7800028 x6 : 0000000000000000
[   60.223450] x5 : ffff0000b7800000 x4 : 0000000000000000
[   60.228762] x3 : ffff0000bb997cc0 x2 : 0000000000000001
[   60.234074] x1 : 0000000000000000 x0 : ffff0000b6c31088
[   60.239386] Call trace:
[   60.241834]  __queue_work+0x46c/0x520
[   60.245496]  queue_work_on+0x6c/0x90
[   60.249075]  dwc3_set_mode+0x48/0x58
[   60.252651]  dwc3_mode_write+0xf8/0x150
[   60.256489]  full_proxy_write+0x5c/0xa8
[   60.260327]  __vfs_write+0x18/0x40
[   60.263729]  vfs_write+0xdc/0x1c8
[   60.267045]  ksys_write+0x68/0xf0
[   60.270360]  __arm64_sys_write+0x18/0x20
[   60.274286]  el0_svc_common.constprop.0+0x68/0x160
[   60.279077]  do_el0_svc+0x20/0x80
[   60.282394]  el0_sync_handler+0x10c/0x178
[   60.286403]  el0_sync+0x140/0x180
[   60.289716] ---[ end trace 70b155582e2b7988 ]---

Signed-off-by: Li Jun <jun.li@nxp.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrivers/net/wan/hdlc_fr: Correctly handle special skb->protocol values
Xie He [Mon, 28 Sep 2020 12:56:43 +0000 (05:56 -0700)]
drivers/net/wan/hdlc_fr: Correctly handle special skb->protocol values

[ Upstream commit 8306266c1d51aac9aa7aa907fe99032a58c6382c ]

The fr_hard_header function is used to prepend the header to skbs before
transmission. It is used in 3 situations:
1) When a control packet is generated internally in this driver;
2) When a user sends an skb on an Ethernet-emulating PVC device;
3) When a user sends an skb on a normal PVC device.

These 3 situations need to be handled differently by fr_hard_header.
Different headers should be prepended to the skb in different situations.

Currently fr_hard_header distinguishes these 3 situations using
skb->protocol. For situation 1 and 2, a special skb->protocol value
will be assigned before calling fr_hard_header, so that it can recognize
these 2 situations. All skb->protocol values other than these special ones
are treated by fr_hard_header as situation 3.

However, it is possible that in situation 3, the user sends an skb with
one of the special skb->protocol values. In this case, fr_hard_header
would incorrectly treat it as situation 1 or 2.

This patch tries to solve this issue by using skb->dev instead of
skb->protocol to distinguish between these 3 situations. For situation
1, skb->dev would be NULL; for situation 2, skb->dev->type would be
ARPHRD_ETHER; and for situation 3, skb->dev->type would be ARPHRD_DLCI.

This way fr_hard_header would be able to distinguish these 3 situations
correctly regardless what skb->protocol value the user tries to use in
situation 3.

Cc: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoath11k: change to disable softirqs for ath11k_regd_update to solve deadlock
Wen Gong [Tue, 29 Sep 2020 17:15:34 +0000 (20:15 +0300)]
ath11k: change to disable softirqs for ath11k_regd_update to solve deadlock

[ Upstream commit df648808c6b9989555e247530d8ca0ad0094b361 ]

After base_lock which occupy by ath11k_regd_update, the softirq run for
WMI_REG_CHAN_LIST_CC_EVENTID maybe arrived and it also need to accuire
the spin lock, then deadlock happend, change to disable softirqis to solve it.

[  235.576990] ================================
[  235.576991] WARNING: inconsistent lock state
[  235.576993] 5.9.0-rc5-wt-ath+ #196 Not tainted
[  235.576994] --------------------------------
[  235.576995] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[  235.576997] kworker/u16:1/98 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  235.576998] ffff9655f75cad98 (&ab->base_lock){+.?.}-{2:2}, at: ath11k_regd_update+0x28/0x1d0 [ath11k]
[  235.577009] {IN-SOFTIRQ-W} state was registered at:
[  235.577013]   __lock_acquire+0x219/0x6e0
[  235.577015]   lock_acquire+0xb6/0x270
[  235.577018]   _raw_spin_lock+0x2c/0x70
[  235.577023]   ath11k_reg_chan_list_event.isra.0+0x10d/0x1e0 [ath11k]
[  235.577028]   ath11k_wmi_tlv_op_rx+0x3c3/0x560 [ath11k]
[  235.577033]   ath11k_htc_rx_completion_handler+0x207/0x370 [ath11k]
[  235.577039]   ath11k_ce_recv_process_cb+0x15e/0x1e0 [ath11k]
[  235.577041]   ath11k_pci_ce_tasklet+0x10/0x30 [ath11k_pci]
[  235.577043]   tasklet_action_common.constprop.0+0xd4/0xf0
[  235.577045]   __do_softirq+0xc9/0x482
[  235.577046]   asm_call_on_stack+0x12/0x20
[  235.577048]   do_softirq_own_stack+0x49/0x60
[  235.577049]   irq_exit_rcu+0x9a/0xd0
[  235.577050]   common_interrupt+0xa1/0x190
[  235.577052]   asm_common_interrupt+0x1e/0x40
[  235.577053]   cpu_idle_poll.isra.0+0x2e/0x60
[  235.577055]   do_idle+0x5f/0xe0
[  235.577056]   cpu_startup_entry+0x14/0x20
[  235.577058]   start_kernel+0x443/0x464
[  235.577060]   secondary_startup_64+0xa4/0xb0
[  235.577061] irq event stamp: 432035
[  235.577063] hardirqs last  enabled at (432035): [<ffffffff968d12b4>] _raw_spin_unlock_irqrestore+0x34/0x40
[  235.577064] hardirqs last disabled at (432034): [<ffffffff968d10d3>] _raw_spin_lock_irqsave+0x63/0x80
[  235.577066] softirqs last  enabled at (431998): [<ffffffff967115c1>] inet6_fill_ifla6_attrs+0x3f1/0x430
[  235.577067] softirqs last disabled at (431996): [<ffffffff9671159f>] inet6_fill_ifla6_attrs+0x3cf/0x430
[  235.577068]
[  235.577068] other info that might help us debug this:
[  235.577069]  Possible unsafe locking scenario:
[  235.577069]
[  235.577070]        CPU0
[  235.577070]        ----
[  235.577071]   lock(&ab->base_lock);
[  235.577072]   <Interrupt>
[  235.577073]     lock(&ab->base_lock);
[  235.577074]
[  235.577074]  *** DEADLOCK ***
[  235.577074]
[  235.577075] 3 locks held by kworker/u16:1/98:
[  235.577076]  #0: ffff9655f75b1d48 ((wq_completion)ath11k_qmi_driver_event){+.+.}-{0:0}, at: process_one_work+0x1d3/0x5d0
[  235.577079]  #1: ffffa33cc02f3e70 ((work_completion)(&ab->qmi.event_work)){+.+.}-{0:0}, at: process_one_work+0x1d3/0x5d0
[  235.577081]  #2: ffff9655f75cad50 (&ab->core_lock){+.+.}-{3:3}, at: ath11k_core_qmi_firmware_ready.part.0+0x4e/0x160 [ath11k]
[  235.577087]
[  235.577087] stack backtrace:
[  235.577088] CPU: 3 PID: 98 Comm: kworker/u16:1 Not tainted 5.9.0-rc5-wt-ath+ #196
[  235.577089] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601 08/01/2018
[  235.577095] Workqueue: ath11k_qmi_driver_event ath11k_qmi_driver_event_work [ath11k]
[  235.577096] Call Trace:
[  235.577100]  dump_stack+0x77/0xa0
[  235.577102]  mark_lock_irq.cold+0x15/0x3c
[  235.577104]  mark_lock+0x1d7/0x540
[  235.577105]  mark_usage+0xc7/0x140
[  235.577107]  __lock_acquire+0x219/0x6e0
[  235.577108]  ? sched_clock_cpu+0xc/0xb0
[  235.577110]  lock_acquire+0xb6/0x270
[  235.577116]  ? ath11k_regd_update+0x28/0x1d0 [ath11k]
[  235.577118]  ? atomic_notifier_chain_register+0x2d/0x40
[  235.577120]  _raw_spin_lock+0x2c/0x70
[  235.577125]  ? ath11k_regd_update+0x28/0x1d0 [ath11k]
[  235.577130]  ath11k_regd_update+0x28/0x1d0 [ath11k]
[  235.577136]  __ath11k_mac_register+0x3fb/0x480 [ath11k]
[  235.577141]  ath11k_mac_register+0x119/0x180 [ath11k]
[  235.577146]  ath11k_core_pdev_create+0x17/0xe0 [ath11k]
[  235.577150]  ath11k_core_qmi_firmware_ready.part.0+0x65/0x160 [ath11k]
[  235.577155]  ath11k_qmi_driver_event_work+0x1c5/0x230 [ath11k]
[  235.577158]  process_one_work+0x265/0x5d0
[  235.577160]  worker_thread+0x49/0x300
[  235.577161]  ? process_one_work+0x5d0/0x5d0
[  235.577163]  kthread+0x135/0x150
[  235.577164]  ? kthread_create_worker_on_cpu+0x60/0x60
[  235.577166]  ret_from_fork+0x22/0x30

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-7-git-send-email-kvalo@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoath11k: fix warning caused by lockdep_assert_held
Carl Huang [Wed, 30 Sep 2020 10:51:12 +0000 (13:51 +0300)]
ath11k: fix warning caused by lockdep_assert_held

[ Upstream commit 2f588660e34a982377109872757f1b99d7748d21 ]

Fix warning caused by lockdep_assert_held when CONFIG_LOCKDEP is enabled.

[  271.940647] WARNING: CPU: 6 PID: 0 at drivers/net/wireless/ath/ath11k/hal.c:818 ath11k_hal_srng_access_begin+0x31/0x40 [ath11k]
[  271.940655] Modules linked in: qrtr_mhi qrtr ns ath11k_pci mhi ath11k qmi_helpers nvme nvme_core
[  271.940675] CPU: 6 PID: 0 Comm: swapper/6 Kdump: loaded Tainted: G        W         5.9.0-rc5-kalle-bringup-wt-ath+ #4
[  271.940682] Hardware name: Dell Inc. Inspiron 7590/08717F, BIOS 1.3.0 07/22/2019
[  271.940698] RIP: 0010:ath11k_hal_srng_access_begin+0x31/0x40 [ath11k]
[  271.940708] Code: 48 89 f3 85 c0 75 11 48 8b 83 a8 00 00 00 8b 00 89 83 b0 00 00 00 5b c3 48 8d 7e 58 be ff ff ff ff e8 53 24 ec fa 85 c0 75 dd <0f> 0b eb d9 90 66 2e 0f 1f 84 00 00 00 00 00 55 53 48 89 f3 8b 35
[  271.940718] RSP: 0018:ffffbdf0c0230df8 EFLAGS: 00010246
[  271.940727] RAX: 0000000000000000 RBX: ffffa12b34e67680 RCX: ffffa12b57a0d800
[  271.940735] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: ffffa12b34e676d8
[  271.940742] RBP: ffffa12b34e60000 R08: 0000000000000001 R09: 0000000000000001
[  271.940753] R10: 0000000000000001 R11: 0000000000000046 R12: 0000000000000000
[  271.940763] R13: ffffa12b34e60000 R14: ffffa12b34e60000 R15: 0000000000000000
[  271.940774] FS:  0000000000000000(0000) GS:ffffa12b5a400000(0000) knlGS:0000000000000000
[  271.940788] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  271.940798] CR2: 00007f8bef282008 CR3: 00000001f4224004 CR4: 00000000003706e0
[  271.940805] Call Trace:
[  271.940813]  <IRQ>
[  271.940835]  ath11k_dp_tx_completion_handler+0x9e/0x950 [ath11k]
[  271.940847]  ? lock_acquire+0xba/0x3b0
[  271.940876]  ath11k_dp_service_srng+0x5a/0x2e0 [ath11k]
[  271.940893]  ath11k_pci_ext_grp_napi_poll+0x1e/0x80 [ath11k_pci]
[  271.940908]  net_rx_action+0x283/0x4f0
[  271.940931]  __do_softirq+0xcb/0x499
[  271.940950]  asm_call_on_stack+0x12/0x20
[  271.940963]  </IRQ>
[  271.940979]  do_softirq_own_stack+0x4d/0x60
[  271.940991]  irq_exit_rcu+0xb0/0xc0
[  271.941001]  common_interrupt+0xce/0x190
[  271.941014]  asm_common_interrupt+0x1e/0x40
[  271.941026] RIP: 0010:cpuidle_enter_state+0x115/0x500

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Carl Huang <cjhuang@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601463073-12106-5-git-send-email-kvalo@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoath11k: Use GFP_ATOMIC instead of GFP_KERNEL in ath11k_dp_htt_get_ppdu_desc
Wen Gong [Tue, 29 Sep 2020 17:15:35 +0000 (20:15 +0300)]
ath11k: Use GFP_ATOMIC instead of GFP_KERNEL in ath11k_dp_htt_get_ppdu_desc

[ Upstream commit 6a8be1baa9116a038cb4f6158cc10134387ca0d0 ]

With SLUB DEBUG CONFIG below crash is seen as kmem_cache_alloc
is being called in non-atomic context.

To fix this issue, use GFP_ATOMIC instead of GFP_KERNEL kzalloc.

[  357.217088] BUG: sleeping function called from invalid context at mm/slab.h:498
[  357.217091] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/0
[  357.217092] INFO: lockdep is turned off.
[  357.217095] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.9.0-rc5-wt-ath+ #196
[  357.217096] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601 08/01/2018
[  357.217097] Call Trace:
[  357.217098]  <IRQ>
[  357.217107]  ? ath11k_dp_htt_get_ppdu_desc+0xa9/0x170 [ath11k]
[  357.217110]  dump_stack+0x77/0xa0
[  357.217113]  ___might_sleep.cold+0xa6/0xb6
[  357.217116]  kmem_cache_alloc_trace+0x1f2/0x270
[  357.217122]  ath11k_dp_htt_get_ppdu_desc+0xa9/0x170 [ath11k]
[  357.217129]  ath11k_htt_pull_ppdu_stats.isra.0+0x96/0x270 [ath11k]
[  357.217135]  ath11k_dp_htt_htc_t2h_msg_handler+0xe7/0x1d0 [ath11k]
[  357.217137]  ? trace_hardirqs_on+0x1c/0x100
[  357.217143]  ath11k_htc_rx_completion_handler+0x207/0x370 [ath11k]
[  357.217149]  ath11k_ce_recv_process_cb+0x15e/0x1e0 [ath11k]
[  357.217151]  ? handle_irq_event+0x70/0xa8
[  357.217154]  ath11k_pci_ce_tasklet+0x10/0x30 [ath11k_pci]
[  357.217157]  tasklet_action_common.constprop.0+0xd4/0xf0
[  357.217160]  __do_softirq+0xc9/0x482
[  357.217162]  asm_call_on_stack+0x12/0x20
[  357.217163]  </IRQ>
[  357.217166]  do_softirq_own_stack+0x49/0x60
[  357.217167]  irq_exit_rcu+0x9a/0xd0
[  357.217169]  common_interrupt+0xa1/0x190
[  357.217171]  asm_common_interrupt+0x1e/0x40
[  357.217173] RIP: 0010:cpu_idle_poll.isra.0+0x2e/0x60
[  357.217175] Code: 8b 35 26 27 74 69 e8 11 c8 3d ff e8 bc fa 42 ff e8 e7 9f 4a ff fb 65 48 8b 1c 25 80 90 01 00 48 8b 03 a8 08 74 0b eb 1c f3 90 <48> 8b 03 a8 08 75 13 8b 0
[  357.217177] RSP: 0018:ffffffff97403ee0 EFLAGS: 00000202
[  357.217178] RAX: 0000000000000001 RBX: ffffffff9742b8c0 RCX: 0000000000b890ca
[  357.217180] RDX: 0000000000b890ca RSI: 0000000000000001 RDI: ffffffff968d0c49
[  357.217181] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[  357.217182] R10: ffffffff9742b8c0 R11: 0000000000000046 R12: 0000000000000000
[  357.217183] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000066fdf520
[  357.217186]  ? cpu_idle_poll.isra.0+0x19/0x60
[  357.217189]  do_idle+0x5f/0xe0
[  357.217191]  cpu_startup_entry+0x14/0x20
[  357.217193]  start_kernel+0x443/0x464
[  357.217196]  secondary_startup_64+0xa4/0xb0

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1601399736-3210-8-git-send-email-kvalo@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agobrcmfmac: Fix warning message after dongle setup failed
Wright Feng [Mon, 28 Sep 2020 05:49:22 +0000 (00:49 -0500)]
brcmfmac: Fix warning message after dongle setup failed

[ Upstream commit 6aa5a83a7ed8036c1388a811eb8bdfa77b21f19c ]

Brcmfmac showed warning message in fweh.c when checking the size of event
queue which is not initialized. Therefore, we only cancel the worker and
reset event handler only when it is initialized.

[  145.505899] brcmfmac 0000:02:00.0: brcmf_pcie_setup: Dongle setup
[  145.929970] ------------[ cut here ]------------
[  145.929994] WARNING: CPU: 0 PID: 288 at drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c:312
brcmf_fweh_detach+0xbc/0xd0 [brcmfmac]
...
[  145.930029] Call Trace:
[  145.930036]  brcmf_detach+0x77/0x100 [brcmfmac]
[  145.930043]  brcmf_pcie_remove+0x79/0x130 [brcmfmac]
[  145.930046]  pci_device_remove+0x39/0xc0
[  145.930048]  device_release_driver_internal+0x141/0x200
[  145.930049]  device_release_driver+0x12/0x20
[  145.930054]  brcmf_pcie_setup+0x101/0x3c0 [brcmfmac]
[  145.930060]  brcmf_fw_request_done+0x11d/0x1f0 [brcmfmac]
[  145.930062]  ? lock_timer_base+0x7d/0xa0
[  145.930063]  ? internal_add_timer+0x1f/0xa0
[  145.930064]  ? add_timer+0x11a/0x1d0
[  145.930066]  ? __kmalloc_track_caller+0x18c/0x230
[  145.930068]  ? kstrdup_const+0x23/0x30
[  145.930069]  ? add_dr+0x46/0x80
[  145.930070]  ? devres_add+0x3f/0x50
[  145.930072]  ? usermodehelper_read_unlock+0x15/0x20
[  145.930073]  ? _request_firmware+0x288/0xa20
[  145.930075]  request_firmware_work_func+0x36/0x60
[  145.930077]  process_one_work+0x144/0x360
[  145.930078]  worker_thread+0x4d/0x3c0
[  145.930079]  kthread+0x112/0x150
[  145.930080]  ? rescuer_thread+0x340/0x340
[  145.930081]  ? kthread_park+0x60/0x60
[  145.930083]  ret_from_fork+0x25/0x30

Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200928054922.44580-3-wright.feng@cypress.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoocteontx2-af: fix LD CUSTOM LTYPE aliasing
Stanislaw Kardach [Tue, 29 Sep 2020 09:28:14 +0000 (11:28 +0200)]
octeontx2-af: fix LD CUSTOM LTYPE aliasing

[ Upstream commit 450f0b978870c384dd81d1176088536555f3170e ]

Since LD contains LTYPE definitions tweaked toward efficient
NIX_AF_RX_FLOW_KEY_ALG(0..31)_FIELD(0..4) usage, the original location
of NPC_LT_LD_CUSTOM0/1 was aliased with MPLS_IN_* definitions.
Moving custom frame to value 6 and 7 removes the aliasing at the cost of
custom frames being also considered when TCP/UDP RSS algo is configured.

However since the goal of CUSTOM frames is to classify them to a
separate set of RQs, this cost is acceptable.

Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoACPI: Add out of bounds and numa_off protections to pxm_to_node()
Jonathan Cameron [Tue, 18 Aug 2020 14:24:25 +0000 (22:24 +0800)]
ACPI: Add out of bounds and numa_off protections to pxm_to_node()

[ Upstream commit 8a3decac087aa897df5af04358c2089e52e70ac4 ]

The function should check the validity of the pxm value before using
it to index the pxm_to_node_map[] array.

Whilst hardening this code may be good in general, the main intent
here is to enable following patches that use this function to replace
acpi_map_pxm_to_node() for non SRAT usecases which should return
NO_NUMA_NODE for PXM entries not matching with those in SRAT.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoxfs: avoid LR buffer overrun due to crafted h_len
Gao Xiang [Tue, 22 Sep 2020 16:41:06 +0000 (09:41 -0700)]
xfs: avoid LR buffer overrun due to crafted h_len

[ Upstream commit f692d09e9c8fd0f5557c2e87f796a16dd95222b8 ]

Currently, crafted h_len has been blocked for the log
header of the tail block in commit a70f9fe52daa ("xfs:
detect and handle invalid iclog size set by mkfs").

However, each log record could still have crafted h_len
and cause log record buffer overrun. So let's check
h_len vs buffer size for each log record as well.

Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoxfs: don't free rt blocks when we're doing a REMAP bunmapi call
Darrick J. Wong [Mon, 21 Sep 2020 16:15:08 +0000 (09:15 -0700)]
xfs: don't free rt blocks when we're doing a REMAP bunmapi call

[ Upstream commit 8df0fa39bdd86ca81a8d706a6ed9d33cc65ca625 ]

When callers pass XFS_BMAPI_REMAP into xfs_bunmapi, they want the extent
to be unmapped from the given file fork without the extent being freed.
We do this for non-rt files, but we forgot to do this for realtime
files.  So far this isn't a big deal since nobody makes a bunmapi call
to a rt file with the REMAP flag set, but don't leave a logic bomb.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agohabanalabs: remove security from ARB_MST_QUIET register
farah kassabri [Tue, 4 Aug 2020 06:41:32 +0000 (09:41 +0300)]
habanalabs: remove security from ARB_MST_QUIET register

[ Upstream commit acd330c141b4c49f468f00719ebc944656061eac ]

Allow user application to write to this register in order
to be able to configure the quiet period of the QMAN between grants.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agocan: flexcan: disable clocks during stop mode
Joakim Zhang [Tue, 10 Dec 2019 09:00:13 +0000 (09:00 +0000)]
can: flexcan: disable clocks during stop mode

[ Upstream commit 02f71c6605e1f8259c07f16178330db766189a74 ]

Disable clocks while CAN core is in stop mode.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Link: https://lore.kernel.org/r/20191210085721.9853-2-qiangqing.zhang@nxp.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoarm64/mm: return cpu_all_mask when node is NUMA_NO_NODE
Zhengyuan Liu [Mon, 21 Sep 2020 02:39:36 +0000 (10:39 +0800)]
arm64/mm: return cpu_all_mask when node is NUMA_NO_NODE

[ Upstream commit a194c5f2d2b3a05428805146afcabe5140b5d378 ]

The @node passed to cpumask_of_node() can be NUMA_NO_NODE, in that
case it will trigger the following WARN_ON(node >= nr_node_ids) due to
mismatched data types of @node and @nr_node_ids. Actually we should
return cpu_all_mask just like most other architectures do if passed
NUMA_NO_NODE.

Also add a similar check to the inline cpumask_of_node() in numa.h.

Signed-off-by: Zhengyuan Liu <liuzhengyuan@tj.kylinos.cn>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20200921023936.21846-1-liuzhengyuan@tj.kylinos.cn
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agocpuidle: tegra: Correctly handle result of arm_cpuidle_simple_enter()
Dmitry Osipenko [Thu, 9 Jul 2020 17:35:32 +0000 (20:35 +0300)]
cpuidle: tegra: Correctly handle result of arm_cpuidle_simple_enter()

[ Upstream commit 1170433e6611402b869c583fa1fbfd85106ff066 ]

The enter() callback of CPUIDLE drivers returns index of the entered idle
state on success or a negative value on failure. The negative value could
any negative value, i.e. it doesn't necessarily needs to be a error code.
That's because CPUIDLE core only cares about the fact of failure and not
about the reason of the enter() failure.

Like every other enter() callback, the arm_cpuidle_simple_enter() returns
the entered idle-index on success. Unlike some of other drivers, it never
fails. It happened that TEGRA_C1=index=err=0 in the code of cpuidle-tegra
driver, and thus, there is no problem for the cpuidle-tegra driver created
by the typo in the code which assumes that the arm_cpuidle_simple_enter()
returns a error code.

The arm_cpuidle_simple_enter() also may return a -ENODEV error if CPU_IDLE
is disabled in a kernel's config, but all CPUIDLE drivers are disabled if
CPU_IDLE is disabled, including the cpuidle-tegra driver. So we can't ever
see the error code from arm_cpuidle_simple_enter() today.

Of course the code may get some changes in the future and then the
typo may transform into a real bug, so let's correct the typo! The
tegra_cpuidle_state_enter() is now changed to make it return the entered
idle-index on success and negative error code on fail, which puts it on
par with the arm_cpuidle_simple_enter(), making code consistent in regards
to the error handling.

This patch fixes a minor typo in the code, it doesn't fix any bugs.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoSUNRPC: Mitigate cond_resched() in xprt_transmit()
Chuck Lever [Wed, 8 Jul 2020 20:09:53 +0000 (16:09 -0400)]
SUNRPC: Mitigate cond_resched() in xprt_transmit()

[ Upstream commit 6f9f17287e78e5049931af2037b15b26d134a32a ]

The original purpose of this expensive call is to prevent a long
queue of requests from blocking other work.

The cond_resched() call is unnecessary after just a single send
operation.

For longer queues, instead of invoking the kernel scheduler, simply
release the transport send lock and return to the RPC scheduler.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousb: xhci: omit duplicate actions when suspending a runtime suspended host.
Peter Chen [Fri, 18 Sep 2020 13:17:49 +0000 (16:17 +0300)]
usb: xhci: omit duplicate actions when suspending a runtime suspended host.

[ Upstream commit 18a367e8947d72dd91b6fc401e88a2952c6363f7 ]

If the xhci-plat.c is the platform driver, after the runtime pm is
enabled, the xhci_suspend is called if nothing is connected on
the port. When the system goes to suspend, it will call xhci_suspend again
if USB wakeup is enabled.

Since the runtime suspend wakeup setting is not always the same as
system suspend wakeup setting, eg, at runtime suspend we always need
wakeup if the controller is in low power mode; but at system suspend,
we may not need wakeup. So, we move the judgement after changing
wakeup setting.

[commit message rewording -Mathias]

Reviewed-by: Jun Li <jun.li@nxp.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20200918131752.16488-8-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomac80211: add missing queue/hash initialization to 802.3 xmit
Felix Fietkau [Tue, 8 Sep 2020 12:36:49 +0000 (14:36 +0200)]
mac80211: add missing queue/hash initialization to 802.3 xmit

[ Upstream commit 5f8d69eaab1915df97f4f2aca89ea16abdd092d5 ]

Fixes AQL for encap-offloaded tx

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20200908123702.88454-2-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/amdgpu: No sysfs, not an error condition
Luben Tuikov [Wed, 16 Sep 2020 17:03:50 +0000 (13:03 -0400)]
drm/amdgpu: No sysfs, not an error condition

[ Upstream commit 5aea5327ea2ddf544cbeff096f45fc2319b0714e ]

Not being able to create amdgpu sysfs attributes
is not a fatal error warranting not to continue
to try to bring up the display. Thus, if we get
an error trying to create amdgpu sysfs attrs,
report it and continue on to try to bring up
a display.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Slava Abramov <slava.abramov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agocoresight: Make sysfs functional on topologies with per core sink
Linu Cherian [Wed, 16 Sep 2020 19:17:35 +0000 (13:17 -0600)]
coresight: Make sysfs functional on topologies with per core sink

[ Upstream commit 6d578258b955fc8888e1bbd9a8fefe7b10065a84 ]

Coresight driver assumes sink is common across all the ETMs,
and tries to build a path between ETM and the first enabled
sink found using bus based search. This breaks sysFS usage
on implementations that has multiple per core sinks in
enabled state.

To fix this, coresight_get_enabled_sink API is updated to
do a connection based search starting from the given source,
instead of bus based search.
With sink selection using sysfs depecrated for perf interface,
provision for reset is removed as well in this API.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
[Fixed indentation problem and removed obsolete comment]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200916191737.4001561-15-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agouio: free uio id after uio file node is freed
Lang Dai [Mon, 14 Sep 2020 03:26:41 +0000 (11:26 +0800)]
uio: free uio id after uio file node is freed

[ Upstream commit 8fd0e2a6df262539eaa28b0a2364cca10d1dc662 ]

uio_register_device() do two things.
1) get an uio id from a global pool, e.g. the id is <A>
2) create file nodes like /sys/class/uio/uio<A>

uio_unregister_device() do two things.
1) free the uio id <A> and return it to the global pool
2) free the file node /sys/class/uio/uio<A>

There is a situation is that one worker is calling uio_unregister_device(),
and another worker is calling uio_register_device().
If the two workers are X and Y, they go as below sequence,
1) X free the uio id <AAA>
2) Y get an uio id <AAA>
3) Y create file node /sys/class/uio/uio<AAA>
4) X free the file note /sys/class/uio/uio<AAA>
Then it will failed at the 3rd step and cause the phenomenon we saw as it
is creating a duplicated file node.

Failure reports as follows:
sysfs: cannot create duplicate filename '/class/uio/uio10'
Call Trace:
   sysfs_do_create_link_sd.isra.2+0x9e/0xb0
   sysfs_create_link+0x25/0x40
   device_add+0x2c4/0x640
   __uio_register_device+0x1c5/0x576 [uio]
   adf_uio_init_bundle_dev+0x231/0x280 [intel_qat]
   adf_uio_register+0x1c0/0x340 [intel_qat]
   adf_dev_start+0x202/0x370 [intel_qat]
   adf_dev_start_async+0x40/0xa0 [intel_qat]
   process_one_work+0x14d/0x410
   worker_thread+0x4b/0x460
   kthread+0x105/0x140
 ? process_one_work+0x410/0x410
 ? kthread_bind+0x40/0x40
 ret_from_fork+0x1f/0x40
 Code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef
 e8 ec c4 ff ff 4c 89 e2 48 89 de 48 c7 c7 e8 b4 ee b4 e8 6a d4 d7
 ff <0f> 0b 48 89 df e8 20 fa f3 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84
---[ end trace a7531c1ed5269e84 ]---
 c6xxvf b002:00:00.0: Failed to register UIO devices
 c6xxvf b002:00:00.0: Failed to register UIO devices

Signed-off-by: Lang Dai <lang.dai@intel.com>
Link: https://lore.kernel.org/r/1600054002-17722-1-git-send-email-lang.dai@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoUSB: adutux: fix debugging
Oliver Neukum [Thu, 17 Sep 2020 11:26:00 +0000 (13:26 +0200)]
USB: adutux: fix debugging

[ Upstream commit c56150c1bc8da5524831b1dac2eec3c67b89f587 ]

Handling for removal of the controller was missing at one place.
Add it.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20200917112600.26508-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agocpufreq: sti-cpufreq: add stih418 support
Alain Volmat [Mon, 31 Aug 2020 06:10:11 +0000 (08:10 +0200)]
cpufreq: sti-cpufreq: add stih418 support

[ Upstream commit 01a163c52039e9426c7d3d3ab16ca261ad622597 ]

The STiH418 can be controlled the same way as STiH407 &
STiH410 regarding cpufreq.

Signed-off-by: Alain Volmat <avolmat@me.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoriscv: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
Zong Li [Mon, 31 Aug 2020 07:33:49 +0000 (15:33 +0800)]
riscv: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO

[ Upstream commit b5fca7c55f9fbab5ad732c3bce00f31af6ba5cfa ]

AT_VECTOR_SIZE_ARCH should be defined with the maximum number of
NEW_AUX_ENT entries that ARCH_DLINFO can contain, but it wasn't defined
for RISC-V at all even though ARCH_DLINFO will contain one NEW_AUX_ENT
for the VDSO address.

Signed-off-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/amd/display: Check clock table return
Rodrigo Siqueira [Thu, 20 Aug 2020 18:26:07 +0000 (14:26 -0400)]
drm/amd/display: Check clock table return

[ Upstream commit 4b4f21ff7f5d11bb77e169b306dcbc5b216f5db5 ]

During the load processes for Renoir, our display code needs to retrieve
the SMU clock and voltage table, however, this operation can fail which
means that we have to check this scenario. Currently, we are not
handling this case properly and as a result, we have seen the following
dmesg log during the boot:

RIP: 0010:rn_clk_mgr_construct+0x129/0x3d0 [amdgpu]
...
Call Trace:
 dc_clk_mgr_create+0x16a/0x1b0 [amdgpu]
 dc_create+0x231/0x760 [amdgpu]

This commit fixes this issue by checking the return status retrieved
from the clock table before try to populate any bandwidth.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agosamples/bpf: Fix possible deadlock in xdpsock
Magnus Karlsson [Thu, 10 Sep 2020 08:31:05 +0000 (10:31 +0200)]
samples/bpf: Fix possible deadlock in xdpsock

[ Upstream commit 5a2a0dd88f0f267ac5953acd81050ae43a82201f ]

Fix a possible deadlock in the l2fwd application in xdpsock that can
occur when there is no space in the Tx ring. There are two ways to get
the kernel to consume entries in the Tx ring: calling sendto() to make
it send packets and freeing entries from the completion ring, as the
kernel will not send a packet if there is no space for it to add a
completion entry in the completion ring. The Tx loop in l2fwd only
used to call sendto(). This patches adds cleaning the completion ring
in that loop.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1599726666-8431-3-git-send-email-magnus.karlsson@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoselinux: access policycaps with READ_ONCE/WRITE_ONCE
Stephen Smalley [Thu, 10 Sep 2020 14:28:05 +0000 (10:28 -0400)]
selinux: access policycaps with READ_ONCE/WRITE_ONCE

[ Upstream commit e8ba53d0023a76ba0f50e6ee3e6288c5442f9d33 ]

Use READ_ONCE/WRITE_ONCE for all accesses to the
selinux_state.policycaps booleans to prevent compiler
mischief.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoselftests/bpf: Define string const as global for test_sysctl_prog.c
Yonghong Song [Thu, 10 Sep 2020 20:27:18 +0000 (13:27 -0700)]
selftests/bpf: Define string const as global for test_sysctl_prog.c

[ Upstream commit 6e057fc15a2da4ee03eb1fa6889cf687e690106e ]

When tweaking llvm optimizations, I found that selftest build failed
with the following error:
  libbpf: elf: skipping unrecognized data section(6) .rodata.str1.1
  libbpf: prog 'sysctl_tcp_mem': bad map relo against '.L__const.is_tcp_mem.tcp_mem_name'
          in section '.rodata.str1.1'
  Error: failed to open BPF object file: Relocation failed
  make: *** [/work/net-next/tools/testing/selftests/bpf/test_sysctl_prog.skel.h] Error 255
  make: *** Deleting file `/work/net-next/tools/testing/selftests/bpf/test_sysctl_prog.skel.h'

The local string constant "tcp_mem_name" is put into '.rodata.str1.1' section
which libbpf cannot handle. Using untweaked upstream llvm, "tcp_mem_name"
is completely inlined after loop unrolling.

Commit 7fb5eefd7639 ("selftests/bpf: Fix test_sysctl_loop{1, 2}
failure due to clang change") solved a similar problem by defining
the string const as a global. Let us do the same here
for test_sysctl_prog.c so it can weather future potential llvm changes.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200910202718.956042-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agonfc: s3fwrn5: Add missing CRYPTO_HASH dependency
Krzysztof Kozlowski [Thu, 10 Sep 2020 16:12:16 +0000 (18:12 +0200)]
nfc: s3fwrn5: Add missing CRYPTO_HASH dependency

[ Upstream commit 4aa62c62d4c41d71b2bda5ed01b78961829ee93c ]

The driver uses crypto hash functions so it needs to select CRYPTO_HASH.
This fixes build errors:

  arc-linux-ld: drivers/nfc/s3fwrn5/firmware.o: in function `s3fwrn5_fw_download':
  firmware.c:(.text+0x152): undefined reference to `crypto_alloc_shash'

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomedia: uvcvideo: Fix dereference of out-of-bound list iterator
Daniel W. S. Almeida [Fri, 7 Aug 2020 08:35:30 +0000 (10:35 +0200)]
media: uvcvideo: Fix dereference of out-of-bound list iterator

[ Upstream commit f875bcc375c738bf2f599ff2e1c5b918dbd07c45 ]

Fixes the following coccinelle report:

drivers/media/usb/uvc/uvc_ctrl.c:1860:5-11:
ERROR: invalid reference to the index variable of the iterator on line 1854

by adding a boolean variable to check if the loop has found the

Found using - Coccinelle (http://coccinelle.lip6.fr)

[Replace cursor variable with bool found]

Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm: panfrost: fix common struct sg_table related issues
Marek Szyprowski [Tue, 28 Apr 2020 11:09:35 +0000 (13:09 +0200)]
drm: panfrost: fix common struct sg_table related issues

[ Upstream commit 34a4e66faf8b22c8409cbd46839ba5e488b1e6a9 ]

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm: lima: fix common struct sg_table related issues
Marek Szyprowski [Tue, 28 Apr 2020 11:09:11 +0000 (13:09 +0200)]
drm: lima: fix common struct sg_table related issues

[ Upstream commit c3d9c17f486d5c54940487dc31a54ebfdeeb371a ]

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoxen: gntdev: fix common struct sg_table related issues
Marek Szyprowski [Fri, 8 May 2020 14:07:13 +0000 (16:07 +0200)]
xen: gntdev: fix common struct sg_table related issues

[ Upstream commit d1749eb1ab85e04e58c29e58900e3abebbdd6e82 ]

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm: exynos: fix common struct sg_table related issues
Marek Szyprowski [Tue, 28 Apr 2020 11:08:41 +0000 (13:08 +0200)]
drm: exynos: fix common struct sg_table related issues

[ Upstream commit 84404614167b829f7b58189cd24b6c0c74897171 ]

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by : Inki Dae <inki.dae@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agobpf: Permit map_ptr arithmetic with opcode add and offset 0
Yonghong Song [Tue, 8 Sep 2020 17:57:02 +0000 (10:57 -0700)]
bpf: Permit map_ptr arithmetic with opcode add and offset 0

[ Upstream commit 7c6967326267bd5c0dded0a99541357d70dd11ac ]

Commit 41c48f3a98231 ("bpf: Support access
to bpf map fields") added support to access map fields
with CORE support. For example,

            struct bpf_map {
                    __u32 max_entries;
            } __attribute__((preserve_access_index));

            struct bpf_array {
                    struct bpf_map map;
                    __u32 elem_size;
            } __attribute__((preserve_access_index));

            struct {
                    __uint(type, BPF_MAP_TYPE_ARRAY);
                    __uint(max_entries, 4);
                    __type(key, __u32);
                    __type(value, __u32);
            } m_array SEC(".maps");

            SEC("cgroup_skb/egress")
            int cg_skb(void *ctx)
            {
                    struct bpf_array *array = (struct bpf_array *)&m_array;

                    /* .. array->map.max_entries .. */
            }

In kernel, bpf_htab has similar structure,

    struct bpf_htab {
    struct bpf_map map;
                    ...
            }

In the above cg_skb(), to access array->map.max_entries, with CORE, the clang will
generate two builtin's.
            base = &m_array;
            /* access array.map */
            map_addr = __builtin_preserve_struct_access_info(base, 0, 0);
            /* access array.map.max_entries */
            max_entries_addr = __builtin_preserve_struct_access_info(map_addr, 0, 0);
    max_entries = *max_entries_addr;

In the current llvm, if two builtin's are in the same function or
in the same function after inlining, the compiler is smart enough to chain
them together and generates like below:
            base = &m_array;
            max_entries = *(base + reloc_offset); /* reloc_offset = 0 in this case */
and we are fine.

But if we force no inlining for one of functions in test_map_ptr() selftest, e.g.,
check_default(), the above two __builtin_preserve_* will be in two different
functions. In this case, we will have code like:
   func check_hash():
            reloc_offset_map = 0;
            base = &m_array;
            map_base = base + reloc_offset_map;
            check_default(map_base, ...)
   func check_default(map_base, ...):
            max_entries = *(map_base + reloc_offset_max_entries);

In kernel, map_ptr (CONST_PTR_TO_MAP) does not allow any arithmetic.
The above "map_base = base + reloc_offset_map" will trigger a verifier failure.
  ; VERIFY(check_default(&hash->map, map));
  0: (18) r7 = 0xffffb4fe8018a004
  2: (b4) w1 = 110
  3: (63) *(u32 *)(r7 +0) = r1
   R1_w=invP110 R7_w=map_value(id=0,off=4,ks=4,vs=8,imm=0) R10=fp0
  ; VERIFY_TYPE(BPF_MAP_TYPE_HASH, check_hash);
  4: (18) r1 = 0xffffb4fe8018a000
  6: (b4) w2 = 1
  7: (63) *(u32 *)(r1 +0) = r2
   R1_w=map_value(id=0,off=0,ks=4,vs=8,imm=0) R2_w=invP1 R7_w=map_value(id=0,off=4,ks=4,vs=8,imm=0) R10=fp0
  8: (b7) r2 = 0
  9: (18) r8 = 0xffff90bcb500c000
  11: (18) r1 = 0xffff90bcb500c000
  13: (0f) r1 += r2
  R1 pointer arithmetic on map_ptr prohibited

To fix the issue, let us permit map_ptr + 0 arithmetic which will
result in exactly the same map_ptr.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200908175702.2463625-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agokgdb: Make "kgdbcon" work properly with "kgdb_earlycon"
Douglas Anderson [Tue, 30 Jun 2020 22:14:38 +0000 (15:14 -0700)]
kgdb: Make "kgdbcon" work properly with "kgdb_earlycon"

[ Upstream commit b18b099e04f450cdc77bec72acefcde7042bd1f3 ]

On my system the kernel processes the "kgdb_earlycon" parameter before
the "kgdbcon" parameter.  When we setup "kgdb_earlycon" we'll end up
in kgdb_register_callbacks() and "kgdb_use_con" won't have been set
yet so we'll never get around to starting "kgdbcon".  Let's remedy
this by detecting that the IO module was already registered when
setting "kgdb_use_con" and registering the console then.

As part of this, to avoid pre-declaring things, move the handling of
the "kgdbcon" further down in the file.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200630151422.1.I4aa062751ff5e281f5116655c976dff545c09a46@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoselftests/powerpc: Make using_hash_mmu() work on Cell & PowerMac
Michael Ellerman [Wed, 19 Aug 2020 01:57:19 +0000 (11:57 +1000)]
selftests/powerpc: Make using_hash_mmu() work on Cell & PowerMac

[ Upstream commit 34c103342be3f9397e656da7c5cc86e97b91f514 ]

These platforms don't show the MMU in /proc/cpuinfo, but they always
use hash, so teach using_hash_mmu() that.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200819015727.1977134-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoia64: kprobes: Use generic kretprobe trampoline handler
Masami Hiramatsu [Sat, 29 Aug 2020 13:01:09 +0000 (22:01 +0900)]
ia64: kprobes: Use generic kretprobe trampoline handler

[ Upstream commit e792ff804f49720ce003b3e4c618b5d996256a18 ]

Use the generic kretprobe trampoline handler. Don't use
framepointer verification.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/159870606883.1229682.12331813108378725668.stgit@devnote2
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoprintk: reduce LOG_BUF_SHIFT range for H8300
John Ogness [Wed, 12 Aug 2020 07:31:22 +0000 (09:37 +0206)]
printk: reduce LOG_BUF_SHIFT range for H8300

[ Upstream commit 550c10d28d21bd82a8bb48debbb27e6ed53262f6 ]

The .bss section for the h8300 is relatively small. A value of
CONFIG_LOG_BUF_SHIFT that is larger than 19 will create a static
printk ringbuffer that is too large. Limit the range appropriately
for the H8300.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200812073122.25412-1-john.ogness@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoarm64: topology: Stop using MPIDR for topology information
Valentin Schneider [Sat, 29 Aug 2020 13:00:16 +0000 (14:00 +0100)]
arm64: topology: Stop using MPIDR for topology information

[ Upstream commit 3102bc0e6ac752cc5df896acb557d779af4d82a1 ]

In the absence of ACPI or DT topology data, we fallback to haphazardly
decoding *something* out of MPIDR. Sadly, the contents of that register are
mostly unusable due to the implementation leniancy and things like Aff0
having to be capped to 15 (despite being encoded on 8 bits).

Consider a simple system with a single package of 32 cores, all under the
same LLC. We ought to be shoving them in the same core_sibling mask, but
MPIDR is going to look like:

  | CPU  | 0 | ... | 15 | 16 | ... | 31 |
  |------+---+-----+----+----+-----+----+
  | Aff0 | 0 | ... | 15 |  0 | ... | 15 |
  | Aff1 | 0 | ... |  0 |  1 | ... |  1 |
  | Aff2 | 0 | ... |  0 |  0 | ... |  0 |

Which will eventually yield

  core_sibling(0-15)  == 0-15
  core_sibling(16-31) == 16-31

NUMA woes
=========

If we try to play games with this and set up NUMA boundaries within those
groups of 16 cores via e.g. QEMU:

  # Node0: 0-9; Node1: 10-19
  $ qemu-system-aarch64 <blah> \
    -smp 20 -numa node,cpus=0-9,nodeid=0 -numa node,cpus=10-19,nodeid=1

The scheduler's MC domain (all CPUs with same LLC) is going to be built via

  arch_topology.c::cpu_coregroup_mask()

In there we try to figure out a sensible mask out of the topology
information we have. In short, here we'll pick the smallest of NUMA or
core sibling mask.

  node_mask(CPU9)    == 0-9
  core_sibling(CPU9) == 0-15

MC mask for CPU9 will thus be 0-9, not a problem.

  node_mask(CPU10)    == 10-19
  core_sibling(CPU10) == 0-15

MC mask for CPU10 will thus be 10-19, not a problem.

  node_mask(CPU16)    == 10-19
  core_sibling(CPU16) == 16-19

MC mask for CPU16 will thus be 16-19... Uh oh. CPUs 16-19 are in two
different unique MC spans, and the scheduler has no idea what to make of
that. That triggers the WARN_ON() added by commit

  ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap")

Fixing MPIDR-derived topology
=============================

We could try to come up with some cleverer scheme to figure out which of
the available masks to pick, but really if one of those masks resulted from
MPIDR then it should be discarded because it's bound to be bogus.

I was hoping to give MPIDR a chance for SMT, to figure out which threads are
in the same core using Aff1-3 as core ID, but Sudeep and Robin pointed out
to me that there are systems out there where *all* cores have non-zero
values in their higher affinity fields (e.g. RK3288 has "5" in all of its
cores' MPIDR.Aff1), which would expose a bogus core ID to userspace.

Stop using MPIDR for topology information. When no other source of topology
information is available, mark each CPU as its own core and its NUMA node
as its LLC domain.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20200829130016.26106-1-valentin.schneider@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agobrcmfmac: increase F2 watermark for BCM4329
Dmitry Osipenko [Sun, 30 Aug 2020 19:14:37 +0000 (22:14 +0300)]
brcmfmac: increase F2 watermark for BCM4329

[ Upstream commit 317da69d10b0247c4042354eb90c75b81620ce9d ]

This patch fixes SDHCI CRC errors during of RX throughput testing on
BCM4329 chip if SDIO BUS is clocked above 25MHz. In particular the
checksum problem is observed on NVIDIA Tegra20 SoCs. The good watermark
value is borrowed from downstream BCMDHD driver and it's matching to the
value that is already used for the BCM4339 chip, hence let's re-use it
for BCM4329.

Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200830191439.10017-2-digetx@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/bridge/synopsys: dsi: add support for non-continuous HS clock
Antonio Borneo [Wed, 1 Jul 2020 19:42:34 +0000 (21:42 +0200)]
drm/bridge/synopsys: dsi: add support for non-continuous HS clock

[ Upstream commit c6d94e37bdbb6dfe7e581e937a915ab58399b8a5 ]

Current code enables the HS clock when video mode is started or to
send out a HS command, and disables the HS clock to send out a LP
command. This is not what DSI spec specify.

Enable HS clock either in command and in video mode.
Set automatic HS clock management for panels and devices that
support non-continuous HS clock.

Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
Tested-by: Philippe Cornu <philippe.cornu@st.com>
Reviewed-by: Philippe Cornu <philippe.cornu@st.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701194234.18123-1-yannick.fertre@st.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agommc: via-sdmmc: Fix data race bug
Madhuparna Bhowmik [Sat, 22 Aug 2020 06:15:28 +0000 (11:45 +0530)]
mmc: via-sdmmc: Fix data race bug

[ Upstream commit 87d7ad089b318b4f319bf57f1daa64eb6d1d10ad ]

via_save_pcictrlreg() should be called with host->lock held
as it writes to pm_pcictrl_reg, otherwise there can be a race
condition between via_sd_suspend() and via_sdc_card_detect().
The same pattern is used in the function via_reset_pcictrl()
as well, where via_save_pcictrlreg() is called with host->lock
held.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Link: https://lore.kernel.org/r/20200822061528.7035-1-madhuparnabhowmik10@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomedia: imx274: fix frame interval handling
Hans Verkuil [Fri, 3 Jul 2020 09:20:32 +0000 (11:20 +0200)]
media: imx274: fix frame interval handling

[ Upstream commit 49b20d981d723fae5a93843c617af2b2c23611ec ]

1) the numerator and/or denominator might be 0, in that case
   fall back to the default frame interval. This is per the spec
   and this caused a v4l2-compliance failure.

2) the updated frame interval wasn't returned in the s_frame_interval
   subdev op.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/vkms: avoid warning in vkms_get_vblank_timestamp
Sidong Yang [Fri, 28 Aug 2020 12:45:53 +0000 (12:45 +0000)]
drm/vkms: avoid warning in vkms_get_vblank_timestamp

[ Upstream commit 05ca530268a9d0ab3547e7b288635e35990a77c4 ]

This patch avoid the warning in vkms_get_vblank_timestamp when vblanks
aren't enabled. When running igt test kms_cursor_crc just after vkms
module, the warning raised like below. Initial value of vblank time is
zero and hrtimer.node.expires is also zero if vblank aren't enabled
before. vkms module isn't real hardware but just virtual hardware
module. so vkms can't generate a resonable timestamp when hrtimer is
off. it's best to grab the current time.

[106444.464503] [IGT] kms_cursor_crc: starting subtest pipe-A-cursor-size-change
[106444.471475] WARNING: CPU: 0 PID: 10109 at
vkms_get_vblank_timestamp+0x42/0x50 [vkms]
[106444.471511] CPU: 0 PID: 10109 Comm: kms_cursor_crc Tainted: G        W  OE
5.9.0-rc1+ #6
[106444.471514] RIP: 0010:vkms_get_vblank_timestamp+0x42/0x50 [vkms]
[106444.471528] Call Trace:
[106444.471551]  drm_get_last_vbltimestamp+0xb9/0xd0 [drm]
[106444.471566]  drm_reset_vblank_timestamp+0x63/0xe0 [drm]
[106444.471579]  drm_crtc_vblank_on+0x85/0x150 [drm]
[106444.471582]  vkms_crtc_atomic_enable+0xe/0x10 [vkms]
[106444.471592]  drm_atomic_helper_commit_modeset_enables+0x1db/0x230
[drm_kms_helper]
[106444.471594]  vkms_atomic_commit_tail+0x38/0xc0 [vkms]
[106444.471601]  commit_tail+0x97/0x130 [drm_kms_helper]
[106444.471608]  drm_atomic_helper_commit+0x117/0x140 [drm_kms_helper]
[106444.471622]  drm_atomic_commit+0x4a/0x50 [drm]
[106444.471629]  drm_atomic_helper_set_config+0x63/0xb0 [drm_kms_helper]
[106444.471642]  drm_mode_setcrtc+0x1d9/0x7b0 [drm]
[106444.471654]  ? drm_mode_getcrtc+0x1a0/0x1a0 [drm]
[106444.471666]  drm_ioctl_kernel+0xb6/0x100 [drm]
[106444.471677]  drm_ioctl+0x3ad/0x470 [drm]
[106444.471688]  ? drm_mode_getcrtc+0x1a0/0x1a0 [drm]
[106444.471692]  ? tomoyo_file_ioctl+0x19/0x20
[106444.471694]  __x64_sys_ioctl+0x96/0xd0
[106444.471697]  do_syscall_64+0x37/0x80
[106444.471699]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Reviewed-by: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200828124553.2178-1-realwakka@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomedia: tw5864: check status of tw5864_frameinterval_get
Tom Rix [Mon, 10 Aug 2020 19:25:18 +0000 (21:25 +0200)]
media: tw5864: check status of tw5864_frameinterval_get

[ Upstream commit 780d815dcc9b34d93ae69385a8465c38d423ff0f ]

clang static analysis reports this problem

tw5864-video.c:773:32: warning: The left expression of the compound
  assignment is an uninitialized value.
  The computed value will also be garbage
        fintv->stepwise.max.numerator *= std_max_fps;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^

stepwise.max is set with frameinterval, which comes from

ret = tw5864_frameinterval_get(input, &frameinterval);
fintv->stepwise.step = frameinterval;
fintv->stepwise.min = frameinterval;
fintv->stepwise.max = frameinterval;
fintv->stepwise.max.numerator *= std_max_fps;

When tw5864_frameinterval_get() fails, frameinterval is not
set. So check the status and fix another similar problem.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousb: typec: tcpm: During PR_SWAP, source caps should be sent only after tSwapSourceStart
Badhri Jagan Sridharan [Mon, 17 Aug 2020 18:38:27 +0000 (11:38 -0700)]
usb: typec: tcpm: During PR_SWAP, source caps should be sent only after tSwapSourceStart

[ Upstream commit 6bbe2a90a0bb4af8dd99c3565e907fe9b5e7fd88 ]

The patch addresses the compliance test failures while running
TD.PD.CP.E3, TD.PD.CP.E4, TD.PD.CP.E5 of the "Deterministic PD
Compliance MOI" test plan published in https://www.usb.org/usbc.
For a product to be Type-C compliant, it's expected that these tests
are run on usb.org certified Type-C compliance tester as mentioned in
https://www.usb.org/usbc.

The purpose of the tests TD.PD.CP.E3, TD.PD.CP.E4, TD.PD.CP.E5 is to
verify the PR_SWAP response of the device. While doing so, the test
asserts that Source Capabilities message is NOT received from the test
device within tSwapSourceStart min (20 ms) from the time the last bit
of GoodCRC corresponding to the RS_RDY message sent by the UUT was
sent. If it does then the test fails.

This is in line with the requirements from the USB Power Delivery
Specification Revision 3.0, Version 1.2:
"6.6.8.1 SwapSourceStartTimer
The SwapSourceStartTimer Shall be used by the new Source, after a
Power Role Swap or Fast Role Swap, to ensure that it does not send
Source_Capabilities Message before the new Sink is ready to receive
the
Source_Capabilities Message. The new Source Shall Not send the
Source_Capabilities Message earlier than tSwapSourceStart after the
last bit of the EOP of GoodCRC Message sent in response to the PS_RDY
Message sent by the new Source indicating that its power supply is
ready."

The patch makes sure that TCPM does not send the Source_Capabilities
Message within tSwapSourceStart(20ms) by transitioning into
SRC_STARTUP only after  tSwapSourceStart(20ms).

Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20200817183828.1895015-1-badhri@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomedia: platform: Improve queue set up flow for bug fixing
Xia Jiang [Fri, 14 Aug 2020 07:11:35 +0000 (09:11 +0200)]
media: platform: Improve queue set up flow for bug fixing

[ Upstream commit 5095a6413a0cf896ab468009b6142cb0fe617e66 ]

Add checking created buffer size follow in mtk_jpeg_queue_setup().

Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Xia Jiang <xia.jiang@mediatek.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agostaging: wfx: fix potential use before init
Jérôme Pouiller [Tue, 25 Aug 2020 08:58:24 +0000 (10:58 +0200)]
staging: wfx: fix potential use before init

[ Upstream commit ce3653a8d3db096aa163fc80239d8ec1305c81fa ]

The trace below can appear:

    [83613.832200] INFO: trying to register non-static key.
    [83613.837248] the code is fine but needs lockdep annotation.
    [83613.842808] turning off the locking correctness validator.
    [83613.848375] CPU: 3 PID: 141 Comm: kworker/3:2H Tainted: G           O      5.6.13-silabs15 #2
    [83613.857019] Hardware name: BCM2835
    [83613.860605] Workqueue: events_highpri bh_work [wfx]
    [83613.865552] Backtrace:
    [83613.868041] [<c010f2cc>] (dump_backtrace) from [<c010f7b8>] (show_stack+0x20/0x24)
    [83613.881463] [<c010f798>] (show_stack) from [<c0d82138>] (dump_stack+0xe8/0x114)
    [83613.888882] [<c0d82050>] (dump_stack) from [<c01a02ec>] (register_lock_class+0x748/0x768)
    [83613.905035] [<c019fba4>] (register_lock_class) from [<c019da04>] (__lock_acquire+0x88/0x13dc)
    [83613.924192] [<c019d97c>] (__lock_acquire) from [<c019f6a4>] (lock_acquire+0xe8/0x274)
    [83613.942644] [<c019f5bc>] (lock_acquire) from [<c0daa5dc>] (_raw_spin_lock_irqsave+0x58/0x6c)
    [83613.961714] [<c0daa584>] (_raw_spin_lock_irqsave) from [<c0ab3248>] (skb_dequeue+0x24/0x78)
    [83613.974967] [<c0ab3224>] (skb_dequeue) from [<bf330db0>] (wfx_tx_queues_get+0x96c/0x1294 [wfx])
    [83613.989728] [<bf330444>] (wfx_tx_queues_get [wfx]) from [<bf320454>] (bh_work+0x454/0x26d8 [wfx])
    [83614.009337] [<bf320000>] (bh_work [wfx]) from [<c014c920>] (process_one_work+0x23c/0x7ec)
    [83614.028141] [<c014c6e4>] (process_one_work) from [<c014cf1c>] (worker_thread+0x4c/0x55c)
    [83614.046861] [<c014ced0>] (worker_thread) from [<c0154c04>] (kthread+0x138/0x168)
    [83614.064876] [<c0154acc>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
    [83614.072200] Exception stack(0xecad3fb0 to 0xecad3ff8)
    [83614.077323] 3fa0:                                     00000000 00000000 00000000 00000000
    [83614.085620] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [83614.093914] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000

Indeed, the code of wfx_add_interface() shows that the interface is
enabled to early. So, the spinlock associated with some skb_queue may
not yet initialized when wfx_tx_queues_get() is called.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200825085828.399505-8-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomisc: fastrpc: fix common struct sg_table related issues
Marek Szyprowski [Wed, 26 Aug 2020 06:33:12 +0000 (08:33 +0200)]
misc: fastrpc: fix common struct sg_table related issues

[ Upstream commit 7cd7edb89437457ec36ffdbb970cc314d00c4aba ]

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20200826063316.23486-29-m.szyprowski@samsung.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoASoC: AMD: Clean kernel log from deferred probe error messages
Akshu Agrawal [Wed, 26 Aug 2020 18:54:20 +0000 (00:24 +0530)]
ASoC: AMD: Clean kernel log from deferred probe error messages

[ Upstream commit f7660445c8e7fda91e8b944128554249d886b1d4 ]

While the driver waits for DAIs to be probed and retries probing,
have the error messages at debug level instead of error.

Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Link: https://lore.kernel.org/r/20200826185454.5545-1-akshu.agrawal@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomedia: videodev2.h: RGB BT2020 and HSV are always full range
Hans Verkuil [Thu, 20 Aug 2020 10:47:16 +0000 (12:47 +0200)]
media: videodev2.h: RGB BT2020 and HSV are always full range

[ Upstream commit b305dfe2e93434b12d438434461b709641f62af4 ]

The default RGB quantization range for BT.2020 is full range (just as for
all the other RGB pixel encodings), not limited range.

Update the V4L2_MAP_QUANTIZATION_DEFAULT macro and documentation
accordingly.

Also mention that HSV is always full range and cannot be limited range.

When RGB BT2020 was introduced in V4L2 it was not clear whether it should
be limited or full range, but full range is the right (and consistent)
choice.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/bridge_connector: Set default status connected for eDP connectors
Enric Balletbo i Serra [Wed, 26 Aug 2020 08:15:22 +0000 (10:15 +0200)]
drm/bridge_connector: Set default status connected for eDP connectors

[ Upstream commit c5589b39549d1875bb506da473bf4580c959db8c ]

In an eDP application, HPD is not required and on most bridge chips
useless. If HPD is not used, we need to set initial status as connected,
otherwise the connector created by the drm_bridge_connector API remains
in an unknown state.

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Reviewed-by: Bilal Wasim <bwasim.lkml@gmail.com>
Tested-by: Bilal Wasim <bwasim.lkml@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826081526.674866-2-enric.balletbo@collabora.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoselftests/x86/fsgsbase: Reap a forgotten child
Andy Lutomirski [Wed, 26 Aug 2020 17:00:45 +0000 (10:00 -0700)]
selftests/x86/fsgsbase: Reap a forgotten child

[ Upstream commit ab2dd173330a3f07142e68cd65682205036cd00f ]

The ptrace() test forgot to reap its child.  Reap it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/e7700a503f30e79ab35a63103938a19893dbeff2.1598461151.git.luto@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoASoC: SOF: fix a runtime pm issue in SOF when HDMI codec doesn't work
Rander Wang [Tue, 25 Aug 2020 23:50:35 +0000 (16:50 -0700)]
ASoC: SOF: fix a runtime pm issue in SOF when HDMI codec doesn't work

[ Upstream commit 6c63c954e1c52f1262f986f36d95f557c6f8fa94 ]

When hda_codec_probe() doesn't initialize audio component, we disable
the codec and keep going. However,the resources are not released. The
child_count of SOF device is increased in snd_hdac_ext_bus_device_init
but is not decrease in error case, so SOF can't get suspended.

snd_hdac_ext_bus_device_exit will be invoked in HDA framework if it
gets a error. Now copy this behavior to release resources and decrease
SOF device child_count to release SOF device.

Signed-off-by: Rander Wang <rander.wang@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20200825235040.1586478-3-ranjani.sridharan@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/brige/megachips: Add checking if ge_b850v3_lvds_init() is working correctly
Nadezda Lutovinova [Wed, 19 Aug 2020 14:37:56 +0000 (17:37 +0300)]
drm/brige/megachips: Add checking if ge_b850v3_lvds_init() is working correctly

[ Upstream commit f688a345f0d7a6df4dd2aeca8e4f3c05e123a0ee ]

If ge_b850v3_lvds_init() does not allocate memory for ge_b850v3_lvds_ptr,
then a null pointer dereference is accessed.

The patch adds checking of the return value of ge_b850v3_lvds_init().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819143756.30626-1-lutovinova@ispras.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/scheduler: Scheduler priority fixes (v2)
Luben Tuikov [Tue, 11 Aug 2020 23:59:58 +0000 (19:59 -0400)]
drm/scheduler: Scheduler priority fixes (v2)

[ Upstream commit e2d732fdb7a9e421720a644580cd6a9400f97f60 ]

Remove DRM_SCHED_PRIORITY_LOW, as it was used
in only one place.

Rename and separate by a line
DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT
as it represents a (total) count of said
priorities and it is used as such in loops
throughout the code. (0-based indexing is the
the count number.)

Remove redundant word HIGH in priority names,
and rename *KERNEL* to *HIGH*, as it really
means that, high.

v2: Add back KERNEL and remove SW and HW,
    in lieu of a single HIGH between NORMAL and KERNEL.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoath10k: fix VHT NSS calculation when STBC is enabled
Sathishkumar Muruganandam [Fri, 14 Aug 2020 08:16:11 +0000 (13:46 +0530)]
ath10k: fix VHT NSS calculation when STBC is enabled

[ Upstream commit 99f41b8e43b8b4b31262adb8ac3e69088fff1289 ]

When STBC is enabled, NSTS_SU value need to be accounted for VHT NSS
calculation for SU case.

Without this fix, 1SS + STBC enabled case was reported wrongly as 2SS
in radiotap header on monitor mode capture.

Tested-on: QCA9984 10.4-3.10-00047

Signed-off-by: Sathishkumar Muruganandam <murugana@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1597392971-3897-1-git-send-email-murugana@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoath10k: start recovery process when payload length exceeds max htc length for sdio
Wen Gong [Fri, 14 Aug 2020 15:17:08 +0000 (18:17 +0300)]
ath10k: start recovery process when payload length exceeds max htc length for sdio

[ Upstream commit 2fd3c8f34d08af0a6236085f9961866ad92ef9ec ]

When simulate random transfer fail for sdio write and read, it happened
"payload length exceeds max htc length" and recovery later sometimes.

Test steps:
1. Add config and update kernel:
CONFIG_FAIL_MMC_REQUEST=y
CONFIG_FAULT_INJECTION=y
CONFIG_FAULT_INJECTION_DEBUG_FS=y

2. Run simulate fail:
cd /sys/kernel/debug/mmc1/fail_mmc_request
echo 10 > probability
echo 10 > times # repeat until hitting issues

3. It happened payload length exceeds max htc length.
[  199.935506] ath10k_sdio mmc1:0001:1: payload length 57005 exceeds max htc length: 4088
....
[  264.990191] ath10k_sdio mmc1:0001:1: payload length 57005 exceeds max htc length: 4088

4. after some time, such as 60 seconds, it start recovery which triggered
by wmi command timeout for periodic scan.
[  269.229232] ieee80211 phy0: Hardware restart was requested
[  269.734693] ath10k_sdio mmc1:0001:1: device successfully recovered

The simulate fail of sdio is not a real sdio transter fail, it only
set an error status in mmc_should_fail_request after the transfer end,
actually the transfer is success, then sdio_io_rw_ext_helper will
return error status and stop transfer the left data. For example,
the really RX len is 286 bytes, then it will split to 2 blocks in
sdio_io_rw_ext_helper, one is 256 bytes, left is 30 bytes, if the
first 256 bytes get an error status by mmc_should_fail_request,then
the left 30 bytes will not read in this RX operation. Then when the
next RX arrive, the left 30 bytes will be considered as the header
of the read, the top 4 bytes of the 30 bytes will be considered as
lookaheads, but actually the 4 bytes is not the lookaheads, so the len
from this lookaheads is not correct, it exceeds max htc length 4088
sometimes. When happened exceeds, the buffer chain is not matched between
firmware and ath10k, then it need to start recovery ASAP. Recently then
recovery will be started by wmi command timeout, but it will be long time
later, for example, it is 60+ seconds later from the periodic scan, if
it does not have periodic scan, it will be longer.

Start recovery when it happened "payload length exceeds max htc length"
will be reasonable.

This patch only effect sdio chips.

Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00029.

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200108031957.22308-3-wgong@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agovideo: fbdev: pvr2fb: initialize variables
Tom Rix [Mon, 20 Jul 2020 19:18:45 +0000 (12:18 -0700)]
video: fbdev: pvr2fb: initialize variables

[ Upstream commit 8e1ba47c60bcd325fdd097cd76054639155e5d2e ]

clang static analysis reports this repesentative error

pvr2fb.c:1049:2: warning: 1st function call argument
  is an uninitialized value [core.CallAndMessage]
        if (*cable_arg)
        ^~~~~~~~~~~~~~~

Problem is that cable_arg depends on the input loop to
set the cable_arg[0].  If it does not, then some random
value from the stack is used.

A similar problem exists for output_arg.

So initialize cable_arg and output_arg.

Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200720191845.20115-1-trix@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/amdgpu: restore ras flags when user resets eeprom(v2)
Guchun Chen [Wed, 22 Jul 2020 02:37:01 +0000 (10:37 +0800)]
drm/amdgpu: restore ras flags when user resets eeprom(v2)

[ Upstream commit bf0b91b78f002faa1be1902a75eeb0797f9fbcf3 ]

RAS flags needs to be cleaned as well when user requires
one clean eeprom.

v2: RAS flags shall be restored after eeprom reset succeeds.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodrm/ast: Separate DRM driver from PCI code
Thomas Zimmermann [Thu, 30 Jul 2020 13:51:59 +0000 (15:51 +0200)]
drm/ast: Separate DRM driver from PCI code

[ Upstream commit d50ace1e72f05708cc5dbc89b9bbb9873f150092 ]

Putting the DRM driver to the top of the file and the PCI code to the
bottom makes ast_drv.c more readable. While at it, the patch prefixes
file-scope variables with ast_.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730135206.30239-3-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agox86/kaslr: Initialize mem_limit to the real maximum address
Arvind Sankar [Mon, 27 Jul 2020 23:07:57 +0000 (19:07 -0400)]
x86/kaslr: Initialize mem_limit to the real maximum address

[ Upstream commit 451286940d95778e83fa7f97006316d995b4c4a8 ]

On 64-bit, the kernel must be placed below MAXMEM (64TiB with 4-level
paging or 4PiB with 5-level paging). This is currently not enforced by
KASLR, which thus implicitly relies on physical memory being limited to
less than 64TiB.

On 32-bit, the limit is KERNEL_IMAGE_SIZE (512MiB). This is enforced by
special checks in __process_mem_region().

Initialize mem_limit to the maximum (depending on architecture), instead
of ULLONG_MAX, and make sure the command-line arguments can only
decrease it. This makes the enforcement explicit on 64-bit, and
eliminates the 32-bit specific checks to keep the kernel below 512M.

Check upfront to make sure the minimum address is below the limit before
doing any work.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200727230801.3468620-5-nivedita@alum.mit.edu
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoath10k: fix retry packets update in station dump
Venkateswara Naralasetty [Mon, 15 Jun 2020 17:29:01 +0000 (20:29 +0300)]
ath10k: fix retry packets update in station dump

[ Upstream commit 67b927f9820847d30e97510b2f00cd142b9559b6 ]

When tx status enabled, retry count is updated from tx completion status.
which is not working as expected due to firmware limitation where
firmware can not provide per MSDU rate statistics from tx completion
status. Due to this tx retry count is always 0 in station dump.

Fix this issue by updating the retry packet count from per peer
statistics. This patch will not break on SDIO devices since, this retry
count is already updating from peer statistics for SDIO devices.

Tested-on: QCA9984 PCI 10.4-3.6-00104
Tested-on: QCA9882 PCI 10.2.4-1.0-00047

Signed-off-by: Venkateswara Naralasetty <vnaralas@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1591856446-26977-1-git-send-email-vnaralas@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoio_uring: don't set COMP_LOCKED if won't put
Pavel Begunkov [Tue, 13 Oct 2020 08:43:56 +0000 (09:43 +0100)]
io_uring: don't set COMP_LOCKED if won't put

[ Upstream commit 368c5481ae7c6a9719c40984faea35480d9f4872 ]

__io_kill_linked_timeout() sets REQ_F_COMP_LOCKED for a linked timeout
even if it can't cancel it, e.g. it's already running. It not only races
with io_link_timeout_fn() for ->flags field, but also leaves the flag
set and so io_link_timeout_fn() may find it and decide that it holds the
lock. Hopefully, the second problem is potential.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoxfs: fix realtime bitmap/summary file truncation when growing rt volume
Darrick J. Wong [Wed, 7 Oct 2020 20:55:16 +0000 (13:55 -0700)]
xfs: fix realtime bitmap/summary file truncation when growing rt volume

[ Upstream commit f4c32e87de7d66074d5612567c5eac7325024428 ]

The realtime bitmap and summary files are regular files that are hidden
away from the directory tree.  Since they're regular files, inode
inactivation will try to purge what it thinks are speculative
preallocations beyond the incore size of the file.  Unfortunately,
xfs_growfs_rt forgets to update the incore size when it resizes the
inodes, with the result that inactivating the rt inodes at unmount time
will cause their contents to be truncated.

Fix this by updating the incore size when we change the ondisk size as
part of updating the superblock.  Note that we don't do this when we're
allocating blocks to the rt inodes because we actually want those blocks
to get purged if the growfs fails.

This fixes corruption complaints from the online rtsummary checker when
running xfs/233.  Since that test requires rmap, one can also trigger
this by growing an rt volume, cycling the mount, and creating rt files.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoxfs: change the order in which child and parent defer ops are finished
Darrick J. Wong [Sat, 26 Sep 2020 00:39:51 +0000 (17:39 -0700)]
xfs: change the order in which child and parent defer ops are finished

[ Upstream commit 27dada070d59c28a441f1907d2cec891b17dcb26 ]

The defer ops code has been finishing items in the wrong order -- if a
top level defer op creates items A and B, and finishing item A creates
more defer ops A1 and A2, we'll put the new items on the end of the
chain and process them in the order A B A1 A2.  This is kind of weird,
since it's convenient for programmers to be able to think of A and B as
an ordered sequence where all the sub-tasks for A must finish before we
move on to B, e.g. A A1 A2 D.

Right now, our log intent items are not so complex that this matters,
but this will become important for the atomic extent swapping patchset.
In order to maintain correct reference counting of extents, we have to
unmap and remap extents in that order, and we want to complete that work
before moving on to the next range that the user wants to swap.  This
patch fixes defer ops to satsify that requirement.

The primary symptom of the incorrect order was noticed in an early
performance analysis of the atomic extent swap code.  An astonishingly
large number of deferred work items accumulated when userspace requested
an atomic update of two very fragmented files.  The cause of this was
traced to the same ordering bug in the inner loop of
xfs_defer_finish_noroll.

If the ->finish_item method of a deferred operation queues new deferred
operations, those new deferred ops are appended to the tail of the
pending work list.  To illustrate, say that a caller creates a
transaction t0 with four deferred operations D0-D3.  The first thing
defer ops does is roll the transaction to t1, leaving us with:

t1: D0(t0), D1(t0), D2(t0), D3(t0)

Let's say that finishing each of D0-D3 will create two new deferred ops.
After finish D0 and roll, we'll have the following chain:

t2: D1(t0), D2(t0), D3(t0), d4(t1), d5(t1)

d4 and d5 were logged to t1.  Notice that while we're about to start
work on D1, we haven't actually completed all the work implied by D0
being finished.  So far we've been careful (or lucky) to structure the
dfops callers such that D1 doesn't depend on d4 or d5 being finished,
but this is a potential logic bomb.

There's a second problem lurking.  Let's see what happens as we finish
D1-D3:

t3: D2(t0), D3(t0), d4(t1), d5(t1), d6(t2), d7(t2)
t4: D3(t0), d4(t1), d5(t1), d6(t2), d7(t2), d8(t3), d9(t3)
t5: d4(t1), d5(t1), d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4)

Let's say that d4-d11 are simple work items that don't queue any other
operations, which means that we can complete each d4 and roll to t6:

t6: d5(t1), d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4)
t7: d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4)
...
t11: d10(t4), d11(t4)
t12: d11(t4)
<done>

When we try to roll to transaction #12, we're holding defer op d11,
which we logged way back in t4.  This means that the tail of the log is
pinned at t4.  If the log is very small or there are a lot of other
threads updating metadata, this means that we might have wrapped the log
and cannot get roll to t11 because there isn't enough space left before
we'd run into t4.

Let's shift back to the original failure.  I mentioned before that I
discovered this flaw while developing the atomic file update code.  In
that scenario, we have a defer op (D0) that finds a range of file blocks
to remap, creates a handful of new defer ops to do that, and then asks
to be continued with however much work remains.

So, D0 is the original swapext deferred op.  The first thing defer ops
does is rolls to t1:

t1: D0(t0)

We try to finish D0, logging d1 and d2 in the process, but can't get all
the work done.  We log a done item and a new intent item for the work
that D0 still has to do, and roll to t2:

t2: D0'(t1), d1(t1), d2(t1)

We roll and try to finish D0', but still can't get all the work done, so
we log a done item and a new intent item for it, requeue D0 a second
time, and roll to t3:

t3: D0''(t2), d1(t1), d2(t1), d3(t2), d4(t2)

If it takes 48 more rolls to complete D0, then we'll finally dispense
with D0 in t50:

t50: D<fifty primes>(t49), d1(t1), ..., d102(t50)

We then try to roll again to get a chain like this:

t51: d1(t1), d2(t1), ..., d101(t50), d102(t50)
...
t152: d102(t50)
<done>

Notice that in rolling to transaction #51, we're holding on to a log
intent item for d1 that was logged in transaction #1.  This means that
the tail of the log is pinned at t1.  If the log is very small or there
are a lot of other threads updating metadata, this means that we might
have wrapped the log and cannot roll to t51 because there isn't enough
space left before we'd run into t1.  This is of course problem #2 again.

But notice the third problem with this scenario: we have 102 defer ops
tied to this transaction!  Each of these items are backed by pinned
kernel memory, which means that we risk OOM if the chains get too long.

Yikes.  Problem #1 is a subtle logic bomb that could hit someone in the
future; problem #2 applies (rarely) to the current upstream, and problem
#3 applies to work under development.

This is not how incremental deferred operations were supposed to work.
The dfops design of logging in the same transaction an intent-done item
and a new intent item for the work remaining was to make it so that we
only have to juggle enough deferred work items to finish that one small
piece of work.  Deferred log item recovery will find that first
unfinished work item and restart it, no matter how many other intent
items might follow it in the log.  Therefore, it's ok to put the new
intents at the start of the dfops chain.

For the first example, the chains look like this:

t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0)
t3: d5(t1), D1(t0), D2(t0), D3(t0)
...
t9: d9(t7), D3(t0)
t10: D3(t0)
t11: d10(t10), d11(t10)
t12: d11(t10)

For the second example, the chains look like this:

t1: D0(t0)
t2: d1(t1), d2(t1), D0'(t1)
t3: d2(t1), D0'(t1)
t4: D0'(t1)
t5: d1(t4), d2(t4), D0''(t4)
...
t148: D0<50 primes>(t147)
t149: d101(t148), d102(t148)
t150: d102(t148)
<done>

This actually sucks more for pinning the log tail (we try to roll to t10
while holding an intent item that was logged in t1) but we've solved
problem #1.  We've also reduced the maximum chain length from:

    sum(all the new items) + nr_original_items

to:

    max(new items that each original item creates) + nr_original_items

This solves problem #3 by sharply reducing the number of defer ops that
can be attached to a transaction at any given time.  The change makes
the problem of log tail pinning worse, but is improvement we need to
solve problem #2.  Actually solving #2, however, is left to the next
patch.

Note that a subsequent analysis of some hard-to-trigger reflink and COW
livelocks on extremely fragmented filesystems (or systems running a lot
of IO threads) showed the same symptoms -- uncomfortably large numbers
of incore deferred work items and occasional stalls in the transaction
grant code while waiting for log reservations.  I think this patch and
the next one will also solve these problems.

As originally written, the code used list_splice_tail_init instead of
list_splice_init, so change that, and leave a short comment explaining
our actions.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>