card->owner is a required property and since commit 81033c6b584b ("ALSA:
core: Warn on empty module") a warning is issued if it is empty. Fix lack
of it. This fixes following warning observed on RaspberryPi 3B board
with ARM 32bit kernel and multi_v7_defconfig:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 210 at sound/core/init.c:207 snd_card_new+0x378/0x398 [snd]
Modules linked in: vc4(+) snd_soc_core ac97_bus snd_pcm_dmaengine bluetooth snd_pcm snd_timer crc32_arm_ce raspberrypi_hwmon snd soundcore ecdh_generic ecc bcm2835_thermal phy_generic
CPU: 1 PID: 210 Comm: systemd-udevd Not tainted 5.8.0-rc1-00027-g81033c6b584b #1087
Hardware name: BCM2835
[<c03113c0>] (unwind_backtrace) from [<c030bcb4>] (show_stack+0x10/0x14)
[<c030bcb4>] (show_stack) from [<c071cef8>] (dump_stack+0xd4/0xe8)
[<c071cef8>] (dump_stack) from [<c0345bfc>] (__warn+0xdc/0xf4)
[<c0345bfc>] (__warn) from [<c0345cc4>] (warn_slowpath_fmt+0xb0/0xb8)
[<c0345cc4>] (warn_slowpath_fmt) from [<bf02ff74>] (snd_card_new+0x378/0x398 [snd])
[<bf02ff74>] (snd_card_new [snd]) from [<bf11f0b4>] (snd_soc_bind_card+0x280/0x99c [snd_soc_core])
[<bf11f0b4>] (snd_soc_bind_card [snd_soc_core]) from [<bf12f000>] (devm_snd_soc_register_card+0x34/0x6c [snd_soc_core])
[<bf12f000>] (devm_snd_soc_register_card [snd_soc_core]) from [<bf165654>] (vc4_hdmi_bind+0x43c/0x5f4 [vc4])
[<bf165654>] (vc4_hdmi_bind [vc4]) from [<c09d660c>] (component_bind_all+0xec/0x24c)
[<c09d660c>] (component_bind_all) from [<bf15c44c>] (vc4_drm_bind+0xd4/0x174 [vc4])
[<bf15c44c>] (vc4_drm_bind [vc4]) from [<c09d6ac0>] (try_to_bring_up_master+0x160/0x1b0)
[<c09d6ac0>] (try_to_bring_up_master) from [<c09d6f38>] (component_master_add_with_match+0xd0/0x104)
[<c09d6f38>] (component_master_add_with_match) from [<bf15c588>] (vc4_platform_drm_probe+0x9c/0xbc [vc4])
[<bf15c588>] (vc4_platform_drm_probe [vc4]) from [<c09df740>] (platform_drv_probe+0x6c/0xa4)
[<c09df740>] (platform_drv_probe) from [<c09dd6f0>] (really_probe+0x210/0x350)
[<c09dd6f0>] (really_probe) from [<c09dd940>] (driver_probe_device+0x5c/0xb4)
[<c09dd940>] (driver_probe_device) from [<c09ddb38>] (device_driver_attach+0x58/0x60)
[<c09ddb38>] (device_driver_attach) from [<c09ddbc0>] (__driver_attach+0x80/0xbc)
[<c09ddbc0>] (__driver_attach) from [<c09db820>] (bus_for_each_dev+0x68/0xb4)
[<c09db820>] (bus_for_each_dev) from [<c09dc9f8>] (bus_add_driver+0x130/0x1e8)
[<c09dc9f8>] (bus_add_driver) from [<c09de648>] (driver_register+0x78/0x110)
[<c09de648>] (driver_register) from [<c0302038>] (do_one_initcall+0x50/0x220)
[<c0302038>] (do_one_initcall) from [<c03db544>] (do_init_module+0x60/0x210)
[<c03db544>] (do_init_module) from [<c03da4f8>] (load_module+0x1e34/0x2338)
[<c03da4f8>] (load_module) from [<c03dac00>] (sys_finit_module+0xac/0xbc)
[<c03dac00>] (sys_finit_module) from [<c03000c0>] (ret_fast_syscall+0x0/0x54)
Exception stack(0xeded9fa8 to 0xeded9ff0)
...
---[ end trace 6414689569c2bc08 ]---
Fixes: bb7d78568814 ("drm/vc4: Add HDMI audio support") Suggested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200701073949.28941-1-m.szyprowski@samsung.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Memory state around the buggy address: ffff8880251a8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880251a8b80: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8880251a8c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff8880251a8c80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff8880251a8d00: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Fixes: 409c3b0c5f03 ("mac802154: tx: move stats tx increment") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Alexander Aring <alex.aring@gmail.com> Cc: Stefan Schmidt <stefan@datenfreihafen.org> Cc: linux-wpan@vger.kernel.org Link: https://lore.kernel.org/r/20200908104025.4009085-1-edumazet@google.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The unicast packet rerouting code makes several assumptions. For
instance it assumes that there is always exactly one destination in the
TT. This breaks for multicast frames in a unicast packets in several ways:
For one thing if there is actually no TT entry and the destination node
was selected due to the multicast tvlv flags it announced. Then an
intermediate node will wrongly drop the packet.
For another thing if there is a TT entry but the TTVN of this entry is
newer than the originally addressed destination node: Then the
intermediate node will wrongly redirect the packet, leading to
duplicated multicast packets at a multicast listener and missing
packets at other multicast listeners or multicast routers.
Fixing this by not applying the unicast packet rerouting to batman-adv
unicast packets with a multicast payload. We are not able to detect a
roaming multicast listener at the moment and will just continue to send
the multicast frame to both the new and old destination for a while in
case of such a roaming multicast listener.
It seems that due to a copy & paste error the void pointer
in batadv_choose_backbone_gw() is cast to the wrong type.
Fixing this by using "struct batadv_bla_backbone_gw" instead of "struct
batadv_bla_claim" which better matches the caller's side.
For now it seems that we were lucky because the two structs both have
their orig/vid and addr/vid in the beginning. However I stumbled over
this issue when I was trying to add some debug variables in front of
"orig" in batadv_backbone_gw, which caused hash lookups to fail.
Fixes: 07568d0369f9 ("batman-adv: don't rely on positions in struct for hashing") Signed-off-by: Linus Lüssing <ll@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Following commit e18696786548 ("mwifiex: Prevent memory corruption
handling keys") the mwifiex driver fails to authenticate with certain
networks, specifically networks with 256 bit keys, and repeatedly asks
for the password. The kernel log repeats the following lines (id and
bssid redacted):
mwifiex_pcie 0000:01:00.0: info: trying to associate to '<id>' bssid <bssid>
mwifiex_pcie 0000:01:00.0: info: associated to bssid <bssid> successfully
mwifiex_pcie 0000:01:00.0: crypto keys added
mwifiex_pcie 0000:01:00.0: info: successfully disconnected from <bssid>: reason code 3
Tracking down this problem lead to the overflow check introduced by the
aforementioned commit into mwifiex_ret_802_11_key_material_v2(). This
check fails on networks with 256 bit keys due to the current storage
size for AES keys in struct mwifiex_aes_param being only 128 bit.
To fix this issue, increase the storage size for AES keys to 256 bit.
Fixes: e18696786548 ("mwifiex: Prevent memory corruption handling keys") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Reported-by: Kaloyan Nikolov <konik98@gmail.com> Tested-by: Kaloyan Nikolov <konik98@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200825153829.38043-1-luzmaximilian@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
When a function is annotated with STACK_FRAME_NON_STANDARD, objtool
doesn't validate its code paths. It also skips sibling call detection
within the function.
But sibling call detection is actually needed for the case where the
ignored function doesn't have any return instructions. Otherwise
objtool naively marks the function as implicit static noreturn, which
affects the reachability of its callers, resulting in "unreachable
instruction" warnings.
Fix it by just enabling sibling call detection for ignored functions.
The 'insn->ignore' check in add_jump_destinations() is no longer needed
after
e6da9567959e ("objtool: Don't use ignore flag for fake jumps").
Some ACPI i2c-devices _STA method (which is used to detect if the device
is present) use autodetection code which probes which device is present
over i2c. This requires the I2C ACPI OpRegion handler to be registered
before we enumerate i2c-clients under the i2c-adapter.
This fixes the i2c touchpad on the Lenovo ThinkBook 14-IIL and
ThinkBook 15 IIL not getting an i2c-client instantiated and thus not
working.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1842039 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the resume stage of GPU recovery, start_cpsch will call pm_init
which set pm->allocated as false, cause the next pm_release_ib has
no chance to release ib memory.
Add pm_release_ib in stop_cpsch which will be called in the suspend
stage of GPU recovery.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add __init to reserve_memory_end, reserve_oldmem and remove_oldmem.
Sometimes these functions are not inlined, and then the build
complains about section mismatch.
Without this we get lockdep failures. They're spurious failures as SMP isn't
up when ftrace_init_nop() is called. As far as I can tell the easiest fix is
to just take the lock, which also seems like the safest fix.
The MPMAN Converter9 2-in-1 almost fully works with out default settings.
The only problem is that it has only 1 speaker so any sounds only playing
on the right channel get lost.
Add a quirk for this model using the default settings + MONO_SPEAKER.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20200901080623.4987-1-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the wm8958_mic_detect, wm8994_mic_detect functions get called from
the machine driver, e.g. from the card's late_probe() callback, the CODEC
device may be PM runtime suspended and any regmap writes have no effect.
Add PM runtime calls to these functions to ensure the device registers
are updated as expected.
This suppresses an error during boot
"wm8994-codec: ASoC: error at snd_soc_component_update_bits on wm8994-codec"
caused by the regmap access error due to the cache_only flag being set.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200827173357.31891-2-s.nawrocki@samsung.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The WM8994_MICBIAS register is not available in the WM1811 CODEC so skip
initialization of that register for that device.
This suppresses an error during boot:
"wm8994-codec: ASoC: error at snd_soc_component_update_bits on wm8994-codec"
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200827173357.31891-1-s.nawrocki@samsung.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 3b4b19721ec652 ("nvme: fix possible deadlock when I/O is
blocked") reverted multipath head disk revalidation due to deadlocks
caused by holding the bd_mutex during revalidate.
Updating the multipath disk blockdev size is still required though for
userspace to be able to observe any resizing while the device is
mounted. Directly update the bdev inode size to avoid unnecessarily
holding the bdev->bd_mutex.
Fixes: 3b4b19721ec652 ("nvme: fix possible deadlock when I/O is
blocked")
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Compile the kernel for arm 32 platform, the build warning found.
To fix that, should use div_u64() for divisions.
| net/openvswitch/meter.c:396: undefined reference to `__udivdi3'
[add more commit msg, change reported tag, and use div_u64 instead
of do_div by Tonghao]
Fixes: e57358873bb5d6ca ("net: openvswitch: use u64 for meter bucket") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Tested-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
A big uncore event group is split into multiple small groups which only
include the uncore events from the same PMU. This has been supported in
the commit 3cdc5c2cb924a ("perf parse-events: Handle uncore event
aliases in small groups properly").
If the event's PMU name starts to repeat, it must be a new event.
That can be used to distinguish the leader from other members.
But now it only compares the pointer of pmu_name
(leader->pmu_name == evsel->pmu_name).
If we use "perf stat -M LLC_MISSES.PCIE_WRITE -a" on cascadelakex,
the event list is:
For the event "unc_iio_data_req_of_cpu.mem_write.part1" with
"uncore_iio_4", it should be the event from PMU "uncore_iio_4".
It's not a new leader for this PMU.
But if we use "(leader->pmu_name == evsel->pmu_name)", the check
would be failed and the event is stored to leaders[] as a new
PMU leader.
So this patch uses strcmp to compare the PMU name between events.
Fixes: d4953f7ef1a2 ("perf parse-events: Fix 3 use after frees found with clang ASAN") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200430003618.17002-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When CONFIG_MTD_UBI_FASTMAP is enabled, fm_anchor will be assigned
a free PEB during ubi_wl_init() or ubi_update_fastmap(). However
if fastmap is not used or disabled on the MTD device, ubi_wl_entry
related with the PEB will not be freed during detach.
So Fix it by freeing the unused fastmap anchor during detach.
[BUG]
When running tests like generic/013 on test device with btrfs quota
enabled, it can normally lead to data leak, detected at unmount time:
BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096
------------[ cut here ]------------
WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs]
RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
Call Trace:
btrfs_put_super+0x15/0x17 [btrfs]
generic_shutdown_super+0x72/0x110
kill_anon_super+0x18/0x30
btrfs_kill_super+0x17/0x30 [btrfs]
deactivate_locked_super+0x3b/0xa0
deactivate_super+0x40/0x50
cleanup_mnt+0x135/0x190
__cleanup_mnt+0x12/0x20
task_work_run+0x64/0xb0
__prepare_exit_to_usermode+0x1bc/0x1c0
__syscall_return_slowpath+0x47/0x230
do_syscall_64+0x64/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace caf08beafeca2392 ]---
BTRFS error (device dm-3): qgroup reserved space leaked
[CAUSE]
In the offending case, the offending operations are:
2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0
The following sequence of events could happen after the writev():
CPU1 (writeback) | CPU2 (truncate)
-----------------------------------------------------------------
btrfs_writepages() |
|- extent_write_cache_pages() |
|- Got page for 1003520 |
| 1003520 is Dirty, no writeback |
| So (!clear_page_dirty_for_io()) |
| gets called for it |
|- Now page 1003520 is Clean. |
| | btrfs_setattr()
| | |- btrfs_setsize()
| | |- truncate_setsize()
| | New i_size is 18388
|- __extent_writepage() |
| |- page_offset() > i_size |
|- btrfs_invalidatepage() |
|- Page is clean, so no qgroup |
callback executed
This means, the qgroup reserved data space is not properly released in
btrfs_invalidatepage() as the page is Clean.
[FIX]
Instead of checking the dirty bit of a page, call
btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().
As qgroup rsv are completely bound to the QGROUP_RESERVED bit of
io_tree, not bound to page status, thus we won't cause double freeing
anyway.
Fixes: 0b34c261e235 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The vfio_pci_release call will free and clear the error and request
eventfd ctx while these ctx could be in use at the same time in the
function like vfio_pci_request, and it's expected to protect them under
the vdev->igate mutex, which is missing in vfio_pci_release.
This issue is introduced since commit 1518ac272e78 ("vfio/pci: fix memory
leaks of eventfd ctx"),and since commit 5c5866c593bb ("vfio/pci: Clear
error and request eventfd ctx after releasing"), it's very easily to
trigger the kernel panic like this:
Below is the log sequence which leads to memory leak. Here we get the
RPORT_EV_READY and RPORT_EV_STOP back to back, which lead to overwrite the
event RPORT_EV_READY by event RPORT_EV_STOP. Because of this, kref_count
gets incremented by 1.
kernel: host0: rport fffce5: Received PLOGI request
kernel: host0: rport fffce5: Received PLOGI in INIT state
kernel: host0: rport fffce5: Port is Ready
kernel: host0: rport fffce5: Received PRLI request while in state Ready
kernel: host0: rport fffce5: PRLI rspp type 8 active 1 passive 0
kernel: host0: rport fffce5: Received LOGO request while in state Ready
kernel: host0: rport fffce5: Delete port
kernel: host0: rport fffce5: Received PLOGI request
kernel: host0: rport fffce5: Received PLOGI in state Delete - send busy
kernel: host0: rport fffce5: work event 3
kernel: host0: rport fffce5: lld callback ev 3
kernel: host0: rport fffce5: work delete
Link: https://lore.kernel.org/r/20200626094959.32151-1-jhasan@marvell.com Reviewed-by: Girish Basrur <gbasrur@marvell.com> Reviewed-by: Saurav Kashyap <skashyap@marvell.com> Reviewed-by: Shyam Sundar <ssundar@marvell.com> Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Below is the log sequence which leads to memory leak. Here we get the
nested "Received PLOGI request" for same port and this request leads to
call the fc_rport_create() twice for the same rport.
kernel: host1: rport fffce5: Received PLOGI request
kernel: host1: rport fffce5: Received PLOGI in INIT state
kernel: host1: rport fffce5: Port is Ready
kernel: host1: rport fffce5: Received PRLI request while in state Ready
kernel: host1: rport fffce5: PRLI rspp type 8 active 1 passive 0
kernel: host1: rport fffce5: Received LOGO request while in state Ready
kernel: host1: rport fffce5: Delete port
kernel: host1: rport fffce5: Received PLOGI request
kernel: host1: rport fffce5: Received PLOGI in state Delete - send busy
Link: https://lore.kernel.org/r/20200622101212.3922-2-jhasan@marvell.com Reviewed-by: Girish Basrur <gbasrur@marvell.com> Reviewed-by: Saurav Kashyap <skashyap@marvell.com> Reviewed-by: Shyam Sundar <ssundar@marvell.com> Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Revert fab7772bfbcf ("nvme-multipath: revalidate nvme_ns_head gendisk
in nvme_validate_ns")
When adding a new namespace to the head disk (via nvme_mpath_set_live)
we will see partition scan which triggers I/O on the mpath device node.
This process will usually be triggered from the scan_work which holds
the scan_lock. If I/O blocks (if we got ana change currently have only
available paths but none are accessible) this can deadlock on the head
disk bd_mutex as both partition scan I/O takes it, and head disk revalidation
takes it to check for resize (also triggered from scan_work on a different
path). See trace [1].
The mpath disk revalidation was originally added to detect online disk
size change, but this is no longer needed since commit cb224c3af4df
("nvme: Convert to use set_capacity_revalidate_and_notify") which already
updates resize info without unnecessarily revalidating the disk (the
mpath disk doesn't even implement .revalidate_disk fop).
Since commit 3fea5a499d57 ("mm: memcontrol: convert page cache to a new
mem_cgroup_charge() API") not cancel the page charge, the pages maybe
double add to pagecache:
thread1 | thread2
cifs_readpages
readpages_get_pages
add_to_page_cache_locked(head,index=n)=0
| readpages_get_pages
| add_to_page_cache_locked(head,index=n+1)=0
add_to_page_cache_locked(head, index=n+1)=-EEXIST
then, will next loop with list head page's
index=n+1 and the page->mapping not NULL
readpages_get_pages
add_to_page_cache_locked(head, index=n+1)
commit_charge
VM_BUG_ON_PAGE
So, we should not do the next loop when any page add to page cache
failed.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Prevent the compiler from uninlining and creating traceable/probable
functions as this is invoked _after_ context tracking switched to
CONTEXT_USER and rcu idle.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200505134340.902709267@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
Looks like some drivers define MTD names with a colon in it, thus
making mtdpart= parsing impossible. Let's fix the parser to gracefully
handle that case: the last ':' in a partition definition sequence is
considered instead of the first one.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Ron Minnich <rminnich@google.com> Tested-by: Ron Minnich <rminnich@google.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fields of md(mport_dev) are set after cdev_device_add(). However, the
file operation callbacks can be called after cdev_device_add() and
therefore accesses to fields of md in the callbacks can race with the rest
of the mport_cdev_add() function.
One such example is INIT_LIST_HEAD(&md->portwrites) in mport_cdev_add(),
the list is initialised after cdev_device_add(). This can race with
list_add_tail(&pw_filter->md_node,&md->portwrites) in
rio_mport_add_pw_filter() which is called by unlocked_ioctl.
To avoid such data races use cdev_device_add() after initializing md.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Alexandre Bounine <alex.bou9@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Mike Marshall <hubcap@omnibond.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Pavel Andrianov <andrianov@ispras.ru> Link: http://lkml.kernel.org/r/20200426112950.1803-1-madhuparnabhowmik10@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
"prev_offset" is a static variable in swapin_nr_pages() that can be
accessed concurrently with only mmap_sem held in read mode as noticed by
KCSAN,
BUG: KCSAN: data-race in swap_cluster_readahead / swap_cluster_readahead
write to 0xffffffff92763830 of 8 bytes by task 14795 on cpu 17:
swap_cluster_readahead+0x2a6/0x5e0
swapin_readahead+0x92/0x8dc
do_swap_page+0x49b/0xf20
__handle_mm_fault+0xcfb/0xd70
handle_mm_fault+0xfc/0x2f0
do_page_fault+0x263/0x715
page_fault+0x34/0x40
1 lock held by (dnf)/14795:
#0: ffff897bd2e98858 (&mm->mmap_sem#2){++++}-{3:3}, at: do_page_fault+0x143/0x715
do_user_addr_fault at arch/x86/mm/fault.c:1405
(inlined by) do_page_fault at arch/x86/mm/fault.c:1535
irq event stamp: 83493
count_memcg_event_mm+0x1a6/0x270
count_memcg_event_mm+0x119/0x270
__do_softirq+0x365/0x589
irq_exit+0xa2/0xc0
read to 0xffffffff92763830 of 8 bytes by task 1 on cpu 22:
swap_cluster_readahead+0xfd/0x5e0
swapin_readahead+0x92/0x8dc
do_swap_page+0x49b/0xf20
__handle_mm_fault+0xcfb/0xd70
handle_mm_fault+0xfc/0x2f0
do_page_fault+0x263/0x715
page_fault+0x34/0x40
1 lock held by systemd/1:
#0: ffff897c38f14858 (&mm->mmap_sem#2){++++}-{3:3}, at: do_page_fault+0x143/0x715
irq event stamp: 43530289
count_memcg_event_mm+0x1a6/0x270
count_memcg_event_mm+0x119/0x270
__do_softirq+0x365/0x589
irq_exit+0xa2/0xc0
Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Marco Elver <elver@google.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200402213748.2237-1-cai@lca.pw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
pm_runtime_get_sync() increments the runtime PM usage counter even
when it returns an error code. Thus a pairing decrement is needed on
the error handling path to keep the counter balanced.
Also, call pm_runtime_disable() when pm_runtime_get_sync() returns
an error code.
Link: https://lore.kernel.org/r/20200521024709.2368-1-dinghao.liu@zju.edu.cn Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pm_runtime_get_sync() increments the runtime PM usage counter even
when it returns an error code. Thus a pairing decrement is needed on
the error handling path to keep the counter balanced.
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
There are two error handling paths in this functon. When
wlcore_tx_work_locked() returns an error code, we should
decrease the runtime PM usage counter the same way as the
error handling path beginning from pm_runtime_get_sync().
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
In the absence of any modules, no "modules" map is created, but there
are other executable pages to map, due to eBPF JIT, kprobe or ftrace.
Map them by recognizing that the first "module" symbol is not
necessarily from a module, and adjust the map accordingly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20200512121922.8997-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Need to free "str" before return when asprintf() failed to avoid memory
leak.
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hongbo Yao <yaohongbo@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200521133218.30150-4-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Joakim reported wrong duration_time value for interval bigger
than 4000 [1].
The problem is in the interval value we pass to update_stats
function, which is typed as 'unsigned int' and overflows when
we get over 2^32 (happens between intervals 4000 and 5000).
Make the architecture test directory agree with the code comment.
Committer notes:
This was split from a larger patch.
The code was assuming the developer always worked from tools/perf/, so make sure we
do the test -d having $toolsdir/perf/arch/$arch, to match the intent expressed in the comment,
just above that loop.
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Igor Lubashev <ilubashe@akamai.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Li <liwei391@huawei.com> Link: http://lore.kernel.org/lkml/20200306071110.130202-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Finished a qemu-kvm (-device vfio-pci,host=0001:01:00.0) triggers a few
memory leaks after a while because vfio_pci_set_ctx_trigger_single()
calls eventfd_ctx_fdget() without the matching eventfd_ctx_put() later.
Fix it by calling eventfd_ctx_put() for those memory in
vfio_pci_release() before vfio_device_release().
Deleting a subvolume on a full filesystem leads to ENOSPC followed by a
forced read-only. This is not a transaction abort and the filesystem is
otherwise ok, so the error should be just propagated to the callers.
This is caused by unnecessary call to btrfs_handle_fs_error for all
errors, except EAGAIN. This does not make sense as the standard
transaction abort mechanism is in btrfs_drop_snapshot so all relevant
failures are handled.
Originally in commit cb1b69f4508a ("Btrfs: forced readonly when
btrfs_drop_snapshot() fails") there was no return value at all, so the
btrfs_std_error made some sense but once the error handling and
propagation has been implemented we don't need it anymore.
Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the SS PHY is in P3, there is no pipe_clk, HW may use suspend_clk
for function, as suspend_clk is slow so EP command need more time to
complete, e.g, imx8M suspend_clk is 32K, set ep configuration will
take about 380us per below trace time stamp(44.286278 - 44.285897
= 0.000381):
This was originally found on Hisilicon Kirin Soc that need more time
for the device controller to clear the CmdAct of DEPCMD.
Signed-off-by: Yu Chen <chenyu56@huawei.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Li Jun <jun.li@nxp.com> Signed-off-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If uboot passes a blank string to console_setup then it results in
a trashed memory. Ultimately, the kernel crashes during freeing up
the memory.
This fix checks if there is a blank parameter being
passed to console_setup from uboot. In case it detects that
the console parameter is blank then it doesn't setup the serial
device and it gracefully exits.
Link: https://lore.kernel.org/r/20200522065306.83-1-shreyas.joshi@biamp.com Signed-off-by: Shreyas Joshi <shreyas.joshi@biamp.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
[pmladek@suse.com: Better format the commit message and code, remove unnecessary brackets.] Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We are seeing a deadlock in e1000 down when NAPI is being disabled. Looking
over the kernel function trace of the system it appears that the interface
is being closed and then a reset is hitting which deadlocks the interface
as the NAPI interface is already disabled.
To prevent this from happening I am disabling the reset task when
__E1000_DOWN is already set. In addition code has been added so that we set
the __E1000_DOWN while holding the __E1000_RESET flag in e1000_close in
order to guarantee that the reset task will not run after we have started
the close call.
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Tested-by: Maxim Zhukov <mussitantesmortem@gmail.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix to return negative error code -ENOMEM from create_afu error handling
case instead of 0, as done elsewhere in this function.
Link: https://lore.kernel.org/r/20200428141855.88704-1-weiyongjun1@huawei.com Acked-by: Matthew R. Ochs <mrochs@linux.ibm.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The comparison of hcd->irq to less than zero for an error check will
never be true because hcd->irq is an unsigned int. Fix this by
assigning the int retval to the return of platform_get_irq and checking
this for the -ve error condition and assigning hcd->irq to retval.
Addresses-Coverity: ("Unsigned compared against 0") Fixes: c856b4b0fdb5 ("USB: EHCI: ehci-mv: fix error handling in mv_ehci_probe()") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20200515165453.104028-1-colin.king@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
page_count() is unstable. Unless there has been an RCU grace period
between when the page was removed from the page cache and now, a
speculative reference may exist from the page cache.
System Reset and Machine Check interrupts that are not recoverable due
to being nested or interrupting when RI=0 currently panic. This is not
necessary, and can often just kill the current context and recover.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Link: https://lore.kernel.org/r/20200508043408.886394-16-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The unsol event handling code has a loop retrieving the read/write
indices and the arrays without locking while the append to the array
may happen concurrently. This may lead to some inconsistency.
Although there hasn't been any proof of this bad results, it's still
safer to protect the racy accesses.
This patch adds the spinlock protection around the unsol handling loop
for addressing it. Here we take bus->reg_lock as the writer side
snd_hdac_bus_queue_event() is also protected by that lock.
Some variants of the samsung tty driver can pick which clock
to use for their baud rate generation. In the DT conversion,
a default clock was selected to be used if a specific one wasn't
assigned and then a comparison of which clock rate worked better
was done. Unfortunately, the comparison was implemented in such
a way that only the default clock was ever actually compared.
Fix this by iterating through all possible clocks, except when a
specific clock has already been picked via clk_sel (which is
only possible via board files).
Upon receipt of a service subscription request from user via a topology
connection, one 'sub' object will be allocated in kernel, so it will be
able to send an event of the service if any to the user correspondingly
then. Also, in case of any failure, the connection will be shutdown and
all the pertaining 'sub' objects will be freed.
However, there is a race condition as follows resulting in memory leak:
That is, the 'receive-work' may get the last subscription request while
the 'send-work' is shutting down the connection due to peer close.
We had a 'lock' on the connection, so the two actions cannot be carried
out simultaneously. If the last subscription is allocated e.g. 'sub-n',
before the 'send-work' closes the connection, there will be no issue at
all, the 'sub' objects will be freed. In contrast the last subscription
will become orphan since the connection was closed, and we released all
references.
This commit fixes the issue by simply adding one test if the connection
remains in 'connected' state right after we obtain the connection lock,
then a subscription object can be created as usual, otherwise we ignore
it.
Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Reported-by: Thang Ngo <thang.h.ngo@dektech.com.au> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the function platform_get_irq() failed, the negative value
returned will not be detected here. So fix error handling in
mv_ehci_probe(). And when get irq failed, the function
platform_get_irq() logs an error message, so remove redundant
message here.
After sending Inquiry Cancel command to the controller, it is possible
that Inquiry Complete event comes before Inquiry Cancel command complete
event. In this case the Inquiry Cancel command will have status of
Command Disallowed since there is no Inquiry session to be cancelled.
This case should not be treated as error, otherwise we can reach an
inconsistent state.
The USB phy takes some time to reset, so make sure we give it to it. The
delay length was taken from the 4x12 phy driver.
This manifested in issues with the DWC2 driver since commit fe369e1826b3
("usb: dwc2: Make dwc2_readl/writel functions endianness-agnostic.")
where the endianness check would read the DWC ID as 0 due to the phy still
resetting, resulting in the wrong endian mode being chosen.
According to the datasheet available at (1), the bottom four
bits are always zero and the actual voltage is 1.25x this value
in mV. Since the kernel API specifies that voltages should be in
uV, it should report 1250x the shifted value.
Realloc of size zero is a free not an error, avoid this causing a double
free. Caught by clang's address sanitizer:
==2634==ERROR: AddressSanitizer: attempting double-free on 0x6020000015f0 in thread T0:
#0 0x5649659297fd in free llvm/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:123:3
#1 0x5649659e9251 in __zfree tools/lib/zalloc.c:13:2
#2 0x564965c0f92c in mem2node__exit tools/perf/util/mem2node.c:114:2
#3 0x564965a08b4c in perf_c2c__report tools/perf/builtin-c2c.c:2867:2
#4 0x564965a0616a in cmd_c2c tools/perf/builtin-c2c.c:2989:10
#5 0x564965944348 in run_builtin tools/perf/perf.c:312:11
#6 0x564965943235 in handle_internal_command tools/perf/perf.c:364:8
#7 0x5649659440c4 in run_argv tools/perf/perf.c:408:2
#8 0x564965942e41 in main tools/perf/perf.c:538:3
0x6020000015f0 is located 0 bytes inside of 1-byte region [0x6020000015f0,0x6020000015f1)
freed by thread T0 here:
#0 0x564965929da3 in realloc third_party/llvm/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x564965c0f55e in mem2node__init tools/perf/util/mem2node.c:97:16
#2 0x564965a08956 in perf_c2c__report tools/perf/builtin-c2c.c:2803:8
#3 0x564965a0616a in cmd_c2c tools/perf/builtin-c2c.c:2989:10
#4 0x564965944348 in run_builtin tools/perf/perf.c:312:11
#5 0x564965943235 in handle_internal_command tools/perf/perf.c:364:8
#6 0x5649659440c4 in run_argv tools/perf/perf.c:408:2
#7 0x564965942e41 in main tools/perf/perf.c:538:3
previously allocated by thread T0 here:
#0 0x564965929c42 in calloc third_party/llvm/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x5649659e9220 in zalloc tools/lib/zalloc.c:8:9
#2 0x564965c0f32d in mem2node__init tools/perf/util/mem2node.c:61:12
#3 0x564965a08956 in perf_c2c__report tools/perf/builtin-c2c.c:2803:8
#4 0x564965a0616a in cmd_c2c tools/perf/builtin-c2c.c:2989:10
#5 0x564965944348 in run_builtin tools/perf/perf.c:312:11
#6 0x564965943235 in handle_internal_command tools/perf/perf.c:364:8
#7 0x5649659440c4 in run_argv tools/perf/perf.c:408:2
#8 0x564965942e41 in main tools/perf/perf.c:538:3
v2: add a WARN_ON_ONCE when the free condition arises.
Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: clang-built-linux@googlegroups.com Link: http://lore.kernel.org/lkml/20200320182347.87675-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Cc: Gengming Liu <l.dmxcsnsbh@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The required supplies in bindings were actually not matching
implementation making the bindings incorrect and misleading. The Linux
kernel driver requires all supplies to be present. Also for wlf,wm8994
uses just DBVDD-supply instead of DBVDDn-supply (n: <1,3>).
Reported-by: Jonathan Bakker <xc-racer2@live.ca> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200501133534.6706-1-krzk@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
We don't need to be quite as strict about mismatched AArch32 support,
which is good because the friendly hardware folks have been busy
mismatching this to their hearts' content.
* We don't care about EL2 or EL3 (there are silly comments concerning
the latter, so remove those)
* EL1 support is gated by the ARM64_HAS_32BIT_EL1 capability and handled
gracefully when a mismatch occurs
* EL0 support is gated by the ARM64_HAS_32BIT_EL0 capability and handled
gracefully when a mismatch occurs
Relax the AArch32 checks to FTR_NONSTRICT.
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20200421142922.18950-8-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When setting the meter rate to 4+Gbps, there is an
overflow, the meters don't work as expected.
Cc: Pravin B Shelar <pshelar@ovn.org> Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If we're going to fail out the vgic_add_lpi(), let's make sure the
allocated vgic_irq memory is also freed. Though it seems that both
cases are unlikely to fail.
After registering character device the file operation callbacks can be
called. The open callback registers interrupt handler.
Therefore interrupt handler can execute in parallel with rest of the init
function. To avoid such data race initialize telclk_interrupt variable
and struct alarm_events before registering character device.
Found by Linux Driver Verification project (linuxtesting.org).
While trying to "dd" to the block device for a USB stick, I
encountered a hung task warning (blocked for > 120 seconds). I
managed to come up with an easy way to reproduce this on my system
(where /dev/sdb is the block device for my USB stick) with:
while true; do dd if=/dev/zero of=/dev/sdb bs=4M; done
With my reproduction here are the relevant bits from the hung task
detector:
INFO: task udevd:294 blocked for more than 122 seconds.
...
udevd D 0 294 1 0x00400008
Call trace:
...
mutex_lock_nested+0x40/0x50
__blkdev_get+0x7c/0x3d4
blkdev_get+0x118/0x138
blkdev_open+0x94/0xa8
do_dentry_open+0x268/0x3a0
vfs_open+0x34/0x40
path_openat+0x39c/0xdf4
do_filp_open+0x90/0x10c
do_sys_open+0x150/0x3c8
...
...
Showing all locks held in the system:
...
1 lock held by dd/2798:
#0: ffffff814ac1a3b8 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x50/0x204
...
dd D 0 2798 2764 0x00400208
Call trace:
...
schedule+0x8c/0xbc
io_schedule+0x1c/0x40
wait_on_page_bit_common+0x238/0x338
__lock_page+0x5c/0x68
write_cache_pages+0x194/0x500
generic_writepages+0x64/0xa4
blkdev_writepages+0x24/0x30
do_writepages+0x48/0xa8
__filemap_fdatawrite_range+0xac/0xd8
filemap_write_and_wait+0x30/0x84
__blkdev_put+0x88/0x204
blkdev_put+0xc4/0xe4
blkdev_close+0x28/0x38
__fput+0xe0/0x238
____fput+0x1c/0x28
task_work_run+0xb0/0xe4
do_notify_resume+0xfc0/0x14bc
work_pending+0x8/0x14
The problem appears related to the fact that my USB disk is terribly
slow and that I have a lot of RAM in my system to cache things.
Specifically my writes seem to be happening at ~15 MB/s and I've got
~4 GB of RAM in my system that can be used for buffering. To write 4
GB of buffer to disk thus takes ~4000 MB / ~15 MB/s = ~267 seconds.
The 267 second number is a problem because in __blkdev_put() we call
sync_blockdev() while holding the bd_mutex. Any other callers who
want the bd_mutex will be blocked for the whole time.
The problem is made worse because I believe blkdev_put() specifically
tells other tasks (namely udev) to go try to access the device at right
around the same time we're going to hold the mutex for a long time.
Putting some traces around this (after disabling the hung task detector),
I could confirm:
dd: 437.608600: __blkdev_put() right before sync_blockdev() for sdb
udevd: 437.623901: blkdev_open() right before blkdev_get() for sdb
dd: 661.468451: __blkdev_put() right after sync_blockdev() for sdb
udevd: 663.820426: blkdev_open() right after blkdev_get() for sdb
A simple fix for this is to realize that sync_blockdev() works fine if
you're not holding the mutex. Also, it's not the end of the world if
you sync a little early (though it can have performance impacts).
Thus we can make a guess that we're going to need to do the sync and
then do it without holding the mutex. We still do one last sync with
the mutex but it should be much, much faster.
With this, my hung task warnings for my test case are gone.
Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fixes a NULL pointer dereference, caused by the PIT firing an interrupt
before the interrupt table has been initialized.
SET_PIT2 can race with the creation of the IRQchip. In particular,
if SET_PIT2 is called with a low PIT timer period (after the creation of
the IOAPIC, but before the instantiation of the irq routes), the PIT can
fire an interrupt at an uninitialized table.
Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Jon Cargille <jcargill@google.com> Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20200416191152.259434-1-jcargill@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
On some platforms, the log is corrupted while console is being
registered. It is observed that when set_termios is called, there
are still some bytes in the FIFO to be transmitted.
So, wait for tx_empty inside cdns_uart_console_setup before calling
set_termios.
On passing requirement to vm_unmapped_area, arch_get_unmapped_area and
arch_get_unmapped_area_topdown did not set align_offset. Internally on
both unmapped_area and unmapped_area_topdown, if info->align_mask is 0,
then info->align_offset was meaningless.
But commit df529cabb7a2 ("mm: mmap: add trace point of
vm_unmapped_area") always prints info->align_offset even though it is
uninitialized.
Fix this uninitialized value issue by setting it to 0 explicitly.
In case rdma accept fails at nvmet_rdma_queue_connect(), release work is
scheduled. Later on, a new RDMA CM event may arrive since we didn't
destroy the cm-id and call nvmet_rdma_queue_connect_fail(), which
schedule another release work. This will cause calling
nvmet_rdma_free_queue twice. To fix this we implicitly destroy the cm_id
with non-zero ret code, which guarantees that new rdma_cm events will
not arrive afterwards. Also add a qp pointer to nvmet_rdma_queue
structure, so we can use it when the cm_id pointer is NULL or was
destroyed.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Suggested-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
pgdat->kswapd_classzone_idx could be accessed concurrently in
wakeup_kswapd(). Plain writes and reads without any lock protection
result in data races. Fix them by adding a pair of READ|WRITE_ONCE() as
well as saving a branch (compilers might well optimize the original code
in an unintentional way anyway). While at it, also take care of
pgdat->kswapd_order and non-kswapd threads in allow_direct_reclaim(). The
data races were reported by KCSAN,
BUG: KCSAN: data-race in wakeup_kswapd / wakeup_kswapd
write to 0xffff9f427ffff2dc of 4 bytes by task 7454 on cpu 13:
wakeup_kswapd+0xf1/0x400
wakeup_kswapd at mm/vmscan.c:3967
wake_all_kswapds+0x59/0xc0
wake_all_kswapds at mm/page_alloc.c:4241
__alloc_pages_slowpath+0xdcc/0x1290
__alloc_pages_slowpath at mm/page_alloc.c:4512
__alloc_pages_nodemask+0x3bb/0x450
alloc_pages_vma+0x8a/0x2c0
do_anonymous_page+0x16e/0x6f0
__handle_mm_fault+0xcd5/0xd40
handle_mm_fault+0xfc/0x2f0
do_page_fault+0x263/0x6f9
page_fault+0x34/0x40
1 lock held by mtest01/7454:
#0: ffff9f425afe8808 (&mm->mmap_sem#2){++++}, at:
do_page_fault+0x143/0x6f9
do_user_addr_fault at arch/x86/mm/fault.c:1405
(inlined by) do_page_fault at arch/x86/mm/fault.c:1539
irq event stamp: 6944085
count_memcg_event_mm+0x1a6/0x270
count_memcg_event_mm+0x119/0x270
__do_softirq+0x34c/0x57c
irq_exit+0xa2/0xc0
read to 0xffff9f427ffff2dc of 4 bytes by task 7472 on cpu 38:
wakeup_kswapd+0xc8/0x400
wake_all_kswapds+0x59/0xc0
__alloc_pages_slowpath+0xdcc/0x1290
__alloc_pages_nodemask+0x3bb/0x450
alloc_pages_vma+0x8a/0x2c0
do_anonymous_page+0x16e/0x6f0
__handle_mm_fault+0xcd5/0xd40
handle_mm_fault+0xfc/0x2f0
do_page_fault+0x263/0x6f9
page_fault+0x34/0x40
1 lock held by mtest01/7472:
#0: ffff9f425a9ac148 (&mm->mmap_sem#2){++++}, at:
do_page_fault+0x143/0x6f9
irq event stamp: 6793561
count_memcg_event_mm+0x1a6/0x270
count_memcg_event_mm+0x119/0x270
__do_softirq+0x34c/0x57c
irq_exit+0xa2/0xc0
BUG: KCSAN: data-race in kswapd / wakeup_kswapd
write to 0xffff90973ffff2dc of 4 bytes by task 820 on cpu 6:
kswapd+0x27c/0x8d0
kthread+0x1e0/0x200
ret_from_fork+0x27/0x50
read to 0xffff90973ffff2dc of 4 bytes by task 6299 on cpu 0:
wakeup_kswapd+0xf3/0x450
wake_all_kswapds+0x59/0xc0
__alloc_pages_slowpath+0xdcc/0x1290
__alloc_pages_nodemask+0x3bb/0x450
alloc_pages_vma+0x8a/0x2c0
do_anonymous_page+0x170/0x700
__handle_mm_fault+0xc9f/0xd00
handle_mm_fault+0xfc/0x2f0
do_page_fault+0x263/0x6f9
page_fault+0x34/0x40
Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Marco Elver <elver@google.com> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/1582749472-5171-1-git-send-email-cai@lca.pw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Mount failure issue happens under the scenario: Application forked dozens
of threads to mount the same number of cramfs images separately in docker,
but several mounts failed with high probability. Mount failed due to the
checking result of the page(read from the superblock of loop dev) is not
uptodate after wait_on_page_locked(page) returned in function cramfs_read:
wait_on_page_locked(page);
if (!PageUptodate(page)) {
...
}
The reason of the checking result of the page not uptodate: systemd-udevd
read the loopX dev before mount, because the status of loopX is Lo_unbound
at this time, so loop_make_request directly trigger the calling of io_end
handler end_buffer_async_read, which called SetPageError(page). So It
caused the page can't be set to uptodate in function
end_buffer_async_read:
Then mount operation is performed, it used the same page which is just
accessed by systemd-udevd above, Because this page is not uptodate, it
will launch a actual read via submit_bh, then wait on this page by calling
wait_on_page_locked(page). When the I/O of the page done, io_end handler
end_buffer_async_read is called, because no one cleared the page
error(during the whole read path of mount), which is caused by
systemd-udevd reading, so this page is still in "PageError" status, which
can't be set to uptodate in function end_buffer_async_read, then caused
mount failure.
But sometimes mount succeed even through systemd-udeved read loopX dev
just before, The reason is systemd-udevd launched other loopX read just
between step 3.1 and 3.2, the steps as below:
1, loopX dev default status is Lo_unbound;
2, systemd-udved read loopX dev (page is set to PageError);
3, mount operation
1) set loopX status to Lo_bound;
==>systemd-udevd read loopX dev<==
2) read loopX dev(page has no error)
3) mount succeed
As the loopX dev status is set to Lo_bound after step 3.1, so the other
loopX dev read by systemd-udevd will go through the whole I/O stack, part
of the call trace as below:
here, mapping->a_ops->readpage() is blkdev_readpage. In latest kernel,
some function name changed, the call trace as below:
blkdev_read_iter
generic_file_read_iter
generic_file_buffered_read:
/*
* A previous I/O error may have been due to temporary
* failures, eg. mutipath errors.
* Pg_error will be set again if readpage fails.
*/
ClearPageError(page);
/* Start the actual read. The read will unlock the page*/
error=mapping->a_ops->readpage(flip, page);
We can see ClearPageError(page) is called before the actual read,
then the read in step 3.2 succeed.
This patch is to add the calling of ClearPageError just before the actual
read of read path of cramfs mount. Without the patch, the call trace as
below when performing cramfs mount:
do_mount
cramfs_read
cramfs_blkdev_read
read_cache_page
do_read_cache_page:
filler(data, page);
or
mapping->a_ops->readpage(data, page);
With the patch, the call trace as below when performing mount:
do_mount
cramfs_read
cramfs_blkdev_read
read_cache_page:
do_read_cache_page:
ClearPageError(page); <== new add
filler(data, page);
or
mapping->a_ops->readpage(data, page);
With the patch, mount operation trigger the calling of
ClearPageError(page) before the actual read, the page has no error if no
additional page error happen when I/O done.
Signed-off-by: Xianting Tian <xianting_tian@126.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Jan Kara <jack@suse.cz> Cc: <yubin@h3c.com> Link: http://lkml.kernel.org/r/1583318844-22971-1-git-send-email-xianting_tian@126.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
mm/kmemleak.c:1955:28: warning: array comparison always evaluates to a constant [-Wtautological-compare]
if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
^
mm/kmemleak.c:1955:60: warning: array comparison always evaluates to a constant [-Wtautological-compare]
if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
These are not true arrays, they are linker defined symbols, which are just
addresses. Using the address of operator silences the warning and does
not change the resulting assembly with either clang/ld.lld or gcc/ld
(tested with diff + objdump -Dr).
When a subrequest is being detached from the subgroup, we want to
ensure that it is not holding the group lock, or in the process
of waiting for the group lock.
Without this commit, a PCIe hotplug port can stop generating interrupts on
hotplug events, so device adds and removals will not be seen:
The pciehp interrupt handler pciehp_isr() reads the Slot Status register
and then writes back to it to clear the bits that caused the interrupt. If
a different interrupt event bit gets set between the read and the write,
pciehp_isr() returns without having cleared all of the interrupt event
bits. If this happens when the MSI isn't masked (which by default it isn't
in handle_edge_irq(), and which it will never be when MSI per-vector
masking is not supported), we won't get any more hotplug interrupts from
that device.
That is expected behavior, according to the PCIe Base Spec r5.0, section
6.7.3.4, "Software Notification of Hot-Plug Events".
Because the Presence Detect Changed and Data Link Layer State Changed event
bits can both get set at nearly the same time when a device is added or
removed, this is more likely to happen than it might seem. The issue was
found (and can be reproduced rather easily) by connecting and disconnecting
an NVMe storage device on at least one system model where the NVMe devices
were being connected to an AMD PCIe port (PCI device 0x1022/0x1483).
Fix the issue by modifying pciehp_isr() to loop back and re-read the Slot
Status register immediately after writing to it, until it sees that all of
the event status bits have been cleared.
[lukas: drop loop count limitation, write "events" instead of "status",
don't loop back in INTx and poll modes, tweak code comment & commit msg] Link: https://lore.kernel.org/r/78b4ced5072bfe6e369d20e8b47c279b8c7af12e.1582121613.git.lukas@wunner.de Tested-by: Stuart Hayes <stuart.w.hayes@gmail.com> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Miditech MIDIFACE 16x16 (USB ID 1290:1749) has more than one extra
endpoint descriptor.
The first extra descriptor is: 0x06 0x30 0x00 0x00 0x00 0x00
As the code in snd_usbmidi_get_ms_info() looks only at the
first extra descriptor to find USB_DT_CS_ENDPOINT the device
as such is recognized but there is neither input nor output
configured.
The patch iterates through the extra descriptors to find the
proper one. With this patch the device is correctly configured.
In “ubifs_check_node”, when the value of "node_len" is abnormal,
the code will goto label of "out_len" for execution. Then, in the
following "ubifs_dump_node", if inode type is "UBIFS_DATA_NODE",
in "print_hex_dump", an out-of-bounds access may occur due to the
wrong "ch->len".
Therefore, when the value of "node_len" is abnormal, data length
should to be adjusted to a reasonable safe range. At this time,
structured data is not credible, so dump the corrupted data directly
for analysis.
Signed-off-by: Liu Song <liu.song11@zte.com.cn> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
On some EFI systems, the video BIOS is provided by the EFI firmware. The
boot stub code stores the physical address of the ROM image in pdev->rom.
Currently we attempt to access this pointer using phys_to_virt(), which
doesn't work with CONFIG_HIGHMEM.
On these systems, attempting to load the radeon module on a x86_32 kernel
can result in the following:
Fix the issue by updating all drivers which can access a platform provided
ROM. Instead of calling the helper function pci_platform_rom() which uses
phys_to_virt(), call ioremap() directly on the pdev->rom.
radeon_read_platform_bios() previously directly accessed an __iomem
pointer. Avoid this by calling memcpy_fromio() instead of kmemdup().
pci_platform_rom() now has no remaining callers, so remove it.
Link: https://lore.kernel.org/r/20200319021623.5426-1-mikel@mikelr.com Signed-off-by: Mikel Rychliski <mikel@mikelr.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Introduce a call to xprt_rdma_free_addresses() similar to the way
that the TCP backchannel releases a transport's peer address
strings.
Fixes: 5d252f90a800 ("svcrdma: Add class for RDMA backwards direction transport") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
'maxlen' is the total size of the destination buffer. There is only one
caller and this value is 256.
When we compute the size already used and what we would like to add in
the buffer, the trailling NULL character is not taken into account.
However, this trailling character will be added by the 'strcat' once we
have checked that we have enough place.
So, there is a off-by-one issue and 1 byte of the stack could be
erroneously overwridden.
Take into account the trailling NULL, when checking if there is enough
place in the destination buffer.
While at it, also replace a 'sprintf' by a safer 'snprintf', check for
output truncation and avoid a superfluous 'strlen'.
Fixes: dc9a16e49dbba ("svc: Add /proc/sys/sunrpc/transport files") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[ cel: very minor fix to documenting comment Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Correct race condition where ioaccel is re-enabled before the raid_map is
updated. For RAID_1, RAID_1ADM, and RAID 5/6 there is a BUG_ON called which
is bad.
- Change event thread to disable ioaccel only. Send all requests down the
RAID path instead.
- Have rescan thread handle offload_enable.
- Since there is only one rescan allowed at a time, turning
offload_enabled on/off should not be racy. Each handler queues up a
rescan if one is already in progress.
- For timing diagram, offload_enabled is initially off due to a change
(transformation: splitmirror/remirror), ...
T0 - I/O completion with UA 0x3f 0x0e sets rescan flag.
T1 - rescan worker thread starts a rescan.
T2 - event comes in
T3 - event thread starts and issues "Acknowledge" message
...
T6 - rescan thread has bypassed code to reload new raid map.
...
T7 - event thread runs and sets offload_to_be_enabled
...
T9 - rescan thread turns on offload_enabled.
T10- request comes in and goes down ioaccel path.
T11- BUG_ON.
- After the patch is applied, ioaccel_enabled can only be re-enabled in
the re-scan thread.
Link: https://lore.kernel.org/r/158472877894.14200.7077843399036368335.stgit@brunhilda Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Matt Perricone <matt.perricone@microsemi.com> Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The RXE driver doesn't set sys_image_guid and user space applications see
zeros. This causes to pyverbs tests to fail with the following traceback,
because the IBTA spec requires to have valid sys_image_guid.
Traceback (most recent call last):
File "./tests/test_device.py", line 51, in test_query_device
self.verify_device_attr(attr)
File "./tests/test_device.py", line 74, in verify_device_attr
assert attr.sys_image_guid != 0
In order to fix it, set sys_image_guid to be equal to node_guid.
Calling nvme_sysfs_delete() when the controller is in the middle of
creation may cause several bugs. If the controller is in NEW state we
remove delete_controller file and don't delete the controller. The user
will not be able to use nvme disconnect command on that controller again,
although the controller may be active. Other bugs may happen if the
controller is in the middle of create_ctrl callback and
nvme_do_delete_ctrl() starts. For example, freeing I/O tagset at
nvme_do_delete_ctrl() before it was allocated at create_ctrl callback.
To fix all those races don't allow the user to delete the controller
before it was fully created.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The nvme multipath error handling defaults to controller reset if the
error is unknown. There are, however, no existing nvme status codes that
indicate a reset should be used, and resetting causes unnecessary
disruption to the rest of IO.
Change nvme's error handling to first check if failover should happen.
If not, let the normal error handling take over rather than reset the
controller.
Based-on-a-patch-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: John Meneghini <johnm@netapp.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If '-o' was used more than 64 times in a single invocation of gpio-hammer,
this could lead to an overflow of the 'lines' array. This commit fixes
this by avoiding the overflow and giving a proper diagnostic back to the
user