Matthew Wilcox (Oracle) [Wed, 9 Feb 2022 16:12:30 +0000 (11:12 -0500)]
nilfs: Convert nilfs_set_page_dirty() to nilfs_dirty_folio()
The comment about the page always being locked is wrong, so copy
the locking protection from __set_page_dirty_buffers(). That
means moving the call to nilfs_set_file_dirty() down the
function so as to not acquire a new dependency between the
mapping->private_lock and the ns_inode_lock. That might be a
harmless dependency to add, but it's not necessary.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Wed, 9 Feb 2022 14:38:09 +0000 (09:38 -0500)]
f2fs: Convert f2fs_set_data_page_dirty to f2fs_dirty_data_folio
Removes several calls to __set_page_dirty_nobuffers(). Also turn the
PageSwapCache() case into a BUG() as there's no way for a swapcache page
to make it to a filesystem that doesn't use SWP_FS_OPS.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Tue, 8 Feb 2022 21:31:44 +0000 (16:31 -0500)]
fs: Add aops->dirty_folio
This replaces ->set_page_dirty(). It returns a bool instead of an int
and takes the address_space as a parameter instead of expecting the
implementations to retrieve the address_space from the page. This is
particularly important for filesystems which use FS_OPS for swap.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Mon, 7 Feb 2022 20:47:29 +0000 (15:47 -0500)]
nfs: Convert from launder_page to launder_folio
We don't need to use page_file_mapping() here because launder_folio
is never called for swap cache pages. We also don't need to
cast an loff_t in order to print it.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Mon, 7 Feb 2022 20:23:41 +0000 (15:23 -0500)]
fs: Add aops->launder_folio
Since the only difference between ->launder_page and ->launder_folio
is the type of the pointer, these can safely use a union without
affecting bisectability.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Mon, 7 Feb 2022 19:07:49 +0000 (14:07 -0500)]
nfs: Convert from invalidatepage to invalidate_folio
Print the folio index instead of the pointer, since this is more
useful. We also don't need to use page_file_mapping() as we do not
invalidate swapcache pages. Since this is the only caller of
nfs_wb_page_cancel(), convert it to nfs_wb_folio_cancel().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Mon, 7 Feb 2022 17:01:34 +0000 (12:01 -0500)]
f2fs: Convert invalidatepage to invalidate_folio
This is a minimal change which just accepts the new arguments and passes
the single struct page to the functions which do the work. There is
very little progress here toards making f2fs support large folios.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Mon, 7 Feb 2022 15:58:35 +0000 (10:58 -0500)]
btrfs: Convert from invalidatepage to invalidate_folio
A lot of the underlying infrastructure in btrfs needs to be switched
over to folios, but this at least documents that invalidatepage can't
be passed a tail page.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Mon, 7 Feb 2022 15:36:00 +0000 (10:36 -0500)]
afs: Convert invalidatepage to invalidate_folio
We know the page is in the page cache, not the swap cache. If we ever
support folios larger than 2GB, afs_invalidate_dirty() will need to be
fixed, but that's a larger project.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Sun, 6 Feb 2022 12:19:11 +0000 (07:19 -0500)]
fs: Convert is_partially_uptodate to folios
Since the uptodate property is maintained on a per-folio basis, the
is_partially_uptodate method should also take a folio. Fix the types
at the same time so it's clear that it returns true/false and takes
the count in bytes, not blocks.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Mon, 17 Jan 2022 18:54:03 +0000 (13:54 -0500)]
buffer: Add folio_buffers()
While there is no intent to use large folios in filesystems using buffer
heads, converting the filesystems to use single-page folios is still worth
doing to remove legacy infrastructure and hidden calls to compound_head().
These helper functions are needed for that conversion to take place.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Wed, 9 Feb 2022 17:58:22 +0000 (12:58 -0500)]
scsicam: Fix use of page cache
Filesystems do not necessarily set PageError; instead they will leave
PageUptodate clear on errors. We should also kmap() the page before
accessing it in case the page is allocated from HIGHMEM.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Fri, 4 Feb 2022 23:07:27 +0000 (18:07 -0500)]
fs: read_mapping_page() should take a struct file argument
While read_cache_page() takes a void *, because you can pass a
pointer to anything as the first argument of filler_t, if we
are calling read_mapping_page(), it will be passed as the first
argument of ->readpage, so we know this must be a struct file
pointer, and we should let the compiler enforce that for us.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Linus Torvalds [Sun, 13 Feb 2022 19:58:11 +0000 (11:58 -0800)]
Merge tag 'kbuild-fixes-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix the truncated path issue for HAVE_GCC_PLUGINS test in Kconfig
- Move -Wunsligned-access to W=1 builds to avoid sprinkling warnings
for the latest Clang
- Fix missing fclose() in Kconfig
- Fix Kconfig to touch dep headers correctly when KCONFIG_AUTOCONFIG is
overridden.
* tag 'kbuild-fixes-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: fix failing to generate auto.conf
kconfig: fix missing fclose() on error paths
Makefile.extrawarn: Move -Wunaligned-access to W=1
kconfig: let 'shell' return enough output for deep path names
Linus Torvalds [Sun, 13 Feb 2022 18:06:40 +0000 (10:06 -0800)]
Merge tag 'irq-urgent-2022-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"Interrupt chip driver fixes:
- Don't install an hotplug notifier for GICV3-ITS on systems which do
not need it to prevent a warning in the notifier about inconsistent
state
- Add the missing device tree matching for the T-HEAD PLIC variant so
the related SoC is properly supported"
* tag 'irq-urgent-2022-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/sifive-plic: Add missing thead,c900-plic match string
dt-bindings: update riscv plic compatible string
irqchip/gic-v3-its: Skip HP notifier when no ITS is registered
Linus Torvalds [Sun, 13 Feb 2022 17:43:34 +0000 (09:43 -0800)]
Merge tag 'objtool_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fix from Borislav Petkov:
"Fix a case where objtool would mistakenly warn about instructions
being unreachable"
* tag 'objtool_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bug: Merge annotate_reachable() into _BUG_FLAGS() asm
Linus Torvalds [Sun, 13 Feb 2022 17:22:52 +0000 (09:22 -0800)]
Merge tag 'x86_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
"Prevent softlockups when tearing down large SGX enclaves"
* tag 'x86_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sgx: Silence softlockup detection when releasing large enclaves
Linus Torvalds [Sun, 13 Feb 2022 17:16:45 +0000 (09:16 -0800)]
Merge tag '5.17-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three small smb3 reconnect fixes and an error log clarification"
* tag '5.17-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: mark sessions for reconnection in helper function
cifs: call helper functions for marking channels for reconnect
cifs: call cifs_reconnect when a connection is marked
[smb3] improve error message when mount options conflict with posix
Linus Torvalds [Sat, 12 Feb 2022 18:29:02 +0000 (10:29 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two minor fixes in the lpfc driver. One changing the classification of
trace messages and the other fixing a build issue when NVME_FC is
disabled"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: lpfc: Reduce log messages seen after firmware download
scsi: lpfc: Remove NVMe support if kernel has NVME_FC disabled
Linus Torvalds [Sat, 12 Feb 2022 18:01:55 +0000 (10:01 -0800)]
Merge tag 'tty-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are four small tty/serial fixes for 5.17-rc4. They are:
- 8250_pericom change revert to fix a reported regression
- two speculation fixes for vt_ioctl
- n_tty regression fix for polling
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt_ioctl: add array_index_nospec to VT_ACTIVATE
vt_ioctl: fix array_index_nospec in vt_setactivate
serial: 8250_pericom: Revert "Re-enable higher baud rates"
n_tty: wake up poll(POLLRDNORM) on receiving data
Linus Torvalds [Sat, 12 Feb 2022 17:56:18 +0000 (09:56 -0800)]
Merge tag 'usb-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB driver fixes for 5.17-rc4 that resolve some
reported issues and add new device ids:
- usb-serial new device ids
- ulpi cleanup fixes
- f_fs use-after-free fix
- dwc3 driver fixes
- ax88179_178a usb network driver fix
- usb gadget fixes
There is a revert at the end of this series to resolve a build problem
that 0-day found yesterday. Most of these have been in linux-next,
except for the last few, and all have now passed 0-day tests"
* tag 'usb-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Revert "usb: dwc2: drd: fix soft connect when gadget is unconfigured"
usb: dwc2: drd: fix soft connect when gadget is unconfigured
usb: gadget: rndis: check size of RNDIS_MSG_SET command
USB: gadget: validate interface OS descriptor requests
usb: core: Unregister device on component_add() failure
net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixup
usb: dwc3: gadget: Prevent core from processing stale TRBs
USB: serial: cp210x: add CPI Bulk Coin Recycler id
USB: serial: cp210x: add NCR Retail IO box id
USB: serial: ftdi_sio: add support for Brainboxes US-159/235/320
usb: gadget: f_uac2: Define specific wTerminalType
usb: gadget: udc: renesas_usb3: Fix host to USB_ROLE_NONE transition
usb: raw-gadget: fix handling of dual-direction-capable endpoints
usb: usb251xb: add boost-up property support
usb: ulpi: Call of_node_put correctly
usb: ulpi: Move of_node_put to ulpi_dev_release
USB: serial: option: add ZTE MF286D modem
USB: serial: ch341: add support for GW Instek USB2.0-Serial devices
usb: f_fs: Fix use-after-free for epfile
usb: dwc3: xilinx: fix uninitialized return value
Linus Torvalds [Sat, 12 Feb 2022 17:12:44 +0000 (09:12 -0800)]
Merge tag 's390-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
"Maintainers and reviewers changes:
- Add Alexander Gordeev as maintainer for s390.
- Christian Borntraeger will focus on s390 KVM maintainership and
stays as s390 reviewer.
Fixes:
- Fix clang build of modules loader KUnit test.
- Fix kernel panic in CIO code on FCES path-event when no driver is
attached to a device or the driver does not provide the path_event
function"
* tag 's390-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: verify the driver availability for path_event call
s390/module: fix building test_modules_helpers.o with clang
MAINTAINERS: downgrade myself to Reviewer for s390
MAINTAINERS: add Alexander Gordeev as maintainer for s390
Linus Torvalds [Sat, 12 Feb 2022 17:08:57 +0000 (09:08 -0800)]
Merge tag 'for-linus-5.17a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- Two small cleanups
- Another fix for addressing the EFI framebuffer above 4GB when running
as Xen dom0
- A patch to let Xen guests use reserved bits in MSI- and IO-APIC-
registers for extended APIC-IDs the same way KVM guests are doing it
already
* tag 'for-linus-5.17a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pci: Make use of the helper macro LIST_HEAD()
xen/x2apic: Fix inconsistent indenting
xen/x86: detect support for extended destination ID
xen/x86: obtain full video frame buffer address for Dom0 also under EFI
Linus Torvalds [Sat, 12 Feb 2022 17:04:05 +0000 (09:04 -0800)]
Merge tag 'seccomp-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp fixes from Kees Cook:
"This fixes a corner case of fatal SIGSYS being ignored since v5.15.
Along with the signal fix is a change to seccomp so that seeing
another syscall after a fatal filter result will cause seccomp to kill
the process harder.
Summary:
- Force HANDLER_EXIT even for SIGNAL_UNKILLABLE
- Make seccomp self-destruct after fatal filter results
- Update seccomp samples for easier behavioral demonstration"
* tag 'seccomp-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
samples/seccomp: Adjust sample to also provide kill option
seccomp: Invalidate seccomp mode to catch death failures
signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE
Linus Torvalds [Sat, 12 Feb 2022 16:57:37 +0000 (08:57 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"5 patches.
Subsystems affected by this patch series: binfmt, procfs, and mm
(vmscan, memcg, and kfence)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kfence: make test case compatible with run time set sample interval
mm: memcg: synchronize objcg lists with a dedicated spinlock
mm: vmscan: remove deadlock due to throttling failing to make progress
fs/proc: task_mmu.c: don't read mapcount for migration entry
fs/binfmt_elf: fix PT_LOAD p_align values for loaders
Jing Leng [Fri, 11 Feb 2022 09:27:36 +0000 (17:27 +0800)]
kconfig: fix failing to generate auto.conf
When the KCONFIG_AUTOCONFIG is specified (e.g. export \
KCONFIG_AUTOCONFIG=output/config/auto.conf), the directory of
include/config/ will not be created, so kconfig can't create deps
files in it and auto.conf can't be generated.
Peng Liu [Sat, 12 Feb 2022 00:32:35 +0000 (16:32 -0800)]
kfence: make test case compatible with run time set sample interval
The parameter kfence_sample_interval can be set via boot parameter and
late shell command, which is convenient for automated tests and KFENCE
parameter optimization. However, KFENCE test case just uses
compile-time CONFIG_KFENCE_SAMPLE_INTERVAL, which will make KFENCE test
case not run as users desired. Export kfence_sample_interval, so that
KFENCE test case can use run-time-set sample interval.
Link: https://lkml.kernel.org/r/20220207034432.185532-1-liupeng256@huawei.com Signed-off-by: Peng Liu <liupeng256@huawei.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Christian Knig <christian.koenig@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Gushchin [Sat, 12 Feb 2022 00:32:32 +0000 (16:32 -0800)]
mm: memcg: synchronize objcg lists with a dedicated spinlock
Alexander reported a circular lock dependency revealed by the mmap1 ltp
test:
LOCKDEP_CIRCULAR (suite: ltp, case: mtest06 (mmap1))
WARNING: possible circular locking dependency detected
5.17.0-20220113.rc0.git0.f2211f194038.300.fc35.s390x+debug #1 Not tainted
------------------------------------------------------
mmap1/202299 is trying to acquire lock: 00000001892c0188 (css_set_lock){..-.}-{2:2}, at: obj_cgroup_release+0x4a/0xe0
but task is already holding lock: 00000000ca3b3818 (&sighand->siglock){-.-.}-{2:2}, at: force_sig_info_to_task+0x38/0x180
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&sighand->siglock){-.-.}-{2:2}:
__lock_acquire+0x604/0xbd8
lock_acquire.part.0+0xe2/0x238
lock_acquire+0xb0/0x200
_raw_spin_lock_irqsave+0x6a/0xd8
__lock_task_sighand+0x90/0x190
cgroup_freeze_task+0x2e/0x90
cgroup_migrate_execute+0x11c/0x608
cgroup_update_dfl_csses+0x246/0x270
cgroup_subtree_control_write+0x238/0x518
kernfs_fop_write_iter+0x13e/0x1e0
new_sync_write+0x100/0x190
vfs_write+0x22c/0x2d8
ksys_write+0x6c/0xf8
__do_syscall+0x1da/0x208
system_call+0x82/0xb0
-> #0 (css_set_lock){..-.}-{2:2}:
check_prev_add+0xe0/0xed8
validate_chain+0x736/0xb20
__lock_acquire+0x604/0xbd8
lock_acquire.part.0+0xe2/0x238
lock_acquire+0xb0/0x200
_raw_spin_lock_irqsave+0x6a/0xd8
obj_cgroup_release+0x4a/0xe0
percpu_ref_put_many.constprop.0+0x150/0x168
drain_obj_stock+0x94/0xe8
refill_obj_stock+0x94/0x278
obj_cgroup_charge+0x164/0x1d8
kmem_cache_alloc+0xac/0x528
__sigqueue_alloc+0x150/0x308
__send_signal+0x260/0x550
send_signal+0x7e/0x348
force_sig_info_to_task+0x104/0x180
force_sig_fault+0x48/0x58
__do_pgm_check+0x120/0x1f0
pgm_check_handler+0x11e/0x180
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&sighand->siglock);
lock(css_set_lock);
lock(&sighand->siglock);
lock(css_set_lock);
*** DEADLOCK ***
2 locks held by mmap1/202299:
#0: 00000000ca3b3818 (&sighand->siglock){-.-.}-{2:2}, at: force_sig_info_to_task+0x38/0x180
#1: 00000001892ad560 (rcu_read_lock){....}-{1:2}, at: percpu_ref_put_many.constprop.0+0x0/0x168
stack backtrace:
CPU: 15 PID: 202299 Comm: mmap1 Not tainted 5.17.0-20220113.rc0.git0.f2211f194038.300.fc35.s390x+debug #1
Hardware name: IBM 3906 M04 704 (LPAR)
Call Trace:
dump_stack_lvl+0x76/0x98
check_noncircular+0x136/0x158
check_prev_add+0xe0/0xed8
validate_chain+0x736/0xb20
__lock_acquire+0x604/0xbd8
lock_acquire.part.0+0xe2/0x238
lock_acquire+0xb0/0x200
_raw_spin_lock_irqsave+0x6a/0xd8
obj_cgroup_release+0x4a/0xe0
percpu_ref_put_many.constprop.0+0x150/0x168
drain_obj_stock+0x94/0xe8
refill_obj_stock+0x94/0x278
obj_cgroup_charge+0x164/0x1d8
kmem_cache_alloc+0xac/0x528
__sigqueue_alloc+0x150/0x308
__send_signal+0x260/0x550
send_signal+0x7e/0x348
force_sig_info_to_task+0x104/0x180
force_sig_fault+0x48/0x58
__do_pgm_check+0x120/0x1f0
pgm_check_handler+0x11e/0x180
INFO: lockdep is turned off.
In this example a slab allocation from __send_signal() caused a
refilling and draining of a percpu objcg stock, resulted in a releasing
of another non-related objcg. Objcg release path requires taking the
css_set_lock, which is used to synchronize objcg lists.
This can create a circular dependency with the sighandler lock, which is
taken with the locked css_set_lock by the freezer code (to freeze a
task).
In general it seems that using css_set_lock to synchronize objcg lists
makes any slab allocations and deallocation with the locked css_set_lock
and any intervened locks risky.
To fix the problem and make the code more robust let's stop using
css_set_lock to synchronize objcg lists and use a new dedicated spinlock
instead.
Link: https://lkml.kernel.org/r/Yfm1IHmoGdyUR81T@carbon.dhcp.thefacebook.com Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Roman Gushchin <guro@fb.com> Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Waiman Long <longman@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Jeremy Linton <jeremy.linton@arm.com> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This failure is specific to CONFIG_PREEMPT=n builds. The root of the
problem is that kcompact0 is not rescheduling on a CPU while a task that
has isolated a large number of the pages from the LRU is waiting on
kcompact0 to reschedule so the pages can be released. While
shrink_inactive_list() only loops once around too_many_isolated, reclaim
can continue without rescheduling if sc->skipped_deactivate == 1 which
could happen if there was no file LRU and the inactive anon list was not
low.
Link: https://lkml.kernel.org/r/20220203100326.GD3301@suse.de Fixes: d818fca1cac3 ("mm/vmscan: throttle reclaim and compaction when too may pages are isolated") Signed-off-by: Mel Gorman <mgorman@suse.de> Debugged-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Rik van Riel <riel@surriel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The reproducer was trying to read /proc/$PID/smaps when calling
MADV_FREE at the mean time. MADV_FREE may split THPs if it is called
for partial THP. It may trigger the below race:
CPU A CPU B
----- -----
smaps walk: MADV_FREE:
page_mapcount()
PageCompound()
split_huge_page()
page = compound_head(page)
PageDoubleMap(page)
When calling PageDoubleMap() this page is not a tail page of THP anymore
so the BUG is triggered.
This could be fixed by elevated refcount of the page before calling
mapcount, but that would prevent it from counting migration entries, and
it seems overkilling because the race just could happen when PMD is
split so all PTE entries of tail pages are actually migration entries,
and smaps_account() does treat migration entries as mapcount == 1 as
Kirill pointed out.
Add a new parameter for smaps_account() to tell this entry is migration
entry then skip calling page_mapcount(). Don't skip getting mapcount
for device private entries since they do track references with mapcount.
Pagemap also has the similar issue although it was not reported. Fixed
it as well.
[shy828301@gmail.com: v4] Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com
[nathan@kernel.org: avoid unused variable warning in pagemap_pmd_range()] Link: https://lkml.kernel.org/r/20220207171049.1102239-1-nathan@kernel.org Link: https://lkml.kernel.org/r/20220120202805.3369-1-shy828301@gmail.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Yang Shi <shy828301@gmail.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reported-by: syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com Acked-by: David Hildenbrand <david@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Jann Horn <jannh@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Sat, 12 Feb 2022 00:32:22 +0000 (16:32 -0800)]
fs/binfmt_elf: fix PT_LOAD p_align values for loaders
Rui Salvaterra reported that Aisleroit solitaire crashes with "Wrong
__data_start/_end pair" assertion from libgc after update to v5.17-rc1.
Bisection pointed to commit 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD
p_align values for static PIE") that fixed handling of static PIEs, but
made the condition that guards load_bias calculation to exclude loader
binaries.
Restoring the check for presence of interpreter fixes the problem.
Link: https://lkml.kernel.org/r/20220202121433.3697146-1-rppt@kernel.org Fixes: 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Tested-by: Rui Salvaterra <rsalvaterra@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Eric Biederman <ebiederm@xmission.com> Cc: "H.J. Lu" <hjl.tools@gmail.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 11 Feb 2022 21:40:03 +0000 (13:40 -0800)]
Merge tag 'soc-fixes-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"This is a fairly large set of bugfixes, most of which had been sent a
while ago but only now made it into the soc tree:
Maintainer file updates:
- Claudiu Beznea now co-maintains the at91 soc family, replacing
Ludovic Desroches.
- Michael Walle maintains the sl28cpld drivers
- Alain Volmat and Raphael Gallais-Pou take over some drivers for ST
platforms
- Alim Akhtar is an additional reviewer for Samsung platforms
Code fixes:
- Op-tee had a problem with object lifetime that needs a slightly
complex fix, as well as another bug with error handling.
- Several minor issues for the OMAP platform, including a regression
with the timer
- A Kconfig change to fix a build-time issue on Intel SoCFPGA
Device tree fixes:
- The Amlogic Meson platform fixes a boot regression on am1-odroid, a
spurious interrupt, and a problem with reserved memory regions
- In the i.MX platform, several bug fixes are needed to make devices
work correctly: SD card detection, alarmtimer, and sound card on
some board. One patch for the GPU got in there by accident and gets
reverted again.
- TI K3 needs a fix for J721S2 serial port numbers
- ux500 needs a fix to mount the SD card as root on the Skomer phone"
* tag 'soc-fixes-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (46 commits)
Revert "arm64: dts: imx8mn-venice-gw7902: disable gpu"
arm64: Remove ARCH_VULCAN
MAINTAINERS: add myself as a maintainer for the sl28cpld
MAINTAINERS: add IRC to ARM sub-architectures and Devicetree
MAINTAINERS: arm: samsung: add Git tree and IRC
ARM: dts: Fix boot regression on Skomer
ARM: dts: spear320: Drop unused and undocumented 'irq-over-gpio' property
soc: aspeed: lpc-ctrl: Block error printing on probe defer cases
docs/ABI: testing: aspeed-uart-routing: Escape asterisk
MAINTAINERS: update drm/stm drm/sti and cec/sti maintainers
MAINTAINERS: Update Benjamin Gaignard maintainer status
ARM: socfpga: fix missing RESET_CONTROLLER
arm64: dts: meson-sm1-odroid: fix boot loop after reboot
arm64: dts: meson-g12: drop BL32 region from SEI510/SEI610
arm64: dts: meson-g12: add ATF BL32 reserved-memory region
arm64: dts: meson-gx: add ATF BL32 reserved-memory region
arm64: dts: meson-sm1-bananapi-m5: fix wrong GPIO domain for GPIOE_2
arm64: dts: meson-sm1-odroid: use correct enable-gpio pin for tf-io regulator
arm64: dts: meson-g12b-odroid-n2: fix typo 'dio2133'
optee: use driver internal tee_context for some rpc
...
Linus Torvalds [Fri, 11 Feb 2022 20:55:17 +0000 (12:55 -0800)]
Merge tag 'pci-v5.17-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fix from Bjorn Helgaas:
"Revert a commit that reduced the number of IRQs used but resulted in
interrupt storms (Bjorn Helgaas)"
* tag 'pci-v5.17-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI/portdrv: Do not setup up IRQs if there are no users"
0e8ae5a6ff59 ("PCI/portdrv: Do not setup up IRQs if there are no users")
reduced usage of IRQs when we don't think we need them. But Joey, Sergiu,
and David reported choppy GUI rendering, systems that became unresponsive
every few seconds, incorrect values reported by cpufreq, and high IRQ 16
CPU usage.
Joey bisected the issues to 0e8ae5a6ff59, so revert it until we figure out
a better solution.
Linus Torvalds [Fri, 11 Feb 2022 20:02:09 +0000 (12:02 -0800)]
Merge tag 'riscv-for-linus-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix to avoid undefined behavior when stack backtracing, which
manifests in GCC as incorrect stack addresses
- A few fixes for the XIP kernels
- A fix to tracking NUMA state on CPU hotplug
- Support for the recently relesaed binutils-2.38, which changed the
default ISA version to one without CSRs or fence.i in 'I' extension
* tag 'riscv-for-linus-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: fix build with binutils 2.38
riscv: cpu-hotplug: clear cpu from numa map when teardown
riscv: extable: fix err reg writing in dedicated uaccess handler
riscv/mm: Add XIP_FIXUP for riscv_pfn_base
riscv/mm: Add XIP_FIXUP for phys_ram_base
riscv: Fix XIP_FIXUP_FLASH_OFFSET
riscv: eliminate unreliable __builtin_frame_address(1)
Linus Torvalds [Fri, 11 Feb 2022 19:55:26 +0000 (11:55 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Enable Cortex-A510 erratum 2051678 by default as we do with other
errata.
- arm64 IORT: Check the node revision for PMCG resources to cope with
old firmware based on a broken revision of the spec that had no way
to describe the second register page (when an implementation is using
the recommended RELOC_CTRS feature).
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
ACPI/IORT: Check node revision for PMCG resources
arm64: Enable Cortex-A510 erratum 2051678 by default
Linus Torvalds [Fri, 11 Feb 2022 19:48:13 +0000 (11:48 -0800)]
Merge tag 'acpi-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These revert two commits that turned out to be problematic and fix two
issues related to wakeup from suspend-to-idle on x86.
Specifics:
- Revert a recent change that attempted to avoid issues with
conflicting address ranges during PCI initialization, because it
turned out to introduce a regression (Hans de Goede).
- Revert a change that limited EC GPE wakeups from suspend-to-idle to
systems based on Intel hardware, because it turned out that systems
based on hardware from other vendors depended on that functionality
too (Mario Limonciello).
- Fix two issues related to the handling of wakeup interrupts and
wakeup events signaled through the EC GPE during suspend-to-idle on
x86 (Rafael Wysocki)"
* tag 'acpi-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
x86/PCI: revert "Ignore E820 reservations for bridge windows on newer systems"
PM: s2idle: ACPI: Fix wakeup interrupts handling
ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE
ACPI: PM: Revert "Only mark EC GPE for wakeup on Intel systems"
Linus Torvalds [Fri, 11 Feb 2022 19:36:32 +0000 (11:36 -0800)]
Merge tag 'gfs2-v5.16-rc3-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fixes from Andreas Gruenbacher:
- Revert debug commit that causes unexpected data corruption
- Fix muti-block reservation regression
* tag 'gfs2-v5.16-rc3-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Fix gfs2_release for non-writers regression
Revert "gfs2: check context in gfs2_glock_put"
Linus Torvalds [Fri, 11 Feb 2022 19:26:07 +0000 (11:26 -0800)]
Merge tag 'block-5.17-2022-02-11' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request
- nvme-tcp: fix bogus request completion when failing to send AER
(Sagi Grimberg)
- add the missing nvme_complete_req tracepoint for batched
completion (Bean Huo)
- Revert of the loop async autoclear issue that has continued to plague
us this release. A few patchsets exists to improve this, but they are
too invasive to be considered at this point (Tetsuo)
* tag 'block-5.17-2022-02-11' of git://git.kernel.dk/linux-block:
loop: revert "make autoclear operation asynchronous"
nvme-tcp: fix bogus request completion when failing to send AER
nvme: add nvme_complete_req tracepoint for batched completion
Linus Torvalds [Fri, 11 Feb 2022 19:18:42 +0000 (11:18 -0800)]
Merge tag 'io_uring-5.17-2022-02-11' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- Fix a false-positive warning from an older gcc (Alviro)
- Allow oom killer invocations from io_uring_setup (Shakeel)
* tag 'io_uring-5.17-2022-02-11' of git://git.kernel.dk/linux-block:
mm: io_uring: allow oom-killer from io_uring_setup
io_uring: Clean up a false-positive warning from GCC 9.3.0
Linus Torvalds [Fri, 11 Feb 2022 19:05:49 +0000 (11:05 -0800)]
Merge tag 'gpio-fixes-for-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- use sleeping variants of GPIO accessors where needed
in gpio-aggregator
- never return kernel's internal error codes to user-space
in gpiolib core
- use the correct register for reading output values in
gpio-sifive
- fix line hogging in gpio-sim
* tag 'gpio-fixes-for-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: sim: fix hogs with custom chip labels
gpio: sifive: use the correct register to read output values
gpiolib: Never return internal error codes to user space
gpio: aggregator: Fix calling into sleeping GPIO controllers
Linus Torvalds [Fri, 11 Feb 2022 18:42:31 +0000 (10:42 -0800)]
Merge tag 'ata-5.17-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
"A couple of additional fixes for 5.17-rc4:
- Fix compilation warnings in the sata_fsl driver (powerpc) (me)
- Disable TRIM commands on M88V29 devices as these commands are
failing despite the device reporting it supports TRIM (Zoltan)"
* tag 'ata-5.17-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: libata-core: Disable TRIM on M88V29
ata: sata_fsl: fix sscanf() and sysfs_emit() format strings
* tag 'drm-fixes-2022-02-11' of git://anongit.freedesktop.org/drm/drm: (25 commits)
drm/amdgpu/display: change pipe policy for DCN 2.0
drm/amd/pm: fix hwmon node of power1_label create issue
drm/amd/display: keep eDP Vdd on when eDP stream is already enabled
drm/amd/display: fix yellow carp wm clamping
drm/amd/display: Cap pflip irqs per max otg number
drm/amdgpu: add utcl2_harvest to gc 10.3.1
display/amd: decrease message verbosity about watermarks table failure
drm/rockchip: vop: Correct RK3399 VOP register fields
drm/rockchip: dw_hdmi: Do not leave clock enabled in error case
MAINTAINERS: Add entry for fbdev core
fbcon: Avoid 'cap' set but not used warning
drm/privacy-screen: Fix sphinx warning
drm/i915: Workaround broken BIOS DBUF configuration on TGL/RKL
drm/i915: Populate pipe dbuf slices more accurately during readout
drm/i915: Allow !join_mbus cases for adlp+ dbuf configuration
drm/i915: Fix header test for !CONFIG_X86
drm/i915/ttm: Return some errors instead of trying memcpy move
drm/i915: Disable DRRS on IVB/HSW port != A
drm/i915: Fix oops due to missing stack depot
drm/vc4: crtc: Fix redundant variable assignment
...
Bob Peterson [Tue, 18 Jan 2022 14:30:18 +0000 (09:30 -0500)]
gfs2: Fix gfs2_release for non-writers regression
When a file is opened for writing, the vfs code (do_dentry_open)
calls get_write_access for the inode, thus incrementing the inode's write
count. That writer normally then creates a multi-block reservation for
the inode (i_res) that can be re-used by other writers, which speeds up
writes for applications that stupidly loop on open/write/close.
When the writes are all done, the multi-block reservation should be
deleted when the file is closed by the last "writer."
Commit 0ec9b9ea4f83 broke that concept when it moved the call to
gfs2_rs_delete before the check for FMODE_WRITE. Non-writers have no
business removing the multi-block reservations of writers. In fact, if
someone opens and closes the file for RO while a writer has a
multi-block reservation, the RO closer will delete the reservation
midway through the write, and this results in:
kernel BUG at fs/gfs2/rgrp.c:677! (or thereabouts) which is:
BUG_ON(rs->rs_requested); from function gfs2_rs_deltree.
This patch moves the check back inside the check for FMODE_WRITE.
Fixes: 0ec9b9ea4f83 ("gfs2: Check for active reservation in gfs2_release") Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Andreas Gruenbacher [Thu, 3 Feb 2022 13:06:56 +0000 (14:06 +0100)]
Revert "gfs2: check context in gfs2_glock_put"
It turns out that the might_sleep() call that commit 660a6126f8c3 adds
is triggering occasional data corruption in testing. We're not sure
about the root cause yet, but since this commit was added as a debugging
aid only, revert it for now.