]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
3 years agofs/binfmt_elf: Use maple tree iterators for fill_files_note()
Liam R. Howlett [Mon, 4 Jan 2021 19:44:16 +0000 (14:44 -0500)]
fs/binfmt_elf: Use maple tree iterators for fill_files_note()

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agodrivers/tee/optee: Use maple tree iterators for __check_mem_type()
Liam R. Howlett [Mon, 4 Jan 2021 19:43:36 +0000 (14:43 -0500)]
drivers/tee/optee: Use maple tree iterators for __check_mem_type()

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agodrivers/misc/cxl: Use maple tree iterators for cxl_prefault_vma()
Liam R. Howlett [Mon, 4 Jan 2021 19:31:50 +0000 (14:31 -0500)]
drivers/misc/cxl: Use maple tree iterators for cxl_prefault_vma()

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoarch/xtensa: Use maple tree iterators for unmapped area
Liam R. Howlett [Mon, 4 Jan 2021 19:30:59 +0000 (14:30 -0500)]
arch/xtensa: Use maple tree iterators for unmapped area

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoarch/x86: Use maple tree iterators for vdso/vma
Liam R. Howlett [Mon, 4 Jan 2021 19:30:25 +0000 (14:30 -0500)]
arch/x86: Use maple tree iterators for vdso/vma

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoarch/s390: Use maple tree iterators instead of linked list.
Liam R. Howlett [Mon, 4 Jan 2021 19:29:19 +0000 (14:29 -0500)]
arch/s390: Use maple tree iterators instead of linked list.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoarch/powerpc: Remove mmap linked list from mm/book3s64/subpage_prot
Liam R. Howlett [Mon, 4 Jan 2021 19:26:34 +0000 (14:26 -0500)]
arch/powerpc: Remove mmap linked list from mm/book3s64/subpage_prot

Start using the maple tree

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoarch/powerpc: Remove mmap linked list from mm/book3s32/tlb
Liam R. Howlett [Mon, 4 Jan 2021 19:25:54 +0000 (14:25 -0500)]
arch/powerpc: Remove mmap linked list from mm/book3s32/tlb

Start using the maple tree

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoarch/parisc: Remove mmap linked list from kernel/cache
Liam R. Howlett [Mon, 4 Jan 2021 19:25:19 +0000 (14:25 -0500)]
arch/parisc: Remove mmap linked list from kernel/cache

Start using the maple tree

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoarch/arm64: Remove mmap linked list from vdso.
Liam R. Howlett [Mon, 4 Jan 2021 19:24:40 +0000 (14:24 -0500)]
arch/arm64: Remove mmap linked list from vdso.

Start using the maple tree

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm: Introduce vma_next() and vma_prev()
Liam R. Howlett [Mon, 4 Jan 2021 20:13:14 +0000 (15:13 -0500)]
mm: Introduce vma_next() and vma_prev()

Rename internal vma_next() to _vma_next().

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Change do_brk_munmap() to use do_mas_align_munmap()
Liam R. Howlett [Tue, 1 Dec 2020 02:30:04 +0000 (21:30 -0500)]
mm/mmap: Change do_brk_munmap() to use do_mas_align_munmap()

do_brk_munmap() has already aligned the address and has a maple tree
state to be used.  Use the new do_mas_align_munmap() to avoid
unnecessary alignment and error checks.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Reorganize munmap to use maple states
Liam R. Howlett [Thu, 19 Nov 2020 17:57:23 +0000 (12:57 -0500)]
mm/mmap: Reorganize munmap to use maple states

Remove __do_munmap() in favour of do_munmap(), do_mas_munmap(), and
do_mas_align_munmap().

do_munmap() is a wrapper to create a maple state for any callers that
have not been converted to the maple tree.

do_mas_munmap() takes a maple state to mumap a range.  This is just a
small function which checks for error conditions and aligns the end of
the range.

do_mas_align_munmap() uses the aligned range to mumap a range.
do_mas_align_munmap() starts with the first VMA in the range, then finds
the last VMA in the range.  Both start and end are split if necessary.
Then the VMAs are unlocked and removed from the linked list at the same
time.  Followed by a single tree operation of overwriting the area in
with a NULL.  Finally, the detached list is unmapped and freed.

By reorganizing the munmap calls as outlined, it is now possible to
avoid extra work of aligning pre-aligned callers which are known to be
safe, avoid extra VMA lookups or tree walks for modifications.

detach_vmas_to_be_unmapped() is no longer used, so drop this code.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Convert count_vma_pages_range() to use ma_state
Liam R. Howlett [Thu, 27 May 2021 15:23:50 +0000 (11:23 -0400)]
mm/mmap: Convert count_vma_pages_range() to use ma_state

mmap_region() uses a maple state to do all the work so convert the
static inline count_vma_pages_range() to use the same structure.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Move mmap_region() below do_munmap()
Liam R. Howlett [Wed, 27 Jan 2021 15:25:13 +0000 (10:25 -0500)]
mm/mmap: Move mmap_region() below do_munmap()

Relocation of code for the next commit.  There should be no changes
here.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm: Remove vmacache
Liam R. Howlett [Mon, 16 Nov 2020 19:50:20 +0000 (14:50 -0500)]
mm: Remove vmacache

By using the maple tree and the maple tree state, the vmacache is no
longer beneficial and is complicating the VMA code.  Remove the vmacache
to reduce the work in keeping it up to date and code complexity.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Use advanced maple tree API for mmap_region()
Liam R. Howlett [Tue, 10 Nov 2020 18:37:40 +0000 (13:37 -0500)]
mm/mmap: Use advanced maple tree API for mmap_region()

Changing mmap_region() to use the maple tree state and the advanced
maple tree interface allows for a lot less tree walking.

This change removes the last caller of munmap_vma_range(), so drop this
unused function.

Add vma_expand() to expand a VMA if possible by doing the necessary
hugepage check, uprobe_munmap of files, dcache flush, modifications then
undoing the detaches, etc.

Add vma_mas_link() helper to add a VMA to the linked list and maple tree
until the linked list is removed.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm: Use maple tree operations for find_vma_intersection() and find_vma()
Liam R. Howlett [Mon, 28 Sep 2020 19:50:19 +0000 (15:50 -0400)]
mm: Use maple tree operations for find_vma_intersection() and find_vma()

Move find_vma_intersection() to mmap.c and change implementation to
maple tree.

When searching for a vma within a range, it is easier to use the maple
tree interface.  This means the find_vma() call changes to a special
case of the find_vma_intersection().

Exported for kvm module.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Change do_brk_flags() to expand existing VMA and add
Liam R. Howlett [Mon, 21 Sep 2020 14:47:34 +0000 (10:47 -0400)]
mm/mmap: Change do_brk_flags() to expand existing VMA and add
do_brk_munmap()

Avoid allocating a new VMA when it a vma modification can occur.  When a
brk() can expand or contract a VMA, then the single store operation will
only modify one index of the maple tree instead of causing a node to
split or coalesce.  This avoids unnecessary allocations/frees of maple
tree nodes and VMAs.

Use the advanced API for the maple tree to avoid unnecessary walks of
the tree.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/khugepaged: Optimize collapse_pte_mapped_thp() by using vma_lookup()
Liam R. Howlett [Thu, 8 Apr 2021 20:21:58 +0000 (16:21 -0400)]
mm/khugepaged: Optimize collapse_pte_mapped_thp() by using vma_lookup()

vma_lookup() will walk the vma tree once and not continue to look for
the next vma.  Since the exact vma is checked below, this is a more
optimal way of searching.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm: Optimize find_exact_vma() to use vma_lookup()
Liam R. Howlett [Thu, 8 Apr 2021 20:15:00 +0000 (16:15 -0400)]
mm: Optimize find_exact_vma() to use vma_lookup()

Use vma_lookup() to walk the tree to the start value requested.  If
the vma at the start does not match, then the answer is NULL and there
is no need to look at the next vma the way that find_vma() would.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoxen/privcmd: Optimized privcmd_ioctl_mmap() by using vma_lookup()
Liam R. Howlett [Thu, 8 Apr 2021 20:06:36 +0000 (16:06 -0400)]
xen/privcmd: Optimized privcmd_ioctl_mmap() by using vma_lookup()

vma_lookup() walks the VMA tree for a specific value, find_vma() will
search the tree after walking to a specific value.  It is more efficient
to only walk to the requested value as this case requires the address to
equal the vm_start.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm: Remove rb tree.
Liam R. Howlett [Fri, 24 Jul 2020 19:06:25 +0000 (15:06 -0400)]
mm: Remove rb tree.

Remove the RB tree and start using the maple tree for vm_area_struct
tracking.

Drop validate_mm() calls in expand_upwards() and expand_downwards() as
the lock is not held.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agokernel/fork: Use maple tree for dup_mmap() during forking
Liam R. Howlett [Fri, 24 Jul 2020 17:32:30 +0000 (13:32 -0400)]
kernel/fork: Use maple tree for dup_mmap() during forking

The maple tree was already tracking VMAs in this function by an earlier
commit, but the rbtree iterator was being used to iterate the list.
Change the iterator to use a maple tree native iterator, rcu locking,
and switch to the maple tree advanced API to avoid multiple walks of the
tree during insert operations.

anon_vma_fork() may enter the slow path and cause a schedule() call to
cause rcu issues.  Drop the rcu lock and reacquiring the lock.  There is
no harm in this approach as the mmap_sem is taken for write/read and
held across the schedule() call so the VMAs will not change.

Note that the bulk allocation of nodes is also happening here for
performance reasons.  The node calculations are done internally to the
tree and use the VMA count and assume the worst-case node requirements.
The VM_DONT_COPY flag does not allow for the most efficient copy method
of the tree and so a bulk loading algorithm is used.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Use maple tree for unmapped_area{_topdown}
Liam R. Howlett [Fri, 24 Jul 2020 16:01:47 +0000 (12:01 -0400)]
mm/mmap: Use maple tree for unmapped_area{_topdown}

The maple tree code was added to find the unmapped area in a previous
commit and was checked against what the rbtree returned, but the actual
result was never used.  Start using the maple tree implementation and
remove the rbtree code. Note, the advanced maple tree interface is used
so the rcu locking is needed to be handled here or at a higher level.

Add kernel documentation comment for these functions.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Use the maple tree for find_vma_prev() instead of the rbtree
Liam R. Howlett [Fri, 24 Jul 2020 15:48:08 +0000 (11:48 -0400)]
mm/mmap: Use the maple tree for find_vma_prev() instead of the rbtree

Use the maple tree's advanced API and a maple state to walk the tree for
the entry at the address or the next vma, then use the maple state to
walk back one entry to find the previous entry.  Note, the advanced
maple tree interface does not handle the rcu locking.

Add kernel documentation comments for this API.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm/mmap: Use the maple tree in find_vma() instead of the rbtree.
Liam R. Howlett [Fri, 24 Jul 2020 15:38:39 +0000 (11:38 -0400)]
mm/mmap: Use the maple tree in find_vma() instead of the rbtree.

Using the maple tree interface mt_find() will handle the RCU locking and
will start searching at the address up to the limit, ULONG_MAX in this
case.

Add kernel documentation to this API.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agomm: Start tracking VMAs with maple tree
Liam R. Howlett [Fri, 24 Jul 2020 17:04:30 +0000 (13:04 -0400)]
mm: Start tracking VMAs with maple tree

Start tracking the VMAs with the new maple tree structure in parallel
with the rb_tree.  Add debug and trace events for maple tree operations
and duplicate the rb_tree that is created on forks into the maple tree.

In this commit, the maple tree is added to the mm_struct including the
mm_init struct, added support in required mm/mmap functions, added
tracking in kernel/fork for process forking, and used to find the
unmapped_area and checked against what the rbtree finds.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoMaple Tree: Add new data structure
Liam R. Howlett [Fri, 24 Jul 2020 17:02:52 +0000 (13:02 -0400)]
Maple Tree: Add new data structure

The maple tree is an RCU-safe range based B-tree designed to use modern
processor cache efficiently.  There are a number of places in the kernel
that a non-overlapping range-based tree would be beneficial, especially
one with a simple interface.  The first user that is covered in this
patch set is the vm_area_struct, where three data structures are
replaced by the maple tree: the augmented rbtree, the vma cache, and the
linked list of VMAs in the mm_struct.  The long term goal is to reduce
or remove the mmap_sem contention.

The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
nodes.  With the increased branching factor, it is significantly shorter than
the rbtree so it has fewer cache misses.  The removal of the linked list
between subsequent entries also reduces the cache misses and the need to pull
in the previous and next VMA during many tree alterations.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
3 years agoradix tree test suite: Add support for slab bulk APIs
Liam R. Howlett [Tue, 25 Aug 2020 19:51:54 +0000 (15:51 -0400)]
radix tree test suite: Add support for slab bulk APIs

Add support for kmem_cache_free_bulk() and kmem_cache_alloc_bulk() to
the radix tree test suite.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoradix tree test suite: Add allocation counts and size to kmem_cache
Liam R. Howlett [Tue, 1 Jun 2021 14:35:43 +0000 (10:35 -0400)]
radix tree test suite: Add allocation counts and size to kmem_cache

Add functions to get the number of allocations, and total allocations
from a kmem_cache.  Also add a function to get the allocated size and a
way to zero the total allocations.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoradix tree test suite: Add kmem_cache_set_non_kernel()
Liam R. Howlett [Thu, 27 Feb 2020 20:13:20 +0000 (15:13 -0500)]
radix tree test suite: Add kmem_cache_set_non_kernel()

kmem_cache_set_non_kernel() is a mechanism to allow a certain number of
kmem_cache_alloc requests to succeed even when GFP_KERNEL is not set in
the flags.  This functionality allows for testing different paths though
the code.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
3 years agoradix tree test suite: Add pr_err define
Liam R. Howlett [Tue, 1 Jun 2021 14:25:22 +0000 (10:25 -0400)]
radix tree test suite: Add pr_err define

define pr_err to printk

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
3 years agoAdd linux-next specific files for 20210902
Stephen Rothwell [Thu, 2 Sep 2021 06:17:17 +0000 (16:17 +1000)]
Add linux-next specific files for 20210902

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agoMerge branch 'akpm/master'
Stephen Rothwell [Thu, 2 Sep 2021 05:32:52 +0000 (15:32 +1000)]
Merge branch 'akpm/master'

3 years agoarch: remove compat_alloc_user_space
Arnd Bergmann [Tue, 24 Aug 2021 00:00:20 +0000 (10:00 +1000)]
arch: remove compat_alloc_user_space

All users of compat_alloc_user_space() and copy_in_user() have been
removed from the kernel, only a few functions in sparc remain that can be
changed to calling arch_copy_in_user() instead.

Link: https://lkml.kernel.org/r/20210727144859.4150043-7-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agocompat: remove some compat entry points
Arnd Bergmann [Tue, 24 Aug 2021 00:00:19 +0000 (10:00 +1000)]
compat: remove some compat entry points

These are all handled correctly when calling the native system call entry
point, so remove the special cases.

Link: https://lkml.kernel.org/r/20210727144859.4150043-6-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agofixup! mm: simplify compat numa syscalls
Arnd Bergmann [Tue, 24 Aug 2021 00:00:19 +0000 (10:00 +1000)]
fixup! mm: simplify compat numa syscalls

When compat user space asks for more data than the kernel has in its
nodemask, get_mempolicy() now either leaks kernel stack data to user space
or, if either VMAP_STACK or KASAN are enabled, causes a crash like

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000038003e7c000 TEID: 0000038003e7c803
Fault in home space mode while using kernel ASCE.
AS:00000001fb388007 R3:000000008021c007 S:0000000082142000 P:0000000000000400
Oops: 0011 ilc:3 [#1] SMP
CPU: 0 PID: 1017495 Comm: get_mempolicy Tainted: G           OE     5.14.0-20210730.rc3.git0.4ccc9e2db7ac.300.fc34.s390x+next #1
Hardware name: IBM 2827 H66 708 (LPAR)
Krnl PSW : 0704e00180000000 00000001f9f11000 (compat_put_bitmap+0x48/0xd0)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000810000 0000000000000000 000000007d9df1c0 0000038003e7c008
           0000000000000004 000000007d9df1c4 0000038003e7be40 0000000000010000
           0000000000008000 0000000000000000 0000000000000390 00000000000001c8
           000000020d6ea000 000002aa00401a48 00000001fa0a85fa 0000038003e7bd50
Krnl Code: 00000001f9f10ff4a7bb0001            aghi    %r11,1
           00000001f9f10ff841303008            la      %r3,8(%r3)
          #00000001f9f10ffc41502004            la      %r5,4(%r2)
          >00000001f9f11000e3103ff8ff04        lg      %r1,-8(%r3)
           00000001f9f110065010f0a4            st      %r1,164(%r15)
           00000001f9f1100aa50e0081            llilh   %r0,129
           00000001f9f1100ec8402000f0a4        mvcos   0(%r2),164(%r15),%r4
           00000001f9f11014: 1799                xr      %r9,%r9
Call Trace:
 [<00000001f9f11000>] compat_put_bitmap+0x48/0xd0
 [<00000001fa0a85fa>] kernel_get_mempolicy+0x102/0x178
 [<00000001fa0a86b0>] __s390_sys_get_mempolicy+0x40/0x50
 [<00000001fa92be30>] __do_syscall+0x1c0/0x1e8
 [<00000001fa939148>] system_call+0x78/0xa0
Last Breaking-Event-Address:
 [<0000038003e7bc00>] 0x38003e7bc00
Kernel panic - not syncing: Fatal exception: panic_on_oops

Fix it by copying the correct size in compat mode again.

Link: https://lkml.kernel.org/r/20210730143417.3700653-1-arnd@kernel.org
Link: https://lore.kernel.org/lkml/YQPLG20V3dmOfq3a@osiris/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: simplify compat numa syscalls
Arnd Bergmann [Tue, 24 Aug 2021 00:00:19 +0000 (10:00 +1000)]
mm: simplify compat numa syscalls

The compat implementations for mbind, get_mempolicy, set_mempolicy and
migrate_pages are just there to handle the subtly different layout of
bitmaps on 32-bit hosts.

The compat implementation however lacks some of the checks that are
present in the native one, in particular for checking that the extra bits
are all zero when user space has a larger mask size than the kernel.
Worse, those extra bits do not get cleared when copying in or out of the
kernel, which can lead to incorrect data as well.

Unify the implementation to handle the compat bitmap layout directly in
the get_nodes() and copy_nodes_to_user() helpers.  Splitting out the
get_bitmap() helper from get_nodes() also helps readability of the native
case.

On x86, two additional problems are addressed by this: compat tasks can
pass a bitmap at the end of a mapping, causing a fault when reading across
the page boundary for a 64-bit word.  x32 tasks might also run into
problems with get_mempolicy corrupting data when an odd number of 32-bit
words gets passed.

On parisc the migrate_pages() system call apparently had the wrong calling
convention, as big-endian architectures expect the words inside of a
bitmap to be swapped.  This is not a problem though since parisc has no
NUMA support.

Link: https://lkml.kernel.org/r/20210727144859.4150043-5-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: simplify compat_sys_move_pages
Arnd Bergmann [Tue, 24 Aug 2021 00:00:19 +0000 (10:00 +1000)]
mm: simplify compat_sys_move_pages

The compat move_pages() implementation uses compat_alloc_user_space() for
converting the pointer array.  Moving the compat handling into the
function itself is a bit simpler and lets us avoid the
compat_alloc_user_space() call.

Link: https://lkml.kernel.org/r/20210727144859.4150043-4-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agokexec: avoid compat_alloc_user_space
Arnd Bergmann [Tue, 24 Aug 2021 00:00:19 +0000 (10:00 +1000)]
kexec: avoid compat_alloc_user_space

kimage_alloc_init() expects a __user pointer, so compat_sys_kexec_load()
uses compat_alloc_user_space() to convert the layout and put it back onto
the user space caller stack.

Moving the user space access into the syscall handler directly actually
makes the code simpler, as the conversion for compat mode can now be done
on kernel memory.

Link: https://lkml.kernel.org/r/20210727144859.4150043-3-arnd@kernel.org
Link: https://lore.kernel.org/lkml/YPbtsU4GX6PL7%2F42@infradead.org/
Link: https://lore.kernel.org/lkml/m1y2cbzmnw.fsf@fess.ebiederm.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Co-developed-by: Eric Biederman <ebiederm@xmission.com>
Co-developed-by: Christoph Hellwig <hch@infradead.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agokexec: move locking into do_kexec_load
Arnd Bergmann [Tue, 24 Aug 2021 00:00:18 +0000 (10:00 +1000)]
kexec: move locking into do_kexec_load

Patch series "compat: remove compat_alloc_user_space", v5.

Going through compat_alloc_user_space() to convert indirect system call
arguments tends to add complexity compared to handling the native and
compat logic in the same code.

This patch (of 6):

The locking is the same between the native and compat version of
sys_kexec_load(), so it can be done in the common implementation to reduce
duplication.

Link: https://lkml.kernel.org/r/20210727144859.4150043-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20210727144859.4150043-2-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Co-developed-by: Eric Biederman <ebiederm@xmission.com>
Co-developed-by: Christoph Hellwig <hch@infradead.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agoscripts: check_extable: fix typo in user error message
Randy Dunlap [Tue, 24 Aug 2021 00:00:18 +0000 (10:00 +1000)]
scripts: check_extable: fix typo in user error message

Fix typo ("and" should be "an") in an error message.

Link: https://lkml.kernel.org/r/20210727002943.29774-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm/vmalloc: add __alloc_size attributes for better bounds checking
Kees Cook [Tue, 24 Aug 2021 00:00:18 +0000 (10:00 +1000)]
mm/vmalloc: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for
appropriate vmalloc allocator interfaces, to provide additional hinting
for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other
compiler optimizations.

Link: https://lkml.kernel.org/r/20210818214021.2476230-8-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agopercpu: add __alloc_size attributes for better bounds checking
Kees Cook [Tue, 24 Aug 2021 00:00:18 +0000 (10:00 +1000)]
percpu: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for
appropriate percpu allocator interfaces, to provide additional hinting for
better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler
optimizations.

Link: https://lkml.kernel.org/r/20210818214021.2476230-7-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm/page_alloc: add __alloc_size attributes for better bounds checking
Kees Cook [Tue, 24 Aug 2021 00:00:18 +0000 (10:00 +1000)]
mm/page_alloc: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for
appropriate page allocator interfaces, to provide additional hinting for
better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler
optimizations.

Link: https://lkml.kernel.org/r/20210818214021.2476230-6-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agoslab: add __alloc_size attributes for better bounds checking
Kees Cook [Tue, 24 Aug 2021 00:00:17 +0000 (10:00 +1000)]
slab: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for regular
kmalloc interfaces, to provide additional hinting for better bounds
checking, assisting CONFIG_FORTIFY_SOURCE and other compiler
optimizations.

Link: https://lkml.kernel.org/r/20210818214021.2476230-5-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agoslab: clean up function declarations
Kees Cook [Tue, 24 Aug 2021 00:00:17 +0000 (10:00 +1000)]
slab: clean up function declarations

In order to have more readable and regular declarations, move __must_check
to the line above the main function declaration and add all the missing
parameter names.

Link: https://lkml.kernel.org/r/20210818214021.2476230-4-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Suggested-by: Joe Perches <joe@perches.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agocheckpatch: add __alloc_size() to known $Attribute
Kees Cook [Tue, 24 Aug 2021 00:00:17 +0000 (10:00 +1000)]
checkpatch: add __alloc_size() to known $Attribute

Make sure checkpatch.pl doesn't get confused about finding the
__alloc_size attribute on functions.

Link: https://lkml.kernel.org/r/20210818214021.2476230-3-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Suggested-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agoCompiler Attributes: Add __alloc_size for better bounds checking fix
Kees Cook [Tue, 31 Aug 2021 22:23:16 +0000 (15:23 -0700)]
Compiler Attributes: Add __alloc_size for better bounds checking fix

Adjust the warning logic to deal with pre-9.1 gcc behaviors.

Link: https://lkml.kernel.org/r/20210827151327.2729736-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agoCompiler Attributes: add __alloc_size() for better bounds checking
Kees Cook [Tue, 24 Aug 2021 00:00:17 +0000 (10:00 +1000)]
Compiler Attributes: add __alloc_size() for better bounds checking

Patch series "Add __alloc_size() for better bounds checking", v2.

GCC and Clang both use the "alloc_size" attribute to assist with bounds
checking around the use of allocation functions.  Add the attribute,
adjust the Makefile to silence needless warnings, and add the hints to the
allocators where possible.  These changes have been in use for a while now
in GrapheneOS.

This patch (of 2):

GCC and Clang can use the "alloc_size" attribute to better inform the
results of __builtin_object_size() (for compile-time constant values).
Clang can additionally use alloc_size to inform the results of
__builtin_dynamic_object_size() (for run-time values).

Because GCC sees the frequent use of struct_size() as an allocator size
argument, and notices it can return SIZE_MAX (the overflow indication), it
complains about these call sites may overflow (since SIZE_MAX is greater
than the default -Walloc-size-larger-than=PTRDIFF_MAX).  This isn't
helpful since we already know a SIZE_MAX will be caught at run-time (this
was an intentional design).  Instead, just disable this check as it is
both a false positive and redundant.  (Clang does not have this warning
option.)

Link: https://lkml.kernel.org/r/20210818214021.2476230-1-keescook@chromium.org
Link: https://lkml.kernel.org/r/20210818214021.2476230-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: unexport {,un}lock_page_memcg
Christoph Hellwig [Tue, 24 Aug 2021 00:00:16 +0000 (10:00 +1000)]
mm: unexport {,un}lock_page_memcg

These are only used in built-in core mm code.

Link: https://lkml.kernel.org/r/20210820095815.445392-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: unexport folio_memcg_{,un}lock
Christoph Hellwig [Tue, 24 Aug 2021 00:00:16 +0000 (10:00 +1000)]
mm: unexport folio_memcg_{,un}lock

Patch series "unexport memcg locking helpers".

Neither the old page-based nor the new folio-based memcg locking helpers
are used in modular code at all, so drop the exports.

This patch (of 2):

folio_memcg_{,un}lock are only used in built-in core mm code.

Link: https://lkml.kernel.org/r/20210820095815.445392-1-hch@lst.de
Link: https://lkml.kernel.org/r/20210820095815.445392-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: migrate: change to use bool type for 'page_was_mapped'
Baolin Wang [Tue, 24 Aug 2021 00:00:16 +0000 (10:00 +1000)]
mm: migrate: change to use bool type for 'page_was_mapped'

Change to use bool type for 'page_was_mapped' variable making it more
readable.

Link: https://lkml.kernel.org/r/ce1279df18d2c163998c403e0b5ec6d3f6f90f7a.1629447552.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: migrate: fix the incorrect function name in comments
Baolin Wang [Tue, 24 Aug 2021 00:00:16 +0000 (10:00 +1000)]
mm: migrate: fix the incorrect function name in comments

since commit a98a2f0c8ce1 ("mm/rmap: split migration into its own
function"), the migration ptes establishment has been split into a
separate try_to_migrate() function, thus update the related comments.

Link: https://lkml.kernel.org/r/5b824bad6183259c916ae6cf42f81d14c6118b06.1629447552.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: migrate: introduce a local variable to get the number of pages
Baolin Wang [Tue, 24 Aug 2021 00:00:16 +0000 (10:00 +1000)]
mm: migrate: introduce a local variable to get the number of pages

Use thp_nr_pages() instead of compound_nr() to get the number of pages for
THP page, meanwhile introducing a local variable 'nr_pages' to avoid
getting the number of pages repeatedly.

Link: https://lkml.kernel.org/r/a8e331ac04392ee230c79186330fb05e86a2aa77.1629447552.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: migrate: simplify the file-backed pages validation when migrating its mapping
Baolin Wang [Tue, 24 Aug 2021 00:00:15 +0000 (10:00 +1000)]
mm: migrate: simplify the file-backed pages validation when migrating its mapping

Patch series "Some cleanup for page migration", v3.

This patchset does some cleanups and improvements for page migration.

This patch (of 4):

There is no need to validate the file-backed page's refcount before trying
to freeze the page's expected refcount, instead we can rely on the
folio_ref_freeze() to validate if the page has the expected refcount
before migrating its mapping.

Moreover we are always under the page lock when migrating the page
mapping, which means nowhere else can remove it from the page cache, so we
can remove the xas_load() validation under the i_pages lock.

Link: https://lkml.kernel.org/r/cover.1629447552.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/df4c129fd8e86a95dbc55f4663d77441cc0d3bd1.1629447552.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm: move kvmalloc-related functions to slab.h
Matthew Wilcox (Oracle) [Tue, 24 Aug 2021 00:00:15 +0000 (10:00 +1000)]
mm: move kvmalloc-related functions to slab.h

Not all files in the kernel should include mm.h.  Migrating callers from
kmalloc to kvmalloc is easier if the kvmalloc functions are in slab.h.

[akpm@linux-foundation.org: move the new kvrealloc() also]
Link: https://lkml.kernel.org/r/20210622215757.3525604-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agomm/workingset: correct kernel-doc notations
Randy Dunlap [Tue, 24 Aug 2021 00:00:15 +0000 (10:00 +1000)]
mm/workingset: correct kernel-doc notations

Use the documented kernel-doc format to prevent kernel-doc warnings.

mm/workingset.c:256: warning: No description found for return value of 'workingset_eviction'
mm/workingset.c:285: warning: Function parameter or member 'folio' not described in 'workingset_refault'
mm/workingset.c:285: warning: Excess function parameter 'page' description in 'workingset_refault'

Link: https://lkml.kernel.org/r/20210808203153.10678-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
3 years agoMerge branch 'akpm-current/current'
Stephen Rothwell [Thu, 2 Sep 2021 05:16:45 +0000 (15:16 +1000)]
Merge branch 'akpm-current/current'

3 years agoMerge remote-tracking branch 'folio/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 04:27:19 +0000 (14:27 +1000)]
Merge remote-tracking branch 'folio/for-next'

# Conflicts:
# mm/filemap.c
# mm/rmap.c
# mm/util.c

3 years agoMerge remote-tracking branch 'cxl/next'
Stephen Rothwell [Thu, 2 Sep 2021 04:25:36 +0000 (14:25 +1000)]
Merge remote-tracking branch 'cxl/next'

3 years agoMerge remote-tracking branch 'rust/rust-next'
Stephen Rothwell [Thu, 2 Sep 2021 04:10:14 +0000 (14:10 +1000)]
Merge remote-tracking branch 'rust/rust-next'

3 years agoMerge remote-tracking branch 'memblock/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 04:08:04 +0000 (14:08 +1000)]
Merge remote-tracking branch 'memblock/for-next'

3 years agoMerge remote-tracking branch 'kunit-next/kunit'
Stephen Rothwell [Thu, 2 Sep 2021 04:05:54 +0000 (14:05 +1000)]
Merge remote-tracking branch 'kunit-next/kunit'

3 years agoMerge remote-tracking branch 'kgdb/kgdb/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 04:03:42 +0000 (14:03 +1000)]
Merge remote-tracking branch 'kgdb/kgdb/for-next'

3 years agoMerge remote-tracking branch 'auxdisplay/auxdisplay'
Stephen Rothwell [Thu, 2 Sep 2021 04:02:04 +0000 (14:02 +1000)]
Merge remote-tracking branch 'auxdisplay/auxdisplay'

3 years agoMerge remote-tracking branch 'hyperv/hyperv-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:59:47 +0000 (13:59 +1000)]
Merge remote-tracking branch 'hyperv/hyperv-next'

3 years agoMerge remote-tracking branch 'nvmem/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:59:46 +0000 (13:59 +1000)]
Merge remote-tracking branch 'nvmem/for-next'

3 years agoMerge remote-tracking branch 'slimbus/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:59:44 +0000 (13:59 +1000)]
Merge remote-tracking branch 'slimbus/for-next'

3 years agoMerge remote-tracking branch 'gnss/gnss-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:58:08 +0000 (13:58 +1000)]
Merge remote-tracking branch 'gnss/gnss-next'

3 years agoMerge remote-tracking branch 'kspp/for-next/kspp'
Stephen Rothwell [Thu, 2 Sep 2021 03:42:47 +0000 (13:42 +1000)]
Merge remote-tracking branch 'kspp/for-next/kspp'

# Conflicts:
# Makefile

3 years agoMerge remote-tracking branch 'ntb/ntb-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:41:08 +0000 (13:41 +1000)]
Merge remote-tracking branch 'ntb/ntb-next'

3 years agoMerge remote-tracking branch 'at24/at24/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:39:32 +0000 (13:39 +1000)]
Merge remote-tracking branch 'at24/at24/for-next'

3 years agoMerge remote-tracking branch 'nvdimm/libnvdimm-for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:36:46 +0000 (13:36 +1000)]
Merge remote-tracking branch 'nvdimm/libnvdimm-for-next'

3 years agoMerge remote-tracking branch 'rtc/rtc-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:34:37 +0000 (13:34 +1000)]
Merge remote-tracking branch 'rtc/rtc-next'

3 years agoMerge remote-tracking branch 'coresight/next'
Stephen Rothwell [Thu, 2 Sep 2021 03:34:31 +0000 (13:34 +1000)]
Merge remote-tracking branch 'coresight/next'

3 years agoMerge remote-tracking branch 'livepatching/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:32:23 +0000 (13:32 +1000)]
Merge remote-tracking branch 'livepatching/for-next'

3 years agoMerge remote-tracking branch 'kselftest/next'
Stephen Rothwell [Thu, 2 Sep 2021 03:30:47 +0000 (13:30 +1000)]
Merge remote-tracking branch 'kselftest/next'

3 years agoMerge remote-tracking branch 'userns/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:16:56 +0000 (13:16 +1000)]
Merge remote-tracking branch 'userns/for-next'

3 years agoMerge remote-tracking branch 'pwm/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:15:18 +0000 (13:15 +1000)]
Merge remote-tracking branch 'pwm/for-next'

3 years agoMerge remote-tracking branch 'pinctrl/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:13:04 +0000 (13:13 +1000)]
Merge remote-tracking branch 'pinctrl/for-next'

3 years agoMerge remote-tracking branch 'gpio-brgl/gpio/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:05:22 +0000 (13:05 +1000)]
Merge remote-tracking branch 'gpio-brgl/gpio/for-next'

3 years agoMerge remote-tracking branch 'rpmsg/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:03:11 +0000 (13:03 +1000)]
Merge remote-tracking branch 'rpmsg/for-next'

3 years agoMerge remote-tracking branch 'vhost/linux-next'
Stephen Rothwell [Thu, 2 Sep 2021 03:00:58 +0000 (13:00 +1000)]
Merge remote-tracking branch 'vhost/linux-next'

# Conflicts:
# drivers/virtio/virtio.c
# include/uapi/linux/virtio_ids.h

3 years agoMerge remote-tracking branch 'scsi/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:50:11 +0000 (12:50 +1000)]
Merge remote-tracking branch 'scsi/for-next'

# Conflicts:
# drivers/scsi/st.c

3 years agoMerge remote-tracking branch 'cgroup/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:50:09 +0000 (12:50 +1000)]
Merge remote-tracking branch 'cgroup/for-next'

3 years agoMerge remote-tracking branch 'dmaengine/next'
Stephen Rothwell [Thu, 2 Sep 2021 02:47:42 +0000 (12:47 +1000)]
Merge remote-tracking branch 'dmaengine/next'

3 years agoMerge remote-tracking branch 'vfio/next'
Stephen Rothwell [Thu, 2 Sep 2021 02:38:01 +0000 (12:38 +1000)]
Merge remote-tracking branch 'vfio/next'

3 years agoMerge remote-tracking branch 'extcon/extcon-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:36:22 +0000 (12:36 +1000)]
Merge remote-tracking branch 'extcon/extcon-next'

3 years agoMerge remote-tracking branch 'usb/usb-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:34:46 +0000 (12:34 +1000)]
Merge remote-tracking branch 'usb/usb-next'

3 years agoMerge remote-tracking branch 'ipmi/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:33:10 +0000 (12:33 +1000)]
Merge remote-tracking branch 'ipmi/for-next'

3 years agoMerge remote-tracking branch 'chrome-platform/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:31:03 +0000 (12:31 +1000)]
Merge remote-tracking branch 'chrome-platform/for-next'

3 years agoMerge remote-tracking branch 'drivers-x86/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:26:37 +0000 (12:26 +1000)]
Merge remote-tracking branch 'drivers-x86/for-next'

3 years agoMerge remote-tracking branch 'percpu/for-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:26:37 +0000 (12:26 +1000)]
Merge remote-tracking branch 'percpu/for-next'

3 years agoMerge remote-tracking branch 'xen-tip/linux-next'
Stephen Rothwell [Thu, 2 Sep 2021 02:24:30 +0000 (12:24 +1000)]
Merge remote-tracking branch 'xen-tip/linux-next'

3 years agoMerge remote-tracking branch 'kvms390/next'
Stephen Rothwell [Thu, 2 Sep 2021 02:09:06 +0000 (12:09 +1000)]
Merge remote-tracking branch 'kvms390/next'

3 years agoMerge remote-tracking branch 'kvm-arm/next'
Stephen Rothwell [Thu, 2 Sep 2021 02:09:03 +0000 (12:09 +1000)]
Merge remote-tracking branch 'kvm-arm/next'

3 years agoMerge remote-tracking branch 'kvm/next'
Stephen Rothwell [Thu, 2 Sep 2021 02:09:02 +0000 (12:09 +1000)]
Merge remote-tracking branch 'kvm/next'

3 years agoMerge remote-tracking branch 'rcu/rcu/next'
Stephen Rothwell [Thu, 2 Sep 2021 01:53:40 +0000 (11:53 +1000)]
Merge remote-tracking branch 'rcu/rcu/next'

# Conflicts:
# kernel/time/tick-internal.h