]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
2 years agomaple_tree: Remove BUG_ON() from mas_topiary_range()
Liam R. Howlett [Thu, 1 Sep 2022 01:10:02 +0000 (21:10 -0400)]
maple_tree: Remove BUG_ON() from mas_topiary_range()

In the even of trying to remove data from a leaf node by use of
mas_topiary_range(), simply log the issue and return.  This will be
quickly followed by a crash but logging and returning will increase the
probability of meaningful data.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Remove BUG_ON() from mas_set_height()
Liam R. Howlett [Thu, 1 Sep 2022 01:06:49 +0000 (21:06 -0400)]
maple_tree: Remove BUG_ON() from mas_set_height()

Use MT_WARN_ON() instead of MT_BUG_ON() as to avoid crashing the kernel.
In the unlikely even of a tree height of > 31, try to increase the
probability of the error being logged.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Avoid BUG_ON() when setting a leaf node as a parent
Liam R. Howlett [Thu, 1 Sep 2022 01:03:02 +0000 (21:03 -0400)]
maple_tree: Avoid BUG_ON() when setting a leaf node as a parent

Don't crash the kernel in the very unlikely case that the node passed as
the parent is actually a leaf.  Instead, use MT_WARN_ON() to increase
the probability of the debug information being recorded.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Convert debug code to use MT_WARN_ON()
Liam R. Howlett [Wed, 31 Aug 2022 19:55:15 +0000 (15:55 -0400)]
maple_tree: Convert debug code to use MT_WARN_ON()

Using MT_WARN_ON() allows for the removal of if statements before
logging.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Change RCU checks to WARN_ON() instead of BUG_ON()
Liam R. Howlett [Thu, 1 Sep 2022 21:33:43 +0000 (17:33 -0400)]
maple_tree: Change RCU checks to WARN_ON() instead of BUG_ON()

If RCU is enabled and the tree isn't locked, just warn the user and
avoid crashing the kernel.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Convert BUG_ON() to MT_BUG_ON()
Liam R. Howlett [Wed, 31 Aug 2022 18:55:26 +0000 (14:55 -0400)]
maple_tree: Convert BUG_ON() to MT_BUG_ON()

Use MT_BUG_ON() to get more information when running with
MAPLE_TREE_DEBUG enabled.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Add MT_WARN_ON()
Liam R. Howlett [Wed, 31 Aug 2022 19:01:11 +0000 (15:01 -0400)]
maple_tree: Add MT_WARN_ON()

Add a debug macro to dump the tree and warn the user.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: remove extra smp_wmb() from mas_dead_leaves()
Liam R. Howlett [Thu, 24 Nov 2022 15:55:31 +0000 (10:55 -0500)]
maple_tree: remove extra smp_wmb() from mas_dead_leaves()

The call to mte_set_dead_node() before the smp_wmb() already calls
smp_wmb() so this is not needed.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Fix freeing of nodes in rcu mode
Liam R. Howlett [Thu, 24 Nov 2022 15:51:37 +0000 (10:51 -0500)]
maple_tree: Fix freeing of nodes in rcu mode

The walk to destroy the nodes was not always setting the node type and
would result in a destroy method potentially using the values as nodes.
Avoid this by setting the correct node types.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Protect mas_prev_nentry() from dead nodes
Liam R. Howlett [Thu, 24 Nov 2022 15:50:21 +0000 (10:50 -0500)]
maple_tree: Protect mas_prev_nentry() from dead nodes

Don't use the pivot array before checking if the node is dead.  This is
for safe use in rcu mode.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Detect dead nodes in ma_data_end()
Liam R. Howlett [Thu, 24 Nov 2022 15:49:27 +0000 (10:49 -0500)]
maple_tree: Detect dead nodes in ma_data_end()

If there is a dead node, then don't use any of the data and just return
0.  The caller will need to validate the node is not dead before using
the returned data anyways.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Detect dead nodes in mas_start()
Liam R. Howlett [Thu, 24 Nov 2022 15:48:23 +0000 (10:48 -0500)]
maple_tree: Detect dead nodes in mas_start()

When initially starting a search, the root node may already be in the
process of being replaced in RCU mode.  Detect and restart the walk if
this is the case.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Protect ma_pivots() from using dead nodes
Liam R. Howlett [Thu, 24 Nov 2022 15:46:45 +0000 (10:46 -0500)]
maple_tree: Protect ma_pivots() from using dead nodes

Ensure the node type being used isn't going to return an invalid
pointer.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Reduce loops in mt_validate()
Liam R. Howlett [Tue, 15 Nov 2022 16:19:29 +0000 (11:19 -0500)]
maple_tree: Reduce loops in mt_validate()

mt_validate() is taking too long when kasan and memory poisoning is
enabled which is delaying the rcu free operation which is causing issues
running LTP test msgstress03.  Change the validation to only loop over a
node once as apposed to many times for each test.  Also only walk the
tree once by keeping track of if the last leaf had a null at a higher
level.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Add format option to mt_dump()
Liam R. Howlett [Tue, 15 Nov 2022 16:08:36 +0000 (11:08 -0500)]
maple_tree: Add format option to mt_dump()

Allow different formatting strings to be used when dumping the tree.
Currently supports hex and decimal.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Clean up mas_parent_enum()
Liam R. Howlett [Tue, 15 Nov 2022 14:29:33 +0000 (09:29 -0500)]
maple_tree: Clean up mas_parent_enum()

mas_parent_enum() is a simple wrapper for mte_parent_enum() which is
only called from that wrapper.  Remove the wrapper and inline
mte_parent_enum() into mas_parent_enum().

At the same time, clean up the bit masking of the root pointer since it
cannot be set by the time the bit masking occurs.  Change the check on
the root bit to a WARN_ON(), and fix the verification code to not
trigger the WARN_ON() before checking if the node is root.

Reported-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: should get pivots boundary by type
Liam R. Howlett [Sat, 12 Nov 2022 23:43:08 +0000 (23:43 +0000)]
maple_tree: should get pivots boundary by type

We should get pivots boundary by type.

Fixes: 54a611b60590 (Maple Tree: add new data structure)
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agovma_merge: Set vma iterator to correct position.
Liam R. Howlett [Wed, 16 Nov 2022 17:29:34 +0000 (12:29 -0500)]
vma_merge: Set vma iterator to correct position.

When merging the previous value, set the vma iterator to the previous
slot.  Don't use the vma iterator to get the next/prev so that it is in
the correct position for a write.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: Remove __vma_adjust()
Liam R. Howlett [Thu, 11 Aug 2022 20:19:43 +0000 (16:19 -0400)]
mm/mmap: Remove __vma_adjust()

Inline the work of __vma_adjust() into vma_merge().  This reduces code
size and has the added benefits of the comments for the cases being
located with the code.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: Convert do_brk_flags() to use vma_prepare() and vma_complete()
Liam R. Howlett [Thu, 11 Aug 2022 15:54:23 +0000 (11:54 -0400)]
mm/mmap: Convert do_brk_flags() to use vma_prepare() and vma_complete()

Use the abstracted vma locking for do_brk_flags()

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: Introduce dup_vma_anon() helper
Liam R. Howlett [Wed, 10 Aug 2022 22:11:48 +0000 (18:11 -0400)]
mm/mmap: Introduce dup_vma_anon() helper

Create a helper for duplicating the anon vma when adjusting the vma.
This simplifies the logic of __vma_adjust().

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: Don't use __vma_adjust() in shift_arg_pages()
Liam R. Howlett [Tue, 15 Nov 2022 19:57:13 +0000 (14:57 -0500)]
mm/mmap: Don't use __vma_adjust() in shift_arg_pages()

Introduce shrink_vma() which uses the vma_prepare() and vma_complete()
functions to reduce the vma coverage.

Convert shift_arg_pages() to use expand_vma() and the new shrink_vma()
function.  Remove support from __vma_adjust() to reduce a vma size since
shift_arg_pages() is the only user that shrinks a VMA in this way.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Don't use __vma_adjust() in __split_vma()
Liam R. Howlett [Tue, 15 Nov 2022 19:47:59 +0000 (14:47 -0500)]
mm: Don't use __vma_adjust() in __split_vma()

Use the abstracted locking and maple tree operations.  Since
__split_vma() is the only user of the __vma_adjust() function to use the
insert argument, drop that argument.  Remove the NULL passed through
from fs/exec's shift_arg_pages() at the same time.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: Introduce init_vma_prep() and init_multi_vma_prep()
Liam R. Howlett [Tue, 15 Nov 2022 18:26:52 +0000 (13:26 -0500)]
mm/mmap: Introduce init_vma_prep() and init_multi_vma_prep()

Add init_vma_prep() and init_multi_vma_prep() to set up the struct
vma_prepare.  This is to abstract the locking when adjusting the VMAs.

Also change __vma_adjust() variable remove_next int in favour of a
pointer to the VMA to remove.  Rename next_next to remove2 since this
better reflects its use.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: Use vma_prepare() and vma_complete() in vma_expand()
Liam R. Howlett [Fri, 5 Aug 2022 20:15:12 +0000 (16:15 -0400)]
mm/mmap: Use vma_prepare() and vma_complete() in vma_expand()

Use the new locking functions for vma_expand().  This reduces code
duplication.

At the same time change VM_BUG_ON() to VM_WARN_ON()

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: Refactor locking out of __vma_adjust()
Liam R. Howlett [Fri, 5 Aug 2022 20:07:25 +0000 (16:07 -0400)]
mm/mmap: Refactor locking out of __vma_adjust()

Move the locking into vma_prepare() and vma_complete() for use elsewhere

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: move anon_vma setting in __vma_adjust()
Liam R. Howlett [Fri, 5 Aug 2022 20:04:16 +0000 (16:04 -0400)]
mm/mmap: move anon_vma setting in __vma_adjust()

Move the anon_vma setting & warn_no up the function.  This is done to
clear up the locking later.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Change munmap splitting order and move_vma()
Liam R. Howlett [Tue, 15 Nov 2022 17:00:59 +0000 (12:00 -0500)]
mm: Change munmap splitting order and move_vma()

Splitting can be more efficient when the order is not of concern.
Change do_vmi_align_munmap() to reduce walking of the tree during split
operations.

move_vma() must also be altered to remove the dependency of keeping the
original VMA as the active part of the split.  Transition to using vma
iterator to look up the prev and/or next vma after munmap.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agommap: Clean up mmap_region() unrolling
Liam R. Howlett [Tue, 15 Nov 2022 15:25:26 +0000 (10:25 -0500)]
mmap: Clean up mmap_region() unrolling

Move logic of unrolling to the error path as apposed to duplicating it
within the function body.  This reduces the potential of missing an
update to one path when making changes.

Cc: Li Zetao <lizetao1@huawei.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Add vma iterator to vma_adjust() arguments
Liam R. Howlett [Mon, 14 Nov 2022 19:54:13 +0000 (14:54 -0500)]
mm: Add vma iterator to vma_adjust() arguments

Change the vma_adjust() function definition to accept the vma iterator
and pass it through to __vma_adjust().

Update fs/exec to use the new vma_adjust() function parameters.

Revert the __split_vma() calls back from __vma_adjust() to vma_adjust()
and pass through the vma iterator.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Pass vma iterator through to __vma_adjust()
Liam R. Howlett [Mon, 14 Nov 2022 16:10:05 +0000 (11:10 -0500)]
mm: Pass vma iterator through to __vma_adjust()

Pass the iterator through to be used in __vma_adjust().  The state of
the iterator needs to be correct for the operation that will occur so
make the adjustments.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Remove unnecessary write to vma iterator in __vma_adjust()
Liam R. Howlett [Mon, 14 Nov 2022 16:08:16 +0000 (11:08 -0500)]
mm: Remove unnecessary write to vma iterator in __vma_adjust()

If the vma start address is going to change due to an insert, then it is
safe to not write the vma to the tree.  The write of the insert vma will
alter the tree as necessary.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomadvise: Use split_vma() instead of __split_vma()
Liam R. Howlett [Mon, 14 Nov 2022 16:02:00 +0000 (11:02 -0500)]
madvise: Use split_vma() instead of __split_vma()

The split_vma() wrapper is specifically for this use case, so use it.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Pass through vma iterator to __vma_adjust()
Liam R. Howlett [Fri, 11 Nov 2022 20:41:07 +0000 (15:41 -0500)]
mm: Pass through vma iterator to __vma_adjust()

Pass the vma iterator through to __vma_adjust() so the state can be
updated.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agommap: Convert __vma_adjust() to use vma iterator
Liam R. Howlett [Thu, 10 Nov 2022 19:32:51 +0000 (14:32 -0500)]
mmap: Convert __vma_adjust() to use vma iterator

Use the vma iterator internally for __vma_adjust().  Avoid using the
maple tree interface directly for type safety.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/damon: Stop using vma_mas_store() for maple tree store
Liam R. Howlett [Tue, 13 Dec 2022 20:24:45 +0000 (15:24 -0500)]
mm/damon: Stop using vma_mas_store() for maple tree store

Prepare for the removal of the vma_mas_store() function by open coding
the maple tree store in this test code.  Set the range of the maple
state and call the store function directly.

Cc: SeongJae Park <sj@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Switch vma_merge(), split_vma(), and __split_vma to vma iterator
Liam R. Howlett [Thu, 10 Nov 2022 19:12:07 +0000 (14:12 -0500)]
mm: Switch vma_merge(), split_vma(), and __split_vma to vma iterator

Drop the vmi_* functions and transition all users to use the vma
iterator directly.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mremap: Use vmi version of vma_merge()
Liam R. Howlett [Thu, 10 Nov 2022 18:47:58 +0000 (13:47 -0500)]
mm/mremap: Use vmi version of vma_merge()

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agommap: Use vmi version of vma_merge()
Liam R. Howlett [Thu, 10 Nov 2022 18:32:58 +0000 (13:32 -0500)]
mmap: Use vmi version of vma_merge()

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agommap: Pass through vmi iterator to __split_vma()
Liam R. Howlett [Thu, 10 Nov 2022 18:02:37 +0000 (13:02 -0500)]
mmap: Pass through vmi iterator to __split_vma()

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomadvise: Use vmi iterator for __split_vma() and vma_merge()
Liam R. Howlett [Thu, 10 Nov 2022 18:02:10 +0000 (13:02 -0500)]
madvise: Use vmi iterator for __split_vma() and vma_merge()

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agosched: Convert to vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 19:31:34 +0000 (14:31 -0500)]
sched: Convert to vma iterator

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agotask_mmu: Convert to vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 19:31:21 +0000 (14:31 -0500)]
task_mmu: Convert to vma iterator

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomempolicy: Convert to vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 19:30:58 +0000 (14:30 -0500)]
mempolicy: Convert to vma iterator

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agocoredump: Convert to vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 19:29:59 +0000 (14:29 -0500)]
coredump: Convert to vma iterator

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomlock: Convert mlock to vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 19:29:33 +0000 (14:29 -0500)]
mlock: Convert mlock to vma iterator

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Change mprotect_fixup to vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 19:28:34 +0000 (14:28 -0500)]
mm: Change mprotect_fixup to vma iterator

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agouserfaultfd: Use vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 18:16:05 +0000 (13:16 -0500)]
userfaultfd: Use vma iterator

Use the vma iterator so that the iterator can be invalidated or updated
to avoid each caller doing so.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agoipc/shm: Use the vma iterator for munmap calls
Liam R. Howlett [Mon, 14 Nov 2022 19:53:13 +0000 (14:53 -0500)]
ipc/shm: Use the vma iterator for munmap calls

Pass through the vma iterator to do_vmi_munmap() to handle the iterator
state internally

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Add temporary vma iterator versions of vma_merge(), split_vma(), and
Liam R. Howlett [Wed, 9 Nov 2022 17:56:02 +0000 (12:56 -0500)]
mm: Add temporary vma iterator versions of vma_merge(), split_vma(), and
__split_vma()

These wrappers are short-lived in this patch set so that each user can
be converted on its own.  In the end, these functions are renamed in one
commit.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agommap: Convert vma_expand() to use vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 16:30:19 +0000 (11:30 -0500)]
mmap: Convert vma_expand() to use vma iterator

Use the vma iterator instead of the maple state for type safety and for
consistency through the mm code.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agommap: Change do_mas_munmap and do_mas_aligned_munmap() to use vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 16:27:11 +0000 (11:27 -0500)]
mmap: Change do_mas_munmap and do_mas_aligned_munmap() to use vma iterator

Start passing the vma iterator through the mm code.  This will allow for
reuse of the state and cleaner invalidation if necessary.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: Remove preallocation from do_mas_align_munmap()
Liam R. Howlett [Wed, 9 Nov 2022 15:29:14 +0000 (10:29 -0500)]
mm/mmap: Remove preallocation from do_mas_align_munmap()

In preparation of passing the vma state through split, the
pre-allocation that occurs before the split has to be moved to after.
Since the preallocation would then live right next to the store, just
call store instead of preallocating.  This effectively restores the
potential error path of splitting and not munmap'ing which pre-dates the
maple tree.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agommap: Convert vma_link() vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 15:22:00 +0000 (10:22 -0500)]
mmap: Convert vma_link() vma iterator

Avoid using the maple tree interface directly.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agokernel/fork: Convert forking to using the vmi iterator
Liam R. Howlett [Tue, 8 Nov 2022 17:15:20 +0000 (12:15 -0500)]
kernel/fork: Convert forking to using the vmi iterator

Avoid using the maple tree interface directly.  This gains type safety.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm/mmap: convert brk to use vma iterator
Liam R. Howlett [Wed, 9 Nov 2022 15:16:57 +0000 (10:16 -0500)]
mm/mmap: convert brk to use vma iterator

Use the vma iterator API for the brk() system call.  This will provide
type safety at compile time.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomm: Expand vma iterator interface.
Liam R. Howlett [Tue, 8 Nov 2022 17:14:32 +0000 (12:14 -0500)]
mm: Expand vma iterator interface.

Add wrappers for the maple tree to the vma iterator.  This will provide
type safety at compile time.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agotest_maple_tree: Test modifications while iterating
Liam R. Howlett [Thu, 17 Nov 2022 19:34:11 +0000 (14:34 -0500)]
test_maple_tree: Test modifications while iterating

Add a testcase to ensure the iterator detects bad states on modifications
and does what the user expects

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Reduce user error potential
Liam R. Howlett [Thu, 17 Nov 2022 19:32:36 +0000 (14:32 -0500)]
maple_tree: Reduce user error potential

When iterating, a user may operate on the tree and cause the maple state
to be altered and left in an unintuitive state.  Detect this scenario
and correct it by setting to the limit and invalidating the state.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Fix potential rcu issue
Liam R. Howlett [Thu, 17 Nov 2022 19:31:38 +0000 (14:31 -0500)]
maple_tree: Fix potential rcu issue

Ensure the node isn't dead after reading the node end.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agomaple_tree: Add mas_init() function
Liam R. Howlett [Fri, 11 Nov 2022 20:39:16 +0000 (15:39 -0500)]
maple_tree: Add mas_init() function

Add a function that will zero out the maple state struct and set some
basic defaults.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 years agoLinux 6.1
Linus Torvalds [Sun, 11 Dec 2022 22:15:18 +0000 (14:15 -0800)]
Linux 6.1

2 years agoMerge tag 'iommu-fix-v6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro...
Linus Torvalds [Sun, 11 Dec 2022 17:49:39 +0000 (09:49 -0800)]
Merge tag 'iommu-fix-v6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fix from Joerg Roedel:

 - Fix device mask to catch all affected devices in the recently added
   quirk for QAT devices in the Intel VT-d driver.

* tag 'iommu-fix-v6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Fix buggy QAT device mask

2 years agoMerge tag 'mm-hotfixes-stable-2022-12-10-1' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sun, 11 Dec 2022 01:10:52 +0000 (17:10 -0800)]
Merge tag 'mm-hotfixes-stable-2022-12-10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "Nine hotfixes.

  Six for MM, three for other areas. Four of these patches address
  post-6.0 issues"

* tag 'mm-hotfixes-stable-2022-12-10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  memcg: fix possible use-after-free in memcg_write_event_control()
  MAINTAINERS: update Muchun Song's email
  mm/gup: fix gup_pud_range() for dax
  mmap: fix do_brk_flags() modifying obviously incorrect VMAs
  mm/swap: fix SWP_PFN_BITS with CONFIG_PHYS_ADDR_T_64BIT on 32bit
  tmpfs: fix data loss from failed fallocate
  kselftests: cgroup: update kmem test precision tolerance
  mm: do not BUG_ON missing brk mapping, because userspace can unmap it
  mailmap: update Matti Vaittinen's email address

2 years agoMerge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Linus Torvalds [Sat, 10 Dec 2022 18:14:52 +0000 (10:14 -0800)]
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fix from Russell King:
 "One further ARM fix for 6.1 from Wang Kefeng, fixing up the handling
  for kfence faults"

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9278/1: kfence: only handle translation faults

2 years agomemcg: fix possible use-after-free in memcg_write_event_control()
Tejun Heo [Thu, 8 Dec 2022 02:53:15 +0000 (16:53 -1000)]
memcg: fix possible use-after-free in memcg_write_event_control()

memcg_write_event_control() accesses the dentry->d_name of the specified
control fd to route the write call.  As a cgroup interface file can't be
renamed, it's safe to access d_name as long as the specified file is a
regular cgroup file.  Also, as these cgroup interface files can't be
removed before the directory, it's safe to access the parent too.

Prior to 347c4a874710 ("memcg: remove cgroup_event->cft"), there was a
call to __file_cft() which verified that the specified file is a regular
cgroupfs file before further accesses.  The cftype pointer returned from
__file_cft() was no longer necessary and the commit inadvertently dropped
the file type check with it allowing any file to slip through.  With the
invarients broken, the d_name and parent accesses can now race against
renames and removals of arbitrary files and cause use-after-free's.

Fix the bug by resurrecting the file type check in __file_cft().  Now that
cgroupfs is implemented through kernfs, checking the file operations needs
to go through a layer of indirection.  Instead, let's check the superblock
and dentry type.

Link: https://lkml.kernel.org/r/Y5FRm/cfcKPGzWwl@slm.duckdns.org
Fixes: 347c4a874710 ("memcg: remove cgroup_event->cft")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org> [3.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoMAINTAINERS: update Muchun Song's email
Muchun Song [Thu, 8 Dec 2022 11:55:48 +0000 (19:55 +0800)]
MAINTAINERS: update Muchun Song's email

I'm moving to the @linux.dev account.  Map my old addresses and update it
to my new address.

Link: https://lkml.kernel.org/r/20221208115548.85244-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/gup: fix gup_pud_range() for dax
John Starks [Wed, 7 Dec 2022 06:00:53 +0000 (22:00 -0800)]
mm/gup: fix gup_pud_range() for dax

For dax pud, pud_huge() returns true on x86. So the function works as long
as hugetlb is configured. However, dax doesn't depend on hugetlb.
Commit 414fd080d125 ("mm/gup: fix gup_pmd_range() for dax") fixed
devmap-backed huge PMDs, but missed devmap-backed huge PUDs. Fix this as
well.

This fixes the below kernel panic:

general protection fault, probably for non-canonical address 0x69e7c000cc478: 0000 [#1] SMP
< snip >
Call Trace:
<TASK>
get_user_pages_fast+0x1f/0x40
iov_iter_get_pages+0xc6/0x3b0
? mempool_alloc+0x5d/0x170
bio_iov_iter_get_pages+0x82/0x4e0
? bvec_alloc+0x91/0xc0
? bio_alloc_bioset+0x19a/0x2a0
blkdev_direct_IO+0x282/0x480
? __io_complete_rw_common+0xc0/0xc0
? filemap_range_has_page+0x82/0xc0
generic_file_direct_write+0x9d/0x1a0
? inode_update_time+0x24/0x30
__generic_file_write_iter+0xbd/0x1e0
blkdev_write_iter+0xb4/0x150
? io_import_iovec+0x8d/0x340
io_write+0xf9/0x300
io_issue_sqe+0x3c3/0x1d30
? sysvec_reschedule_ipi+0x6c/0x80
__io_queue_sqe+0x33/0x240
? fget+0x76/0xa0
io_submit_sqes+0xe6a/0x18d0
? __fget_light+0xd1/0x100
__x64_sys_io_uring_enter+0x199/0x880
? __context_tracking_enter+0x1f/0x70
? irqentry_exit_to_user_mode+0x24/0x30
? irqentry_exit+0x1d/0x30
? __context_tracking_exit+0xe/0x70
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7fc97c11a7be
< snip >
</TASK>
---[ end trace 48b2e0e67debcaeb ]---
RIP: 0010:internal_get_user_pages_fast+0x340/0x990
< snip >
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled

Link: https://lkml.kernel.org/r/1670392853-28252-1-git-send-email-ssengar@linux.microsoft.com
Fixes: 414fd080d125 ("mm/gup: fix gup_pmd_range() for dax")
Signed-off-by: John Starks <jostarks@microsoft.com>
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agommap: fix do_brk_flags() modifying obviously incorrect VMAs
Liam Howlett [Mon, 5 Dec 2022 19:23:17 +0000 (19:23 +0000)]
mmap: fix do_brk_flags() modifying obviously incorrect VMAs

Add more sanity checks to the VMA that do_brk_flags() will expand.  Ensure
the VMA matches basic merge requirements within the function before
calling can_vma_merge_after().

Drop the duplicate checks from vm_brk_flags() since they will be enforced
later.

The old code would expand file VMAs on brk(), which is functionally
wrong and also dangerous in terms of locking because the brk() path
isn't designed for file VMAs and therefore doesn't lock the file
mapping.  Checking can_vma_merge_after() ensures that new anonymous
VMAs can't be merged into file VMAs.

See https://lore.kernel.org/linux-mm/CAG48ez1tJZTOjS_FjRZhvtDA-STFmdw8PEizPDwMGFd_ui0Nrw@mail.gmail.com/

Link: https://lkml.kernel.org/r/20221205192304.1957418-1-Liam.Howlett@oracle.com
Fixes: 2e7ce7d354f2 ("mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap()")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Jann Horn <jannh@google.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/swap: fix SWP_PFN_BITS with CONFIG_PHYS_ADDR_T_64BIT on 32bit
David Hildenbrand [Mon, 5 Dec 2022 15:08:57 +0000 (16:08 +0100)]
mm/swap: fix SWP_PFN_BITS with CONFIG_PHYS_ADDR_T_64BIT on 32bit

We use "unsigned long" to store a PFN in the kernel and phys_addr_t to
store a physical address.

On a 64bit system, both are 64bit wide.  However, on a 32bit system, the
latter might be 64bit wide.  This is, for example, the case on x86 with
PAE: phys_addr_t and PTEs are 64bit wide, while "unsigned long" only spans
32bit.

The current definition of SWP_PFN_BITS without MAX_PHYSMEM_BITS misses
that case, and assumes that the maximum PFN is limited by an 32bit
phys_addr_t.  This implies, that SWP_PFN_BITS will currently only be able
to cover 4 GiB - 1 on any 32bit system with 4k page size, which is wrong.

Let's rely on the number of bits in phys_addr_t instead, but make sure to
not exceed the maximum swap offset, to not make the BUILD_BUG_ON() in
is_pfn_swap_entry() unhappy.  Note that swp_entry_t is effectively an
unsigned long and the maximum swap offset shares that value with the swap
type.

For example, on an 8 GiB x86 PAE system with a kernel config based on
Debian 11.5 (-> CONFIG_FLATMEM=y, CONFIG_X86_PAE=y), we will currently
fail removing migration entries (remove_migration_ptes()), because
mm/page_vma_mapped.c:check_pte() will fail to identify a PFN match as
swp_offset_pfn() wrongly masks off PFN bits.  For example,
split_huge_page_to_list()->...->remap_page() will leave migration entries
in place and continue to unlock the page.

Later, when we stumble over these migration entries (e.g., via
/proc/self/pagemap), pfn_swap_entry_to_page() will BUG_ON() because these
migration entries shouldn't exist anymore and the page was unlocked.

[   33.067591] kernel BUG at include/linux/swapops.h:497!
[   33.067597] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[   33.067602] CPU: 3 PID: 742 Comm: cow Tainted: G            E      6.1.0-rc8+ #16
[   33.067605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
[   33.067606] EIP: pagemap_pmd_range+0x644/0x650
[   33.067612] Code: 00 00 00 00 66 90 89 ce b9 00 f0 ff ff e9 ff fb ff ff 89 d8 31 db e8 48 c6 52 00 e9 23 fb ff ff e8 61 83 56 00 e9 b6 fe ff ff <0f> 0b bf 00 f0 ff ff e9 38 fa ff ff 3e 8d 74 26 00 55 89 e5 57 31
[   33.067615] EAX: ee394000 EBX: 00000002 ECX: ee394000 EDX: 00000000
[   33.067617] ESI: c1b0ded4 EDI: 00024a00 EBP: c1b0ddb4 ESP: c1b0dd68
[   33.067619] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246
[   33.067624] CR0: 80050033 CR2: b7a00000 CR3: 01bbbd20 CR4: 00350ef0
[   33.067625] Call Trace:
[   33.067628]  ? madvise_free_pte_range+0x720/0x720
[   33.067632]  ? smaps_pte_range+0x4b0/0x4b0
[   33.067634]  walk_pgd_range+0x325/0x720
[   33.067637]  ? mt_find+0x1d6/0x3a0
[   33.067641]  ? mt_find+0x1d6/0x3a0
[   33.067643]  __walk_page_range+0x164/0x170
[   33.067646]  walk_page_range+0xf9/0x170
[   33.067648]  ? __kmem_cache_alloc_node+0x2a8/0x340
[   33.067653]  pagemap_read+0x124/0x280
[   33.067658]  ? default_llseek+0x101/0x160
[   33.067662]  ? smaps_account+0x1d0/0x1d0
[   33.067664]  vfs_read+0x90/0x290
[   33.067667]  ? do_madvise.part.0+0x24b/0x390
[   33.067669]  ? debug_smp_processor_id+0x12/0x20
[   33.067673]  ksys_pread64+0x58/0x90
[   33.067675]  __ia32_sys_ia32_pread64+0x1b/0x20
[   33.067680]  __do_fast_syscall_32+0x4c/0xc0
[   33.067683]  do_fast_syscall_32+0x29/0x60
[   33.067686]  do_SYSENTER_32+0x15/0x20
[   33.067689]  entry_SYSENTER_32+0x98/0xf1

Decrease the indentation level of SWP_PFN_BITS and SWP_PFN_MASK to keep it
readable and consistent.

[david@redhat.com: rely on sizeof(phys_addr_t) and min_t() instead]
Link: https://lkml.kernel.org/r/20221206105737.69478-1-david@redhat.com
[david@redhat.com: use "int" for comparison, as we're only comparing numbers < 64]
Link: https://lkml.kernel.org/r/1f157500-2676-7cef-a84e-9224ed64e540@redhat.com
Link: https://lkml.kernel.org/r/20221205150857.167583-1-david@redhat.com
Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry")
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agotmpfs: fix data loss from failed fallocate
Hugh Dickins [Mon, 5 Dec 2022 00:51:50 +0000 (16:51 -0800)]
tmpfs: fix data loss from failed fallocate

Fix tmpfs data loss when the fallocate system call is interrupted by a
signal, or fails for some other reason.  The partial folio handling in
shmem_undo_range() forgot to consider this unfalloc case, and was liable
to erase or truncate out data which had already been committed earlier.

It turns out that none of the partial folio handling there is appropriate
for the unfalloc case, which just wants to proceed to removal of whole
folios: which find_get_entries() provides, even when partially covered.

Original patch by Rui Wang.

Link: https://lore.kernel.org/linux-mm/33b85d82.7764.1842e9ab207.Coremail.chenguoqic@163.com/
Link: https://lkml.kernel.org/r/a5dac112-cf4b-7af-a33-f386e347fd38@google.com
Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Guoqi Chen <chenguoqic@163.com>
Link: https://lore.kernel.org/all/20221101032248.819360-1-kernel@hev.cc/
Cc: Rui Wang <kernel@hev.cc>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: <stable@vger.kernel.org> [5.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokselftests: cgroup: update kmem test precision tolerance
Michal Hocko [Fri, 2 Dec 2022 08:50:26 +0000 (09:50 +0100)]
kselftests: cgroup: update kmem test precision tolerance

1813e51eece0 ("memcg: increase MEMCG_CHARGE_BATCH to 64") has changed
the batch size while this test case has been left behind. This has led
to a test failure reported by test bot:
not ok 2 selftests: cgroup: test_kmem # exit=1

Update the tolerance for the pcp charges to reflect the
MEMCG_CHARGE_BATCH change to fix this.

[akpm@linux-foundation.org: update comments, per Roman]
Link: https://lkml.kernel.org/r/Y4m8Unt6FhWKC6IH@dhcp22.suse.cz
Fixes: 1813e51eece0a ("memcg: increase MEMCG_CHARGE_BATCH to 64")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/oe-lkp/202212010958.c1053bd3-yujie.liu@intel.com
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: do not BUG_ON missing brk mapping, because userspace can unmap it
Jason A. Donenfeld [Fri, 2 Dec 2022 16:27:24 +0000 (17:27 +0100)]
mm: do not BUG_ON missing brk mapping, because userspace can unmap it

The following program will trigger the BUG_ON that this patch removes,
because the user can munmap() mm->brk:

  #include <sys/syscall.h>
  #include <sys/mman.h>
  #include <assert.h>
  #include <unistd.h>

  static void *brk_now(void)
  {
    return (void *)syscall(SYS_brk, 0);
  }

  static void brk_set(void *b)
  {
    assert(syscall(SYS_brk, b) != -1);
  }

  int main(int argc, char *argv[])
  {
    void *b = brk_now();
    brk_set(b + 4096);
    assert(munmap(b - 4096, 4096 * 2) == 0);
    brk_set(b);
    return 0;
  }

Compile that with musl, since glibc actually uses brk(), and then
execute it, and it'll hit this splat:

  kernel BUG at mm/mmap.c:229!
  invalid opcode: 0000 [#1] PREEMPT SMP
  CPU: 12 PID: 1379 Comm: a.out Tainted: G S   U             6.1.0-rc7+ #419
  RIP: 0010:__do_sys_brk+0x2fc/0x340
  Code: 00 00 4c 89 ef e8 04 d3 fe ff eb 9a be 01 00 00 00 4c 89 ff e8 35 e0 fe ff e9 6e ff ff ff 4d 89 a7 20>
  RSP: 0018:ffff888140bc7eb0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 00000000007e7000 RCX: ffff8881020fe000
  RDX: ffff8881020fe001 RSI: ffff8881955c9b00 RDI: ffff8881955c9b08
  RBP: 0000000000000000 R08: ffff8881955c9b00 R09: 00007ffc77844000
  R10: 0000000000000000 R11: 0000000000000001 R12: 00000000007e8000
  R13: 00000000007e8000 R14: 00000000007e7000 R15: ffff8881020fe000
  FS:  0000000000604298(0000) GS:ffff88901f700000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000603fe0 CR3: 000000015ba9a005 CR4: 0000000000770ee0
  PKRU: 55555554
  Call Trace:
   <TASK>
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
  RIP: 0033:0x400678
  Code: 10 4c 8d 41 08 4c 89 44 24 10 4c 8b 01 8b 4c 24 08 83 f9 2f 77 0a 4c 8d 4c 24 20 4c 01 c9 eb 05 48 8b>
  RSP: 002b:00007ffc77863890 EFLAGS: 00000212 ORIG_RAX: 000000000000000c
  RAX: ffffffffffffffda RBX: 000000000040031b RCX: 0000000000400678
  RDX: 00000000004006a1 RSI: 00000000007e6000 RDI: 00000000007e7000
  RBP: 00007ffc77863900 R08: 0000000000000000 R09: 00000000007e6000
  R10: 00007ffc77863930 R11: 0000000000000212 R12: 00007ffc77863978
  R13: 00007ffc77863988 R14: 0000000000000000 R15: 0000000000000000
   </TASK>

Instead, just return the old brk value if the original mapping has been
removed.

[akpm@linux-foundation.org: fix changelog, per Liam]
Link: https://lkml.kernel.org/r/20221202162724.2009-1-Jason@zx2c4.com
Fixes: 2e7ce7d354f2 ("mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomailmap: update Matti Vaittinen's email address
Matti Vaittinen [Thu, 1 Dec 2022 16:45:40 +0000 (18:45 +0200)]
mailmap: update Matti Vaittinen's email address

The email backend used by ROHM keeps labeling patches as spam.  This can
result in missing the patches.

Switch my mail address from a company mail to a personal one.

Link: https://lkml.kernel.org/r/8f4498b66fedcbded37b3b87e0c516e659f8f583.1669912977.git.mazziesaccount@gmail.com
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Suggested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Anup Patel <anup@brainfault.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Bjorn Andersson <andersson@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Kirill Tkhai <tkhai@ya.ru>
Cc: Qais Yousef <qyousef@layalina.io>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoMerge tag 'media/v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Fri, 9 Dec 2022 18:45:51 +0000 (10:45 -0800)]
Merge tag 'media/v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fix from Mauro Carvalho Chehab:
 "A v4l-core fix related to validating DV timings related to video
  blanking values"

* tag 'media/v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: v4l2-dv-timings.c: fix too strict blanking sanity checks

2 years agoMerge tag 'soc-fixes-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Fri, 9 Dec 2022 18:32:40 +0000 (10:32 -0800)]
Merge tag 'soc-fixes-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fix from Arnd Bergmann:
 "One more last minute revert for a boot regression that was found on
  the popular colibri-imx7"

* tag 'soc-fixes-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  Revert "ARM: dts: imx7: Fix NAND controller size-cells"

2 years agoMerge tag 'drm-fixes-2022-12-09' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 9 Dec 2022 00:58:31 +0000 (16:58 -0800)]
Merge tag 'drm-fixes-2022-12-09' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Last set of fixes for final, scattered bunch of fixes, two amdgpu, one
  vmwgfx, and some misc others.

  amdgpu:
   - S0ix fix
   - DCN 3.2 array out of bounds fix

  shmem:
   - Fixes to shmem-helper error paths

  bridge:
   - Fix polarity bug in bridge/ti-sn65dsi86

  dw-hdmi:
   - Prefer 8-bit RGB fallback before any YUV mode in dw-hdmi, since
     some panels lie about YUV support

  vmwgfx:
   - Stop using screen objects when SEV is active"

* tag 'drm-fixes-2022-12-09' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: fix array index out of bound error in DCN32 DML
  drm/amdgpu/sdma_v4_0: turn off SDMA ring buffer in the s2idle suspend
  drm/vmwgfx: Don't use screen objects when SEV is active
  drm/shmem-helper: Avoid vm_open error paths
  drm/shmem-helper: Remove errant put in error path
  drm: bridge: dw_hdmi: fix preference of RGB modes over YUV420
  drm/bridge: ti-sn65dsi86: Fix output polarity setting bug
  drm/vmwgfx: Fix race issue calling pin_user_pages

2 years agoMerge tag 'drm-misc-fixes-2022-12-08' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 9 Dec 2022 00:11:05 +0000 (10:11 +1000)]
Merge tag 'drm-misc-fixes-2022-12-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v6.1 final?:
- Fix polarity bug in bridge/ti-sn65dsi86.
- Prefer 8-bit RGB fallback before any YUV mode in dw-hdmi, since some
  panels lie about YUV support.
- Fixes to shmem-helper error paths.
- Small vmwgfx to stop using screen objects when SEV is active.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8110f02d-d155-926e-8674-c88b806c3a3a@linux.intel.com
2 years agoMerge tag 'amd-drm-fixes-6.1-2022-12-07' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 9 Dec 2022 00:09:57 +0000 (10:09 +1000)]
Merge tag 'amd-drm-fixes-6.1-2022-12-07' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.1-2022-12-07:

amdgpu:
- S0ix fix
- DCN 3.2 array out of bounds fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221207222751.9558-1-alexander.deucher@amd.com
2 years agoMerge tag 'block-6.1-2022-12-08' of git://git.kernel.dk/linux
Linus Torvalds [Thu, 8 Dec 2022 23:53:39 +0000 (15:53 -0800)]
Merge tag 'block-6.1-2022-12-08' of git://git.kernel.dk/linux

Pull block fix from Jens Axboe:
 "A small fix for initializing the NVMe quirks before initializing the
  subsystem"

* tag 'block-6.1-2022-12-08' of git://git.kernel.dk/linux:
  nvme initialize core quirks before calling nvme_init_subsystem

2 years agoMerge tag 'io_uring-6.1-2022-12-08' of git://git.kernel.dk/linux
Linus Torvalds [Thu, 8 Dec 2022 23:44:09 +0000 (15:44 -0800)]
Merge tag 'io_uring-6.1-2022-12-08' of git://git.kernel.dk/linux

Pull io_uring fix from Jens Axboe:
 "A single small fix for an issue related to ordering between
  cancelation and current->io_uring teardown"

* tag 'io_uring-6.1-2022-12-08' of git://git.kernel.dk/linux:
  io_uring: Fix a null-ptr-deref in io_tctx_exit_cb()

2 years agoMerge tag 'net-6.1-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 8 Dec 2022 23:32:13 +0000 (15:32 -0800)]
Merge tag 'net-6.1-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, can and netfilter.

  Current release - new code bugs:

   - bonding: ipv6: correct address used in Neighbour Advertisement
     parsing (src vs dst typo)

   - fec: properly scope IRQ coalesce setup during link up to supported
     chips only

  Previous releases - regressions:

   - Bluetooth fixes for fake CSR clones (knockoffs):
       - re-add ERR_DATA_REPORTING quirk
       - fix crash when device is replugged

   - Bluetooth:
       - silence a user-triggerable dmesg error message
       - L2CAP: fix u8 overflow, oob access
       - correct vendor codec definition
       - fix support for Read Local Supported Codecs V2

   - ti: am65-cpsw: fix RGMII configuration at SPEED_10

   - mana: fix race on per-CQ variable NAPI work_done

  Previous releases - always broken:

   - af_unix: diag: fetch user_ns from in_skb in unix_diag_get_exact(),
     avoid null-deref

   - af_can: fix NULL pointer dereference in can_rcv_filter

   - can: slcan: fix UAF with a freed work

   - can: can327: flush TX_work on ldisc .close()

   - macsec: add missing attribute validation for offload

   - ipv6: avoid use-after-free in ip6_fragment()

   - nft_set_pipapo: actually validate intervals in fields after the
     first one

   - mvneta: prevent oob access in mvneta_config_rss()

   - ipv4: fix incorrect route flushing when table ID 0 is used, or when
     source address is deleted

   - phy: mxl-gpy: add workaround for IRQ bug on GPY215B and GPY215C"

* tag 'net-6.1-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
  net: dsa: sja1105: avoid out of bounds access in sja1105_init_l2_policing()
  s390/qeth: fix use-after-free in hsci
  macsec: add missing attribute validation for offload
  net: mvneta: Fix an out of bounds check
  net: thunderbolt: fix memory leak in tbnet_open()
  ipv6: avoid use-after-free in ip6_fragment()
  net: plip: don't call kfree_skb/dev_kfree_skb() under spin_lock_irq()
  net: phy: mxl-gpy: add MDINT workaround
  net: dsa: mv88e6xxx: accept phy-mode = "internal" for internal PHY ports
  xen/netback: don't call kfree_skb() under spin_lock_irqsave()
  dpaa2-switch: Fix memory leak in dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove()
  ethernet: aeroflex: fix potential skb leak in greth_init_rings()
  tipc: call tipc_lxc_xmit without holding node_read_lock
  can: esd_usb: Allow REC and TEC to return to zero
  can: can327: flush TX_work on ldisc .close()
  can: slcan: fix freed work crash
  can: af_can: fix NULL pointer dereference in can_rcv_filter
  net: dsa: sja1105: fix memory leak in sja1105_setup_devlink_regions()
  ipv4: Fix incorrect route flushing when table ID 0 is used
  ipv4: Fix incorrect route flushing when source address is deleted
  ...

2 years agoMerge tag 'for-linus-2022120801' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 8 Dec 2022 20:37:42 +0000 (12:37 -0800)]
Merge tag 'for-linus-2022120801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:
 "A regression fix for handling Logitech HID++ devices and memory
  corruption fixes:

   - regression fix (revert) for catch-all handling of Logitech HID++
     Bluetooth devices; there are devices that turn out not to work with
     this, and the root cause is yet to be properly understood. So we
     are dropping it for now, and it will be revisited for 6.2 or 6.3
     (Benjamin Tissoires)

   - memory corruption fix in HID core (ZhangPeng)

   - memory corruption fix in hid-lg4ff (Anastasia Belova)

   - Kconfig fix for I2C_HID (Benjamin Tissoires)

   - a few device-id specific quirks that piggy-back on top of the
     important fixes above"

* tag 'for-linus-2022120801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  Revert "HID: logitech-hidpp: Enable HID++ for all the Logitech Bluetooth devices"
  Revert "HID: logitech-hidpp: Remove special-casing of Bluetooth devices"
  HID: usbhid: Add ALWAYS_POLL quirk for some mice
  HID: core: fix shift-out-of-bounds in hid_report_raw_event
  HID: uclogic: Add HID_QUIRK_HIDINPUT_FORCE quirk
  HID: fix I2C_HID not selected when I2C_HID_OF_ELAN is
  HID: hid-lg4ff: Add check for empty lbuf
  HID: ite: Enable QUIRK_TOUCHPAD_ON_OFF_REPORT on Acer Aspire Switch V 10
  HID: uclogic: Fix frame templates for big endian architectures

2 years agoMerge tag 'soc-fixes-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Thu, 8 Dec 2022 19:22:27 +0000 (11:22 -0800)]
Merge tag 'soc-fixes-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fix from Arnd Bergmann:
 "One last build fix came in, addressing a link failure when building
  without CONFIG_OUTER_CACHE"

* tag 'soc-fixes-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: at91: fix build for SAMA5D3 w/o L2 cache

2 years agoRevert "HID: logitech-hidpp: Enable HID++ for all the Logitech Bluetooth devices"
Benjamin Tissoires [Wed, 7 Dec 2022 14:24:33 +0000 (15:24 +0100)]
Revert "HID: logitech-hidpp: Enable HID++ for all the Logitech Bluetooth devices"

This reverts commit 532223c8ac57605a10e46dc0ab23dcf01c9acb43.

As reported in [0], hid-logitech-hidpp now binds on all bluetooth mice,
but there are corner cases where hid-logitech-hidpp just gives up on
the mouse. This leads the end user with a dead mouse.

Given that we are at -rc8, we are definitively too late to find a proper
fix. We already identified 2 issues less than 24 hours after the bug
report. One in that ->match() was never designed to be used anywhere else
than in hid-generic, and the other that hid-logitech-hidpp has corner
cases where it gives up on devices it is not supposed to.

So we have no choice but postpone this patch to the next kernel release.

[0] https://lore.kernel.org/linux-input/CAJZ5v0g-_o4AqMgNwihCb0jrwrcJZfRrX=jv8aH54WNKO7QB8A@mail.gmail.com/

Reported-by: Rafael J . Wysocki <rjw@rjwysocki.net>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2 years agoRevert "HID: logitech-hidpp: Remove special-casing of Bluetooth devices"
Benjamin Tissoires [Wed, 7 Dec 2022 14:24:32 +0000 (15:24 +0100)]
Revert "HID: logitech-hidpp: Remove special-casing of Bluetooth devices"

This reverts commit 8544c812e43ab7bdf40458411b83987b8cba924d.

We need to revert commit 532223c8ac57 ("HID: logitech-hidpp: Enable HID++
for all the Logitech Bluetooth devices") because that commit might make
hid-logitech-hidpp bind on mice that are not well enough supported by
hid-logitech-hidpp, and the end result is that the probe of those mice
is now returning -ENODEV, leaving the end user with a dead mouse.

Given that commit 8544c812e43a ("HID: logitech-hidpp: Remove special-casing
of Bluetooth devices") is a direct dependency of 532223c8ac57, revert it
too.

Note that this also adapt according to commit 908d325e1665 ("HID:
logitech-hidpp: Detect hi-res scrolling support") to re-add support of
the devices that were removed from that commit too.

I have locally an MX Master and I tested this device with that revert,
ensuring we still have high-res scrolling.

Reported-by: Rafael J . Wysocki <rjw@rjwysocki.net>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2 years agoMerge tag 'loongarch-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 8 Dec 2022 19:16:15 +0000 (11:16 -0800)]
Merge tag 'loongarch-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Export smp_send_reschedule() for modules use, fix a huge page entry
  update issue, and add documents for booting description"

* tag 'loongarch-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  docs/zh_CN: Add LoongArch booting description's translation
  docs/LoongArch: Add booting description
  LoongArch: mm: Fix huge page entry update for virtual machine
  LoongArch: Export symbol for function smp_send_reschedule()

2 years agoMerge tag 'for-linus-xsa-6.1-rc9b-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 8 Dec 2022 19:11:06 +0000 (11:11 -0800)]
Merge tag 'for-linus-xsa-6.1-rc9b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "A single fix for the recent security issue XSA-423"

* tag 'for-linus-xsa-6.1-rc9b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/netback: fix build warning

2 years agoMerge tag 'gpio-fixes-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 8 Dec 2022 19:00:42 +0000 (11:00 -0800)]
Merge tag 'gpio-fixes-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix a memory leak in gpiolib core

 - fix reference leaks in gpio-amd8111 and gpio-rockchip

* tag 'gpio-fixes-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio/rockchip: fix refcount leak in rockchip_gpiolib_register()
  gpio: amd8111: Fix PCI device reference count leak
  gpiolib: fix memory leak in gpiochip_setup_dev()

2 years agoMerge tag 'ata-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Thu, 8 Dec 2022 18:46:52 +0000 (10:46 -0800)]
Merge tag 'ata-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ATA fix from Damien Le Moal:

 - Avoid a NULL pointer dereference in the libahci platform code that
   can happen on initialization when a device tree does not specify
   names for the adapter clocks (from Anders)

* tag 'ata-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: libahci_platform: ahci_platform_find_clk: oops, NULL pointer

2 years agomemcg: Fix possible use-after-free in memcg_write_event_control()
Tejun Heo [Thu, 8 Dec 2022 02:53:15 +0000 (16:53 -1000)]
memcg: Fix possible use-after-free in memcg_write_event_control()

memcg_write_event_control() accesses the dentry->d_name of the specified
control fd to route the write call.  As a cgroup interface file can't be
renamed, it's safe to access d_name as long as the specified file is a
regular cgroup file.  Also, as these cgroup interface files can't be
removed before the directory, it's safe to access the parent too.

Prior to 347c4a874710 ("memcg: remove cgroup_event->cft"), there was a
call to __file_cft() which verified that the specified file is a regular
cgroupfs file before further accesses.  The cftype pointer returned from
__file_cft() was no longer necessary and the commit inadvertently
dropped the file type check with it allowing any file to slip through.
With the invarients broken, the d_name and parent accesses can now race
against renames and removals of arbitrary files and cause
use-after-free's.

Fix the bug by resurrecting the file type check in __file_cft().  Now
that cgroupfs is implemented through kernfs, checking the file
operations needs to go through a layer of indirection.  Instead, let's
check the superblock and dentry type.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 347c4a874710 ("memcg: remove cgroup_event->cft")
Cc: stable@kernel.org # v3.14+
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agonet: dsa: sja1105: avoid out of bounds access in sja1105_init_l2_policing()
Radu Nicolae Pirea (OSS) [Wed, 7 Dec 2022 13:23:47 +0000 (15:23 +0200)]
net: dsa: sja1105: avoid out of bounds access in sja1105_init_l2_policing()

The SJA1105 family has 45 L2 policing table entries
(SJA1105_MAX_L2_POLICING_COUNT) and SJA1110 has 110
(SJA1110_MAX_L2_POLICING_COUNT). Keeping the table structure but
accounting for the difference in port count (5 in SJA1105 vs 10 in
SJA1110) does not fully explain the difference. Rather, the SJA1110 also
has L2 ingress policers for multicast traffic. If a packet is classified
as multicast, it will be processed by the policer index 99 + SRCPORT.

The sja1105_init_l2_policing() function initializes all L2 policers such
that they don't interfere with normal packet reception by default. To have
a common code between SJA1105 and SJA1110, the index of the multicast
policer for the port is calculated because it's an index that is out of
bounds for SJA1105 but in bounds for SJA1110, and a bounds check is
performed.

The code fails to do the proper thing when determining what to do with the
multicast policer of port 0 on SJA1105 (ds->num_ports = 5). The "mcast"
index will be equal to 45, which is also equal to
table->ops->max_entry_count (SJA1105_MAX_L2_POLICING_COUNT). So it passes
through the check. But at the same time, SJA1105 doesn't have multicast
policers. So the code programs the SHARINDX field of an out-of-bounds
element in the L2 Policing table of the static config.

The comparison between index 45 and 45 entries should have determined the
code to not access this policer index on SJA1105, since its memory wasn't
even allocated.

With enough bad luck, the out-of-bounds write could even overwrite other
valid kernel data, but in this case, the issue was detected using KASAN.

Kernel log:

sja1105 spi5.0: Probed switch chip: SJA1105Q
==================================================================
BUG: KASAN: slab-out-of-bounds in sja1105_setup+0x1cbc/0x2340
Write of size 8 at addr ffffff880bd57708 by task kworker/u8:0/8
...
Workqueue: events_unbound deferred_probe_work_func
Call trace:
...
sja1105_setup+0x1cbc/0x2340
dsa_register_switch+0x1284/0x18d0
sja1105_probe+0x748/0x840
...
Allocated by task 8:
...
sja1105_setup+0x1bcc/0x2340
dsa_register_switch+0x1284/0x18d0
sja1105_probe+0x748/0x840
...

Fixes: 38fbe91f2287 ("net: dsa: sja1105: configure the multicast policers, if present")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Radu Nicolae Pirea (OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20221207132347.38698-1-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agos390/qeth: fix use-after-free in hsci
Alexandra Winter [Wed, 7 Dec 2022 10:53:04 +0000 (11:53 +0100)]
s390/qeth: fix use-after-free in hsci

KASAN found that addr was dereferenced after br2dev_event_work was freed.

==================================================================
BUG: KASAN: use-after-free in qeth_l2_br2dev_worker+0x5ba/0x6b0
Read of size 1 at addr 00000000fdcea440 by task kworker/u760:4/540
CPU: 17 PID: 540 Comm: kworker/u760:4 Tainted: G            E      6.1.0-20221128.rc7.git1.5aa3bed4ce83.300.fc36.s390x+kasan #1
Hardware name: IBM 8561 T01 703 (LPAR)
Workqueue: 0.0.8000_event qeth_l2_br2dev_worker
Call Trace:
 [<000000016944d4ce>] dump_stack_lvl+0xc6/0xf8
 [<000000016942cd9c>] print_address_description.constprop.0+0x34/0x2a0
 [<000000016942d118>] print_report+0x110/0x1f8
 [<0000000167a7bd04>] kasan_report+0xfc/0x128
 [<000000016938d79a>] qeth_l2_br2dev_worker+0x5ba/0x6b0
 [<00000001673edd1e>] process_one_work+0x76e/0x1128
 [<00000001673ee85c>] worker_thread+0x184/0x1098
 [<000000016740718a>] kthread+0x26a/0x310
 [<00000001672c606a>] __ret_from_fork+0x8a/0xe8
 [<00000001694711da>] ret_from_fork+0xa/0x40
Allocated by task 108338:
 kasan_save_stack+0x40/0x68
 kasan_set_track+0x36/0x48
 __kasan_kmalloc+0xa0/0xc0
 qeth_l2_switchdev_event+0x25a/0x738
 atomic_notifier_call_chain+0x9c/0xf8
 br_switchdev_fdb_notify+0xf4/0x110
 fdb_notify+0x122/0x180
 fdb_add_entry.constprop.0.isra.0+0x312/0x558
 br_fdb_add+0x59e/0x858
 rtnl_fdb_add+0x58a/0x928
 rtnetlink_rcv_msg+0x5f8/0x8d8
 netlink_rcv_skb+0x1f2/0x408
 netlink_unicast+0x570/0x790
 netlink_sendmsg+0x752/0xbe0
 sock_sendmsg+0xca/0x110
 ____sys_sendmsg+0x510/0x6a8
 ___sys_sendmsg+0x12a/0x180
 __sys_sendmsg+0xe6/0x168
 __do_sys_socketcall+0x3c8/0x468
 do_syscall+0x22c/0x328
 __do_syscall+0x94/0xf0
 system_call+0x82/0xb0
Freed by task 540:
 kasan_save_stack+0x40/0x68
 kasan_set_track+0x36/0x48
 kasan_save_free_info+0x4c/0x68
 ____kasan_slab_free+0x14e/0x1a8
 __kasan_slab_free+0x24/0x30
 __kmem_cache_free+0x168/0x338
 qeth_l2_br2dev_worker+0x154/0x6b0
 process_one_work+0x76e/0x1128
 worker_thread+0x184/0x1098
 kthread+0x26a/0x310
 __ret_from_fork+0x8a/0xe8
 ret_from_fork+0xa/0x40
Last potentially related work creation:
 kasan_save_stack+0x40/0x68
 __kasan_record_aux_stack+0xbe/0xd0
 insert_work+0x56/0x2e8
 __queue_work+0x4ce/0xd10
 queue_work_on+0xf4/0x100
 qeth_l2_switchdev_event+0x520/0x738
 atomic_notifier_call_chain+0x9c/0xf8
 br_switchdev_fdb_notify+0xf4/0x110
 fdb_notify+0x122/0x180
 fdb_add_entry.constprop.0.isra.0+0x312/0x558
 br_fdb_add+0x59e/0x858
 rtnl_fdb_add+0x58a/0x928
 rtnetlink_rcv_msg+0x5f8/0x8d8
 netlink_rcv_skb+0x1f2/0x408
 netlink_unicast+0x570/0x790
 netlink_sendmsg+0x752/0xbe0
 sock_sendmsg+0xca/0x110
 ____sys_sendmsg+0x510/0x6a8
 ___sys_sendmsg+0x12a/0x180
 __sys_sendmsg+0xe6/0x168
 __do_sys_socketcall+0x3c8/0x468
 do_syscall+0x22c/0x328
 __do_syscall+0x94/0xf0
 system_call+0x82/0xb0
Second to last potentially related work creation:
 kasan_save_stack+0x40/0x68
 __kasan_record_aux_stack+0xbe/0xd0
 kvfree_call_rcu+0xb2/0x760
 kernfs_unlink_open_file+0x348/0x430
 kernfs_fop_release+0xc2/0x320
 __fput+0x1ae/0x768
 task_work_run+0x1bc/0x298
 exit_to_user_mode_prepare+0x1a0/0x1a8
 __do_syscall+0x94/0xf0
 system_call+0x82/0xb0
The buggy address belongs to the object at 00000000fdcea400
 which belongs to the cache kmalloc-96 of size 96
The buggy address is located 64 bytes inside of
 96-byte region [00000000fdcea40000000000fdcea460)
The buggy address belongs to the physical page:
page:000000005a9c26e8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xfdcea
flags: 0x3ffff00000000200(slab|node=0|zone=1|lastcpupid=0x1ffff)
raw: 3ffff00000000200 0000000000000000 0000000100000122 000000008008cc00
raw: 0000000000000000 0020004100000000 ffffffff00000001 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 00000000fdcea300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 00000000fdcea380: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>00000000fdcea400: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                                           ^
 00000000fdcea480: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 00000000fdcea500: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================

Fixes: f7936b7b2663 ("s390/qeth: Update MACs of LEARNING_SYNC device")
Reported-by: Thorsten Winkler <twinkler@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Link: https://lore.kernel.org/r/20221207105304.20494-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomacsec: add missing attribute validation for offload
Emeel Hakim [Wed, 7 Dec 2022 10:16:18 +0000 (12:16 +0200)]
macsec: add missing attribute validation for offload

Add missing attribute validation for IFLA_MACSEC_OFFLOAD
to the netlink policy.

Fixes: 791bb3fcafce ("net: macsec: add support for specifying offload upon link creation")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/20221207101618.989-1-ehakim@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mvneta: Fix an out of bounds check
Dan Carpenter [Wed, 7 Dec 2022 07:06:31 +0000 (10:06 +0300)]
net: mvneta: Fix an out of bounds check

In an earlier commit, I added a bounds check to prevent an out of bounds
read and a WARN().  On further discussion and consideration that check
was probably too aggressive.  Instead of returning -EINVAL, a better fix
would be to just prevent the out of bounds read but continue the process.

Background: The value of "pp->rxq_def" is a number between 0-7 by default,
or even higher depending on the value of "rxq_number", which is a module
parameter. If the value is more than the number of available CPUs then
it will trigger the WARN() in cpu_max_bits_warn().

Fixes: e8b4fc13900b ("net: mvneta: Prevent out of bounds read in mvneta_config_rss()")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/Y5A7d1E5ccwHTYPf@kadam
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: thunderbolt: fix memory leak in tbnet_open()
Zhengchao Shao [Wed, 7 Dec 2022 01:50:01 +0000 (09:50 +0800)]
net: thunderbolt: fix memory leak in tbnet_open()

When tb_ring_alloc_rx() failed in tbnet_open(), ida that allocated in
tb_xdomain_alloc_out_hopid() is not released. Add
tb_xdomain_release_out_hopid() to the error path to release ida.

Fixes: 180b0689425c ("thunderbolt: Allow multiple DMA tunnels over a single XDomain connection")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20221207015001.1755826-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "ARM: dts: imx7: Fix NAND controller size-cells"
Francesco Dolcini [Mon, 5 Dec 2022 15:23:27 +0000 (16:23 +0100)]
Revert "ARM: dts: imx7: Fix NAND controller size-cells"

This reverts commit 753395ea1e45c724150070b5785900b6a44bd5fb.

It introduced a boot regression on colibri-imx7, and potentially any
other i.MX7 boards with MTD partition list generated into the fdt by
U-Boot.

While the commit we are reverting here is not obviously wrong, it fixes
only a dt binding checker warning that is non-functional, while it
introduces a boot regression and there is no obvious fix ready.

Fixes: 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Marek Vasut <marex@denx.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/Y4dgBTGNWpM6SQXI@francesco-nb.int.toradex.com/
Link: https://lore.kernel.org/all/20221205144917.6514168a@xps-13/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 years agodocs/zh_CN: Add LoongArch booting description's translation
Yanteng Si [Thu, 8 Dec 2022 06:59:15 +0000 (14:59 +0800)]
docs/zh_CN: Add LoongArch booting description's translation

Translate ../loongarch/booting.rst into Chinese.

Suggested-by: Xiaotian Wu <wuxiaotian@loongson.cn>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agodocs/LoongArch: Add booting description
Yanteng Si [Thu, 8 Dec 2022 06:59:15 +0000 (14:59 +0800)]
docs/LoongArch: Add booting description

1, Describe the information passed from BootLoader to kernel.
2, Describe the meaning and values of the kernel image header field.

Suggested-by: Xiaotian Wu <wuxiaotian@loongson.cn>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: mm: Fix huge page entry update for virtual machine
Huacai Chen [Thu, 8 Dec 2022 06:59:15 +0000 (14:59 +0800)]
LoongArch: mm: Fix huge page entry update for virtual machine

In virtual machine (guest mode), the tlbwr instruction can not write the
last entry of MTLB, so we need to make it non-present by invtlb and then
write it by tlbfill. This also simplify the whole logic.

Signed-off-by: Rui Wang <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>