]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
6 weeks agorename mt_wr_split_data() to split_data_by_state()
Liam R. Howlett [Tue, 6 May 2025 16:31:45 +0000 (12:31 -0400)]
rename mt_wr_split_data() to split_data_by_state()

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agomove left_store to split_data struct
Liam R. Howlett [Tue, 6 May 2025 16:23:21 +0000 (12:23 -0400)]
move left_store to split_data struct

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agomas_wr_rebalance reorg
Liam R. Howlett [Tue, 6 May 2025 16:11:17 +0000 (12:11 -0400)]
mas_wr_rebalance reorg

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agomas_wr_rebalance_reduce() rename
Liam R. Howlett [Tue, 6 May 2025 15:51:17 +0000 (11:51 -0400)]
mas_wr_rebalance_reduce() rename

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agomas_wr_rebalance_two() comment
Liam R. Howlett [Tue, 6 May 2025 15:50:39 +0000 (11:50 -0400)]
mas_wr_rebalance_two() comment

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agorename mas_wr_rebalance_calc()
Liam R. Howlett [Tue, 6 May 2025 15:50:13 +0000 (11:50 -0400)]
rename mas_wr_rebalance_calc()

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agomove mas_wr_rebalance_calc()
Liam R. Howlett [Tue, 6 May 2025 15:48:41 +0000 (11:48 -0400)]
move mas_wr_rebalance_calc()

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agomore cleanup
Liam R. Howlett [Tue, 6 May 2025 15:47:46 +0000 (11:47 -0400)]
more cleanup

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agomas_wr_rebalance_two rename
Liam R. Howlett [Tue, 6 May 2025 15:35:58 +0000 (11:35 -0400)]
mas_wr_rebalance_two rename

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agoditch debug and some cleanup
Liam R. Howlett [Tue, 6 May 2025 15:34:24 +0000 (11:34 -0400)]
ditch debug and some cleanup

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agooff by one\n
Liam R. Howlett [Tue, 6 May 2025 15:13:56 +0000 (11:13 -0400)]
off by one\n

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agostill issues
Liam R. Howlett [Tue, 6 May 2025 14:45:41 +0000 (10:45 -0400)]
still issues

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agoenable all testing again
Liam R. Howlett [Fri, 2 May 2025 20:15:48 +0000 (16:15 -0400)]
enable all testing again

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agorebalance_reduce use split_state_setup
Liam R. Howlett [Fri, 2 May 2025 20:15:33 +0000 (16:15 -0400)]
rebalance_reduce use split_state_setup

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agosplit_state_setup() working better
Liam R. Howlett [Fri, 2 May 2025 20:01:06 +0000 (16:01 -0400)]
split_state_setup() working better

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agofix src offset
Liam R. Howlett [Fri, 2 May 2025 19:30:02 +0000 (15:30 -0400)]
fix src offset

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agoditch mni_insert_part, dst_max_off, unfinished, and use src->offset
Liam R. Howlett [Fri, 2 May 2025 19:26:06 +0000 (15:26 -0400)]
ditch mni_insert_part, dst_max_off, unfinished, and use src->offset

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agoadd state_setup() function
Liam R. Howlett [Fri, 2 May 2025 15:18:20 +0000 (11:18 -0400)]
add state_setup() function

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agorebalance left to right functioning well
Liam R. Howlett [Thu, 1 May 2025 18:26:50 +0000 (14:26 -0400)]
rebalance left to right functioning well

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
6 weeks agomas_wr_rebalance: rebalancing, not reducing
Liam R. Howlett [Wed, 30 Apr 2025 20:12:08 +0000 (16:12 -0400)]
mas_wr_rebalance: rebalancing, not reducing

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
7 weeks agorebalance wip
Liam R. Howlett [Fri, 25 Apr 2025 15:50:07 +0000 (11:50 -0400)]
rebalance wip

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agocode cleanup of try_rebalance
Liam R. Howlett [Wed, 16 Apr 2025 15:54:58 +0000 (11:54 -0400)]
code cleanup of try_rebalance

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agoIntroduce helpers for can_rebalance directions
Liam R. Howlett [Wed, 16 Apr 2025 15:39:06 +0000 (11:39 -0400)]
Introduce helpers for can_rebalance directions

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agooutline
Liam R. Howlett [Wed, 16 Apr 2025 14:53:45 +0000 (10:53 -0400)]
outline

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agocompiles, ship it.
Liam R. Howlett [Wed, 16 Apr 2025 01:02:35 +0000 (21:02 -0400)]
compiles, ship it.

The copy and pasting of code does not work out without an overall plan

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agobroken for now
Liam R. Howlett [Mon, 14 Apr 2025 19:39:23 +0000 (15:39 -0400)]
broken for now

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agorename alloc to is_alloc
Liam R. Howlett [Thu, 10 Apr 2025 19:12:52 +0000 (15:12 -0400)]
rename alloc to is_alloc

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agoma_part -> part and comments
Liam R. Howlett [Thu, 10 Apr 2025 18:43:55 +0000 (14:43 -0400)]
ma_part -> part and comments

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agofinished conversion
Liam R. Howlett [Thu, 10 Apr 2025 18:34:08 +0000 (14:34 -0400)]
finished conversion

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agotry_rebalance converted
Liam R. Howlett [Thu, 10 Apr 2025 18:19:39 +0000 (14:19 -0400)]
try_rebalance converted

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agoconverged converted
Liam R. Howlett [Thu, 10 Apr 2025 18:13:16 +0000 (14:13 -0400)]
converged converted

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agouse split_data struct
Liam R. Howlett [Thu, 10 Apr 2025 15:58:47 +0000 (11:58 -0400)]
use split_data struct

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agowip, cleanup
Liam R. Howlett [Tue, 8 Apr 2025 17:50:28 +0000 (13:50 -0400)]
wip, cleanup

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agomas_wr_split cleanup and some comments
Liam R. Howlett [Tue, 8 Apr 2025 15:07:39 +0000 (11:07 -0400)]
mas_wr_split cleanup and some comments

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agoRevert "remove dead code"
Liam R. Howlett [Tue, 8 Apr 2025 02:02:54 +0000 (22:02 -0400)]
Revert "remove dead code"

This reverts commit b3062ee59ae87e44f81fd309b79c8fbf5660f2db.

2 months agomore printk and dead code
Liam R. Howlett [Tue, 8 Apr 2025 02:00:03 +0000 (22:00 -0400)]
more printk and dead code

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agoremove dead code
Liam R. Howlett [Tue, 8 Apr 2025 01:53:20 +0000 (21:53 -0400)]
remove dead code

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agoremove printk
Liam R. Howlett [Tue, 8 Apr 2025 01:53:01 +0000 (21:53 -0400)]
remove printk

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agoworking new code
Liam R. Howlett [Tue, 8 Apr 2025 01:49:42 +0000 (21:49 -0400)]
working new code

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
2 months agodebug
Liam R. Howlett [Wed, 2 Apr 2025 13:36:49 +0000 (09:36 -0400)]
debug

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
3 months agomas_wr_try_rebalance: Drop dead code
Liam R. Howlett [Fri, 28 Feb 2025 22:28:38 +0000 (17:28 -0500)]
mas_wr_try_rebalance: Drop dead code

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
3 months agomas_split: Drop dead code
Liam R. Howlett [Fri, 28 Feb 2025 22:27:39 +0000 (17:27 -0500)]
mas_split: Drop dead code

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
3 months agomaple_tree: use mt_wr_split_data() in split and try_rebalance
Liam R. Howlett [Fri, 28 Feb 2025 22:27:01 +0000 (17:27 -0500)]
maple_tree: use mt_wr_split_data() in split and try_rebalance

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
3 months agofix rewind for ma_part
Liam R. Howlett [Thu, 27 Feb 2025 18:57:47 +0000 (13:57 -0500)]
fix rewind for ma_part

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
3 months agomas_wr_split using mt_wr_split_data
Liam R. Howlett [Thu, 27 Feb 2025 03:02:30 +0000 (22:02 -0500)]
mas_wr_split using mt_wr_split_data

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
3 months agomt_wr_split_data()
Liam R. Howlett [Thu, 27 Feb 2025 02:20:19 +0000 (21:20 -0500)]
mt_wr_split_data()

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
3 months agoprogress, I guess Split and reblance testing
Liam R. Howlett [Wed, 19 Feb 2025 02:45:42 +0000 (21:45 -0500)]
progress, I guess  Split and reblance testing

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agoworking on split rebalance now
Liam R. Howlett [Fri, 14 Feb 2025 20:42:56 +0000 (15:42 -0500)]
working on split rebalance now

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agowip, moving to struct splitting
Liam R. Howlett [Tue, 11 Feb 2025 17:18:30 +0000 (12:18 -0500)]
wip, moving to struct splitting

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agorebalance wip
Liam R. Howlett [Tue, 4 Feb 2025 18:05:35 +0000 (13:05 -0500)]
rebalance wip

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agomas_wr_rebalance() - it compiles new_split_2025
Liam R. Howlett [Mon, 3 Feb 2025 21:04:12 +0000 (16:04 -0500)]
mas_wr_rebalance() - it compiles

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agomas_destroy_rebalance() whitespace fix
Liam R. Howlett [Thu, 30 Jan 2025 18:42:29 +0000 (13:42 -0500)]
mas_destroy_rebalance() whitespace fix

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agomas_wr_rebalance_nodes: fix line length
Liam R. Howlett [Mon, 27 Jan 2025 21:10:36 +0000 (16:10 -0500)]
mas_wr_rebalance_nodes: fix line length

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agotesting/raix-tree/maple: Increase readers and reduce delay for faster machines
Liam R. Howlett [Mon, 27 Jan 2025 20:59:53 +0000 (15:59 -0500)]
testing/raix-tree/maple: Increase readers and reduce delay for faster machines

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agomas_wr_rebalance for insufficient data
Liam R. Howlett [Mon, 27 Jan 2025 20:56:40 +0000 (15:56 -0500)]
mas_wr_rebalance for insufficient data

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agodrop mas_wr_rebalance to be re-added clean
Liam R. Howlett [Mon, 27 Jan 2025 20:56:11 +0000 (15:56 -0500)]
drop mas_wr_rebalance to be re-added clean

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
4 months agodrop meaningless return from mas_wr_split
Liam R. Howlett [Thu, 5 Dec 2024 20:49:17 +0000 (15:49 -0500)]
drop meaningless return from mas_wr_split

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agomaple_tree: Combine mas_parent_gap() into mas_update_gap()
Liam R. Howlett [Sat, 30 Nov 2024 04:54:58 +0000 (23:54 -0500)]
maple_tree: Combine mas_parent_gap() into mas_update_gap()

mas_parent_gap() is used in one location and a lot of what is needed already
exists in the calling function.  Inline the function and dropping the
duplication simplifies the code and reduces the instruction count.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agomaple_tree: Inline ma_max_gap() in mas_update_gap()
Liam R. Howlett [Sat, 30 Nov 2024 04:30:52 +0000 (23:30 -0500)]
maple_tree: Inline ma_max_gap() in mas_update_gap()

ma_max_gap is called from a single location and can benefit from the
setup in mas_update_gap, so inline it.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agomaple_tree: Fix ma_dead_node() comment
Liam R. Howlett [Wed, 30 Oct 2024 15:28:08 +0000 (11:28 -0400)]
maple_tree: Fix ma_dead_node() comment

Update the arguemnt description to be accurate.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agoinline three small functions
Liam R. Howlett [Sat, 30 Nov 2024 04:02:41 +0000 (23:02 -0500)]
inline three small functions

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agomas_wr_split() Avoid mas_wr_new_end() call
Liam R. Howlett [Sat, 30 Nov 2024 04:01:13 +0000 (23:01 -0500)]
mas_wr_split() Avoid mas_wr_new_end() call

ma_part has the size set to what is needed by examining the same data.
Use ma_part.size instead.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agotools/testing/radix-tree: Add maple tree fuzzer
Liam R. Howlett [Wed, 23 Aug 2023 16:13:52 +0000 (12:13 -0400)]
tools/testing/radix-tree: Add maple tree fuzzer

This is the introduction of the maple tree fuzzer into the radix-tree
test suite, based heavily on the work of Vasily Gorbik sent in March
2022.

The tester uses the LLVM fuzzer to automate testing of the maple tree by
randomly inserting, storing, deleting, and resetting the tree.  Testing
has been expanded to test both allocation and basic trees.

After building the fuzz-maple target with clang, just run the resulting
executable.  The llvm libfuzzer supports minimizing the steps to
reproduce a crash from a crash file.

Using V=1 on the minimized crash will result in a testcase that can be
added to the lib/test_maple_tree.c test suite.  Using V=2 can help
figure out what is happening to cause the crash.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
4 months agomaple_tree: Use local variable
Liam R. Howlett [Sat, 30 Nov 2024 01:34:34 +0000 (20:34 -0500)]
maple_tree: Use local variable

the maple state variable has already been set up, so use it.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agoinlineing
Liam R. Howlett [Sat, 30 Nov 2024 01:33:38 +0000 (20:33 -0500)]
inlineing

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agoalways inlined and such
Liam R. Howlett [Fri, 29 Nov 2024 23:26:03 +0000 (18:26 -0500)]
always inlined and such

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agomaple_tree: Add skipping slot support
Liam R. Howlett [Fri, 29 Nov 2024 22:15:53 +0000 (17:15 -0500)]
maple_tree: Add skipping slot support

Allow maple node state to jump entire slots when necessary.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agooptimize a bunch of junk
Liam R. Howlett [Fri, 29 Nov 2024 18:25:18 +0000 (13:25 -0500)]
optimize a bunch of junk

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agorebalance starts now
Liam R. Howlett [Fri, 29 Nov 2024 15:31:43 +0000 (10:31 -0500)]
rebalance starts now

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agomaple_tree: Optimise mas_wr_store_entry() and mas_prealloc_calc()
Liam R. Howlett [Fri, 29 Nov 2024 16:28:33 +0000 (11:28 -0500)]
maple_tree: Optimise mas_wr_store_entry() and mas_prealloc_calc()

Rearrange the switch statements so that the more likely code paths are
checked first.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agomaple_tree: Create new mas_wr_split()
Liam R. Howlett [Mon, 4 Nov 2024 15:32:03 +0000 (10:32 -0500)]
maple_tree: Create new mas_wr_split()

Stop using the large struct big_node and use logic with two allocated
nodes.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
4 months agofoo
Andrew Morton [Sun, 26 Jan 2025 23:05:03 +0000 (15:05 -0800)]
foo

4 months agomailmap: add an entry for Hamza Mahfooz
Hamza Mahfooz [Mon, 20 Jan 2025 20:56:59 +0000 (15:56 -0500)]
mailmap: add an entry for Hamza Mahfooz

Map my previous work email to my current one.

Link: https://lkml.kernel.org/r/20250120205659.139027-1-hamzamahfooz@linux.microsoft.com
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hans verkuil <hverkuil@xs4all.nl>
Cc: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm: gup: fix infinite loop within __get_longterm_locked
Zhaoyang Huang [Tue, 21 Jan 2025 02:01:59 +0000 (10:01 +0800)]
mm: gup: fix infinite loop within __get_longterm_locked

We can run into an infinite loop in __get_longterm_locked() when
collect_longterm_unpinnable_folios() finds only folios that are isolated
from the LRU or were never added to the LRU.  This can happen when all
folios to be pinned are never added to the LRU, for example when
vm_ops->fault allocated pages using cma_alloc() and never added them to
the LRU.

We incorrectly update the "collected" variable even if nothing was
collected.  Fix it by incrementing "collected" only when we isolated a
folio and added it to the list of folios to migrate.

Link: https://lkml.kernel.org/r/20250121020159.3636477-1-zhaoyang.huang@unisoc.com
Fixes: 67e139b02d99 ("mm/gup.c: refactor check_and_migrate_movable_pages()")
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Cc: Aijun Sun <aijun.sun@unisoc.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoocfs2: fix incorrect CPU endianness conversion causing mount failure
Heming Zhao [Tue, 21 Jan 2025 11:22:03 +0000 (19:22 +0800)]
ocfs2: fix incorrect CPU endianness conversion causing mount failure

Commit 23aab037106d ("ocfs2: fix UBSAN warning in ocfs2_verify_volume()")
introduced a regression bug.  The blksz_bits value is already converted to
CPU endian in the previous code; therefore, the code shouldn't use
le32_to_cpu() anymore.

Link: https://lkml.kernel.org/r/20250121112204.12834-1-heming.zhao@suse.com
Fixes: 23aab037106d ("ocfs2: fix UBSAN warning in ocfs2_verify_volume()")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm: page_isolation: avoid calling folio_hstate() without hugetlb_lock
Liu Shixin [Wed, 22 Jan 2025 06:11:51 +0000 (14:11 +0800)]
mm: page_isolation: avoid calling folio_hstate() without hugetlb_lock

I found a NULL pointer dereference as followed:

 BUG: kernel NULL pointer dereference, address: 0000000000000028
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP PTI
 CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty #20
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.
 RIP: 0010:has_unmovable_pages+0x184/0x360
 ...
 Call Trace:
  <TASK>
  set_migratetype_isolate+0xd1/0x180
  start_isolate_page_range+0xd2/0x170
  alloc_contig_range_noprof+0x101/0x660
  alloc_contig_pages_noprof+0x238/0x290
  alloc_gigantic_folio.isra.0+0xb6/0x1f0
  only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60
  alloc_pool_huge_folio+0x80/0xf0
  set_max_huge_pages+0x211/0x490
  __nr_hugepages_store_common+0x5f/0xe0
  nr_hugepages_store+0x77/0x80
  kernfs_fop_write_iter+0x118/0x200
  vfs_write+0x23c/0x3f0
  ksys_write+0x62/0xe0
  do_syscall_64+0x5b/0x170
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there
is a race to free the HugeTLB page between PageHuge() and folio_hstate().
There is no need to add hugetlb_lock here as the HugeTLB page can be freed
in lot of places.  So it's enough to unfold folio_hstate() and add a check
to avoid NULL pointer dereference for hugepage_migration_supported().

Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com
Fixes: 464c7ffbcb16 ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoMAINTAINERS: mailmap: update Yosry Ahmed's email address
Yosry Ahmed [Thu, 23 Jan 2025 23:13:44 +0000 (23:13 +0000)]
MAINTAINERS: mailmap: update Yosry Ahmed's email address

Moving to a linux.dev email address.

Link: https://lkml.kernel.org/r/20250123231344.817358-1-yosry.ahmed@linux.dev
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomailmap, docs: update email to carlos.bilbao@kernel.org
Carlos Bilbao [Sat, 11 Jan 2025 16:11:06 +0000 (10:11 -0600)]
mailmap, docs: update email to carlos.bilbao@kernel.org

Update .mailmap to reflect my new (and final) primary email address,
carlos.bilbao@kernel.org.  This ensures consistent attribution in Git
history.  Also update my contact information in file
Documentation/translations/sp_SP/index.rst to help contributors reach out
for Spanish translations.

Link: https://lkml.kernel.org/r/20250111161110.862131-1-carlos.bilbao@kernel.org
Signed-off-by: Carlos Bilbao <carlos.bilbao@kernel.org>
Cc: Avadhut Naik <avadhut.naik@amd.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoscripts/gdb: fix aarch64 userspace detection in get_current_task
Jan Kiszka [Fri, 10 Jan 2025 10:36:33 +0000 (11:36 +0100)]
scripts/gdb: fix aarch64 userspace detection in get_current_task

At least recent gdb releases (seen with 14.2) return SP_EL0 as signed long
which lets the right-shift always return 0.

Link: https://lkml.kernel.org/r/dcd2fabc-9131-4b48-8419-6444e2d67454@siemens.com
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm-vmscan-accumulate-nr_demoted-for-accurate-demotion-statistics-v2
Li Zhijian [Sat, 11 Jan 2025 01:52:53 +0000 (09:52 +0800)]
mm-vmscan-accumulate-nr_demoted-for-accurate-demotion-statistics-v2

introduce local nr_demoted to fix nr_reclaimed double counting

Link: https://lkml.kernel.org/r/20250111015253.425693-1-lizhijian@fujitsu.com
Fixes: f77f0c751478 ("mm,memcg: provide per-cgroup counters for NUMA balancing operations")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Cc: Kaiyang Zhao <kaiyang2@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/vmscan: accumulate nr_demoted for accurate demotion statistics
Li Zhijian [Fri, 10 Jan 2025 12:21:32 +0000 (20:21 +0800)]
mm/vmscan: accumulate nr_demoted for accurate demotion statistics

In shrink_folio_list(), demote_folio_list() can be called 2 times.
Currently stat->nr_demoted will only store the last nr_demoted( the later
nr_demoted is always zero, the former nr_demoted will get lost), as a
result number of demoted pages is not accurate.

Accumulate the nr_demoted count across multiple calls to
demote_folio_list(), ensuring accurate reporting of demotion statistics.

Link: https://lkml.kernel.org/r/20250110122133.423481-1-lizhijian@fujitsu.com
Fixes: f77f0c751478 ("mm,memcg: provide per-cgroup counters for NUMA balancing operations")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Acked-by: Kaiyang Zhao <kaiyang2@cs.cmu.edu>
Tested-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/hugetlb_vmemmap: fix memory loads ordering
Yu Zhao [Wed, 8 Jan 2025 07:48:21 +0000 (00:48 -0700)]
mm/hugetlb_vmemmap: fix memory loads ordering

Using x86_64 as an example, for a 32KB struct page[] area describing a 2MB
hugeTLB, HVO reduces the area to 4KB by the following steps:

1. Split the (r/w vmemmap) PMD mapping the area into 512 (r/w) PTEs;
2. For the 8 PTEs mapping the area, remap PTE 1-7 to the page mapped
   by PTE 0, and at the same time change the permission from r/w to
   r/o;
3. Free the pages PTE 1-7 used to map, hence the reduction from 32KB
   to 4KB.

However, the following race can happen due to improperly memory loads
ordering:
  CPU 1 (HVO)                     CPU 2 (speculative PFN walker)

  page_ref_freeze()
  synchronize_rcu()
                                  rcu_read_lock()
                                  page_is_fake_head() is false
  vmemmap_remap_pte()
  XXX: struct page[] becomes r/o

  page_ref_unfreeze()
                                  page_ref_count() is not zero

                                  atomic_add_unless(&page->_refcount)
                                  XXX: try to modify r/o struct page[]

Specifically, page_is_fake_head() must be ordered after page_ref_count()
on CPU 2 so that it can only return true for this case, to avoid the later
attempt to modify r/o struct page[].

This patch adds the missing memory barrier and makes the tests on
page_is_fake_head() and page_ref_count() done in the proper order.

Link: https://lkml.kernel.org/r/20250108074822.722696-1-yuzhao@google.com
Fixes: bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
Signed-off-by: Yu Zhao <yuzhao@google.com>
Reported-by: Will Deacon <will@kernel.org>
Closes: https://lore.kernel.org/20241128142028.GA3506@willie-the-truck/
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/vmscan: fix hard LOCKUP in function isolate_lru_folios
liuye [Tue, 19 Nov 2024 06:08:42 +0000 (14:08 +0800)]
mm/vmscan: fix hard LOCKUP in function isolate_lru_folios

This fixes the following hard lockup in isolate_lru_folios() during memory
reclaim.  If the LRU mostly contains ineligible folios this may trigger
watchdog.

watchdog: Watchdog detected hard LOCKUP on cpu 173
RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
Call Trace:
_raw_spin_lock_irqsave+0x31/0x40
folio_lruvec_lock_irqsave+0x5f/0x90
folio_batch_move_lru+0x91/0x150
lru_add_drain_per_cpu+0x1c/0x40
process_one_work+0x17d/0x350
worker_thread+0x27b/0x3a0
kthread+0xe8/0x120
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1b/0x30

lruvec->lru_lock owner:

PID: 2865     TASK: ffff888139214d40  CPU: 40   COMMAND: "kswapd0"
 #0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
 #1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
 #2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
 #3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
 #4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
    [exception RIP: isolate_lru_folios+403]
    RIP: ffffffffa597df53  RSP: ffffc90006fb7c28  RFLAGS: 00000002
    RAX: 0000000000000001  RBX: ffffc90006fb7c60  RCX: ffffea04a2196f88
    RDX: ffffc90006fb7c60  RSI: ffffc90006fb7c60  RDI: ffffea04a2197048
    RBP: ffff88812cbd3010   R8: ffffea04a2197008   R9: 0000000000000001
    R10: 0000000000000000  R11: 0000000000000001  R12: ffffea04a2197008
    R13: ffffea04a2197048  R14: ffffc90006fb7de8  R15: 0000000003e3e937
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    <NMI exception stack>
 #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
 #6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
 #7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
 #8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
 #9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
crash>

Scenario:
User processe are requesting a large amount of memory and keep page active.
Then a module continuously requests memory from ZONE_DMA32 area.
Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
However pages in the LRU(active_anon) list are mostly from
the ZONE_NORMAL area.

Reproduce:
Terminal 1: Construct to continuously increase pages active(anon).
mkdir /tmp/memory
mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
dd if=/dev/zero of=/tmp/memory/block bs=4M
tail /tmp/memory/block

Terminal 2:
vmstat -a 1
active will increase.
procs ---memory--- ---swap-- ---io---- -system-- ---cpu--- ...
 r  b   swpd   free  inact active   si   so    bi    bo
 1  0   0 1445623076 45898836 83646008    0    0     0
 1  0   0 1445623076 43450228 86094616    0    0     0
 1  0   0 1445623076 41003480 88541364    0    0     0
 1  0   0 1445623076 38557088 90987756    0    0     0
 1  0   0 1445623076 36109688 93435156    0    0     0
 1  0   0 1445619552 33663256 95881632    0    0     0
 1  0   0 1445619804 31217140 98327792    0    0     0
 1  0   0 1445619804 28769988 100774944    0    0     0
 1  0   0 1445619804 26322348 103222584    0    0     0
 1  0   0 1445619804 23875592 105669340    0    0     0

cat /proc/meminfo | head
Active(anon) increase.
MemTotal:       1579941036 kB
MemFree:        1445618500 kB
MemAvailable:   1453013224 kB
Buffers:            6516 kB
Cached:         128653956 kB
SwapCached:            0 kB
Active:         118110812 kB
Inactive:       11436620 kB
Active(anon):   115345744 kB
Inactive(anon):   945292 kB

When the Active(anon) is 115345744 kB, insmod module triggers
the ZONE_DMA32 watermark.

perf record -e vmscan:mm_vmscan_lru_isolate -aR
perf script
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2
nr_skipped=2 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0
nr_skipped=0 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844
nr_skipped=28835844 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844
nr_skipped=28835844 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29
nr_skipped=29 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0
nr_skipped=0 nr_taken=0 lru=active_anon

See nr_scanned=28835844.
28835844 * 4k = 115343376KB approximately equal to 115345744 kB.

If increase Active(anon) to 1000G then insmod module triggers
the ZONE_DMA32 watermark. hard lockup will occur.

In my device nr_scanned = 0000000003e3e937 when hard lockup.
Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.

   [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
    ffffc90006fb7c300000000000000020 0000000000000000
    ffffc90006fb7c40ffffc90006fb7d40 ffff88812cbd3000
    ffffc90006fb7c50ffffc90006fb7d30 0000000106fb7de8
    ffffc90006fb7c60ffffea04a2197008 ffffea0006ed4a48
    ffffc90006fb7c700000000000000000 0000000000000000
    ffffc90006fb7c800000000000000000 0000000000000000
    ffffc90006fb7c900000000000000000 0000000000000000
    ffffc90006fb7ca00000000000000000 0000000003e3e937
    ffffc90006fb7cb00000000000000000 0000000000000000
    ffffc90006fb7cc08d7c0b56b7874b00 ffff88812cbd3000

About the Fixes:
Why did it take eight years to be discovered?

The problem requires the following conditions to occur:
1. The device memory should be large enough.
2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
3. The memory in ZONE_DMA32 needs to reach the watermark.

If the memory is not large enough, or if the usage design of ZONE_DMA32
area memory is reasonable, this problem is difficult to detect.

notes:
The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL,
but other suitable scenarios may also trigger the problem.

Link: https://lkml.kernel.org/r/20241119060842.274072-1-liuye@kylinos.cn
Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Signed-off-by: liuye <liuye@kylinos.cn>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Yang Shi <yang@os.amperecomputing.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/compaction: fix UBSAN shift-out-of-bounds warning
Liu Shixin [Thu, 23 Jan 2025 02:10:29 +0000 (10:10 +0800)]
mm/compaction: fix UBSAN shift-out-of-bounds warning

syzkaller reported a UBSAN shift-out-of-bounds warning of (1UL << order)
in isolate_freepages_block().  The bogus compound_order can be any value
because it is union with flags.  Add back the MAX_PAGE_ORDER check to fix
the warning.

Link: https://lkml.kernel.org/r/20250123021029.2826736-1-liushixin2@huawei.com
Fixes: 3da0272a4c7d ("mm/compaction: correctly return failure with bogus compound_order in strict mode")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agos390/mm: add missing ctor/dtor on page table upgrade
Alexander Gordeev [Thu, 23 Jan 2025 16:03:49 +0000 (17:03 +0100)]
s390/mm: add missing ctor/dtor on page table upgrade

Commit 78966b550289 ("s390: pgtable: add statistics for PUD and P4D level
page table") misses the call to pagetable_p4d_ctor() against a newly
allocated P4D table in crst_table_upgrade();

Commit 68c601de75d8 ("mm: introduce ctor/dtor at PGD level") misses the
call to pagetable_pgd_ctor() against a newly allocated PGD and the call to
pagetable_dtor() against a newly allocated P4D that is about to be freed
on crst_table_upgrade() PGD upgrade fail path.

The missed constructors and destructor break (at least) the page table
accounting when a process memory space is upgraded.

Link: https://lkml.kernel.org/r/20250123160349.200154-1-agordeev@linux.ibm.com
Fixes: 78966b550289 ("s390: pgtable: add statistics for PUD and P4D level page table")
Fixes: 68c601de75d8 ("mm: introduce ctor/dtor at PGD level")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Closes: https://lore.kernel.org/all/20250122074954.8685-A-hca@linux.ibm.com/
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Acked-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agokasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
Thorsten Blum [Thu, 16 Jan 2025 06:24:04 +0000 (07:24 +0100)]
kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()

Remove hard-coded strings by using the str_on_off() helper function.

Link: https://lkml.kernel.org/r/20250116062403.2496-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agotools: add VM_WARN_ON_VMG definition
Suren Baghdasaryan [Thu, 16 Jan 2025 18:15:38 +0000 (10:15 -0800)]
tools: add VM_WARN_ON_VMG definition

vma tests compilation yields the following error:

vma.c:732:9: error: implicit declaration of function ‘VM_WARN_ON_VMG’

Fix it by adding missing VM_WARN_ON_VMG() definition.

Link: https://lkml.kernel.org/r/20250116181538.759469-1-surenb@google.com
Fixes: e3a7ae85f87c ("mm/debug: prefer VM_WARN_ON_VMG() to report VMG debug warnings")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
Thorsten Blum [Thu, 16 Jan 2025 20:42:16 +0000 (21:42 +0100)]
mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()

Remove hard-coded strings by using the str_high_low() helper function.

Link: https://lkml.kernel.org/r/20250116204216.106999-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoseqlock: add missing parameter documentation for raw_seqcount_try_begin()
Suren Baghdasaryan [Thu, 16 Jan 2025 18:27:30 +0000 (10:27 -0800)]
seqlock: add missing parameter documentation for raw_seqcount_try_begin()

Add missing documentation for raw_seqcount_try_begin() start parameter.

Link: https://lkml.kernel.org/r/20250116182730.801497-1-surenb@google.com
Fixes: dba4761a3e40 ("seqlock: add raw_seqcount_try_begin")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20250116170522.23e884d5@canb.auug.org.au/
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
Jim Zhao [Thu, 21 Nov 2024 10:05:39 +0000 (18:05 +0800)]
mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh

Address the feedback from 39ac99852fca ("mm/page-writeback: raise
wb_thresh to prevent write blocking with strictlimit)".  The wb_thresh
bumping logic is scattered across wb_position_ratio, __wb_calc_thresh, and
wb_update_dirty_ratelimit.  For consistency, consolidate all wb_thresh
bumping logic into __wb_calc_thresh.

Link: https://lkml.kernel.org/r/20241121100539.605818-1-jimzhao.ai@gmail.com
Signed-off-by: Jim Zhao <jimzhao.ai@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/page_alloc: remove the incorrect and misleading comment
Yuntao Wang [Wed, 15 Jan 2025 04:16:34 +0000 (12:16 +0800)]
mm/page_alloc: remove the incorrect and misleading comment

The comment removed in this patch originally belonged to the
build_zonelists_in_zone_order() function, which was introduced by commit
f0c0b2b808f2 ("change zonelist order: zonelist order selection logic").

Later, commit c9bff3eebc09 ("mm, page_alloc: rip out ZONELIST_ORDER_ZONE")
removed build_zonelists_in_zone_order() but left its comment behind.

Subsequently, commit 9d3be21bf9c0 ("mm, page_alloc: simplify zonelist
initialization") moved the node_order variable into build_zonelists(),
making the comment originally belonged to build_zonelists_in_zone_order()
appear as if it were part of build_zonelists().

Remove this misleading comment.

Link: https://lkml.kernel.org/r/20250115041634.63387-1-yuntao.wang@linux.dev
Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agozram: remove zcomp_stream_put() from write_incompressible_page()
Sergey Senozhatsky [Wed, 15 Jan 2025 07:19:16 +0000 (16:19 +0900)]
zram: remove zcomp_stream_put() from write_incompressible_page()

We cannot and should not put per-CPU compression stream in
write_incompressible_page() because that function never gets any
per-CPU streams in the first place.  It's zram_write_page() that
puts the stream before it calls write_incompressible_page().

Link: https://lkml.kernel.org/r/20250115072003.380567-1-senozhatsky@chromium.org
Fixes: 485d11509d6d ("zram: factor out ZRAM_HUGE write")
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm: separate move/undo parts from migrate_pages_batch()
Byungchul Park [Thu, 8 Aug 2024 06:53:58 +0000 (15:53 +0900)]
mm: separate move/undo parts from migrate_pages_batch()

Functionally, no change.  This is a preparation for luf mechanism that
requires to use separated folio lists for its own handling during
migration.  Refactored migrate_pages_batch() so as to separate move/undo
parts from migrate_pages_batch().

Link: https://lkml.kernel.org/r/20250115103403.11882-1-byungchul@sk.com
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Shivank Garg <shivankg@amd.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/kfence: use str_write_read() helper in get_access_type()
Thorsten Blum [Wed, 15 Jan 2025 15:55:12 +0000 (16:55 +0100)]
mm/kfence: use str_write_read() helper in get_access_type()

Remove hard-coded strings by using the str_write_read() helper function.

Link: https://lkml.kernel.org/r/20250115155511.954535-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
liuye [Tue, 14 Jan 2025 02:38:38 +0000 (10:38 +0800)]
selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()

Release memory before exception branch returns to prevent memory leaks

Checking tools/testing/selftests/mm/mkdirty.c ...
tools/testing/selftests/mm/mkdirty.c:283:3: error: Memory leak: src [memleak]
  return;
  ^

Link: https://lkml.kernel.org/r/20250114023838.48589-1-liuye@kylinos.cn
Signed-off-by: liuye <liuye@kylinos.cn>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agokasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
Thorsten Blum [Tue, 14 Jan 2025 15:09:35 +0000 (16:09 +0100)]
kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()

Remove hard-coded strings by using the str_on_off() helper function.

Link: https://lkml.kernel.org/r/20250114150935.780869-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: virtual_address_range: avoid reading from VM_IO mappings
Thomas Weißschuh [Tue, 14 Jan 2025 16:06:48 +0000 (17:06 +0100)]
selftests/mm: virtual_address_range: avoid reading from VM_IO mappings

The virtual_address_range selftest reads from the start of each mapping
listed in /proc/self/maps.  However not all mappings are valid to be
arbitrarily accessed.

For example the vvar data used for virtual clocks on x86 [vvar_vclock] can
only be accessed if 1) the kernel configuration enables virtual clocks and
2) the hypervisor provided the data for it.  Only the VDSO itself has the
necessary information to know this.  Since commit e93d2521b27f ("x86/vdso:
Split virtual clock pages into dedicated mapping") the virtual clock data
was split out into its own mapping, leading to EFAULT from read() during
the validation.

Check for the VM_IO flag as a proxy.  It is present for the VVAR mappings
and MMIO ranges can be dangerous to access arbitrarily.

Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-4-6fd7269934a5@linutronix.de
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202412271148.2656e485-lkp@intel.com
Fixes: e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping")
Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Suggested-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/lkml/e97c2a5d-c815-4936-a767-ac42a3220a90@redhat.com/
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: vm_util: split up /proc/self/smaps parsing
Thomas Weißschuh [Tue, 14 Jan 2025 16:06:47 +0000 (17:06 +0100)]
selftests/mm: vm_util: split up /proc/self/smaps parsing

Upcoming changes want to reuse the /proc/self/smaps parsing logic to parse
the VmFlags field.

As that works differently from the currently parsed HugePage counters,
split up the logic so common functionality can be shared.

While reworking this code, also use the correct sscanf placeholder for the
"uint64_t thp" variable.

Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-3-6fd7269934a5@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: kernel test robot <oliver.sang@intel.com>
Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: virtual_address_range: unmap chunks after validation
Thomas Weißschuh [Tue, 14 Jan 2025 16:06:46 +0000 (17:06 +0100)]
selftests/mm: virtual_address_range: unmap chunks after validation

For each accessed chunk a PTE is created.  More than 1GiB of PTEs is used
in this way.  Remove each PTE after validating a chunk to reduce peak
memory usage.

It is important to only unmap memory that previously mmap()ed, as
unmapping other mappings like the stack, heap or executable mappings will
crash the process.

The mappings read from /proc/self/maps and the return values from mmap()
don't allow a simple correlation due to merging and no guaranteed order.
To correlate the pointers and mappings use prctl(PR_SET_VMA_ANON_NAME).
While it introduces a test dependency, other alternatives would introduce
runtime or development overhead.

Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-2-6fd7269934a5@linutronix.de
Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: kernel test robot <oliver.sang@intel.com>
Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: virtual_address_range: mmap() without PROT_WRITE
Thomas Weißschuh [Tue, 14 Jan 2025 16:06:45 +0000 (17:06 +0100)]
selftests/mm: virtual_address_range: mmap() without PROT_WRITE

Patch series "selftests/mm: virtual_address_range: Reduce memory", v4.

The selftest started failing since commit e93d2521b27f ("x86/vdso: Split
virtual clock pages into dedicated mapping") was merged.  While debugging
I stumbled upon some memory usage optimizations.

With these test now runs on a VM with only 60MiB of memory.

This patch (of 4):

When mapping a larger chunk than physical memory is available with
PROT_WRITE and overcommit is disabled, the mapping will fail.  This will
prevent the test from running on systems with less then ~1GiB of memory
and triggering an inscrutinable test failure.  As the mappings are never
written to anyways, the flag can be removed.

Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-0-6fd7269934a5@linutronix.de
Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-1-6fd7269934a5@linutronix.de
Fixes: 4e5ce33ceb32 ("selftests/vm: add a test for virtual address range mapping")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Dev Jain <dev.jain@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>