]> www.infradead.org Git - users/jedix/linux-maple.git/commitdiff
mm: readahead: improve mmap_miss heuristic for concurrent faults
authorRoman Gushchin <roman.gushchin@linux.dev>
Fri, 15 Aug 2025 18:32:24 +0000 (11:32 -0700)
committerAndrew Morton <akpm@linux-foundation.org>
Fri, 12 Sep 2025 00:25:03 +0000 (17:25 -0700)
If two or more threads of an application faulting on the same folio, the
mmap_miss counter can be decreased multiple times.  It breaks the
mmap_miss heuristic and keeps the readahead enabled even under extreme
levels of memory pressure.

It happens often if file folios backing a multi-threaded application are
getting evicted and re-faulted.

Fix it by skipping decreasing mmap_miss if the folio is locked.

This change was evaluated on several hundred thousands hosts in Google's
production over a couple of weeks.  The number of containers being stuck
in a vicious reclaim cycle for a long time was reduced several fold
(~10-20x), as well as the overall fleet-wide cpu time spent in direct
memory reclaim was meaningfully reduced.  No regressions were observed.

Link: https://lkml.kernel.org/r/20250815183224.62007-1-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/filemap.c

index d1fb0b12bff27b9862143f01fabe58e7b7729aca..1a388b11cfa93de2bb40a1ac04bf64f47ef750a3 100644 (file)
@@ -3323,9 +3323,17 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf,
        if (vmf->vma->vm_flags & VM_RAND_READ || !ra->ra_pages)
                return fpin;
 
-       mmap_miss = READ_ONCE(ra->mmap_miss);
-       if (mmap_miss)
-               WRITE_ONCE(ra->mmap_miss, --mmap_miss);
+       /*
+        * If the folio is locked, we're likely racing against another fault.
+        * Don't touch the mmap_miss counter to avoid decreasing it multiple
+        * times for a single folio and break the balance with mmap_miss
+        * increase in do_sync_mmap_readahead().
+        */
+       if (likely(!folio_test_locked(folio))) {
+               mmap_miss = READ_ONCE(ra->mmap_miss);
+               if (mmap_miss)
+                       WRITE_ONCE(ra->mmap_miss, --mmap_miss);
+       }
 
        if (folio_test_readahead(folio)) {
                fpin = maybe_unlock_mmap_for_io(vmf, fpin);