Yin Fengwei [Tue, 7 Feb 2023 17:45:21 +0000 (12:45 -0500)]
filemap: Batch PTE mappings
Call set_pte_range() once per contiguous range of the folio instead
of once per page. This batches the updates to mm counters and the
rmap.
With a will-it-scale.page_fault3 like app (change file write
fault testing to read fault testing. Trying to upstream it to
will-it-scale at [1]) got 15% performance gain on a 48C/96T
Cascade Lake test box with 96 processes running against xfs.
Perf data collected before/after the change:
18.73%--page_add_file_rmap
|
--11.60%--__mod_lruvec_page_state
|
|--7.40%--__mod_memcg_lruvec_state
| |
| --5.58%--cgroup_rstat_updated
|
--2.53%--__mod_lruvec_state
|
--1.48%--__mod_node_page_state
Yin Fengwei [Tue, 7 Feb 2023 17:33:25 +0000 (12:33 -0500)]
mm: Convert do_set_pte() to set_pte_range()
set_pte_range() allows to setup page table entries for a specific
range. It takes advantage of batched rmap update for large folio.
It now takes care of calling update_mmu_cache_range().
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Yin Fengwei [Mon, 6 Feb 2023 14:06:37 +0000 (22:06 +0800)]
rmap: add folio_add_file_rmap_range()
folio_add_file_rmap_range() allows to add pte mapping to a specific
range of file folio. Comparing to page_add_file_rmap(), it batched
updates __lruvec_stat for large folio.
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Yin Fengwei [Mon, 6 Feb 2023 14:06:36 +0000 (22:06 +0800)]
filemap: Add filemap_map_folio_range()
filemap_map_folio_range() maps partial/full folio. Comparing to original
filemap_map_pages(), it updates refcount once per folio instead of per
page and gets minor performance improvement for large folio.
With a will-it-scale.page_fault3 like app (change file write
fault testing to read fault testing. Trying to upstream it to
will-it-scale at [1]), got 2% performance gain on a 48C/96T
Cascade Lake test box with 96 processes running against xfs.
Matthew Wilcox (Oracle) [Tue, 28 Feb 2023 19:10:58 +0000 (14:10 -0500)]
mm: Rationalise flush_icache_pages() and flush_icache_page()
Move the default (no-op) implementation of flush_icache_pages()
to <linux/cacheflush.h> from <asm-generic/cacheflush.h>.
Remove the flush_icache_page() wrapper from each architecture
into <linux/cacheflush.h>.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
um: Implement the new page table range API
Add PFN_PTE_SHIFT and update_mmu_cache_range().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: linux-um@lists.infradead.org
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
sparc64: Implement the new page table range API
Add set_ptes(), update_mmu_cache_range(), flush_dcache_folio() and
flush_icache_pages(). Convert the PG_dcache_dirty flag from being
per-page to per-folio.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
sh: Implement the new page table range API
Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_dcache_folio() and
flush_icache_pages(). Change the PG_dcache_clean flag from being
per-page to per-folio. Flush the entire folio containing the pages in
flush_icache_pages() for ease of implementation.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: linux-sh@vger.kernel.org
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
parisc: Implement the new page table range API
Add set_ptes(), update_mmu_cache_range(), flush_dcache_folio()
and flush_icache_pages(). Change the PG_arch_1 (aka PG_dcache_dirty) flag
from being per-page to per-folio.
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
openrisc: Implement the new page table range API
Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_dcache_folio().
Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page
to per-folio.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: linux-openrisc@vger.kernel.org
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
nios2: Implement the new page table range API
Add set_ptes(), update_mmu_cache_range(), flush_icache_pages() and
flush_dcache_folio(). Change the PG_arch_1 (aka PG_dcache_dirty) flag
from being per-page to per-folio.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org>
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
mips: Implement the new page table range API
Rename _PFN_SHIFT to PFN_PTE_SHIFT. Convert a few places
to call set_pte() instead of set_pte_at(). Add set_ptes(),
update_mmu_cache_range(), flush_icache_pages() and flush_dcache_folio().
Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page
to per-folio.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
microblaze: Implement the new page table range API
Rename PFN_SHIFT_OFFSET to PTE_PFN_SHIFT. Change the calling
convention for set_pte() to be the same as other architectures. Add
update_mmu_cache_range(), flush_icache_pages() and flush_dcache_folio().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Michal Simek <monstr@monstr.eu>
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
loongarch: Implement the new page table range API
Add update_mmu_cache_range() and change _PFN_SHIFT to PFN_PTE_SHIFT.
It would probably be more efficient to implement __update_tlb() by
flushing the entire folio instead of calling __update_tlb() N times,
but I'll leave that for someone who understands the architecture better.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: loongarch@lists.linux.dev
Matthew Wilcox (Oracle) [Mon, 13 Feb 2023 19:12:22 +0000 (14:12 -0500)]
ia64: Implement the new page table range API
Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_dcache_folio().
Change the PG_arch_1 (aka PG_dcache_clean) flag from being per-page to
per-folio, which makes arch_dma_mark_clean() and mark_clean() a little
more exciting.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: linux-ia64@vger.kernel.org
Matthew Wilcox (Oracle) [Fri, 10 Feb 2023 20:35:06 +0000 (15:35 -0500)]
arm: Implement the new page table range API
Add set_ptes(), update_mmu_cache_range(), flush_dcache_folio() and
flush_icache_pages(). Change the PG_dcache_clear flag from being per-page
to per-folio which makes __dma_page_dev_to_cpu() a bit more exciting.
Also add flush_cache_pages(), even though this isn't used by generic code
(yet?)
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org
Matthew Wilcox (Oracle) [Fri, 10 Feb 2023 20:35:06 +0000 (15:35 -0500)]
arc: Implement the new page table range API
Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_dcache_folio()
and flush_icache_pages().
Change the PG_dc_clean flag from being per-page to per-folio (which
means it cannot always be set as we don't know that all pages in this
folio were cleaned). Enhance the internal flush routines to take the
number of pages to flush.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Vineet Gupta <vgupta@kernel.org> Cc: linux-snps-arc@lists.infradead.org
Matthew Wilcox (Oracle) [Wed, 8 Feb 2023 19:07:59 +0000 (14:07 -0500)]
alpha: Implement the new page table range API
Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_icache_pages().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-alpha@vger.kernel.org
Matthew Wilcox (Oracle) [Fri, 3 Mar 2023 16:52:17 +0000 (11:52 -0500)]
mm: Add default definition of set_ptes()
Most architectures can just define set_pte() and PFN_PTE_SHIFT to
use this definition. It's also a handy spot to document the guarantees
provided by the MM.
Suggested-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Matthew Wilcox (Oracle) [Fri, 10 Feb 2023 20:30:38 +0000 (15:30 -0500)]
mm: Add folio_flush_mapping()
This is the folio equivalent of page_mapping_file(), but rename it
to make it clear that it's very different from page_file_mapping().
Theoretically, there's nothing flush-only about it, but there are no
other users today, and I doubt there will be; it's almost always more
useful to know the swapfile's mapping or the swapcache's mapping.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Matthew Wilcox (Oracle) [Wed, 8 Feb 2023 19:17:52 +0000 (14:17 -0500)]
mm: Add generic flush_icache_pages() and documentation
flush_icache_page() is deprecated but not yet removed, so add
a range version of it. Change the documentation to refer to
update_mmu_cache_range() instead of update_mmu_cache().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND). It also has useful (under
some circumstances) behaviour if the range exceeds the maximum value of
the type. Convert all the conflicting definitions of in_range() within
the kernel; some can use the generic definition while others need their
own definition.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>