Alex Sierra [Tue, 31 May 2022 20:00:40 +0000 (15:00 -0500)]
tools: add hmm gup tests for device coherent type
The intention is to test hmm device coherent type under different get user
pages paths. Also, test gup with FOLL_LONGTERM flag set in device
coherent pages. These pages should get migrated back to system memory.
Link: https://lkml.kernel.org/r/20220531200041.24904-13-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Hildenbrand <david@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:39 +0000 (15:00 -0500)]
tools: update test_hmm script to support SP config
Add two more parameters to set spm_addr_dev0 & spm_addr_dev1 addresses.
These two parameters configure the start SP addresses for each device in
test_hmm driver. Consequently, this configures zone device type as
coherent.
Link: https://lkml.kernel.org/r/20220531200041.24904-12-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:38 +0000 (15:00 -0500)]
tools: update hmm-test to support device coherent type
Test cases such as migrate_fault and migrate_multiple, were modified to
explicit migrate from device to sys memory without the need of page
faults, when using device coherent type.
Snapshot test case updated to read memory device type first and based on
that, get the proper returned results migrate_ping_pong test case added to
test explicit migration from device to sys memory for both private and
coherent zone types.
Helpers to migrate from device to sys memory and vicerversa were also
added.
Link: https://lkml.kernel.org/r/20220531200041.24904-11-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:37 +0000 (15:00 -0500)]
lib: add support for device coherent type in test_hmm
Device Coherent type uses device memory that is coherently accesible by
the CPU. This could be shown as SP (special purpose) memory range at the
BIOS-e820 memory enumeration. If no SP memory is supported in system,
this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.
Currently, test_hmm only supports two different SP ranges of at least
256MB size. This could be specified in the kernel parameter variable
efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x100000000 &
0x140000000 physical address. Ex.
efi_fake_mem=1G@0x100000000:0x40000,1G@0x140000000:0x40000
Private and coherent device mirror instances can be created in the same
probed. This is done by passing the module parameters spm_addr_dev0 &
spm_addr_dev1. In this case, it will create four instances of
device_mirror. The first two correspond to private device type, the last
two to coherent type. Then, they can be easily accessed from user space
through /dev/hmm_mirror<num_device>. Usually num_device 0 and 1 are for
private, and 2 and 3 for coherent types. If no module parameters are
passed, two instances of private type device_mirror will be created only.
Link: https://lkml.kernel.org/r/20220531200041.24904-10-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Poppple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:36 +0000 (15:00 -0500)]
lib: test_hmm add module param for zone device type
In order to configure device coherent in test_hmm, two module parameters
should be passed, which correspond to the SP start address of each device
(2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed, private
device type is configured.
Link: https://lkml.kernel.org/r/20220531200041.24904-9-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Poppple <apopple@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:35 +0000 (15:00 -0500)]
lib: test_hmm add ioctl to get zone device type
Add new ioctl cmd to query zone device type. This will be used once the
test_hmm adds zone device coherent type.
Link: https://lkml.kernel.org/r/20220531200041.24904-8-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Poppple <apopple@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:34 +0000 (15:00 -0500)]
drm/amdkfd: add SPM support for SVM
When CPU is connected throug XGMI, it has coherent access to VRAM
resource. In this case that resource is taken from a table in the device
gmc aperture base. This resource is used along with the device type,
which could be DEVICE_PRIVATE or DEVICE_COHERENT to create the device page
map region.
Also, MIGRATE_VMA_SELECT_DEVICE_COHERENT flag is selected for coherent
type case during migration to device.
Link: https://lkml.kernel.org/r/20220531200041.24904-7-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alistair Popple [Tue, 31 May 2022 20:00:33 +0000 (15:00 -0500)]
mm/gup: migrate device coherent pages when pinning instead of failing
Currently any attempts to pin a device coherent page will fail. This is
because device coherent pages need to be managed by a device driver, and
pinning them would prevent a driver from migrating them off the device.
However this is no reason to fail pinning of these pages. These are
coherent and accessible from the CPU so can be migrated just like pinning
ZONE_MOVABLE pages. So instead of failing all attempts to pin them first
try migrating them out of ZONE_DEVICE.
[hch@lst.de: rebased to the split device memory checks, moved migrate_device_page to migrate_device.c] Link: https://lkml.kernel.org/r/20220531200041.24904-6-alex.sierra@amd.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alistair Popple [Tue, 31 May 2022 20:00:32 +0000 (15:00 -0500)]
mm: remove the vma check in migrate_vma_setup()
migrate_vma_setup() checks that a valid vma is passed so that the page
tables can be walked to find the pfns associated with a given address
range. However in some cases the pfns are already known, such as when
migrating device coherent pages during pin_user_pages() meaning a valid
vma isn't required.
Link: https://lkml.kernel.org/r/20220531200041.24904-5-alex.sierra@amd.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:31 +0000 (15:00 -0500)]
mm: add device coherent vma selection for memory migration
This case is used to migrate pages from device memory, back to system
memory. Device coherent type memory is cache coherent from device and CPU
point of view.
Link: https://lkml.kernel.org/r/20220531200041.24904-4-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Poppple <apopple@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:30 +0000 (15:00 -0500)]
mm: handling Non-LRU pages returned by vm_normal_pages
With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
device-managed anonymous pages that are not LRU pages. Although they
behave like normal pages for purposes of mapping in CPU page, and for COW.
They do not support LRU lists, NUMA migration or THP.
We also introduced a FOLL_LRU flag that adds the same behaviour to
follow_page and related APIs, to allow callers to specify that they expect
to put pages on an LRU list.
Link: https://lkml.kernel.org/r/20220531200041.24904-3-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Sierra [Tue, 31 May 2022 20:00:29 +0000 (15:00 -0500)]
mm: add zone device coherent type memory support
Patch series "Add MEMORY_DEVICE_COHERENT for coherent device memory mapping", v5.
This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
owned by a device that can be mapped into CPU page tables like
MEMORY_DEVICE_GENERIC and can also be migrated like MEMORY_DEVICE_PRIVATE.
This patch series is mostly self-contained except for a few places where
it needs to update other subsystems to handle the new memory type.
System stability and performance are not affected according to our ongoing
testing, including xfstests.
How it works: The system BIOS advertises the GPU device memory (aka VRAM)
as SPM (special purpose memory) in the UEFI system address map.
The amdgpu driver registers the memory with devmap as
MEMORY_DEVICE_COHERENT using devm_memremap_pages. The initial user for
this hardware page migration capability is the Frontier supercomputer
project. This functionality is not AMD-specific. We expect other GPU
vendors to find this functionality useful, and possibly other hardware
types in the future.
Our test nodes in the lab are similar to the Frontier configuration, with
.5 TB of system memory plus 256 GB of device memory split across 4 GPUs,
all in a single coherent address space. Page migration is expected to
improve application efficiency significantly. We will report empirical
results as they become available.
Coherent device type pages at gup are now migrated back to system memory
if they are being pinned long-term (FOLL_LONGTERM). The reason is, that
long-term pinning would interfere with the device memory manager owning
the device-coherent pages (e.g. evictions in TTM). These series
incorporate Alistair Popple patches to do this migration from
pin_user_pages() calls. hmm_gup_test has been added to hmm-test to test
different get user pages calls.
This series includes handling of device-managed anonymous pages returned
by vm_normal_pages. Although they behave like normal pages for purposes
of mapping in CPU page tables and for COW, they do not support LRU lists,
NUMA migration or THP.
We also introduce a FOLL_LRU flag that adds the same behaviour to
follow_page and related APIs, to allow callers to specify that they expect
to put pages on an LRU list.
This patch (od 13):
Device memory that is cache coherent from device and CPU point of view.
This is used on platforms that have an advanced system bus (like CAPI or
CXL). Any page of a process can be migrated to such memory. However, no
one should be allowed to pin such memory so that it can always be evicted.
[hch@lst.de: rebased ontop of the refcount changes, removed is_dev_private_or_coherent_page] Link: https://lkml.kernel.org/r/20220531200041.24904-1-alex.sierra@amd.com Link: https://lkml.kernel.org/r/20220531200041.24904-2-alex.sierra@amd.com Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Miaohe Lin [Mon, 30 May 2022 11:30:16 +0000 (19:30 +0800)]
mm/migration: fix potential pte_unmap on an not mapped pte
__migration_entry_wait and migration_entry_wait_on_locked assume pte is
always mapped from caller. But this is not the case when it's called from
migration_entry_wait_huge and follow_huge_pmd. Add a hugetlbfs variant
that calls hugetlb_migration_entry_wait(ptep == NULL) to fix this issue.
Link: https://lkml.kernel.org/r/20220530113016.16663-5-linmiaohe@huawei.com Fixes: 30dad30922cc ("mm: migration: add migrate_entry_wait_huge()") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Suggested-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Howells <dhowells@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: kernel test robot <lkp@intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Miaohe Lin [Mon, 30 May 2022 11:30:15 +0000 (19:30 +0800)]
mm/migration: return errno when isolate_huge_page failed
We might fail to isolate huge page due to e.g. the page is under
migration which cleared HPageMigratable. We should return errno in this
case rather than always return 1 which could confuse the user, i.e. the
caller might think all of the memory is migrated while the hugetlb page is
left behind. We make the prototype of isolate_huge_page consistent with
isolate_lru_page as suggested by Huang Ying and rename isolate_huge_page
to isolate_hugetlb as suggested by Muchun to improve the readability.
Link: https://lkml.kernel.org/r/20220530113016.16663-4-linmiaohe@huawei.com Fixes: e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Suggested-by: Huang Ying <ying.huang@intel.com> Reported-by: kernel test robot <lkp@intel.com> (build error) Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Miaohe Lin [Mon, 30 May 2022 11:30:14 +0000 (19:30 +0800)]
mm/migration: remove unneeded lock page and PageMovable check
When non-lru movable page was freed from under us, __ClearPageMovable must
have been done. So we can remove unneeded lock page and PageMovable check
here. Also free_pages_prepare() will clear PG_isolated for us, so we can
further remove ClearPageIsolated as suggested by David.
Link: https://lkml.kernel.org/r/20220530113016.16663-3-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Howells <dhowells@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: kernel test robot <lkp@intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Liam R. Howlett [Tue, 21 Jun 2022 20:47:14 +0000 (20:47 +0000)]
mm/mmap.c: pass in mapping to __vma_link_file()
__vma_link_file() resolves the mapping from the file, if there is one.
Pass through the mapping and check the vm_file externally since most
places already have the required information and check of vm_file.
Liam R. Howlett [Tue, 21 Jun 2022 20:47:14 +0000 (20:47 +0000)]
mm: remove the vma linked list
Replace any vm_next use with vma_find().
Update free_pgtables(), unmap_vmas(), and zap_page_range() to use the
maple tree.
Use the new free_pgtables() and unmap_vmas() in do_mas_align_munmap(). At
the same time, alter the loop to be more compact.
Now that free_pgtables() and unmap_vmas() take a maple tree as an
argument, rearrange do_mas_align_munmap() to use the new tree to hold the
vmas to remove.
Remove __vma_link_list() and __vma_unlink_list() as they are exclusively
used to update the linked list.
Drop linked list update from __insert_vm_struct().
Rework validation of tree as it was depending on the linked list.
Liam R. Howlett [Tue, 21 Jun 2022 20:47:11 +0000 (20:47 +0000)]
mm/mempolicy: use vma iterator & maple state instead of vma linked list
Reworked the way mbind_range() finds the first VMA to reuse the maple
state and limit the number of tree walks needed.
Note, this drops the VM_BUG_ON(!vma) call, which would catch a start
address higher than the last VMA. The code was written in a way that
allowed no VMA updates to occur and still return success. There should be
no functional change to this scenario with the new code.
Matthew Wilcox (Oracle) [Tue, 21 Jun 2022 20:47:08 +0000 (20:47 +0000)]
sched: use maple tree iterator to walk VMAs
The linked list is slower than walking the VMAs using the maple tree. We
can't use the VMA iterator here because it doesn't support moving to an
earlier position.
Liam R. Howlett [Tue, 21 Jun 2022 20:47:07 +0000 (20:47 +0000)]
ipc/shm: use VMA iterator instead of linked list
The VMA iterator is faster than the linked llist, and it can be walked
even when VMAs are being removed from the address space, so there's no
need to keep track of 'next'.
Matthew Wilcox (Oracle) [Tue, 21 Jun 2022 20:47:06 +0000 (20:47 +0000)]
coredump: remove vma linked list walk
Use the Maple Tree iterator instead. This is too complicated for the VMA
iterator to handle, so let's open-code it for now. If this turns out to
be a common pattern, we can migrate it to common code.
Matthew Wilcox (Oracle) [Tue, 21 Jun 2022 20:47:05 +0000 (20:47 +0000)]
cxl: remove vma linked list walk
Use the VMA iterator instead. This requires a little restructuring of the
surrounding code to hoist the mm to the caller. That turns
cxl_prefault_one() into a trivial function, so call cxl_fault_segment()
directly.
Matthew Wilcox (Oracle) [Tue, 21 Jun 2022 20:47:05 +0000 (20:47 +0000)]
xtensa: remove vma linked list walks
Use the VMA iterator instead. Since VMA can no longer be NULL in the
loop, then deal with out-of-memory outside the loop. This means a
slightly longer run time in the failure case (-ENOMEM) - it will run to
the end of the VMAs before erroring instead of in the middle of the loop.
Liam R. Howlett [Tue, 21 Jun 2022 20:47:02 +0000 (20:47 +0000)]
mm/mmap: change do_brk_munmap() to use do_mas_align_munmap()
do_brk_munmap() has already aligned the address and has a maple tree state
to be used. Use the new do_mas_align_munmap() to avoid unnecessary
alignment and error checks.
Liam R. Howlett [Tue, 21 Jun 2022 20:47:01 +0000 (20:47 +0000)]
mm/mmap: reorganize munmap to use maple states
Remove __do_munmap() in favour of do_munmap(), do_mas_munmap(), and
do_mas_align_munmap().
do_munmap() is a wrapper to create a maple state for any callers that have
not been converted to the maple tree.
do_mas_munmap() takes a maple state to mumap a range. This is just a
small function which checks for error conditions and aligns the end of the
range.
do_mas_align_munmap() uses the aligned range to mumap a range.
do_mas_align_munmap() starts with the first VMA in the range, then finds
the last VMA in the range. Both start and end are split if necessary.
Then the VMAs are removed from the linked list and the mm mlock count is
updated at the same time. Followed by a single tree operation of
overwriting the area in with a NULL. Finally, the detached list is
unmapped and freed.
By reorganizing the munmap calls as outlined, it is now possible to avoid
extra work of aligning pre-aligned callers which are known to be safe,
avoid extra VMA lookups or tree walks for modifications.
detach_vmas_to_be_unmapped() is no longer used, so drop this code.
vm_brk_flags() can just call the do_mas_munmap() as it checks for
intersecting VMAs directly.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:59 +0000 (20:46 +0000)]
mm: remove vmacache
By using the maple tree and the maple tree state, the vmacache is no
longer beneficial and is complicating the VMA code. Remove the vmacache
to reduce the work in keeping it up to date and code complexity.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:59 +0000 (20:46 +0000)]
mm/mmap: use advanced maple tree API for mmap_region()
Changing mmap_region() to use the maple tree state and the advanced maple
tree interface allows for a lot less tree walking.
This change removes the last caller of munmap_vma_range(), so drop this
unused function.
Add vma_expand() to expand a VMA if possible by doing the necessary
hugepage check, uprobe_munmap of files, dcache flush, modifications then
undoing the detaches, etc.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:58 +0000 (20:46 +0000)]
mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap()
Avoid allocating a new VMA when it a vma modification can occur. When a
brk() can expand or contract a VMA, then the single store operation will
only modify one index of the maple tree instead of causing a node to split
or coalesce. This avoids unnecessary allocations/frees of maple tree
nodes and VMAs.
Move some limit & flag verifications out of the do_brk_flags() function to
use only relevant checks in the code path of bkr() and vm_brk_flags().
Set the vma to check if it can expand in vm_brk_flags() if extra criteria
are met.
Drop userfaultfd from do_brk_flags() path and only use it in
vm_brk_flags() path since that is the only place a munmap will happen.
Add a wraper for munmap for the brk case called do_brk_munmap().
Liam R. Howlett [Tue, 21 Jun 2022 20:46:58 +0000 (20:46 +0000)]
mm/khugepaged: optimize collapse_pte_mapped_thp() by using vma_lookup()
vma_lookup() will walk the vma tree once and not continue to look for the
next vma. Since the exact vma is checked below, this is a more optimal
way of searching.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:57 +0000 (20:46 +0000)]
mm: optimize find_exact_vma() to use vma_lookup()
Use vma_lookup() to walk the tree to the start value requested. If the
vma at the start does not match, then the answer is NULL and there is no
need to look at the next vma the way that find_vma() would.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:57 +0000 (20:46 +0000)]
xen: use vma_lookup() in privcmd_ioctl_mmap()
vma_lookup() walks the VMA tree for a specific value, find_vma() will
search the tree after walking to a specific value. It is more efficient
to only walk to the requested value since privcmd_ioctl_mmap() will exit
the loop if vm_start != msg->va.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:57 +0000 (20:46 +0000)]
mmap: change zeroing of maple tree in __vma_adjust()
Only write to the maple tree if we are not inserting or the insert isn't
going to overwrite the area to clear. This avoids spanning writes and
node coealescing when unnecessary.
The change requires a custom search for the linked list addition to find
the correct VMA for the prev link.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:56 +0000 (20:46 +0000)]
damon: convert __damon_va_three_regions to use the VMA iterator
This rather specialised walk can use the VMA iterator. If this proves to
be too slow, we can write a custom routine to find the two largest gaps,
but it will be somewhat complicated, so let's see if we need it first.
Update the kunit test case to use the maple tree. This also fixes an
issue with the kunit testcase not adding the last VMA to the list.
Link: https://lkml.kernel.org/r/20220504011215.661968-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20220621204632.3370049-16-Liam.Howlett@oracle.com Fixes: 17ccae8bb5c9 (mm/damon: add kunit tests) Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: SeongJae Park <sj@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Howells <dhowells@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Liam R. Howlett [Tue, 21 Jun 2022 20:46:55 +0000 (20:46 +0000)]
kernel/fork: use maple tree for dup_mmap() during forking
The maple tree was already tracking VMAs in this function by an earlier
commit, but the rbtree iterator was being used to iterate the list.
Change the iterator to use a maple tree native iterator and switch to the
maple tree advanced API to avoid multiple walks of the tree during insert
operations. Unexport the now-unused vma_store() function.
For performance reasons we bulk allocate the maple tree nodes. The node
calculations are done internally to the tree and use the VMA count and
assume the worst-case node requirements. The VM_DONT_COPY flag does not
allow for the most efficient copy method of the tree and so a bulk loading
algorithm is used.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:55 +0000 (20:46 +0000)]
mm/mmap: use maple tree for unmapped_area{_topdown}
The maple tree code was added to find the unmapped area in a previous
commit and was checked against what the rbtree returned, but the actual
result was never used. Start using the maple tree implementation and
remove the rbtree code.
Add kernel documentation comment for these functions.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:55 +0000 (20:46 +0000)]
mm/mmap: use the maple tree for find_vma_prev() instead of the rbtree
Use the maple tree's advanced API and a maple state to walk the tree for
the entry at the address of the next vma, then use the maple state to walk
back one entry to find the previous entry.
Add kernel documentation comments for this API.
Link: https://lkml.kernel.org/r/20220504010716.661115-14-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20220621204632.3370049-13-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Howells <dhowells@redhat.com> Cc: SeongJae Park <sj@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Matthew Wilcox (Oracle) [Tue, 21 Jun 2022 20:46:54 +0000 (20:46 +0000)]
mm: add VMA iterator
This thin layer of abstraction over the maple tree state is for iterating
over VMAs. You can go forwards, go backwards or ask where the iterator
is. Rename the existing vma_next() to __vma_next() -- it will be removed
by the end of this series.
Link: https://lkml.kernel.org/r/20220504010716.661115-11-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20220621204632.3370049-10-Liam.Howlett@oracle.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Howells <dhowells@redhat.com> Cc: SeongJae Park <sj@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Liam R. Howlett [Tue, 21 Jun 2022 20:46:53 +0000 (20:46 +0000)]
mm: start tracking VMAs with maple tree
Start tracking the VMAs with the new maple tree structure in parallel with
the rb_tree. Add debug and trace events for maple tree operations and
duplicate the rb_tree that is created on forks into the maple tree.
The maple tree is added to the mm_struct including the mm_init struct,
added support in required mm/mmap functions, added tracking in kernel/fork
for process forking, and used to find the unmapped_area and checked
against what the rbtree finds.
This also moves the mmap_lock() in exit_mmap() since the oom reaper call
does walk the VMAs. Otherwise lockdep will be unhappy if oom happens.
When splitting a vma fails due to allocations of the maple tree nodes,
the error path in __split_vma() calls new->vm_ops->close(new). The page
accounting for hugetlb is actually in the close() operation, so it
accounts for the removal of 1/2 of the VMA which was not adjusted. This
results in a negative exit value. To avoid the negative charge, set
vm_start = vm_end and vm_pgoff = 0.
There is also a potential accounting issue in special mappings from
insert_vm_struct() failing to allocate, so reverse the charge there in
the failure scenario.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:48 +0000 (20:46 +0000)]
radix tree test suite: add allocation counts and size to kmem_cache
Add functions to get the number of allocations, and total allocations from
a kmem_cache. Also add a function to get the allocated size and a way to
zero the total allocations.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:48 +0000 (20:46 +0000)]
radix tree test suite: add kmem_cache_set_non_kernel()
kmem_cache_set_non_kernel() is a mechanism to allow a certain number of
kmem_cache_alloc requests to succeed even when GFP_KERNEL is not set in
the flags. This functionality allows for testing different paths though
the code.
Liam R. Howlett [Tue, 21 Jun 2022 20:46:47 +0000 (20:46 +0000)]
Maple Tree: add new data structure
Patch series "Introducing the Maple Tree".
The maple tree is an RCU-safe range based B-tree designed to use modern
processor cache efficiently. There are a number of places in the kernel
that a non-overlapping range-based tree would be beneficial, especially
one with a simple interface. If you use an rbtree with other data
structures to improve performance or an interval tree to track
non-overlapping ranges, then this is for you.
The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
nodes. With the increased branching factor, it is significantly shorter
than the rbtree so it has fewer cache misses. The removal of the linked
list between subsequent entries also reduces the cache misses and the need
to pull in the previous and next VMA during many tree alterations.
The first user that is covered in this patch set is the vm_area_struct,
where three data structures are replaced by the maple tree: the augmented
rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The
long term goal is to reduce or remove the mmap_lock contention.
The plan is to get to the point where we use the maple tree in RCU mode.
Readers will not block for writers. A single write operation will be
allowed at a time. A reader re-walks if stale data is encountered. VMAs
would be RCU enabled and this mode would be entered once multiple tasks
are using the mm_struct.
Davidlor said
: Yes I like the maple tree, and at this stage I don't think we can ask for
: more from this series wrt the MM - albeit there seems to still be some
: folks reporting breakage. Fundamentally I see Liam's work to (re)move
: complexity out of the MM (not to say that the actual maple tree is not
: complex) by consolidating the three complimentary data structures very
: much worth it considering performance does not take a hit. This was very
: much a turn off with the range locking approach, which worst case scenario
: incurred in prohibitive overhead. Also as Liam and Matthew have
: mentioned, RCU opens up a lot of nice performance opportunities, and in
: addition academia[1] has shown outstanding scalability of address spaces
: with the foundation of replacing the locked rbtree with RCU aware trees.
A similar work has been discovered in the academic press
Sheer coincidence. We designed our tree with the intention of solving the
hardest problem first. Upon settling on a b-tree variant and a rough
outline, we researched ranged based b-trees and RCU b-trees and did find
that article. So it was nice to find reassurances that we were on the
right path, but our design choice of using ranges made that paper unusable
for us.
This patch (of 69):
The maple tree is an RCU-safe range based B-tree designed to use modern
processor cache efficiently. There are a number of places in the kernel
that a non-overlapping range-based tree would be beneficial, especially
one with a simple interface. If you use an rbtree with other data
structures to improve performance or an interval tree to track
non-overlapping ranges, then this is for you.
The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
nodes. With the increased branching factor, it is significantly shorter
than the rbtree so it has fewer cache misses. The removal of the linked
list between subsequent entries also reduces the cache misses and the need
to pull in the previous and next VMA during many tree alterations.
The first user that is covered in this patch set is the vm_area_struct,
where three data structures are replaced by the maple tree: the augmented
rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The
long term goal is to reduce or remove the mmap_lock contention.
The plan is to get to the point where we use the maple tree in RCU mode.
Readers will not block for writers. A single write operation will be
allowed at a time. A reader re-walks if stale data is encountered. VMAs
would be RCU enabled and this mode would be entered once multiple tasks
are using the mm_struct.
drivers/android/binder_alloc_selftest.c: In function 'binder_selftest_alloc':
drivers/android/binder_alloc_selftest.c:290:43: error: 'struct binder_alloc' has no member named 'vma'
290 | if (!binder_selftest_run || !alloc->vma)
Cc: Christian Brauner (Microsoft) <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hridya Valsaraju <hridya@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Martijn Coenen <maco@android.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Todd Kjos <tkjos@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Liam R. Howlett [Tue, 21 Jun 2022 01:09:09 +0000 (21:09 -0400)]
android: binder: stop saving a pointer to the VMA
Do not record a pointer to a VMA outside of the mmap_lock for later use.
This is unsafe and there are a number of failure paths *after* the
recorded VMA pointer may be freed during setup. There is no callback to
the driver to clear the saved pointer from generic mm code. Furthermore,
the VMA pointer may become stale if any number of VMA operations end up
freeing the VMA so saving it was fragile to being with.
Instead, change the binder_alloc struct to record the start address of the
VMA and use vma_lookup() to get the vma when needed. Add lockdep
mmap_lock checks on updates to the vma pointer to ensure the lock is held
and depend on that lock for synchronization of readers and writers - which
was already the case anyways, so the smp_wmb()/smp_rmb() was not
necessary.
Link: https://lkml.kernel.org/r/20220621140212.vpkio64idahetbyf@revolver Fixes: da1b9564e85b ("android: binder: fix the race mmap and alloc_new_buf_locked") Reported-by: syzbot+58b51ac2b04e388ab7b0@syzkaller.appspotmail.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Christian Brauner (Microsoft) <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hridya Valsaraju <hridya@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Martijn Coenen <maco@android.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Todd Kjos <tkjos@android.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
NeilBrown [Sun, 26 Jun 2022 22:40:31 +0000 (15:40 -0700)]
mm: discard __GFP_ATOMIC
__GFP_ATOMIC serves little purpose. Its main effect is to set
ALLOC_HARDER which adds a few little boosts to increase the chance of an
allocation succeeding, one of which is to lower the water-mark at which it
will succeed.
It is *always* paired with __GFP_HIGH which sets ALLOC_HIGH which also
adjusts this watermark. It is probable that other users of __GFP_HIGH
should benefit from the other little bonuses that __GFP_ATOMIC gets.
__GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM.
There is little point to this. We already get a might_sleep() warning if
__GFP_DIRECT_RECLAIM is set.
__GFP_ATOMIC allows the "watermark_boost" to be side-stepped. It is
probable that testing ALLOC_HARDER is a better fit here.
__GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might
sleep. This should test __GFP_DIRECT_RECLAIM instead.
This patch:
- removes __GFP_ATOMIC
- causes __GFP_HIGH to set ALLOC_HARDER unless __GFP_NOMEMALLOC is set
(as well as ALLOC_HIGH).
- makes other adjustments as suggested by the above.
The net result is not change to GFP_ATOMIC allocations. Other
allocations that use __GFP_HIGH will benefit from a few different extra
privileges. This affects:
xen, dm, md, ntfs3
the vermillion frame buffer
hibernation
ksm
swap
all of which likely produce more benefit than cost if these selected
allocation are more likely to succeed quickly.
Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Yang Shi [Fri, 13 May 2022 19:17:05 +0000 (12:17 -0700)]
mm/page_vma_mapped.c: check possible huge PMD map with transhuge_vma_suitable()
IIUC page_vma_mapped_walk() checks if the vma is possibly huge PMD mapped
with transparent_hugepage_active() and "pvmw->nr_pages >= HPAGE_PMD_NR".
Actually pvmw->nr_pages is returned by compound_nr() or folio_nr_pages(),
so the page should be THP as long as "pvmw->nr_pages >= HPAGE_PMD_NR".
And it is guaranteed THP is allocated for valid VMA in the first place.
But it may be not PMD mapped if the VMA is file VMA and it is not properly
aligned. The transhuge_vma_suitable() is used to do such check, so
replace transparent_hugepage_active() to it, which is too heavy and
overkilling.
Link: https://lkml.kernel.org/r/20220513191705.457775-1-shy828301@gmail.com Signed-off-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Yang Shi [Sun, 26 Jun 2022 22:40:30 +0000 (15:40 -0700)]
mm: rmap: use the correct parameter name for DEFINE_PAGE_VMA_WALK
The parameter used by DEFINE_PAGE_VMA_WALK is _page not page, fix the
parameter name. It didn't cause any build error, it is probably because
the only caller is write_protect_page() from ksm.c, which pass in page.
Link: https://lkml.kernel.org/r/20220512174551.81279-1-shy828301@gmail.com Fixes: 2aff7a4755be ("mm: Convert page_vma_mapped_walk to work on PFNs") Signed-off-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit introduced a regression that can cause mount hung. The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.
To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.
Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Gautam Menghani [Sun, 26 Jun 2022 17:03:55 +0000 (22:33 +0530)]
mm/kasan: fix null pointer dereference warning in qlink_to_cache()
virt_to_slab() declared in slab.h can return NULL if the address does not
belong to a slab. This case is not handled in qlink_to_cache() in
quarantine.c, which can cause a NULL pointer dereference in
"virt_to_slab(qlink)->slab_cache". This issue was discovered by fanalyzer
(my gcc version: 12.1.1 20220507)
Gowans, James [Thu, 23 Jun 2022 05:24:03 +0000 (05:24 +0000)]
mm: split huge PUD on wp_huge_pud fallback
Currently the implementation will split the PUD when a fallback is taken
inside the create_huge_pud function. This isn't where it should be done:
the splitting should be done in wp_huge_pud, just like it's done for PMDs.
Reason being that if a callback is taken during create, there is no PUD
yet so nothing to split, whereas if a fallback is taken when encountering
a write protection fault there is something to split.
It looks like this was the original intention with the commit where the
splitting was introduced, but somehow it got moved to the wrong place
between v1 and v2 of the patch series. Rebase mistake perhaps.
Link: https://lkml.kernel.org/r/6f48d622eb8bce1ae5dd75327b0b73894a2ec407.camel@amazon.com Fixes: 327e9fd48972 ("mm: Split huge pages on write-notify or COW") Signed-off-by: James Gowans <jgowans@amazon.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ryusuke Konishi [Thu, 23 Jun 2022 08:54:01 +0000 (17:54 +0900)]
nilfs2: fix incorrect masking of permission flags for symlinks
The permission flags of newly created symlinks are wrongly dropped on
nilfs2 with the current umask value even though symlinks should have 777
(rwxrwxrwx) permissions:
David Hildenbrand [Thu, 23 Jun 2022 20:53:32 +0000 (22:53 +0200)]
mm/rmap: fix dereferencing invalid subpage pointer in try_to_migrate_one()
The subpage we calculate is an invalid pointer for device private pages,
because device private pages are mapped via non-present device private
entries, not ordinary present PTEs.
Let's just not compute broken pointers and fixup later. Move the proper
assignment of the correct subpage to the beginning of the function and
assert that we really only have a single page in our folio.
This currently results in a BUG when tying to compute anon_exclusive,
because:
YueHaibing [Fri, 24 Jun 2022 08:52:36 +0000 (16:52 +0800)]
riscv/mm: fix build error while PAGE_TABLE_CHECK enabled without MMU
mm/page_table_check.c: In function `__page_table_check_pte_clear':
mm/page_table_check.c:148:6: error: implicit declaration of function `pte_user_accessible_page'; did you mean `user_access_save'? [-Werror=implicit-function-declaration]
if (pte_user_accessible_page(pte)) {
^~~~~~~~~~~~~~~~~~~~~~~~
user_access_save
ARCH_SUPPORTS_PAGE_TABLE_CHECK should only enabled with MMU.
Link: https://lkml.kernel.org/r/20220624085236.18544-1-yuehaibing@huawei.com Fixes: 3fee229a8eb9 ("riscv/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Tong Tiangen <tongtiangen@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bagas Sanjaya [Wed, 22 Jun 2022 08:45:46 +0000 (15:45 +0700)]
Documentation: highmem: use literal block for code example in highmem.h comment
When building htmldocs on Linus's tree, there are inline emphasis warnings
on include/linux/highmem.h:
Documentation/vm/highmem:166: ./include/linux/highmem.h:154: WARNING: Inline emphasis start-string without end-string.
Documentation/vm/highmem:166: ./include/linux/highmem.h:157: WARNING: Inline emphasis start-string without end-string.
These warnings above are due to comments in code example at the mentioned
lines above are enclosed by double dash (--), which confuses Sphinx as
inline markup delimiters instead.
Fix these warnings by indenting the code example with literal block
indentation and making the comments C comments.
Link: https://lkml.kernel.org/r/20220622084546.17745-1-bagasdotme@gmail.com Fixes: 85a85e7601263f ("Documentation/vm: move "Using kmap-atomic" to highmem.h") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>