From: Mike Kravetz Date: Thu, 8 Nov 2018 00:10:28 +0000 (-0800) Subject: hugetlbfs: use truncate mutex to prevent pmd sharing race X-Git-Tag: v4.1.12-124.31.3~371 X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=c4b649c0d657bb58343bddc7ae1474426b63fbdc;p=users%2Fjedix%2Flinux-maple.git hugetlbfs: use truncate mutex to prevent pmd sharing race The synchronization mechanism for hugetlbfs pagefaults/truncation and pmd sharing ideally needs to be modified to use i_mmap_rwsem. See: http://lkml.kernel.org/r/20181024045053.1467-1-mike.kravetz@oracle.com In UEK, we have introduced a hugetlbfs truncate mutex in an inode extension. By taking this mutex earlier in hugetlb_fault (before calling huge_pte_alloc), we eliminate the most common cause of problems where ptep can be altered by a call to huge_pmd_unshare. Orabug: 28896255 Signed-off-by: Mike Kravetz Reviewed-by: Larry Bassel Signed-off-by: Brian Maly --- diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1096f53b5f9c9..c972acfe94616 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3792,24 +3792,33 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, VM_FAULT_SET_HINDEX(hstate_index(h)); } + /* + * Use truncate mutex to serialize truncation and page faults. This + * prevents ANY faults from happening on the file during truncation. + * + * Acquire truncate mutex BEFORE calling huge_pte_alloc. This + * protects us from calls to huge_pmd_unshare that may invalidate + * ptep. Something besides trunc_rwsem really should be used for this + * synchronization. This is a less than ideal solution, but protects + * us in UEK kernels. + */ + mapping = vma->vm_file->f_mapping; + hinode_info = HUGETLBFS_I(mapping->host); + down_read(&hinode_info->trunc_rwsem); ptep = huge_pte_alloc(mm, address, huge_page_size(h)); - if (!ptep) + if (!ptep) { + up_read(&hinode_info->trunc_rwsem); return VM_FAULT_OOM; - - mapping = vma->vm_file->f_mapping; - idx = vma_hugecache_offset(h, vma, address); + } /* - * Use truncate mutex to serialize truncation and page faults. This - * prevents ANY faults from happening on the file during truncation. * The fault mutex serializes hugepage allocation and instantiation * on the same page. This prevents spurious allocation failures if * two CPUs race to instantiate the same page in the page cache. * - * Acquire truncate mutex BEFORE fault mutex. + * Acquire fault mutex AFTER truncate mutex. */ - hinode_info = HUGETLBFS_I(mapping->host); - down_read(&hinode_info->trunc_rwsem); + idx = vma_hugecache_offset(h, vma, address); hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, address); mutex_lock(&hugetlb_fault_mutex_table[hash]);