From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Thu, 8 Nov 2018 00:10:28 +0000 (-0800)
Subject: hugetlbfs: use truncate mutex to prevent pmd sharing race
X-Git-Tag: v4.1.12-124.31.3~371
X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=c4b649c0d657bb58343bddc7ae1474426b63fbdc;p=users%2Fjedix%2Flinux-maple.git

hugetlbfs: use truncate mutex to prevent pmd sharing race

The synchronization mechanism for hugetlbfs pagefaults/truncation and
pmd sharing ideally needs to be modified to use i_mmap_rwsem.  See:
http://lkml.kernel.org/r/20181024045053.1467-1-mike.kravetz@oracle.com

In UEK, we have introduced a hugetlbfs truncate mutex in an inode
extension.  By taking this mutex earlier in hugetlb_fault (before calling
huge_pte_alloc), we eliminate the most common cause of problems where
ptep can be altered by a call to huge_pmd_unshare.

Orabug: 28896255

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Larry Bassel <larry.bassel@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
---

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1096f53b5f9c9..c972acfe94616 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3792,24 +3792,33 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 				VM_FAULT_SET_HINDEX(hstate_index(h));
 	}
 
+	/*
+	 * Use truncate mutex to serialize truncation and page faults.  This
+	 * prevents ANY faults from happening on the file during truncation.
+	 *
+	 * Acquire truncate mutex BEFORE calling huge_pte_alloc.  This
+	 * protects us from calls to huge_pmd_unshare that may invalidate
+	 * ptep.  Something besides trunc_rwsem really should be used for this
+	 * synchronization.  This is a less than ideal solution, but protects
+	 * us in UEK kernels.
+	 */
+	mapping = vma->vm_file->f_mapping;
+	hinode_info = HUGETLBFS_I(mapping->host);
+	down_read(&hinode_info->trunc_rwsem);
 	ptep = huge_pte_alloc(mm, address, huge_page_size(h));
-	if (!ptep)
+	if (!ptep) {
+		up_read(&hinode_info->trunc_rwsem);
 		return VM_FAULT_OOM;
-
-	mapping = vma->vm_file->f_mapping;
-	idx = vma_hugecache_offset(h, vma, address);
+	}
 
 	/*
-	 * Use truncate mutex to serialize truncation and page faults.  This
-	 * prevents ANY faults from happening on the file during truncation.
 	 * The fault mutex serializes hugepage allocation and instantiation
 	 * on the same page.  This prevents spurious allocation failures if
 	 * two CPUs race to instantiate the same page in the page cache.
 	 *
-	 * Acquire truncate mutex BEFORE fault mutex.
+	 * Acquire fault mutex AFTER truncate mutex.
 	 */
-	hinode_info = HUGETLBFS_I(mapping->host);
-	down_read(&hinode_info->trunc_rwsem);
+	idx = vma_hugecache_offset(h, vma, address);
 	hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, address);
 	mutex_lock(&hugetlb_fault_mutex_table[hash]);