From: Suren Baghdasaryan Date: Tue, 29 Nov 2022 06:13:25 +0000 (+0000) Subject: WORKAROUND: mas_walk() in RCU mode returns value 0xE X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=9f4ad814ada8c49fc3d68bb4e2bda92c9b7ac9ac;p=users%2Fjedix%2Flinux-maple.git WORKAROUND: mas_walk() in RCU mode returns value 0xE Mel Gorman reported an issue with mas_walk() occasionally returning 0xE causing NULL pointer dereference: [ 3611.890629][T12506] BUG: kernel NULL pointer dereference, address: 000000000000009e [ 3611.898293][T12506] #PF: supervisor read access in kernel mode [ 3611.904127][T12506] #PF: error_code(0x0000) - not-present page [ 3611.909961][T12506] PGD 5022e39067 P4D 5022e39067 PUD 502344b067 PMD 0 [ 3611.916571][T12506] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 3611.921629][T12506] CPU: 35 PID: 12506 Comm: pft Tainted: G E 6.1.0-rc6-mm-pervmalock-v2r3 #1 5eb359771d3fe61fee3afacf8357ad1393b861b8 [ 3611.935095][T12506] Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.8.2 08/27/2020 [ 3611.943266][T12506] RIP: 0010:lock_vma_under_rcu+0x70/0x140 # faddr2line matches mm/memory.c:5268 The code causing the crash is vma_is_anonymous() call which accesses vma->vm_ops. With vm_ops being at offset 0x90 in vm_area_struct, the vma returned by mas_walk() must be 0xE (0x9e-0x90). Work around this issue for testing until we figure out why this is happening. Signed-off-by: Suren Baghdasaryan --- diff --git a/mm/memory.c b/mm/memory.c index 1f77d337677e..9738c3fda771 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5283,6 +5283,18 @@ retry: if (!vma) goto inval; + /* A known issue to be investigated */ + if (WARN_ON(vma <= (struct vm_area_struct *)0xFF)) { + trace_printk("RECEIVED VMA 0x%lx for addr 0x%lx\n", + (unsigned long)vma, address); + printk("GOT AN INVALID VMA 0x%lx for address 0x%lx\n", + (unsigned long)vma, address); +#ifdef CONFIG_DEBUG_MAPLE_TREE + mt_dump(&mm->mm_mt); +#endif + goto inval; + } + /* Only anonymous vmas are supported for now */ if (!vma_is_anonymous(vma)) goto inval;