There are two perf problems addressed by this commit:
1) Perf doesn't produce totally accurate call graphs for the user
space processes being analyzed. This is because the kernel's
space identifier is used to access the user stack when a
privileged context is interrupted by irq15 (perf counter
interrupt).
2) Spurious segfaults and bus errors occur in random processes,
including perf itself. This is caused by the same situation as
above, but if the perf counter interrupt arrives in the middle of
handling a fault (ie, TLB miss, page fault, etc), the running
thread's fault address and/or fault code (in thread_info) can be
inadvertantly modified by a nested fault, which makes the fault
unresolvable, and the process is killed with a signal.
This inadvertant modification happens because perf interrupt
processing can (and will) incur a fault itself while walking
the user stack, and this can overwrite the fault information
for the handler that was interrupted.
Orabug:
22350940
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
(cherry picked from commit
21c8eb7e6a89f6be2a90ab8044ff64ceea8c2b36)
void
perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
{
+ u64 saved_fault_address = current_thread_info()->fault_address;
+ u8 saved_fault_code = get_thread_fault_code();
+ u8 saved_asi;
+
perf_callchain_store(entry, regs->tpc);
if (!current->mm)
return;
+ /* Make sure we are setup to access user memory */
+ __asm__ __volatile__ ("rd %%asi, %0\n" : "=r" (saved_asi));
+ if (saved_asi != ASI_AIUS)
+ __asm__ __volatile__ (
+ "wr %%g0, %0, %%asi\n" : : "i" (ASI_AIUS));
+
flushw_user();
pagefault_disable();
perf_callchain_user_64(entry, regs);
pagefault_enable();
+
+ if (saved_asi != ASI_AIUS)
+ __asm__ __volatile__ (
+ "wr %%g0, %0, %%asi\n" : : "r" (saved_asi));
+
+ set_thread_fault_code(saved_fault_code);
+ current_thread_info()->fault_address = saved_fault_address;
}