When waking from a CPU idle instruction (e.g., nap or stop), the sync
for ordering the KVM secondary thread state can be avoided if there
wakeup is coming from a kernel context rather than KVM context.
This improves performance for ping-pong benchmark with the stop0 idle
state by 0.46% for 2 threads in the same core, and 1.02% for different
cores.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
        mr      r3,r12
 
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+       lbz     r0,HSTATE_HWTHREAD_STATE(r13)
+       cmpwi   r0,KVM_HWTHREAD_IN_KERNEL
+       beq     1f
        li      r0,KVM_HWTHREAD_IN_KERNEL
        stb     r0,HSTATE_HWTHREAD_STATE(r13)
        /* Order setting hwthread_state vs. testing hwthread_req */