This code is timing 100k indirect calls, so the added overhead
of counting the number of cycles elapsed as a 64-bit number
should be insignificant.  Drop the optimization of using a
32-bit count.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/d58f339a9c0dd8352b50d2f7a216f67ec2844f20.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
                const int K6_BUG_LOOP = 1000000;
                int n;
                void (*f_vide)(void);
-               unsigned long d, d2;
+               u64 d, d2;
 
                printk(KERN_INFO "AMD K6 stepping B detected - ");
 
 
                n = K6_BUG_LOOP;
                f_vide = vide;
-               rdtscl(d);
+               d = native_read_tsc();
                while (n--)
                        f_vide();
-               rdtscl(d2);
+               d2 = native_read_tsc();
                d = d2-d;
 
                if (d > 20*K6_BUG_LOOP)