From 7539b174aef405d9d57db48c58390ba360c91312 Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Fri, 4 Jan 2019 15:54:14 -0200 Subject: [PATCH] x86: kvmguest: use TSC clocksource if invariant TSC is exposed The invariant TSC bit has the following meaning: "The time stamp counter in newer processors may support an enhancement, referred to as invariant TSC. Processor's support for invariant TSC is indicated by CPUID.80000007H:EDX[8]. The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. This is the architectural behavior moving forward. On processors with invariant TSC support, the OS may use the TSC for wall clock timer services (instead of ACPI or HPET timers). TSC reads are much more efficient and do not incur the overhead associated with a ring transition or access to a platform resource." IOW, TSC does not change frequency. In such case, and with TSC scaling hardware available to handle migration, it is possible to use the TSC clocksource directly, whose system calls are faster. Reduce the rating of kvmclock clocksource to allow TSC clocksource to be the default if invariant TSC is exposed. Signed-off-by: Marcelo Tosatti v2: Use feature bits and tsc_unstable() check (Sean Christopherson) Signed-off-by: Paolo Bonzini --- arch/x86/kernel/kvmclock.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index d908a37bf3f3a..904494b924c13 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -351,6 +351,20 @@ void __init kvmclock_init(void) machine_ops.crash_shutdown = kvm_crash_shutdown; #endif kvm_get_preset_lpj(); + + /* + * X86_FEATURE_NONSTOP_TSC is TSC runs at constant rate + * with P/T states and does not stop in deep C-states. + * + * Invariant TSC exposed by host means kvmclock is not necessary: + * can use TSC as clocksource. + * + */ + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && + boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && + !check_tsc_unstable()) + kvm_clock.rating = 299; + clocksource_register_hz(&kvm_clock, NSEC_PER_SEC); pv_info.name = "KVM"; } -- 2.51.0