In a surprising turn of events, it transpires that CPU capabilities
configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
result of late-onlining. Therefore our handling of erratum 
1418040 does
not get activated if it is not required by any of the boot CPUs, even
though we allow late-onlining of an affected CPU.
In order to get things working again, replace the cpus_have_const_cap()
invocation with an explicit check for the current CPU using
this_cpu_has_cap().
Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201106114952.10032-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
 
 /*
  * CPU feature detected at boot time based on feature of one or more CPUs.
  * All possible conflicts for a late CPU are ignored.
+ * NOTE: this means that a late CPU with the feature will *not* cause the
+ * capability to be advertised by cpus_have_*cap()!
  */
 #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE            \
        (ARM64_CPUCAP_SCOPE_LOCAL_CPU           |       \
 
        bool prev32, next32;
        u64 val;
 
-       if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
-             cpus_have_const_cap(ARM64_WORKAROUND_1418040)))
+       if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))
                return;
 
        prev32 = is_compat_thread(task_thread_info(prev));
        next32 = is_compat_thread(task_thread_info(next));
 
-       if (prev32 == next32)
+       if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
                return;
 
        val = read_sysreg(cntkctl_el1);