BorisO reports that misc_register() fails often on xen. The current code
unregisters the CPU hotplug notifier in that case. If then a CPU is
offlined and onlined back again, we end up with a second timer running
on that CPU, leading to soft lockups and system hangs.
So let's leave the hotcpu notifier always registered - even if
mce_device_create failed for some cores and never unreg it so that we
can deal with the timer handling accordingly.
Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1403274493-1371-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>
        for_each_online_cpu(i) {
                err = mce_device_create(i);
                if (err) {
+                       /*
+                        * Register notifier anyway (and do not unreg it) so
+                        * that we don't leave undeleted timers, see notifier
+                        * callback above.
+                        */
+                       __register_hotcpu_notifier(&mce_cpu_notifier);
                        cpu_notifier_register_done();
                        goto err_device_create;
                }
 err_register:
        unregister_syscore_ops(&mce_syscore_ops);
 
-       cpu_notifier_register_begin();
-       __unregister_hotcpu_notifier(&mce_cpu_notifier);
-       cpu_notifier_register_done();
-
 err_device_create:
        /*
         * We didn't keep track of which devices were created above, but