We need to wait until the percpu_ref is released before exit. Otherwise,
we sometimes lose the race and trigger this new warning that was added
in v4.9 (commit 
a67823c1ed10 "percpu-refcount: init ->confirm_switch
member properly"):
 WARNING: CPU: 0 PID: 3629 at lib/percpu-refcount.c:107 percpu_ref_exit+0x51/0x60
 [..]
 Call Trace:
  [<
ffffffff814bf093>] dump_stack+0x85/0xc2
  [<
ffffffff810b15db>] __warn+0xcb/0xf0
  [<
ffffffff810b170d>] warn_slowpath_null+0x1d/0x20
  [<
ffffffff814d70c1>] percpu_ref_exit+0x51/0x60
  [<
ffffffffa005706a>] dax_pmem_percpu_exit+0x1a/0x50 [dax_pmem]
  [<
ffffffff81615f1f>] devm_action_release+0xf/0x20
Cc: <stable@vger.kernel.org>
Fixes: ab68f2622136 ("/dev/dax, pmem: direct access to persistent memory")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
 
        dev_dbg(dax_pmem->dev, "%s\n", __func__);
        percpu_ref_exit(ref);
-       wait_for_completion(&dax_pmem->cmp);
 }
 
 static void dax_pmem_percpu_kill(void *data)
 
        dev_dbg(dax_pmem->dev, "%s\n", __func__);
        percpu_ref_kill(ref);
+       wait_for_completion(&dax_pmem->cmp);
 }
 
 static int dax_pmem_probe(struct device *dev)