Control dependencies do not guarantee load order across the condition,
allowing a CPU to predict and speculate memory reads.
Commit 
324b494c2862 inlined verifying a new completion with its
handling. At least one architecture was observed to access the contents
out of order, resulting in the driver using stale data for the
completion.
Add a dma read barrier before reading the completion queue entry and
after the condition its contents depend on to ensure the read order is
determinsitic.
Reported-by: John Garry <john.garry@huawei.com>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
 
 
        while (nvme_cqe_pending(nvmeq)) {
                found++;
+               /*
+                * load-load control dependency between phase and the rest of
+                * the cqe requires a full read memory barrier
+                */
+               dma_rmb();
                nvme_handle_cqe(nvmeq, nvmeq->cq_head);
                nvme_update_cq_head(nvmeq);
        }