Currently each dm_io's reference counter is grabbed before calling
__map_bio(), this way isn't efficient since we can move this grabbing
to initialization time inside alloc_io().
Meantime it becomes typical async io reference counter model: one is
for submission side, the other is for completion side, and the io won't
be completed until both sides are done.
Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>