]> www.infradead.org Git - nvme.git/commit
md/raid1: Fix data corruption for degraded array with slow disk
authorYu Kuai <yukuai3@huawei.com>
Sat, 3 Aug 2024 09:11:37 +0000 (17:11 +0800)
committerSong Liu <song@kernel.org>
Thu, 15 Aug 2024 20:38:17 +0000 (13:38 -0700)
commitc916ca35308d3187c9928664f9be249b22a3a701
treeb6555d198877e184e09b9d9b90ddaad51fc77639
parent7db4042336580dfd75cb5faa82c12cd51098c90b
md/raid1: Fix data corruption for degraded array with slow disk

read_balance() will avoid reading from slow disks as much as possible,
however, if valid data only lands in slow disks, and a new normal disk
is still in recovery, unrecovered data can be read:

raid1_read_request
 read_balance
  raid1_should_read_first
  -> return false
  choose_best_rdev
  -> normal disk is not recovered, return -1
  choose_bb_rdev
  -> missing the checking of recovery, return the normal disk
 -> read unrecovered data

Root cause is that the checking of recovery is missing in
choose_bb_rdev(). Hence add such checking to fix the problem.

Also fix similar problem in choose_slow_rdev().

Cc: stable@vger.kernel.org
Fixes: 9f3ced792203 ("md/raid1: factor out choose_bb_rdev() from read_balance()")
Fixes: dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()")
Reported-and-tested-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Closes: https://lore.kernel.org/all/9952f532-2554-44bf-b906-4880b2e88e3a@o2.pl/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20240803091137.3197008-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu <song@kernel.org>
drivers/md/raid1.c