]> www.infradead.org Git - users/dwmw2/linux.git/commitdiff
drm/amdkfd: Reset GPU on queue preemption failure
authorHarish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Tue, 26 Mar 2024 19:32:46 +0000 (15:32 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Wed, 10 Apr 2024 03:09:31 +0000 (23:09 -0400)
Currently, with F32 HWS GPU reset is only when unmap queue fails.

However, if compute queue doesn't repond to preemption request in time
unmap will return without any error. In this case, only preemption error
is logged and Reset is not triggered. Call GPU reset in this case also.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c

index f4d395e38683db7c85f3a7f5fc922e93b1222f88..0b655555e1678643fb84fa8b3e1640cd35a9a74e 100644 (file)
@@ -2001,6 +2001,7 @@ static int unmap_queues_cpsch(struct device_queue_manager *dqm,
                dev_err(dev, "HIQ MQD's queue_doorbell_id0 is not 0, Queue preemption time out\n");
                while (halt_if_hws_hang)
                        schedule();
+               kfd_hws_hang(dqm);
                return -ETIME;
        }