]> www.infradead.org Git - users/hch/configfs.git/commit
drm/amdkfd: support per-queue reset on gfx9
authorJonathan Kim <Jonathan.Kim@amd.com>
Tue, 25 Jun 2024 15:22:50 +0000 (11:22 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Tue, 6 Aug 2024 14:43:18 +0000 (10:43 -0400)
commitee0a469cf9175aeb6131c0476c4a4a8eb5997dfa
tree54532ea2f59cccefc7d0c99fcf5efe4776c75361
parente89d2fec4cde967445e16e02e406481bac380cc4
drm/amdkfd: support per-queue reset on gfx9

Support per-queue reset for GFX9.  The recommendation is for the driver
to target reset the HW queue via a SPI MMIO register write.

Since this requires pipe and HW queue info and MEC FW is limited to
doorbell reports of hung queues after an unmap failure, scan the HW
queue slots defined by SET_RESOURCES first to identify the user queue
candidates to reset.

Only signal reset events to processes that have had a queue reset.

If queue reset fails, fall back to GPU reset.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
16 files changed:
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
drivers/gpu/drm/amd/amdkfd/kfd_events.c
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
drivers/gpu/drm/amd/amdkfd/kfd_priv.h
drivers/gpu/drm/amd/amdkfd/kfd_process.c
drivers/gpu/drm/amd/include/kgd_kfd_interface.h