www.infradead.org Git - linux.git/log

]> www.infradead.org Git - linux.git/log

Lijo Lazar [Fri, 16 Aug 2024 09:04:17 +0000 (14:34 +0530)]

drm/amd/pm: Add support for new P2S table revision

Add p2s table support for a new revision of SMUv13.0.6.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Likun Gao [Thu, 22 Aug 2024 03:44:12 +0000 (11:44 +0800)]

drm/amdgpu: support for gc_info table v1.3

Add gc_info table v1.3 for IP discovery.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Ma Ke [Wed, 21 Aug 2024 04:27:24 +0000 (12:27 +0800)]

drm/amd/display: avoid using null object of framebuffer

Instead of using state->fb->obj[0] directly, get object from framebuffer
by calling drm_gem_fb_get_obj() and return error code when object is
null to avoid using null object of framebuffer.

Fixes: 5d945cbcd4b1 ("drm/amd/display: Create a file dedicated to planes")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Yang Wang [Wed, 21 Aug 2024 06:42:41 +0000 (14:42 +0800)]

drm/amdgpu: add list empty check to avoid null pointer issue

Add list empty check to avoid null pointer issues in some corner cases.
- list_for_each_entry_safe()

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jinjie Ruan [Wed, 21 Aug 2024 06:40:39 +0000 (14:40 +0800)]

drm/amd/display: Make dcn401_dsc_funcs static

The sparse tool complains as follows:

drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dcn401/dcn401_dsc.c:30:24: warning:
symbol 'dcn401_dsc_funcs' was not declared. Should it be static?

This symbol is not used outside of dcn401_dsc.c, so marks it static.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jinjie Ruan [Wed, 21 Aug 2024 06:40:38 +0000 (14:40 +0800)]

drm/amd/display: Make dcn35_hubp_funcs static

The sparse tool complains as follows:

drivers/gpu/drm/amd/amdgpu/../display/dc/hubp/dcn35/dcn35_hubp.c:191:19: warning:
symbol 'dcn35_hubp_funcs' was not declared. Should it be static?

This symbol is not used outside of dcn35_hubp.c, so marks it static.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jinjie Ruan [Wed, 21 Aug 2024 06:40:37 +0000 (14:40 +0800)]

drm/amd/display: Make core_dcn4_ip_caps_base static

The sparse tool complains as follows:

drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.c:12:28: warning:
symbol 'core_dcn4_ip_caps_base' was not declared. Should it be static?

This symbol is not used outside of dcn35_hubp.c, so marks it static.

And do not want to change it, so mark it const.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jinjie Ruan [Wed, 21 Aug 2024 06:40:36 +0000 (14:40 +0800)]

drm/amd/display: Make core_dcn4_g6_temp_read_blackout_table static

The sparse tool complains as follows:

drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:6853:56: warning:
symbol 'core_dcn4_g6_temp_read_blackout_table' was not declared. Should it be static?

This symbol is not used outside of dml2_core_dcn4_calcs.c, so marks it static.

And not want to change it, so mark it const.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Tue, 20 Aug 2024 17:11:22 +0000 (13:11 -0400)]

drm/amdgpu/gfx12: set UNORD_DISPATCH in compute MQDs

This needs to be set to 1 to avoid a potential deadlock in
the GC 10.x and newer. On GC 9.x and older, this needs
to be set to 0. This can lead to hangs in some mixed
graphics and compute workloads.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3575
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Hawking Zhang [Mon, 19 Aug 2024 14:59:19 +0000 (22:59 +0800)]

drm/amdgpu: Retire query_utcl2_poison_status callback

Driver switches to interrupt source id to identify
utcl2 poison event. polling interface is not needed.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Rahul Jain [Tue, 13 Aug 2024 08:11:11 +0000 (13:41 +0530)]

drm/amdgpu: Take IOMMU remapping into account for p2p checks

when trying to enable p2p the amdgpu_device_is_peer_accessible()
checks the condition where address_mask overlaps the aper_base
and hence returns 0, due to which the p2p disables for this platform

IOMMU should remap the BAR addresses so the device can access
them. Hence check if peer_adev is remapping DMA

v5: (Felix, Alex)
- fixing comment as per Alex feedback
- refactor code as per Felix

v4: (Alex)
- fix the comment and description

v3:
- remove iommu_remap variable

v2: (Alex)
- Fix as per review comments
- add new function amdgpu_device_check_iommu_remap to check if iommu
remap

Signed-off-by: Rahul Jain <Rahul.Jain@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Kenneth Feng [Tue, 20 Aug 2024 00:57:15 +0000 (08:57 +0800)]

drm/amd/pm: update message interface for smu v14.0.2/3

update message interface for smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Hawking Zhang [Mon, 19 Aug 2024 14:23:11 +0000 (22:23 +0800)]

drm/amdkfd: Drop poison hanlding from gfx v10

Not supported.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Hawking Zhang [Tue, 20 Aug 2024 05:56:32 +0000 (13:56 +0800)]

drm/amdkfd: Check int source id for utcl2 poison event

Traditional utcl2 fault_status polling does not
work in SRIOV environment. The polling of fault
status register from guest side will be dropped
by hardware.

Driver should switch to check utcl2 interrupt
source id to identify utcl2 poison event. It is
set to 1 when poisoned data interrupts are
signaled.

v2: drop the unused local variable (Tao)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Tue, 20 Aug 2024 14:35:49 +0000 (10:35 -0400)]

drm/amd/gfx11: move the gfx mutex into the caller

Otherwise we can fail to drop the software mutex when
we fail to take the hardware mutex.

Fixes: 76acba7b7f12 ("drm/amdgpu/gfx11: add a mutex for the gfx semaphore")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Tim Huang [Wed, 7 Aug 2024 09:15:12 +0000 (17:15 +0800)]

drm/amd/pm: ensure the fw_info is not null before using it

This resolves the dereference null return value warning
reported by Coverity.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Victor Zhao [Mon, 19 Aug 2024 03:16:13 +0000 (11:16 +0800)]

drm/amd/amdgpu: allow use kiq to do hdp flush under sriov

when use cpu to do page table update under sriov runtime, since mmio
access is blocked, kiq has to be used to flush hdp.

change WREG32_NO_KIQ to WREG32 to allow kiq.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Mon, 19 Aug 2024 15:14:29 +0000 (11:14 -0400)]

drm/amdgpu: fix eGPU hotplug regression

The driver needs to wait for the on board firmware
to finish its initialization before probing the card.
Commit 959056982a9b ("drm/amdgpu: Fix discovery initialization failure during pci rescan")
switched from using msleep() to using usleep_range() which
seems to have caused init failures on some navi1x boards. Switch
back to msleep().

Fixes: 959056982a9b ("drm/amdgpu: Fix discovery initialization failure during pci rescan")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3559
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3500
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Ma Jun <Jun.Ma2@amd.com>

commit | commitdiff | tree

Martin Leung [Mon, 12 Aug 2024 04:55:56 +0000 (00:55 -0400)]

drm/amd/display: Promote DC to 3.2.297

- Various DML 2.1 fixes
- Fix module unload
- Fix construct_phy with MXM connector
- Support UHBR10 link rate on eDP
- Revert updated DCCG wrappers

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Martin Leung <Martin.Leung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Austin Zheng [Thu, 15 Aug 2024 22:45:23 +0000 (18:45 -0400)]

drm/amd/display: DML2.1 Reintegration for Various Fixes

[Why and How]
DML2.1 reintegration for several fixes and updates to the DML
code.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Roman Li <roman.li@amd
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Tim Huang [Thu, 15 Aug 2024 22:45:22 +0000 (18:45 -0400)]

drm/amd/display: fix double free issue during amdgpu module unload

Flexible endpoints use DIGs from available inflexible endpoints,
so only the encoders of inflexible links need to be freed.
Otherwise, a double free issue may occur when unloading the
amdgpu module.

[  279.190523] RIP: 0010:__slab_free+0x152/0x2f0
[  279.190577] Call Trace:
[  279.190580]  <TASK>
[  279.190582]  ? show_regs+0x69/0x80
[  279.190590]  ? die+0x3b/0x90
[  279.190595]  ? do_trap+0xc8/0xe0
[  279.190601]  ? do_error_trap+0x73/0xa0
[  279.190605]  ? __slab_free+0x152/0x2f0
[  279.190609]  ? exc_invalid_op+0x56/0x70
[  279.190616]  ? __slab_free+0x152/0x2f0
[  279.190642]  ? asm_exc_invalid_op+0x1f/0x30
[  279.190648]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
[  279.191096]  ? __slab_free+0x152/0x2f0
[  279.191102]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
[  279.191469]  kfree+0x260/0x2b0
[  279.191474]  dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
[  279.191821]  link_destroy+0xd7/0x130 [amdgpu]
[  279.192248]  dc_destruct+0x90/0x270 [amdgpu]
[  279.192666]  dc_destroy+0x19/0x40 [amdgpu]
[  279.193020]  amdgpu_dm_fini+0x16e/0x200 [amdgpu]
[  279.193432]  dm_hw_fini+0x26/0x40 [amdgpu]
[  279.193795]  amdgpu_device_fini_hw+0x24c/0x400 [amdgpu]
[  279.194108]  amdgpu_driver_unload_kms+0x4f/0x70 [amdgpu]
[  279.194436]  amdgpu_pci_remove+0x40/0x80 [amdgpu]
[  279.194632]  pci_device_remove+0x3a/0xa0
[  279.194638]  device_remove+0x40/0x70
[  279.194642]  device_release_driver_internal+0x1ad/0x210
[  279.194647]  driver_detach+0x4e/0xa0
[  279.194650]  bus_remove_driver+0x6f/0xf0
[  279.194653]  driver_unregister+0x33/0x60
[  279.194657]  pci_unregister_driver+0x44/0x90
[  279.194662]  amdgpu_exit+0x19/0x1f0 [amdgpu]
[  279.194939]  __do_sys_delete_module.isra.0+0x198/0x2f0
[  279.194946]  __x64_sys_delete_module+0x16/0x20
[  279.194950]  do_syscall_64+0x58/0x120
[  279.194954]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  279.194980]  </TASK>

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Nicholas Susanto [Thu, 15 Aug 2024 22:45:21 +0000 (18:45 -0400)]

drm/amd/display: DCN35 set min dispclk to 50Mhz

[Why]

Causes hard hangs when resuming after display off on extended/duplicate
modes

[How]

Set the min dispclk to 50Mhz for DCN35

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Ilya Bakoulin [Thu, 15 Aug 2024 22:45:20 +0000 (18:45 -0400)]

drm/amd/display: Fix construct_phy with MXM connector

[Why/How]
The call to construct_phy will fail in cases where connector type is
MXM, and the dc_link won't be properly created/initialized.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sung Joon Kim [Thu, 15 Aug 2024 22:45:19 +0000 (18:45 -0400)]

drm/amd/display: Support UHBR10 link rate on eDP

[why]
Supporting UHBR10 link rate on eDP leverages
the existing DP2.0 code but need to add some small
adjustments in code.

[how]
Acknowledge the given DPCD caps for UHBR10
link rate support and allow DP2.0 programming
sequence and link training for eDP.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Sung Joon Kim <Sungjoon.Kim@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Nevenko Stupar [Thu, 15 Aug 2024 22:45:18 +0000 (18:45 -0400)]

drm/amd/display: Hardware cursor changes color when switched to software cursor

[Why & How]
DCN4 Cursor has separate degamma block and should always
do Cursor degamma for Cursor color modes.

Reviewed-by: Chris Park <chris.park@amd.com>
Signed-off-by: Nevenko Stupar <Nevenko.Stupar@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Michael Strauss [Thu, 15 Aug 2024 22:45:17 +0000 (18:45 -0400)]

drm/amd/display: Allow UHBR Interop With eDP Supported Link Rates Table

[WHY]
eDP 2.0 is introducing support for UHBR link rates, however current eDP ILR
link optimization does not account for UHBR capabilities.
Either UHBR capabilities will be provided via the same 128b/132b rate DPCD caps
that are currently used on DP2.1, or Table 4-13 may be updated to include UHBR
rates.

[HOW]
Add extra Supported Link Rates table translations for UHBR10/13.5/20.
Update eDP link setting optimization search to be aware of 128b/132b DPCD
rate caps in order to unblock UHBR on panels with Supported Link Rates table.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Nicholas Susanto [Thu, 15 Aug 2024 22:45:16 +0000 (18:45 -0400)]

drm/amd/display: Remove redundant check in DCN35 hwseq

Removing redundant condition.

Reviewed-by: Hansen Dsouza <Hansen.Dsouza@amd.com>
Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Aurabindo Pillai [Thu, 15 Aug 2024 22:45:15 +0000 (18:45 -0400)]

drm/amd/display: remove an extraneous call for checking dchub clock

when removing the amdgpu module and reinserting it, a call trace is
triggered:

[  334.230602] RIP: 0010:hubbub2_get_dchub_ref_freq+0xbb/0xe0 [amdgpu]
[  334.230807] Code: 25 28 00 00 00 75 3c 48 8d 65 f0 5b 41 5c 5d 31 c0 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 45 31 d2 45 31 db e9 55 a1 ca de <0f> 0b eb c6 0f 0b eb c2 d1 eb 8d 83 c0 63 ff ff 3d 20 4e 00 00 76
[  334.230809] RSP: 0018:ffffbc8b823fb540 EFLAGS: 00010246
[  334.230811] RAX: 0000000000001000 RBX: 00000000000186a0 RCX: 0000000000000000
[  334.230812] RDX: ffffbc8b823fb544 RSI: 0000000000000000 RDI: 0000000000000000
[  334.230813] RBP: ffffbc8b823fb560 R08: 0000000000000000 R09: 0000000000000000
[  334.230814] R10: 0000000000000000 R11: 000000000000000f R12: ffff9e644f1f2bb0
[  334.230815] R13: ffff9e6451361300 R14: 0000000000000000 R15: ffff9e6452c00000
[  334.230816] FS:  00007af7c8519000(0000) GS:ffff9e737dd00000(0000) knlGS:0000000000000000
[  334.230817] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  334.230818] CR2: 0000703576b9cbd0 CR3: 00000001095a2000 CR4: 0000000000750ee0
[  334.230819] PKRU: 55555554
[  334.230820] Call Trace:
[  334.230822]  <TASK>
[  334.230824]  ? show_regs+0x6d/0x80
[  334.230828]  ? __warn+0x89/0x160
[  334.230832]  ? hubbub2_get_dchub_ref_freq+0xbb/0xe0 [amdgpu]
[  334.231024]  ? report_bug+0x17e/0x1b0
[  334.231028]  ? handle_bug+0x46/0x90
[  334.231030]  ? exc_invalid_op+0x18/0x80
[  334.231032]  ? asm_exc_invalid_op+0x1b/0x20
[  334.231036]  ? hubbub2_get_dchub_ref_freq+0xbb/0xe0 [amdgpu]
[  334.231217]  dc_create_resource_pool+0xfd/0x320 [amdgpu]
[  334.231408]  dc_create+0x256/0x700 [amdgpu]
[  334.231588]  ? srso_alias_return_thunk+0x5/0x7f
[  334.231590]  ? dmi_matches+0xa0/0x230
[  334.231594]  amdgpu_dm_init+0x28c/0x25f0 [amdgpu]
[  334.231791]  ? prb_read_valid+0x1c/0x30
[  334.231795]  ? __irq_work_queue_local+0x43/0xf0
[  334.231798]  ? srso_alias_return_thunk+0x5/0x7f
[  334.231800]  ? irq_work_queue+0x2f/0x70
[  334.231802]  ? srso_alias_return_thunk+0x5/0x7f
[  334.231803]  ? __wake_up_klogd.part.0+0x40/0x70
[  334.231805]  ? srso_alias_return_thunk+0x5/0x7f
[  334.231807]  ? vprintk_emit+0xd9/0x210
[  334.231809]  ? set_dev_info+0x130/0x1c0
[  334.231812]  ? srso_alias_return_thunk+0x5/0x7f
[  334.231813]  ? dev_printk_emit+0xa1/0xe0
[  334.231819]  dm_hw_init+0x14/0x30 [amdgpu]
[  334.231993]  amdgpu_device_init+0x23c7/0x2fc0 [amdgpu]
[  334.232134]  ? pci_read_config_word+0x25/0x50
[  334.232139]  amdgpu_driver_load_kms+0x1a/0xd0 [amdgpu]
[  334.232284]  amdgpu_pci_probe+0x1f9/0x620 [amdgpu]

On DCN401, get_dchub_ref_freq() hook is called before init_hw() hook.
Hence, it is expected to trigger an assert. Remove the extraneous call
to get_dchub_ref_freq() to suppress the call trace

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Michael Strauss [Thu, 15 Aug 2024 22:45:14 +0000 (18:45 -0400)]

drm/amd/display: Update HPO I/O When Handling Link Retrain Automation Request

[WHY]
Previous multi-display HPO fix moved where HPO I/O enable/disable is performed.
The codepath now taken to enable/disable HPO I/O is not used for compliance
test automation, meaning that if a compliance box being driven at a DP1 rate
requests retrain at UHBR, HPO I/O will remain off if it was previously off.

[HOW]
Explicitly update HPO I/O after allocating encoders for test request.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Hansen Dsouza [Thu, 15 Aug 2024 22:45:13 +0000 (18:45 -0400)]

Revert "drm/amd/display: Update to using new dccg callbacks"

[Why]
Revert updated DCCG wrappers due to regression

[How]
This reverts commit 680458d41aa46a009909482f58358205b5c4b438.

Reviewed-by: Chris Park <chris.park@amd.com>
Signed-off-by: Hansen Dsouza <Hansen.Dsouza@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Candice Li [Thu, 15 Aug 2024 03:37:28 +0000 (11:37 +0800)]

drm/amdgpu: Validate TA binary size

Add TA binary size validation to avoid OOB write.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Mukul Joshi [Mon, 12 Aug 2024 15:11:28 +0000 (11:11 -0400)]

drm/amdkfd: Update BadOpcode Interrupt handling with MES

Based on the recommendation of MEC FW, update BadOpcode interrupt
handling by unmapping all queues, removing the queue that got the
interrupt from scheduling and remapping rest of the queues back when
using MES scheduler. This is done to prevent the case where unmapping
of the bad queue can fail thereby causing a GPU reset.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Mukul Joshi [Mon, 3 Jun 2024 15:57:50 +0000 (11:57 -0400)]

drm/amdkfd: Update queue unmap after VM fault with MES

MEC FW expects MES to unmap all queues when a VM fault is observed
on a queue and then resumed once the affected process is terminated.
Use the MES Suspend and Resume APIs to achieve this.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Mukul Joshi [Mon, 3 Jun 2024 15:48:23 +0000 (11:48 -0400)]

drm/amdgpu: Implement MES Suspend and Resume APIs for GFX11

Add implementation for MES Suspend and Resume APIs to unmap/map
all queues for GFX11. Support for GFX12 will be added when the
corresponding firmware support is in place.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Amber Lin [Mon, 29 Apr 2024 20:40:44 +0000 (16:40 -0400)]

drm/amdkfd: Enable processes isolation on gfx9

When amdgpu enable enforce_isolation, KFD enables single-process mode in
HWS and sets exec_cleaner_shader bit in MAP_PROCESS.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Tue, 14 May 2024 18:25:20 +0000 (23:55 +0530)]

drm/amdgpu/gfx_v9_4_3: Apply Isolation Enforcement to GFX & Compute rings

This commit applies isolation enforcement to the GFX and Compute rings
in the gfx_v9_4_3 module.

The commit sets `amdgpu_gfx_enforce_isolation_ring_begin_use` and
`amdgpu_gfx_enforce_isolation_ring_end_use` as the functions to be
called when a ring begins and ends its use, respectively.

`amdgpu_gfx_enforce_isolation_ring_begin_use` is called when a ring
begins its use. This function cancels any scheduled
`enforce_isolation_work` and, if necessary, signals the Kernel Fusion
Driver (KFD) to stop the runqueue.

`amdgpu_gfx_enforce_isolation_ring_end_use` is called when a ring ends
its use. This function schedules `enforce_isolation_work` to be run
after a delay.

These functions are part of the Enforce Isolation Handler, which
enforces shader isolation on AMD GPUs to prevent data leakage between
different processes.

The commit also includes a check for the type of the ring. If the type
of the ring is `AMDGPU_RING_TYPE_COMPUTE`, the `xcp_id` of the
`enforce_isolation` structure in the `gfx` structure of the
`amdgpu_device` is set to the `xcp_id` of the ring. This ensures that
the correct `xcp_id` is used when enforcing isolation on compute rings.
The `xcp_id` is an identifier for an XCP partition, and different rings
can be associated with different XCP partitions.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Thu, 18 Jul 2024 12:52:35 +0000 (18:22 +0530)]

drm/amdgpu/gfx9: Apply Isolation Enforcement to GFX & Compute rings

This commit applies isolation enforcement to the GFX and Compute rings
in the gfx_v9_0 module.

The commit sets `amdgpu_gfx_enforce_isolation_ring_begin_use` and
`amdgpu_gfx_enforce_isolation_ring_end_use` as the functions to be
called when a ring begins and ends its use, respectively.

`amdgpu_gfx_enforce_isolation_ring_begin_use` is called when a ring
begins its use. This function cancels any scheduled
`enforce_isolation_work` and, if necessary, signals the Kernel Fusion
Driver (KFD) to stop the runqueue.

`amdgpu_gfx_enforce_isolation_ring_end_use` is called when a ring ends
its use. This function schedules `enforce_isolation_work` to be run
after a delay.

These functions are part of the Enforce Isolation Handler, which
enforces shader isolation on AMD GPUs to prevent data leakage between
different processes.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Thu, 6 Jun 2024 07:58:02 +0000 (13:28 +0530)]

drm/amdgpu: Implement Enforce Isolation Handler for KGD/KFD serialization

This commit introduces the Enforce Isolation Handler designed to enforce
shader isolation on AMD GPUs, which helps to prevent data leakage
between different processes.

The handler counts the number of emitted fences for each GFX and compute
ring. If there are any fences, it schedules the `enforce_isolation_work`
to be run after a delay of `GFX_SLICE_PERIOD`. If there are no fences,
it signals the Kernel Fusion Driver (KFD) to resume the runqueue.

The function is synchronized using the `enforce_isolation_mutex`.

This commit also introduces a reference count mechanism
(kfd_sch_req_count) to keep track of the number of requests to enable
the KFD scheduler. When a request to enable the KFD scheduler is made,
the reference count is decremented. When the reference count reaches
zero, a delayed work is scheduled to enforce isolation after a delay of
GFX_SLICE_PERIOD.

When a request to disable the KFD scheduler is made, the function first
checks if the reference count is zero. If it is, it cancels the delayed
work for enforcing isolation and checks if the KFD scheduler is active.
If the KFD scheduler is active, it sends a request to stop the KFD
scheduler and sets the KFD scheduler state to inactive. Then, it
increments the reference count.

The function is synchronized using the kfd_sch_mutex to ensure that the
KFD scheduler state and reference count are updated atomically.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Amber Lin [Mon, 29 Jul 2024 18:22:30 +0000 (14:22 -0400)]

drm/amdkfd: APIs to stop/start KFD scheduling

Provide amdgpu_amdkfd_stop_sched() for amdgpu to stop KFD scheduling
compute work on HIQ. amdgpu_amdkfd_start_sched() resumes the scheduling.
When amdgpu_amdkfd_stop_sched is called, KFD will unmap queues from
runlist. If users send ioctls to KFD to create queues, they'll be added
but those queues won't be mapped to runlist (so not scheduled) until
amdgpu_amdkfd_start_sched is called.

v2: fix build (Alex)

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Mon, 29 Jul 2024 16:48:45 +0000 (22:18 +0530)]

drm/amdgpu/gfx9: Add cleaner shader support for GFX9.4.4 hardware

This commit extends the cleaner shader feature to support GFX9.4.4
hardware.

The cleaner shader feature is used to clear or initialize certain GPU
resources, such as Local Data Share (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs). This
operation needs to be performed in isolation, while no other tasks
should be running on the GPU at the same time.

Previously, the cleaner shader feature was implemented for GFX9.4.3
hardware. This commit adds support for GFX9.4.4 hardware by allowing the
cleaner shader to be used with this hardware version.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Mon, 29 Jul 2024 16:44:41 +0000 (22:14 +0530)]

drm/amdgpu/gfx9: Add cleaner shader for GFX9.4.3

This commit adds the cleaner shader microcode for GFX9.4.3 GPUs. The
cleaner shader is a piece of GPU code that is used to clear or
initialize certain GPU resources, such as Local Data Share (LDS), Vector
General Purpose Registers (VGPRs), and Scalar General Purpose Registers
(SGPRs).

Clearing these resources is important for ensuring data isolation
between different workloads running on the GPU. Without the cleaner
shader, residual data from a previous workload could potentially be
accessed by a subsequent workload, leading to data leaks and incorrect
computation results.

The cleaner shader microcode is represented as an array of 32-bit words
(`gfx_9_4_3_cleaner_shader_hex`). This array is the binary
representation of the cleaner shader code, which is written in a
low-level GPU instruction set.

When the cleaner shader feature is enabled, the AMDGPU driver loads this
array into a specific location in the GPU memory. The GPU then reads
this memory location to fetch and execute the cleaner shader
instructions.

The cleaner shader is executed automatically by the GPU at the end of
each workload, before the next workload starts. This ensures that all
GPU resources are in a clean state before the start of each workload.

This addition is part of the cleaner shader feature implementation. The
cleaner shader feature helps improve GPU performance and resource
utilization by cleaning up GPU resources after they are used. It also
enhances security and reliability by preventing data leaks between
workloads.

v2: fix copyright date (Alex)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Mon, 29 Jul 2024 16:42:02 +0000 (22:12 +0530)]

drm/amdgpu/gfx9: Implement cleaner shader support for GFX9.4.3 hardware

The patch modifies the gfx_v9_4_3_kiq_set_resources function to write
the cleaner shader's memory controller address to the ring buffer. It
also adds a new function, gfx_v9_4_3_ring_emit_cleaner_shader, which
emits the PACKET3_RUN_CLEANER_SHADER packet to the ring buffer.

This patch adds support for the PACKET3_RUN_CLEANER_SHADER packet in the
gfx_v9_4_3 module. This packet is used to emit the cleaner shader, which
is used to clear GPU memory before it's reused, helping to prevent data
leakage between different processes.

Finally, the patch updates the ring function structures to include the
new gfx_v9_4_3_ring_emit_cleaner_shader function. This allows the
cleaner shader to be emitted as part of the ring's operations.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Mon, 29 Jul 2024 16:26:57 +0000 (21:56 +0530)]

drm/amdgpu/gfx9: Implement cleaner shader support for GFX9 hardware

The patch modifies the gfx_v9_0_kiq_set_resources function to write
the cleaner shader's memory controller address to the ring buffer. It
also adds a new function, gfx_v9_0_ring_emit_cleaner_shader, which
emits the PACKET3_RUN_CLEANER_SHADER packet to the ring buffer.

This patch adds support for the PACKET3_RUN_CLEANER_SHADER packet in the
gfx_v9_0 module. This packet is used to emit the cleaner shader, which
is used to clear GPU memory before it's reused, helping to prevent data
leakage between different processes.

Finally, the patch updates the ring function structures to include the
new gfx_v9_0_ring_emit_cleaner_shader function. This allows the
cleaner shader to be emitted as part of the ring's operations.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Sun, 7 Jul 2024 03:24:04 +0000 (08:54 +0530)]

drm/amdgpu: Add PACKET3_RUN_CLEANER_SHADER for cleaner shader execution

This commit adds the PACKET3_RUN_CLEANER_SHADER definition. This packet
is a command packet used to instruct the GPU to execute the cleaner
shader.

The cleaner shader is a piece of GPU code that is used to clear or
initialize certain GPU resources, such as Local Data Share (LDS), Vector
General Purpose Registers (VGPRs), and Scalar General Purpose Registers
(SGPRs). Clearing these resources is important for ensuring data
isolation between different workloads running on the GPU.

The PACKET3_RUN_CLEANER_SHADER packet is used to trigger the execution
of the cleaner shader on the GPU. The packet consists of a header
followed by a RESERVED field, which is programmed to zero. When the GPU
receives this packet, it fetches and executes the cleaner shader
instructions from the location specified in the packet.

The cleaner shader feature helps to enhances security and reliability by
preventing data leaks between workloads.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Mon, 27 May 2024 02:08:21 +0000 (07:38 +0530)]

drm/amdgpu: Add sysfs interface for running cleaner shader

This patch adds a new sysfs interface for running the cleaner shader on
AMD GPUs. The cleaner shader is used to clear GPU memory before it's
reused, which can help prevent data leakage between different processes.

The new sysfs file is write-only and is named `run_cleaner_shader`.
Write the number of the partition to this file to trigger the cleaner shader
on that partition. There is only one partition on GPUs which do not
support partitioning.

Changes made in this patch:

- Added `amdgpu_set_run_cleaner_shader` function to handle writes to the
  `run_cleaner_shader` sysfs file.
- Added `run_cleaner_shader` to the list of device attributes in
  `amdgpu_device_attrs`.
- Updated `default_attr_update` to handle `run_cleaner_shader`.
- Added `AMDGPU_DEVICE_ATTR_WO` macro to create write-only device
  attributes.

v2: fix error handling (Alex)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Mon, 27 May 2024 02:00:47 +0000 (07:30 +0530)]

drm/amdgpu: Add enforce_isolation sysfs attribute

This commit adds a new sysfs attribute 'enforce_isolation' to control
the 'enforce_isolation' setting per GPU. The attribute can be read and
written, and accepts values 0 (disabled) and 1 (enabled).

When 'enforce_isolation' is enabled, reserved VMIDs are allocated for
each ring. When it's disabled, the reserved VMIDs are freed.

The set function locks a mutex before changing the 'enforce_isolation'
flag and the VMIDs, and unlocks it afterwards. This ensures that these
operations are atomic and prevents race conditions and other concurrency
issues.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Wed, 20 Mar 2024 01:12:38 +0000 (06:42 +0530)]

drm/amdgpu: Enforce isolation as part of the job

This patch adds a new parameter 'enforce_isolation' to the amdgpu_job
structure. This parameter is used to determine whether shader isolation
should be enforced for a job. The enforce_isolation parameter is then
stored in the amdgpu_job structure and used when flushing the VM.

The enforce_isolation field of the amdgpu_job structure is set directly
after the job is allocated

This change allows more fine-grained control over shader isolation,
making it possible to enforce isolation on a per-job basis rather than
globally. This can be useful in scenarios where only certain jobs
require isolation.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Victor Skvortsov [Fri, 2 Aug 2024 18:22:26 +0000 (14:22 -0400)]

drm/amdgpu: abort KIQ waits when there is a pending reset

Stop waiting for the KIQ to return back when there is a reset pending.
It's quite likely that the KIQ will never response.

Signed-off-by: Koenig Christian <Christian.Koenig@amd.com>
Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
Tested-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Mon, 29 Jul 2024 16:05:26 +0000 (21:35 +0530)]

drm/amdgpu: Make enforce_isolation setting per GPU

This commit makes enforce_isolation setting to be per GPU and per
partition by adding the enforce_isolation array to the adev structure.
The adev variable is set based on the global enforce_isolation module
parameter during device initialization.

In amdgpu_ids.c, the adev->enforce_isolation value for the current GPU
is used to determine whether to enforce isolation between graphics and
compute processes on that GPU.

In amdgpu_ids.c, the adev->enforce_isolation value for the current GPU
and partition is used to determine whether to enforce isolation between
graphics and compute processes on that GPU and partition.

This allows the enforce_isolation setting to be controlled individually
for each GPU and each partition, which is useful in a system with
multiple GPUs and partitions where different isolation settings might be
desired for different GPUs and partitions.

v2: fix loop in amdgpu_vmid_mgr_init() (Alex)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Alex Deucher [Tue, 12 Mar 2024 18:22:26 +0000 (14:22 -0400)]

drm/amdgpu: Emit cleaner shader at end of IB submission

This commit introduces the emission of a cleaner shader at the end of
the IB submission process. This is achieved by adding a new function
pointer, `emit_cleaner_shader`, to the `amdgpu_ring_funcs` structure. If
the `emit_cleaner_shader` function is set in the ring functions, it is
called during the VM flush process.

The cleaner shader is only emitted if the `enable_cleaner_shader` flag
is set in the `amdgpu_device` structure. This allows the cleaner shader
emission to be controlled on a per-device basis.

By emitting a cleaner shader at the end of the IB submission, we can
ensure that the VM state is properly cleaned up after each submission.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Srinivasan Shanmugam [Thu, 6 Jun 2024 07:42:40 +0000 (13:12 +0530)]

drm/amdgpu: Add infrastructure for Cleaner Shader feature

The cleaner shader is used by the CP firmware to clean LDS and GPRs
between processes on the CUs.

This adds an internal API for GFX IP code to allocate and initialize the
cleaner shader.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 14 Aug 2024 23:06:36 +0000 (19:06 -0400)]

drm/amdgpu: handle enforce isolation on non-0 gfxhub

Some chips have more than one gfxhub so check if we
are a gfxhub rather than just gfxhub 0.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 14 Aug 2024 14:28:24 +0000 (10:28 -0400)]

drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1

The workaround seems to cause stability issues on other
SDMA 5.2.x IPs.

Fixes: a03ebf116303 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3556
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 11:57:58 +0000 (17:27 +0530)]

drm/amdgpu: add vcn ip dump support for vcn_v2_6

Add support for logging the registers in devcoredump
buffer for vcn_v2_6.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 11:56:31 +0000 (17:26 +0530)]

drm/amdgpu: add print support for vcn_v2_5 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v2_5.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 11:53:55 +0000 (17:23 +0530)]

drm/amdgpu: add vcn_v2_5 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v2_5.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 11:49:42 +0000 (17:19 +0530)]

drm/amdgpu: add print support for vcn_v2_0 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v2_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 11:48:09 +0000 (17:18 +0530)]

drm/amdgpu: add vcn_v2_0 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v2_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 11:38:22 +0000 (17:08 +0530)]

drm/amdgpu: add print support for vcn_v1_0 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v1_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 11:36:24 +0000 (17:06 +0530)]

drm/amdgpu: add vcn_v1_0 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v1_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 07:28:46 +0000 (12:58 +0530)]

drm/amdgpu: add print support for vcn_v4_0_5 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v4_0_5.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 07:13:07 +0000 (12:43 +0530)]

drm/amdgpu: add print support for vcn_v4_0 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v4_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 07:22:27 +0000 (12:52 +0530)]

drm/amdgpu: add print support for vcn_v4_0_3 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v4_0_3.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 07:27:09 +0000 (12:57 +0530)]

drm/amdgpu: add vcn_v4_0_5 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v4_0_5.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 07:01:21 +0000 (12:31 +0530)]

drm/amdgpu: add vcn_v4_0 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v4_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Mon, 5 Aug 2024 07:19:20 +0000 (12:49 +0530)]

drm/amdgpu: add vcn_v4_0_3 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v4_0_3.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Thu, 1 Aug 2024 13:49:27 +0000 (19:19 +0530)]

drm/amdgpu: add print support for vcn_v5_0 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v5_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Mon, 3 Jun 2024 17:48:40 +0000 (13:48 -0400)]

drm/amdgpu/mes12: add API for user queue reset

Add API for resetting user queues.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Mon, 3 Jun 2024 17:48:07 +0000 (13:48 -0400)]

drm/amdgpu/mes11: add API for user queue reset

Add API for resetting user queues.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Mon, 3 Jun 2024 17:35:05 +0000 (13:35 -0400)]

drm/amdgpu/mes: add API for user queue reset

Add API for resetting user queues.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Fri, 12 Jul 2024 20:39:30 +0000 (16:39 -0400)]

drm/amdgpu/gfx11: export gfx_v11_0_request_gfx_index_mutex()

It will be used by the queue reset code.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Fri, 12 Jul 2024 20:37:33 +0000 (16:37 -0400)]

drm/amdgpu/gfx11: add a mutex for the gfx semaphore

This will be used in more places in the future so
add a mutex.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Fri, 12 Jul 2024 19:36:19 +0000 (15:36 -0400)]

drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL

Need to enter safe mode before touching GC MMIO.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Thu, 18 Jul 2024 19:59:20 +0000 (15:59 -0400)]

drm/amdgpu/gfx7: add ring reset callback for gfx

Add ring reset callback for gfx.

v2: fix operator precedence (kernel test robot)

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Thu, 18 Jul 2024 19:50:23 +0000 (15:50 -0400)]

drm/amdgpu/gfx8: add ring reset callback for gfx

Add ring reset callback for gfx.

v2: fix operator precedence (kernel test robot)

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Thu, 1 Aug 2024 13:47:11 +0000 (19:17 +0530)]

drm/amdgpu: add vcn_v5_0 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v5_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Wed, 24 Jul 2024 11:18:28 +0000 (16:48 +0530)]

drm/amdgpu: add print support for vcn_v3_0 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v3_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Wed, 24 Jul 2024 11:05:41 +0000 (16:35 +0530)]

drm/amdgpu: add vcn_v3_0 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v3_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Tue, 23 Jul 2024 07:38:55 +0000 (13:08 +0530)]

drm/amdgpu: add vcn ip dump ptr in vcn global struct

Add pointer to the vcn ip dump in the vcn global structure
to be accessible for all vcn version via global adev.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Zhang Zekun [Mon, 12 Aug 2024 12:24:15 +0000 (20:24 +0800)]

drm/amd: Remove unused declarations

amdgpu_gart_table_vram_pin() and amdgpu_gart_table_vram_unpin() has
been removed since commit 575e55ee4fbc ("drm/amdgpu: recover gart table
at resume") remain the declarations untouched in the header files.

Besides, amdgpu_dm_display_resume() has also beed removed since
commit a80aa93de1a0 ("drm/amd/display: Unify dm resume sequence into a
single call"). So, let's remove this unused declarations.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Nikita Zhandarovich [Tue, 6 Aug 2024 17:19:04 +0000 (10:19 -0700)]

drm/radeon/evergreen_cs: fix int overflow errors in cs track offsets

Several cs track offsets (such as 'track->db_s_read_offset')
either are initialized with or plainly take big enough values that,
once shifted 8 bits left, may be hit with integer overflow if the
resulting values end up going over u32 limit.

Same goes for a few instances of 'surf.layer_size * mslice'
multiplications that are added to 'offset' variable - they may
potentially overflow as well and need to be validated properly.

While some debug prints in this code section take possible overflow
issues into account, simply casting to (unsigned long) may be
erroneous in its own way, as depending on CPU architecture one is
liable to get different results.

Fix said problems by:
- casting 'offset' to fixed u64 data type instead of
ambiguous unsigned long.
- casting one of the operands in vulnerable to integer
overflow cases to u64.
- adjust format specifiers in debug prints to properly
represent 'offset' values.

Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.

Fixes: 285484e2d55e ("drm/radeon: add support for evergreen/ni tiling informations v11")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Yang Wang [Tue, 13 Aug 2024 05:51:48 +0000 (13:51 +0800)]

drm/amdgpu: fixing rlc firmware loading failure issue

Skip rlc firmware validation to ignore firmware header size mismatch issues.
This restores the workaround added in
commit 849e133c973c ("drm/amdgpu: Fix the null pointer when load rlc firmware")

Fixes: 3af2c80ae2f5 ("drm/amdgpu: refine gfx10 firmware loading")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3551
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Sunil Khatri [Tue, 13 Aug 2024 17:04:26 +0000 (22:34 +0530)]

drm/amdgpu: remove ME0 registers from mi300 dump

Remove ME0 registers from MI300 gfx_9_4_3 ipdump
MI300 does not have gfx ME and hence those register
are just empty one and could be dropped.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 24 Jul 2024 22:20:57 +0000 (18:20 -0400)]

drm/amdgpu/gfx9: use rlc safe mode for soft recovery

Protect the MMIO access with safe mode.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 24 Jul 2024 22:20:44 +0000 (18:20 -0400)]

drm/amdgpu/gfx9.4.3: use rlc safe mode for soft recovery

Protect the MMIO access with safe mode.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 24 Jul 2024 22:04:44 +0000 (18:04 -0400)]

drm/amdgpu/gfx9.4.3: use proper rlc safe mode helpers

Rather than open coding it for the queue reset.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 24 Jul 2024 21:59:47 +0000 (17:59 -0400)]

drm/amdgpu/gfx9: use proper rlc safe mode helpers

Rather than open coding it for the queue reset.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 17 Jul 2024 23:02:50 +0000 (19:02 -0400)]

drm/amdgpu/gfx9: add ring reset callback for gfx

Add ring reset callback for gfx.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Thu, 18 Jul 2024 14:20:56 +0000 (10:20 -0400)]

drm/amdgpu/gfx9: per queue reset only on bare metal

It's not supported under SR-IOV at the moment.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jiadong Zhu [Thu, 4 Jul 2024 06:51:58 +0000 (14:51 +0800)]

drm/amdgpu/gfx9.4.3: implement reset_hw_queue for gfx9.4.3

Using mmio to do queue reset. Enter safe mode
before writing mmio registers.

v2: set register instance offset according to xcc id.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jiadong Zhu [Thu, 4 Jul 2024 04:24:31 +0000 (12:24 +0800)]

drm/amdgpu/gfx9: implement reset_hw_queue for gfx9

Using mmio to do queue reset. Enter safe mode
when writing registers.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jiadong Zhu [Thu, 4 Jul 2024 04:12:42 +0000 (12:12 +0800)]

drm/amdgpu/gfx: add a new kiq_pm4_funcs callback for reset_hw_queue

Add reset_hw_queue in kiq_pm4_funcs callbacks.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jiadong Zhu [Fri, 28 Jun 2024 03:48:22 +0000 (11:48 +0800)]

drm/amdgpu/gfx_9.4.3: wait for reset done before remap

There is a racing condition that cp firmware modifies
MQD in reset sequence after driver updates it for
remapping. We have to wait till CP_HQD_ACTIVE becoming
false then remap the queue.

v2: fix KIQ locking (Alex)
v3: fix KIQ locking harder

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jiadong Zhu [Fri, 14 Jun 2024 05:05:32 +0000 (13:05 +0800)]

drm/amdgpu/gfx9.4.3: remap queue after reset successfully

Kiq command unmap_queues only does the dequeueing action.
We have to map the queue back with clean mqd.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Mon, 3 Jun 2024 21:24:03 +0000 (17:24 -0400)]

drm/amdgpu/gfx9.4.3: add ring reset callback

Add ring reset callback for compute.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jiadong Zhu [Tue, 2 Jul 2024 01:03:49 +0000 (09:03 +0800)]

drm/amdgpu/gfx9: wait for reset done before remap

There is a racing condition that cp firmware modifies
MQD in reset sequence after driver updates it for
remapping. We have to wait till CP_HQD_ACTIVE becoming
false then remap the queue.

v2: fix KIQ locking (Alex)
v3: fix KIQ locking harder

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Jiadong Zhu [Tue, 11 Jun 2024 10:06:44 +0000 (18:06 +0800)]

drm/amdgpu/gfx9: remap queue after reset successfully

Kiq command unmap_queues only does the dequeueing action.
We have to map the queue back with clean mqd.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Mon, 3 Jun 2024 21:23:14 +0000 (17:23 -0400)]

drm/amdgpu/gfx9: add ring reset callback

Add ring reset callback for compute.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Prike Liang [Wed, 12 Jun 2024 07:49:38 +0000 (15:49 +0800)]

drm/amdgpu: increase the reset counter for the queue reset

Update the reset counter for the amdgpu_cs_query_reset_state()

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Mon, 3 Jun 2024 18:38:20 +0000 (14:38 -0400)]

drm/amdgpu: add per ring reset support (v5)

If a specific job is hung, try and reset just the
ring associated with the job.

v2: move to amdgpu_job.c
v3: fix drm_sched_stop() handling when ring reset fails
v4: drop unnecessary amdgpu_fence_driver_clear_job_fences() and
drm_sched_increase_karma()
v5: rework sched_stop handling

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom