]> www.infradead.org Git - nvme.git/log
nvme.git
9 months agodrm/amdgpu: add missed harvest check for VCN IP v4/v5
Tim Huang [Tue, 23 Jul 2024 08:54:34 +0000 (16:54 +0800)]
drm/amdgpu: add missed harvest check for VCN IP v4/v5

To prevent below probe failure, add a check for models with VCN
IP v4.0.6 where VCN1 may be harvested.

v2:
Apply the same check to VCN IP v4.0 and v5.0.

[   54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43
c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02
00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00
[   54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286
[   54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX:
0000000000000000
[   54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI:
ffff99a82f680000
[   54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09:
0000000000000000
[   54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12:
0000000000000000
[   54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15:
0000000000000001
[   54.075400] FS:  00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000)
knlGS:0000000000000000
[   54.075988] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4:
0000000000750ef0
[   54.076927] PKRU: 55555554
[   54.077132] Call Trace:
[   54.077319]  <TASK>
[   54.077484]  ? show_regs+0x69/0x80
[   54.077747]  ? __die+0x28/0x70
[   54.077979]  ? page_fault_oops+0x180/0x4b0
[   54.078286]  ? do_user_addr_fault+0x2d2/0x680
[   54.078610]  ? exc_page_fault+0x84/0x190
[   54.078910]  ? asm_exc_page_fault+0x2b/0x30
[   54.079224]  ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.079941]  ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu]
[   54.080617]  vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu]
[   54.081316]  amdgpu_device_ip_set_powergating_state+0x64/0xc0
[amdgpu]
[   54.082057]  amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu]
[   54.082727]  amdgpu_ring_alloc+0x44/0x70 [amdgpu]
[   54.083351]  amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu]
[   54.084054]  amdgpu_ring_test_helper+0x22/0x90 [amdgpu]
[   54.084698]  vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu]
[   54.085307]  amdgpu_device_init+0x1f96/0x2780 [amdgpu]
[   54.085951]  amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu]
[   54.086591]  amdgpu_pci_probe+0x19f/0x550 [amdgpu]
[   54.087215]  local_pci_probe+0x48/0xa0
[   54.087509]  pci_device_probe+0xc9/0x250
[   54.087812]  really_probe+0x1a4/0x3f0
[   54.088101]  __driver_probe_device+0x7d/0x170
[   54.088443]  driver_probe_device+0x24/0xa0
[   54.088765]  __driver_attach+0xdd/0x1d0
[   54.089068]  ? __pfx___driver_attach+0x10/0x10
[   54.089417]  bus_for_each_dev+0x8e/0xe0
[   54.089718]  driver_attach+0x22/0x30
[   54.090000]  bus_add_driver+0x120/0x220
[   54.090303]  driver_register+0x62/0x120
[   54.090606]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[   54.091255]  __pci_register_driver+0x62/0x70
[   54.091593]  amdgpu_init+0x67/0xff0 [amdgpu]
[   54.092190]  do_one_initcall+0x5f/0x330
[   54.092495]  do_init_module+0x68/0x240
[   54.092794]  load_module+0x201c/0x2110
[   54.093093]  init_module_from_file+0x97/0xd0
[   54.093428]  ? init_module_from_file+0x97/0xd0
[   54.093777]  idempotent_init_module+0x11c/0x2a0
[   54.094134]  __x64_sys_finit_module+0x64/0xc0
[   54.094476]  do_syscall_64+0x58/0x120
[   54.094767]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit 0b071245ddd98539d4f7493bdd188417fcf2d629)

9 months agodrm/amdgpu: Fix eeprom max record count
Stanley.Yang [Thu, 18 Jul 2024 02:58:04 +0000 (10:58 +0800)]
drm/amdgpu: Fix eeprom max record count

The eeprom table is empty before initializing,
set eeprom table version first before initializing.

Changed from V1:
Reuse amdgpu_ras_set_eeprom_table_version function

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 015b8a2fdf39a4c288ff24e7b715b8d9198e56dc)

9 months agodrm/amdgpu: fix ras UE error injection failure issue
YiPeng Chai [Fri, 19 Jul 2024 12:43:04 +0000 (20:43 +0800)]
drm/amdgpu: fix ras UE error injection failure issue

The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8284951a6e79c6806c675e5f68a4cd425dd56bc4)

9 months agodrm/amd/display: Remove ASSERT if significance is zero in math_ceil2
Rodrigo Siqueira [Thu, 4 Jul 2024 17:54:34 +0000 (11:54 -0600)]
drm/amd/display: Remove ASSERT if significance is zero in math_ceil2

In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)

9 months agodrm/amd/display: Check for NULL pointer
Sung Joon Kim [Mon, 8 Jul 2024 23:29:49 +0000 (19:29 -0400)]
drm/amd/display: Check for NULL pointer

[why & how]
Need to make sure plane_state is initialized
before accessing its members.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Xi (Alex) Liu <xi.liu@amd.com>
Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 295d91cbc700651782a60572f83c24861607b648)

9 months agodrm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF
Jane Jian [Mon, 15 Jul 2024 10:48:31 +0000 (18:48 +0800)]
drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF

For VCN/JPEG 4.0.3, use only the local addressing scheme.

- Mask bit higher than AID0 range

v2
remain the case for mmhub use master XCC

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit caaf576292f8ccef5cdc0ac16e77b87dbf6e17ab)

9 months agodrm/amdgpu: Add empty HDP flush function to VCN v4.0.3
Lijo Lazar [Mon, 11 Dec 2023 05:48:42 +0000 (11:18 +0530)]
drm/amdgpu: Add empty HDP flush function to VCN v4.0.3

VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch
will do the HDP flush.

This change is necessary for VCN v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 49cfaebe48e97500a68d5322a8194736b0a2c3cf)

9 months agodrm/amdgpu: Add empty HDP flush function to JPEG v4.0.3
Lijo Lazar [Mon, 11 Dec 2023 05:15:38 +0000 (10:45 +0530)]
drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3

JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead,
mmsch fw will do the flush.

This change is necessary for JPEG v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 585e3fdb36f59c5cfed0ae06c852dc1df22b1d60)

9 months agodrm/amd/amdgpu: Fix uninitialized variable warnings
Ma Ke [Thu, 18 Jul 2024 14:17:35 +0000 (22:17 +0800)]
drm/amd/amdgpu: Fix uninitialized variable warnings

Return 0 to avoid returning an uninitialized variable r.

Cc: stable@vger.kernel.org
Fixes: 230dd6bb6117 ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6472de66c0aa18d50a4b5ca85f8272e88a737676)

9 months agodrm/amdgpu: Fix atomics on GFX12
David Belanger [Mon, 10 Jun 2024 20:38:55 +0000 (16:38 -0400)]
drm/amdgpu: Fix atomics on GFX12

If PCIe supports atomics, configure register to prevent DF from
breaking atomics in separate load/store operations.

Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 666f14cab21b17ccc1bdfe1e82458aa429b3b7e0)

9 months agodrm/amdgpu/sdma5.2: Update wptr registers as well as doorbell
Alex Deucher [Tue, 9 Jul 2024 21:54:11 +0000 (17:54 -0400)]
drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell

We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().

Enable this workaround while we are root causing the issue with
the HW team.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
Tested-by: Friedrich Vock <friedrich.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit f2ac52634963fc38e4935e11077b6f7854e5d700)

9 months agoMerge tag 'amd-drm-fixes-6.11-2024-07-18' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Mon, 22 Jul 2024 03:03:42 +0000 (13:03 +1000)]
Merge tag 'amd-drm-fixes-6.11-2024-07-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-fixes-6.11-2024-07-18:

amdgpu:
- Bump driver version for GFX12 DCC
- DC documention warning fixes
- VCN unified queue power fix
- SMU fix
- RAS fix
- Display corruption fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718215258.79356-1-alexander.deucher@amd.com
9 months agoMerge tag 'drm-misc-next-fixes-2024-07-19' of https://gitlab.freedesktop.org/drm...
Dave Airlie [Mon, 22 Jul 2024 02:52:55 +0000 (12:52 +1000)]
Merge tag 'drm-misc-next-fixes-2024-07-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

Two fixes for v3d to fix an array indexing on newer V3D revisions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719-emerald-newt-of-skill-89b54a@houat
9 months agoMerge tag 'drm-xe-next-fixes-2024-07-18' of https://gitlab.freedesktop.org/drm/xe...
Dave Airlie [Mon, 22 Jul 2024 01:51:48 +0000 (11:51 +1000)]
Merge tag 'drm-xe-next-fixes-2024-07-18' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

- Xe_exec ioctl minor fix on sync entry cleanup upon error (Ashutosh)
- SRIOV: limit VF LMEM provisioning (Michal)
- Wedge mode fixes (Brost)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zpk6CI0FDoTJwkSb@intel.com
9 months agoMerge tag 'drm-intel-next-fixes-2024-07-18' of https://gitlab.freedesktop.org/drm...
Dave Airlie [Mon, 22 Jul 2024 00:51:47 +0000 (10:51 +1000)]
Merge tag 'drm-intel-next-fixes-2024-07-18' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

- Reset intel_dp->link_trained before retraining the link [dp] (Imre Deak)
- Don't switch the LTTPR mode on an active link [dp] (Imre Deak)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZpjgtowjpUZoHvrl@linux
9 months agodrm/xe: Don't suspend device upon wedge
Matthew Brost [Tue, 16 Jul 2024 06:39:02 +0000 (23:39 -0700)]
drm/xe: Don't suspend device upon wedge

When wedging a device we shouldn't be suspending device as state for
debug will be lost.

Also this appears to not work as the below stack trace pops upon trying
to resume a wedged device:

[  304.245044] INFO: task cat:12115 blocked for more than 151 seconds.
[  304.251333]       Tainted: G        W          6.10.0-rc7-xe+ #3518
[  304.257617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  304.265459] task:cat             state:D stack:13384 pid:12115 tgid:12115 ppid:3986   flags:0x00000006
[  304.265465] Call Trace:
[  304.265467]  <TASK>
[  304.265469]  __schedule+0x3c4/0xdf0
[  304.265478]  schedule+0x3c/0x140
[  304.265481]  rpm_resume+0x1cc/0x740
[  304.265484]  ? __pfx_autoremove_wake_function+0x10/0x10
[  304.265489]  __pm_runtime_resume+0x49/0x80
[  304.265494]  guc_info+0x6b/0xb0 [xe]
[  304.265538]  ? __pfx___drm_printfn_seq_file+0x10/0x10
[  304.265541]  ? __pfx___drm_puts_seq_file+0x10/0x10
[  304.265545]  seq_read_iter+0x111/0x4c0
[  304.265551]  seq_read+0xfc/0x140
[  304.265556]  full_proxy_read+0x58/0x80
[  304.265560]  vfs_read+0xa7/0x360
[  304.265563]  ? find_held_lock+0x2b/0x80
[  304.265568]  ksys_read+0x64/0xe0
[  304.265571]  do_syscall_64+0x68/0x140
[  304.265575]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  304.265578] RIP: 0033:0x7f4254d14992
[  304.265580] RSP: 002b:00007ffc558666f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  304.265583] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f4254d14992
[  304.265584] RDX: 0000000000020000 RSI: 00007f4254ebb000 RDI: 0000000000000003
[  304.265586] RBP: 00007f4254ebb000 R08: 00007f4254eba010 R09: 00007f4254eba010
[  304.265587] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000
[  304.265588] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[  304.265593]  </TASK>
[  304.265594]
               Showing all locks held in the system:
[  304.265598] 1 lock held by khungtaskd/57:
[  304.265599]  #0: ffffffff8273b860 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x36/0x1c0
[  304.265607] 3 locks held by kworker/6:1/90:
[  304.265610] 1 lock held by in:imklog/547:
[  304.265611]  #0: ffff88810498cd88 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x76/0xc0
[  304.265620] 1 lock held by dmesg/1310:

v2: Drop local 'err' variable (Jonathan)

Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-2-matthew.brost@intel.com
(cherry picked from commit 452bca0edbd0764ca0284239d5438b3edd305ab3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
9 months agodrm/xe: Wedge the entire device
Matthew Brost [Tue, 16 Jul 2024 06:39:01 +0000 (23:39 -0700)]
drm/xe: Wedge the entire device

Wedge the entire device, not just GT which may have triggered the wedge.
To implement this, cleanup the layering so xe_device_declare_wedged()
calls into the lower layers (GT) to ensure entire device is wedged.

While we are here, also signal any pending GT TLB invalidations upon
wedging device.

Lastly, short circuit reset wait if device is wedged.

v2:
 - Short circuit reset wait if device is wedged (Local testing)

Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com
(cherry picked from commit 7dbe8af13c189f5937e87e9fb924d5bbc49e6f71)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
9 months agodrm/xe/pf: Limit fair VF LMEM provisioning
Michal Wajdeczko [Thu, 11 Jul 2024 19:23:19 +0000 (21:23 +0200)]
drm/xe/pf: Limit fair VF LMEM provisioning

Due to the current design of the BO and VRAM manager, any object
with XE_BO_FLAG_PINNED flag, which the PF driver uses during VF
LMEM provisionining, is created with the TTM_PL_FLAG_CONTIGUOUS
flag, which may cause VRAM fragmentation that prevents subsequent
allocations of larger objects, like fair VF LMEM provisioning.

To avoid such failures, round down fair VF LMEM provisioning size
to next power of two size, to compensate what xe_ttm_vram_mgr is
doing to achieve contiguous allocations.

Fixes: ac6598aed1b3 ("drm/xe/pf: Add support to configure SR-IOV VFs")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711192320.1198-2-michal.wajdeczko@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4c3fe5eae46b92e2fd961b19f7779608352e5368)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
9 months agodrm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup
Ashutosh Dixit [Thu, 11 Jul 2024 21:12:03 +0000 (14:12 -0700)]
drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup

Increment num_syncs after xe_sync_entry_parse() is successful to ensure
the xe_sync_entry_cleanup() logic under "err_syncs" label works correctly.

v2: Use the same pattern as that in xe_vm.c (Matt Brost)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711211203.3728180-1-ashutosh.dixit@intel.com
(cherry picked from commit 43a6faa6d9b5e9139758200a79fe9c8f4aaa0c8d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
9 months agoMerge tag 'amd-drm-next-6.11-2024-07-12' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Wed, 17 Jul 2024 23:19:46 +0000 (09:19 +1000)]
Merge tag 'amd-drm-next-6.11-2024-07-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.11-2024-07-12:

amdgpu:
- RAS fixes
- SMU fixes
- GC 12 updates
- SR-IOV fixes
- IH 7 updates
- DCC fixes
- GC 11.5 fixes
- DP MST fixes
- GFX 9.4.4 fixes
- SMU 14 updates
- Documentation updates
- MAINTAINERS updates
- PSR SU fix
- Misc small fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240712171637.2581787-1-alexander.deucher@amd.com
9 months agodrm/amd/display: fix corruption with high refresh rates on DCN 3.0
Alex Deucher [Tue, 16 Jul 2024 16:49:25 +0000 (12:49 -0400)]
drm/amd/display: fix corruption with high refresh rates on DCN 3.0

This reverts commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 and the
register changes from commit 6d4279cb99ac4f51d10409501d29969f687ac8dc.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3412
Cc: mikhail.v.gavrilov@gmail.com
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.10.x
9 months agoDocumentation/amdgpu: Fix duplicate declaration
Rodrigo Siqueira [Mon, 15 Jul 2024 22:57:47 +0000 (16:57 -0600)]
Documentation/amdgpu: Fix duplicate declaration

Address the below kernel doc warning:

Documentation/gpu/amdgpu/display/display-manager:134:
drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h:3: WARNING: Duplicate C
declaration, also defined at gpu/amdgpu/display/dcn-blocks:101.
Declaration is '.. c:struct:: mpcc_blnd_cfg'.
Documentation/gpu/amdgpu/display/display-manager:146:
drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h:3: WARNING: Duplicate C
declaration, also defined at gpu/amdgpu/display/dcn-blocks:3.
Declaration is '.. c:enum:: mpcc_alpha_blend_mode'.

To address the above warnings, this commit uses the 'no-identifiers'
option in the dcn-blocks to avoid duplication with the previous use of
this function doc in the display-manager file. Finally, replaces the
deprecated ':function:' in favor of ':identifiers:'.

Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agoDocumentation/gpu: Remove undocumented files from dcn-blockshubbub.h
Rodrigo Siqueira [Mon, 15 Jul 2024 22:29:41 +0000 (16:29 -0600)]
Documentation/gpu: Remove undocumented files from dcn-blockshubbub.h

The dchubbub.h and hubp.h do not have any meaningful documentation; for
this reason, this commit removes those files from the dcn-blocks
documentation.

Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/display: Add simple struct doc to remove doc build warning
Rodrigo Siqueira [Mon, 15 Jul 2024 22:16:12 +0000 (16:16 -0600)]
drm/amd/display: Add simple struct doc to remove doc build warning

This commit is a part of a series that addresses the following build
warning for opp:

./drivers/gpu/drm/amd/display/dc/inc/hw/opp.h:1: warning: no structured
comments found
./drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h:1: warning: no structured
comments found

This commit fixes this issue by adding a simple kernel-doc to a struct
in the opp.h and the dpp.h files.

Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agoDocumentation/gpu: Adjust DCN documentation paths
Rodrigo Siqueira [Mon, 15 Jul 2024 21:58:40 +0000 (15:58 -0600)]
Documentation/gpu: Adjust DCN documentation paths

When building the kernel-doc, it has the following complaints:

Documentation/gpu/amdgpu/display/dcn-blocks:23:
drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h:3: WARNING: Duplicate C
declaration, also defined at gpu/amdgpu/display/dcn-blocks:3.

Declaration is '.. c:struct:: surface_flip_registers'.

Documentation/gpu/amdgpu/display/dcn-blocks:35:
drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h:3: WARNING: Duplicate C
declaration, also defined at gpu/amdgpu/display/dcn-blocks:3.
Declaration is '.. c:struct:: surface_flip_registers'.

This error happened due to a copy-and-paste where the same file path was
duplicated multiple times to a different set of blocks. This commit
addresses this issue by using the correct file path.

Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agoDocumentation/gpu: Remove ':export:' option from DCN documentation
Rodrigo Siqueira [Mon, 15 Jul 2024 21:10:32 +0000 (15:10 -0600)]
Documentation/gpu: Remove ':export:' option from DCN documentation

This commit reduces, but does not fix, all the occurrences and some of
the documentation warnings related to the 'no structured comments.' This
was caused by the wrong use of the ':export:' option in the DCN
kernel-doc, so this commit drops the usage of those options.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/display: Move DIO documentation to the right place
Rodrigo Siqueira [Mon, 15 Jul 2024 20:57:49 +0000 (14:57 -0600)]
drm/amd/display: Move DIO documentation to the right place

When building the kernel-doc, it complains with the below warning:

./drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.h:1: warning: no structured comments found
./drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.h:1: warning: no structured comments found

This warning was caused by the wrong use of the ':export:' and the lack
of function documentation in the file pointed under the ':internal:'.
This commit addresses those issues by relocating the overview
documentation to the correct C file, removing the ':export:' options,
and adding two simple kernel-doc to ensure that ':internal:' does not
have any warning.

Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/dri-devel/20240715085918.68f5ecc9@canb.auug.org.au/
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/swsmu: enable Pstates profile levels for SMU v14.0.4
Li Ma [Wed, 10 Jul 2024 09:29:59 +0000 (17:29 +0800)]
drm/amd/swsmu: enable Pstates profile levels for SMU v14.0.4

Enables following UMD stable Pstates profile levels
of power_dpm_force_performance_level for SMU v14.0.4.

    - profile_peak
    - profile_min_mclk
    - profile_min_sclk
    - profile_standard

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/pm: early return if disabling DPMS for GFX IP v11.5.2
Tim Huang [Fri, 12 Jul 2024 03:05:07 +0000 (11:05 +0800)]
drm/amd/pm: early return if disabling DPMS for GFX IP v11.5.2

This was intended to add support for GFX IP v11.5.2, but it needs
to be applied to all GFX11 and subsequent APUs. Therefore the code
should be revised to accommodate this.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: add mutex to protect ras shared memory
YiPeng Chai [Fri, 12 Apr 2024 05:46:03 +0000 (13:46 +0800)]
drm/amdgpu: add mutex to protect ras shared memory

Add mutex to protect ras shared memory.

v2:
  Add TA_RAS_COMMAND__TRIGGER_ERROR command call
  status check.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/display: Add function banner for idle_workqueue
Roman Li [Mon, 15 Jul 2024 20:45:46 +0000 (16:45 -0400)]
drm/amd/display: Add function banner for idle_workqueue

[Why]
htmldocs warning:
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h: warning:
Function parameter or struct member 'idle_workqueue' not described in
'amdgpu_display_manager'.

[How]
Add comment section for idle_workqueue with param description.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/dri-devel/20240715090211.736a9b4d@canb.auug.org.au/
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/display: Add doc entry for program_3dlut_size
Alex Hung [Mon, 15 Jul 2024 19:53:17 +0000 (13:53 -0600)]
drm/amd/display: Add doc entry for program_3dlut_size

Fixes the warning:

Function parameter or struct member 'program_3dlut_size' not described in
'mpc_funcs'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/dri-devel/20240715090445.7e9387ec@canb.auug.org.au/
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu/vcn: not pause dpg for unified queue
Boyuan Zhang [Wed, 10 Jul 2024 20:17:12 +0000 (16:17 -0400)]
drm/amdgpu/vcn: not pause dpg for unified queue

For unified queue, DPG pause for encoding is done inside VCN firmware,
so there is no need to pause dpg based on ring type in kernel.

For VCN3 and below, pausing DPG for encoding in kernel is still needed.

v2: add more comments
v3: update commit message

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu/vcn: identify unified queue in sw init
Boyuan Zhang [Thu, 11 Jul 2024 20:19:54 +0000 (16:19 -0400)]
drm/amdgpu/vcn: identify unified queue in sw init

Determine whether VCN using unified queue in sw_init, instead of calling
functions later on.

v2: fix coding style

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/i915/dp: Don't switch the LTTPR mode on an active link
Imre Deak [Mon, 8 Jul 2024 19:00:25 +0000 (22:00 +0300)]
drm/i915/dp: Don't switch the LTTPR mode on an active link

Switching to transparent mode leads to a loss of link synchronization,
so prevent doing this on an active link. This happened at least on an
Intel N100 system / DELL UD22 dock, the LTTPR residing either on the
host or the dock. To fix the issue, keep the current mode on an active
link, adjusting the LTTPR count accordingly (resetting it to 0 in
transparent mode).

v2: Adjust code comment during link training about reiniting the LTTPRs.
   (Ville)

Fixes: 7b2a4ab8b0ef ("drm/i915: Switch to LTTPR transparent mode link training")
Reported-and-tested-by: Gareth Yu <gareth.yu@intel.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10902
Cc: <stable@vger.kernel.org> # v5.15+
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708190029.271247-3-imre.deak@intel.com
(cherry picked from commit 211ad49cf8ccfdc798a719b4d1e000d0a8a9e588)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
9 months agodrm/i915/dp: Reset intel_dp->link_trained before retraining the link
Imre Deak [Mon, 8 Jul 2024 19:00:24 +0000 (22:00 +0300)]
drm/i915/dp: Reset intel_dp->link_trained before retraining the link

Regularly retraining a link during an atomic commit happens with the
given pipe/link already disabled and hence intel_dp->link_trained being
false. Ensure this also for retraining a DP SST link via direct calls to
the link training functions (vs. an actual commit as for DP MST). So far
nothing depended on this, however the next patch will depend on
link_trained==false for changing the LTTPR mode to non-transparent.

Cc: <stable@vger.kernel.org> # v5.15+
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708190029.271247-2-imre.deak@intel.com
(cherry picked from commit a4d5ce61765c08ab364aa4b327f6739b646e6cfa)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
9 months agodrm/amd/display: fix doc entry for bb_from_dmub
Aurabindo Pillai [Mon, 15 Jul 2024 19:02:20 +0000 (15:02 -0400)]
drm/amd/display: fix doc entry for bb_from_dmub

Fixes the warning:

Function parameter or struct member 'bb_from_dmub' not described in 'amdgpu_display_manager'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd: Bump KMS_DRIVER_MINOR version
Aurabindo Pillai [Wed, 10 Jul 2024 20:20:47 +0000 (20:20 +0000)]
drm/amd: Bump KMS_DRIVER_MINOR version

Increase the KMS minor version to indicate GFX12 DCC support since this
contains a major change in how DCC is managed across IPs like GFX, DCN
etc. This will be used mainly by userspace like Mesa to figure out
DCC support on GFX12 hardware.

v2: fix version number (Alex)

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/v3d: Fix Indirect Dispatch configuration for V3D 7.1.6 and later
Maíra Canal [Sun, 14 Jul 2024 14:49:12 +0000 (11:49 -0300)]
drm/v3d: Fix Indirect Dispatch configuration for V3D 7.1.6 and later

`args->cfg[4]` is configured in Indirect Dispatch using the number of
batches. Currently, for all V3D tech versions, `args->cfg[4]` equals the
number of batches subtracted by 1. But, for V3D 7.1.6 and later, we must not
subtract 1 from the number of batches.

Implement the fix by checking the V3D tech version and revision.

Fixes several `dEQP-VK.synchronization*` CTS tests related to Indirect Dispatch.

Fixes: 18b8413b25b7 ("drm/v3d: Create a CPU job extension for a indirect CSD job")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-2-mcanal@igalia.com
9 months agodrm/v3d: Add V3D tech revision to the device information
Maíra Canal [Sun, 14 Jul 2024 14:49:11 +0000 (11:49 -0300)]
drm/v3d: Add V3D tech revision to the device information

The V3D tech revision can be a useful information when configuring
jobs. Therefore, expose it in the `struct v3d_dev` with the V3D tech
version.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-1-mcanal@igalia.com
9 months agodrm/amdgpu/mes12: add missing opcode string
Alex Deucher [Mon, 8 Jul 2024 21:40:14 +0000 (17:40 -0400)]
drm/amdgpu/mes12: add missing opcode string

Fixes the indexing of the string array.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu/mes11: update opcode strings
Alex Deucher [Mon, 8 Jul 2024 21:39:15 +0000 (17:39 -0400)]
drm/amdgpu/mes11: update opcode strings

Add new packet.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agoRevert "drm/amd/display: Reset freesync config before update new state"
Leo Li [Thu, 11 Jul 2024 14:31:09 +0000 (10:31 -0400)]
Revert "drm/amd/display: Reset freesync config before update new state"

This change caused PSR SU panels to not read from their remote fb,
preventing us from entering self-refresh. It is a regression.

This reverts commit eb6dfbb7a9c67c7d9bcdb9f9b9131270e2144e3d.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit dc1000bf463d1d89f66d6b5369cf76603f32c4d3)

9 months agodrm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
Nathan Chancellor [Thu, 20 Jun 2024 15:48:16 +0000 (08:48 -0700)]
drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB

Prior to commit dc6fcaaba5a5 ("drm/omap: Allow build with
COMPILE_TEST=y"), it was only possible to build the omapdrm driver with
a 4KB page size. After that change, when the PAGE_SIZE is 64KB or
larger, clang points out that the driver has some assumptions around the
page size implicitly by passing PAGE_SIZE to a parameter with a type of
u16:

  drivers/gpu/drm/omapdrm/omap_gem.c:758:7: error: implicit conversion from 'unsigned long' to 'u16' (aka 'unsigned short') changes value from 65536 to 0 [-Werror,-Wconstant-conversion]
    757 |                 block = tiler_reserve_2d(fmt, omap_obj->width, omap_obj->height,
        |                         ~~~~~~~~~~~~~~~~
    758 |                                          PAGE_SIZE);
        |                                          ^~~~~~~~~
  arch/powerpc/include/asm/page.h:25:34: note: expanded from macro 'PAGE_SIZE'
     25 | #define PAGE_SIZE               (ASM_CONST(1) << PAGE_SHIFT)
        |                                  ~~~~~~~~~~~~~^~~~~~~~~~~~~
  drivers/gpu/drm/omapdrm/omap_gem.c:1504:44: error: implicit conversion from 'unsigned long' to 'u16' (aka 'unsigned short') changes value from 65536 to 0 [-Werror,-Wconstant-conversion]
   1504 |                         block = tiler_reserve_2d(fmts[i], w, h, PAGE_SIZE);
        |                                 ~~~~~~~~~~~~~~~~                ^~~~~~~~~
  arch/powerpc/include/asm/page.h:25:34: note: expanded from macro 'PAGE_SIZE'
     25 | #define PAGE_SIZE               (ASM_CONST(1) << PAGE_SHIFT)
        |                                  ~~~~~~~~~~~~~^~~~~~~~~~~~~
  2 errors generated.

As there is a lot of use of a u16 type throughout this driver and it
will only ever be run on hardware that has a 4KB page size, just
restrict compile testing to when the page size is less than 64KB (as no
other issues have been discussed and it keeps compile testing relatively
more available).

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240620-omapdrm-restrict-compile-test-to-sub-64kb-page-size-v1-1-5e56de71ffca@kernel.org
9 months agoMerge tag 'drm-xe-next-fixes-2024-07-11' of https://gitlab.freedesktop.org/drm/xe...
Dave Airlie [Fri, 12 Jul 2024 02:52:15 +0000 (12:52 +1000)]
Merge tag 'drm-xe-next-fixes-2024-07-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

UAPI Changes:
- Rename xe perf layer as xe observation layer (Ashutosh)

Driver Changes:
- Drop trace_xe_hw_fence_free (Brost)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zo_3ustogPDVKZwu@intel.com
9 months agoMerge tag 'drm-misc-next-fixes-2024-07-11' of https://gitlab.freedesktop.org/drm...
Dave Airlie [Fri, 12 Jul 2024 00:42:16 +0000 (10:42 +1000)]
Merge tag 'drm-misc-next-fixes-2024-07-11' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

A fix for fbdev on big endian systems, a condition fix for a sharp panel
at removal, and a fix for qxl to prevent unpinned buffer access under
certain conditions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711-benign-rich-mouflon-2eeafe@houat
9 months agodrm/xe: Drop trace_xe_hw_fence_free
Matthew Brost [Mon, 8 Jul 2024 21:10:08 +0000 (14:10 -0700)]
drm/xe: Drop trace_xe_hw_fence_free

fence->ctx may be stale memory when trace_xe_hw_fence_free is called
resuling UAF bug when deriving the device name. This tracepoint is not
all that useful, so just drop it.

Fixes: 501c4255c409 ("drm/xe/trace: Print device_id in xe_trace events")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708211008.956384-1-matthew.brost@intel.com
(cherry picked from commit caaf1f44a6a27bae33eee189842c4d8fc21c3b02)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
9 months agodrm/xe/uapi: Rename xe perf layer as xe observation layer
Ashutosh Dixit [Wed, 3 Jul 2024 16:48:01 +0000 (09:48 -0700)]
drm/xe/uapi: Rename xe perf layer as xe observation layer

In Xe, the perf layer allows capture of HW counter streams. These HW
counters are generally performance related but don't have to be necessarily
so. Also, the name "perf" is a carryover from i915 and is not preferred.

Here we propose the name "observation" for this common layer which allows
capture of different types of these counter streams.

v2: Rename observability layer to observation layer (Lucas/Rodrigo)
v3: Rename sysctl file to "observation_paranoid" (Jose)

Fixes: 52c2e956dceb ("drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types")
Fixes: fe8929bdf835 ("drm/xe/perf/uapi: Add perf_stream_paranoid sysctl")
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703164801.2561423-1-ashutosh.dixit@intel.com
(cherry picked from commit 8169b2097d88d99d7e4a72e20e4b549efe9eb8d7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
9 months agodrm/amdgpu: remove exp hw support check for gfx12
Alex Deucher [Mon, 29 Apr 2024 22:11:11 +0000 (18:11 -0400)]
drm/amdgpu: remove exp hw support check for gfx12

Enable it by default.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
YiPeng Chai [Tue, 2 Jul 2024 09:53:02 +0000 (17:53 +0800)]
drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed

The problem case is as follows:
1. GPU A triggers a gpu ras reset, and GPU A drives
   GPU B to also perform a gpu ras reset.
2. After gpu B ras reset started, gpu B queried a DE
   data. Since the DE data was queried in the ras reset
   thread instead of the page retirement thread, bad
   page retirement work would not be triggered. Then
   even if all gpu resets are completed, the bad pages
   will be cached in RAM until GPU B's bad page retirement
   work is triggered again and then saved to eeprom.

This patch can save the bad pages to eeprom in time after gpu
ras reset is completed.

v2:
  1. Add the above description to code comments.
  2. Reuse existing function.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: flush all cached ras bad pages to eeprom
YiPeng Chai [Tue, 2 Jul 2024 10:16:52 +0000 (18:16 +0800)]
drm/amdgpu: flush all cached ras bad pages to eeprom

Before uninstalling gpu driver, flush all cached ras
bad pages to eeprom.

v2:
  Put the same code into a function and reuse the function.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: select compute ME engines dynamically
Sunil Khatri [Tue, 9 Jul 2024 05:58:22 +0000 (11:28 +0530)]
drm/amdgpu: select compute ME engines dynamically

GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX12.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/display: Allow display DCC for DCN401
Aurabindo Pillai [Wed, 3 Jul 2024 16:34:58 +0000 (16:34 +0000)]
drm/amd/display: Allow display DCC for DCN401

To enable mesa to use display dcc, DM should expose them in the
supported modifiers. Add the best (most efficient) modifiers first.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: select compute ME engines dynamically
Sunil Khatri [Tue, 9 Jul 2024 05:54:39 +0000 (11:24 +0530)]
drm/amdgpu: select compute ME engines dynamically

GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX11.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu/job: Replace DRM_INFO/ERROR logging
Alex Deucher [Mon, 8 Jul 2024 19:02:40 +0000 (15:02 -0400)]
drm/amdgpu/job: Replace DRM_INFO/ERROR logging

Use the dev_info/err variants so we get per device logging
in multi-GPU cases.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: select compute ME engines dynamically
Sunil Khatri [Tue, 9 Jul 2024 05:29:36 +0000 (10:59 +0530)]
drm/amdgpu: select compute ME engines dynamically

GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX10.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/pm: Ignore initial value in smu response register
Danijel Slivka [Fri, 5 Jul 2024 12:15:32 +0000 (14:15 +0200)]
drm/amd/pm: Ignore initial value in smu response register

Why:
If the reg mmMP1_SMN_C2PMSG_90 is being written to during amdgpu driver
load or driver unload, subsequent amdgpu driver load will fail at
smu_hw_init. The default of mmMP1_SMN_C2PMSG_90 register at a clean
environment is 0x1 and if value differs from expected, amdgpu driver
load will fail.

How to fix:
Ignore the initial value in smu response register before the first smu
message is sent,if smc in SMU_FW_INIT state, just proceed further to
send the message. If register holds an unexpected value after smu message
was sent set, smc_state to SMU_FW_HANG state and no further smu messages
will be sent.

v2:
Set SMU_FW_INIT state at the start of smu hw_init/resume.
Check smc_fw_state before sending smu message if in hang state skip
sending message.
Set SMU_FW_HANG only in case unexpected value is detected

Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: Initialize VF partition mode
Lijo Lazar [Wed, 3 Jul 2024 06:22:47 +0000 (11:52 +0530)]
drm/amdgpu: Initialize VF partition mode

For SOCs with GFX v9.4.3, a VF may have multiple compute partitions.
Fetch the partition information during init and initialize partition
nodes. There is no support to switch partition mode in VF mode, hence
disable the same.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping.
Gavin Wan [Mon, 8 Jul 2024 17:07:04 +0000 (17:07 +0000)]
drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping.

sdma has 2 instances in SRIOV cpx mode. Odd numbered VFs have
sdma0/sdma1 instances. Even numbered vfs have sdma2/sdma3. For
Even numbered vfs, the sdma2 & sdma3 (irq srouce id
CLIENTID_SDMA2 and CLIENTID_SDMA3) should map to irq seq 0 & 1.

Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agoMAINTAINERS: fix Xinhui's name
Alex Deucher [Wed, 3 Jul 2024 19:34:16 +0000 (15:34 -0400)]
MAINTAINERS: fix Xinhui's name

Switch to fist last for consistency.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
9 months agoMAINTAINERS: update powerplay and swsmu
Alex Deucher [Wed, 3 Jul 2024 19:32:16 +0000 (15:32 -0400)]
MAINTAINERS: update powerplay and swsmu

Evan is no longer maintaining powerplay and swsmu.
Add Kenneth Feng as his replacement.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
9 months agoMerge tag 'drm-intel-next-2024-06-28' of https://gitlab.freedesktop.org/drm/i915...
Daniel Vetter [Wed, 10 Jul 2024 08:36:46 +0000 (10:36 +0200)]
Merge tag 'drm-intel-next-2024-06-28' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

drm/i915 feature pull #2 for v6.11:

Features and functionality:
- More eDP Panel Replay enabling (Jouni)
- Add async flip and flip done tracepoints (Ville)

Refactoring and cleanups:
- Clean up BDW+ pipe interrupt register definitions (Ville)
- Prep work for DSB based plane programming (Ville)
- Relocate encoder suspend/shutdown helpers (Imre)
- Polish plane surface alignment handling (Ville)

Fixes:
- Enable more fault interrupts on TGL+/MTL+ (Ville)
- Fix CMRR 32-bit build (Mitul)
- Fix PSR Selective Update Region Scan Line Capture Indication (Jouni)
- Fix cursor fb unpinning (Maarten, Ville)
- Fix Cx0 PHY PLL state verification in TBT mode (Imre)
- Fix unnecessary MG DP programming on MTL+ Type-C (Imre)

DRM changes:
- Rename drm_plane_check_pixel_format() to drm_plane_has_format() and export
  (Ville)
- Add drm_vblank_work_flush_all() (Maarten)

Xe driver changes:
- Call encoder .suspend_complete() hook also on Xe (Imre)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/875xttazx2.fsf@intel.com
9 months agodrm/qxl: Pin buffer objects for internal mappings
Thomas Zimmermann [Mon, 8 Jul 2024 14:21:37 +0000 (16:21 +0200)]
drm/qxl: Pin buffer objects for internal mappings

Add qxl_bo_pin_and_vmap() that pins and vmaps a buffer object in one
step. Update callers of the regular qxl_bo_vmap(). Fixes a bug where
qxl accesses an unpinned buffer object while it is being moved; such
as with the monitor-description BO. An typical error is shown below.

[    4.303586] [drm:drm_atomic_helper_commit_planes] *ERROR* head 1 wrong: 65376256x16777216+0+0
[    4.586883] [drm:drm_atomic_helper_commit_planes] *ERROR* head 1 wrong: 65376256x16777216+0+0
[    4.904036] [drm:drm_atomic_helper_commit_planes] *ERROR* head 1 wrong: 65335296x16777216+0+0
[    5.374347] [drm:qxl_release_from_id_locked] *ERROR* failed to find id in release_idr

Commit b33651a5c98d ("drm/qxl: Do not pin buffer objects for vmap")
removed the implicit pin operation from qxl's vmap code. This is the
correct behavior for GEM and PRIME interfaces, but the pin is still
needed for qxl internal operation.

Also add a corresponding function qxl_bo_vunmap_and_unpin() and remove
the old qxl_bo_vmap() helpers.

Future directions: BOs should not be pinned or vmapped unnecessarily.
The pin-and-vmap operation should be removed from the driver and a
temporary mapping should be established with a vmap_local-like helper.
See the client helper drm_client_buffer_vmap_local() for semantics.

v2:
- unreserve BO on errors in qxl_bo_pin_and_vmap() (Dmitry)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: b33651a5c98d ("drm/qxl: Do not pin buffer objects for vmap")
Reported-by: David Kaplan <david.kaplan@amd.com>
Closes: https://lore.kernel.org/dri-devel/ab0fb17d-0f96-4ee6-8b21-65d02bb02655@suse.de/
Tested-by: David Kaplan <david.kaplan@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Zack Rusin <zack.rusin@broadcom.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: virtualization@lists.linux.dev
Cc: spice-devel@lists.freedesktop.org
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708142208.194361-1-tzimmermann@suse.de
9 months agodrm/panel: sharp-lq101r1sx01: Fixed reversed "if" in remove
Douglas Anderson [Mon, 8 Jul 2024 17:52:21 +0000 (10:52 -0700)]
drm/panel: sharp-lq101r1sx01: Fixed reversed "if" in remove

Commit d7d473d8464e ("drm/panel: sharp-lq101r1sx01: Don't call disable
at shutdown/remove") had a subtle bug. We should be calling
sharp_panel_del() when the "sharp" variable is non-NULL, not when it's
NULL. Fix.

Fixes: d7d473d8464e ("drm/panel: sharp-lq101r1sx01: Don't call disable at shutdown/remove")
Cc: Thierry Reding <treding@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202406261525.SkhtM3ZV-lkp@intel.com/
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708105221.1.I576751c661c7edb6b804dda405d10e2e71153e32@changeid
9 months agodrm/amdgpu: set CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1
Zhigang Luo [Tue, 25 Jun 2024 17:53:56 +0000 (13:53 -0400)]
drm/amdgpu: set CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1

to avoid reading wrong WPTR from doorbell in sriov vf, set
CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 to read WPTR from MQD.

Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agoDocumentation/amdgpu: Clarify MI200 and MI300 entries
Kent Russell [Thu, 4 Jul 2024 14:51:44 +0000 (10:51 -0400)]
Documentation/amdgpu: Clarify MI200 and MI300 entries

Add "Series" to MI200 and MI300 to clarify that they represent the
series of cards, and to more closely match the product information
materials. This also matches other entries in this list

Also correct a typo in the MI300 codename (Vangaram->Vanjaram)

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: add ras event state device attribute support
Yang Wang [Wed, 3 Jul 2024 02:23:13 +0000 (10:23 +0800)]
drm/amdgpu: add ras event state device attribute support

add amdgpu ras 'event_state' sysfs device attribute support

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/swsmu: enable more Pstates profile levels for SMU v14.0.0 and v14.0.1
Li Ma [Mon, 1 Jul 2024 07:56:12 +0000 (15:56 +0800)]
drm/amd/swsmu: enable more Pstates profile levels for SMU v14.0.0 and v14.0.1

V1:  This patch enables following UMD stable Pstates profile
levels for power_dpm_force_performance_level interface.

- profile_peak
- profile_min_mclk
- profile_min_sclk
- profile_standard

V2: Fix conflict with commit "drm/amd/pm: smu v14.0.4 reuse smu v14.0.0 dpmtable "

V3: Add VCLK1 and DCLK1 support for SMU V14.0.1
And avoid to set VCLK1 and DCLK1 for SMU v14.0.0

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: add ras POSION_CONSUMPTION event id support
Yang Wang [Fri, 28 Jun 2024 08:24:39 +0000 (16:24 +0800)]
drm/amdgpu: add ras POSION_CONSUMPTION event id support

add amdgpu ras POSION_CONSUMPTION event id support.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdkfd: Use mode1 reset for GFX v9.4.4
Stanley.Yang [Wed, 3 Jul 2024 07:34:35 +0000 (15:34 +0800)]
drm/amdkfd: Use mode1 reset for GFX v9.4.4

GFX v9.4.4 uses mode1 reset to handle poison consumption.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: add ras POSION_CREATION event id support
Yang Wang [Thu, 27 Jun 2024 03:43:09 +0000 (11:43 +0800)]
drm/amdgpu: add ras POSION_CREATION event id support

add amdgpu ras POSION_CREATION event id support.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: refine amdgpu ras event id core code
Yang Wang [Tue, 25 Jun 2024 06:23:42 +0000 (14:23 +0800)]
drm/amdgpu: refine amdgpu ras event id core code

v1:
- use unified event id to manage ras events
- add a new function amdgpu_ras_query_error_status_with_event() to accept
  event type as parameter.

v2:
add a warn log to show the location of function failure
when calling amdgpu_ras_mark_event(). (Tao Zhou)

v3:
change RAS_EVENT_TYPE_ISR to RAS_EVENT_TYPE_FATAL.

v4:
rename amdgpu_ras_get_recovery_event() to
amdgpu_ras_get_fatal_error_event().

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/display: Solve mst monitors blank out problem after resume
Wayne Lin [Thu, 23 May 2024 04:18:07 +0000 (12:18 +0800)]
drm/amd/display: Solve mst monitors blank out problem after resume

[Why]
In dm resume, we firstly restore dc state and do the mst resume for topology
probing thereafter. If we change dpcd DP_MSTM_CTRL value after LT in mst reume,
it will cause light up problem on the hub.

[How]
Revert commit 202dc359adda ("drm/amd/display: Defer handling mst up request in resume").
And adjust the reason to trigger dc_link_detect by DETECT_REASON_RESUMEFROMS3S4.

Cc: stable@vger.kernel.org
Fixes: 202dc359adda ("drm/amd/display: Defer handling mst up request in resume")
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Fangzhi Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: reject gang submit on reserved VMIDs
Christian König [Fri, 19 Jan 2024 13:57:29 +0000 (14:57 +0100)]
drm/amdgpu: reject gang submit on reserved VMIDs

A gang submit won't work if the VMID is reserved and we can't flush out
VM changes from multiple engines at the same time.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: enable dpg for vcn and jpeg on GC 11_5_2
Saleemkhan Jamadar [Thu, 4 Jul 2024 09:49:35 +0000 (15:19 +0530)]
drm/amdgpu: enable dpg for vcn and jpeg on GC 11_5_2

DPG mode is enabled for vcn and jpeg on VCN v4_0_5

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: remove redundant semicolons in RAS_EVENT_LOG
Yang Wang [Thu, 4 Jul 2024 05:48:19 +0000 (13:48 +0800)]
drm/amdgpu: remove redundant semicolons in RAS_EVENT_LOG

remove redundant semicolons in RAS_EVENT_LOG to avoid
code format check warning.

Fixes: b712d7c20133 ("drm/amdgpu: fix compiler 'side-effect' check issue for RAS_EVENT_LOG()")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: restore dcc bo tilling configs while moving
Frank Min [Thu, 30 May 2024 07:01:59 +0000 (15:01 +0800)]
drm/amdgpu: restore dcc bo tilling configs while moving

While moving buffer which has dcc tiling config, it is needed to restore
its original dcc tiling.

1. extend copy flag to cover tiling bits
2. add logic to restore original dcc tiling config

Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: add gfx queue support for gfx12 ipdump
Sunil Khatri [Thu, 27 Jun 2024 12:43:11 +0000 (18:13 +0530)]
drm/amdgpu: add gfx queue support for gfx12 ipdump

Add support of all the CP GFX queues for gfx12 ipdump
to be used by devcoredump.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: add cp queue registers for gfx12 ipdump
Sunil Khatri [Thu, 27 Jun 2024 13:09:55 +0000 (18:39 +0530)]
drm/amdgpu: add cp queue registers for gfx12 ipdump

Add gfx12 support of CP queue registers for all queues
to be used by devcoredump.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: enable redirection of irq's for IH v7.0
Sunil Khatri [Wed, 3 Jul 2024 17:34:11 +0000 (23:04 +0530)]
drm/amdgpu: enable redirection of irq's for IH v7.0

Enable redirection of irq for pagefaults for specific
clients to avoid overflow without dropping interrupts.

So here we redirect the interrupts to another IH ring
i.e ring1 where only these interrupts are processed.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm:amdgpu: enable IH ring1 for IH v7.0
Sunil Khatri [Wed, 3 Jul 2024 17:30:46 +0000 (23:00 +0530)]
drm:amdgpu: enable IH ring1 for IH v7.0

We need IH ring1 for handling the pagefault
interrupts which over flow in default
ring for specific usecases.

Enable ring1 allows software to redirect
high interrupts to ring1 from default IH
ring.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: Set no_hw_access when VF request full GPU fails
Yifan Zha [Thu, 27 Jun 2024 07:06:23 +0000 (15:06 +0800)]
drm/amdgpu: Set no_hw_access when VF request full GPU fails

[Why]
If VF request full GPU access and the request failed,
the VF driver can get stuck accessing registers for an extended period during
the unload of KMS.

[How]
Set no_hw_access flag when VF request for full GPU access fails
This prevents further hardware access attempts, avoiding the prolonged
stuck state.

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: add print support for gfx12 ipdump
Sunil Khatri [Thu, 27 Jun 2024 06:55:46 +0000 (12:25 +0530)]
drm/amdgpu: add print support for gfx12 ipdump

Add support of gfx12 ipdump print so devcoredump
could trigger it to dump the captured registers
in devcoredump.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: add gfx12 register support in ipdump
Sunil Khatri [Thu, 27 Jun 2024 06:52:06 +0000 (12:22 +0530)]
drm/amdgpu: add gfx12 register support in ipdump

Add general registers of gfx12 in ipdump for
devcoredump support.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: update gfxhub client id for gfx12
Frank Min [Thu, 20 Jun 2024 05:57:55 +0000 (13:57 +0800)]
drm/amdgpu: update gfxhub client id for gfx12

update gfxhub client id for gfx12

Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amd/pm: avoid to load smu firmware for APUs
Tim Huang [Thu, 13 Jun 2024 02:34:13 +0000 (10:34 +0800)]
drm/amd/pm: avoid to load smu firmware for APUs

Certain call paths still load the SMU firmware for APUs,
which needs to be skipped.

Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/amdgpu: sysfs node disable query error count during gpu reset
YiPeng Chai [Mon, 1 Jul 2024 06:43:17 +0000 (14:43 +0800)]
drm/amdgpu: sysfs node disable query error count during gpu reset

Sysfs node disable query error count during gpu reset.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
9 months agodrm/fbdev-dma: Fix framebuffer mode for big endian devices
Thomas Huth [Tue, 2 Jul 2024 12:17:37 +0000 (14:17 +0200)]
drm/fbdev-dma: Fix framebuffer mode for big endian devices

The drm_mode_legacy_fb_format() function only generates formats suitable
for little endian devices. switch to drm_driver_legacy_fb_format() here
instead to take the device endianness into consideration, too.

Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 6ae2ff23aa43 ("drm/client: Convert drm_client_buffer_addfb() to drm_mode_addfb2()")
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: <stable@vger.kernel.org> # v6.7+
Link: https://patchwork.freedesktop.org/patch/msgid/20240702121737.522878-1-thuth@redhat.com
9 months agoMerge tag 'drm-msm-next-2024-07-04' of https://gitlab.freedesktop.org/drm/msm into...
Daniel Vetter [Fri, 5 Jul 2024 10:45:40 +0000 (12:45 +0200)]
Merge tag 'drm-msm-next-2024-07-04' of https://gitlab.freedesktop.org/drm/msm into drm-next

Updates for v6.11

Core:
- SM7150 support

DPU:
- SM7150 support
- Fix DSC support for DSI panels in video mode
- Fixed TE vsync source support for DSI command-mode panels
- Fix for devices without UBWC in the display controller (ie.
  QCM2290)

DSI:
- Remove unused register-writing wrappers
- Fix DSC support for panels in video mode
- Add support for parsing TE vsync source
- Add support for MSM8937 (28nm DSI PHY)

MDP5:
- Add support for MSM8937
- Fix configuration for MSM8953

GPU:
- Split giant device table into per-gen "hw catalog" similar to
  what is done on the display side of the driver
- Fix a702 UBWC mode
- Fix unused variably warnings
- GPU memory traces
- Add param for userspace to know if raytracing is supported
- Memory barrier cleanup and GBIF unhalt fix
- X185 support (aka gpu in X1 laptop chips)
- a505 support
- fixes

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvZQpYEHpSCgXGJ2kaHJDK6QFAFfTsfiWm4b2zZOnjXGw@mail.gmail.com
9 months agoMerge tag 'drm-misc-next-2024-07-04' of https://gitlab.freedesktop.org/drm/misc/kerne...
Daniel Vetter [Fri, 5 Jul 2024 10:37:21 +0000 (12:37 +0200)]
Merge tag 'drm-misc-next-2024-07-04' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for $kernel-version:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - dp/mst: Fix daisy-chaining at resume
  - dsc: Add helper to dump the DSC configuration
  - tests: Add tests for the new monochrome TV mode variant

Driver Changes:
  - ast: Refactor the mode setting code
  - panfrost: Fix devfreq job reporting
  - stm: Add LDVS support, DSI PHY updates
  - panels:
    - New panel: AUO G104STN01, K&d kd101ne3-40ti,

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704-curvy-outstanding-lizard-bcea78@houat
9 months agoMerge tag 'drm-intel-gt-next-2024-07-04' of https://gitlab.freedesktop.org/drm/i915...
Daniel Vetter [Fri, 5 Jul 2024 10:14:58 +0000 (12:14 +0200)]
Merge tag 'drm-intel-gt-next-2024-07-04' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

Driver Changes:

Fixes/improvements/new stuff:

- Downgrade stolen lmem setup warning [gem] (Jonathan Cavitt)
- Evaluate GuC priority within locks [gt/uc] (Andi Shyti)
- Fix potential UAF by revoke of fence registers [gt] (Janusz Krzysztofik)
- Return NULL instead of '0' [gem] (Andi Shyti)
- Use the correct format specifier for resource_size_t [gem] (Andi Shyti)
- Suppress oom warning in favour of ENOMEM to userspace [gem] (Nirmoy Das)

Miscellaneous:

- Evaluate forcewake usage within locks [gt] (Andi Shyti)
- Fix typo in comment [gt/uc] (Andi Shyti)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZoZP6mUSergfzFMh@linux
9 months agoMerge tag 'amd-drm-next-6.11-2024-07-03' of https://gitlab.freedesktop.org/agd5f...
Daniel Vetter [Fri, 5 Jul 2024 10:02:11 +0000 (12:02 +0200)]
Merge tag 'amd-drm-next-6.11-2024-07-03' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.11-2024-07-03:

amdgpu:
- Use vmalloc for dc_state
- Replay fixes
- Freesync fixes
- DCN 4.0.1 fixes
- DML fixes
- DCC updates
- Misc code cleanups and bug fixes
- 8K display fixes
- DCN 3.5 fixes
- Restructure DIO code
- DML1 fixes
- DML2 fixes
- GFX11 fix
- GFX12 updates
- GFX12 modifiers fixes
- RAS fixes
- IP dump fixes
- Add some updated IP version checks
_ Silence UBSAN warning

radeon:
- GPUVM fix

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703211314.2041893-1-alexander.deucher@amd.com
9 months agoMerge tag 'amd-drm-next-6.11-2024-06-28' of https://gitlab.freedesktop.org/agd5f...
Daniel Vetter [Fri, 5 Jul 2024 09:39:22 +0000 (11:39 +0200)]
Merge tag 'amd-drm-next-6.11-2024-06-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.11-2024-06-28:

amdgpu:
- JPEG 5.x fixes
- More FW loading cleanups
- Misc code cleanups
- GC 12.x fixes
- ASPM fix
- DCN 4.0.1 updates
- SR-IOV fixes
- HDCP fix
- USB4 fixes
- Silence UBSAN warnings
- MES submission fixes
- Update documentation for new products
- DCC updates
- Initial ISP 4.x plumbing
- RAS fixes
- Misc small fixes

amdkfd:
- Fix missing unlock in error path for adding queues

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240628213135.427214-1-alexander.deucher@amd.com
9 months agoMerge tag 'mediatek-drm-next-6.11' of https://git.kernel.org/pub/scm/linux/kernel...
Daniel Vetter [Fri, 5 Jul 2024 09:36:24 +0000 (11:36 +0200)]
Merge tag 'mediatek-drm-next-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next

Mediatek DRM Next for Linux 6.11

1. Convert to platform remove callback returning void
2. Drop chain_mode_fixup call in mode_valid()
3. Fixes the errors of MediaTek display driver found by IGT.
4. Add display support for the MT8365-EVK board
5. Fix bit depth overwritten for mtk_ovl_set bit_depth()
6. Remove less-than-zero comparison of an unsigned value
7. Call drm_atomic_helper_shutdown() at shutdown time
8. Log errors in probe with dev_err_probe()
9. Fix possible_crtcs calculation
10. Fix spurious kfree()

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240628134632.28672-1-chunkuang.hu@kernel.org
9 months agoMerge tag 'drm-etnaviv-next-2024-06-28' of https://git.pengutronix.de/git/lst/linux...
Daniel Vetter [Fri, 5 Jul 2024 09:29:59 +0000 (11:29 +0200)]
Merge tag 'drm-etnaviv-next-2024-06-28' of https://git.pengutronix.de/git/lst/linux into drm-next

- fix i.MX8MP NPU clock gating
- workaround FE register cdc issues on some cores
- fix DMA sync handling for cached buffers
- fix job timeout handling
- keep TS enabled on MMUv2 cores for improved performance

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/e8b91e2f18e6eaa722569dd21f559009064b1730.camel@pengutronix.de
9 months agoMerge tag 'exynos-drm-next-for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel...
Daniel Vetter [Fri, 5 Jul 2024 09:21:34 +0000 (11:21 +0200)]
Merge tag 'exynos-drm-next-for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next

Some cleanups to Exynos Virtual Display driver
- Use drm_edid_duplicate() instead of kmemdup().
- Replace existing EDID handling with struct drm_edid functions
  for improved management.
- Keep an allocated raw_edid or NULL and handle fake_edid_info in get_modes().

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703075912.37106-1-inki.dae@samsung.com
9 months agoMerge v6.10-rc6 into drm-next
Daniel Vetter [Fri, 5 Jul 2024 08:35:14 +0000 (10:35 +0200)]
Merge v6.10-rc6 into drm-next

The exynos-next pull is based on a newer -rc than drm-next. hence
backmerge first to make sure the unrelated conflicts we accumulated
don't end up randomly in the exynos merge pull, but are separated out.

Conflicts are all benign: Adjacent changes in amdgpu and fbdev-dma
code, and cherry-pick conflict in xe.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
9 months agoMerge tag 'drm-xe-next-2024-07-02' of https://gitlab.freedesktop.org/drm/xe/kernel...
Daniel Vetter [Fri, 5 Jul 2024 07:12:19 +0000 (09:12 +0200)]
Merge tag 'drm-xe-next-2024-07-02' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Driver Changes:
- Fix in migration code (Auld)
- Simplification in HWMon related code (Karthik)
- Fix in forcewake logic (Nirmoy)
- Fix engine utilization information (umesh)
- Clean up on MOCS related code (Roper)
- Fix on multicast register (Roper)
- Fix TLB invalidation timeout (Nirmoy)
- More SRIOV preparation (Michal)
- Fix out-of-bounds array access (Lucas)
- Fixes around some mutex utilization (Ashutosh, Vinay)
- Expand LNL workaround to BMG (Vinay)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZoROvquFrTFhk3Pb@intel.com
9 months agoMerge drm-misc-next-2024-07-04 into drm-misc-next-fixes
Maxime Ripard [Thu, 4 Jul 2024 13:19:33 +0000 (15:19 +0200)]
Merge drm-misc-next-2024-07-04 into drm-misc-next-fixes

Let's start the drm-misc-next-fixes cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
9 months agodrm/panthor: Record devfreq busy as soon as a job is started
Steven Price [Wed, 3 Jul 2024 15:56:46 +0000 (16:56 +0100)]
drm/panthor: Record devfreq busy as soon as a job is started

If a queue is already assigned to the hardware, then a newly submitted
job can start straight away without waiting for the tick. However in
this case the devfreq infrastructure isn't notified that the GPU is
busy. By the time the tick happens the job might well have finished and
no time will be accounted for the GPU being busy.

Fix this by recording the GPU as busy directly in queue_run_job() in the
case where there is a CSG assigned and therefore we just ring the
doorbell.

Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
Signed-off-by: Steven Price <steven.price@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703155646.80928-1-steven.price@arm.com