www.infradead.org Git - users/willy/linux.git/log

Merge tag 'amd-drm-fixes-5.13-2021-05-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-fixes-5.13-2021-05-05:

amdgpu:
- MPO hang workaround
- Fix for concurrent VM flushes on vega/navi
- dcefclk is not adjustable on navi1x and newer
- MST HPD debugfs fix
- Suspend/resumes fixes
- Register VGA clients late in case driver fails to load
- Fix GEM leak in user framebuffer create
- Add support for polaris12 with 32 bit memory interface
- Fix duplicate cursor issue when using overlay
- Fix corruption with tiled surfaces on VCN3
- Add BO size and stride check to fix BO size verification

radeon:
- Fix off-by-one in power state parsing
- Fix possible memory leak in power state parsing

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210506033929.3875-1-alexander.deucher@amd.com

Merge tag 'drm-misc-next-fixes-2021-05-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

Two patches, one to fix a null pointer dereference in msm, and one to
fix an unused warning for in fbdev when PROCFS is disabled.

Signed-off-by: Dave Airlie <airlied@redhat.com>
# gpg: Signature made Thu 06 May 2021 22:26:35 AEST
# gpg: using ? key E3EF0D6F671851C5
# gpg: Can't check signature: unknown pubkey algorithm
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210506122723.oqadel7oacazywij@gilmour

Merge tag 'drm-intel-next-fixes-2021-04-30' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

drm/i915 GVT fixes for v5.13-rc1:
- Fix a possible division by zero in vgpu display rate calculation

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87wnsk16sa.fsf@intel.com

drm/amdgpu: Use device specific BO size & stride check.

The builtin size check isn't really the right thing for AMD
modifiers due to a couple of reasons:

1) In the format structs we don't do set any of the tilesize / blocks
etc. to avoid having format arrays per modifier/GPU
2) The pitch on the main plane is pixel_pitch * bytes_per_pixel even
for tiled ...
3) The pitch for the DCC planes is really the pixel pitch of the main
surface that would be covered by it ...

Note that we only handle GFX9+ case but we do this after converting
the implicit modifier to an explicit modifier, so on GFX9+ all
framebuffers should be checked here.

There is a TODO about DCC alignment, but it isn't worse than before
and I'd need to dig a bunch into the specifics. Getting this out in
a reasonable timeframe to make sure it gets the appropriate testing
seemed more important.

Finally as I've found that debugging addfb2 failures is a pita I was
generous adding explicit error messages to every failure case.

Fixes: f258907fdd83 ("drm/amdgpu: Verify bo size can fit framebuffer size on init.")
Tested-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Init GFX10_ADDR_CONFIG for VCN v3 in DPG mode.

Otherwise tiling modes that require the values form this field
(In particular _*_X) would be corrupted upon video decode.

Copied from the VCN v2 code.

Fixes: 99541f392b4d ("drm/amdgpu: add mc resume DPG mode for VCN3.0")
Reviewed-and-Tested by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/pm: initialize variable

Static analysis reports this problem

amdgpu_pm.c:478:16: warning: The right operand of '<' is a garbage value
  for (i = 0; i < data.nums; i++) {
                ^ ~~~~~~~~~

In some cases data is not set.  Initialize to 0 and flag not setting
data as an error with the existing check.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/radeon: Avoid power table parsing memory leaks

Avoid leaving a hanging pre-allocated clock_info if last mode is
invalid, and avoid heap corruption if no valid modes are found.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211537
Fixes: 6991b8f2a319 ("drm/radeon/kms: fix segfault in pm rework")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/radeon: Fix off-by-one power_state index heap overwrite

An out of bounds write happens when setting the default power state.
KASAN sees this as:

[drm] radeon: 512M of GTT memory ready.
[drm] GART: num cpu pages 131072, num gpu pages 131072
==================================================================
BUG: KASAN: slab-out-of-bounds in
radeon_atombios_parse_power_table_1_3+0x1837/0x1998 [radeon]
Write of size 4 at addr ffff88810178d858 by task systemd-udevd/157

CPU: 0 PID: 157 Comm: systemd-udevd Not tainted 5.12.0-E620 #50
Hardware name: eMachines        eMachines E620  /Nile       , BIOS V1.03 09/30/2008
Call Trace:
dump_stack+0xa5/0xe6
print_address_description.constprop.0+0x18/0x239
kasan_report+0x170/0x1a8
radeon_atombios_parse_power_table_1_3+0x1837/0x1998 [radeon]
radeon_atombios_get_power_modes+0x144/0x1888 [radeon]
radeon_pm_init+0x1019/0x1904 [radeon]
rs690_init+0x76e/0x84a [radeon]
radeon_device_init+0x1c1a/0x21e5 [radeon]
radeon_driver_load_kms+0xf5/0x30b [radeon]
drm_dev_register+0x255/0x4a0 [drm]
radeon_pci_probe+0x246/0x2f6 [radeon]
pci_device_probe+0x1aa/0x294
really_probe+0x30e/0x850
driver_probe_device+0xe6/0x135
device_driver_attach+0xc1/0xf8
__driver_attach+0x13f/0x146
bus_for_each_dev+0xfa/0x146
bus_add_driver+0x2b3/0x447
driver_register+0x242/0x2c1
do_one_initcall+0x149/0x2fd
do_init_module+0x1ae/0x573
load_module+0x4dee/0x5cca
__do_sys_finit_module+0xf1/0x140
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae

Without KASAN, this will manifest later when the kernel attempts to
allocate memory that was stomped, since it collides with the inline slab
freelist pointer:

invalid opcode: 0000 [#1] SMP NOPTI
CPU: 0 PID: 781 Comm: openrc-run.sh Tainted: G        W 5.10.12-gentoo-E620 #2
Hardware name: eMachines        eMachines E620  /Nile , BIOS V1.03       09/30/2008
RIP: 0010:kfree+0x115/0x230
Code: 89 c5 e8 75 ea ff ff 48 8b 00 0f ba e0 09 72 63 e8 1f f4 ff ff 41 89 c4 48 8b 45 00 0f ba e0 10 72 0a 48 8b 45 08 a8 01 75 02 <0f> 0b 44 89 e1 48 c7 c2 00 f0 ff ff be 06 00 00 00 48 d3 e2 48 c7
RSP: 0018:ffffb42f40267e10 EFLAGS: 00010246
RAX: ffffd61280ee8d88 RBX: 0000000000000004 RCX: 000000008010000d
RDX: 4000000000000000 RSI: ffffffffba1360b0 RDI: ffffd61280ee8d80
RBP: ffffd61280ee8d80 R08: ffffffffb91bebdf R09: 0000000000000000
R10: ffff8fe2c1047ac8 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000100
FS:  00007fe80eff6b68(0000) GS:ffff8fe339c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe80eec7bc0 CR3: 0000000038012000 CR4: 00000000000006f0
Call Trace:
__free_fdtable+0x16/0x1f
put_files_struct+0x81/0x9b
do_exit+0x433/0x94d
do_group_exit+0xa6/0xa6
__x64_sys_exit_group+0xf/0xf
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fe80ef64bea
Code: Unable to access opcode bytes at RIP 0x7fe80ef64bc0.
RSP: 002b:00007ffdb1c47528 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fe80ef64bea
RDX: 00007fe80ef64f60 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 00007fe80ee2c620 R11: 0000000000000246 R12: 00007fe80eff41e0
R13: 00000000ffffffff R14: 0000000000000024 R15: 00007fe80edf9cd0
Modules linked in: radeon(+) ath5k(+) snd_hda_codec_realtek ...

Use a valid power_state index when initializing the "flags" and "misc"
and "misc2" fields.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211537
Reported-by: Erhard F. <erhard_f@mailbox.org>
Fixes: a48b9b4edb8b ("drm/radeon/kms/pm: add asic specific callbacks for getting power state (v2)")
Fixes: 79daedc94281 ("drm/radeon/kms: minor pm cleanups")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix two cursor duplication when using overlay

Our driver supports overlay planes, and as expected, some userspace
compositor takes advantage of these features. If the userspace is not
enabling the cursor, they can use multiple planes as they please.
Nevertheless, we start to have constraints when userspace tries to
enable hardware cursor with various planes. Basically, we cannot draw
the cursor at the same size and position on two separated pipes since it
uses extra bandwidth and DML only run with one cursor.

For those reasons, when we enable hardware cursor and multiple planes,
our driver should accept variations like the ones described below:

  +-------------+   +--------------+
  | +---------+ |   |              |
  | |Primary  | |   | Primary      |
  | |         | |   | Overlay      |
  | +---------+ |   |              |
  |Overlay      |   |              |
  +-------------+   +--------------+

In this scenario, we can have the desktop UI in the overlay and some
other framebuffer attached to the primary plane (e.g., video). However,
userspace needs to obey some rules and avoid scenarios like the ones
described below (when enabling hw cursor):

                                      +--------+
                                      |Overlay |
+-------------+    +-----+-------+ +-|        |--+
| +--------+  | +--------+       | | +--------+  |
| |Overlay |  | |Overlay |       | |             |
| |        |  | |        |       | |             |
| +--------+  | +--------+       | |             |
| Primary     |    | Primary     | | Primary     |
+-------------+    +-------------+ +-------------+

+-------------+   +-------------+
|     +--------+  |  Primary    |
|     |Overlay |  |             |
|     |        |  |             |
|     +--------+  | +--------+  |
| Primary     |   | |Overlay |  |
+-------------+   +-|        |--+
                     +--------+

If the userspace violates some of the above scenarios, our driver needs
to reject the commit; otherwise, we can have unexpected behavior. Since
we don't have a proper driver validation for the above case, we can see
some problems like a duplicate cursor in applications that use multiple
planes. This commit fixes the cursor issue and others by adding adequate
verification for multiple planes.

Change since V1 (Harry and Sean):
- Remove cursor verification from the equation.

Cc: Louis Li <Ching-shih.Li@amd.com>
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add new MC firmware for Polaris12 32bit ASIC

Polaris12 32bit ASIC needs a special MC firmware.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

fbmem: Mark proc_fb_seq_ops as __maybe_unused

With CONFIG_PROC_FS=n and -Werror, 0-day reports:

drivers/video/fbdev/core/fbmem.c:736:36: error:
'proc_fb_seq_ops' defined but not used

Mark it as __maybe_unused.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210504142910.2084722-1-linux@roeck-us.net

drm/msm/dpu: Delete bonkers code

dpu_crtc_atomic_flush() was directly poking it's attached planes in a
code path that ended up in dpu_plane_atomic_update(), even if the plane
was not involved in the current atomic update.  While a bit dubious,
this worked before because plane->state would always point to something
valid.  But now using drm_atomic_get_new_plane_state() we could get a
NULL state pointer instead, leading to:

   [   20.873273] Call trace:
   [   20.875740]  dpu_plane_atomic_update+0x5c/0xed0
   [   20.880311]  dpu_plane_restore+0x40/0x88
   [   20.884266]  dpu_crtc_atomic_flush+0xf4/0x208
   [   20.888660]  drm_atomic_helper_commit_planes+0x150/0x238
   [   20.894014]  msm_atomic_commit_tail+0x1d4/0x7a0
   [   20.898579]  commit_tail+0xa4/0x168
   [   20.902102]  drm_atomic_helper_commit+0x164/0x178
   [   20.906841]  drm_atomic_commit+0x54/0x60
   [   20.910798]  drm_atomic_connector_commit_dpms+0x10c/0x118
   [   20.916236]  drm_mode_obj_set_property_ioctl+0x1e4/0x440
   [   20.921588]  drm_connector_property_set_ioctl+0x60/0x88
   [   20.926852]  drm_ioctl_kernel+0xd0/0x120
   [   20.930807]  drm_ioctl+0x21c/0x478
   [   20.934235]  __arm64_sys_ioctl+0xa8/0xe0
   [   20.938193]  invoke_syscall+0x64/0x130
   [   20.941977]  el0_svc_common.constprop.3+0x5c/0xe0
   [   20.946716]  do_el0_svc+0x80/0xa0
   [   20.950058]  el0_svc+0x20/0x30
   [   20.953145]  el0_sync_handler+0x88/0xb0
   [   20.957014]  el0_sync+0x13c/0x140

The reason for the codepath seems dubious, the atomic suspend/resume
heplers should handle the power-collapse case.  If not, the CRTC's
atomic_check() should be adding the planes to the atomic update.

Reported-by: Stephen Boyd <swboyd@chromium.org>
Reported-by: John Stultz <john.stultz@linaro.org>
Fixes: 37418bf14c13 ("drm: Use state helper instead of the plane state pointer")
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210430171744.1721408-1-robdclark@gmail.com

Merge tag 'drm-intel-next-fixes-2021-04-27' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

drm/i915 fixes for v5.13-rc1:
- Several fixes to GLK handling in recent display refactoring (Ville)
- Rare watchdog timer race fix (Tvrtko)
- Cppcheck redundant condition fix (José)
- Overlay error code propagation fix (Dan Carpenter)
- Documentation fix (Maarten)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/871raw5d3g.fsf@intel.com

Merge tag 'amd-drm-next-5.13-2021-04-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-5.13-2021-04-23:

amdgpu:
- Fixes for Aldebaran
- Display LTTPR fixes
- eDP fixes
- Fixes for Vangogh
- RAS fixes
- ASPM support
- Renoir SMU fixes
- Modifier fixes
- Misc code cleanups
- Freesync fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210423223920.3786-1-alexander.deucher@amd.com

Merge tag 'drm-misc-next-fixes-2021-04-29' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

Two patches in drm-misc-next-fixes this week, one to fix the error
handling in TTM when a BO can't be swapped out and one to prevent a
wrong dereference in efifb.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210429090308.k3fuqvenf6vupfmg@gilmour

Merge tag 'drm/tegra/for-5.13-rc1' of ssh://git.freedesktop.org/git/tegra/linux into drm-next

drm/tegra: Changes for v5.13-rc1

The changes this time around contain a couple of fixes for host1x along
with some improvements for Tegra DRM. Most notably the Tegra DRM driver
now supports the hardware cursor on Tegra186 and later, more correctly
reflects the capabilities of the display pipelines on various Tegra SoC
generations and knows how to deal with the dGPU sector layout by using
framebuffer modifiers.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210401164430.3349105-1-thierry.reding@gmail.com

Merge tag 'gvt-next-fixes-2021-04-29' of https://github.com/intel/gvt-linux into drm-intel-next-fixes

gvt-next-fixes-2021-04-29

- Fix possible divide error in vgpu display rate calculation (Colin)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210429085142.GT1551@zhen-hp.sh.intel.com

drm/i915/gvt: Prevent divided by zero when calculating refresh rate

To get refresh rate as vblank timer period and keep the precision, the
calculation of rate is multiplied by 1000. However old logic was using:
rate = pixel clock / (h * v / 1000). When the h/v total is invalid, like
all 0, h * v / 1000 will be rounded to 0, which leads to a divided by 0
fault.

0 H/V are already checked above. Instead of divide after divide, refine
the calculation to divide after multiply: "pixel clock * 1000 / (h * v)"
Guest driver should guarantee the correctness of the timing regs' value.

Fixes: 6a4500c7b83f ("drm/i915/gvt: Get accurate vGPU virtual display refresh rate from vreg")
Reported-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210416083355.159305-1-colin.xu@intel.com
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>

amdgpu: fix GEM obj leak in amdgpu_display_user_framebuffer_create

This error code-path is missing a drm_gem_object_put call. Other
error code-paths are fine.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 1769152ac64b ("drm/amdgpu: Fail fb creation from imported dma-bufs. (v2)")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amdgpu: Register VGA clients after init can no longer fail

When an amdgpu device fails to init, it makes another VGA device cause
kernel splat:
kernel: amdgpu 0000:08:00.0: amdgpu: amdgpu_device_ip_init failed
kernel: amdgpu 0000:08:00.0: amdgpu: Fatal error during GPU init
kernel: amdgpu: probe of 0000:08:00.0 failed with error -110
...
kernel: amdgpu 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
kernel: BUG: kernel NULL pointer dereference, address: 0000000000000018
kernel: #PF: supervisor read access in kernel mode
kernel: #PF: error_code(0x0000) - not-present page
kernel: PGD 0 P4D 0
kernel: Oops: 0000 [#1] SMP NOPTI
kernel: CPU: 6 PID: 1080 Comm: Xorg Tainted: G        W         5.12.0-rc8+ #12
kernel: Hardware name: HP HP EliteDesk 805 G6/872B, BIOS S09 Ver. 02.02.00 12/30/2020
kernel: RIP: 0010:amdgpu_device_vga_set_decode+0x13/0x30 [amdgpu]
kernel: Code: 06 31 c0 c3 b8 ea ff ff ff 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 48 8b 87 90 06 00 00 48 89 e5 53 89 f3 <48> 8b 40 18 40 0f b6 f6 e8 40 58 39 fd 80 fb 01 5b 5d 19 c0 83 e0
kernel: RSP: 0018:ffffae3c0246bd68 EFLAGS: 00010002
kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
kernel: RDX: ffff8dd1af5a8560 RSI: 0000000000000000 RDI: ffff8dce8c160000
kernel: RBP: ffffae3c0246bd70 R08: ffff8dd1af5985c0 R09: ffffae3c0246ba38
kernel: R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000246
kernel: R13: 0000000000000000 R14: 0000000000000003 R15: ffff8dce81490000
kernel: FS:  00007f9303d8fa40(0000) GS:ffff8dd1af580000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000000000000018 CR3: 0000000103cfa000 CR4: 0000000000350ee0
kernel: Call Trace:
kernel:  vga_arbiter_notify_clients.part.0+0x4a/0x80
kernel:  vga_get+0x17f/0x1c0
kernel:  vga_arb_write+0x121/0x6a0
kernel:  ? apparmor_file_permission+0x1c/0x20
kernel:  ? security_file_permission+0x30/0x180
kernel:  vfs_write+0xca/0x280
kernel:  ksys_write+0x67/0xe0
kernel:  __x64_sys_write+0x1a/0x20
kernel:  do_syscall_64+0x38/0x90
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f93041e02f7
kernel: Code: 75 05 48 83 c4 58 c3 e8 f7 33 ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
kernel: RSP: 002b:00007fff60e49b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
kernel: RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f93041e02f7
kernel: RDX: 000000000000000b RSI: 00007fff60e49b40 RDI: 000000000000000f
kernel: RBP: 00007fff60e49b40 R08: 00000000ffffffff R09: 00007fff60e499d0
kernel: R10: 00007f93049350b5 R11: 0000000000000246 R12: 000056111d45e808
kernel: R13: 0000000000000000 R14: 000056111d45e7f8 R15: 000056111d46c980
kernel: Modules linked in: nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_seq input_leds snd_seq_device snd_timer snd soundcore joydev kvm_amd serio_raw k10temp mac_hid hp_wmi ccp kvm sparse_keymap wmi_bmof ucsi_acpi efi_pstore typec_ucsi rapl typec video wmi sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor raid6_pq raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log hid_generic usbhid hid amdgpu drm_ttm_helper ttm iommu_v2 gpu_sched i2c_algo_bit drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul sysimgblt crc32_pclmul fb_sys_fops ghash_clmulni_intel cec rc_core aesni_intel crypto_simd psmouse cryptd r8169 i2c_piix4 drm ahci xhci_pci realtek libahci xhci_pci_renesas gpio_amdpt gpio_generic
kernel: CR2: 0000000000000018
kernel: ---[ end trace 76d04313d4214c51 ]---

Commit 4192f7b57689 ("drm/amdgpu: unmap register bar on device init
failure") makes amdgpu_driver_unload_kms() skips amdgpu_device_fini(),
so the VGA clients remain registered. So when
vga_arbiter_notify_clients() iterates over registered clients, it causes
NULL pointer dereference.

Since there's no reason to register VGA clients that early, so solve
the issue by putting them after all the goto cleanups.

v2:
- Remove redundant vga_switcheroo cleanup in failed: label.

Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Handling of amdgpu_device_resume return value for graceful teardown

The runtime resume PM op disregards the return value from
amdgpu_device_resume(), masking errors for failed resumes at the PM
layer.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Pavan Kumar Ramayanam <pavan.ramayanam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix r initial values

Sriov gets suspend of IP block <dce_virtual> failed as return
value was not initialized.

v2: return 0 directly to align original code semantic before this
was broken out into a separate helper function instead of setting
initial values

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/display: fix wrong statement in mst hpd debugfs

[why]
Previous statement would always evaluate to true
making it meaningless
[how]
Just check if a connector is MST by checking if its port exists.

Fixes: 41efcd3879b1df ("drm/amd/display: Add MST capability to trigger_hotplug interface")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Wayne Lin <waynelin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

amdgpu/pm: set pp_dpm_dcefclk to readonly on NAVI10 and newer gpus

v2 : change condition to apply to all chips after NAVI10

Writing to dcefclk causes the gpu to become unresponsive, and requires a reboot.
Patch prevents user from successfully writing to file pp_dpm_dcefclk on parts
NAVI10 and newer, and gives better user feedback that this operation is not allowed.

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

amdgpu/pm: Prevent force of DCEFCLK on NAVI10 and SIENNA_CICHLID

Writing to dcefclk causes the gpu to become unresponsive, and requires a reboot.
Patch ignores a .force_clk_levels(SMU_DCEFCLK) call and issues an
info message.

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2

Starting with Vega the hardware supports concurrent flushes
of VMID which can be used to implement per process VMID
allocation.

But concurrent flushes are mutual exclusive with back to
back VMID allocations, fix this to avoid a VMID used in
two ways at the same time.

v2: don't set ring to NULL

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/display: Reject non-zero src_y and src_x for video planes

[Why]
This hasn't been well tested and leads to complete system hangs on DCN1
based systems, possibly others.

The system hang can be reproduced by gesturing the video on the YouTube
Android app on ChromeOS into full screen.

[How]
Reject atomic commits with non-zero drm_plane_state.src_x or src_y values.

v2:
- Add code comment describing the reason we're rejecting non-zero
src_x and src_y
- Drop gerrit Change-Id
- Add stable CC
- Based on amd-staging-drm-next

v3: removed trailing whitespace

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: stable@vger.kernel.org
Cc: nicholas.kazlauskas@amd.com
Cc: amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Cc: Roman.Li@amd.com
Cc: hersenxs.wu@amd.com
Cc: danny.wang@amd.com
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

efifb: Check efifb_pci_dev before using it

On some platforms like Hyper-V and RPi4 with UEFI firmware, efifb is not
a PCI device.

So make sure efifb_pci_dev is found before using it.

Fixes: a6c0fd3d5a8b ("efifb: Ensure graphics device for efifb stays at PCI D0")
BugLink: https://bugs.launchpad.net/bugs/1922403
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210413170508.968148-1-kai.heng.feng@canonical.com

drm/i915: Fix docbook descriptions for i915_gem_shrinker

Fixes the following htmldocs warning:
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:102: warning: Function parameter or member 'ww' not described in 'i915_gem_shrink'

Fixes: cf41a8f1dc1e ("drm/i915: Finally remove obj->mm.lock.")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210421120938.546076-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 772f7bb75dffd4ec90eaf411f9e09dc2429f5c81)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: fix an error code in intel_overlay_do_put_image()

This code should propagate the error from intel_overlay_pin_fb()
but currently it returns success.

Fixes: 1b321026e213 ("drm/i915: Pass ww ctx to intel_pin_to_display_plane")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YHaFcEzcnh/hk1/Q@mwanda
(cherry picked from commit 103b8cbac28ea2965f24ca52e6a92d76d3851b3c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/display/psr: Fix cppcheck warnings

Fix redundant condition, caught in cppcheck by kernel test robot.

Reported-by: kernel test robot <lkp@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Fixes: b64d6c51380b ("drm/i915/display: Support PSR Multiple Instances")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210409231738.238682-1-jose.souza@intel.com
(cherry picked from commit 1884b579c0cfbb52a92462184406558ac633cafb)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Disable LTTPR detection on GLK once again

The glk display version change is causing us to again attempt
LTTPR detection on glk. We must not do tha since glk doesn't
have a long enough AUX timeout. Restore the correct logic to
skip the detection.

Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: 2b5a4562edd0 ("drm/i915/display: Simplify GLK display version tests")
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210412054607.18133-4-ville.syrjala@linux.intel.com
(cherry picked from commit 543d592a73d3948737d0ae4880a407c6da28662f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Restore lost glk ccs w/a

We lost a CCS related w/a on glk when the display version
became 10 instead of 9. Restore the correct check.

Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: 2b5a4562edd0 ("drm/i915/display: Simplify GLK display version tests")
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210412054607.18133-3-ville.syrjala@linux.intel.com
(cherry picked from commit 0fe6637d9852a33c2873e59ae7e5225f92ac4cc2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Restore lost glk FBC 16bpp w/a

We lost the FBC 16bpp 512byte stride requirement on glk when
we switched from display version 9 to 10. Restore the w/a to
avoid enabling FBC with a bad stride and thus display garbage.

Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: 2b5a4562edd0 ("drm/i915/display: Simplify GLK display version tests")
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210412054607.18133-2-ville.syrjala@linux.intel.com
(cherry picked from commit 87b8c3bc8d27270c9abd5e895ea9066e918ef89b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Take request reference before arming the watchdog timer

Reference needs to be taken before arming the timer. Luckily, given the
default timer period of 20s, the potential to hit the race is extremely
unlikely.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 9b4d0598ee94 ("drm/i915: Request watchdog infrastructure")
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210326105759.2387104-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit f7c379779161d364eb30338529490eac7dc377b7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/ttm: fix error handling if no BO can be swapped out v4

In case that all pre-allocated BOs are busy, just continue to populate
BOs since likely half of system memory in total is still free.

v4 (chk): fix code moved to VMWGFX as well

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210422115757.3946-1-christian.koenig@amd.com

Merge tag 'drm-misc-next-fixes-2021-04-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

A few fixes for the next merge window, with some build fixes for anx7625
and lt8912b bridges, incorrect error handling for lt8912b and TTM, and
one fix for TTM page limit accounting.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210422163329.dvbuwre3akwdmzjt@gilmour

drm/ttm: Don't count pages in SG BOs against pages_limit

Pages in SG BOs were not allocated by TTM. So don't count them against
TTM's pages limit.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210414064804.29356-9-Felix.Kuehling@amd.com

drm/ttm: fix return value check

The function returns the number of swapped pages here. Only abort when we get
a negative error code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210409130113.1459-1-christian.koenig@amd.com

Merge tag 'gvt-next-fixes-2021-04-21' of https://github.com/intel/gvt-linux into drm-intel-next-fixes

gvt-next-fixes-2021-04-21

- Remove one unused function warning (Jiapeng)
- Fix intel_gvt_init_device() return type (Dan)
- Remove one duplicated register accessible check (Zhenyu)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210421094522.GU1551@zhen-hp.sh.intel.com

drm/i915/gvt: Remove duplicated register accessible check

Looks this duplication was missed in code rebase. Remove extra check.

Cc: Yan Zhao <yan.y.zhao@intel.com>
Fixes: 02dd2b12a685 ("drm/i915/gvt: unify lri cmd handler and mmio handlers")
Reviewed-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210414084813.3763353-2-zhenyuw@linux.intel.com

drm/amdgpu/gmc9: remove dummy read workaround for newer chips

Aldebaran has a hw fix so no longer requires the workaround.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add mem sync flag for IB allocated by SA

The buffer of SA bo will be used by many cases. So it's better
to invalidate the cache of indirect buffer allocated by SA before
commit the IB.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix SDMA RAS error reporting on Aldebaran

Fix the following issues with SDMA RAS error reporting:
1. Read the EDC_COUNTER2 register also to fetch error counts
for all sub-blocks in SDMA.
2. SDMA RAS on Aldebaran suports single-bit uncorrectable errors
only. So, report error count in UE count instead of CE count.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-By: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Reset RAS error count and status regs

Reset the RAS error count and error status registers after
reading to prevent over reporting error counts on Aldebaran.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-By: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amdgpu: workaround the TMR MC address issue (v2)"

This reverts commit 2f055097daef498da57552f422f49de50a1573e6.
2f055097daef498da57552f422f49de50a1573e6 was a driver workaround
when PSP firmware was not ready. Now the PSP fw is ready so we
revert this driver workaround.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: 3.2.132

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: [FW Promotion] Release 0.0.62

Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: add helper for enabling mst stream features

[Why]
Some MST devices uses different method to enable mst
specific stream features.

[How]
Add dm_helpers_mst_enable_stream features. This can be
modified later when we are ready to implement those features.

Signed-off-by: Bing Guo <bing.guo@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Report Proper Quantization Range in AVI Infoframe

[Why?]
When a monitor does not set both QS and QY bits, DC does not
set Q0, Q1, QY0 and QY1 bits in AVI infoframe. Setting RGB bits
should be separate from setting YCC bits.

[How?]
Separate logic for setting RGB and YCC quantization range bits
in the AVI infoframe.

Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix call to pass bpp in 16ths of a bit

[Why & How?]
Call to dc_dsc_compute_bandwidth_range should have min and max bpp
in 16ths of a bit. Multiply min and max bpp from policy.

Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Eryk Brol <Eryk.Brol@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fixed typo in function name.

[How & Why]
Changed "prsent" to "present".

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Always poll for rxstatus in authenticate

[Why]
Requirement from the spec - we shouldn't be potentially exiting out
early based on encryption status.

[How]
Drop the calls from HDCP1 and HDCP2 execution that exit out early
based on link encryption status.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add link rate optimization logs for ILR

[Why&How]
Add logs to verify ILR optimization behaviour on boot

Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Unconditionally clear training pattern set after lt

[WHY]
While Link Training is being performed,
and the LTTPRs are in Non-LTTPR or LTTPR Transparent mode,
any DPCD registers besides those used for Link Training are not to be
accessed.

The spec defines the link training registers as DP_TRAINING_PATTERN_SET
(102h) to DP_TRAINING_LANE3_SET (106h), and DP_LANE0_1_STATUS (202h)
to DP_ADJUST_REQUEST_LANE2_3 (207h).

[HOW]
Move the current write to DPCD Address DP_LINK_TRAINING_PATTERN_SET out
of its conditional block.

Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix FreeSync when RGB MPO in use

[WHY]
We should skip programming manual trigger on non-primary planes when MPO is
enabled.

[HOW]
Implement an explicit mechanism for skipping manual trigger programming
for planes that shouldn't cause the frame to end.

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: treat memory as a single-channel for asymmetric memory v2

Previous change had been reverted since it caused hang.
Remake change to avoid defect.

[Why]
1. Driver use umachannelnumber to calculate watermarks for stutter.
In asymmetric memory config, the actual bandwidth is less than
dual-channel. The bandwidth should be the same as single-channel.
2. We found single rank dimm need additional delay time for stutter.

[How]
Get information from each DIMM. Treat memory config as a single-channel
for asymmetric memory in bandwidth calculating.
Add additional delay time for single rank dimm.

Fixes: b8720ed0b87d32 ("drm/amd/display: System black screen hangs on driver load")
Signed-off-by: Hugo Hu <hugo.hu@amd.com>
Reviewed-by: Sung Lee <Sung.Lee@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: removed unused function dc_link_reallocate_mst_payload.

[Why]
Found that dc_link_reallocate_mst_payload is not used anymore
in any of the use case scenario.

[How]
removed dc_link_reallocate_mst_payload function definition
and declaration.

Signed-off-by: Robin Singh <robin.singh@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: disable seamless boot for external DP

[Why]
Primary feature use case is with eDP panels.

[How]
Fail seamless boot validation if display is not an eDP panel.

Signed-off-by: Anthony Wang <anthony1.wang@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: add handling for hdcp2 rx id list validation

[why]
the current implementation of hdcp2 rx id list validation does not
have handler/checker for invalid message status, e.g. HMAC, the V
parameter calculated from PSP not matching the V prime from Rx.

[how]
return a generic FAILURE for any message status not SUCCESS or
REVOKED.

Signed-off-by: Dingchen (David) Zhang <dingchen.zhang@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: update hdcp display using correct CP type.

[why]
currently we enforce to update hdcp display using TYPE0, but there
is case that connector CP type prop be TYPE1 instead of type0.

[how]
using the drm prop of CP type of the connector as input argument.

Signed-off-by: Dingchen (David) Zhang <dingchen.zhang@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add DSC check to seamless boot validation

[Why & How]
We want to immediately fail seamless boot validation if DSC is active,
as VBIOS currently does not support DSC timings. Add a check for
the relevant flag in dc_validate_seamless_boot_timing.

Signed-off-by: Anthony Wang <anthony1.wang@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: fixed divide by zero kernel crash during dsc enablement

[why]
During dsc enable, a divide by zero condition triggered the
kernel crash.

[how]
An IGT test, which enable the DSC, was crashing at the time of
restore the default dsc status, becaue of h_totals value
becoming 0. So add a check before divide condition. If h_total
is zero, gracefully ignore and set the default value.

kernel panic log:

[  128.758827] divide error: 0000 [#1] PREEMPT SMP NOPTI
[  128.762714] CPU: 5 PID: 4562 Comm: amd_dp_dsc Tainted: G        W         5.4.19-android-x86_64 #1
[  128.769728] Hardware name: ADVANCED MICRO DEVICES, INC. Mauna/Mauna, BIOS WMN0B13N Nov 11 2020
[  128.777695] RIP: 0010:hubp2_vready_at_or_After_vsync+0x37/0x7a [amdgpu]
[  128.785707] Code: 80 02 00 00 48 89 f3 48 8b 7f 08 b ......
[  128.805696] RSP: 0018:ffffad8f82d43628 EFLAGS: 00010246
......
[  128.857707] CR2: 00007106d8465000 CR3: 0000000426530000 CR4: 0000000000140ee0
[  128.865695] Call Trace:
[  128.869712] hubp3_setup+0x1f/0x7f [amdgpu]
[  128.873705] dcn20_update_dchubp_dpp+0xc8/0x54a [amdgpu]
[  128.877706] dcn20_program_front_end_for_ctx+0x31d/0x463 [amdgpu]
[  128.885706] dc_commit_state+0x3d2/0x658 [amdgpu]
[  128.889707] amdgpu_dm_atomic_commit_tail+0x4b3/0x1e7c [amdgpu]
[  128.897699] ? dm_read_reg_func+0x41/0xb5 [amdgpu]
[  128.901707] ? dm_read_reg_func+0x41/0xb5 [amdgpu]
[  128.905706] ? __is_insn_slot_addr+0x43/0x48
[  128.909706] ? fill_plane_buffer_attributes+0x29e/0x3dc [amdgpu]
[  128.917705] ? dm_plane_helper_prepare_fb+0x255/0x284 [amdgpu]
[  128.921700] ? usleep_range+0x7c/0x7c
[  128.925705] ? preempt_count_sub+0xf/0x18
[  128.929706] ? _raw_spin_unlock_irq+0x13/0x24
[  128.933732] ? __wait_for_common+0x11e/0x18f
[  128.937705] ? _raw_spin_unlock_irq+0x13/0x24
[  128.941706] ? __wait_for_common+0x11e/0x18f
[  128.945705] commit_tail+0x8b/0xd2 [drm_kms_helper]
[  128.949707] drm_atomic_helper_commit+0xd8/0xf5 [drm_kms_helper]
[  128.957706] amdgpu_dm_atomic_commit+0x337/0x360 [amdgpu]
[  128.961705] ? drm_atomic_check_only+0x543/0x68d [drm]
[  128.969705] ? drm_atomic_set_property+0x760/0x7af [drm]
[  128.973704] ? drm_mode_atomic_ioctl+0x6f3/0x85a [drm]
[  128.977705] drm_mode_atomic_ioctl+0x6f3/0x85a [drm]
[  128.985705] ? drm_atomic_set_property+0x7af/0x7af [drm]
[  128.989706] drm_ioctl_kernel+0x82/0xda [drm]
[  128.993706] drm_ioctl+0x225/0x319 [drm]
[  128.997707] ? drm_atomic_set_property+0x7af/0x7af [drm]
[  129.001706] ? preempt_count_sub+0xf/0x18
[  129.005713] amdgpu_drm_ioctl+0x4b/0x76 [amdgpu]
[  129.009705] vfs_ioctl+0x1d/0x2a
[  129.013705] do_vfs_ioctl+0x419/0x43d
[  129.017707] ksys_ioctl+0x52/0x71
[  129.021707] __x64_sys_ioctl+0x16/0x19
[  129.025706] do_syscall_64+0x78/0x85
[  129.029705] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Robin Singh <robin.singh@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Reviewed-by: Robin Singh <Robin.Singh@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish

dimgrey_cavefish has similar gc_10_3 ip with sienna_cichlid,
so follow its registers offset setting.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: resolve erroneous gfx_v9_4_2 prints

resolve bug on aldebaran where gfx error counts will
print on driver load when there are no errors present

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix a error injection failed issue

because "sscanf(str, "retire_page")" always return 0, if application use
the raw data for error injection, it always wrongly falls into "op ==
3". Change to use strstr instead.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: only harvest gcea/mmea error status in aldebaran

In aldebaran, driver only needs to harvest SDP
RdRspStatus, WrRspStatus and first parity error
on RdRsp data. Check error type before harvest
error information.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: only harvest gcea/mmea error status in arcturus

SDP RdRspStatus/WrRspStatus or first parity error on
RdRsp data can cause system fatal error in arcturus.
GPU will be freezed in such case.

Driver needs to harvest these error information before
reset the GPU. Check error type to avoid harvest normal
gcea/mmea information.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: enable tmz on renoir asics

The tmz functions are verified on renoir chips as well. So enable it by
default.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Tested-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: correct default gfx wdt timeout setting

When gfx wdt was configured to fatal_disable, the
timeout period should be configured to 0x0 (timeout
disabled)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: add edc error interrupt handle for poison propogate mode

In poison progogate mode, when driver receive the edc error interrupt
from SQ, driver should kill the process by pasid which is using the
poison data, and then trigger GPU reset.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/bridge: lt8912b: fix incorrect handling of of_* return values

A static analysis shows several issues in the driver code at
probing time.

DT parsing errors were bad handled and could lead to bugs:
- Bad error detection;
- Bad release of resources

Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Adrien Grassein <adrien.grassein@gmail.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210415183639.1487-1-rdunlap@infradead.org
Signed-off-by: Robert Foss <robert.foss@linaro.org>

drm: bridge: fix LONTIUM use of mipi_dsi_() functions

The Lontium DRM bridge drivers use mipi_dsi_() function interfaces so
they need to select DRM_MIPI_DSI to prevent build errors.

ERROR: modpost: "mipi_dsi_attach" [drivers/gpu/drm/bridge/lontium-lt9611uxc.ko] undefined!
ERROR: modpost: "mipi_dsi_device_register_full" [drivers/gpu/drm/bridge/lontium-lt9611uxc.ko] undefined!
ERROR: modpost: "of_find_mipi_dsi_host_by_node" [drivers/gpu/drm/bridge/lontium-lt9611uxc.ko] undefined!
ERROR: modpost: "mipi_dsi_device_unregister" [drivers/gpu/drm/bridge/lontium-lt9611uxc.ko] undefined!
ERROR: modpost: "mipi_dsi_detach" [drivers/gpu/drm/bridge/lontium-lt9611uxc.ko] undefined!
ERROR: modpost: "mipi_dsi_attach" [drivers/gpu/drm/bridge/lontium-lt9611.ko] undefined!
ERROR: modpost: "mipi_dsi_device_register_full" [drivers/gpu/drm/bridge/lontium-lt9611.ko] undefined!
ERROR: modpost: "of_find_mipi_dsi_host_by_node" [drivers/gpu/drm/bridge/lontium-lt9611.ko] undefined!
ERROR: modpost: "mipi_dsi_device_unregister" [drivers/gpu/drm/bridge/lontium-lt9611.ko] undefined!
ERROR: modpost: "mipi_dsi_detach" [drivers/gpu/drm/bridge/lontium-lt9611.ko] undefined!
WARNING: modpost: suppressed 5 unresolved symbol warnings because there were too many)

Fixes: 23278bf54afe ("drm/bridge: Introduce LT9611 DSI to HDMI bridge")
Fixes: 0cbbd5b1a012 ("drm: bridge: add support for lontium LT9611UXC bridge")
Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Adren Grassein <adrien.grassein@gmail.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Adrien Grassein <adrien.grassein@gmail.com>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Robert Foss <robert.foss@linaro.org>
Cc: dri-devel@lists.freedesktop.org
Cc: stable@vger.kernel.org
Signed-off-by: Robert Foss <robert.foss@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210415183639.1487-1-rdunlap@infradead.org

drm: bridge: fix ANX7625 use of mipi_dsi_() functions

The Analogix DRM ANX7625 bridge driver uses mips_dsi_() function
interfaces so it should select DRM_MIPI_DSI to prevent build errors.

ERROR: modpost: "mipi_dsi_attach" [drivers/gpu/drm/bridge/analogix/anx7625.ko] undefined!
ERROR: modpost: "mipi_dsi_device_register_full" [drivers/gpu/drm/bridge/analogix/anx7625.ko] undefined!
ERROR: modpost: "of_find_mipi_dsi_host_by_node" [drivers/gpu/drm/bridge/analogix/anx7625.ko] undefined!
ERROR: modpost: "mipi_dsi_device_unregister" [drivers/gpu/drm/bridge/analogix/anx7625.ko] undefined!
ERROR: modpost: "mipi_dsi_detach" [drivers/gpu/drm/bridge/analogix/anx7625.ko] undefined!

Fixes: 8bdfc5dae4e3 ("drm/bridge: anx7625: Add anx7625 MIPI DSI/DPI to DP")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Robert Foss <robert.foss@linaro.org>
Cc: Xin Ji <xji@analogixsemi.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: dri-devel@lists.freedesktop.org
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Robert Foss <robert.foss@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Robert Foss <robert.foss@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210415183619.1431-1-rdunlap@infradead.org

drm/radeon/si: Fix inconsistent indenting

Kernel test robot throws below warning ->

smatch warnings:
drivers/gpu/drm/radeon/si.c:4514 si_vm_packet3_cp_dma_check() warn:
inconsistent indenting

Fixed the inconsistent indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: fix error code in smu_set_power_limit()

We should return -EINVAL instead of success if the "limit" is too high.

Fixes: e098bc9612c2 ("drm/amd/pm: optimize the power related source code layout")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix an error code in init_pmu_entry_by_type_and_add()

If the kmemdup() fails then this should return a negative error code
but it currently returns success

Fixes: b4a7db71ea06 ("drm/amdgpu: add per device user friendly xgmi events for vega20")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/radeon/cik: remove set but not used variables

The value of pipe_id and queue_id are not used under certain
circumstances, so just delete.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

amd/display: allow non-linear multi-planar formats

Accept non-linear buffers which use a multi-planar format, as long
as they don't use DCC.

Tested on GFX9 with NV12.

Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/dm: Fix NULL pointer crash during DP MST hotplug

This patch checks the return value of the function
dc_link_add_remote_sink before using it. This was causing
a crash during consecutive hotplugs of DP MST displays.

Cc: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "Revert "drm/amdgpu: Ensure that the modifier requested is supported by plane.""

This reverts commit 55fa622fe635bfc3f2587d784f6facc30f8fdf12.

The regression caused by the original patch has been
cleared, thus introduce back the change.

Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Update modifier list for gfx10_3

[Why]
Current list supports modifiers that have DCC_MAX_COMPRESSED_BLOCK
set to AMD_FMT_MOD_DCC_BLOCK_128B, while AMD_FMT_MOD_DCC_BLOCK_64B
is used instead by userspace.

[How]
Replace AMD_FMT_MOD_DCC_BLOCK_128B with AMD_FMT_MOD_DCC_BLOCK_64B
for modifiers with DCC supported.

Fixes: faa37f54ce0462 ("drm/amd/display: Expose modifiers")
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: revise two names of sensor values for vangogh

This patch is to revise two names of sensor values for vangogh.
New smu metrics table is supported by new pmfw
(from version 4.63.36.00 ), it includes two parts, one part is
the current smu metrics table data and the other part is the
average smu metrics table data. The hwmon will read the current gfxclk
and mclk from the current smu metrics table data.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: remove the "set" function of pp_dpm_mclk for vangogh

This patch is to remove the "set" function of pp_dpm_mclk for vangogh.
For vangogh, mclk bonds with fclk, they will lock each other
on the same perfomance level. But according to the smu message from pmfw,
only fclk is allowed to set value manually, so remove the unnecessary
code of "set" function for mclk.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Copy MEC FW version to MEC2 if we skipped loading MEC2

If we skipped loading MEC2 firmware separately
from MEC, then MEC2 will be running the same
firmware image. Copy the MEC version and feature
numbers into MEC2 version and feature numbers.
This is needed for things like GWS support, where
we rely on knowing what version of firmware is
running on MEC2. Leaving these MEC2 entries blank
breaks our ability to version-check enables and
workarounds.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: add the callback to get the bootup values for renoir

This patch is to add the callback to get the bootup values for renoir.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: update the atomfirmware header for smu12

This patch is to update the atomfirmware header for smu12.

v2: remove some unnecessary members

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Remove legacy code not acquiring VMs

ROCm user mode has acquired VMs from DRM file descriptors for as long
as it supported the upstream KFD. Legacy code to support older versions
of ROCm is not needed any more.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Use iterator methods exposed by amdgpu_res_cursor.h in building SG_TABLE's for a VRAM BO

Extend current implementation of SG_TABLE construction method to
allow exportation of sub-buffers of a VRAM BO. This capability will
enable logical partitioning of a VRAM BO into multiple non-overlapping
sub-buffers. One example of this use case is to partition a VRAM BO
into two sub-buffers, one for SRC and another for DST.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add double-sscanf but invert

Add back the double-sscanf so that both decimal
and hexadecimal values could be read in, but this
time invert the scan so that hexadecimal format
with a leading 0x is tried first, and if that
fails, then try decimal format.

Also use a logical-AND instead of nesting double
if-conditional.

See commit "drm/amdgpu: Fix a bug for input with double sscanf"

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/amdgpu: add ASPM support on polaris

add ASPM support on polaris

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/amdgpu: enable ASPM on vega

enable ASPM on vega to save the power
without the performance hurt.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/amdgpu: enable ASPM on navi1x

enable ASPM on navi1x for the benifit of system power consumption
without performance hurt.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/sriov no need to config GECC for sriov

No need to config GECC feature here for sriov
Leave the host drvier to do the configuration job.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Show updated clocks on aldebaran

When GFXCLK range is updated in manual/determinism mode, show the
updated min/max clock range.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix kernel-doc for the RAS sysfs interface

Imporve the kernel-doc for the RAS sysfs
interface. Fix the grammar, fix the context.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add bad_page_cnt_threshold to debugfs

Add bad_page_cnt_threshold to debugfs, an optional
file system used for debugging, for reporting
purposes only--it usually matches the size of
EEPROM but may be different depending on the
"bad_page_threshold" kernel module option.

The "bad_page_cnt_threshold" is a dynamically
computed value. It depends on three things: the
VRAM size; the size of the EEPROM (or the size
allocated to the RAS table therein); and the
"bad_page_threshold" module parameter. It is a
dynamically computed value, when the amdgpu module
is run, on which further parameters and logic
depend, and as such it is helpful to see the
dynamically computed value in debugfs.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix a bug in checking the result of reserve page

Fix if (ret) --> if (!ret), a bug, for
"retire_page", which caused the kernel to recall
the method with *pos == end of file, and that
bounced back with error. On the first run, we
advanced *pos, but returned 0 back to fs layer,
also a bug.

Fix the logic of the check of the result of
amdgpu_reserve_page_direct()--it is 0 on success,
and non-zero on error, not the other way
around. This patch fixes this bug.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix a bug for input with double sscanf

Remove double-sscanf to scan for %llu and 0x%llx,
as that is not going to work!

The %llu will consume the "0" in "0x" of your
input, and the hex value you think you're entering
will always be 0. That is, a valid hex value can
never be consumed.

On the other hand, just entering a hex number
without leading 0x will either be scanned as a
string and not match, for instance FAB123, or
the leading decimal portion is scanned as the
%llu, for instance 123FAB will be scanned as 123,
which is not correct.

Thus remove the first %llu scan and leave only the
%llx scan, removing the leading 0x since %llx can
scan either.

Addresses are usually always hex values, so this
suffices.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Xinhui Pan <xinhui.pan@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add graphics cache rinse packet for sdma

Add emit mem sync callback for sdma_v5_2

In amdgpu sync object test, three threads created jobs
to send GFX IB and SDMA IB in sequence. After the first
GFX thread joined, sometimes the third thread will reuse
the same physical page to store the SDMA IB. There will
be a risk that SDMA will read GFX IB in the previous physical
page. So it's better to flush the cache before commit sdma IB.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>