]> www.infradead.org Git - users/hch/block.git/log
users/hch/block.git
22 months agodrm/xe/bo: handle PL_TT -> PL_TT
Matthew Auld [Thu, 15 Jun 2023 17:20:52 +0000 (18:20 +0100)]
drm/xe/bo: handle PL_TT -> PL_TT

When moving between PL_VRAM <-> PL_SYSTEM we have to have use PL_TT in
the middle as a temporary resource for the actual copy. In some GL
workloads it can be seen that once the resource has been moved to the
PL_TT we might have to bail out of the ttm_bo_validate(), before
finishing the final hop. If this happens the resource is left as
TTM_PL_FLAG_TEMPORARY, and when the ttm_bo_validate() is restarted the
current placement is always seen as incompatible, requiring us to
complete the move.  However if the BO allows PL_TT as a possible
placement we can end up attempting a PL_TT -> PL_TT move (like when
running out of VRAM) which leads to explosions in xe_bo_move(), like
triggering the XE_BUG_ON(!tile).

Going from TTM_PL_FLAG_TEMPORARY with PL_TT -> PL_VRAM should already
work as-is, so it looks like we only need to worry about PL_TT -> PL_TT
and it looks like we can just treat it as a dummy move, since no real
move is needed.

Reported-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: VM LRU bulk move
Matthew Brost [Wed, 7 Jun 2023 15:45:20 +0000 (08:45 -0700)]
drm/xe: VM LRU bulk move

Use the TTM LRU bulk move for BOs tied to a VM. Update the bulk moves
LRU position on every exec.

v2: Bulk move for compute VMs, use WARN rather than BUG

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Only try to lock external BOs in VM bind
Matthew Brost [Tue, 28 Mar 2023 01:34:49 +0000 (18:34 -0700)]
drm/xe: Only try to lock external BOs in VM bind

We only need to try to lock a BO if it's external as non-external BOs
share the dma-resv with the already locked VM. Trying to lock
non-external BOs caused an issue (list corruption) in an uncoming patch
which adds bulk LRU move. Since this code isn't needed, remove it.

v2: New commit message, s/mattthew/matthew/

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Ensure LR engines are not persistent
Matthew Brost [Thu, 13 Apr 2023 01:48:41 +0000 (18:48 -0700)]
drm/xe: Ensure LR engines are not persistent

With our ref counting scheme long running (LR) engines only close
properly if not persistent, ensure that LR engines are non-persistent.

v2: spell out LR

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Long running job update
Matthew Brost [Mon, 22 May 2023 01:24:20 +0000 (18:24 -0700)]
drm/xe: Long running job update

For long running (LR) jobs with the DRM scheduler we must return NULL in
run_job which results in signaling the job's finished fence immediately.
This prevents LR jobs from creating infinite dma-fences.

Signaling job's finished fence immediately breaks flow controlling ring
with the DRM scheduler. To work around this, the ring is flow controlled
and written in the exec IOCTL. Signaling job's finished fence
immediately also breaks the TDR which is used in reset / cleanup entity
paths so write a new path for LR entities.

v2: Better commit, white space, remove rmb(), better comment next to
emit_job()
v3 (Thomas): Change LR reference counting, fix working in commit

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: NULL binding implementation
Matthew Brost [Thu, 15 Jun 2023 18:22:36 +0000 (11:22 -0700)]
drm/xe: NULL binding implementation

Add uAPI and implementation for NULL bindings. A NULL binding is defined
as writes dropped and read zero. A single bit in the uAPI has been added
which results in a single bit in the PTEs being set.

NULL bindings are intendedd to be used to implement VK sparse bindings,
in particular residencyNonResidentStrict property.

v2: Fix BUG_ON shown in VK testing, fix check patch warning, fix
xe_pt_scan_64K, update __gen8_pte_encode to understand NULL bindings,
remove else if vma_addr

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Suggested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/Xe: Use EOPNOTSUPP instead of ENOTSUPP
Janga Rahul Kumar [Tue, 13 Jun 2023 09:37:40 +0000 (15:07 +0530)]
drm/Xe: Use EOPNOTSUPP instead of ENOTSUPP

ENOTSUPP is not a standard Unix error should use
EOPNOTSUPP instead.

v2: Update commit description (Aravind)

Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Janga Rahul Kumar <janga.rahul.kumar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: limit GGTT size to GUC_GGTT_TOP
Daniele Ceraolo Spurio [Wed, 14 Jun 2023 17:47:54 +0000 (10:47 -0700)]
drm/xe: limit GGTT size to GUC_GGTT_TOP

The GuC can't access addresses above GUC_GGTT_TOP, so any GuC-accessible
objects can't be mapped above that offset. Instead of checking each
object to see if GuC may access it or not before mapping it, we just
limit the GGTT size to GUC_GGTT_TOP. This wastes a bit of address space
(about ~18 MBs, which is in addition to what already removed at the bottom
of the GGTT), but it is a good tradeoff to keep the code simple.

The in-code comment has also been updated to explain the limitation.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20230615002521.2587250-1-daniele.ceraolospurio@intel.com/
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/mtl: Add some initial MTL workarounds
Matt Roper [Thu, 8 Jun 2023 18:12:17 +0000 (11:12 -0700)]
drm/xe/mtl: Add some initial MTL workarounds

This adds a handful of workarounds that apply to production steppings of
MTL:
 - Wa_14018575942
 - Wa_22016670082
 - Wa_14017856879
 - Wa_18019271663

Wa_22016670082 is currently only applied to the primary GT at the
moment, but may need to be extended to the media GT in the future if a
pending update to the workaround database gets finalized.

OOB workarounds will need to be implemented separately in future patches
for Wa_14016712196, Wa_16018063123, and Wa_18013179988.

Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://lore.kernel.org/r/20230608181217.2385932-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Fix check for platform without geometry pipeline
Michał Winiarski [Tue, 23 May 2023 13:50:20 +0000 (15:50 +0200)]
drm/xe: Fix check for platform without geometry pipeline

It's not possible for the condition checking if we're running on
platform without geometry pipeline to ever be true, since
gt->fuse_topo.g_dss_mask is an array.

It also breaks the build:
../drivers/gpu/drm/xe/xe_rtp.c:183:50: error: address of array 'gt->fuse_topo.g_dss_mask' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230523135020.345596-2-michal@hardline.pl
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Fix uninitialized variables
Michał Winiarski [Tue, 23 May 2023 13:50:19 +0000 (15:50 +0200)]
drm/xe: Fix uninitialized variables

Using uninitialized variables leads to undefined behavior.

Moreover, it causes the compiler to complain with:
../drivers/gpu/drm/xe/xe_vm.c:3265:40: error: variable 'vma' is uninitialized when used here [-Werror,-Wuninitialized]
../drivers/gpu/drm/xe/xe_rtp.c:118:36: error: variable 'i' is uninitialized when used here [-Werror,-Wuninitialized]
../drivers/gpu/drm/xe/xe_mocs.c:449:3: error: variable 'flags' is uninitialized when used here [-Werror,-Wuninitialized]

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230523135020.345596-1-michal@hardline.pl
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Fix GT looping for standalone media
Riana Tauro [Tue, 13 Jun 2023 09:42:32 +0000 (15:12 +0530)]
drm/xe: Fix GT looping for standalone media

gt_count is only being incremented when initializing the primary GT;
since the media GT sets the ID directly, gt_count is not incremented
again, resulting in an incorrect count on MTL.  Use autoincrement while
assigning the media GTs ID to ensure gt_count is correct on MTL and
other future platforms with standalone media.

Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Link: https://lore.kernel.org/r/20230613094232.3703549-1-riana.tauro@intel.com
[mattrope: Tweaked commit message to focus on gt_count importance]
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Donot apply forcewake while reading actual frequency
Badal Nilawar [Fri, 9 Jun 2023 02:49:54 +0000 (08:19 +0530)]
drm/xe: Donot apply forcewake while reading actual frequency

RPSTAT1 is an sgunit register and thus doesn't need forcewake.
MTL_MIRROR_TARGET_WP1 is within an "always on" power domain and thus
doesn't require any forcewake to ensure the register is powered
up and usable. When GT is RC6 the actual frequency reported will be 0.

v2:
 - Add bspec index (Anshuman)
 - %s/GEN12_RPSTAT1/GT_PERF_STATUS as per bspec
v3: Update Fixes tag

Bspec: 51837, 67651
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230609024954.987039-1-badal.nilawar@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Normalize error messages with %#x
Lucas De Marchi [Sun, 11 Jun 2023 22:24:45 +0000 (15:24 -0700)]
drm/xe/guc: Normalize error messages with %#x

One of the messages was printed without 0x prefix, so it was not clear
if it was decimal or hex: make sure to add the prefix by using %#x.
While at it, normalize the other messages in the same function to follow
the same pattern.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230611222447.2837573-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Fix typo s/enabled/enable/
Lucas De Marchi [Sun, 11 Jun 2023 22:24:44 +0000 (15:24 -0700)]
drm/xe/guc: Fix typo s/enabled/enable/

Fix the log message when it fails to enable CT.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230611222447.2837573-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Rename pte/pde encoding functions
Lucas De Marchi [Sun, 11 Jun 2023 22:24:43 +0000 (15:24 -0700)]
drm/xe: Rename pte/pde encoding functions

Remove the leftover TODO by renameing the functions to use xe prefix.
Since the static __gen8_pte_encode() already has a double score,
just remove the prefix.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230611222447.2837573-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Move XE_PTE_FLAG_READ_ONLY to xe_vm_types.h
Matthew Brost [Wed, 7 Jun 2023 18:51:36 +0000 (11:51 -0700)]
drm/xe: Move XE_PTE_FLAG_READ_ONLY to xe_vm_types.h

XE_PTE_FLAG_READ_ONLY is specific to struct xe_vma, move it from xe_bo.h
to xe_vm_types.h to reflect that.

Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: s/XE_PTE_READ_ONLY/XE_PTE_FLAG_READ_ONLY
Matthew Brost [Wed, 7 Jun 2023 18:43:52 +0000 (11:43 -0700)]
drm/xe: s/XE_PTE_READ_ONLY/XE_PTE_FLAG_READ_ONLY

This define is for internal PTE flags rather than fields in the hardware
PTEs, rename as such. This will help in an upcoming patch to avoid
further confusion.

Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Use Xe ordered workqueue for rebind worker
Matthew Brost [Fri, 9 Jun 2023 18:19:30 +0000 (11:19 -0700)]
drm/xe: Use Xe ordered workqueue for rebind worker

A mix of the system unbound wq and Xe ordered wq was used for the
rebind, only use the Xe ordered wq. This will ensure only 1 rebind is
occuring at a time providing a somewhat clunky work around for short
comings in TTM wrt to memory contention. Once the TTM memory contention
is resolved we should be able to use a dedicated non-ordered workqueue.

Also add helper to queue rebind worker to avoid using wrong workqueue
going forward.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Handle unmapped userptr in analyze VM
Matthew Brost [Fri, 9 Jun 2023 18:09:37 +0000 (11:09 -0700)]
drm/xe: Handle unmapped userptr in analyze VM

A corner exists where a userptr may have no mapping when analyze VM is
called, handle this case.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Emit a render cache flush after each rcs/ccs batch
Thomas Hellström [Fri, 2 Jun 2023 12:44:23 +0000 (14:44 +0200)]
drm/xe: Emit a render cache flush after each rcs/ccs batch

We need to flush render caches before fence signalling, where we might
release the memory for reuse. We can't rely on userspace doing this,
so flush render caches after the batch, but before user fence- and
dma_fence signalling.

Copy the cache flush from i915, but omit PIPE_CONTROL_FLUSH_L3, since it
should be implied by the other flushes. Also omit
PIPE_CONTROL_TLB_INVALIDATE since there should be no apparent need to
invalidate TLB after batch completion.

v2:
- Update Makefile for OOB WA.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com> #1
Reported-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Invalidate TLB also on bind if in scratch page mode
Thomas Hellström [Mon, 5 Jun 2023 13:04:54 +0000 (15:04 +0200)]
drm/xe: Invalidate TLB also on bind if in scratch page mode

For scratch table mode we need to cover the case where a scratch PTE might
have been pre-fetched and cached and used instead of that of the newly
bound vma.
For compute vms, invalidate TLB globally using GuC before signalling
bind complete. For !long-running vms, invalidate TLB at batch start.

Also document how TLB invalidation works.

v2:
- Fix a pointer to the comment about TLB invalidation (Jose Souza).
- Add a bool to the vm whether we want to invalidate TLB at batch start.
- Invalidate TLB also on BCS- and video engines at batch start where
  needed.
- Use BIT() macro instead of explicit shift.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com> #v1
Reported-by: José Roberto de Souza <jose.souza@intel.com> #v1
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/reg_sr: Apply limit to register whitelisting
Gustavo Sousa [Fri, 9 Jun 2023 14:38:15 +0000 (11:38 -0300)]
drm/xe/reg_sr: Apply limit to register whitelisting

If RING_MAX_NONPRIV_SLOTS denotes the maximum number of whitelisting
slots, then it makes sense to refuse going above it.

v2:
  - Use xe_gt_err() instead of drm_err() for more detailed info in the
    error message. (Matt)

Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230609143815.302540-3-gustavo.sousa@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/reg_sr: Use a single parameter for xe_reg_sr_apply_whitelist()
Gustavo Sousa [Fri, 9 Jun 2023 14:38:14 +0000 (11:38 -0300)]
drm/xe/reg_sr: Use a single parameter for xe_reg_sr_apply_whitelist()

All other parameters can be extracted from a single struct xe_hw_engine
reference. This removes redundancy and simplifies the code.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230609143815.302540-2-gustavo.sousa@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Read HXG fields from DW1 of G2H response
Matthew Brost [Wed, 18 Jan 2023 02:34:34 +0000 (18:34 -0800)]
drm/xe/guc: Read HXG fields from DW1 of G2H response

The HXG fields are DW1 not DW0, fix this.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Fix some formatting issues in uAPI
Francois Dugast [Thu, 8 Jun 2023 07:59:14 +0000 (09:59 +0200)]
drm/xe: Fix some formatting issues in uAPI

Fix spacing, alignment, and repeated words in the documentation.

Reported-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Group engine related structs
Francois Dugast [Wed, 31 May 2023 15:23:35 +0000 (15:23 +0000)]
drm/xe: Group engine related structs

Move the definition of drm_xe_engine_class_instance to group it with
other engine related structs and to follow the ioctls order.

Reported-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Use SPDX-License-Identifier instead of license text
Francois Dugast [Wed, 31 May 2023 15:23:34 +0000 (15:23 +0000)]
drm/xe: Use SPDX-License-Identifier instead of license text

Replace the license text with its SPDX-License-Identifier for
quick identification of the license and consistency with the
rest of the driver.

Reported-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/wa: Extend scope of Wa_14015795083
Matt Roper [Fri, 2 Jun 2023 23:10:54 +0000 (16:10 -0700)]
drm/xe/wa: Extend scope of Wa_14015795083

Wa_14015795083 was already implemented for DG2 and PVC, but the
workaround database has been updated to extend it to more platforms.  It
should now apply to all platforms with graphics versions 12.00 - 12.60,
as well as A-step of Xe_LPG (12.70 / 12.71).

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://lore.kernel.org/r/20230602231054.1306865-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: REBAR resize should be best effort
Michael J. Ruhl [Mon, 5 Jun 2023 16:08:56 +0000 (12:08 -0400)]
drm/xe: REBAR resize should be best effort

The resizing of the PCI BAR is a best effort feature.  If it is
not available, it should not fail the driver probe.

Rework the resize to not exit on failure.

Fixes: 7f075300a318 ("drm/xe: Simplify rebar sizing")
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Don't hardcode GuC's MOCS index in register header
Matt Roper [Fri, 2 Jun 2023 23:52:10 +0000 (16:52 -0700)]
drm/xe: Don't hardcode GuC's MOCS index in register header

Although PVC is currently the only platform that needs us to program a
GuC register with the index of an uncached MOCS entry, it's likely other
platforms will need this in the future.  Rather than hardcoding PVC's
index into the register header, we should just pull the appropriate
index from gt->mocs.uc_index to future-proof the code.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230602235210.1314028-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Initialize MOCS earlier
Matt Roper [Fri, 2 Jun 2023 23:52:09 +0000 (16:52 -0700)]
drm/xe: Initialize MOCS earlier

xe_mocs_init_early doesn't touch the hardware, it just sets up internal
software state.  There's no need to perform this step in the "forcewake
held" region.  Moving the init earlier will also make the uc_index
values available earlier which will be important for an upcoming GuC
init patch.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230602235210.1314028-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Reformat xe_guc_regs.h
Matt Roper [Fri, 2 Jun 2023 23:52:08 +0000 (16:52 -0700)]
drm/xe: Reformat xe_guc_regs.h

Reformat the GuC register header according to the same rules used by
other register headers:
 - Register definitions are ordered by offset
 - Value of #define's start on column 49
 - Lowercase used for hex values

No functional change.

This header has some things that aren't directly related to register
definitions (e.g., number of doorbells, doorbell info structure, GuC
interrupt vector layout, etc.  These items have been moved to the bottom
of the header.

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230602235210.1314028-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Replace deprecated DRM_ERROR()
Gustavo Sousa [Thu, 1 Jun 2023 19:44:19 +0000 (16:44 -0300)]
drm/xe: Replace deprecated DRM_ERROR()

DRM_ERROR() has been deprecated in favor of pr_err(). However, we should
prefer to use xe_gt_err() or drm_err() whenever possible so we get gt-
or device-specific output with the error message.

v2:
  - Prefer drm_err() over pr_err(). (Matt, Jani)
v3:
  - Prefer xe_gt_err() over drm_err() when possible. (Matt)
v4:
  - Use the already available dev variable instead of xe->drm as
    parameter to drm_err(). (Matt)

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Haridhar Kalvala <haridhar.kalvala@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230601194419.1179609-1-gustavo.sousa@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Add kerneldoc description of multi-tile devices
Matt Roper [Thu, 1 Jun 2023 21:52:44 +0000 (14:52 -0700)]
drm/xe: Add kerneldoc description of multi-tile devices

v2:
 - Fix doubled word.  (Lucas)

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-32-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Reinstate media GT support
Matt Roper [Thu, 1 Jun 2023 21:52:43 +0000 (14:52 -0700)]
drm/xe: Reinstate media GT support

Now that tiles and GTs are handled separately and other prerequisite
changes are in place, we're ready to re-enable the media GT.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-31-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Update query uapi to support standalone media
Matt Roper [Thu, 1 Jun 2023 21:52:42 +0000 (14:52 -0700)]
drm/xe: Update query uapi to support standalone media

Now that a higher GT count can result from either multiple tiles (with
one GT each) or an extra media GT within the root tile, we need to
update the query code slightly to stop looking at tile_count.

FIXME: As noted previously, we need to decide on a formal direction for
exposing tiles and/or GTs to userspace.

v2:
 - Drop num_gt() function in favor of stored xe->info.gt_count.  (Brian)
v3:
 - Keep XE_QUERY_GT_TYPE_REMOTE around for now.  Userspace probably
   doesn't actually need this, and we may remove it in the future, but
   for now let's avoid changing uapi.  (Brian)

Cc: Brian Welty <brian.welty@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-30-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Allow GT looping and lookup on standalone media
Matt Roper [Thu, 1 Jun 2023 21:52:41 +0000 (14:52 -0700)]
drm/xe: Allow GT looping and lookup on standalone media

Allow xe_device_get_gt() and for_each_gt() to operate as expected on
platforms with standalone media.

FIXME: We need to figure out a consistent ID scheme for GTs.  This patch
keeps the pre-existing behavior of 0/1 being the GT IDs for both PVC
(multi-tile) and MTL (multi-GT), but depending on the direction we
decide to go with uapi, we may change this in the future (e.g., to
return 0/1 on PVC and 0/2 on MTL).  Or if we decide we only need to
expose tiles to userspace and not GTs, we may not even need ID numbers
for the GTs anymore.

v2:
 - Restructure a bit to make the assertions more clear.
 - Clarify in commit message that the goal here is to preserve existing
   behavior; UAPI-visible changes may be introduced in the future once
   we settle on what we really want.
v3:
 - Store total GT count in xe_device for ease of lookup.  (Brian)
 - s/(id__++)/(id__)++/  (Gustavo)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-29-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
Matt Roper [Thu, 1 Jun 2023 21:52:40 +0000 (14:52 -0700)]
drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations

Updates to the GGTT can happen when there are no in-flight jobs keeping
the hardware awake.  If the GT is powered down when invalidation is
requested, we will not be able to communicate with the GuC (or MMIO) and
the invalidation request will go missing.  Explicitly grab GT forcewake
to ensure the GT and GuC are powered up during the TLB invalidation.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-28-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Invalidate TLB on all affected GTs during GGTT updates
Matt Roper [Thu, 1 Jun 2023 21:52:39 +0000 (14:52 -0700)]
drm/xe: Invalidate TLB on all affected GTs during GGTT updates

The GGTT is part of the tile and is shared by the primary and media GTs
on platforms with a standalone media architecture.  However each of
these GTs has its own TLBs caching the page table lookups, and each
needs to be invalidated separately.

Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-27-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
Matt Roper [Thu, 1 Jun 2023 21:52:38 +0000 (14:52 -0700)]
drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe

The majority of xe_gt_irq_postinstall() is really focused on the
hardware engine interrupts; other GT-related interrupts such as the GuC
are enabled/disabled independently.  Renaming the function and making it
truly GT-specific will make it more clear what the intended focus is.

Disabling/masking of other interrupts (such as GuC interrupts) is
unnecessary since that has already happened during the irq_reset stage,
and doing so will become harmful once the media GT is re-enabled since
calls to xe_gt_irq_postinstall during media GT initialization would
incorrectly disable the primary GT's GuC interrupts.

Also, since this function is called from gt_fw_domain_init(), it's not
necessary to also call it earlier during xe_irq_postinstall; just
xe_irq_resume to handle runtime resume should be sufficient.

v2:
 - Drop unnecessary !gt check.  (Lucas)
 - Reword some comments about enable/unmask for clarity.  (Lucas)

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-26-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/irq: Untangle postinstall functions
Matt Roper [Thu, 1 Jun 2023 21:52:37 +0000 (14:52 -0700)]
drm/xe/irq: Untangle postinstall functions

The xe_irq_postinstall() never actually gets called after installing the
interrupt handler.  This oversight seems to get papered over due to the
fact that the (misnamed) xe_gt_irq_postinstall does more than it really
should and gets called in the middle of the GT initialization.  The
callstack for postinstall is also a bit muddled with top-level device
interrupt enablement happening within platform-specific functions called
from the per-tile xe_gt_irq_postinstall() function.

Clean this all up by adding the missing call to xe_irq_postinstall()
after installing the interrupt handler and pull top-level irq enablement
up to xe_irq_postinstall where we'd expect it to be.

The xe_gt_irq_postinstall() function is still a bit misnamed here; an
upcoming patch will refocus its purpose and rename it.

v2:
 - Squash in patch to actually call xe_irq_postinstall() after
   installing the interrupt handler.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-25-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask
Matt Roper [Thu, 1 Jun 2023 21:52:36 +0000 (14:52 -0700)]
drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask

Although primary and media GuC share a single interrupt enable bit, they
each have distinct bits in the mask register.  Although we always enable
interrupts for the primary GuC before the media GuC today (and never
disable either of them), this might not always be the case in the
future, so use a RMW when updating the mask register to ensure the other
GuC's mask doesn't get clobbered.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-24-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/irq: Move ASLE backlight interrupt logic
Matt Roper [Thu, 1 Jun 2023 21:52:35 +0000 (14:52 -0700)]
drm/xe/irq: Move ASLE backlight interrupt logic

Our only use of GUnit interrupts is to handle ASLE backlight operations
that are reported as GUnit GSE interrupts.  Move the enable/disable of
these interrupts to a more sensible place, in the same area where we
expect display interrupt code to be added by future patches.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-23-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Interrupts are delivered per-tile, not per-GT
Matt Roper [Thu, 1 Jun 2023 21:52:34 +0000 (14:52 -0700)]
drm/xe: Interrupts are delivered per-tile, not per-GT

IRQ delivery and handling needs to be handled on a per-tile basis.  Note
that this is true even for the "GT interrupts" relating to engines and
GuCs --- the interrupts relating to both GTs get raised through a single
set of registers in the tile's sgunit range.

On true multi-tile platforms, interrupts on remote tiles are internally
forwarded to the root tile; the first thing the top-level interrupt
handler should do is consult the root tile's instance of
DG1_MSTR_TILE_INTR to determine which tile(s) had interrupts.  This
register is also responsible for enabling/disabling top-level reporting
of any interrupts to the OS.  Although this register technically exists
on all tiles, it should only be used on the root tile.

The (mis)use of struct xe_gt as a target for MMIO operations in the
driver makes the code somewhat confusing since we wind up needing a GT
pointer to handle programming that's unrelated to the GT.  To mitigate
this confusion, all of the xe_gt structures used solely as an MMIO
target in interrupt code are renamed to 'mmio' so that it's clear that
the structure being passed does not necessarily relate to any specific
GT (primary or media) that we might be dealing with interrupts for.
Reworking the driver's MMIO handling to not be dependent on xe_gt is
planned as a future patch series.

Note that GT initialization code currently calls xe_gt_irq_postinstall()
in an attempt to enable the HWE interrupts for the GT being initialized.
Unfortunately xe_gt_irq_postinstall() doesn't really match its name and
does a bunch of other stuff unrelated to the GT interrupts (such as
enabling the top-level device interrupts).  That will be addressed in
future patches.

v2:
 - Clarify commit message with explanation of why DG1_MSTR_TILE_INTR is
   only used on the root tile, even though it's an sgunit register that
   is technically present in each tile's MMIO space.  (Aravind)
 - Also clarify that the xe_gt used as a target for MMIO operations may
   or may not relate to the GT we're dealing with for interrupts.
   (Lucas)

Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-22-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Add media GT to tile
Matt Roper [Thu, 1 Jun 2023 21:52:32 +0000 (14:52 -0700)]
drm/xe: Add media GT to tile

This media_gt pointer isn't actually allocated yet.  Future patches will
start hooking it up at appropriate places in the code, and then creation
of the media GT will be added once those infrastructure changes are in
place.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-20-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Allocate GT dynamically
Matt Roper [Thu, 1 Jun 2023 21:52:31 +0000 (14:52 -0700)]
drm/xe: Allocate GT dynamically

In preparation for re-adding media GT support, switch the primary GT
within the tile to a dynamic allocation.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-19-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
Matt Roper [Thu, 1 Jun 2023 21:52:30 +0000 (14:52 -0700)]
drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE

Now that tiles and GTs are handled separately, extra_gts[] doesn't
really provide any useful information that we can't just infer directly.
The primary GT of the root tile and of the remote tiles behave the same
way and don't need independent handling.

When we re-add support for media GTs in a future patch, the presence of
media can be determined from MEDIA_VER() (i.e., >= 13) and media's GSI
offset handling is expected to remain constant for all forseeable future
platforms, so it won't need to be provided in a definition structure
either.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-18-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Drop vram_id
Matt Roper [Thu, 1 Jun 2023 21:52:29 +0000 (14:52 -0700)]
drm/xe: Drop vram_id

The VRAM ID is always the tile ID; there's no need to track it
separately within a GT.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-17-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Clarify 'gt' retrieval for primary tile
Matt Roper [Thu, 1 Jun 2023 21:52:28 +0000 (14:52 -0700)]
drm/xe: Clarify 'gt' retrieval for primary tile

There are a bunch of places in the driver where we need to perform
non-GT MMIO against the platform's primary tile (display code, top-level
interrupt enable/disable, driver initialization, etc.).  Rename
'to_gt()' to 'xe_primary_mmio_gt()' to clarify that we're trying to get
a primary MMIO handle for these top-level operations.

In the future we need to move away from xe_gt as the target for MMIO
operations (most of which are completely unrelated to GT).

v2:
 - s/xe_primary_mmio_gt/xe_root_mmio_gt/ for more consistency with how
   we refer to tile 0.  (Lucas)
v3:
 - Tweak comment on xe_root_mmio_gt().  (Lucas)

Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-16-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Move migration from GT to tile
Matt Roper [Thu, 1 Jun 2023 21:52:27 +0000 (14:52 -0700)]
drm/xe: Move migration from GT to tile

Migration primarily focuses on the memory associated with a tile, so it
makes more sense to track this at the tile level (especially since the
driver was already skipping migration operations on media GTs).

Note that the blitter engine used to perform the migration always lives
in the tile's primary GT today.  In theory that could change if media
GTs ever start including blitter engines in the future, but we can
extend the design if/when that happens in the future.

v2:
 - Fix kunit test build
 - Kerneldoc parameter name update
v3:
 - Removed leftover prototype for removed function.  (Gustavo)
 - Remove unrelated / unwanted error handling change.  (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-15-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Memory allocations are tile-based, not GT-based
Matt Roper [Thu, 1 Jun 2023 21:52:25 +0000 (14:52 -0700)]
drm/xe: Memory allocations are tile-based, not GT-based

Since memory and address spaces are a tile concept rather than a GT
concept, we need to plumb tile-based handling through lots of
memory-related code.

Note that one remaining shortcoming here that will need to be addressed
before media GT support can be re-enabled is that although the address
space is shared between a tile's GTs, each GT caches the PTEs
independently in their own TLB and thus TLB invalidation should be
handled at the GT level.

v2:
 - Fix kunit test build.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-13-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Move VRAM from GT to tile
Matt Roper [Thu, 1 Jun 2023 21:52:23 +0000 (14:52 -0700)]
drm/xe: Move VRAM from GT to tile

On platforms with VRAM, the VRAM is associated with the tile, not the
GT.

v2:
 - Unsquash the GGTT handling back into its own patch.
 - Fix kunit test build
v3:
 - Tweak the "FIXME" comment to clarify that this function will be
   completely gone by the end of the series.  (Lucas)
v4:
 - Move a few changes that were supposed to be part of the GGTT patch
   back to that commit.  (Gustavo)
v5:
 - Kerneldoc parameter name fix.

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-11-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Move GGTT from GT to tile
Matt Roper [Thu, 1 Jun 2023 21:52:21 +0000 (14:52 -0700)]
drm/xe: Move GGTT from GT to tile

The GGTT exists at the tile level.  When a tile contains multiple GTs,
they share the same GGTT.

v2:
 - Include some changes that were mis-squashed into the VRAM patch.
   (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-9-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Move register MMIO into xe_tile
Matt Roper [Thu, 1 Jun 2023 21:52:19 +0000 (14:52 -0700)]
drm/xe: Move register MMIO into xe_tile

Each tile has its own register region in the BAR, containing instances
of all registers for the platform.  In contrast, the multiple GTs within
a tile share the same MMIO space; there's just a small subset of
registers (the GSI registers) which have multiple copies at different
offsets (0x0 for primary GT, 0x380000 for media GT).  Move the register
MMIO region size/pointers to the tile structure, leaving just the GSI
offset information in the GT structure.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-7-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Add for_each_tile iterator
Matt Roper [Thu, 1 Jun 2023 21:52:18 +0000 (14:52 -0700)]
drm/xe: Add for_each_tile iterator

As we start splitting tile handling out from GT handling, we'll need to
be able to iterate over tiles separately from GTs.  This iterator will
be used in upcoming patches.

v2:
 - s/(id__++)/(id__)++/  (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-6-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Add backpointer from gt to tile
Matt Roper [Thu, 1 Jun 2023 21:52:16 +0000 (14:52 -0700)]
drm/xe: Add backpointer from gt to tile

Rather than a backpointer to the xe_device, a GT should have a
backpointer to its tile (which can then be used to lookup the device if
necessary).

The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h)
can and should still be used to jump directly from an xe_gt to
xe_device.

v2:
 - Fix kunit test build
 - Move a couple changes to the previous patch. (Lucas)

Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-4-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Introduce xe_tile
Matt Roper [Thu, 1 Jun 2023 21:52:15 +0000 (14:52 -0700)]
drm/xe: Introduce xe_tile

Create a new xe_tile structure to begin separating the concept of "tile"
from "GT."  A tile is effectively a complete GPU, and a GT is just one
part of that.  On platforms like MTL, there's only a single full GPU
(tile) which has its IP blocks provided by two GTs.  In contrast, a
"multi-tile" platform like PVC is basically multiple complete GPUs
packed behind a single PCI device.

For now, just create xe_tile as a simple wrapper around xe_gt.  The
items in xe_gt that are truly tied to the tile rather than the GT will
be moved in future patches.  Support for multiple GTs per tile (i.e.,
the MTL standalone media case) will also be re-introduced in a future
patch.

v2:
 - Fix kunit test build
 - Move hunk from next patch to use local tile variable rather than
   direct xe->tiles[id] accesses.  (Lucas)
 - Mention compute in kerneldoc.  (Rodrigo)

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/mtl: Disable media GT
Matt Roper [Thu, 1 Jun 2023 21:52:14 +0000 (14:52 -0700)]
drm/xe/mtl: Disable media GT

Xe incorrectly conflates the concept of 'tile' and 'GT.'  Since MTL's
media support is not yet functioning properly, let's just disable it
completely for now while we fix the fundamental driver design.  Support
for media GTs on platforms like MTL will be re-added later.

v2:
 - Drop some unrelated code cleanup that didn't belong in this patch.
   (Lucas)
v3:
 - Drop unnecessary xe_gt.h include.  (Gustavo)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/vm: fix double list add
Matthew Auld [Thu, 1 Jun 2023 12:35:05 +0000 (13:35 +0100)]
drm/xe/vm: fix double list add

It looks like the driver only wants to track one vma for each external
object per vm. However it looks like bo_has_vm_references_locked() will
ignore any vma that is marked as vma->destroyed (not actually destroyed
yet). If we then mark our externally tracked vma as destroyed and then
create a new vma for the same object and vm, we can have two externally
tracked vma for the same object and vm. When the destroy actually
happens it tries to move the external tracking to a different vma, but
in this case it is already being tracked, leading to double list add
errors. It should be safe to simply drop the destroyed check in
bo_has_vm_references(), since the actual destroy will switch the
external tracking to the next available vma.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/290
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Replace PVC check by engine type check
José Roberto de Souza [Tue, 23 May 2023 20:14:45 +0000 (13:14 -0700)]
drm/xe: Replace PVC check by engine type check

__emit_job_gen12_render_compute() masks some PIPE_CONTROL bits that
do not exist in platforms without render engine.
So here replacing the PVC check by something more generic that will
support any future platforms without render engine.

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Rename GPU offset helper to reflect true usage
Michael J. Ruhl [Thu, 25 May 2023 19:43:26 +0000 (15:43 -0400)]
drm/xe: Rename GPU offset helper to reflect true usage

The _io_offset helper function is returning an offset into the GPU
address space.  Using the CPU address offset (io_) is not correct.

Rename to reflect usage.
Update to use GPU offset information.
Update PT dma_offset to use the helper

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Size GT device memory correctly
Michael J. Ruhl [Thu, 25 May 2023 19:43:25 +0000 (15:43 -0400)]
drm/xe: Size GT device memory correctly

The current method of sizing GT device memory is not quite right.

Update the algorithm to use the relevant HW information and offsets
to set up the sizing correctly.

Update the stolen memory sizing to reflect the changes, and to be
GT specific.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Simplify rebar sizing
Michael J. Ruhl [Thu, 25 May 2023 19:43:24 +0000 (15:43 -0400)]
drm/xe: Simplify rebar sizing

"Right sizing" the PCI BAR is not necessary.  If rebar is needed
size to the maximum available.

Preserve the force_vram_bar_size sizing.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Rework size helper to be a little more correct
Michael J. Ruhl [Thu, 25 May 2023 19:43:23 +0000 (15:43 -0400)]
drm/xe: Rework size helper to be a little more correct

The _total_vram_size helper is device based and is not complete.

Teach the helper to be tile aware and add the ability to size
DG1 correctly.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Prevent evicting for page tables
Maarten Lankhorst [Thu, 25 May 2023 19:14:29 +0000 (21:14 +0200)]
drm/xe: Prevent evicting for page tables

When creating page tables from xe_exec_ioctl, we may end up freeing
memory we just validated. To be certain this does not happen, do not
allow the current reservation to be evicted from the ioctl.

Callchain:
[  109.008522]  xe_bo_move_notify+0x5c/0xf0 [xe]
[  109.008548]  xe_bo_move+0x90/0x510 [xe]
[  109.008573]  ttm_bo_handle_move_mem+0xb7/0x170 [ttm]
[  109.008581]  ttm_bo_swapout+0x15e/0x360 [ttm]
[  109.008586]  ttm_device_swapout+0xc2/0x110 [ttm]
[  109.008592]  ttm_global_swapout+0x47/0xc0 [ttm]
[  109.008598]  ttm_tt_populate+0x7a/0x130 [ttm]
[  109.008603]  ttm_bo_handle_move_mem+0x160/0x170 [ttm]
[  109.008609]  ttm_bo_validate+0xe5/0x1d0 [ttm]
[  109.008614]  ttm_bo_init_reserved+0xac/0x190 [ttm]
[  109.008620]  __xe_bo_create_locked+0x153/0x260 [xe]
[  109.008645]  xe_bo_create_locked_range+0x77/0x360 [xe]
[  109.008671]  xe_bo_create_pin_map_at+0x33/0x1f0 [xe]
[  109.008695]  xe_bo_create_pin_map+0x11/0x20 [xe]
[  109.008721]  xe_pt_create+0x69/0xf0 [xe]
[  109.008749]  xe_pt_stage_bind_entry+0x208/0x430 [xe]
[  109.008776]  xe_pt_walk_range+0xe9/0x2a0 [xe]
[  109.008802]  xe_pt_walk_range+0x223/0x2a0 [xe]
[  109.008828]  xe_pt_walk_range+0x223/0x2a0 [xe]
[  109.008853]  __xe_pt_bind_vma+0x28d/0xbd0 [xe]
[  109.008878]  xe_vm_bind_vma+0xc7/0x2f0 [xe]
[  109.008904]  xe_vm_rebind+0x72/0x160 [xe]
[  109.008930]  xe_exec_ioctl+0x22b/0xa70 [xe]
[  109.008955]  drm_ioctl_kernel+0xb9/0x150 [drm]
[  109.008972]  drm_ioctl+0x210/0x430 [drm]
[  109.008988]  __x64_sys_ioctl+0x85/0xb0
[  109.008990]  do_syscall_64+0x38/0x90
[  109.008991]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

Original warning:
[ 5613.149126] WARNING: CPU: 3 PID: 45883 at drivers/gpu/drm/xe/xe_vm.c:504 xe_vm_unlock_dma_resv+0x43/0x50 [xe]
...
[ 5613.226398] RIP: 0010:xe_vm_unlock_dma_resv+0x43/0x50 [xe]
[ 5613.316098] Call Trace:
[ 5613.318595]  <TASK>
[ 5613.320743]  xe_exec_ioctl+0x383/0x8a0 [xe]
[ 5613.325278]  ? __is_insn_slot_addr+0x8e/0x110
[ 5613.329719]  ? __is_insn_slot_addr+0x8e/0x110
[ 5613.334116]  ? kernel_text_address+0x75/0xf0
[ 5613.338429]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 5613.343778]  ? __kernel_text_address+0x9/0x40
[ 5613.348181]  ? unwind_get_return_address+0x1a/0x30
[ 5613.353013]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 5613.358362]  ? arch_stack_walk+0x99/0xf0
[ 5613.362329]  ? rcu_read_lock_sched_held+0xb/0x70
[ 5613.366996]  ? lock_acquire+0x287/0x2f0
[ 5613.370873]  ? rcu_read_lock_sched_held+0xb/0x70
[ 5613.375530]  ? rcu_read_lock_sched_held+0xb/0x70
[ 5613.380181]  ? lock_release+0x225/0x2e0
[ 5613.384059]  ? __pfx_xe_exec_ioctl+0x10/0x10 [xe]
[ 5613.389092]  drm_ioctl_kernel+0xc0/0x170
[ 5613.393068]  drm_ioctl+0x1b7/0x490
[ 5613.396519]  ? __pfx_xe_exec_ioctl+0x10/0x10 [xe]
[ 5613.401547]  ? lock_release+0x225/0x2e0
[ 5613.405432]  __x64_sys_ioctl+0x8a/0xb0
[ 5613.409232]  do_syscall_64+0x37/0x90

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/239
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: keep pulling mem_access_get further back
Matthew Auld [Wed, 24 May 2023 17:56:54 +0000 (18:56 +0100)]
drm/xe: keep pulling mem_access_get further back

Lockdep is unhappy about ggtt->lock -> runtime_pm, where it seems
to think this can somehow get inverted. The ggtt->lock looks like a
potentially sensitive driver lock, so likely a sensible move to never
call the runtime_pm routines while holding it. Actually it looks like
d3cold wants to grab this, so perhaps this can indeed deadlock.

v2:
 - Don't forget about xe_gt_tlb_invalidation_vma(), which now needs
   explicit access_get.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: don't allocate under ct->lock
Matthew Auld [Wed, 24 May 2023 17:56:53 +0000 (18:56 +0100)]
drm/xe: don't allocate under ct->lock

Seems to be a sensitive lock, where ct->lock looks to be primed with
fs_reclaim, so holding that and then allocating memory will cause
lockdep to complain. We need to change the ordering wrt to grabbing the
ct->lock and potentially grabbing the runtime_pm, since some of the
runtime_pm routines can allocate memory (or at least that's what lockdep
seems to suggest).

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/migrate: retain CCS aux state for vram -> vram
Matthew Auld [Thu, 25 May 2023 11:45:43 +0000 (12:45 +0100)]
drm/xe/migrate: retain CCS aux state for vram -> vram

There is no mention that migrate_copy() will skip copying the CCS aux
state for all types of vram -> vram transfers.  Currently we don't need
such a facility but might be surprising if we ever do.

v2: (Lucas):
  - s/lmem/vram/ in the commit message
  - Tidy up the code a bit; use one emit_copy_ccs()
v3:
  - Reword the commit message

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/bo: further limit where CCS pages are needed
Matthew Auld [Thu, 25 May 2023 11:45:42 +0000 (12:45 +0100)]
drm/xe/bo: further limit where CCS pages are needed

No need to allocate extra pages for this if we know flat-ccs AUX state
is not even possible, like for normal system memory objects.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Support copying of data between system memory bos
Thomas Hellström [Wed, 24 May 2023 16:52:29 +0000 (16:52 +0000)]
drm/xe: Support copying of data between system memory bos

Modify the xe_migrate_copy() function somewhat to explicitly allow
copying of data between two buffer objects including system memory
buffer objects. Update the migrate test accordingly.

v2:
- Check that buffer object sizes match when copying (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Fix the migrate selftest for integrated GPUs
Thomas Hellström [Wed, 24 May 2023 16:51:42 +0000 (16:51 +0000)]
drm/xe: Fix the migrate selftest for integrated GPUs

The TTM resource cursor was set up incorrectly.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Port Wa_14014475959 to xe_wa and fix it
Lucas De Marchi [Fri, 26 May 2023 16:43:58 +0000 (09:43 -0700)]
drm/xe/guc: Port Wa_14014475959 to xe_wa and fix it

Port Wa_14014475959 to xe_wa fixing its condition. The workaround should
only be applied on the primary GT, not on media. So just checking by
MTL platform is not enough: checking GT is of the right type is also
needed.

Since the GRAPHICS_STEP() does checks the GT type, we could leave the
first check as a platform one: it'd would be easier to understand and
not go out of sync with the graphics_ip_map[] in
drivers/gpu/drm/xe/xe_pci.c. However it also means that new platforms
using the same IP wouldn't match. Prefer using the IP version.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-22-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/rtp: Also check gt type
Lucas De Marchi [Fri, 26 May 2023 16:43:57 +0000 (09:43 -0700)]
drm/xe/rtp: Also check gt type

When running rules on MTL and beyond that have media as a standalone GT,
the rule should only match if the gt passed as parameter match the
version/range/stepping that the rule is checking. This allows
workarounds affecting only the media GT to be applied only on that GT
and vice-versa.

For platforms before MTL, the GT will not be of media type, even if it
includes media engines. Make sure to cover that case by checking if the
platforma has standalone media.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-21-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Port Wa_1509372804 to xe_wa
Lucas De Marchi [Fri, 26 May 2023 16:43:56 +0000 (09:43 -0700)]
drm/xe/guc: Port Wa_1509372804 to xe_wa

Port Wa_1509372804 to xe_wa so it's reported as active.

v2: Match workaround database, starting from A0 stepping (Matt Roper)

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-20-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Port Wa_16015675438/Wa_18020744125 to xe_wa
Lucas De Marchi [Fri, 26 May 2023 16:43:55 +0000 (09:43 -0700)]
drm/xe/guc: Port Wa_16015675438/Wa_18020744125 to xe_wa

Wa_16015675438 and Wa_18020744125 apply to DG2 using the same action and
conditions. Add both to the oob rules so they are both reported as
active. Note that previously they were not checking by platform or IP
version, hence making them not future-proof.  Those workarounds should
only be active in PVC and DG2, besides the check for "no render engine".

v2: From current WA database, Wa_16015675438 applies to all DG2
    subplatforms except G11. Migrate condition to use subplatform and
    remove G11 from the match (Matt Roper)

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-19-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Port Wa_22012727170/Wa_22012727685 to xe_wa
Lucas De Marchi [Fri, 26 May 2023 16:43:54 +0000 (09:43 -0700)]
drm/xe/guc: Port Wa_22012727170/Wa_22012727685 to xe_wa

Wa_22012727170 and Wa_22012727685 apply to DG2 using the same action and
conditions. Add both to the oob rules so they are both reported as
active.

Do not Wa_22012727170 to PVC and MTL since only early A* steppings are
affected.

v2: Remove DG2_G10 from Wa_22012727685 to match current WA database
    (Matt Roper)
v3: GRAPHICS_STEP(A0, FOREVER) can be left alone for DG2 as this means
    all steppings

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-18-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Port Wa_16011777198 to xe_wa
Lucas De Marchi [Fri, 26 May 2023 16:43:53 +0000 (09:43 -0700)]
drm/xe/guc: Port Wa_16011777198 to xe_wa

Port Wa_16011777198 to xe_wa so it's reported as active.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-17-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Port Wa_14012197797/Wa_22011391025 to xe_wa
Lucas De Marchi [Fri, 26 May 2023 16:43:52 +0000 (09:43 -0700)]
drm/xe/guc: Port Wa_14012197797/Wa_22011391025 to xe_wa

Wa_14012197797 and Wa_22011391025 apply to DG2 using the same action.
They apply to slightly different conditions. Add both to the oob rules
so they are both reported as active.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-16-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Port Wa_16011759253 to xe_wa
Lucas De Marchi [Fri, 26 May 2023 16:43:51 +0000 (09:43 -0700)]
drm/xe/guc: Port Wa_16011759253 to xe_wa

Port Wa_16011759253 to oob. Wa_22011383443, that has the same action,
doesn't need to be ported as it targets early PVC steppings.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-15-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/guc: Port Wa_22012773006 to xe_wa
Lucas De Marchi [Fri, 26 May 2023 16:43:50 +0000 (09:43 -0700)]
drm/xe/guc: Port Wa_22012773006 to xe_wa

Let xe_guc.c start using XE_WA() for workarounds, starting from a simple
one: Wa_22012773006. It's also changed to start with graphics version
12, since that is the first supported by xe.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-14-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Add support for OOB workarounds
Lucas De Marchi [Fri, 26 May 2023 16:43:49 +0000 (09:43 -0700)]
drm/xe: Add support for OOB workarounds

There are WAs that, due to their nature, cannot be applied from a
central place like xe_wa.c. Those are peppered around the rest of the
code, as needed. Now they have a new name:  "out-of-band workarounds".

These workarounds have their names and rules still grouped in xe_wa.c,
inside the xe_wa_oob array, which is generated at compile time by
xe_wa_oob.rules and the hostprog xe_gen_wa_oob. The code generation
guarantees that the header xe_wa_oob.h contains the IDs for the
workarounds that match the index in the table. This way the runtime
checks that are spread throughout the code are simple tests against the
bitmap saved during initialization.

v2: Fix prev_name tracking not working when it's empty, i.e. when there
    is more than 1 continuation rule.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-13-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Include build directory
Lucas De Marchi [Fri, 26 May 2023 16:43:48 +0000 (09:43 -0700)]
drm/xe: Include build directory

When doing out-of-tree builds with O= or KBUILD_OUTPUT=, it's important
to also add the directory where the target is saved. Otherwise any file
generated by the build system may not be available for other targets
depending on it.

The $(obj) is added automatically when building the entire kernel,
but it's not added when M=drivers/gpu/drm/xe is added.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-12-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/rtp: Add support for entries with no action
Lucas De Marchi [Fri, 26 May 2023 16:43:47 +0000 (09:43 -0700)]
drm/xe/rtp: Add support for entries with no action

Add a separate struct to hold entries in a table that has no action
associated with each of them. The goal is that the caller in future can
set a per-context callback, or just use the active entry marking
feature.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-11-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/rtp: Add check for media stepping
Lucas De Marchi [Fri, 26 May 2023 16:43:46 +0000 (09:43 -0700)]
drm/xe/rtp: Add check for media stepping

Start differentiating the media and graphics stepping as it will be
important for MTL. Note that RTP is still not prepared to handle the
different types of GT, i.e. checking for graphics version/range/stepping
on a media gt or vice versa still matches regardless of the gt being
passed as parameter. Changing it to accommodate MTL is left for a future
patch.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-10-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/rtp: Rename STEP to GRAPHICS_STEP
Lucas De Marchi [Fri, 26 May 2023 16:43:45 +0000 (09:43 -0700)]
drm/xe/rtp: Rename STEP to GRAPHICS_STEP

Rename the RTP match in order to prepare the code base to check for the
media version. Up until MTL, the graphics vs media distinction wrt to
stepping was not ver relevant as they were the same GT. However, with
MTL this is no longer true.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-9-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/debugfs: Dump active workarounds
Lucas De Marchi [Fri, 26 May 2023 16:43:44 +0000 (09:43 -0700)]
drm/xe/debugfs: Dump active workarounds

Add a "workarounds" node in debugfs that can dump all the active
workarounds using the information recorded by rtp infra when those
workarounds were processed.

v2: move workarounds to be reported per-GT

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-8-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/wa: Track gt/engine/lrc active workarounds
Lucas De Marchi [Fri, 26 May 2023 16:43:43 +0000 (09:43 -0700)]
drm/xe/wa: Track gt/engine/lrc active workarounds

Allocate the data to track workarounds on each gt of the device,
and pass that to RTP so the active workarounds are tracked.

Even if the workarounds available until now are mostly device
or platform centric, with the different IP versions for media and
graphics starting with MTL, it's possible that some workarounds
need to be applied only on select GTs. Also, given the workaround
database is per IP block, for tracking purposes there is no need to
differentiate the workarounds per engine class. Hence the bitmask
to track active workarounds can be tracked per GT.

v2: Move the tracking from per-device to per-GT basis (Matt Roper)

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/rtp: Allow to track active workarounds
Lucas De Marchi [Fri, 26 May 2023 16:43:42 +0000 (09:43 -0700)]
drm/xe/rtp: Allow to track active workarounds

Add the metadata in struct xe_rtp_process_ctx, to be set by
xe_rtp_process_ctx_enable_active_tracking(), so rtp knows how to mark
the active entries while processing the table. This can be used by the
WA infra to record what are the active workarounds.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/rtp: Add "_sr" to entry/function names
Lucas De Marchi [Fri, 26 May 2023 16:43:41 +0000 (09:43 -0700)]
drm/xe/rtp: Add "_sr" to entry/function names

The xe_rtp_process() function and xe_rtp_entry depend on the
save-restore struct. In future it will be desired to process rtp rules,
regardless of adding them to a save-restore. Rename the struct and
function so the intent is clear and the name is freed for future uses.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/rtp: Replace XE_WARN_ON
Lucas De Marchi [Fri, 26 May 2023 16:43:40 +0000 (09:43 -0700)]
drm/xe/rtp: Replace XE_WARN_ON

Now that rule_matches() always has an xe pointer, replace the XE_WARN_ON
with the more appropriate drm_warn().

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/rtp: Split rtp process initialization
Lucas De Marchi [Fri, 26 May 2023 16:43:39 +0000 (09:43 -0700)]
drm/xe/rtp: Split rtp process initialization

The selection between hwe and gt is exposed to the outside of rtp, by
the xe_rtp_process() function. However it doesn't make seense from the
caller point of view to pass a hwe and a gt as argument since the gt
should always be the one containing the hwe.

This clarifies the interface by separating the context creation into an
initializer. The initializer then passes the correct value and there
should never be a case with hwe and gt set: when hwe is passed, the gt
is the one containing it. Internally the functions continue receiving
the argument separately.

v2: Leave the device-only context to a separate patch if they are indeed
    needed later

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Fix Wa_22011802037 annotation
Lucas De Marchi [Fri, 26 May 2023 16:43:38 +0000 (09:43 -0700)]
drm/xe: Fix Wa_22011802037 annotation

It was missing one digit, so not showing up as a proper WA number. Add
the missing number and annotate it with a FIXME as there are more to be
implemented to consider this WA done: ensure CS is stop before doing a
reset, wait for pending.

Also, this WA applies to platforms up to graphics version 1270 (with the
exception of MTL A*, that are not supported in xe). Fix platform check.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/284
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe/pvc: Don't try to invalidate AuxCCS TLB
Matt Roper [Wed, 24 May 2023 19:26:35 +0000 (12:26 -0700)]
drm/xe/pvc: Don't try to invalidate AuxCCS TLB

Generally !has_flatccs implies that a platform has AuxCCS compression
and thus needs to invalidate the AuxCCS TLB.  However PVC is a special
case because it has no compression of either type (FlatCCS or AuxCCS)
so we should avoid writing to non-existent AuxCCS registers.

Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com>
Link: https://lore.kernel.org/r/20230524192635.673293-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Validate uAPI padding and reserved fields
Christopher Snowhill [Thu, 25 May 2023 01:56:07 +0000 (18:56 -0700)]
drm/xe: Validate uAPI padding and reserved fields

Padding and reserved fields are declared such that they must be
zeroed, so verify that they're all zero in the respective ioctl
functions.

Derived from original patch by mlankhorst.

v2:
Removed extensions checks where there were none originally. (José)
Moved extraneous parentheses to the correct places. (Lucas)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Christopher Snowhill <kode54@gmail.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Add explicit padding to uAPI definition
Christopher Snowhill [Thu, 25 May 2023 01:56:06 +0000 (18:56 -0700)]
drm/xe: Add explicit padding to uAPI definition

Pad the uAPI definition so that it would align identically between
64-bit and 32-bit uarch, so consumers using this header will work
correctly from 32-bit compat userspace on a 64-bit kernel. Do it
in a minimally invasive way, so that 64-bit userspace will still
work with the previous header, and so that no fields suddenly
change sizes.

Originally inspired by mlankhorst.

Signed-off-by: Christopher Snowhill <kode54@gmail.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Apply upper limit to sg element size
Niranjana Vishwanathapura [Tue, 16 May 2023 03:26:53 +0000 (03:26 +0000)]
drm/xe: Apply upper limit to sg element size

The iommu_dma_map_sg() function ensures iova allocation doesn't
cross dma segment boundary. It does so by padding some sg elements.
This can cause overflow, ending up with sg->length being set to 0.
Avoid this by halving the maximum segment size (rounded down to
PAGE_SIZE).

Specify maximum segment size for sg elements by using
sg_alloc_table_from_pages_segment() to allocate sg_table.

v2: Use correct max segment size in dma_set_max_seg_size() call

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Add stepping support for GMD_ID platforms
Matt Roper [Wed, 24 May 2023 18:59:52 +0000 (11:59 -0700)]
drm/xe: Add stepping support for GMD_ID platforms

For platforms with GMD_ID registers, the IP stepping should be
determined from the 'revid' field of those registers rather than from
the PCI revid.

The hardware teams have indicated that they plan to keep the revid =>
stepping mapping consistent across all GMD_ID platforms, with major
steppings (A0, B0, C0, etc.) having revids that are multiples of 4, and
minor steppings (A1, A2, A3, etc.) taking the intermediate values.  For
now we'll trust that hardware follows through on this plan; if they have
to change direction in the future (e.g., they wind up needing something
like an "A4" that doesn't fit this scheme), we can add a GMD_ID-based
lookup table when the time comes.

v2:
 - Set xe->info.platform before finding stepping; the pre-GMD_ID code
   relies on this value to pick a lookup table.
v3:
 - Also set xe->info.subplatform before picking the stepping for
   pre-GMD_ID lookup.

Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://lore.kernel.org/r/20230524185952.666158-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Fail xe_device_create() if wq allocation fails
Gustavo Sousa [Thu, 18 May 2023 21:56:51 +0000 (18:56 -0300)]
drm/xe: Fail xe_device_create() if wq allocation fails

Let's make sure we give the driver a valid workqueue.

While at it, also make sure to call destroy_workqueue() only if the
workqueue is a valid one. That is necessary because xe_device_destroy()
is indirectly called as part of the cleanup process of a failed
xe_device_create().

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20230518215651.502159-3-gustavo.sousa@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
22 months agodrm/xe: Call drmm_add_action_or_reset() early in xe_device_create()
Gustavo Sousa [Thu, 18 May 2023 21:56:50 +0000 (18:56 -0300)]
drm/xe: Call drmm_add_action_or_reset() early in xe_device_create()

Otherwise no cleanup is actually done if we branch to err_put.

This works for now: currently we do know that, once inside
xe_device_destroy(), ttm_device_init() was successful so we can safely
call ttm_device_fini(); and, for xe->ordered_wq, there is an upcoming
commit to check its value before calling destroy_workqueue().

However, we might need change this in the future if we have more
initializers called that can fail in a way that we can not know which
one was it once inside xe_device_destroy().

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20230518215651.502159-2-gustavo.sousa@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>