]> www.infradead.org Git - users/hch/block.git/log
users/hch/block.git
18 months agodrm/xe/guc: Reuse code while debugging GuC params
Michal Wajdeczko [Thu, 4 Apr 2024 15:50:46 +0000 (17:50 +0200)]
drm/xe/guc: Reuse code while debugging GuC params

There is no need to duplicate code to print GuC parameters.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404155046.627-2-michal.wajdeczko@intel.com
18 months agodrm/xe/guc: Prefer GT oriented logs for GuC messages
Michal Wajdeczko [Thu, 4 Apr 2024 15:50:45 +0000 (17:50 +0200)]
drm/xe/guc: Prefer GT oriented logs for GuC messages

A platform can have more than one GuC, so we should use GT-oriented
logs to correctly identify the source of the message.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404155046.627-1-michal.wajdeczko@intel.com
18 months agodrm/xe/xe_hw_engine_class_sysfs: use sysfs_emit() for attr's _show()
Bommu Krishnaiah [Sat, 9 Dec 2023 23:59:49 +0000 (05:29 +0530)]
drm/xe/xe_hw_engine_class_sysfs: use sysfs_emit() for attr's _show()

sprintf() is deprecated for sysfs, use preferred sysfs_emit() instead.

v2: used sysfs_emit instand of sprintf

Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231209235949.54524-3-krishnaiah.bommu@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
18 months agodrm/xe: prefer snprintf over sprintf
Bommu Krishnaiah [Sat, 9 Dec 2023 23:59:48 +0000 (05:29 +0530)]
drm/xe: prefer snprintf over sprintf

since the sprintf() function lacks built-in protection against buffer
overflows using the snprintf() function.

v2: Removed hard coded values and used sizeof()

Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231209235949.54524-2-krishnaiah.bommu@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
18 months agodrm/xe: Protect devcoredump access after unbind
Rodrigo Vivi [Wed, 3 Apr 2024 19:50:44 +0000 (15:50 -0400)]
drm/xe: Protect devcoredump access after unbind

While we don't have the full flow protection when devcoredump
is accessed after device unbind. Let's at least for now
protect against null dereference:

[  422.766508] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[  423.119584] RIP: 0010:xe_vm_snapshot_free+0x30/0x180 [xe]

While at it, I also fixed a non-standard code-declaration block
on the similar function of xe_guc_submit.

v2: - Use IS_ERR_OR_NULL (Nirmoy)
    - Expand to other functions

Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240403195044.239766-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
18 months agodrm/xe/xe_migrate: Cast to output precision before multiplying operands
Himal Prasad Ghimiray [Mon, 1 Apr 2024 17:53:00 +0000 (23:23 +0530)]
drm/xe/xe_migrate: Cast to output precision before multiplying operands

Addressing potential overflow in result of  multiplication of two lower
precision (u32) operands before widening it to higher precision
(u64).

-v2
Fix commit message and description. (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240401175300.3823653-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
18 months agodrm/xe: Use ordered wq for preempt fence waiting
Matthew Brost [Mon, 1 Apr 2024 22:19:11 +0000 (15:19 -0700)]
drm/xe: Use ordered wq for preempt fence waiting

Preempt fences can sleep waiting for an exec queue suspend operation to
complete. If the system_unbound_wq is used for waiting and the number of
waiters exceeds max_active this will result in other users of the
system_unbound_wq getting starved. Use a device private work queue for
preempt fences to avoid starvation of the system_unbound_wq.

Even though suspend operations can complete out-of-order, all suspend
operations within a VM need to complete before the preempt rebind worker
can start. With that, use a device private ordered wq for preempt fence
waiting.

v2:
 - Add comment about cleanup on failure (Matt R)
 - Update commit message (Lucas)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240401221913.139672-2-matthew.brost@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
18 months agodrm/xe/xe2: Add workaround 18033852989
Himal Prasad Ghimiray [Mon, 1 Apr 2024 16:38:06 +0000 (22:08 +0530)]
drm/xe/xe2: Add workaround 18033852989

This workaround applies to RCS engine's context, hence added as
LRC workaround.

v2
- Fix commit description as lrc workaround instead of engine.(Lucas)

v3
- COMMON_SLICE_CHICKEN1 is a masked register, add XE_REG_OPTION_MASKED
flag. (Matt)

BSPEC: 55899

Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240401163806.3821128-1-himal.prasad.ghimiray@intel.com
18 months agodrm/xe: Normalize bo flags macros
Lucas De Marchi [Fri, 22 Mar 2024 14:27:02 +0000 (07:27 -0700)]
drm/xe: Normalize bo flags macros

The flags stored in the BO grew over time without following
much a naming pattern. First of all, get rid of the _BIT suffix that was
banned from everywhere else due to the guideline in
drivers/gpu/drm/i915/i915_reg.h that xe kind of follows:

Define bits using ``REG_BIT(N)``. Do **not** add ``_BIT`` suffix to the name.

Here the flags aren't for a register, but it's good practice to keep it
consistent.

Second divergence on names is the use or not of "CREATE". This is
because most of the flags are passed to xe_bo_create*() family of
functions, changing its behavior. However, since the flags are also
stored in the bo itself and checked elsewhere in the code, it seems
better to just omit the CREATE part.

With those 2 guidelines, all the flags are given the form
XE_BO_FLAG_<FLAG_NAME> with the following commands:

git grep -le "XE_BO_" -- drivers/gpu/drm/xe | xargs sed -i \
-e "s/XE_BO_\([_A-Z0-9]*\)_BIT/XE_BO_\1/g" \
-e 's/XE_BO_CREATE_/XE_BO_FLAG_/g'
git grep -le "XE_BO_" -- drivers/gpu/drm/xe | xargs sed -i -r \
-e 's/XE_BO_(DEFER_BACKING|SCANOUT|FIXED_PLACEMENT|PAGETABLE|NEEDS_CPU_ACCESS|NEEDS_UC|INTERNAL_TEST|INTERNAL_64K|GGTT_INVALIDATE)/XE_BO_FLAG_\1/g'

And then the defines in drivers/gpu/drm/xe/xe_bo.h are adjusted to
follow the coding style.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240322142702.186529-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
18 months agodrm/xe: Stop passing user flag to xe_bo_create_user()
Lucas De Marchi [Fri, 22 Mar 2024 14:27:01 +0000 (07:27 -0700)]
drm/xe: Stop passing user flag to xe_bo_create_user()

It's quite redundant to pass XE_BO_CREATE_USER_BIT to
xe_bo_create_user() since the only difference of that function is to
force that flag. Stop passing the flag in the few cases that were
explicitly doing so.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240322142702.186529-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
18 months agodrm/xe/xe_devcoredump: Check NULL before assignments
Himal Prasad Ghimiray [Thu, 28 Mar 2024 12:37:39 +0000 (18:07 +0530)]
drm/xe/xe_devcoredump: Check NULL before assignments

Assign 'xe_devcoredump_snapshot *' and 'xe_device *' only if
'coredump' is not NULL.

v2
- Fix commit messages.

v3
- Define variables before code.(Ashutosh/Jose)

v4
- Drop return check for coredump_to_xe. (Jose/Rodrigo)

v5
- Modify misleading commit message. (Matt)

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328123739.3633428-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
18 months agodrm/xe/hwmon: Add infra to support card power and energy attributes
Karthik Poosa [Thu, 28 Mar 2024 17:54:35 +0000 (23:24 +0530)]
drm/xe/hwmon: Add infra to support card power and energy attributes

Add infra to support card power and energy attributes through channel 0.
Package attributes will be now exposed through channel 1 rather than
channel 0 as shown below.

Channel 0 i.e power1/energy1_xxx used for card and
channel 1 i.e power2/energy2_xxx used for package power,energy attributes.

power1/curr1_crit and in0_input are moved to channel 1, i.e.
power2/curr2_crit and in1_input as these are available for package only.

This would be needed for future platforms where they might be
separate registers for package and card power and energy.

Each discrete GPU supported by Xe driver, would have a directory in
/sys/class/hwmon/ with multiple channels under it.
Each channel would have attributes for power, energy etc.

Ex: /sys/class/hwmon/hwmon2/power1_max
                           /power1_label
                           /energy1_input
                           /energy1_label

Attributes will have a label to get more description of it.
Labelling is as below.
power1_label/energy1_label - "card",
power2_label/energy2_label - "pkg".

v2: Fix checkpatch errors.

v3:
 - Update intel-xe-hwmon documentation. (Riana, Badal)
 - Rename hwmon card channel enum from CHANNEL_PLATFORM
   to CHANNEL_CARD. (Riana)

v4:
 - Remove unrelated changes from patch. (Anshuman)
 - Fix typo in commit msg.

v5:
 - Update commit message and intel-xe-hwmon documentation with "Xe"
   instead of xe when using it as a name. (Rodrigo)

Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328175435.3870957-1-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
18 months agodrm/xe: Refactor GT debugfs
Michal Wajdeczko [Thu, 28 Mar 2024 16:28:08 +0000 (17:28 +0100)]
drm/xe: Refactor GT debugfs

We are abusing struct drm_info_list.data by storing there pointer
to the xe_gt, while it shouldn't be used for any device specific
data.  Use recently introduced xe_gt_debugfs_simple_show() that
hides all details how to obtain the xe_gt pointer.  This will also
remove the need for making copies of the struct drm_info_list
to get GT specific definitions.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20240214115756.1525-4-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328162808.451-4-michal.wajdeczko@intel.com
18 months agodrm/xe: Define helper for GT specific debugfs files
Michal Wajdeczko [Thu, 28 Mar 2024 16:28:07 +0000 (17:28 +0100)]
drm/xe: Define helper for GT specific debugfs files

Many of our debugfs files are GT specific and require a pointer to
struct xe_gt to correctly show its content.  Our initial approach
to use drm_info_list.data field to pass pointer not only requires
extra steps (like copying template per each GT) but also abuses
the rule that this data field should not be device specific.

Introduce helper function that will use xe_gt pointer stored at
parent directory level and use .data only to pass actual print
function that would expects xe_gt pointer as a parameter.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20240214115756.1525-3-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328162808.451-3-michal.wajdeczko@intel.com
18 months agodrm/xe: Store pointer to struct xe_gt in gt/ debugfs directory
Michal Wajdeczko [Thu, 28 Mar 2024 16:28:06 +0000 (17:28 +0100)]
drm/xe: Store pointer to struct xe_gt in gt/ debugfs directory

Attributes added under 'gt/' directories may wish to use that
in case they can't obtain it from elsewhere.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20240214115756.1525-2-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328162808.451-2-michal.wajdeczko@intel.com
18 months agodrm/xe/uapi: Define topology types as indexes rather than masks
Francois Dugast [Thu, 28 Mar 2024 14:02:43 +0000 (14:02 +0000)]
drm/xe/uapi: Define topology types as indexes rather than masks

The topology type is an index (not a mask) so define the values
like other indexes instead of using powers of 2. This is also
to make clear that the next type can use value 3. This commit
does not change the existing values so it does not break
compatibility.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Link: https://lore.kernel.org/intel-xe/20240327232317.GI718896@mdroper-desk1.amr.corp.intel.com/
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328140243.7-1-francois.dugast@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
18 months agodrm/xe/gsc: Implement WA 14018094691
Daniele Ceraolo Spurio [Tue, 26 Mar 2024 22:44:56 +0000 (15:44 -0700)]
drm/xe/gsc: Implement WA 14018094691

The WA states that we need to keep the primary GT powered up during GSC
load to allow the GSC FW to access its registers. We also need to make
sure that one of the registers is locked before starting the load.

v2: fix location of register def (Matt)

Bspec: 55928
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240326224456.518548-1-daniele.ceraolospurio@intel.com
18 months agodrm/xe/pf: Add minimal support for VF_STATE_NOTIFY events
Michal Wajdeczko [Tue, 26 Mar 2024 19:15:18 +0000 (20:15 +0100)]
drm/xe/pf: Add minimal support for VF_STATE_NOTIFY events

GuC will use VF_STATE_NOTIFY events to notify the PF about changes
of the VF state, in particular when a VF FLR was requested.  Add
very minimal support for such events to avoid reporting errors due
to unexpected G2H. We will improve handling of these messages later.

While around also add few basic functions to control the VF state
(pause, resume, stop) as we will also exercise them soon.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240326191518.363-3-michal.wajdeczko@intel.com
18 months agodrm/xe/guc: Add VF_STATE_NOTIFY and VF_CONTROL to ABI
Michal Wajdeczko [Tue, 26 Mar 2024 19:15:17 +0000 (20:15 +0100)]
drm/xe/guc: Add VF_STATE_NOTIFY and VF_CONTROL to ABI

In upcoming patches the PF driver will add support to handle the
GUC2PF_VF_STATE_NOTIFY events and to send PF2GUC_VF_CONTROL request
messages.  Add necessary definitions to our GuC firmware ABI header.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240326191518.363-2-michal.wajdeczko@intel.com
18 months agodrm/xe/vf: Add proper detection of the SR-IOV VF mode
Michal Wajdeczko [Wed, 27 Mar 2024 18:27:40 +0000 (19:27 +0100)]
drm/xe/vf: Add proper detection of the SR-IOV VF mode

SR-IOV VF mode detection is based on testing VF capability bit on
the register that is accessible from both the PF and enabled VFs.

Bspec: 49904, 53227
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-4-michal.wajdeczko@intel.com
18 months agodrm/xe: Move SR-IOV probe to xe_device_probe_early()
Michal Wajdeczko [Wed, 27 Mar 2024 18:27:39 +0000 (19:27 +0100)]
drm/xe: Move SR-IOV probe to xe_device_probe_early()

SR-IOV mode detection requires access to the MMIO register and
this can be done now in xe_device_probe_early().

We can also drop explicit has_sriov parameter as this flag is now
already available from xe->info.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-3-michal.wajdeczko@intel.com
18 months agodrm/xe: Separate pure MMIO init from VRAM checkout
Michal Wajdeczko [Wed, 27 Mar 2024 18:27:38 +0000 (19:27 +0100)]
drm/xe: Separate pure MMIO init from VRAM checkout

We can setup root tile registers mapping at the same time as we
do early mapping of the entire MMIO BAR and keep mandatory VRAM
checkout as a separate step. This will allow us to perform SR-IOV
VF mode detection between those two steps using regular MMIO regs
access functions.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-2-michal.wajdeczko@intel.com
18 months agodrm/xe: Move vma rebinding to the drm_exec locking loop
Thomas Hellström [Wed, 27 Mar 2024 09:11:36 +0000 (10:11 +0100)]
drm/xe: Move vma rebinding to the drm_exec locking loop

Rebinding might allocate page-table bos, causing evictions.
To support blocking locking during these evictions,
perform the rebinding in the drm_exec locking loop.

Also Reserve fence slots where actually needed rather than trying to
predict how many fence slots will be needed over a complete
wound-wait transaction.

v2:
- Remove a leftover call to xe_vm_rebind() (Matt Brost)
- Add a helper function xe_vm_validate_rebind() (Matt Brost)
v3:
- Add comments and squash with previous patch (Matt Brost)

Fixes: 24f947d58fe5 ("drm/xe: Use DRM GPUVM helpers for external- and evicted objects")
Fixes: 29f424eb8702 ("drm/xe/exec: move fence reservation")
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-5-thomas.hellstrom@linux.intel.com
18 months agodrm/xe: Make TLB invalidation fences unordered
Thomas Hellström [Wed, 27 Mar 2024 09:11:35 +0000 (10:11 +0100)]
drm/xe: Make TLB invalidation fences unordered

They can actually complete out-of-order, so allocate a unique
fence context for each fence.

Fixes: 5387e865d90e ("drm/xe: Add TLB invalidation fence after rebinds issued from execs")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-4-thomas.hellstrom@linux.intel.com
18 months agodrm/xe: Rework rebinding
Thomas Hellström [Wed, 27 Mar 2024 09:11:34 +0000 (10:11 +0100)]
drm/xe: Rework rebinding

Instead of handling the vm's rebind fence separately,
which is error prone if they are not strictly ordered,
attach rebind fences as kernel fences to the vm's resv.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-3-thomas.hellstrom@linux.intel.com
18 months agodrm/xe: Use ring ops TLB invalidation for rebinds
Thomas Hellström [Wed, 27 Mar 2024 09:11:33 +0000 (10:11 +0100)]
drm/xe: Use ring ops TLB invalidation for rebinds

For each rebind we insert a GuC TLB invalidation and add a
corresponding unordered TLB invalidation fence. This might
add a huge number of TLB invalidation fences to wait for so
rather than doing that, defer the TLB invalidation to the
next ring ops for each affected exec queue. Since the TLB
is invalidated on exec_queue switch, we need to invalidate
once for each affected exec_queue.

v2:
- Simplify if-statements around the tlb_flush_seqno.
  (Matthew Brost)
- Add some comments and asserts.

Fixes: 5387e865d90e ("drm/xe: Add TLB invalidation fence after rebinds issued from execs")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-2-thomas.hellstrom@linux.intel.com
18 months agodrm/xe/guc: Use GuC ID Manager in submission code
Michal Wajdeczko [Wed, 13 Mar 2024 22:11:12 +0000 (23:11 +0100)]
drm/xe/guc: Use GuC ID Manager in submission code

We are ready to replace private guc_ids management code with
separate GuC ID Manager that can be shared with upcoming SR-IOV
PF provisioning code.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-5-michal.wajdeczko@intel.com
18 months agodrm/xe/kunit: Add basic tests for GuC context ID Manager
Michal Wajdeczko [Wed, 13 Mar 2024 22:11:11 +0000 (23:11 +0100)]
drm/xe/kunit: Add basic tests for GuC context ID Manager

Before we switch-over submission code to use new GuC context ID
Manager, lets add some kunit tests to make sure that ID manager
works as expected.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-4-michal.wajdeczko@intel.com
18 months agodrm/xe/guc: Introduce GuC context ID Manager
Michal Wajdeczko [Wed, 13 Mar 2024 22:11:10 +0000 (23:11 +0100)]
drm/xe/guc: Introduce GuC context ID Manager

While we are already managing GuC IDs directly in GuC submission
code, using bitmap() for MLRC and ida() for SLRC, this code can't
be easily extended to meet additional requirements for SR-IOV use
cases, like limited number of IDs available on VFs, or ID range
reservation for provisioning VFs by the PF.

Add a separate component for managing GuC IDs, that will replace
existing ID management. Start with bitmap() based implementation
that could be optimized later based on perf data.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-3-michal.wajdeczko@intel.com
18 months agodrm/xe/guc: Move GUC_ID_MAX definition to GuC ABI header
Michal Wajdeczko [Wed, 13 Mar 2024 22:11:09 +0000 (23:11 +0100)]
drm/xe/guc: Move GUC_ID_MAX definition to GuC ABI header

This macro represents GuC firmware capability and shall be defined
in the firmware ABI header. Move it to xe_guc_fwif.h file.

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-2-michal.wajdeczko@intel.com
18 months agodrm/xe/guc: Fix include guard for SR-IOV ABI
Michal Wajdeczko [Tue, 13 Feb 2024 21:49:08 +0000 (22:49 +0100)]
drm/xe/guc: Fix include guard for SR-IOV ABI

Use include guard macro name that follows naming used by the other
GuC ABI files.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213214908.1481-1-michal.wajdeczko@intel.com
18 months agodrm/xe: Move HW GGTT definitions to dedicated file
Michal Wajdeczko [Tue, 26 Mar 2024 13:10:42 +0000 (14:10 +0100)]
drm/xe: Move HW GGTT definitions to dedicated file

It's better to keep all hardware GGTT definitions separated from
the driver code. It also helps to avoid duplicated definitions.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240326131042.319-1-michal.wajdeczko@intel.com
18 months agodrm/xe: Create a helper function to init job's user fence
Nirmoy Das [Thu, 21 Mar 2024 16:11:42 +0000 (17:11 +0100)]
drm/xe: Create a helper function to init job's user fence

Refactor xe_sync_entry_signal so it doesn't have to
modify xe_sched_job struct instead create a new helper function
to set user fence values for a job.

v2: Move the sync type check to xe_sched_job_init_user_fence(Lucas)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321161142.4954-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
18 months agodrm/xe/guc: Remove explicit shutdown of SLPC
Vinay Belgaumkar [Mon, 25 Mar 2024 23:56:02 +0000 (16:56 -0700)]
drm/xe/guc: Remove explicit shutdown of SLPC

SLPC shutdown is called in reset and suspend paths. In the reset
path, it is possible that the H2G call gets lost as GuC is in the
process of being reset. There is no value in stopping SLPC when
it will happen anyways.

In the suspend path, we disable communication with GuC, so there
is no need to explicitly shutdown SLPC.

v2: Rebase

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240325235602.1155486-1-vinay.belgaumkar@intel.com
18 months agodrm/xe: Add new PCI IDs to DG2 platform
Ravi Kumar Vodapalli [Tue, 26 Mar 2024 10:38:25 +0000 (16:08 +0530)]
drm/xe: Add new PCI IDs to DG2 platform

New PCI IDs are added in Bspec for DG2 platform, add them in driver

Bspec: 44477
Signed-off-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240326103825.3832879-1-ravi.kumar.vodapalli@intel.com
18 months agodrm/xe: Use FIELD_PREP for lrc descriptor
Niranjana Vishwanathapura [Fri, 22 Mar 2024 19:14:55 +0000 (12:14 -0700)]
drm/xe: Use FIELD_PREP for lrc descriptor

Use FIELD_PREP for setting lrc descriptor fields instead
of shifting values to fields.

v2: Use ULL macro variants
v3: Do not use FIELD_PREP for 1-bit values

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240322191455.7613-1-niranjana.vishwanathapura@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe: Remove redundant functions to get xe
Lucas De Marchi [Thu, 21 Mar 2024 21:38:18 +0000 (14:38 -0700)]
drm/xe: Remove redundant functions to get xe

xe_device.h implements these helpers, just use them.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321213818.72311-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe: Fix END redefinition
Lucas De Marchi [Fri, 22 Mar 2024 14:48:43 +0000 (07:48 -0700)]
drm/xe: Fix END redefinition

mips declares an END macro in its headers so it can't be used without
namespace in a driver like xe.

Instead of coming up with a longer name, just remove the macro and
replace its use with 0 since it's still clear what that means:
set_offsets() was already using that implicitly when checking the data
variable.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: http://kisskb.ellerman.id.au/kisskb/buildresult/15143996/
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240322145037.196548-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe/guc: Check error code when initializing the CT mutex
Daniele Ceraolo Spurio [Thu, 21 Mar 2024 19:55:12 +0000 (12:55 -0700)]
drm/xe/guc: Check error code when initializing the CT mutex

The initialization via drmm_mutex_init can fail, so we need to check the
return code and escalate the failure.

The mutex initialization has been moved after all the other init steps
that can't fail, so we're always guaranteed to have those done and don't
have to check in the cleanup code.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321195512.274210-1-daniele.ceraolospurio@intel.com
19 months agodrm/xe/guc: Add some failure checks
Vinay Belgaumkar [Thu, 21 Mar 2024 19:12:19 +0000 (12:12 -0700)]
drm/xe/guc: Add some failure checks

Return failures from pc_adjust_freq_bounds.

Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321191219.243583-1-vinay.belgaumkar@intel.com
19 months agodrm/xe: Nuke EXEC_QUEUE_FLAG_PERSISTENT
José Roberto de Souza [Thu, 7 Mar 2024 13:52:29 +0000 (05:52 -0800)]
drm/xe: Nuke EXEC_QUEUE_FLAG_PERSISTENT

This is a left over of commit f1a9abc0cf31 ("drm/xe/uapi: Remove support for persistent exec_queues").

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-3-jose.souza@intel.com
19 months agodrm/xe/devcoredump: Print errno if VM snapshot was not captured
José Roberto de Souza [Thu, 7 Mar 2024 13:52:28 +0000 (05:52 -0800)]
drm/xe/devcoredump: Print errno if VM snapshot was not captured

My testing machine has only 8GB of RAM and while running piglit tests
I can reach the OOM cache in xe_vm_snapshot_capture() snap allocaiton
sometimes.

So to differentiate the OOM from race between capture and UMDs
unbinbind VMs here I'm adding a '[0].error: -12' to devcoredump.

v2:
- fix returned errno values

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-2-jose.souza@intel.com
19 months agodrm/xe: Make devcoredump VM error state print consistent
José Roberto de Souza [Thu, 7 Mar 2024 13:52:27 +0000 (05:52 -0800)]
drm/xe: Make devcoredump VM error state print consistent

This makes VM error consistent with [x].length and [x].data.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-1-jose.souza@intel.com
19 months agodrm/xe: remove unused struct xe_device members
Jani Nikula [Thu, 21 Mar 2024 16:15:48 +0000 (18:15 +0200)]
drm/xe: remove unused struct xe_device members

modeset_restore_state has been unused since commit 6af0ffc0db93
("drm/i915/display: move restore state and ctx under display
sub-struct").

member global_obj_list has been unused since commit e2925e19c006
("drm/i915/display: move global_obj_list under display sub-struct").

hti_state has been unused since commit 62749912540b ("drm/i915/display:
move hti under display sub-struct").

snps_phy_failed_calibration has been unused since commit 3a7e2d58f800
("drm/i915: move snps_phy_failed_calibration to display sub-struct under
snps").

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321161548.3509672-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
19 months agodrm/xe/query: fix gt_id bounds check
Matthew Auld [Thu, 21 Mar 2024 11:06:30 +0000 (11:06 +0000)]
drm/xe/query: fix gt_id bounds check

The user provided gt_id should always be less than the
XE_MAX_GT_PER_TILE.

Fixes: 7793d00d1bf5 ("drm/xe: Correlate engine and cpu timestamps with better accuracy")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321110629.334701-2-matthew.auld@intel.com
19 months agodrm/xe: Add debug messages for MMU notifier and VMA invalidate
Matthew Brost [Wed, 20 Mar 2024 19:42:32 +0000 (12:42 -0700)]
drm/xe: Add debug messages for MMU notifier and VMA invalidate

Extra debug is useful when working on VM issues.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320194232.1910688-1-matthew.brost@intel.com
19 months agodrm/xe: Use USEC_PER_MSEC rather than the hard coding
Himal Prasad Ghimiray [Wed, 20 Mar 2024 08:33:25 +0000 (14:03 +0530)]
drm/xe: Use USEC_PER_MSEC rather than the hard coding

Use USEC_PER_MSEC rather than the hard coded value of 1000.

Static analyzer Reported "casting either timeout_ms or
1000U to type u64" to avoid overflow-before-widen.
Using USEC_PER_MSEC seems better and will help with static analyzer
report cleanup.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320083325.3258720-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe/bb: assert width in xe_bb_create_migration_job()
Matthew Auld [Wed, 20 Mar 2024 11:27:32 +0000 (11:27 +0000)]
drm/xe/bb: assert width in xe_bb_create_migration_job()

The q->width should always be exactly one here for migration queue/vm.
The width will anyway be overridden later since we need to emit two
jumps for special migration jobs. Enforce that here to ensure caller is
not doing something strange. While here also convert to the helper to
determine if the queue is migration based.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320112730.219854-4-matthew.auld@intel.com
19 months agodrm/xe/bb: assert width in xe_bb_create_job()
Matthew Auld [Wed, 20 Mar 2024 11:27:31 +0000 (11:27 +0000)]
drm/xe/bb: assert width in xe_bb_create_job()

The queue width will determine the number of batch buffer emitted into
the ring. In the case of xe_bb_create_job() we pass exactly one batch
address, therefore add an assert for the width to make sure we don't go
out of bounds. While here also convert to the helper to determine if the
queue is migration based.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320112730.219854-3-matthew.auld@intel.com
19 months agodrm/xe/uc: Use u64 for offsets for which we use upper_32_bits()
Daniele Ceraolo Spurio [Tue, 19 Mar 2024 19:51:01 +0000 (12:51 -0700)]
drm/xe/uc: Use u64 for offsets for which we use upper_32_bits()

The GGTT is currently a 32 bit address space, but the HW and GuC
support 48b addresses in GGTT-related operations, both to keep the
interface/HW paths common between PPGTT and GGTT and to allow for
future increase of the GGTT size.
This leaves us having to program a 64b field with a 32b offset, which
currently we're in some cases doing this by using an upper_32_bits()
call on a 32b variable, which doesn't make any sense. To do this cleanly
we have 2 options:

1 - Set the upper 32 bits directly to zero.
2 - Use 64b variables for the offset and keep programming the whole thing,
    so we're ready if we ever have bigger offsets.

This patch goes with option #2 and switches the related variables to u64.

v2: don't change the log ctl flag variable (John)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319195101.2784480-1-daniele.ceraolospurio@intel.com
19 months agodrm/xe: Always check force_wake_get return code
Daniele Ceraolo Spurio [Mon, 18 Mar 2024 15:49:24 +0000 (08:49 -0700)]
drm/xe: Always check force_wake_get return code

A force_wake_get failure means that the HW might not be awake for the
access we're doing; this can lead to an immediate error or it can be a
more subtle problem (e.g. a register read might return an incorrect
value that is still valid, leading the driver to make a wrong choice
instead of flagging an error).
We avoid an error from the force_wake function because callers might
handle or tolerate the error, but this only works if all callers
are checking the error code. The majority already do, but a few are not.
These are mainly falling into 3 categories, which are each handled
differently:

1) error capture: in this case we want to continue the capture, but we
   log an info message in dmesg to notify the user that the capture
   might have incorrect data.

2) ioctl: in this case we return a -EIO error to userspace

3) unabortable actions: these are scenarios where we can't simply abort
   and retry and so it's better to just try it anyway because there is a
   chance the HW is awake even with the failure. In this case we throw a
   warning so we know there was a forcewake problem if something fails
   down the line.

v2: use gt_WARN_ON where appropriate

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318154924.3453513-1-daniele.ceraolospurio@intel.com
19 months agodrm/xe/xelpg: Add Wa_14020495402
Radhakrishna Sripada [Mon, 18 Mar 2024 21:01:20 +0000 (14:01 -0700)]
drm/xe/xelpg: Add Wa_14020495402

Disable clockgating for TDL SVHS fub.

v2: Extend the Wa to 1274(MattR)

Bspec: 46045
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318210120.564692-1-radhakrishna.sripada@intel.com
19 months agodrm/xe/gt: Remove continue statement which has no effect
Tejas Upadhyay [Mon, 18 Mar 2024 11:40:57 +0000 (17:10 +0530)]
drm/xe/gt: Remove continue statement which has no effect

Remove continue statement which does not have real effect
as no actions are to be taken post continue.

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318114057.3831274-1-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe/display: fix type of intel_uncore_read*() functions
Luca Coelho [Thu, 14 Mar 2024 06:52:21 +0000 (08:52 +0200)]
drm/xe/display: fix type of intel_uncore_read*() functions

Some of the backported intel_uncore_read*() functions used the wrong
types.  Change the function declarations accordingly.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314065221.1181158-1-luciano.coelho@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
19 months agodrm/xe: Move xe_ggtt_invalidate out from ggtt->lock
Maarten Lankhorst [Wed, 6 Mar 2024 05:20:02 +0000 (21:20 -0800)]
drm/xe: Move xe_ggtt_invalidate out from ggtt->lock

Considering the caller of the GGTT functions should keep the
backing storage alive before the function completes, it's not
necessary to invalidate with the GGTT lock held. This just adds
latency for every user of the GGTT.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-5-matthew.brost@intel.com
19 months agodrm/xe: Add XE_BO_GGTT_INVALIDATE flag
Matthew Brost [Wed, 6 Mar 2024 05:20:01 +0000 (21:20 -0800)]
drm/xe: Add XE_BO_GGTT_INVALIDATE flag

Add XE_BO_GGTT_INVALIDATE flag which indicates the GGTT should be
invalidated when a BO is added / removed from the GGTT. This is
typically set when a BO is used by the GuC as the GuC has GGTT TLBs.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
[mlankhorst: Small fix to only inherit GGTT_INVALIDATE from src bo]
[mlankhorst: Remove _BIT from name]
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-4-matthew.brost@intel.com
19 months agodrm/xe: Drop ggtt invalidate from display code
Matthew Brost [Wed, 6 Mar 2024 05:20:00 +0000 (21:20 -0800)]
drm/xe: Drop ggtt invalidate from display code

Only buffers mapped in the GGTT used by the GuC require an invalidation.
Display buffers do not require an invalidation. Delete the invalidatio
from display code and make invalidation a static function in xe_ggtt.c.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-3-matthew.brost@intel.com
19 months agodrm/xe: Add a NULL check in xe_ttm_stolen_mgr_init
Nirmoy Das [Tue, 19 Mar 2024 13:09:25 +0000 (14:09 +0100)]
drm/xe: Add a NULL check in xe_ttm_stolen_mgr_init

Add an explicit check to ensure that the mgr is not NULL.

Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319130925.22399-1-nirmoy.das@intel.com
19 months agodrm/xe: Use correct function pointer type
Niranjana Vishwanathapura [Tue, 19 Mar 2024 17:49:19 +0000 (10:49 -0700)]
drm/xe: Use correct function pointer type

Use xe_exec_queue_user_extension_fn type for
exec_queue_user_extension_funcs.`

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319174919.1847-1-niranjana.vishwanathapura@intel.com
19 months agodrm/xe: Streamline exec queue freeing path
Niranjana Vishwanathapura [Tue, 19 Mar 2024 17:59:46 +0000 (10:59 -0700)]
drm/xe: Streamline exec queue freeing path

Ensure exec queue freeing happens at one place, that is in
__xe_exec_queue_free(). It releases q->vm reference also. Set
q->vm before handling extensions as they can potentially reference it.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319175947.15890-1-niranjana.vishwanathapura@intel.com
19 months agodrm/xe: Separate out sched/deregister_done handling
Niranjana Vishwanathapura [Tue, 19 Mar 2024 18:41:53 +0000 (11:41 -0700)]
drm/xe: Separate out sched/deregister_done handling

Abstract out the core part of sched_done and deregister_done handlers
to separate functions to decouple them from any protocol error handling
part and make them more readable.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319184153.16667-1-niranjana.vishwanathapura@intel.com
19 months agodrm/xe/guc: Don't support older GuC 70.x releases
Daniele Ceraolo Spurio [Mon, 4 Mar 2024 16:26:16 +0000 (08:26 -0800)]
drm/xe/guc: Don't support older GuC 70.x releases

Supporting older GuC versions comes with baggage, both on the coding
side (due to interfaces only being available from a certain version
onwards) and on the testing side (due to having to make sure the driver
works as expected with older GuCs).
Since all of our Xe platform are still under force probe, we haven't
committed to support any specific GuC version and we therefore don't
need to support the older once, which means that we can force a bottom
limit to what GuC we accept. This allows us to remove any conditional
statements based on older GuC versions and also to approach newer
additions knowing that we'll never attempt to load something older
than our minimum requirement.

As an initial value, the minimum expected version is set to 70.19,
which is the version currently in the firmware table, but the
expectation is that this will be bumbed every time we update the
table, until we remove the force probe.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240304162616.824884-1-daniele.ceraolospurio@intel.com
19 months agodrm/xe: Add dbg messages on the suspend resume functions.
Rodrigo Vivi [Mon, 18 Mar 2024 18:01:41 +0000 (14:01 -0400)]
drm/xe: Add dbg messages on the suspend resume functions.

In case of the suspend/resume flow getting locked up we
can get reports with some useful hints on where it might
get locked and if that has failed.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-2-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
19 months agodrm/xe: Convert gt suspend/resume messages to debug
Rodrigo Vivi [Mon, 18 Mar 2024 18:01:40 +0000 (14:01 -0400)]
drm/xe: Convert gt suspend/resume messages to debug

Let's be quieter on production configuration and let's also
print the entry point of the gt suspend when debug messages
are enabled.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
19 months agodrm/xe/vm : Remove duplicate assignment of XE_VM_FLAG_LR_MODE flag.
Himal Prasad Ghimiray [Thu, 7 Mar 2024 06:52:13 +0000 (12:22 +0530)]
drm/xe/vm : Remove duplicate assignment of XE_VM_FLAG_LR_MODE flag.

vm->flags are already assigned with passed flags. Remove the redundant
assignment.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307065213.1968688-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
19 months agodrm/xe/display: Mark dpt and related vma as uncached
Juha-Pekka Heikkila [Mon, 18 Mar 2024 20:18:50 +0000 (22:18 +0200)]
drm/xe/display: Mark dpt and related vma as uncached

Mark dpt and related vma as uncached to avoid pipe faults on some devices.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318201850.127785-1-juhapekka.heikkila@gmail.com
19 months agodrm/xe/display: mark DPT with XE_BO_PAGETABLE
Matthew Auld [Thu, 14 Mar 2024 16:49:06 +0000 (16:49 +0000)]
drm/xe/display: mark DPT with XE_BO_PAGETABLE

Otherwise in the case where we use normal system memory, the CPU access
will always be cached, like when filling the DPT PTEs, which is likely
not what we want since HW access could be incoherent on platforms like
LNL. Marking as XE_BO_PAGETABLE will force wc/uc underneath on such
platforms.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314164905.239449-2-matthew.auld@intel.com
19 months agodrm/xe: Remove usage of unsafe strcpy
Nirmoy Das [Mon, 18 Mar 2024 09:10:55 +0000 (10:10 +0100)]
drm/xe: Remove usage of unsafe strcpy

Remove usage of unsafe strcpy with a helper function
to convert engine class to string.

Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318091055.638-1-nirmoy.das@intel.com
19 months agodrm/xe: Drop bogus vma NULL check
Nirmoy Das [Mon, 18 Mar 2024 09:35:47 +0000 (10:35 +0100)]
drm/xe: Drop bogus vma NULL check

The vma pointer can't be NULL here.

Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318093547.16326-1-nirmoy.das@intel.com
19 months agodrm/xe/device: fix XE_MAX_TILES_PER_DEVICE check
Matthew Auld [Mon, 18 Mar 2024 18:05:35 +0000 (18:05 +0000)]
drm/xe/device: fix XE_MAX_TILES_PER_DEVICE check

Here XE_MAX_TILES_PER_DEVICE is the gt array size, therefore the gt
index should always be less than.

v2 (Lucas):
  - Add fixes tag.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-6-matthew.auld@intel.com
19 months agodrm/xe/device: fix XE_MAX_GT_PER_TILE check
Matthew Auld [Mon, 18 Mar 2024 18:05:34 +0000 (18:05 +0000)]
drm/xe/device: fix XE_MAX_GT_PER_TILE check

Here XE_MAX_GT_PER_TILE is the total, therefore the gt index should
always be less than.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-5-matthew.auld@intel.com
19 months agodrm/xe/queue: fix engine_class bounds check
Matthew Auld [Mon, 18 Mar 2024 18:05:33 +0000 (18:05 +0000)]
drm/xe/queue: fix engine_class bounds check

The engine_class is the index into the user_to_xe_engine_class,
therefore it needs to be less than.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-4-matthew.auld@intel.com
19 months agodrm/xe: Fix potential integer overflow in page size calculation
Nirmoy Das [Mon, 18 Mar 2024 16:43:41 +0000 (17:43 +0100)]
drm/xe: Fix potential integer overflow in page size calculation

Explicitly cast tbo->page_alignment to u64 before bit-shifting to
prevent overflow when assigning to min_page_size.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318164342.3094-1-nirmoy.das@intel.com
19 months agodrm/xe/vm: fix xe_assert()
Matthew Auld [Mon, 18 Mar 2024 10:36:17 +0000 (10:36 +0000)]
drm/xe/vm: fix xe_assert()

The region can be used an index into the region_to_mem_type, so we
should be asserting that it is less than the ARRAY_SIZE here.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318103616.26240-2-matthew.auld@intel.com
19 months agodrm/xe/client: drop bogus bo NULL check
Matthew Auld [Mon, 18 Mar 2024 09:34:33 +0000 (09:34 +0000)]
drm/xe/client: drop bogus bo NULL check

If we fished it out the list then it can't be null; the list entry is
embedded in the bo.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318093431.21075-4-matthew.auld@intel.com
19 months agodrm/xe/client: remove bogus rcu list usage
Matthew Auld [Mon, 18 Mar 2024 09:34:32 +0000 (09:34 +0000)]
drm/xe/client: remove bogus rcu list usage

We use plain spinlock to protect readers and writers, so there is no
actual RCU here. Rather use the more appropriate non-rcu list based API.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318093431.21075-3-matthew.auld@intel.com
19 months agodrm/xe/pf: Always select Multi-Level LMTT for platforms 12.60+
Michal Wajdeczko [Wed, 13 Mar 2024 10:41:32 +0000 (11:41 +0100)]
drm/xe/pf: Always select Multi-Level LMTT for platforms 12.60+

The Multi-Level LMTT variant is not specific only to the PVC.
Change logic to select it for all new platforms beyond 12.60.

Bspec: 52404, 67468
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313104132.1045-4-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe/pf: Request 64K aligned allocations for LMTT PD
Michal Wajdeczko [Wed, 13 Mar 2024 10:41:31 +0000 (11:41 +0100)]
drm/xe/pf: Request 64K aligned allocations for LMTT PD

The LMTT Page Directory, as well as the directory entries, must be
aligned on a 64KB boundary in VRAM. Use explicit alignment flag to
match hardware requirement.

Bspec: 52404, 67468
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313104132.1045-3-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe: Allow VRAM BO allocations aligned to 64K
Michal Wajdeczko [Wed, 13 Mar 2024 10:41:30 +0000 (11:41 +0100)]
drm/xe: Allow VRAM BO allocations aligned to 64K

While today we are getting VRAM allocations aligned to 64K as the
XE_VRAM_FLAGS_NEED64K flag could be set, we shouldn't only rely on
that flag and we should also allow caller to specify required 64K
alignment explicitly.  Define new XE_BO_NEEDS_64K flag for that.

Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313104132.1045-2-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe: Make xe_mmio_read|write() functions non-inline
Michal Wajdeczko [Thu, 14 Mar 2024 17:31:30 +0000 (18:31 +0100)]
drm/xe: Make xe_mmio_read|write() functions non-inline

Shortly we will updating xe_mmio_read|write() functions with SR-IOV
specific features making those functions less suitable for inline.
Convert now those functions into regular ones, lowering driver
footprint, according to scripts/bloat-o-meter, by 6%

add/remove: 18/18 grow/shrink: 31/603 up/down: 2719/-79663 (-76944)
Function                                     old     new   delta
Total: Before=1276633, After=1199689, chg -6.03%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
Data                                         old     new   delta
Total: Before=48990, After=48990, chg +0.00%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
RO Data                                      old     new   delta
Total: Before=115680, After=115680, chg +0.00%

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-7-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe: Mark VF accessible interrupt registers
Michal Wajdeczko [Thu, 14 Mar 2024 17:31:29 +0000 (18:31 +0100)]
drm/xe: Mark VF accessible interrupt registers

Interrupt registers 1900xx are VF accessible but only until version
12.50 as on newer platforms VFs are using memory-based interrupts.

To avoid complexity, we mark those registers with XE_REG_OPTION_VF
unconditionally, as IRQ handling on newer VFs is different anyway.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-6-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe: Mark VF accessible global registers
Michal Wajdeczko [Thu, 14 Mar 2024 17:31:28 +0000 (18:31 +0100)]
drm/xe: Mark VF accessible global registers

Only selected registers are available for Virtual Functions.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-5-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe: Mark VF accessible GuC registers
Michal Wajdeczko [Thu, 14 Mar 2024 17:31:27 +0000 (18:31 +0100)]
drm/xe: Mark VF accessible GuC registers

Only selected registers are available for Virtual Functions.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-4-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe: Define XE_REG_OPTION_VF
Michal Wajdeczko [Thu, 14 Mar 2024 17:31:26 +0000 (18:31 +0100)]
drm/xe: Define XE_REG_OPTION_VF

We will tag registers that SR-IOV Virtual Functions can access.
This will help us catch any invalid usage and/or provide custom
replacement if available.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-3-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe: Assert size of the struct xe_reg
Michal Wajdeczko [Thu, 14 Mar 2024 17:31:25 +0000 (18:31 +0100)]
drm/xe: Assert size of the struct xe_reg

We want to keep the struct xe_reg as small as possible.
Make sure we don't accidentally change its size.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-2-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
19 months agodrm/xe: Add helper macro to loop each DSS
Zhanjun Dong [Thu, 14 Mar 2024 21:07:35 +0000 (14:07 -0700)]
drm/xe: Add helper macro to loop each DSS

Add helper macro to loop each DSS. This is a precursor patch to allow
for easier iteration through MCR registers and other per-DSS uses.

Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314210735.258553-2-zhanjun.dong@intel.com
19 months agodrm/xe/mocs: Clarify which GT is being operated on
Matt Roper [Thu, 14 Mar 2024 19:58:27 +0000 (12:58 -0700)]
drm/xe/mocs: Clarify which GT is being operated on

Switch the MOCS-related debug messages to use a GT-specific logging
function and add ID/type output to the beginning of the MOCS kunit test
to assist with debug when problems arise.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314195825.3226856-4-matthew.d.roper@intel.com
19 months agodrm/xe/mocs: Determine MCR separately for primary/media GT in kunit test
Matt Roper [Thu, 14 Mar 2024 19:58:26 +0000 (12:58 -0700)]
drm/xe/mocs: Determine MCR separately for primary/media GT in kunit test

Although MOCS registers became multicast in graphics version 12.50 on
the primary GT, this transition did not happen until version 20 on the
media GT.  Considering each GT independently is mostly important for
MTL/ARL where the Xe_LPM+ IP has non-MCR MOCS registers, even though
Xe_LPG IP has MCR registers.

Bspec: 67789, 71186
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314195825.3226856-3-matthew.d.roper@intel.com
19 months agodrm/xe/gsc: Handle GSCCS ER interrupt
Daniele Ceraolo Spurio [Mon, 4 Mar 2024 14:56:34 +0000 (06:56 -0800)]
drm/xe/gsc: Handle GSCCS ER interrupt

Starting on Xe2, the GSCCS engine reset is a 2-step process. When the
driver or the GuC hits the GDRST register, the CS is immediately reset
and a success is reported, but the GSC shim continues its reset in the
background. While the shim reset is ongoing, the CS is able to accept
new context submission, but any commands that require the shim will
be stalled until the reset is completed. This means that we can keep
submitting to the GSCCS as long as we make sure that the preemption
timeout is big enough to cover any delay introduced by the reset; since
the GSC preempt timeout is not tunable at runtime, we only need to check
that the value set in kconfig is big enough (and increase it if it
isn't).
When the shim reset completes, a specific CS interrupt is triggered,
in response to which we need to check the GSCI_TIMER_STATUS register
to see if the reset was successful or not.
Note that the GSCI_TIMER_STATUS register is not power save/restored,
so it gets reset on MC6 entry. However, a reset failure stops MC6,
so in that scenario we're always guaranteed to find the correct value.

Since we can't check the register within interrupt context, the
existing GSC worker has been updated to handle it.
The expected action to take on ER failure is to trigger a driver FLR,
but we still don't support that, so for now we just print an error. A
comment has been added to the code to keep track of the FLR requirement.

v2: Add a check for the initial timeout value (Alan)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240304145634.820684-1-daniele.ceraolospurio@intel.com
19 months agodrm/xe/guc_submit: use jiffies for job timeout
Matthew Auld [Thu, 14 Mar 2024 12:15:55 +0000 (12:15 +0000)]
drm/xe/guc_submit: use jiffies for job timeout

drm_sched_init() expects jiffies for the timeout, but here we are
passing the timeout in ms. Convert to jiffies instead.

Fixes: eef55700f302 ("drm/xe: Add sysfs for default engine scheduler properties")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314121554.223229-2-matthew.auld@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe: Remove unused xe_bo->props struct
Nirmoy Das [Mon, 11 Mar 2024 15:11:59 +0000 (16:11 +0100)]
drm/xe: Remove unused xe_bo->props struct

Property struct is not being used so remove it and related dead code.

Fixes: ddfa2d6a846a ("drm/xe/uapi: Kill VM_MADVISE IOCTL")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: intel-xe@lists.freedesktop.org
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240311151159.10036-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC
José Roberto de Souza [Wed, 13 Mar 2024 17:13:18 +0000 (10:13 -0700)]
drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC

Doing a XE_EXEC with num_batch_buffer == 0 makes signals passed as
argument to be signaled when the last real XE_EXEC is completed.
But to do that it was first pinning all VMAs in drm_gpuvm_exec_lock(),
this patch remove this pinning as it is not required.

This change also help Mesa implementing memory over-commiting recovery
as it needs to unbind not needed VMAs when the whole VM can't fit
in GPU memory but it can only do the unbiding when the last XE_EXEC
is completed.
So with this change Mesa can get the signal it want without getting
out-of-memory errors.

Fixes: eb9702ad2986 ("drm/xe: Allow num_batch_buffer / num_binds == 0 in IOCTLs")
Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313171318.121066-1-jose.souza@intel.com
19 months agodrm/xe: Use xe_assert in xe_device_assert_mem_access
Matthew Brost [Wed, 13 Mar 2024 18:44:30 +0000 (11:44 -0700)]
drm/xe: Use xe_assert in xe_device_assert_mem_access

The implementation of xe_device_assert_mem_access has a non-zero cost.
Use xe_assert rather than XE_WARN_ON so it will compile out in non-debug
kernel builds (Kconfig CONFIG_DRM_XE_DEBUG=n).

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313184430.999397-1-matthew.brost@intel.com
19 months agodrm/xe/xe_exec : In xe_exec_ioctl remove deadcode
Himal Prasad Ghimiray [Wed, 13 Mar 2024 15:05:45 +0000 (20:35 +0530)]
drm/xe/xe_exec : In xe_exec_ioctl remove deadcode

At label err_unlock_list the condition write_label will never be true.
Remove the deadcode line for write_label true.

Reported by static analyzer.

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313150545.2830408-3-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
19 months agodrm/xe: Return if kobj creation is failed
Himal Prasad Ghimiray [Wed, 13 Mar 2024 15:05:44 +0000 (20:35 +0530)]
drm/xe: Return if kobj creation is failed

Return after warning regarding kobj creation failure.

Fixes: 4ae3aeab32d7 ("drm/xe: Add vram frequency sysfs attributes")
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313150545.2830408-2-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
19 months agodrm/xe/xe_tracer: Align fence output format in ftrace log
Shuicheng Lin [Wed, 13 Mar 2024 02:50:52 +0000 (02:50 +0000)]
drm/xe/xe_tracer: Align fence output format in ftrace log

The fence print in xe_gt_tlb_invalidation_fence and xe_hw_fence
is with "%p", change fence print in xe_sched_job to "%p" also.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313025052.1410833-1-shuicheng.lin@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe: Group live kunit tests
Lucas De Marchi [Tue, 12 Mar 2024 14:51:57 +0000 (07:51 -0700)]
drm/xe: Group live kunit tests

As was done for the normal kunit tests, group the live tests into a
single module, xe_live_test.ko.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312145158.2295351-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
19 months agodrm/xe: Invalidate userptr VMA on page pin fault
Matthew Brost [Tue, 12 Mar 2024 18:39:07 +0000 (11:39 -0700)]
drm/xe: Invalidate userptr VMA on page pin fault

Rather than return an error to the user or ban the VM when userptr VMA
page pin fails with -EFAULT, invalidate VMA mappings. This supports the
UMD use case of freeing userptr while still having bindings.

Now that non-faulting VMs can invalidate VMAs, drop the usm prefix for
the tile_invalidated member.

v2:
 - Fix build error (CI)
v3:
 - Don't invalidate VMA if in fault mode, rather kill VM (Thomas)
 - Update commit message with tile_invalidated name chagne (Thomas)
 - Wait VM bookkeep slots with VM resv lock (Thomas)
v4:
 - Move list_del_init(&userptr.repin_link) after error check (Thomas)
 - Assert not in fault mode (Matthew)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312183907.933835-1-matthew.brost@intel.com
19 months agodrm/xe/uapi: Add IP version and stepping to GT list query
Matt Roper [Tue, 12 Mar 2024 21:12:25 +0000 (14:12 -0700)]
drm/xe/uapi: Add IP version and stepping to GT list query

For modern platforms (MTL and later), both kernel and userspace drivers
are expected to apply GT programming and workarounds based on the IP
version and stepping self-reported by the GT hardware via the GMD_ID
registers.  Since userspace drivers can't access these registers
directly, pass along the version and stepping information via the GT
list query.  Note that the new query fields will remain 0's when running
on pre-GMD_ID platforms.  Userspace is expected to continue using PCI
devid / revid on those older platforms.

Although the hardware also has a GMD_ID register for display
version/stepping, that value is intentionally *not* included anywhere in
the Xe uapi.  Display userspace should be using platform-agnostic APIs
and auto-detecting platform capabilities rather than matching specific
IP versions.

v2:
 - s/revid/rev/  (Lucas)
 - Fix kerneldoc copy/paste mistakes

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312211229.2871288-4-matthew.d.roper@intel.com
19 months agodrm/xe/hdcp: Fix condition for hdcp gsc cs requirement
Suraj Kandpal [Fri, 8 Mar 2024 15:49:40 +0000 (21:19 +0530)]
drm/xe/hdcp: Fix condition for hdcp gsc cs requirement

Add condition for check of hdcp gsc cs requirement rather than
assuming gsc cs to always be required when xe is loaded. It is not
required for display version < 14

--v2
-Use display version in commit message [Lucas]

Fixes: 152f2df954d8 ("drm/xe/hdcp: Enable HDCP for XE")
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240308154939.1940960-2-suraj.kandpal@intel.com