Matt Roper [Fri, 24 May 2024 23:04:45 +0000 (16:04 -0700)]
drm/xe: Don't refer to general LRC initialization as a "wa"
During engine LRC initialization a number of registers need to be
programmed as general setup. This programming is not a "workaround" so
naming the RTP table as "lrc_was" is misleading; switch to the name
"lrc_setup" to more accurately describe what the table is actually for.
Michal Wajdeczko [Tue, 21 May 2024 14:22:55 +0000 (16:22 +0200)]
drm/xe: Store platform name in xe_device.info
We already maintain the platform name as part of the device
descriptor, but in xe_device.info we only store platform enum,
which is not the best for use in any user-facing messages.
Andrzej Hajda [Wed, 22 May 2024 07:27:27 +0000 (09:27 +0200)]
drm/xe: flush gtt before signalling user fence on all engines
Tests show that user fence signalling requires kind of write barrier,
otherwise not all writes performed by the workload will be available
to userspace. It is already done for render and compute, we need it
also for the rest: video, gsc, copy.
v2: added gsc and copy engines, added fixes and r-b tags
drm/xe: Do not access xe file when updating exec queue run_ticks
The current code is running into a use after free case where xe file is
closed before the exec queue run_ticks can be updated. This is occurring
in the xe_file_close path. To fix that, do not access xe file when
updating the exec queue run_ticks. Instead store the exec queue run_ticks
locally in the exec queue object and accumulate it when the user dumps
the drm client stats. We know that the xe file is valid when user is
dumping the run_ticks for the drm client, so this effectively
removes the dependency on xe file object in
xe_exec_queue_update_run_ticks().
v2:
- Fix the accumulation of q->run_ticks delta into xe file run_ticks
- s/runtime/run_ticks/ (Rodrigo)
drm/xe: Use run_ticks instead of runtime for client stats
Note that runtime is also used in the pm context, so it is confusing to
use the same name to denote run time of the drm client. Use a more
appropriate name for the client utilization.
While at it, drop the incorrect multi-lrc comment in the helper
description
Thomas Hellström [Mon, 27 May 2024 13:59:12 +0000 (15:59 +0200)]
drm/xe: Move job creation out of the struct xe_migrate::job_mutex
In order to be able to run gpu jobs from reclaim context,
move job creation (where allocation takes place) out of the
struct xe_migrate::job_mutex, and prime that mutex as reclaim
tainted.
Jobs that may need to run from reclaim context include
CCS metadata extraction at shrinking time.
Thomas Hellström [Mon, 27 May 2024 13:59:10 +0000 (15:59 +0200)]
drm/xe: Don't initialize fences at xe_sched_job_create()
Pre-allocate but don't initialize fences at xe_sched_job_create(),
and initialize / arm them instead at xe_sched_job_arm(). This
makes it possible to move xe_sched_job_create() with its memory
allocation out of any lock that is required for fence
initialization, and that may not allow memory allocation under it.
Replaces the struct dma_fence_array for parallell jobs with a
struct dma_fence_chain, since the former doesn't allow
a split-up between allocation and initialization.
v2:
- Rebase.
- Don't always use the first lrc when initializing parallel
lrc fences.
- Use dma_fence_chain_contained() to access the lrc fences.
v4:
- Add an assert that job->lrc_seqno == fence->seqno.
(Matthew Brost)
Thomas Hellström [Mon, 27 May 2024 13:59:09 +0000 (15:59 +0200)]
drm/xe: Split lrc seqno fence creation up
Since sometimes a lock is required to initialize a seqno fence,
and it might be desirable not to hold that lock while performing
memory allocations, split the lrc seqno fence creation up into an
allocation phase and an initialization phase.
Since lrc seqno fences under the hood are hw_fences, do the same
for these and remove the xe_hw_fence_create() function since it
is not used anymore.
Matthew Brost [Mon, 27 May 2024 13:59:08 +0000 (15:59 +0200)]
drm/xe: Decouple job seqno and lrc seqno
Tightly coupling these seqno presents problems if alternative fences for
jobs are used. Decouple these for correctness.
v2:
- Slightly reword commit message (Thomas)
- Make sure the lrc fence ops are used in comparison (Thomas)
- Assume seqno is unsigned rather than signed in format string (Thomas)
Michal Wajdeczko [Mon, 27 May 2024 11:20:15 +0000 (13:20 +0200)]
drm/xe/vf: Use only assigned GGTT region
Each VF is assigned a limited range of the GGTT address space.
To ensure that the VF driver does not use GGTT allocations outside
of the assigned region, explicitly reserve GGTT space below and
above this region when initializing GGTT.
Michal Wajdeczko [Fri, 24 May 2024 11:37:13 +0000 (13:37 +0200)]
drm/xe/vf: Read VF configuration prior to GGTT initialization
Each VF will be assigned with only a limited range of the GGTT
address space. Make sure that VF driver will read its own GGTT
configuration before starting any GGTT initialization.
Michal Wajdeczko [Thu, 23 May 2024 19:22:40 +0000 (21:22 +0200)]
drm/xe/vf: Treat GMDID as another runtime register
While the GMDID registers are not part of the runtime register list
shared by the PF driver, we may still return cached values from our
VF specific read32() helper function.
Michal Wajdeczko [Thu, 23 May 2024 19:22:39 +0000 (21:22 +0200)]
drm/xe/vf: Cache value of the GMDID register
Read and cache value of the GMDID register as part of the config
query that VF driver is doing over MMIO.
While the VF driver likely already obtained the value of the GMDID
register once during the early driver probe, we couldn't cache it
then as the GT structures were not ready yet.
Cache it now, in case the driver needs it later when the GuC MMIO
communication, required to query GMDID from GuC, could be no longer
desired as it will be replaced by the CTB communication.
While around, assert that we will query GMDID only when applicable.
Michal Wajdeczko [Thu, 23 May 2024 22:30:42 +0000 (00:30 +0200)]
drm/xe/vf: Provide early access to GMDID register
VFs do not have direct access to the GMDID register and must obtain
its value from the GuC. Since we need GMDID value very early in the
driver probe flow, before we even start the full setup of GT and GuC
data structures, we must do some early initializations ourselves.
Additionally, since we also need GMDID for the media GT, which isn't
created yet, temporarly tweak the root GT type into MEDIA to allow
communication with the correct GuC, as only it can provide the value
of the media GMDID register.
Michal Wajdeczko [Thu, 23 May 2024 19:22:36 +0000 (21:22 +0200)]
drm/xe/guc: Add GLOBAL_CFG_GMD_ID KLV definition
VF drivers can't access GMD_ID register over MMIO.
The value of the GMD_ID register must be queried from GuC.
It is available as GLOBAL_CFG_GMD_ID KLV.
Michal Wajdeczko [Thu, 23 May 2024 19:22:35 +0000 (21:22 +0200)]
drm/xe/vf: Use register values obtained from the PF
As part of the its initialization, the VF driver has already
obtained a list of the runtime (fuse) register values from the
PF driver. When VF driver is attempting to read register that is
inaccessible to the VF, use the values from this list instead.
John Harrison [Sat, 18 May 2024 04:36:59 +0000 (21:36 -0700)]
drm/xe/guc: Port over the slow GuC loading support from i915
GuC loading can take longer than it is supposed to for various
reasons. So add in the code to cope with that and to report it when it
happens. There are also many different reasons why GuC loading can
fail, so add in the code for checking for those and for reporting
issues in a meaningful manner rather than just hitting a timeout and
saying 'fail: status = %x'.
Also, remove the 'FIXME' comment about an i915 bug that has never been
applicable to Xe!
v2: Actually report the requested and granted frequencies rather than
showing granted twice (review feedback from Badal).
v3: Locally code all the timeout and end condition handling because a
helper function is not allowed (review feedback from Lucas/Rodrigo).
v4: Add more documentation comments and rename a define to add units
(review feedback from Lucas).
v5: Fix copy/paste error in xe_mmio_wait32_not (review feedback from
Lucas) and rebase (no more return value from guc_wait_ucode).
John Harrison [Sat, 18 May 2024 04:36:58 +0000 (21:36 -0700)]
drm/xe: Make read_perf_limit_reasons globally accessible
Other driver code beyond the sysfs interface wants to know about
throttling. So make the query function globally accessible.
v2: Revert include order change (review feedback from Lucas)
v3: Remove '_sysfs' from throttle file names and keep limit query in
the same file rather than moving elsewhere (review feedback from
Rodrigo).
v4: Correct #include while renaming header file (review feedback
from Lucas).
José Roberto de Souza [Wed, 22 May 2024 20:34:31 +0000 (13:34 -0700)]
drm/xe: Nuke simple error capture
This error capture prints into dmesg HW state when a gpu hang happens.
It was useful when we did not had devcoredump, now it is a incompleted
version of devcoredump that has potential to flood dmesg.
Rodrigo Vivi [Wed, 22 May 2024 17:01:05 +0000 (13:01 -0400)]
drm/xe: Enable D3Cold on 'low' VRAM utilization
Now that we eliminated all the mem_access get/put with its
locking issues from the inner calls of migration, we can
allow D3Cold.
Enable it when VRAM utilization is lower then 300Mb. On
higher utilization we only allow D3hot so we don't increase
so much the latency on runtime resume due to the memory
restoration.
Rodrigo Vivi [Wed, 22 May 2024 17:01:04 +0000 (13:01 -0400)]
drm/xe: Stop checking for power_lost on D3Cold
GuC reset status is not reliable for this purpose and it is
once in a while ending up in a situation of D3Cold, where
power_reset is false and without the proper memory restoration
the GuC reload and Display will fail to come back from D3Cold.
So, let's do a full restoration of everything if we have a risk
of losing power, without further optimizations.
v2: also remove the gut_in_reset function (Anshuman)
Rodrigo Vivi [Wed, 22 May 2024 17:01:03 +0000 (13:01 -0400)]
drm/xe: Prepare display for D3Cold
Prepare power-well and DC handling for a full power
lost during D3Cold, then sanitize it upon D3->D0.
Otherwise we get a bunch of state mismatch.
Ideally we could leave DC9 enabled and wouldn't need
to move DC9->DC0 on every runtime resume, however,
the disable_DC is part of the power-well checks and
intrinsic to the dc_off power well. In the future that
can be detangled so we can have even bigger power savings.
But for now, let's focus on getting a D3Cold, which saves
much more power by itself.
v2: create new functions to avoid full-suspend-resume path,
which would result in a deadlock between xe_gem_fault and the
modeset-ioctl.
v3: Only avoid the full modeset to avoid the race, for a more
robust suspend-resume.
Rodrigo Vivi [Wed, 22 May 2024 17:01:02 +0000 (13:01 -0400)]
drm/xe: Relax runtime pm protection around VM
In the regular use case scenario, user space will create a
VM, and keep it alive for the entire duration of its workload.
For the regular desktop cases, it means that the VM
is alive even on idle scenarios where display goes off. This
is unacceptable since this would entirely block runtime PM
indefinitely, blocking deeper Package-C state. This would be
a waste drainage of power.
Limit the VM protection solely for long-running workloads that
are not protected by the scheduler references.
By design, run_job for long-running workloads returns NULL and
the scheduler drops all the references of it, hence protecting
the VM for this case is necessary.
v2: Update commit message to a more imperative language and to
reflect why the VM protection is really needed.
Also add a comment in the code to let the reason visbible.
v3: Remove vma_access case and the mentions to mmap. Mmap cases
are already protected by the gem page fault.
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522170105.327472-4-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Rodrigo Vivi [Wed, 22 May 2024 17:01:01 +0000 (13:01 -0400)]
drm/xe: Relax runtime pm protection during execution
Limit the protection only during moments of actual job execution,
and introduce protection for guc submit fini, which is currently
unprotected due to the absence of exec_queue life protection.
In the regular use case scenario, user space will create an
exec queue, and keep it alive to reuse that until it is done
with that kind of workload.
For the regular desktop cases, it means that the exec_queue
is alive even on idle scenarios where display goes off. This
is unacceptable since this would entirely block runtime PM
indefinitely, blocking deeper Package-C state. This would be
a waste drainage of power.
Rodrigo Vivi [Wed, 22 May 2024 17:00:59 +0000 (13:00 -0400)]
drm/xe: Fix xe_pm_runtime_get_if_active return
Current callers of this function are already taking the result
to a boolean and using in an if. It might be a problem because
current function might return negative error codes on failure,
without increasing the reference counter.
In this scenario we could end up with extra 'put' call ending
in unbalanced scenarios.
Let's fix it, while aligning with the current xe_pm_get_if_in_use
style.
Niranjana Vishwanathapura [Tue, 21 May 2024 20:17:11 +0000 (13:17 -0700)]
drm/xe: Properly handle alloc_guc_id() failure
Release the submission_state lock if alloc_guc_id() fails.
v2: Add Fixes tag and CC stable kernel
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521201711.4934-1-niranjana.vishwanathapura@intel.com
Michal Wajdeczko [Tue, 21 May 2024 11:48:57 +0000 (13:48 +0200)]
drm/xe/uc: Don't emit false error if running in execlist mode
When running in execlist mode (using force_execlist=1 modparam)
we incorrectly select the error path in xe_uc_init(), leading to
an unwanted error message like this:
Matthew Auld [Wed, 22 May 2024 10:22:01 +0000 (11:22 +0100)]
drm/xe/display: move device_remove over to drmm
i915 display calls this when releasing the drm_device, match this also
in xe by using drmm. intel_display_device_remove() is freeing purely
software state for the drm_device.
Matthew Auld [Wed, 22 May 2024 10:21:58 +0000 (11:21 +0100)]
drm/xe: reset mmio mappings with devm
Set our various mmio mappings to NULL. This should make it easier to
catch something rogue trying to mess with mmio after device removal. For
example, we might unmap everything and then start hitting some mmio
address which has already been unmamped by us and then remapped by
something else, causing all kinds of carnage.
Matthew Auld [Wed, 22 May 2024 10:21:57 +0000 (11:21 +0100)]
drm/xe/mmio: move mmio_fini over to devm
Not valid to touch mmio once the device is removed, so make sure we
unmap on removal and not just when driver instance goes away. Also set
the mmio pointers to NULL to hopefully catch such issues more easily.
Matthew Auld [Wed, 22 May 2024 10:21:54 +0000 (11:21 +0100)]
drm/xe/coredump: move over to devm
Here we are using drmm to ensure we release the coredump when unloading
the module, however the coredump is very much tied to the struct device
underneath. We can see this when we hotunplug the device, for which we
have already got a coredump attached. In such a case the coredump still
remains and adding another is not possible. However we still register
the release action via xe_driver_devcoredump_fini(), so in effect two or
more releases for one dump. The other consideration is that the
coredump state is embedded in the xe_driver instance, so technically
once the drmm release action fires we might free the coredumpe state
from a different driver instance, assuming we have two release actions
and they can race. Rather use devm here to remove the coredump when the
device is released.
Matthew Auld [Wed, 22 May 2024 10:21:49 +0000 (11:21 +0100)]
drm/xe/guc_pc: move pc_fini to devm
Here we are touching the HW/GuC and presumably this should happen when
the device is removed. Currently if you hotunplug the device this is
skipped if there is already open driver instance.
Matthew Auld [Wed, 22 May 2024 10:21:47 +0000 (11:21 +0100)]
drm/xe/guc: move guc_fini over to devm
Make sure to actually call this when the device is removed. Currently we
only trigger it when the driver instance goes away, but that doesn't
work too well with hotunplug, since device can be removed and re-probed
with a new driver instance, where the guc_fini() is called too late.
Move the fini over to devm to ensure this is called when device is
removed.
Matthew Auld [Wed, 22 May 2024 10:21:46 +0000 (11:21 +0100)]
drm/xe/ggtt: use drm_dev_enter to mark device section
Device can be hotunplugged before we start destroying gem objects. In
such a case don't touch the GGTT entries, trigger any invalidations or
mess around with rpm. This should already be taken care of when
removing the device, we just need to take care of dealing with the
software state, like removing the mm node.
v2: (Andrzej)
- Avoid some duplication by tracking the bound status and checking
that instead.
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1717 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522102143.128069-21-matthew.auld@intel.com
Matthew Auld [Wed, 22 May 2024 10:21:45 +0000 (11:21 +0100)]
drm/xe: covert sysfs over to devm
Hotunplugging the device seems to result in stuff like:
kobject_add_internal failed for tile0 with -EEXIST, don't try to
register things with the same name in the same directory.
We only remove the sysfs as part of drmm, however that is tied to the
lifetime of the driver instance and not the device underneath. Attempt
to fix by using devm for all of the remaining sysfs stuff related to the
device.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1667 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1432 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522102143.128069-20-matthew.auld@intel.com
Matthew Auld [Wed, 22 May 2024 10:21:44 +0000 (11:21 +0100)]
drm/xe/pci: remove broken driver_release
This is quite broken since we are nuking the pdev link to the private
driver struct, but note here that driver_release is called when the
drm_device is released (poor mans drmm), which can be long after the
device has been removed. So here what we are actually doing is nuking
the pdev link for what is potentially bound to a different drm_device.
If that happens before our pci remove callback is triggered (for the new
drm_device) we silently exit and skip some important cleanup steps,
resulting in hilarity.
There should be no reason to implement driver_release, when we already
have nicer stuff like drmm, so just remove completely. The actual pdev
link is already nuked when removing the device.
Michal Wajdeczko [Tue, 21 May 2024 09:25:18 +0000 (11:25 +0200)]
drm/xe/vf: Custom GuC initialization if VF
The GuC firmware is loaded and initialized by the PF driver. Make
sure VF drivers only perform permitted operations. For submission
initialization, use number of GuC context IDs from self config.
Michal Wajdeczko [Tue, 21 May 2024 09:25:17 +0000 (11:25 +0200)]
drm/xe/guc: Allow to initialize submission with limited set of IDs
While PF and native drivers may initialize submission code to use
all available GuC contexts IDs, the VF driver may only use limited
number of IDs. Update init function to accept number of context
IDs available for use.
Michal Wajdeczko [Mon, 20 May 2024 18:18:13 +0000 (20:18 +0200)]
drm/xe: Don't rely on indirect includes from xe_mmio.h
These compilation units use udelay() or some GT oriented printk
functions without explicitly including proper header files, and
relying on #includes from the xe_mmio.h instead. Fix that.
Nirmoy Das [Tue, 21 May 2024 10:36:23 +0000 (12:36 +0200)]
drm/xe: Add warn when level can not be zero.
At xe_pt_zap_ptes_entry() and xe_pt_stage_unbind_entry, the level cannot
be 0. Therefore, add an independent check for the level. Since the level
cannot be zero at this point, there is no need to check for `is_compact`,
so remove that instead.
The L3 bank mask is already generated and stored internally with
the rest of the GT topology. In user space, the compute runtime
now needs this information to be added to the device properties
therefore the topology mask query is extended to provide a new
mask which represents the L3 banks enabled on the GT.
The changes in the compute runtime are ready and approved, see
link below.
v2: Rewrite commit message and add a link to the compute
runtime PR (Francois Dugast)
Lucas De Marchi [Fri, 17 May 2024 20:43:10 +0000 (13:43 -0700)]
drm/xe/client: Print runtime to fdinfo
Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:
1) loop through all the exec queues for the current client and
accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
that, being read from the context image.
2) Read a "GPU timestamp" that can be used for considering "how much GPU
time has passed" and that has the same unit/refclock as the one
recording the runtime. RING_TIMESTAMP is used for that via MMIO.
Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine available.
This is exported to userspace as 2 numbers in fdinfo:
Since drm-cycles-<class> always starts at 0, it's also possible to know
if and engine was ever used by a client.
It's expected that userspace will read any 2 samples every few seconds.
Given the update frequency of the counters involved and that
CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap
around (assuming 100% utilization) after ~200s. The wraparound is not
perceived by userspace since it's just accumulated for all the
exec_queues in a 64-bit counter) but the measurement will not be
accurate if the samples are too far apart.
This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc, with none of them really defaulting to 1 sample
per minute or more.
Lucas De Marchi [Fri, 17 May 2024 20:43:08 +0000 (13:43 -0700)]
drm/xe: Cache data about user-visible engines
gt->info.engine_mask used to indicate the available engines, but that
is not always true anymore: some engines are reserved to kernel and some
may be exposed as a single engine (e.g. with ccs_mode).
Runtime changes only happen when no clients exist, so it's safe to cache
the list of engines in the gt and update that when it's needed. This
will help implementing per client engine utilization so this (mostly
constant) information doesn't need to be re-calculated on every query.
drm/xe: Add helper to accumulate exec queue runtime
Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.
v2:
- Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
- Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
- Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
- Handle the parallel submission case by multiplying the runtime by
width.
v3: Update comments
Lucas De Marchi [Fri, 17 May 2024 20:43:06 +0000 (13:43 -0700)]
drm/xe: Add helper to capture engine timestamp
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to
get the timestamp for the engine so it can be used to calculate the
"engine time" with the same unit as the runtime is recorded.
Lucas De Marchi [Fri, 17 May 2024 20:43:04 +0000 (13:43 -0700)]
drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and
remove the default handling so it's protected by -Wswitch.
Currently the only user is xe_hw_engine_class_sysfs_init(), which
already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior.
José Roberto de Souza [Fri, 10 May 2024 15:01:08 +0000 (08:01 -0700)]
drm/xe: Replace RING_START_UDW by u64 RING_START
Other u64 registers are printed in a single line so RING_START
needs to follow that too.
As there is no upstream decoder tool parsing RING_START this will
not break any decoder application.
Michal Wajdeczko [Thu, 16 May 2024 11:05:46 +0000 (13:05 +0200)]
drm/xe/vf: Expose SR-IOV VF attributes to GT debugfs
For debug purposes we might want to view actual VF configuration
(including GGTT range, LMEM size, number of GuC contexts IDs or
doorbells) and the negotiated ABI versions (with GuC and PF).
Michal Wajdeczko [Thu, 16 May 2024 11:05:44 +0000 (13:05 +0200)]
drm/xe/vf: Add support for VF to query its configuration
The VF driver doesn't know which GuC firmware was loaded by the PF
driver and must perform GuC ABI version handshake prior to sending
any other H2G actions to the GuC to submit workloads.
The VF driver also doesn't have access to the fuse registers and
must rely on the runtime info, which includes values of the fuse
registers, that the PF driver is exposing to the VFs.
Add functions to cover that functionality. We will use these
functions in upcoming patches.
Michal Wajdeczko [Thu, 16 May 2024 11:05:43 +0000 (13:05 +0200)]
drm/xe/guc: Add VF2GUC_QUERY_SINGLE_KLV to ABI
In upcoming patches we will add support to the VF driver to
read its configuration from the GuC using special H2G actions.
Add necessary definitions to our GuC firmware ABI header.
Michal Wajdeczko [Thu, 16 May 2024 11:05:42 +0000 (13:05 +0200)]
drm/xe/guc: Add VF2GUC_VF_RESET to ABI
The version negotiation between the VF driver and the GuC firmware
must start with explicit soft reset of the GuC state initiated by
the VF driver. Add VF2GUC action definitions to the ABI header.
Michal Wajdeczko [Thu, 16 May 2024 11:05:41 +0000 (13:05 +0200)]
drm/xe/guc: Add VF2GUC_MATCH_VERSION to ABI
In upcoming patches we will add a version negotiation between
the VF driver and the GuC firmware. Add necessary definitions
to our GuC firmware ABI header.
Michal Wajdeczko [Tue, 14 May 2024 19:00:14 +0000 (21:00 +0200)]
drm/xe/pf: Track adverse events notifications from GuC
When thresholds used to monitor VFs activities are configured,
then GuC may send GUC2PF_ADVERSE_EVENT messages informing the
PF driver about exceeded thresholds. Start handling such messages.
Michal Wajdeczko [Tue, 14 May 2024 19:00:13 +0000 (21:00 +0200)]
drm/xe/guc: Add GUC2PF_ADVERSE_EVENT to ABI
When thresholds used to monitor VFs activities are configured,
then GuC may send GUC2PF_ADVERSE_EVENT messages informing the
PF driver about exceeded thresholds. Add necessary definitions
to our GuC firmware ABI header.
Michal Wajdeczko [Tue, 14 May 2024 19:00:12 +0000 (21:00 +0200)]
drm/xe/pf: Allow configuration of VF thresholds over debugfs
Initial values of all thresholds used by the GuC to monitor VF's
activity is zero (disabled) and we need to explicitly configure
them per each VF. Expose additional attributes over debugfs.
Definitions of all attributes are generated so we will not need
to make any changes if new thresholds would be added to the set.
Michal Wajdeczko [Tue, 14 May 2024 19:00:11 +0000 (21:00 +0200)]
drm/xe/pf: Introduce functions to configure VF thresholds
The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.
Add functions to allow configuration of these thresholds per VF.
Michal Wajdeczko [Tue, 14 May 2024 19:00:09 +0000 (21:00 +0200)]
drm/xe/guc: Introduce GuC KLV thresholds set
The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.
The available thresholds are defined in the GuC ABI as part of the
GuC VF Configuration KLVs. Threshold configurations performed by
the PF driver and notifications sent by the GuC rely on the KLV keys,
which are not zero-based and might not guarantee continuity.
To simplify the driver code and eliminate the need to repeat very
similar code for each threshold, introduce the threshold set macro
that allows to generate required code based on unique threshold tag.
Michal Wajdeczko [Tue, 14 May 2024 19:00:08 +0000 (21:00 +0200)]
drm/xe/guc: Add more KLV helper macros
In upcoming patches we will want to generate some of the KLV keys
from other macros. Add MAKE_GUC_KLV_{KEY|LEN} macros for that and
make sure they will correctly expand provided TAG parameter. Also
fix PREP_GUC_KLV_TAG to also work correctly within other macros.
The PCI subsystem already exposes the "sriov_numvfs" attribute
that users can use to enable or disable SR-IOV VFs. Add custom
implementation of the .sriov_configure callback defined by the
pci_driver to perform additional steps, including fair VFs
provisioning with the resources, as required by our platforms.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> #v2 Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240506184121.2615-1-michal.wajdeczko@intel.com
Michal Wajdeczko [Tue, 7 May 2024 16:57:57 +0000 (18:57 +0200)]
drm/xe/pf: Don't advertise support to enable VFs if not ready
Even if we have not enabled SR-IOV support using the platform
specific has_sriov flag, the hardware may still report SR-IOV
capability and the PCI layer may wrongly advertise driver support
to enable VFs. Explicitly reset the number of supported VFs to
zero to avoid confusion.
Applications may read the /sys/bus/pci/devices/.../sriov_totalvfs
prior to enabling VFs using the sriov_numvfs to check if such an
operation is possible.
Jonathan Cavitt [Fri, 10 May 2024 19:45:40 +0000 (12:45 -0700)]
drm/xe/xe_guc_submit: Declare reset if banned or killed or wedged
Add an additional condition to the reset_status guc_exec_queue_op that
returns true if the exec queue has been banned or killed or wedged. The
reset_status op is only used for exiting any xe_wait_user_fence_ioctl
that waits on an exec queue without timing out, so doing this will exit
the ioctl early in cases where the exec queue can no longer function,
such as after a GuC stop during a reset.
Jonathan Cavitt [Fri, 10 May 2024 19:45:39 +0000 (12:45 -0700)]
drm/xe/xe_guc_submit: Allow lr exec queues to be banned
LR queues currently don't get banned during a GT/GuC reset because they
lack a job. Though they don't have a job to detect the reset status of,
it's still possible to tell when they should be banned by looking at the
LRC: if the LRC head and tail don't match, then the exec queue should be
banned and cleaned up.
This also requires swapping the usage of xe_sched_tdr_queue_imm with
xe_guc_exec_queue_trigger_cleanup, as the former is specific to non-lr
exec queues.
Reorder the xe_sched_tdr_queue_imm and set_exec_queue_banned calls in
guc_exec_queue_stop. This prevents a possible race condition between
the two events in which it's possible for xe_sched_tdr_queue_imm to
wake the ufence waiter before the exec queue is banned, causing the
ufence waiter to miss the banned state.
Michal Wajdeczko [Fri, 10 May 2024 20:38:09 +0000 (22:38 +0200)]
drm/xe/uc: Reorder post hwconfig uC initialization step
We want to move the GuC submission initialization to the post
hwconfig step, but now this step is done too late as migration
initialization uses exec_queue that would crash due to a unset
exec_queue_ops. We can easily fix that by small function reorder.
Matthew Brost [Mon, 15 Apr 2024 19:04:53 +0000 (12:04 -0700)]
drm/xe: Only use reserved BCS instances for usm migrate exec queue
The GuC context scheduling queue is 2 entires deep, thus it is possible
for a migration job to be stuck behind a fault if migration exec queue
shares engines with user jobs. This can deadlock as the migrate exec
queue is required to service page faults. Avoid deadlock by only using
reserved BCS instances for usm migrate exec queue.
Fixes: a043fbab7af5 ("drm/xe/pvc: Use fast copy engines as migrate engine on PVC") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240415190453.696553-2-matthew.brost@intel.com Reviewed-by: Brian Welty <brian.welty@intel.com>
drm/xe: Change pcode timeout to 50msec while polling again
Polling is initially attempted with timeout_base_ms enabled for
preemption, and if it exceeds this timeframe, another attempt is made
without preemption, allowing an additional 50 ms before timing out.
v2
- Rebase
v3
- Move warnings to separate patch (Lucas)
Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Fixes: 7dc9b92dcfef ("drm/xe: Remove i915_utils dependency from xe_pcode.") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240508152216.3263109-2-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
xe_force_wake_init_gt() is a software-only initialization and doesn't
need to be called from xe_device_probe(). Move it to initialize
together with the gt.