Matthew Brost [Thu, 14 Oct 2021 17:19:43 +0000 (10:19 -0700)]
drm/i915/guc: Take engine PM when a context is pinned with GuC submission
Taking a PM reference to prevent intel_gt_wait_for_idle from short
circuiting while any user context has scheduling enabled. Returning GT
idle when it is not can cause all sorts of issues throughout the stack.
v2:
(Daniel Vetter)
- Add might_lock annotations to pin / unpin function
v3:
(CI)
- Drop intel_engine_pm_might_put from unpin path as an async put is
used
v4:
(John Harrison)
- Make intel_engine_pm_might_get/put work with GuC virtual engines
- Update commit message
v5:
- Update commit message again
Matthew Brost [Thu, 14 Oct 2021 17:19:42 +0000 (10:19 -0700)]
drm/i915/guc: Take GT PM ref when deregistering context
Taking a PM reference to prevent intel_gt_wait_for_idle from short
circuiting while a deregister context H2G is in flight. To do this must
issue the deregister H2G from a worker as context can be destroyed from
an atomic context and taking GT PM ref blows up. Previously we took a
runtime PM from this atomic context which worked but will stop working
once runtime pm autosuspend in enabled.
So this patch is two fold, stop intel_gt_wait_for_idle from short
circuting and fix runtime pm autosuspend.
v2:
(John Harrison)
- Split structure changes out in different patch
(Tvrtko)
- Don't drop lock in deregister_destroyed_contexts
v3:
(John Harrison)
- Flush destroyed contexts before destroying context reg pool
Matthew Brost [Thu, 14 Oct 2021 17:19:41 +0000 (10:19 -0700)]
drm/i915/guc: Move GuC guc_id allocation under submission state sub-struct
Move guc_id allocation under submission state sub-struct as a future
patch will reuse the spin lock as a global submission state lock. Moving
this into sub-struct makes ownership of fields / lock clear.
v2:
(Docs)
- Add comment for submission_state sub-structure
v3:
(John Harrison)
- Fixup a few comments
v4:
(John Harrison)
- Fix typo
Andi Shyti [Tue, 12 Oct 2021 22:17:38 +0000 (00:17 +0200)]
drm/i915/gt: move remaining debugfs interfaces into gt
The following interfaces:
i915_wedged
i915_forcewake_user
are dependent on gt values. Put them inside gt/ and drop the
"i915_" prefix name. This would be the new structure:
dri/0/gt
|
+-- forcewake_user
|
\-- reset
For backwards compatibility with existing igt (and the slight
semantic difference between operating on the i915 abi entry
points and the deep gt info):
dri/0
|
+-- i915_wedged
|
\-- i915_forcewake_user
remain at the top level.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211012221738.16029-1-andi@etezian.org
5.15-rc1 crashes with blank screen when booting up on two ThinkPads
using i915. Bisections converge convincingly, but arrive at different
and surprising "culprits", none of them the actual culprit.
netconsole (with init_netconsole() hacked to call i915_init() when
logging has started, instead of by module_init()) tells the story:
kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
function needs to be 4-byte aligned.
v2:
(Jani Nikula)
- Change BUG_ON to WARN_ON
v3:
(Jani / Tvrtko)
- Short circuit __i915_sw_fence_init on WARN_ON
v4:
(Lucas)
- Break WARN_ON changes out in a different patch
Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210922015039.26411-1-matthew.brost@intel.com
Matt Roper [Fri, 1 Oct 2021 00:58:16 +0000 (17:58 -0700)]
drm/i915: Stop using I915_TILING_* in client blit selftest
The I915_TILING_* definitions in the uapi header are intended solely for
tiling modes that are visible to the old de-tiling fence ioctls. Since
modern hardware does not support de-tiling fences, we should not add new
definitions for new tiling types going forward. However we do want the
client blit selftest to eventually cover other new tiling modes (such as
Tile4), so switch it to using its own enum of tiling modes.
Lucas De Marchi [Thu, 7 Oct 2021 23:32:10 +0000 (16:32 -0700)]
drm/i915/gt: include tsc.h where used
We are currently using tsc_khz as a fallback so add the right include.
For other architectures we may need to add a different fallback, but
this is not being used by dgfx so we may as well just paper it over.
Lucas De Marchi [Tue, 5 Oct 2021 17:17:28 +0000 (10:17 -0700)]
drm/i915: remove IS_ACTIVE
When trying to bring IS_ACTIVE to linux/kconfig.h I thought it wouldn't
provide much value just encapsulating it in a boolean context. So I also
added the support for handling undefined macros as the IS_ENABLED()
counterpart. However the feedback received from Masahiro Yamada was that
it is too ugly, not providing much value. And just wrapping in a boolean
context is too dumb - we could simply open code it.
As detailed in commit babaab2f4738 ("drm/i915: Encapsulate kconfig
constant values inside boolean predicates"), the IS_ACTIVE macro was
added to workaround a compilation warning. However after checking again
our current uses of IS_ACTIVE it turned out there is only
1 case in which it triggers a warning in clang (due
-Wconstant-logical-operand) and 2 in smatch. All the others
can simply use the shorter version, without wrapping it in any macro.
So here I'm dialing all the way back to simply removing the macro. That
single case hit by clang can be changed to make the constant come first,
so it doesn't think it's mask:
- if (context && CONFIG_DRM_I915_FENCE_TIMEOUT)
+ if (CONFIG_DRM_I915_FENCE_TIMEOUT && context)
As talked with Dan Carpenter, that logic will be added in smatch as
well, so it will also stop warning about it.
In short this makes i915 work for hybrid setups (DRI_PRIME=1 with Mesa)
when rendering is done on Intel dgfx and scanout/composition on Intel
igfx.
Before this patch the driver was not quite ready for that setup, mainly
because it was able to emit a semaphore wait between the two GPUs, which
results in deadlocks because semaphore target location in HWSP is neither
shared between the two, nor mapped in both GGTT spaces.
To fix it the patch adds an additional check to a couple of relevant code
paths in order to prevent using semaphores for inter-engine
synchronisation when relevant objects are not in the same GGTT space.
v2:
* Avoid adding rq->i915. (Chris)
v3:
* Use GGTT which describes the limit more precisely.
Matthew Brost [Fri, 1 Oct 2021 15:58:25 +0000 (08:58 -0700)]
drm/i915: Fix bug in user proto-context creation that leaked contexts
Set number of engines before attempting to create contexts so the
function free_engines can clean up properly. Also check return of
alloc_engines for NULL.
v2:
(Tvrtko)
- Send as stand alone patch
(John Harrison)
- Check for alloc_engines returning NULL
v3:
(Checkpatch / Tvrtko)
- Remove braces around single line if statement
Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: d4433c7600f7 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)") Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211001155825.6762-1-matthew.brost@intel.com
i915 enables a wider set of warnings with '-Wall -Wextra' then disables
several with cc-disable-warning. If an unknown flag gets added to
KBUILD_CFLAGS when building with clang, all subsequent calls to
cc-{disable-warning,option} will fail, meaning that all of these
warnings do not get disabled [1].
A separate series will address the root cause of the issue by not adding
these flags when building with clang [2]; however, the symptom of these
extra warnings appearing can be addressed separately by just removing
the calls to cc-disable-warning, which makes the build ever so slightly
faster because the compiler does not need to be called as much before
building.
The following warnings are supported by GCC 4.9 and clang 10.0.1, which
are the minimum supported versions of these compilers so the call to
cc-disable-warning is not necessary. Masahiro cleaned this up for the
reset of the kernel in commit 4c8dd95a723d ("kbuild: add some extra
warning flags unconditionally").
-Wunused-but-set-variable was implemented in clang 13.0.0 and
-Wframe-address was implemented in clang 12.0.0 so the
cc-disable-warning calls are kept for these two warnings.
Lastly, -Winitializer-overrides is clang's version of -Woverride-init,
which is disabled for the specific files that are problematic. clang
added a compatibility alias in clang 8.0.0 so -Winitializer-overrides
can be removed.
Add support to enable/disable PLANE_SURF Decryption Request bit.
It requires only to enable plane decryption support when following
condition met.
1. PXP session is enabled.
2. Buffer object is protected.
v2:
- Used gen fb obj user_flags instead gem_object_metadata. [Krishna]
v3:
- intel_pxp_gem_object_status() API changes.
v4: use intel_pxp_is_active (Daniele)
v5: rebase and use the new protected object status checker (Daniele)
v6: used plane state for plane_decryption to handle async flip
as suggested by Ville.
v7: check pxp session while plane decrypt state computation. [Ville]
removed pointless code. [Ville]
v8 (Daniele): update PXP check
v9: move decrypt check after icl_check_nv12_planes() when overlays
have fb set (Juston)
v10 (Daniele): update PXP check again to match rework in earlier
patches and don't consider protection valid if the object has not
been used in an execbuf beforehand.
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Huang Sean Z <sean.z.huang@intel.com> Cc: Gaurav Kumar <kumar.gaurav@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Juston Li <juston.li@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> #v9 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-14-alan.previn.teres.alexis@intel.com
Huang, Sean Z [Fri, 24 Sep 2021 19:14:47 +0000 (12:14 -0700)]
drm/i915/pxp: Enable PXP power management
During the power event S3+ sleep/resume, hardware will lose all the
encryption keys for every hardware session, even though the
session state might still be marked as alive after resume. Therefore,
we should consider the session as dead on suspend and invalidate all the
objects. The session will be automatically restarted on the first
protected submission on resume.
v2: runtime suspend also invalidates the keys
v3: fix return codes, simplify rpm ops (Chris), use the new worker func
v4: invalidate the objects on suspend, don't re-create the arb sesson on
resume (delayed to first submission).
v5: move irq changes back to irq patch (Rodrigo)
v6: drop invalidation in runtime suspend (Rodrigo)
Signed-off-by: Huang, Sean Z <sean.z.huang@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-13-alan.previn.teres.alexis@intel.com
Now that we can handle destruction and re-creation of the arb session,
we can postpone the start of the session to the first submission that
requires it, to avoid keeping it running with no user.
v10: increase timeout when waiting in intel_pxp_start as firmware
session startup is slower right after boot.
v13: increase the same timeout by 50 milisec because previous timeout
was not enough to cover two lower level 100 milisec timeouts
in the session termination + creation steps.
drm/i915/pxp: interfaces for using protected objects
This api allow user mode to create protected buffers and to mark
contexts as making use of such objects. Only when using contexts
marked in such a way is the execution guaranteed to work as expected.
Contexts can only be marked as using protected content at creation time
(i.e. the parameter is immutable) and they must be both bannable and not
recoverable. Given that the protected session gets invalidated on
suspend, contexts created this way hold a runtime pm wakeref until
they're either destroyed or invalidated.
All protected objects and contexts will be considered invalid when the
PXP session is destroyed and all new submissions using them will be
rejected. All intel contexts within the invalidated gem contexts will be
marked banned. Userspace can detect that an invalidation has occurred via
the RESET_STATS ioctl, where we report it the same way as a ban due to a
hang.
v5: squash patches, rebase on proto_ctx, update kerneldoc
v6: rebase on obj create_ext changes
v7: Use session counter to check if an object it valid, hold wakeref in
context, don't add a new flag to RESET_STATS (Daniel)
v8: don't increase guilty count for contexts banned during pxp
invalidation (Rodrigo)
v9: better comments, avoid wakeref put race between pxp_inval and
context_close, add usage examples (Rodrigo)
v10: modify internal set/get-protected-context functions to not
return -ENODEV when setting PXP param to false or getting param
when running on pxp-unsupported hw or getting param when i915
was built with CONFIG_PXP off
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-11-alan.previn.teres.alexis@intel.com
Huang, Sean Z [Fri, 24 Sep 2021 19:14:44 +0000 (12:14 -0700)]
drm/i915/pxp: Implement PXP irq handler
The HW will generate a teardown interrupt when session termination is
required, which requires i915 to submit a terminating batch. Once the HW
is done with the termination it will generate another interrupt, at
which point it is safe to re-create the session.
Since the termination and re-creation flow is something we want to
trigger from the driver as well, use a common work function that can be
called both from the irq handler and from the driver set-up flows, which
has the addded benefit of allowing us to skip any extra locks because
the work itself serializes the operations.
v2: use struct completion instead of bool (Chris)
v3: drop locks, clean up functions and improve comments (Chris),
move to common work function.
v4: improve comments, simplify wait logic (Rodrigo)
v5: unconditionally set interrupts, rename state_attacked var (Rodrigo)
v10: remove inclusion of intel_gt_types.h from intel_pxp.h (Jani)
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Huang, Sean Z <sean.z.huang@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-10-alan.previn.teres.alexis@intel.com
Huang, Sean Z [Fri, 24 Sep 2021 19:14:43 +0000 (12:14 -0700)]
drm/i915/pxp: Implement arb session teardown
Teardown is triggered when the display topology changes and no
long meets the secure playback requirement, and hardware trashes
all the encryption keys for display. Additionally, we want to emit a
teardown operation to make sure we're clean on boot and resume
v2: emit in the ring, use high prio request (Chris)
v3: better defines, stalling flush, cleaned up and renamed submission
funcs (Chris)
v12: fix uninitialized variable bug
Huang, Sean Z [Fri, 24 Sep 2021 19:14:42 +0000 (12:14 -0700)]
drm/i915/pxp: Create the arbitrary session after boot
Create the arbitrary session, with the fixed session id 0xf, after
system boot, for the case that application allocates the protected
buffer without establishing any protection session. Because the
hardware requires at least one alive session for protected buffer
creation. This arbitrary session will need to be re-created after
teardown or power event because hardware encryption key won't be
valid after such cases.
The session ID is exposed as part of the uapi so it can be used as part
of userspace commands.
v2: use gt->uncore->rpm (Chris)
v3: s/arb_is_in_play/arb_is_valid (Chris), move set-up to the new
init_hw function
v4: move interface defs to separate header, set arb_is valid to false
on fini (Rodrigo)
v5: handle async component binding
Signed-off-by: Huang, Sean Z <sean.z.huang@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-8-alan.previn.teres.alexis@intel.com
The setting is required by hardware to allow us doing further protection
operation such as sending commands to GPU or TEE. The register needs to
be re-programmed on resume, so for simplicitly we bundle the programming
with the component binding, which is automatically called on resume.
Further HW set-up operations will be added in the same location in
follow-up patches, so get ready for them by using a couple of
init/fini_hw wrappers instead of calling the KCR funcs directly.
v3: move programming to component binding function, rework commit msg
drm/i915/pxp: allocate a vcs context for pxp usage
The context is required to send the session termination commands to the
VCS, which will be implemented in a follow-up patch. We can also use the
presence of the context as a check of pxp initialization completion.
v2: use perma-pinned context (Chris)
v3: rename pinned_context functions (Chris)
v4: split export of pinned_context functions to a separate patch (Rodrigo)
v10: remove inclusion of intel_gt_types.h from intel_pxp.h (Jani)
v13: fixed for loop pointer dereference (Vinay)
Ahead of the PXP implementation, define the relevant define flag and
kconfig option.
v2: flip kconfig default to N. Some machines have IFWIs that do not
support PXP, so we need it to be an opt-in until we add support to query
the caps from the mei device.
v10: change comments from "Gen12+" to "Gen12 and newer"
This will be used for communication between the i915 driver and the mei
one. Defining it in a stand-alone patch to avoid circualr dependedencies
between the patches modifying the 2 drivers.
Split out from an original patch from Huang, Sean Z
Michal Wajdeczko [Sun, 26 Sep 2021 18:45:45 +0000 (20:45 +0200)]
drm/i915/guc: Move and improve error message for missed CTB reply
If we timeout waiting for a CT reply we print very simple error
message. Improve that and by moving error reporting to the caller
we can use CT_ERROR instead of DRM_ERROR and report just fence
as error code will be reported later anyway.
Michal Wajdeczko [Sun, 26 Sep 2021 18:45:42 +0000 (20:45 +0200)]
drm/i915/guc: Verify result from CTB (de)register action
In commit b839a869dfc9 ("drm/i915/guc: Add support for data
reporting in GuC responses") we missed the hypothetical case
that GuC might return positive non-zero value as success data.
While that would be lucky treated as error case, and at the
end will result in reporting valid -EIO, in the meantime this
value will be passed to ERR_PTR that could be misleading.
Michal Wajdeczko [Sun, 26 Sep 2021 20:10:05 +0000 (22:10 +0200)]
drm/i915: Use fixed offset for PTEs location
We assumed that for all modern GENs the PTEs and register space are
split in the GTTMMADR BAR, but while it is true, we should rather use
fixed offset as it is defined in the specification.
We may end up in i915_ttm_bo_destroy() in an error path before the
object is fully initialized. In that case it's not correct to call
__i915_gem_free_object(), because that function
a) Assumes the gem object refcount is 0, which it isn't.
b) frees the placements which are owned by the caller until the
init_object() region ops returns successfully. Fix this by providing
a lightweight cleanup function __i915_gem_object_fini() which is also
called by __i915_gem_free_object().
While doing this, also make sure we call dma_resv_fini() as part of
ordinary object destruction and not from the RCU callback that frees
the object. This will help track down bugs where the object is incorrectly
locked from an RCU lookup.
Finally, make sure the object isn't put on the region list until it's
either locked or fully initialized in order to block list processing of
partially initialized objects.
v2:
- The TTM object backend memory was freed before the gem pages were
put. Separate this functionality into __i915_gem_object_pages_fini()
and call it from the TTM delete_mem_notify() callback.
v3:
- Include i915_gem_object_free_mmaps() in __i915_gem_object_pages_fini()
to make sure we don't inadvertedly introduce a race.
Janusz Krzysztofik [Fri, 24 Sep 2021 16:38:25 +0000 (18:38 +0200)]
drm/i915: Flush buffer pools on driver remove
We currently do an explicit flush of the buffer pools within the call path
of drm_driver.release(); this removes all buffers, regardless of their age,
freeing the buffers' associated resources (objects, address space areas).
However there is other code that runs within the drm_driver.release() call
chain that expects objects and their associated address space areas have
already been flushed.
Since buffer pools auto-flush old buffers once per second in a worker
thread, there's a small window where if we remove the driver while there
are still objects in buffers with an age of less than one second, the
assumptions of the other release code may be violated.
By moving the flush to driver remove (which executes earlier via the
pci_driver.remove() flow) we're ensuring that all buffers are flushed and
their associated objects freed before some other code in
pci_driver.remove() flushes those objects so they are released before
_any_ code in drm_driver.release() that check completness of those
flushes executes.
v2: Reword commit description as suggested by Matt.
In commit 4e5c8a99e1cb ("drm/i915: Drop i915_request.lock requirement
for intel_rps_boost()"), we decoupled the rps worker from the pm so
that we could avoid the synchronization penalty which makes the
assertion liable to run too early. Which makes warning invalid hence
removed.
it looks THP + shmem_writeback was an unexpected combination, and ends up
hitting some BUG_ON, but it also looks like that is now fixed.
While the IGTs did eventually hit this(although not during pre-merge it
seems), it's likely worthwhile adding some explicit coverage for this
scenario in the shrink_thp selftest.
Matthew Auld [Tue, 21 Sep 2021 13:42:02 +0000 (14:42 +0100)]
drm/i915/request: fix early tracepoints
Currently we blow up in trace_dma_fence_init, when calling into
get_driver_name or get_timeline_name, since both the engine and context
might be NULL(or contain some garbage address) in the case of newly
allocated slab objects via the request ctor. Note that we also use
SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
freed, but delay freeing the underlying page by an RCU grace period.
With this scheme requests can be re-allocated, at the same time as they
are also being read by some lockless RCU lookup mechanism.
In the ctor case, which is only called for new slab objects(i.e allocate
new page and call the ctor for each object) it's safe to reset the
context/engine prior to calling into dma_fence_init, since we can be
certain that no one is doing an RCU lookup which might depend on peeking
at the engine/context, like in active_engine(), since the object can't
yet be externally visible.
In the recycled case(which might also be externally visible) the request
refcount always transitions from 0->1 after we set the context/engine
etc, which should ensure it's valid to dereference the engine for
example, when doing an RCU list-walk, so long as we can also increment
the refcount first. If the refcount is already zero, then the request is
considered complete/released. If it's non-zero, then the request might
be in the process of being re-allocated, or potentially still in flight,
however after successfully incrementing the refcount, it's possible to
carefully inspect the request state, to determine if the request is
still what we were looking for. Note that all externally visible
requests returned to the cache must have zero refcount.
One possible fix then is to move dma_fence_init out from the request
ctor. Originally this was how it was done, but it was moved in:
drm/i915: Do not share hwsp across contexts any more, v8.
intel_timeline_get_seqno() could also be cleaned up slightly by dropping
the request argument.
Moving dma_fence_init back out of the ctor, should ensure we have enough
of the request initialised in case of trace_dma_fence_init.
Functionally this should be the same, and is effectively what we were
already open coding before, except now we also assign the fence->lock
and fence->ops, but since these are invariant for recycled
requests(which might be externally visible), and will therefore already
hold the same value, it shouldn't matter.
An alternative fix, since we don't yet have a fully initialised request
when in the ctor, is just setting the context/engine as NULL, but this
does require adding some extra handling in get_driver_name etc.
v2(Daniel):
- Try to make the commit message less confusing
Thomas Hellström [Wed, 22 Sep 2021 06:25:25 +0000 (08:25 +0200)]
drm/i915: Reduce the number of objects subject to memcpy recover
We really only need memcpy restore for objects that affect the
operability of the migrate context. That is, primarily the page-table
objects of the migrate VM.
Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early
restores using memcpy and a way to assign LMEM page-table object flags
to be used by the vms.
Restore objects without this flag with the gpu blitter and only objects
carrying the flag using TTM memcpy.
Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and
defer for a later audit which vms actually need it. Most importantly, user-
allocated vms with pinned page-table objects can be restored using the
blitter.
Performance-wise memcpy restore is probably as fast as gpu restore if not
faster, but using gpu restore will help tackling future restrictions in
mappable LMEM size.
v4:
- Don't mark the aliasing ppgtt page table flags for early resume, but
rather the ggtt page table flags as intended. (Matthew Auld)
- The check for user buffer objects during early resume is pointless, since
they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld)
v5:
- Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored
before we fire up the migrate context.
Thomas Hellström [Wed, 22 Sep 2021 06:25:24 +0000 (08:25 +0200)]
drm/i915: Don't back up pinned LMEM context images and rings during suspend
Pinned context images are now reset during resume. Don't back them up,
and assuming that rings can be assumed empty at suspend, don't back them
up either.
Introduce a new object flag, I915_BO_ALLOC_PM_VOLATILE meaning that an
object is allowed to lose its content on suspend.
Thomas Hellström [Wed, 22 Sep 2021 06:25:23 +0000 (08:25 +0200)]
drm/i915/gt: Register the migrate contexts with their engines
Pinned contexts, like the migrate contexts need reset after resume
since their context image may have been lost. Also the GuC needs to
register pinned contexts.
Add a list to struct intel_engine_cs where we add all pinned contexts on
creation, and traverse that list at resume time to reset the pinned
contexts.
This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now,
but proper LMEM backup / restore is needed for full suspend functionality.
However, note that even with full LMEM backup / restore it may be
desirable to keep the reset since backing up the migrate context images
must happen using memcpy() after the migrate context has become inactive,
and for performance- and other reasons we want to avoid memcpy() from
LMEM.
Also traverse the list at guc_init_lrc_mapping() calling
guc_kernel_context_pin() for the pinned contexts, like is already done
for the kernel context.
v2:
- Don't reset the contexts on each __engine_unpark() but rather at
resume time (Chris Wilson).
v3:
- Reset contexts in the engine sanitize callback. (Chris Wilson)
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Brost Matthew <matthew.brost@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-6-thomas.hellstrom@linux.intel.com
Thomas Hellström [Wed, 22 Sep 2021 06:25:22 +0000 (08:25 +0200)]
drm/i915 Implement LMEM backup and restore for suspend / resume
Just evict unpinned objects to system. For pinned LMEM objects,
make a backup system object and blit the contents to that.
Backup is performed in three steps,
1: Opportunistically evict evictable objects using the gpu blitter.
2: After gt idle, evict evictable objects using the gpu blitter. This will
be modified in an upcoming patch to backup pinned objects that are not used
by the blitter itself.
3: Backup remaining pinned objects using memcpy.
Also move uC suspend to after 2) to make sure we have a functional GuC
during 2) if using GuC submission.
v2:
- Major refactor to make sure gem_exec_suspend@hang-SX subtests work, and
suspend / resume works with a slightly modified GuC submission enabling
patch series.
v3:
- Fix a potential use-after-free (Matthew Auld)
- Use i915_gem_object_create_shmem() instead of
i915_gem_object_create_region (Matthew Auld)
- Minor simplifications (Matthew Auld)
- Fix up kerneldoc for i195_ttm_restore_region().
- Final lmem_suspend() call moved to i915_gem_backup_suspend from
i915_gem_suspend_late, since the latter gets called at driver unload
and we don't unnecessarily want to run it at that time.
v4:
- Interface change of ttm- & lmem suspend / resume functions to use
flags rather than bools. (Matthew Auld)
- Completely drop the i915_gem_backup_suspend change (Matthew Auld)
Thomas Hellström [Wed, 22 Sep 2021 06:25:21 +0000 (08:25 +0200)]
drm/i915/gt: Increase suspend timeout
With GuC submission on DG1, the execution of the requests times out
for the gem_exec_suspend igt test case after executing around 800-900
of 1000 submitted requests.
Given the time we allow elsewhere for fences to signal (in the order of
seconds), increase the timeout before we mark the gt wedged and proceed.
Thomas Hellström [Wed, 22 Sep 2021 06:25:20 +0000 (08:25 +0200)]
drm/i915/gem: Implement a function to process all gem objects of a region
An upcoming common pattern is to traverse the region object list and
perform certain actions on all objects in a region. It's a little tricky
to get the list locking right, in particular since a gem object may
change region unless it's pinned or the object lock is held.
Define a function that does this for us and that takes an argument that
defines the action to be performed on each object.
v3:
- Improve structure documentation a bit (Matthew Auld)
Thomas Hellström [Wed, 22 Sep 2021 06:25:19 +0000 (08:25 +0200)]
drm/i915/ttm: Implement a function to copy the contents of two TTM-based objects
When backing up or restoring contents of pinned objects at suspend /
resume time we need to allocate a new object as the backup. Add a function
to facilitate copies between the two. Some data needs to be copied before
the migration context is ready for operation, so make sure we can
disable accelerated copies.
v2:
- Fix a missing return value check (Matthew Auld)
drm/i915/guc, docs: Fix pdfdocs build error by removing nested grid
Nested grids in grid-table cells are not specified as proper ReST
constructs.
Commit 572f2a5cd974 ("drm/i915/guc: Update firmware to v62.0.0")
added a couple of kerneldoc tables of the form:
For "make htmldocs", they happen to work as one might expect,
but they are incompatible with "make latexdocs" and "make pdfdocs",
and cause the generated gpu.tex file to become incomplete and
unbuildable by xelatex.
Restore the compatibility by removing those nested grids in the tables.
Matt Roper [Thu, 23 Sep 2021 00:30:29 +0000 (17:30 -0700)]
drm/i915/uncore: fwtable read handlers are now used on all forcewake platforms
With the recent refactor of the uncore mmio handling, all
forcewake-based platforms (i.e., graphics version 6 and beyond) now use
the 'fwtable' read handlers. Let's pull the assignment out of the
per-platform if/else ladder to make this more obvious.
drm/i915/debugfs: Do not report currently active engine when describing objects
It is not very useful to have code which tries to report a rapidly
transient state which will not report anything majority of the time,
especially since it is currently only used from
<debugfs>/i915_gem_framebuffers.
We thought the DG2 table of shadowed registers would be the same as the
gen12/xehp table, but it turns out that there are a few minor
differences that require us to define a new DG2-specific table:
* One register is removed (0xC4D4)
* One register is added (0xC4E0)
Matt Roper [Fri, 10 Sep 2021 20:10:29 +0000 (13:10 -0700)]
drm/i915/uncore: Drop gen11 mmio read handlers
Consolidate down to just a single 'fwtable' implementation. For reads
we don't need to worry about shadow tables.
While consolidating the functions, gen11/gen12 pick up a
NEEDS_FORCE_WAKE() check that they didn't have before, allowing them to
bypass a lot of forcewake/shadow checking for non-GT registers (e.g.,
display).
Matt Roper [Fri, 10 Sep 2021 20:10:28 +0000 (13:10 -0700)]
drm/i915/uncore: Drop gen11/gen12 mmio write handlers
Now that the reference to the shadow table is stored within the uncore,
we don't need to generate separate fwtable, gen11_fwtable, and
gen12_fwtable variants of the register write functions; a single
'fwtable' implementation will work for all of those platforms now.
While consolidating the functions, gen11/gen12 pick up a
NEEDS_FORCE_WAKE() check that they didn't have before, allowing them to
bypass a lot of forcewake/shadow checking for non-GT registers (e.g.,
display). However since these later platforms also introduce media
engines at higher MMIO offsets, the definition of NEEDS_FORCE_WAKE() is
extended to also consider register offsets above GEN11_BSD_RING_BASE.
v2:
- Restore NEEDS_FORCE_WAKE(), but extend it for compatibility with the
gen11+ platforms by also passing offsets above GEN11_BSD_RING_BASE.
(Chris, Tvrtko)
Matt Roper [Fri, 10 Sep 2021 20:10:27 +0000 (13:10 -0700)]
drm/i915/uncore: Replace gen8 write functions with general fwtable
Now that we have both a standard forcewake table (albeit a single-entry
table) and the shadow table stored in the uncore, we can drop the
gen8-specific write handlers in favor of the general fwtable version.
Matt Roper [Fri, 10 Sep 2021 20:10:26 +0000 (13:10 -0700)]
drm/i915/uncore: Associate shadow table with uncore
Store a reference to a platform's shadow table inside the uncore, the
same as we do with the forcewake table. This will allow us to use a
single set of functions that operate on the shadow table reference
rather than generating lots of nearly-identical functions via macros
that differ only in terms of the table that they reference.
Matt Roper [Fri, 10 Sep 2021 20:10:25 +0000 (13:10 -0700)]
drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we
simply check whether the register offset is < 0x40000, and return
FORCEWAKE_RENDER if it is. To prepare for upcoming refactoring, let's
define a single-entry forcewake table from [0x0, 0x3ffff] and switch
these platforms over to use the fwtable reader functions.
v2:
- Drop __gen6_reg_read_fw_domains which is no longer used. (Tvrtko)
Matt Roper [Fri, 17 Sep 2021 16:12:03 +0000 (09:12 -0700)]
drm/i915: Check SFC fusing before recording/dumping SFC_DONE
On Xe_HP and beyond the SFC unit may be fused off, even if the
corresponding media engines are present. Check the SFC-specific fusing
before trying to dump the SFC_DONE instances.
Matt Roper [Fri, 17 Sep 2021 16:12:02 +0000 (09:12 -0700)]
drm/i915/xehp: Check new fuse bits for SFC availability
Xe_HP adds some new bits to the FUSE1 register to let us know whether a
given SFC unit is present. We should take this into account while
initializing SFC availability to our VCS and VECS engines.
While we're at it, update the FUSE1 register definition to use
REG_GENMASK / REG_FIELD_GET notation.
Note that, the bspec confusingly names the fuse bits "disable" despite
the register reflecting the *enable* status of the SFC units. The
original architecture documents which the bspec is based on do properly
name this field "SFC_ENABLE."
drm/i915/guc: put all guc objects in lmem when available
The firmware binary has to be loaded from lmem and the recommendation is
to put all other objects in there as well. Note that we don't fall back
to system memory if the allocation in lmem fails because all objects are
allocated during driver load and if we have issues with lmem at that point
something is seriously wrong with the system, so no point in trying to
handle it.
Cc: Matthew Auld <matthew.auld@intel.com> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210916162819.27848-3-matthew.brost@intel.com
Support for multiple GT's within a single i915 device will be arriving
soon. Since each GT may have its own fusing and require different
workarounds, we need to make the GT workaround functions and multicast
steering setup per-gt.
Lucas De Marchi [Sat, 18 Sep 2021 02:57:54 +0000 (19:57 -0700)]
drm/i915: deduplicate frequency dump on debugfs
Although commit 9dd4b065446a ("drm/i915/gt: Move pm debug files into a
gt aware debugfs") says it was moving debug files to gt/, the
i915_frequency_info file was left behind and its implementation copied
into drivers/gpu/drm/i915/gt/debugfs_gt_pm.c. Over time we had several
patches having to change both places to keep them in sync (and some
patches failing to do so). The initial idea was to remove
i915_frequency_info, but there are user space tools using it. From a
quick code search there are other scripts and test tools besides igt, so
it's not simply updating igt to get rid of the older file.
Here we export a function using drm_printer as parameter and make
both show() implementations to call this same function. Aside from a few
variable name differences, for i915_frequency_info this brings a few
lines that were not previously printed: RP UP EI, RP UP THRESHOLD, RP
DOWN THRESHOLD and RP DOWN EI. These came in as part of
commit 9c878557b1eb ("drm/i915/gt: Use the RPM config register to
determine clk frequencies"), which didn't change both places.
Lucas De Marchi [Sat, 18 Sep 2021 02:57:53 +0000 (19:57 -0700)]
drm/i915: rename debugfs_gt_pm files
We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
functions, defines and structs follow suit.
Lucas De Marchi [Sat, 18 Sep 2021 02:57:52 +0000 (19:57 -0700)]
drm/i915: rename debugfs_engines files
We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_engines.[ch] to intel_gt_engines_debugfs.[ch] and then make
functions, defines and structs follow suit.
Lucas De Marchi [Sat, 18 Sep 2021 02:57:51 +0000 (19:57 -0700)]
drm/i915: rename debugfs_gt files
We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_gt.[ch] to intel_gt_debugfs.[ch] and then make functions,
defines and structs follow suit.
While at it and since we are renaming the header, sort the includes
alphabetically.
Fixing eviction to nest into ww_class_acquire is a high priority, but
it requires a rework of the entire driver, which can only be done one
step at a time.
As an intermediate solution, add an acquire context to
ww_mutex_trylock, which allows us to do proper nesting annotations on
the trylocks, making the above lockdep splat disappear.
This is also useful in regulator_lock_nested, which may avoid dropping
regulator_nesting_mutex in the uncontended path, so use it there.
TTM may be another user for this, where we could lock a buffer in a
fastpath with list locks held, without dropping all locks we hold.
Maarten Lankhorst [Mon, 30 Aug 2021 12:09:48 +0000 (14:09 +0200)]
drm/i915: Move __i915_gem_free_object to ttm_bo_destroy
When we implement delayed destroy, we may have a second
call to the delete_mem_notify() handler, while free_object()
only should be called once.
Move it to bo->destroy(), to ensure it's only called once.
This fixes some weird memory corruption issues with delayed
destroy when async eviction is used.
Janusz Krzysztofik [Fri, 3 Sep 2021 14:28:37 +0000 (16:28 +0200)]
drm/i915: Mark GPU wedging on driver unregister unrecoverable
GPU wedged flag now set on driver unregister to prevent from further
using the GPU can be then cleared unintentionally when calling
__intel_gt_unset_wedged() still before the flag is finally marked
unrecoverable. We need to have it marked unrecoverable earlier.
Implement that by replacing a call to intel_gt_set_wedged() in
intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini().
With the above in place, intel_gt_set_wedged_on_fini() is now called
twice on driver remove, second time from __intel_gt_disable(). This
seems harmless, while dropping intel_gt_set_wedged_on_fini() from
__intel_gt_disable() proved to break some driver probe error unwind
paths as well as mock selftest exit path.
drm/i915: Add mmap lock around vma_lookup() in the mman selftest.
Add mmap_read_lock/unlock around vma_lookup(). The core code requires
this for lookups. Since we only check if the return value is NULL,
we can immediately unlock.
Lucas De Marchi [Sat, 4 Sep 2021 00:35:43 +0000 (17:35 -0700)]
drm/i915/xehpsdv: Define MOCS table for XeHP SDV
Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a
dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to
memory for L3 destined transaction" and L3_LKP to "enable Lookup for
uncacheable accesses".
Bspec: 45101 Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210904003544.2422282-2-matthew.d.roper@intel.com
Nathan Chancellor [Tue, 24 Aug 2021 22:54:27 +0000 (15:54 -0700)]
drm/i915: Enable -Wsometimes-uninitialized
This warning helps catch uninitialized variables. It should have been
enabled at the same time as commit b2423184ac33 ("drm/i915: Enable
-Wuninitialized") but I did not realize they were disabled separately.
Enable it now that i915 is clean so that it stays that way.
Nathan Chancellor [Tue, 24 Aug 2021 22:54:26 +0000 (15:54 -0700)]
drm/i915/selftests: Always initialize err in igt_dmabuf_import_same_driver_lmem()
Clang warns:
drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:127:13: warning:
variable 'err' is used uninitialized whenever 'if' condition is false
[-Wsometimes-uninitialized]
} else if (PTR_ERR(import) != -EOPNOTSUPP) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:138:9: note:
uninitialized use occurs here
return err;
^~~
drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:127:9: note: remove
the 'if' if its condition is always true
} else if (PTR_ERR(import) != -EOPNOTSUPP) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:95:9: note:
initialize the variable 'err' to silence this warning
int err;
^
= 0
The test is expected to pass if i915_gem_prime_import() returns
-EOPNOTSUPP so initialize err to zero in this case.
Fixes: cdb35d1ed6d2 ("drm/i915/gem: Migrate to system at dma-buf attach time (v7)") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210824225427.2065517-3-nathan@kernel.org
Nathan Chancellor [Tue, 24 Aug 2021 22:54:25 +0000 (15:54 -0700)]
drm/i915/selftests: Do not use import_obj uninitialized
Clang warns a couple of times:
drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:63:6: warning:
variable 'import_obj' is used uninitialized whenever 'if' condition is
true [-Wsometimes-uninitialized]
if (import != &obj->base) {
^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:80:22: note:
uninitialized use occurs here
i915_gem_object_put(import_obj);
^~~~~~~~~~
drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:63:2: note: remove
the 'if' if its condition is always false
if (import != &obj->base) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:38:46: note:
initialize the variable 'import_obj' to silence this warning
struct drm_i915_gem_object *obj, *import_obj;
^
= NULL
Shuffle the import_obj initialization above these if statements so that
it is not used uninitialized.
Fixes: d7b2cb380b3a ("drm/i915/gem: Correct the locking and pin pattern for dma-buf (v8)") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210824225427.2065517-2-nathan@kernel.org
Matthew Brost [Thu, 9 Sep 2021 16:47:44 +0000 (09:47 -0700)]
drm/i915/guc: Add GuC kernel doc
Add GuC kernel doc for all structures added thus far for GuC submission
and update the main GuC submission section with the new interface
details.
v2:
- Drop guc_active.lock DOC
v3:
- Fixup a few kernel doc comments (Daniele)
v4 (Daniele):
- Implement doc suggestions from John
- Add kerneldoc for all members of the GuC structure and pull the file
in i915.rst
v5 (Daniele):
- Implement new doc suggestions from John
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-24-matthew.brost@intel.com
Matthew Brost [Thu, 9 Sep 2021 16:47:43 +0000 (09:47 -0700)]
drm/i915/guc: Drop guc_active move everything into guc_state
Now that we have locking hierarchy of sched_engine->lock ->
ce->guc_state everything from guc_active can be moved into guc_state and
protected the guc_state.lock.
Matthew Brost [Thu, 9 Sep 2021 16:47:41 +0000 (09:47 -0700)]
drm/i915/guc: Move GuC priority fields in context under guc_active
Move GuC management fields in context under guc_active struct as this is
where the lock that protects theses fields lives. Also only set guc_prio
field once during context init.
Matthew Brost [Thu, 9 Sep 2021 16:47:40 +0000 (09:47 -0700)]
drm/i915/guc: Drop pin count check trick between sched_disable and re-pin
Drop pin count check trick between a sched_disable and re-pin, now rely
on the lock and counter of the number of committed requests to determine
if scheduling should be disabled on the context.
Matthew Brost [Thu, 9 Sep 2021 16:47:38 +0000 (09:47 -0700)]
drm/i915/guc: Rework and simplify locking
Rework and simplify the locking with GuC subission. Drop
sched_state_no_lock and move all fields under the guc_state.sched_state
and protect all these fields with guc_state.lock . This requires
changing the locking hierarchy from guc_state.lock -> sched_engine.lock
to sched_engine.lock -> guc_state.lock.
v2:
(Daniele)
- Don't check fields outside of lock during sched disable, check less
fields within lock as some of the outside are no longer needed
Matthew Brost [Thu, 9 Sep 2021 16:47:36 +0000 (09:47 -0700)]
drm/i915/guc: Release submit fence from an irq_work
A subsequent patch will flip the locking hierarchy from
ce->guc_state.lock -> sched_engine->lock to sched_engine->lock ->
ce->guc_state.lock. As such we need to release the submit fence for a
request from an IRQ to break a lock inversion - i.e. the fence must be
release went holding ce->guc_state.lock and the releasing of the can
acquire sched_engine->lock.
v2:
(Daniele)
- Delete request from list before calling irq_work_queue
Matthew Brost [Thu, 9 Sep 2021 16:47:34 +0000 (09:47 -0700)]
drm/i915/guc: Don't touch guc_state.sched_state without a lock
Before we did some clever tricks to not use the a lock when touching
guc_state.sched_state in certain cases. Don't do that, enforce the use
of the lock.
v2:
(kernel test robo )
- Add __maybe_unused to sched_state_is_init()
v3: rebase after the unused code path removal has been moved to an
earlier patch.
Matthew Brost [Thu, 9 Sep 2021 16:47:32 +0000 (09:47 -0700)]
drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H
While debugging an issue with full GT resets I went down a rabbit hole
thinking the scrubbing of lost G2H wasn't working correctly. This proved
to be incorrect as this was working just fine but this chase inspired me
to write a selftest to prove that this works. This simple selftest
injects errors dropping various G2H and then issues a full GT reset
proving that the scrubbing of these G2H doesn't blow up.
v2:
(Daniel Vetter)
- Use ifdef instead of macros for selftests
v3:
(Checkpatch)
- A space after 'switch' statement
v4:
(Daniele)
- A comment saying GT won't idle if G2H are lost
Matthew Brost [Thu, 9 Sep 2021 16:47:31 +0000 (09:47 -0700)]
drm/i915/guc: Copy whole golden context, set engine state size of subset
When the GuC does a media reset, it copies a golden context state back
into the corrupted context's state. The address of the golden context
and the size of the engine state restore are passed in via the GuC ADS.
The i915 had a bug where it passed in the whole size of the golden
context, not the size of the engine state to restore resulting in a
memory corruption.
Also copy the entire golden context on init rather than just the engine
state that is restored.
v2 (Daniele): use defines to avoid duplicated const variables (John).
Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-11-matthew.brost@intel.com
Matthew Brost [Thu, 9 Sep 2021 16:47:29 +0000 (09:47 -0700)]
drm/i915/guc: Kick tasklet after queuing a request
Kick tasklet after queuing a request so it submitted in a timely manner.
Fixes: 3a4cdf1982f0 ("drm/i915/guc: Implement GuC context operations for new inteface") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-9-matthew.brost@intel.com
Matthew Brost [Thu, 9 Sep 2021 16:47:28 +0000 (09:47 -0700)]
Revert "drm/i915/gt: Propagate change in error status to children on unhold"
Propagating errors to dependent fences is broken and can lead to errors
from one client ending up in another. In commit 3761baae908a ("Revert
"drm/i915: Propagate errors on awaiting already signaled fences""), we
attempted to get rid of fence error propagation but missed the case
added in commit 8e9f84cf5cac ("drm/i915/gt: Propagate change in error
status to children on unhold"). Revert that one too. This error was
found by an up-and-coming selftest which triggers a reset during
request cancellation and verifies that subsequent requests complete
successfully.