Some testing environments and some heavier tests are slower than
previous limits allowed for. For example, it can take multiple seconds
for the 'context has been reset' notification handler to reach the
'kill the requests' code in the 'active' version of the 'reset
engines' test. During which time the selftest gets bored, gives up
waiting and fails the test.
There is also an async thread that the selftest uses to pump work
through the hardware in parallel to the context that is marked for
reset. That also could get bored waiting for completions and kill the
test off.
Lastly, the flush at the of various test sections can also see
timeouts due to the large amount of work backed up. This is also true
of the live_hwsp_read test.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-32-matthew.brost@intel.com
        if (!rq)
                return 0;
 
-       if (i915_request_wait(rq, 0, 5 * HZ) < 0) {
+       if (i915_request_wait(rq, 0, 10 * HZ) < 0) {
                GEM_TRACE("%s timed out waiting for completion of fence %llx:%lld\n",
                          rq->engine->name,
                          rq->fence.context,
 
 
        cond_resched();
 
-       if (intel_gt_wait_for_idle(gt, HZ / 5) == -ETIME) {
+       if (intel_gt_wait_for_idle(gt, HZ) == -ETIME) {
                pr_err("%pS timed out, cancelling all further testing.\n",
                       __builtin_return_address(0));
 
 
 
 #define REDUCED_TIMESLICE      5
 #define REDUCED_PREEMPT                10
-#define WAIT_FOR_RESET_TIME    1000
+#define WAIT_FOR_RESET_TIME    10000
 
 int intel_selftest_modify_policy(struct intel_engine_cs *engine,
                                 struct intel_selftest_saved_policy *saved,