If we fail to recover the HW state upon resume (i.e. our attempt to
clear the wedged bit and reset during i915_gem_sanitize() fails), then
skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
HW restart when successfully unwedging and resetting the HW later,
but attempting to restore a wedged device upon resume is risky as the HW
is in an unknown state.
v2: Suppress the error message when detecting the already wedged HW.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103240
Testcase: igt/gem_eio/in-flight-suspend
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171015143725.27764-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
 
        init_unused_rings(dev_priv);
 
        BUG_ON(!dev_priv->kernel_context);
+       if (i915_terminally_wedged(&dev_priv->gpu_error)) {
+               ret = -EIO;
+               goto out;
+       }
 
        ret = i915_ppgtt_init_hw(dev_priv);
        if (ret) {
                 * wedged. But we only want to do this where the GPU is angry,
                 * for all other failure, such as an allocation failure, bail.
                 */
-               DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
-               i915_gem_set_wedged(dev_priv);
+               if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
+                       DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
+                       i915_gem_set_wedged(dev_priv);
+               }
                ret = 0;
        }