The optimization is available on Ivy Bridge and later, and is disabled by
default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
No piglit regression.
v2
 - no need to save the register before suspend as init_clock_gating can
   correctly program it after resume
 - split IVB change to another commit
Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
 #define   ECO_GATING_CX_ONLY   (1<<3)
 #define   ECO_FLIP_DONE                (1<<0)
 
+#define CACHE_MODE_0_GEN7      0x7000 /* IVB+ */
+#define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
 #define CACHE_MODE_1           0x7004 /* IVB+ */
 #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
 
 
        I915_WRITE(GEN7_FF_THREAD_MODE,
                   I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
 
+       /* enable HiZ Raw Stall Optimization */
+       I915_WRITE(CACHE_MODE_0_GEN7,
+                  _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
+
        /* WaDisable4x2SubspanOptimization:hsw */
        I915_WRITE(CACHE_MODE_1,
                   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));