Enables blend optimization for floating point RTs
This restores the workaround that was reverted in 
c358514ba8da
("Revert "drm/i915/icl: WaEnableFloatBlendOptimization"").
The revert was due to the register write seemingly not sticking,
but the HW team has confirmed that this is because the
register is WO and that the workaround is indeed required.
Here the wa is added with a mask of 0 since the register is WO.
References: https://hsdes.intel.com/resource/
1408134172
References: https://bugs.freedesktop.org/show_bug.cgi?id=107338
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Talha Nassar <talha.nassar@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1548983324-15344-4-git-send-email-talha.nassar@intel.com
 #define GEN6_RCS_PWR_FSM _MMIO(0x22ac)
 #define GEN9_RCS_FE_FSM2 _MMIO(0x22a4)
 
+#define GEN10_CACHE_MODE_SS                    _MMIO(0xe420)
+#define   FLOAT_BLEND_OPTIMIZATION_ENABLE      (1 << 4)
+
 /* Fuse readout registers for GT */
 #define HSW_PAVP_FUSE1                 _MMIO(0x911C)
 #define   HSW_F1_EU_DIS_SHIFT          16
 
        if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
                WA_SET_BIT_MASKED(GEN11_COMMON_SLICE_CHICKEN3,
                                  GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
+
+       /* WaEnableFloatBlendOptimization:icl */
+       wa_write_masked_or(wal,
+                          GEN10_CACHE_MODE_SS,
+                          0, /* write-only, so skip validation */
+                          _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE));
 }
 
 void intel_engine_init_ctx_wa(struct intel_engine_cs *engine)