Since we inherited the context image setup from gen8 which needed a
per-bb workaround (for GPGPU), we are submitting an empty per-bb buffer
on gen9. Now that we can skip adding the buffer to the context image,
remove the dangling per-bb. This slightly improves execution latency,
most notably on an idle engine.
References: https://bugs.freedesktop.org/show_bug.cgi?id=87725
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170921135444.27330-2-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
        return batch;
 }
 
-static u32 *gen9_init_perctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
-       *batch++ = MI_BATCH_BUFFER_END;
-
-       return batch;
-}
-
 #define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
 
 static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
                return 0;
        case 9:
                wa_bb_fn[0] = gen9_init_indirectctx_bb;
-               wa_bb_fn[1] = gen9_init_perctx_bb;
+               wa_bb_fn[1] = NULL;
                break;
        case 8:
                wa_bb_fn[0] = gen8_init_indirectctx_bb;