From: Riana Tauro Date: Mon, 7 Apr 2025 05:14:12 +0000 (+0530) Subject: drm/xe: Add documentation for survivability mode X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=77052ab24590cb72598e31de4a7c29f99d51d201;p=users%2Fwilly%2Flinux.git drm/xe: Add documentation for survivability mode Add survivability mode document to pcode document as it is enabled when pcode detects a failure. v2: fix kernel-doc (Lucas) Signed-off-by: Riana Tauro Reviewed-by: Lucas De Marchi Link: https://lore.kernel.org/r/20250407051414.1651616-3-riana.tauro@intel.com Signed-off-by: Lucas De Marchi --- diff --git a/Documentation/gpu/xe/xe_pcode.rst b/Documentation/gpu/xe/xe_pcode.rst index d2e22cc45061..5937ef3599b0 100644 --- a/Documentation/gpu/xe/xe_pcode.rst +++ b/Documentation/gpu/xe/xe_pcode.rst @@ -12,3 +12,10 @@ Internal API .. kernel-doc:: drivers/gpu/drm/xe/xe_pcode.c :internal: + +================== +Boot Survivability +================== + +.. kernel-doc:: drivers/gpu/drm/xe/xe_survivability_mode.c + :doc: Xe Boot Survivability diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.c b/drivers/gpu/drm/xe/xe_survivability_mode.c index cb813b337fd3..399c06890b0b 100644 --- a/drivers/gpu/drm/xe/xe_survivability_mode.c +++ b/drivers/gpu/drm/xe/xe_survivability_mode.c @@ -28,20 +28,32 @@ * This is implemented by loading the driver with bare minimum (no drm card) to allow the firmware * to be flashed through mei and collect telemetry. The driver's probe flow is modified * such that it enters survivability mode when pcode initialization is incomplete and boot status - * denotes a failure. The driver then populates the survivability_mode PCI sysfs indicating - * survivability mode and provides additional information required for debug + * denotes a failure. * - * KMD exposes below admin-only readable sysfs in survivability mode + * Survivability mode can also be entered manually using the survivability mode attribute available + * through configfs which is beneficial in several usecases. It can be used to address scenarios + * where pcode does not detect failure or for validation purposes. It can also be used in + * In-Field-Repair (IFR) to repair a single card without impacting the other cards in a node. * - * device/survivability_mode: The presence of this file indicates that the card is in survivability - * mode. Also, provides additional information on why the driver entered - * survivability mode. + * Use below command enable survivability mode manually:: * - * Capability Information - Provides boot status - * Postcode Information - Provides information about the failure - * Overflow Information - Provides history of previous failures - * Auxiliary Information - Certain failures may have information in - * addition to postcode information + * # echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode + * + * Refer :ref:`xe_configfs` for more details on how to use configfs + * + * Survivability mode is indicated by the below admin-only readable sysfs which provides additional + * debug information:: + * + * /sys/bus/pci/devices//surivability_mode + * + * Capability Information: + * Provides boot status + * Postcode Information: + * Provides information about the failure + * Overflow Information + * Provides history of previous failures + * Auxiliary Information + * Certain failures may have information in addition to postcode information */ static u32 aux_history_offset(u32 reg_value)