# Verifying Installation To verify whether the expected hardware is working with the i915 driver, check the display hardware connected to your system: ```bash hwinfo --display ``` On SLES, if `hwinfo` is installed in `/usr/sbin` and not in the default user path, run it using the following command: ```bash /usr/sbin/hwinfo --display ``` :::{dropdown} Example output for Intel® Data Center GPU Max 1550 (device ID 0x0BD5) ```bash 51: PCI 8c00.0: 0380 Display controller [Created at pci.386] Unique ID: JefI.QAjErpDk4H4 Parent ID: juVd.xbjkZcxCQYD SysFS ID: /devices/pci0000:89/0000:89:02.0/0000:8a:00.0/0000:8b:01.0/0000:8c:00.0 SysFS BusID: 0000:8c:00.0 Hardware Class: graphics card Model: "Intel Display controller" Vendor: pci 0x8086 "Intel Corporation" Device: pci 0x0bd5 SubVendor: pci 0x8086 "Intel Corporation" SubDevice: pci 0x0000 Revision: 0x2f Driver: "i915" Driver Modules: "i915" Memory Range: 0x23fe7e000000-0x23fe7fffffff (ro,non-prefetchable) Memory Range: 0x236000000000-0x237fffffffff (ro,non-prefetchable) IRQ: 138 (447 events) Module Alias: "pci:v00008086d00000BD5sv00008086sd00000000bc03sc80i00" Driver Info #0: Driver Status: i915 is active Driver Activation Cmd: "modprobe i915" Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #26 (PCI bridge) ``` ::: ### Diagnosing the installed GPU using the XPU manager The Intel® XPU Manager (Intel® XPUM) tool helps with system administration, GPU monitoring, diagnostics, and configuration for Intel Data Center GPUs. You can use it in full-featured mode with a RESTful API as well as via the simplified XPU System Management Interface (XPU-SMI) tool. The following examples present commands that can help you get more information about your GPU installation. :::{dropdown} Getting information about the available GPU ```bash $ xpu-smi discovery +-----------+--------------------------------------------------------------------------------------+ | Device ID | Device Information | +-----------+--------------------------------------------------------------------------------------+ | 0 | Device Name: Intel(R) Data Center GPU Flex 170 | | | Vendor Name: Intel(R) Corporation | | | UUID: 00000000-0000-0000-6769-df256e271362 | | | PCI BDF Address: 0000:4d:00.0 | | | DRM Device: /dev/dri/card1 | | | Function Type: physical | +-----------+--------------------------------------------------------------------------------------+ ``` ::: :::{dropdown} Getting information about the available GPU, including installed driver and firmware versions ```bash $ sudo xpu-smi discovery -d 0 +-----------+--------------------------------------------------------------------------------------+ | Device ID | Device Information | +-----------+--------------------------------------------------------------------------------------+ | 0 | Device Type: GPU | | | Device Name: Intel(R) Data Center GPU Flex 170 | | | Vendor Name: Intel(R) Corporation | | | UUID: 00000000-0000-0000-6769-df256e271362 | | | Serial Number: LQAC13401787 | | | Core Clock Rate: 2050 MHz | | | Stepping: C0 | | | | | | Driver Version: I915_23.4.15_PSB_230307.15 | | | Kernel Version: 5.15.0-47-generic | | | GFX Firmware Name: GFX | | | GFX Firmware Version: DG02_1.3267 | | | GFX Firmware Status: normal | | | GFX Data Firmware Name: GFX_DATA | | | GFX Data Firmware Version: 0x46b | | | GFX PSC Firmware Name: GFX_PSCBIN | | | GFX PSC Firmware Version: | | | AMC Firmware Name: AMC | | | AMC Firmware Version: | | | | | | PCI BDF Address: 0000:4d:00.0 | | | PCI Slot: J37 - Riser 1, Slot 1 | | | PCIe Generation: 4 | | | PCIe Max Link Width: 16 | | | OAM Socket ID: | | | | | | Memory Physical Size: 14248.00 MiB | | | Max Mem Alloc Size: 4095.99 MiB | | | ECC State: enabled | | | Number of Memory Channels: 2 | | | Memory Bus Width: 128 | | | Max Hardware Contexts: 65536 | | | Max Command Queue Priority: 0 | | | | | | Number of EUs: 512 | | | Number of Tiles: 1 | | | Number of Slices: 1 | | | Number of Sub Slices per Slice: 32 | | | Number of Threads per EU: 8 | | | Physical EU SIMD Width: 8 | | | Number of Media Engines: 2 | | | Number of Media Enhancement Engines: 2 | | | | | | Number of Xe Link ports: | | | Max Tx/Rx Speed per Xe Link port: | | | Number of Lanes per Xe Link port: | +-----------+--------------------------------------------------------------------------------------+ ``` ::: :::{dropdown} Enabling GPU telemetry ```bash $sudo xpu-smi stats -d 0 +-----------------------------+--------------------------------------------------------------------+ | Device ID | 0 | +-----------------------------+--------------------------------------------------------------------+ | GPU Utilization (%) | 0 | | EU Array Active (%) | | | EU Array Stall (%) | | | EU Array Idle (%) | | | | | | Compute Engine Util (%) | 0; Engine 0: 0, Engine 1: 0, Engine 2: 0, Engine 3: 0 | | Render Engine Util (%) | 0; Engine 0: 0 | | Media Engine Util (%) | 0 | | Decoder Engine Util (%) | Engine 0: 0, Engine 1: 0 | | Encoder Engine Util (%) | Engine 0: 0, Engine 1: 0 | | Copy Engine Util (%) | 0; Engine 0: 0 | | Media EM Engine Util (%) | Engine 0: 0, Engine 1: 0 | | 3D Engine Util (%) | | +-----------------------------+--------------------------------------------------------------------+ | Reset | | | Programming Errors | | | Driver Errors | | | Cache Errors Correctable | | | Cache Errors Uncorrectable | | | Mem Errors Correctable | | | Mem Errors Uncorrectable | | +-----------------------------+--------------------------------------------------------------------+ | GPU Power (W) | 44 | | GPU Frequency (MHz) | 2050 | | GPU Core Temperature (C) | 40 | | GPU Memory Temperature (C) | | | GPU Memory Read (kB/s) | 1346 | | GPU Memory Write (kB/s) | 286 | | GPU Memory Bandwidth (%) | 0 | | GPU Memory Used (MiB) | 26 | | Xe Link Throughput (kB/s) | | +-----------------------------+--------------------------------------------------------------------+ ``` ::: For more information on Intel® XPUM, see [Intel® XPUM overview](https://github.com/intel/xpumanager/blob/master/README.md) or [XPU System Management Interface user guide](https://github.com/intel/xpumanager/blob/master/doc/smi_user_guide.md). ### Smoke testing the compute stack Use the following command to smoke test the compute stack: ```bash clinfo | head -n 5 ``` Running the same command without `head` displays multiple pages of GPGPU compute capability summary. :::{dropdown} Example output ```bash Number of platforms 1 Platform Name Intel(R) OpenCL HD Graphics Platform Vendor Intel(R) Corporation Platform Version OpenCL 3.0 Platform Profile FULL_PROFILE ``` ::: ### Smoke testing the media stack Use the following command to smoke test the media stack for the Data Center GPU Flex series: ```bash vainfo ``` Intel® Data Center GPU Max Series does not include codec capabilities, so the expected output has minimal entry points. Intel® Data Center GPU Flex Series and client GPUs provide hardware codecs, so many entry points are expected from vainfo output. See the following examples for both GPU series. :::{dropdown} Example output Intel® Data Center GPU Max Series: ```bash vainfo: VA-API version: 1.18 (libva 2.17.0) vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 23.1.4 (12e141d) vainfo: Supported profile and entrypoints VAProfileNone : VAEntrypointVideoProc VAProfileNone : VAEntrypointStats ``` Intel® Data Center GPU Flex Series: ```bash vainfo: VA-API version: 1.18 (libva 2.17.0) vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 23.1.4 (12e141d) vainfo: Supported profile and entrypoints VAProfileNone : VAEntrypointVideoProc VAProfileNone : VAEntrypointStats VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointEncSliceLP VAProfileH264High : VAEntrypointVLD VAProfileH264High : VAEntrypointEncSliceLP VAProfileJPEGBaseline : VAEntrypointVLD VAProfileJPEGBaseline : VAEntrypointEncPicture VAProfileH264ConstrainedBaseline: VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP VAProfileHEVCMain : VAEntrypointVLD VAProfileHEVCMain : VAEntrypointEncSliceLP VAProfileHEVCMain10 : VAEntrypointVLD VAProfileHEVCMain10 : VAEntrypointEncSliceLP VAProfileVP9Profile0 : VAEntrypointVLD VAProfileVP9Profile0 : VAEntrypointEncSliceLP VAProfileVP9Profile1 : VAEntrypointVLD VAProfileVP9Profile1 : VAEntrypointEncSliceLP VAProfileVP9Profile2 : VAEntrypointVLD VAProfileVP9Profile2 : VAEntrypointEncSliceLP VAProfileVP9Profile3 : VAEntrypointVLD VAProfileVP9Profile3 : VAEntrypointEncSliceLP VAProfileHEVCMain12 : VAEntrypointVLD VAProfileHEVCMain422_10 : VAEntrypointVLD VAProfileHEVCMain422_12 : VAEntrypointVLD VAProfileHEVCMain444 : VAEntrypointVLD VAProfileHEVCMain444 : VAEntrypointEncSliceLP VAProfileHEVCMain444_10 : VAEntrypointVLD VAProfileHEVCMain444_10 : VAEntrypointEncSliceLP VAProfileHEVCMain444_12 : VAEntrypointVLD VAProfileHEVCSccMain : VAEntrypointVLD VAProfileHEVCSccMain : VAEntrypointEncSliceLP VAProfileHEVCSccMain10 : VAEntrypointVLD VAProfileHEVCSccMain10 : VAEntrypointEncSliceLP VAProfileHEVCSccMain444 : VAEntrypointVLD VAProfileHEVCSccMain444 : VAEntrypointEncSliceLP VAProfileAV1Profile0 : VAEntrypointVLD VAProfileAV1Profile0 : VAEntrypointEncSliceLP VAProfileHEVCSccMain444_10 : VAEntrypointVLD VAProfileHEVCSccMain444_10 : VAEntrypointEncSliceLP ``` ::: ### Verifying the usage of the Virtual Special Engine Capability (VSEC) module To access the full range of Intel® Data Center GPU Max Series telemetry features, you need to use the intel_vsec module instead of intel_pmt. The intel_vsec module supports Max telemetry features while intel_pmt focuses on CPU telemetry. To check whether the VSEC change is needed, review the output of the `xpu-smi discovery -d 0` command. If the serial number is unknown, there may be a VSEC issue for the device serial number. In that case, follow this procedure to check and modify the used kernel driver module. 1. Use the following command to check whether the intel_vsec module loads and is associated with a PCI device. ```bash for d in 8086:09A7 8086:4F93 8086:4F95; do sudo lspci -k -d $d; done ``` The correct output should like in the following example: ``` 05:00.0 Memory controller: Intel Corporation Device 09A7 Kernel driver in use: intel-vsec Kernel modules: intel_vsec ``` If intel_pmt is used as a kernel driver instead of intel-vsec, proceed to the next steps to change the kernel driver. 2. Install the *driverctl* tool: ::::{tab-set} :::{tab-item} RHEL ```bash sudo dnf install driverctl ``` ::: :::{tab-item} SLES A driverctl package is not available for SUSE Linux Enterprise Server 15. Instead, install it from the driverctl repository. ```bash git clone https://gitlab.com/driverctl/driverctl.git cd driverctl sudo make install ``` ::: :::{tab-item} Ubuntu ```bash sudo apt install driverctl ``` ::: :::: 3. Check which device the intel-pmt module is linked to. ```bash sudo driverctl list-devices | grep -iE "pmt" ``` The expected output is `0000:8e:00.1 intel-pmt`, but you may see a different device address than 0000:8e:00.1. 4. Override the default driver binding using the retrieved system's device address. ```bash sudo driverctl set-override 0000:8e:00.1 "intel_vsec" ``` ## Verifying Integrated Firmware Image (IFWI) Use the Intel® XPUM tool to flash IFWI onto a Flex or Max GPU. 1. Check GFX firmware version for each GPU. ```bash sudo xpu-smi discovery -d 0 sudo xpu-smi discovery -d 1 ``` 2. Check the latest firmware version for your hardware from your Intel or OEM portal and compare it with the version currently installed on your device. If the latest firmware version is newer than the one on your device, install the new firmware. ```bash sudo xpu-smi updatefw -d 0 -t GFX -f /home/intel/ATS_M75_128_B0_PVT_ES_017_gfx_fwupdate_SOC2.bin -y sudo xpu-smi updatefw -d 0 -t GFX_PSCBIN -f /home/test/PVC_Tuscany_oam_cbb_otf_53G_220803.pscbin sudo xpu-smi updatefw -d 0 -t GFX -f /home/test/PVC.Fwupdate_Prod_2023.WW26.3_Tuscany_Pcie.bin ``` 3. Update firmware options. ```bash sudo xpu-smi updatefw Update GPU firmware Usage: xpu-smi updatefw [Options] xpu-smi updatefw -d [deviceId] -t GFX -f [imageFilePath] xpu-smi updatefw -d [pciBdfAddress] -t GFX -f [imageFilePath] Options: -h,--help Print this help message and exit -j,--json Print result in JSON format -d,--device The device ID or PCI BDF address -t,--type The firmware name. Valid options: GFX, GFX_DATA, GFX_CODE_DATA, GFX_PSCBIN, AMC. AMC firmware update just works on Intel M50CYP server (BMC firmware version is 2.82 or newer) and Supermicro SYS-620C-TN12R server (BMC firmware version is 11.01 or newer). -f,--file The firmware image file path on this server -u,--username Username used to authenticate for host redfish access -p,--password Password used to authenticate for host redfish access -y,--assumeyes Assume that the answer to any question which would be asked is yes --force Force GFX firmware update. This parameter only works for GFX firmware. ```