Shared RAM ECC RAS Test
Overview
The RD Fremont platform has support for Shared RAM that is shared between AP, MCP, SCP and RSS. The shared RAM is protected with SECDED (Single Error Correct Double Error Detect). RD Fremont platform defines ECC RAS registers to log any ECC errors that occur during Shared RAM access from each master AP, SCP, MCP or RSS. There are 4 sets of ECC RAS registers defined for each master to log errors based on master’s PAS. The list for Shared RAM ECC RAS registers is defined below:
AP Secure RAM ECC RAS registers
AP Non-Secure RAM ECC RAS registers
AP Realm RAM ECC RAS registers
AP Root RAM ECC RAS registers
SCP Secure RAM ECC RAS registers
SCP Non-Secure RAM ECC RAS registers
SCP Realm RAM ECC RAS registers
SCP Root RAM ECC RAS registers
MCP Secure RAM ECC RAS registers
MCP Non-Secure RAM ECC RAS registers
MCP Realm RAM ECC RAS registers
MCP Root RAM ECC RAS registers
For instance any error that occurs during SRAM access from AP when AP is executing in root PAS is logged into “AP Root RAM ECC RAS registers”. This doc demonstrates the error logging for 1-bit CE that occurs during SRAM access from AP when executing in root PAS.
Note
This test is only supported on RD-Fremont-Cfg1 platform. The test is limited to error logging at EL3 and does not involve Host OS as explained in section “Firmware First Error Handling” of RAS document
1-bit CE error injection on Shared RAM
Each ECC RAS register set implements SRAMECC_ERRMISC1 register which provides a way to inject Corrected Error (CE) or Uncorrected Error (UE) in the Shared RAM. The error injection only takes effect if the register programming is followed by a read access to shared RAM. If the injection is successful the error records pertaining to the master and respective access are populated with error information and an error interrupt is delivered to the master.
Detailed Error injection software sequence is illustrated to inject 1-bit CE into Shared RAM from AP executing in root PAS.
Add memory map for the Shared RAM ECC RAS registers memory space.
Add memory map for the Shared memory space.
- Program the SRAMECC_ERRMISC1 register to inject CE.
- mmio_write_32((AP_RT_RAM_ECC_RAS_BASE + SRAM_ERR_MISC1_OFFSET),
SRAM_INJECT_ERROR_CE);
Download the platform software
Skip this section if the required sources have been downloaded.
To obtain the required sources for the platform, follow the steps listed on the Setup Workspace page. Ensure that the platform software is downloaded before proceeding with the steps listed below. Also, note the host machine requirements listed on that page which is essential to build and execute the platform software stack.
Procedure to perform 1-bit CE injection and handling on Shared RAM
Boot upto Busybox
Refer to the Busybox Boot or Buildroot Boot page to build the reference design platform software stack and boot into busybox on the Neoverse RD FVP.
Shared RAM error handling test
Run below command to inject 1-bit CE to the Shared RAM. This test uses EINJ ACPI table to perform error injection. Shared RAM is not a standard defined error_type in EINJ ACPI table so use the vendor defined error type. Bit 31 of error_type field represents vendor error type. Use error_type value 0x8002_0000 to represent Shared RAM errors.
mount -t debugfs none /sys/kernel/debug (Needed for buildroot) echo 0x80020000 > /sys/kernel/debug/apei/einj/error_type echo 1 > /sys/kernel/debug/apei/einj/oem-einj/sel-firmware-first echo 1 > /sys/kernel/debug/apei/einj/oem-einj/sel-component echo 1 > /sys/kernel/debug/apei/einj/oem-einj/sel-error-type echo 1 > /sys/kernel/debug/apei/einj/error_inject
Shared RAM error handling happens in Firmware first mode. The EL3 firmware receives the fault handling interrupt (FHI) for the corrected error detected and logs the error on the secure console.
INFO: SGI: Base element RAM interrupt [85] handler INFO: ErrStatus = 0x86000000 INFO: ErrAddr = 0x19100
Copyright (c) 2024, Arm Limited. All rights reserved.