Quick one today: So you completed a maintenance in a component (a memory component, as per example below) but keep receiving messages of failure?
Well, try clearing all the error messages after complete the maintenance and lets check if the threshold is reached again. If so, we may need to really replace it.
How to do it? Easy:
ssh root@grepora01-ilom -> show /SYS/MB/P0/D3 Expected: [...] fault_state = Faulted [..] -> set /SYS/MB/P0/D3 clear_fault_action=true Are you sure you want to clear /SYS/MB/P0/D3 (y/n)? y -> show /SYS/MB/P0/D3 [Expected] /SYS/MB/P0/D3 Targets: PRSNT SERVICE Properties: type = DIMM ipmi_name = MB/P0/D3 fru_name = 16384MB DDR4 SDRAM DIMM fru_manufacturer = Samsung fru_part_number = % fru_rev_level = 01 fru_serial_number = % fault_state = OK clear_fault_action = (none)