Monitoring/Memory

From Wikitech

Memory correctable errors -EDAC-

There seem to be a few different "classes" or "states" of EDAC errors:

  1. high count of CEs and recent kernel messages
  2. low count of CEs and no recent kernel messages

For class #1 The course of action is to file tasks to service owners to diagnose memory

For class #2 reset edac counters to probe for re-occurrences i.e.

sudo rmmod sb_edac edac_core ; sudo modprobe sb_edac edac_core

This post was taken from a comment on T183177