Speaker
Naoyuki Fujita
(JAXA)
Description
In recent years, large and/or unknown storage failures have occurred on JAXA's supercomputer.
In HUF2015 we reported silent data corruption on JAXA supercomputer file system.
This year we experienced file system failures on supercomputer file system two times, one HPSS cache disk failure, and one-pair HPSS tape read error. Are these storage system reliable? Should we change our system design and/or operation policy?