16–20 Oct 2017
KEK Building K03
Asia/Tokyo timezone

JAXA Site Presentation -Reliability of Data Management-

19 Oct 2017, 14:00
30m
Seminar Hall (KEK Building K03)

Seminar Hall

KEK Building K03

Speaker

Naoyuki Fujita (JAXA)

Description

In recent years, large and/or unknown storage failures have occurred on JAXA's supercomputer. In HUF2015 we reported silent data corruption on JAXA supercomputer file system. This year we experienced file system failures on supercomputer file system two times, one HPSS cache disk failure, and one-pair HPSS tape read error. Are these storage system reliable? Should we change our system design and/or operation policy?

Presentation materials