The Research Computing Center hosts and maintains a number of storage systems. RCC users have access to both persistent high-capacity storage that can be shared among a research group or remain private to each individual user and to high-performance storage when data needs to be temporarily staged and accessed quickly.
Persistent and High-Capacity Storage
Storage is accessible from all compute nodes on Midway1 and Midway2 as well as outside of the RCC compute environment through various mechanisms, such as mounting directories as network drives on your personal computer or accessing data as a Globus Online endpoint (at the time of this writing, Globus Online is supported on Midway1). RCC takes snapshots of all home directories (users' private storage space) at regular intervals so that if any data is lost or corrupted, it can easily be recovered.
Visit the data sharing services page to learn about how RCC can help researchers customize data access, and visit the data management page to learn how RCC provides support for writing data management plans for grants and other sources of funding.
Each RCC user has a home directory for storing small, frequently used items such as source code, binaries, and scripts. By default, a home directory is only accessible by its owner and is suitable for storing files which do not need to be shared with others. The data in the home directory is accessible from Midway1 and Midway2 as well as remotely via different protocols. Please see our guide to data transfer for more details.
Principal investigators may request a project space for their research group. These directories are generally used for longer-term storage of data/files which are shared by members of a research group/project and are accessible from all RCC compute systems (Midway1 and Midway2) as well as remotely.
High-Performance Scratch Space
Scratch space is hosted on RCC’s high-performance storage system and is intended to be used for staging data which is required/generated by computational processes running on the cluster. Unlike home and project directories, Midway1 and Midway2 do not share scratch spaces. Scratch space is neither snapshotted nor backed up and may be periodically purged. As such, it is the responsibility of the user to ensure any important data in the scratch space is replicated in a location providing persistent storage such as project or home directories.
Backup and Data Recovery
RCC maintains GPFS Filesystem Snapshots for quick and easy data recovery. In the event of catestrophic storage failure, archival tape backups can be used to recover data from persistent storage locations on Midway.
Automated snapshots of the home directories are available in case of accidental file deletion or other problems. Currently snapshots are available for these time periods:
- 4 hourly snapshots
- 7 daily snapshots
- 4 weekly snapshots
Backups are performed on a nightly basis to a tape machine located in a different data center than the main storage system. These backups are meant to safeguard against events such as hardware failure or disasters that could result in the complete loss of RCC’s primary data center. During periods of high activity, the nightly tape backup may take longer than 24 hours to complete. It is therefore possible that the tape backup can occasionally be a few days out of date. Users should make use of the snapshots described above to recover files as tape backup is intended for disaster recovery only.