File Storage

Overview

It is almost guaranteed that researchers will need to store a large amount of data.  This guide discusses the various storage options available to researchers.

Contents

Two Types of Storage for Researchers

Research often requires large amounts of data and usually with very fast throughput.   Research IT provides two types of storage:  

Large Scale Storage (LSS) 

The Large Scale Storage system is intended for storing large quantities of research data.  The cost is currently $40 per terabyte per year.   Research groups, colleges or departments buy the amount of storage they need.  See this detailed guide to LSS for additional details.  Although users can access the LSS to transfer data to and from the cluster, the LSS data is not accessible by the Nova compute nodes.  Hence, LSS cannot be used by compute jobs on the cluster.

High Performance Computing Storage

For users on the Nova cluster there are several storage options available described below.

Home Directories

Every user on the cluster has a home directory with a path of /home/<username> where <username> is the ISU NetID.    Linux requires every user to have a home for storing certain configuration files.   However, the home directories are given a maximum size of 10GB which is not large enough to store research data.  Users should make every effort to store their data under the Work storage or ptmp locations.  

Reducing Home Directory Storage on Nova:   Some applications keep data in a user's home directory by default.  This can inadvertently lead to a user filling up their home directory.  See this guide for instructions for moving certain directories to your /work or /ptmp directory.

HPC Work Storage

Work storage is high-performance storage that is purchased by individual research groups.  An individual group gets a data directory assigned to them with the path  /work/<group> where <group> is the group name.   This storage is accessible over the high speed of the Nova HPC cluster and provides the best overall combination of performance, capacity, and data retention.   The group PIs determine who can access the data.  This is the standard storage option used by most  HPC groups.  

Purchasing Work Storage:  The cost of work storage $130 per terabyte with snapshots, or $75 per terabyte without snapshots.   Snapshots allow files that were accidentally removed or changed to be more easily recovered.   The form for purchasing work storage is available here.

Finding Your Work Directory:  Each group has a directory shared by all members of the group (and only that group).  These directories are links under the directory /work. Issue "ls /work/" to see all group directories. Unless you're in the LAS group, "cdw" command will cd to /work/<your_group_working_directory>/<NetID> . ("cdw" will also create directory <NetID> in /work/<your_group_working_directory> if it does not exist).

To find your group, issue the "groups | grep nova" command. Normally groups will have names its-hpc-nova-<NetID>, where <NetID> is NetID of the PI for this group. LAS users can cd to /work/LAS/<PI_name>-lab .

Group working directories are available on all nodes, and any user in the group can create files there using the group's quota. Group quota for this space is based on the shares for the group and can be seen at login. The usage and quota can also be found in the file /work/<your_group_working_directory>/group_storage_usage which is updated once every hour.

Work Directory Snapshots:  The group working directories are backed up daily using ZFS snapshots. The snapshots contain the previous version of a file, or in the case of a deleted file, the last version of it.  Use of snapshots allows to do incremental backups based on the these snapshots, which is needed because there are several millions of files on each server. The snapshots are also for filesystem consistency, as the file can be reverted to the previous version by the filesystem admin. The snapshots are copied to a backup storage and are deleted from the primary storage after 5 days. The snapshots on the primary storage use group allocation. That means that removing files frees up space only after 5 days.

The ZFS on the storage servers uses automated compression, so users can actually store more data in the same amount of physical space. The compression ratio depends on the data. Autocompression not only saves space, but also improves performance. However any file system will perform worse when it's close to being full. For this reason the group's PI will receive an email when group's data usage is over 70% quota. It's a good practice to keep file system less than 70% full.

HPC ptmp

The /ptmp file system is available to all HPC users at no cost.  This storage is intended as temporary high performance storage used by computational jobs.  Users can create their own directory under /ptmp  as needed.  Directories created under /ptmp are only intended to be kept for short periods of 6 to 8 weeks.  Note that for tar files, one should use the -m flag when untaring files, so that files would have the date they were untared and not the date at which the author of the tar file created them.

HPC classtmp

All students in a class registered to use the cluster are given a file storage directory with the path /work/classtmp/<username>, where <username> is the ISU NetID.  

Temporary Local Storage ($TMPDIR)

Computational jobs can take advantage of relatively fast but temporary storage available on compute nodes.  The amount available is typically between 1 and 1.5 terabytes.  The actual path of the storage is assigned to the variable $TMPDIR by the Slurm job scheduler on the cluster.    Note that the directory at $TMPDIR will disappear at the conclusion of your job.  Any data which is not copied out of $TMPDIR cannot be recovered after your job has finished.

File Transfers

When transferring files to or from nova storage please use the Data Transfer Node, novadtn.its.iastate.edu.  All of the storage options are available there and it will help keep the head node from becoming overburdened.