Transferring Files

Methods for Transferring Files

There are several methods available for transferring files to or from the cluster:

scp -  A widely available command line utility for copying files securely over SSH.   WinSCP is a version for Windows also includes a graphical interface.  

rsync - Rsync is a very versatile file copying utility that is often used for synchronizing files between two locations.  It is a command line tool with syntax very similar to scp but with additional features that make it easier to keep files synchronized.

Globus - Globus is a file transfer service that is highly recommended when transferring large data sets between Nova and other registered Globus endpoints. 

Rclone - This is a utility that can be used to copy data to several cloud-based storage solutions, including Goole Drive, Apple Drive, Box, and Dropbox.

Important Things to Keep in Mind:

  • You should always use your work folder instead of your home directory for storing any significant amount of data on the cluster.  (Your work folder is usually a folder under the /work directory with the same name as your primary group.  To find out the name of your primary group, run the  my_primary_group command. )
  • When transferring data to Nova, be sure to connect to the data transfer server called novadtn.its.iastate.edu instead of he head node (nova.its.iastate.edu).  

SCP

Scp is a command line tool for transferring small to moderate amounts of data.  It is widely available on Linux and Macs from a command line shell. It is also available on Windows within Powershell or WSL.  In the following example, the user rockyb can copy the file file1.txt from the local computer to their Nova work folder /work/ajohnson/rockyb with the following command:

      $  scp file1.txt  rockyb@novadtn.its.iastate.edu:/work/ajohnson/rockyb

This can also work for transferring entire directories by adding the -r flag (for recursive).  To copy a directory called testfiles (and everything within it) to your work folder you can do something like this: 
      $  scp -r /home/rockyb/testfiles rockyb@novadtn.its.iastate.edu:/work/ajohnson/rockyb

To find out your primary group on Nova, run the  my_primary_group command.

It is also possible to ftp from Nova. This is faster than scp for large files because the file is not encrypted.

For older Microsoft Windows based machines, ssh is available as part of Windows ITS downloads under the link for Secure Shell.

Rsync

Rsync is an extremely versatile tool for copying files.  It works very similar to scp but differs in one key aspect:  If a file in the source path has already been copied to the destination, rsync will not copy it again. This makes it ideal for keep two file repositories in sync.  In the following example, Rsync is used to synchronize the directory /home/rockyb/development to /work/johnson/rockyb/development directory located on novadtn.its.iastate.edu:
     $  rsync -av /home/rockyb/development novadtn.its.iastate.edu:/work/johnson/rockyb

Globus Online (recommended)

Globus instructions are here

Rclone

To back up files to cloud storage (Google Drive, Box etc) from HPC cluster, use Rclone. Remember to use Data Transfer Node for all file transfers.

FileZilla

FileZilla can be used to transfer files in sftp mode.   None of clusters support standard ftp as that would send your username and password in the clear which is both a bad idea and agaisnt ISU policy. Remember to use a data transfer node instead of the login node.

Tips to set up FileZilla:

In FileZilla click on the “File” menu section and choose “Site Manager”. In the Site Manager window click on “New Site”, enter Host (e.g.novadtn.its.iastate.edu), set Protocol to SFTP, set Logon Type to Interactive, type your NetID in the User field. You will also want to open the "Transfer Settings" tab and check "Limit number of simultaneous connections." Set the "maximum number of connections" to "1" If you do not do this you will be prompted for your verification code and password many times.  Click on Connect and a small window will open, showing message from the cluster prompting for Verification code. In the Password field enter the 6 digit number generated by the GA running on your mobile device. Next you will be prompted for password.