Storage
There are three types of storage directly connected to the cluster. All directories are mounted on both submit nodes and all compute nodes. Each directory is assigned a variable when you log in that you may use in your scripts.
NOT BACKED UP: No data, anywhere, is backed up. We recommend using the Research File Storage for those purposes.
To see any of the information below, run the command crctool
Name | Quota | Variable | Purpose |
---|---|---|---|
Home | 100 GB | $HOME | Personal storage assigned to every user |
Work | Based on group allocation | $WORK | Shared storage between research group to collaborate within group. Store raw data sets. |
Scratch | 85 TB total shared between all users | $SCRATCH | Temporary storage used to process raw data sets. Subject to 60 day purge with notice. |
You can use $HOME, $WORK, and $SCRATCH in you submit scripts and make it easier to get around the file systems.
All owner groups will receive 1TB of $WORK storage for free. Any additional space of $WORK can be purchased for $100 per TB per year. Invoices for storage will be sent 1 year after the storage's start date.
Transfer Data
Using the Data Transfer Node (dtn.ku.edu), files on the Research File Storage (ResFS) and the KU Community Cluster may be accessed
Name | Purpose | KU Anywhere Required? | Notes |
---|---|---|---|
SCP / SFTP | Transfer small data sets to and from the cluster. | Yes | If source or destination off-campus, must use KU Anywhere. Must keep connection open while transferring. |
Globus | Transfer large data sets between storage and the world. Share data sets to anyone to be downloaded or uploaded. | No | Uses web application. Can transfer data outside of KU easily. Must use Globus software for destination. |
Path to Access Files
Storage | Path |
---|---|
KU Community Cluster | /panfs/pfs.local |
ResFS | /resfs/GROUPS |
Quota
Each type of storage has an enforced quota. To determine how much of your quota you are using for each of these volumes, login to the cluster and run crctool
This will produce output similar to the following:
------------------------------- Storage Variables ------------------------------ | Variable Path | | $HOME /home/username | | $WORK /panfs/pfs.local/work/groupname/username | | $SCRATCH /panfs/pfs.local/scratch/groupname/username | -------------------------------------------------------------------------------- --------------------------------- Disk Quotas ---------------------------------- | Disk Usage (GB) Limit %Used File Usage Limit %Used | | $HOME 33.51 100.00 33.51 99054 100000 99.05 | | $WORK 6436.80 13969.84 46.08 296488 0 0 | | $SCRATCH 39533.15 55879.35 70.75 1 0 0 | --------------------------------------------------------------------------------
Violation
Users will receive an email from the Panasas File System when a quota for $HOME, $WORK, and $SCRATCH has been exceeded (Hard Quota) or is about to be exceeded (Soft Quota). They will all start with the information below:
PanActive Manager Warning: User Quota Violation Soft (bytes) Date: Wed Feb 07 00:00:16 CST 2018 System Name: <name> System IP: <ip range> Version <version> Customer ID: <custid>
Soft Quota: A warning email is sent to the user that the specified resource is about to exceed the size or file limit quota. The example below is for $HOME for the user and has crossed the 85 GB threshold.
User Quota Violation Soft (bytes): Limit reached on volume /home for Unix User: (Id: uid:<uid>) Limit = 85.00 GB. The above message applies to the following component: Volume: /home
This example is for a Soft Quota Violation of the file limit size in $HOME
User Quota Violation Soft (files): Limit reached on volume /home for Unix User: (Id: uid:<uid>) Limit = 85.00 K. The above message applies to the following component: Volume: /home
Hard Quota: The maximum allotted space has been reached for that volume. This could be for any of locations above, including the system-wide location of $SCRATCH. No further writes are allowed, and you must remove files before creating any new ones
User Quota Violation Hard (bytes): Limit reached on volume /home for Unix User: (Id: uid:<uid>) Limit = 100.00 GB. No further writes allowed for this Unix User in this volume unless it has at most 95.00 GB of data. The above message applies to the following component: Volume: /home
When you have removed enough files to drop below the Hard or Soft Quota violation, an Event CLEARED email will be sent to you. At the top of the email, you will notice the below:
Event CLEARED: PanActive Manager Warning: User Quota Violation Hard (bytes)
Recovering your files
One of the features of our cluster filesystem is the concept of snapshots. Snapshots are a daily capture of files in a given directory. All snapshots are user accessible, but only for volumes that are owned by a group the user is part of. Snapshots are read-only, but can be used for when you accidentally delete a file, you can retrieve that file up to seven days later.
Snapshots are stored in the .snapshot directory in the root of the your work or home directory, but this directory is hidden, and won't be displayed in listings (ls) of that directory. Snapshots are captured for $HOME and $WORK directories but not $SCRATCH
For example, say you're working in your work directory, (i.e. /panfs/pfs.local/work/groupname/username) and you accidentally delete a file named oops.txt. To restore that file from a previous snapshot, you can navigate to the .snapshot directory for your group's work and there you will find directories containing snapshots from the past seven days. Each of these directories contain a file structure similar to that of /panfs/pfs.local/groupname and has a snapshot of what was in those files when that snapshot was taken. You can navigate into those directories and copy the file(s) you accidentally deleted back to your work directory.
cd /panfs/pfs.local/work/groupname/.snapshot ls cd date-of-snapshot.automatic cd username cp oops.txt /panfs/pfs.local/work/groupname/username
If one particular file was heavily modified, the snapshot may not recover the most recent change, but it will have the files that were in those directories when the snapshot was taken for that day.
Snapshots of home directories can also be found in
/home/.snapshot/date-of-snapshot.automatic/username
Due to the way that directory is set up you cannot ls inside the date-of-snapshot.automatic directory, instead you must go directly to your own home directory as shown above.
Snapshots are on a rolling seven day purge, so if you accidentally delete a file you will need to restore it within seven days or it will be gone forever.