GHPCC Storage
In order to make sure that our cluster storage is optimized, we will be adhering to certain storage policies on the cluster. Here are the following locations that all PaiLab cluster users have access to:
HOME DIRECTORY (/home/[username]/)
50G of storage reserved for your personal use, not accessible to any other user
only directory that is backed up
store all SCRIPTS/FINALIZED FILES here, to ensure the ability to re-create analyses if it becomes necessary
PROJECT FOLDER (/project/umw_athma_pai/)
1.5T of storage accessible to all group members
when saving something here, please ensure the permissions are such that all group members can rwx.
Store the following:
genome information (/project/umw_athma_pai/genomes/)
common use genome fasta files, mapping indexes, gtfs, etc as needed
organized by: /project/umw_athma_pai/genomes/[species]/[genome build]/
raw data (/project/umw_athma_pai/raw/)
common-use raw data files, primarily fastq files for Illumina data and both signal/fastq files for Minion data
includes: (1) data generated by our lab & (2) data downloaded from SRA/GEO
stay tuned for organization system to keep track of and search all data that is here
NEARLINE FOLDER (/nl/umw_athma_pai/)
currently 2T of storage accessible to all group members, with options to increase storage as needed
when saving something here, please save to a user-specific folder or common folder that's named to clearly indicate it's purpose
Store the following:
files that are being actively worked on, including PROCESSED FILES / ANALYSES, etc
GENERAL RULES/POLICIES:
Every user has access to as much storage space as needed for their specific needs/projects, but please be reasonable and conscientious about usage
avoid saving multiple versions of the same file
gzip files when possible
regularly clean up temporary files, including test files, intermediate files, error/output files, etc
Thanks to an automated script, we will send regular updates regarding cluster usage, including information about:
project/nl usage per folder/user
instances of the following files:
unzipped fastq files --> all fastq files should be gzipped
sam files --> all sam files should be converted to bam files
[suggest other checks here]