Start an interactive job

Running R Batch Jobs on the cluster

R Batch Script

Running Large #s of R Batch Jobs

> R packages

Prerequisite: Create a directory for your R package installs

Prerequisite: Understanding the R_LIBS_USER environment variable

Installing Packages and Running R

Packages in R

Interacting with the module system from R

> R basics

Basics

Step 0: To use R on the FASRC cluster, load the appropriate version available via our module system. See the modules list for available versions.

You should first have taken our Introduction to the FASRC training and be familiar with running jobs on the cluster.

Start an interactive job

An interactive job is the best way to provide a test environment while we are still working with our scripts.

salloc -p test --mem 1000 -t 30

Running R Batch Jobs on the cluster

To submit R jobs to the cluster via SLURM, the R command in your SLURM batch file should be in the format:

R CMD BATCH --quiet --no-restore --no-save scriptfile outputfile

where

--quiet

silences the startup messages so that they won’t appear in your output

--no-restore

does not restore the R workspace at startup

--no-save

does not save your R batch environment at exit

scriptfile

is your R script

outputfile

is where all output will be sent

If you wish to pass along command line arguments in your SLURM batch script, you need to use the format:

R CMD BATCH --no-save --no-restore '--args a=1 b=c(2,5,6)' test.R test.out

and include the following lines in your R script:

##First read in the arguments listed at the command line

args=(commandArgs(TRUE))

##args is now a list of character vectors

## First check to see if arguments are passed.

## Then cycle through each element of the list and evaluate the expressions.

if(length(args)==0){

print("No arguments supplied.")

##supply default values

a = 1

b = c(1,1,1)

}else{

for(i in 1:length(args)){

eval(parse(text=args[[i]]))

}

print(a)

print(b)

Your output file test.out should have the following lines in it:

> print (a)

[1] 1

> print (b)

[1] 2 5 6

More examples and detail can be found at this helpful Stack Overflow webpage and the R doc pages.

You can also use the Rscript command. Please consult the the O’Reilly book R Cookbook for the difference between R CMD BATCH and RScript at O’Reilly Books Online for Harvard (valid Harvard ID required).

R Batch Script

To run R script as SBATCH script use the following template. Create R.batch file using the given template. You should make the requested changes to runtime, memory etc based on your needs.

#!/bin/bash

#SBATCH -c 1 # Number of cores

#SBATCH -t 0-00:10 # Runtime in D-HH:MM, minimum of 10 minutes

#SBATCH -p shared # Partition to submit to

#SBATCH --mem=1000 # Memory pool for all cores (see also --mem-per-cpu)

#SBATCH -o myRjob_%j.out # File to which STDOUT will be written, %j inserts jobid

#SBATCH -e myRjob_%j.err # File to which STDERR will be written, %j inserts jobid

module load R #Load R module

R CMD BATCH --quiet --no-restore --no-save scriptfile outputfile

To submit the created script using sbatch R.batch

Running Large #s of R Batch Jobs

If you need to submit a large number of files (e.g. varying the parameters for jobs submitted), please see our documentation on Submitting Large Numbers of Files to the Cluster.

> R packages

Step 0: To use R on the cluster, load the appropriate version available via our module system. See the modules list for available versions.

When loading R from the Lmod system, 100s of common packages have already been installed. The list is available here. However, if you need to install new packages locally, the process is fairly straight-forward.
See also: R – Basics

Prerequisite: Create a directory for your R package installs

Before attempting to install your own R packages, you will first need to create a directory for your local R package installs to live in. You’ll only need to do this once for each version of R you use. This is the path you will then point the R_LIBS_USER variable to.
mkdir -pv ~/apps/R_version
It’s highly recommended that you “tag” your package folder with the specific version of R you are using to install them, so that you don’t risk in future to forget and accidentally use the packages you are installing with a different version of R.

Prerequisite: Understanding the R_LIBS_USER environment variable

The R_LIBS_USER environment variable is used by R to determine where packages you install should be located when the install.packages() function is called and when you later use them. It is set using:
export R_LIBS_USER=$HOME/apps/R_version:$R_LIBS_USER
Note: You can also add this to you .bashrc if you wish, but we recommend calling this directly after loading the module in your scripts or when running R interactively. This ensures that your local library is the first one checked by R for installs and libraries..

Installing Packages and Running R

To install packages, you will need to load an R module, set your R_LIBS_USER variable, and run R. We recommend choosing a specific R module rather than simply using module load R. Look up available R modules here: https://portal.rc.fas.harvard.edu/apps/modules/R. Example:

module load R/3.5.1-fasrc01

export R_LIBS_USER=$HOME/apps/R_3.5.1:$R_LIBS_USER

Now when you use R’s install.packages() function, the package will be installed in the specified directory.
Examples:

install.packages("ape") (You will be asked to pick a mirror site to download from)
install.packages("ape", repos="http://cran.r-project.org") (You can also specify a mirror)

Example

In this example, submit an interactive job, load modules, link the appropriate path for your R packages, start the R shell, and finally install R packages.

[user@rclogin ~]$ salloc -p test -t 60 -n1 --mem 4000

[user@computenode ~]$ module load R/3.5.1-fasrc01

[user@computenode ~]$ mkdir -pv ~/apps/R_3.5.1

mkdir: created directory ‘/n/home00/user/apps’

mkdir: created directory ‘/n/home00/user/apps/R_3.5.1’

[user@computenode ~]$ export R_LIBS_USER=$HOME/apps/R_3.5.1:$R_LIBS_USER

[user@computenode ~]$ R --quiet

> install.packages('ape',repos="http://cran.r-project.org")

Installing package into ‘/n/home00/user/apps/R_3.5.1’

trying URL 'http://cran.r-project.org/src/contrib/ape_5.2.tar.gz'

Content type 'application/x-gzip' length 790069 bytes (771 KB)

==================================================downloaded 771 KB

* installing *source* package ‘ape’ ...

... omitted output ...

** testing if installed package can be loaded

* DONE (ape)

Installing sp, rgdal, rgeos, and sf

For the packages sp, rgdal, rgeos, and sf, refer to our documentation on FASRC Github.

Packages in R

Available packages

List of available packages in the R-project repository.

Installed packages

To see the installed packages in the R shell:

> installed.packages()

...

Version Priority

ADGofTest "0.3" NA

AnDE "1.0" NA

BB "2014.1-1" NA

Brobdingnag "1.2-4" NA

CpGassoc "2.11" NA

DBI "0.2-7" NA

DEoptimR "1.0-1" NA

Defaults "1.1-1" NA

FNN "1.1" NA

Formula "1.1-1" NA

... omitted output ...

spatial "7.3-8" "recommended"

splines "3.1.0" "base"

stats "3.1.0" "base"

stats4 "3.1.0" "base"

survival "2.37-7" "recommended"

tcltk "3.1.0" "base"

tools "3.1.0" "base"

utils "3.1.0" "base"

R parallel packages

In the FASRC Github documentation, you can find a brief explanation about parallel R packages and a few examples of:

Hard-to-install packages

Some R packages have lots of dependencies and/or require additional software to be installed in the cluster (e.g. protobuf, geojsonio). Properly configuring these installs with R can become problematic. To overcome that, we documented how to install R packages within a Singularity container.

Interacting with the module system from R

Inside your R session you can interact with the module system by using the module function provided by the script in /n/helmod/apps/lmod/7.7.32/init/R.
For example:

[user@boslogin02 ~]$ srun --pty -t 60 --mem 2000 -p test /bin/bash

[user@holy7c19316 scratch]$ module load R/3.5.1-fasrc01

[user@holy7c19316 scratch]$ R --quiet

\> source("/n/helmod/apps/lmod/7.7.32/init/R") > module('load','bcftools')

\> ...omitted lot of output ...

\> system('bcftools --version')

\> bcftools 1.5

\> Using htslib 1.5

Page updated

Google Sites

Report abuse