Cluster Description | Login and Data Transfer | HW Configuration | Storage and Backup | Recharge Model | Scheduler Configuration | Low Priority QoS Jobs | Job Script Examples


Cluster Description:

LAWRENCIUM is the platform for the LBNL Condo Cluster Computing (LC3) program, which provides a sustainable way to meet the midrange computing requirement for Berkeley Lab. LAWRENCIUM is part of the LBNL Supercluster and shares the same Supercluster infrastructure. This includes the system management software, software module farm, scheduler, storage and backend network infrastructure.

Login and Data Transfer:

LAWRENCIUM uses One Time Password (OTP) for login authentication for all the services provided below. Please also refer to Data Transfer page for additional information.
  • Login server: lrc-login.lbl.gov
  • DATA transfer server: lrc-xfer.lbl.gov
  • Globus Online endpoint: lbnl#lrc

Hardware Configuration:

LAWRENCIUM is composed of multiple generations of hardware hence it is physically separated into several partitions to facilitate management and to meet the requirements to host Condo projects. The following table lists the hardware configuration for each individual partition.

Partition Nodes Node List CPU Cores Memory Infiniband Accelerator
lr2 198
n0[000-141].lr2
n0[146-153].lr2
n0[161-180].lr2
n0[182-208].lr2
n0181.lr2
INTEL XEON X5650 12
24GB
96GB
QDR -
lr3
300
n0[000-163].lr3
n0[164-203].lr3
n0[213-308].lr3
INTEL XEON E5-2670
INTEL XEON E5-2670 v2
16
20
64GB
64GB
FDR -
lr4 108 n0[000-095].lr4
n0[099-110].lr4
INTEL XEON E5-2670 v3 24 64GB FDR -
lr_amd 4 n0[157-160].lr2 AMD OPTERON 6276 32 64GB QDR -
lr_bigmem
5
n0[154-155].lr2
n0156.lr2
n0210.lr3
n0211.lr3
AMD OPTERON 6174
AMD OPTERON 6180 SE
INTEL XEON E5-4620
INTEL XEON E7-4860 v2
48
48
32
48
256GB
512GB
1024GB
1024GB
QDR -
lr_manycore
9
n0[204-207].lr3
n0[208-209].lr3
n0[096-098].lr4
n0[136-138].lr4
INTEL XEON E5-2603
INTEL XEON E5-2603
INTEL XEON E5-2623 v3
INTEL XEON E5-2623 v3
8
8
8
8
64GB
64GB
64GB
64GB
FDR
INTEL XEON PHI 7120
NVIDIA KEPLER K20
4X NVIDIA KEPLER K80
4X NVIDIA GTX 1080TI
mako 272 n0[000-271].mako0 INTEL XEON E5530 8 24GB QDR -
mako_
manycore
4 n0[272-275].mako0 INTEL XEON X5650 12 24GB QDR 2X NVIDIA
TESLA C2050

Storage and Backup:

LAWRENCIUM cluster users are entitled to access the following storage systems so please get familiar with them.

Name Location Quota Backup Allocation Description
HOME /global/home/users/$USER 12GB Yes Per User HOME directory for permanent data
storage
GROUP-SW /global/home/groups-sw/$GROUP 200GB Yes Per Group GROUP directory for software and
data sharing with backup
GROUP /global/home/groups/$GROUP 400GB No Per Group GROUP directory for data sharing
without backup
SCRATCH /global/scratch/$USER none No Per User SCRATCH directory with Lustre high
performance parallel file system
CLUSTERFS /clusterfs/axl/$USER none No Per User Private storage for AXL condo
CLUSTERFS /clusterfs/cumulus/$USER none No Per User Private storage for CUMULUS condo
CLUSTERFS /clusterfs/esd/$USER none No Per User Private storage for ESD condos
CLUSTERFS /clusterfs/geoseq/$USER none No Per User Private storage for CO2SEQ condo
CLUSTERFS /clusterfs/nokomis/$USER none No Per User Private storage for NOKOMIS condo

NOTE: HOME, GROUP, GROUP-SW and CLUSTERFS directories are located on a highly reliable enterprise level BlueArc storage device. Since this appliance also provides storage for many other mission critical file systems, and it is not designed for high performance applications, running large I/O dependent jobs on these file systems could greatly degrade the performance of all the file systems that are hosted on this device and affect hundreds of users, thus this behavior is explicitly prohibited. HPCS reserves the right to kill these jobs without notification once discovered. Jobs that have I/O requirement should use the SCRATCH file system which is designed specifically for that purpose.

Recharge Model:

LAWRENCIUM is a Lab-fund platform for LC3 program. LBNL has made a significant investment in developing this platform to meet the midrange computing requirement at Berkeley Lab. The primary purpose of it is to provide a sustainable way to host all the condo projects while meeting the computing requirements from other users as well. To achieve this goal, condo users are allowed to run within their condo contributions for free. However normal users who would like to use the LAWRENCIUM cluster are subject to the LBNL recharge rate. Condo users who would need to run outside of their condo contributions are also subject to the same recharge rate as normal users. For this purpose, condo users will obtain either one or two projects/accounts when their accounts are created on LAWRENCIUM, per the instruction we receive from the PI of the condo project. They would need to provide the correct project when running jobs inside or outside of their condo contributions, which will be explained in detail in the Scheduler Configuration section below. The current recharge model has been effective since Jan, 2011 with the standard recharge rate of $0.01 per Service Unit (1 cent per service unit, SU). Due to the hardware architecture difference we discount effective recharge rate for older generations of hardware and this may go down further when we have newer generations of hardware in production, please refer to the following table for the current recharge rate for each partition.

Partition Nodes Node List SU to Core CPU Hour Ratio Effective Recharge Rate
lr2 198 n0[000-141].lr2
n0[146-153].lr2
n0[161-208].lr2
0.50 $0.0050 per Core CPU Hour
lr3 300
n0[000-203].lr3
n0[213-308].lr3
0.75 $0.0075 per Core CPU Hour
lr4 108 n0[000-095].lr4
n0[099-110].lr4
1.00 $0.0100 per Core CPU Hour
lr_amd 4 n0[157-160].lr2 0.50 $0.0050 per Core CPU Hour
lr_bigmem 5 n0[154-156].lr2
n0[210-211].lr3
0.75 $0.0075 per Core CPU Hour
lr_manycore 9 n0[204-209].lr3
n0[096-098].lr4
n0[136-138].lr4
1.00 $0.0100 per Core CPU Hour
mako 272 n0[000-271].mako0 0.50 $0.0050 per Core CPU Hour
mako_manycore 4 n0[272-275].mako0 0.50 $0.0050 per Core CPU Hour

NOTE: The usage calculation is based on the resource that is allocated to the job instead of the actual usage of the job. For example, if a job asked for one lr2 node with one CPU requirement (typical serial job case), and the job ran for 24 hours, since lr2 nodes are allocated exclusively to the job (please refer to the following Scheduler Configuration section for more detail), the charge that this job incurred would be: $0.0050/(core*hour) * 1 node * 12 cores/node * 24 hours = $1.44, instead of: $0.0005/(core*hour) * 1 core * 24 hours = $0.12.

NOTE: For Many-Core nodes, recharges are calculated based on the host Core CPU Hour usage not the integrated core usage.

Scheduler Configuration: 

LAWRENCIUM cluster uses SLURM as the scheduler to manage jobs on the cluster. The following scheduler configuration has been introduced to LAWRENCIUM cluster and it is highly recommended that all users to get familiar with these configurations before using the LAWRENCIUM cluster.
  • For normal users to use the LAWRENCIUM resource the proper project account, e.g., "--account=ac_abc", is needed. One of the QoS's "br_serial", "lr_normal", or "mako_normal" is also required based on the partition that the job is submitted to, e.g., "--qos=lr_normal".
  • If a debug job is desired the "lr_debug" or "mako_debug" QoS should be specified, e.g., "--qos=lr_debug" so that the scheduler can adjust job priority accordingly.
  • Condo users please use the proper condo QoS, e.g., "--qos=condo_xyz", as well as the proper recharge account "--account=lr_xyz".
  • The partition name that the job is submitted to is always required in all cases, e.g., "--partiton=lr1".
  • A standard fair-share policy with a decay half life value of 14 days (2 weeks) is enforced.
  • If a node feature is not provided, the job will be dispatched to nodes based on a predefined order, for "lr1" the order is: lr1_m16, lr1_m24; for "lr3" the order is: lr3_c16, lr3_c20.
Partition Nodes Node List Node
Features
Shared QoS QoS Limit Account
lr2 198
n0[000-141]
.lr2
n0[146-153]
.lr2
n0[161-180]
.lr2
n0[182-208]
.lr2
n0181.lr2
lr2
lr2
lr2_m96
Exclusive
lr_normal
lr_debug
condo_co2seq
condo_cumulus
condo_matgen
64 nodes max per job
72:00:00 wallclock limit
4 nodes max per job
4 nodes in total
00:30:00 wallclock limit
64 nodes max per group
28 nodes max per group
8 nodes max per group
ac_*
lr_co2seq
lr_cumulus
lr_matgen
lr3
300
n0[000-163]
.lr3
n0[164-203]
.lr3
n0[213-308]
.lr3
lr3
lr3_c16
lr3
lr3_c20
Exclusive
lr_normal
lr_debug
condo_axl
condo_esd1
condo_esd2
condo_
nanotheory
condo_nokomis
64 nodes max per job
72:00:00 wallclock limit
4 nodes max per job
4 nodes in total
00:30:00 wallclock limit
36 nodes max per group
30 nodes max per user
16 nodes max per group
20 nodes max per group
4 nodes max per group
40 nodes max per group
ac_*
lr_axl
lr_esd1
lr_esd2
lr_
nanotheory
lr_nokomis
lr4 108 n0[000-095]
.lr4
n0[099-110]
.lr4
lr4 Exclusive
lr_normal
lr_debug
condo_
minnehaha
condo_
matminer
64 nodes max per job
72:00:00 wallclock limit
4 nodes max per job
4 nodes in total
00:30:00 wallclock limit
36 nodes max per group
4 nodes max per group
ac_*
lr_
minnehaha
lr_
matminer
lr_amd 4 n0[157-160]
.lr2
lr_
interlagos
Exclusive lr_normal
64 nodes max per job
72:00:00 wallclock limit
ac_*
lr_
bigmem
5
n0[154-155]
.lr2
n0156.lr2
n0[210-211]
.lr3
lr_amd
lr_m256
lr_amd
lr_m512
lr_intel
lr_m1024
Exclusive lr_normal
64 nodes max per job
72:00:00 wallclock limit
ac_*
lr_
manycore
9
n0[204-207]
.lr3
n0[208-209]
.lr3
n0[096-098]
.lr4
n0[136-138]
.lr4
lr_phi
lr_kepler
lr_k20
lr_kepler
lr_k80
lr_pascal
lr_1080ti
Exclusive lr_normal
64 nodes max per job
72:00:00 wallclock limit
ac_*
mako 272 n0[000-271]
.mako0
mako Exclusive
mako_normal
mako_debug
condo_ganita
64 nodes max per job
72:00:00 wallclock limit
4 nodes max per job
4 nodes in total
00:30:00 wallclock limit
24 nodes max per group
ac_*
lr_ganita
mako_
manycore
4 n0[272-275]
.mako0
mako_
fermi
Exclusive mako_normal
64 nodes max per job
72:00:00 wallclock limit
ac_*

Low Priority QoS Jobs
LAWRENCIUM users are entitled to use the extra resource that is available across all LAWRENCIUM partitions. The is done through low priority QoSs "lr_lowprio" and "mako_lowprio" and your account is automatically subscribed to these QoSs during the account creation stage. You do not need to request for it explicitly. The low priority QoSs "lr_lowprio" and "mako_lowprio" have no limitation on the number of nodes and wallclock time. But it does come with a lower priority. By using these low priority QoSs you will NOT incur any usage charges described in the "Recharge Model" section above. What this means to users is that you now can scavenge free compute cycles when they are available. However these QoSs do not get a priority as high as the general QoSs, such as "lr_normal" and "lr_debug", "mako_normal" and "mako_debug", or any of the condo QoSs, and it is subject to preemption when all the other QoSs become busy. Thus it has two implications:
  1. When system is busy, any job that is submitted with these QoSs will be pending and yield to other jobs which have higher priorities.
  2. When system is busy and there are higher priority jobs pending, scheduler will preempt jobs that are running with these low priority QoSs. Preempted jobs can choose whether the job should be simply killed, or be automatically requeued after it is killed (at submission time, please see example below for detail). Please note that, since preemption could happen at any time, it would be very beneficial if your job is capable of checkpointing/restarting by itself, when you choose to requeue the job. Otherwise, you may need to verify data integrity manually before you want to run the job again.
Job Script Examples
  1. A normal user from "ac_abc" account would like to run a job with 64 MPI processes for 24 hours in "lr2" partition with email notifications.
  2. #!/bin/bash
    # Job name:
    #SBATCH --job-name=test
    #
    # Partition:
    #SBATCH --partition=lr2
    #
    # QoS:
    #SBATCH --qos=lr_normal
    #
    # Account:
    #SBATCH --account=ac_abc
    #
    # Processors:
    #SBATCH --ntasks=64
    #
    # Wall clock limit:
    #SBATCH --time=24:00:00
    #
    # Mail type:
    #SBATCH --mail-type=all
    #
    # Mail user:
    #SBATCH --mail-user=joe.doe@lbl.gov
    
    ## Run command
    module load openmpi
    mpirun ./a.out
    
  3. A condo user from "lr_xyz" group would like to run a job with 20 MPI processes and a memory requirement of 6 GB per process for 24 hours in "lr3" partition.
  4. #!/bin/bash
    # Job name:
    #SBATCH --job-name=test
    #
    # Partition:
    #SBATCH --partition=lr3
    #
    # QoS:
    #SBATCH --qos=condo_xyz
    #
    # Account:
    #SBATCH --account=lr_xyz
    #
    # Processors:
    #SBATCH --ntasks=20
    #
    # Memory requirement:
    #SBATCH --mem-per-cpu=6G
    #
    # Wall clock limit:
    #SBATCH --time=24:00:00
    
    ## Run command
    module load openmpi
    mpirun ./a.out
    
  5. A user from "ac_opq" group (normal or condo) would like run a low priority job with 20 MPI processes for 24 hours in "lr3" partition, and would like not to requeue the job in the event when the job is preempted.
  6. #!/bin/bash
    # Job name:
    #SBATCH --job-name=test
    #
    # Partition:
    #SBATCH --partition=lr3
    #
    # QoS:
    #SBATCH --qos=lr_lowprio
    #
    # Account:
    #SBATCH --account=ac_opq
    #
    # Requeue: ###SBATCH --requeue ### only needed if requeue is desired ### # # Processors: #SBATCH --ntasks=20 # # Wall clock limit: #SBATCH --time=24:00:00

    ## Run command module load openmpi mpirun ./a.out

Software Configuration:

LAWRENCIUM uses Environment Modules to manage the cluster wide software installation.

Cluster Status:

Please visit here for the live status of LAWRENCIUM cluster.

Additional Information:

Please use Service Now or send email to hpcshelp@lbl.gov for any inquiries or service requests.