Tools and examples
http://code.google.com/p/hpcinchemistrytutorial
Global Arrays (GA)
http://www.emsl.pnl.gov/docs/global/
PeIGS (Parallel Eigen Solver)
http://www.emsl.pnl.gov/docs/global/peigs.shtml
ScaLapack (Netlib)
http://www.netlib.org/scalapack/scalapack_home.html
Previous Benchmarks Parallel Eigensolver Performance
http://www.cse.scitech.ac.uk/arc/diags.shtml
Performance Tools and API's ALCF
https://wiki.alcf.anl.gov/index.php/Profiling
Equivalence table PEIGS vs LAPACK
MPICH2 (for vanilla armci)
GA-SCALAPACK
Subroutine GA_pdsyevx(g_a, g_b, eval, nb8)
PDSYEVX allows the calculation of a subset of eigenvalues/eigenvectors. All other methods calculate the full set.
Subroutine GA_pdsygv(g_a, g_s, g_b, eval)
PDSYGV Hermitian Eigen solver
Subroutine Ga_pdsyevd(g_a, g_b, eval, nb8)
PDSYEVD uses the Parallel Divide and Conquer Algorithm [3]. This consists of an initial partition of the problem into sub-problems and then after appropriate computations, results are joined together using the rank-one update of Cuppen.
Subroutine GA_pdsyevr(g_a, g_b, eval, nb8)
PDSYEVR ScaLAPACK includes block algorithms for solving symmetric and real ,uses the QR algorithm.
**File: global/src/scalapack.F
GA-PEIGS
Subroutine GA_diag_std(g_a, g_v, eval)
Solves the standard symmetric eigenvalue problem returning all eigenvectors and values in ascending order and calls PDSPEV (Peigs 3.0) [5] implements an optimised inverse iteration algorithm with orthogonalization performed in parallel [4]. The Block Factored Jacobi method (BFG) [6] is an iterative procedure that uses repeat applications of orthogonal Jacobi rotations, which, depending on the order of them, bring the matrix more or less quickly to diagonal form. No initial reduction of the matrix to tridiagonal form is required.
Subroutine GA_diag(g_a, g_s, g_v, eval)
Solve the generalized eigen-value problem returning all eigen-vectors and values in ascending order and calls PDSPGV
Subroutine ga_diag_reuse(reuse,g_a, g_s, g_v, eval)
Solve the generalized eigen-value problem returning all eigen-vectors and values in ascending order reuse factorized g_s
reuse:
0 first time,
>0 following calls
<0 only deletes factorized g_s
**File: global/src/ga_diag.F
NWCHEM (PEIGS or SCALAPACK)
NWCHEM can be used either with PEIGS or with SCALAPACK. The PEIGS subroutines are activated by default.
If you want to use NWCHEM with SCALAPACK you should define in the preprocessing (somewhere):
PARALLEL_DIAG
SCALAPACK
Configuring Global Arrays in BGP with ScaLapack
../ga-5-0/configure --prefix=/gpfs/home/avazquez/soft/nwchem/nwchem-6.0/src/tools/install \
--host=powerpc-bgp-linux \
--with-tcgmsg \
--with-mpi\
--enable-peigs \
--disable-mpi-tests \
--with-blas="-L/soft/apps/ESSL-4.4.1-0/lib -L/soft/apps/ibmcmp-apr2011/xlf/bg/11.1/bglib -L/soft/apps/ibmcmp-apr2011/xlsmp/bg/1.7/bglib -lesslbg -lxlf90_r -lxlsmp -lpthread" \
--with-dcmf \
--with-scalapack="/home/avazquez/soft/algebra/scalapack/scalapack-1.8.0/scalapack_.a \
/home/avazquez/soft/algebra/scalapack/blacs-mpi-1.1/LIB/blacsF77init_MPI-BGP-0.a \
/home/avazquez/soft/algebra/scalapack/blacs-mpi-1.1/LIB/blacs_MPI-BGP-0.a\
/home/avazquez/soft/algebra/lapack-essl/lapack-3.3.0/lapack_BGPESSL.a \
-L/soft/apps/ibmcmp/xlsmp/bg/1.7/bglib/ -lxlomp_ser" \
CC=mpicc F77=mpixlf90_r MPICC=mpicc \
--enable-underscoring
BLACS BGP
COMMLIB = MPI
PLAT = BGP
BLACSdir = $(BTOPdir)/LIB
BLACSDBGLVL = 0
BLACSFINIT = $(BLACSdir)/blacsF77init_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
BLACSCINIT = $(BLACSdir)/blacsCinit_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
BLACSLIB = $(BLACSdir)/blacs_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
SMPLIB = "\
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib -Wl,-rpath -Wl,/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib -lmpich.cnk -lopa \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/sys/lib -Wl,-rpath,/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/sys/lib -ldcmf.cnk -ldc
MPILIB = "\
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib -Wl,-rpath -Wl,/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib -lmpich.cnk -lopa \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/sys/lib -Wl,-rpath,/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/sys/lib -ldcmf.cnk -ldcmfcoll.cnk -lpthread \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/runtime/SPI -Wl,-rpath,/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/runtime/SPI -lSPI.cna -lrt"
MPIINCdir =/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/include
INSTdir = $(BTOPdir)/INSTALL/EXE
SYSINC = -I$(MPIINCdir)
INTFACE = -DAdd_
SENDIS =
BUFF =
TRANSCOMM = -DUseMpich
WHATMPI = -DUseF77Mpi
F77 = tbgxlf_r
CC = powerpc-bgp-linux-gcc
NOOPT = -w $(FPIC)
F77FLAGS = -g -O2 -qnoipa -qfloat=rsqrt:fltint -qstrict -qxlf77=leadzero -qthreaded -qnosave -qalign=4k -qarch=450d -qtune=450 -qextname
CCFLAGS = -qarch=450 -qtune=450 -g -O3 -qstrict -g -O2
SRCFLAG =
F77LOADER = $(F77)
CCLOADER = $(F77)
SYSLIBS = -L/soft/apps/ibmcmp/xlf/bg/11.1/bglib -lxlf90_r -lxlfmath -lxlopt -lxl -L/soft/apps/ibmcmp/xlsmp/bg/1.7/bglib/ -lxlomp_ser
F77LOADFLAGS =
CCLOADFLAGS =
ScaLapack in BGP
Dependencies:
Lapack/ESSL http://www.netlib.org/lapack/essl/
BLACS
ESSL
home = /mydir
PLAT = BGP
#
BLACSDBGLVL = 0
BLACSdir = /myBLACS
#
# MPI setup; tailor to your system if using MPIBLACS
#
USEMPI = -DUsingMpiBlacs
SMPLIB = \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib -Wl,-rpath -Wl,/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib -lmpich.cnk -lopa \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/sys/lib -Wl,-rpath,/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/sys/lib -ldcmf.cnk -ldcmfcoll.cnk -lpthread \
-L/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/runtime/SPI -Wl,-rpath,/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/runtime/SPI -lSPI.cna -lrt
BLACSFINIT = /myBLACS/LIB/blacsF77init_MPI-BGP-0.a
BLACSCINIT = /myBLACS/LIB/blacsCinit_MPI-BGP-0.a
BLACSLIB = /myBLACS/LIB/blacs_MPI-BGP-0.a
#
F77 = tbgxlf_r
CC = powerpc-bgp-linux-gcc
NOOPT = -w $(FPIC)
F77FLAGS = -g -O2 -qnoipa -qfloat=rsqrt:fltint -qstrict -qxlf77=leadzero -qthreaded -qnosave -qalign=4k -qarch=450d -qtune=450 -qextname
CCFLAGS = -qarch=450 -qtune=450 -g -O3 -qstrict -g -O2
SRCFLAG =
F77LOADER = $(F77)
CCLOADER = $(F77)
SYSLIBS = -L/soft/apps/ibmcmp/xlf/bg/11.1/bglib -lxlf90_r -lxlfmath -lxlopt -lxl -L/soft/apps/ibmcmp/xlsmp/bg/1.7/bglib/ -lxlomp_ser
F77LOADFLAGS =
CCLOADFLAGS =
CDEFS = -DAdd_ -DNO_IEEE $(USEMPI)
#
#
SCALAPACKLIB = $(home)/scalapack_$(MPI).a
BLASLIB = -L/soft/apps/ESSL-4.4.1-0/lib -L/soft/apps/ibmcmp-apr2011/xlf/bg/11.1/bglib -L/soft/apps/ibmcmp-apr2011/xlsmp/bg/1.7/bglib -lesslbg -lxlf90_r -lxlsmp -lpthread
LAPACKLIB = /mylib/lapack_BGPESSL.a
PBLIBS = $(SCALAPACKLIB) $(FBLACSLIB) $(BLASLIB) $(CBLACSLIB) $(BLASLIB) $(SMPLIB) $(LAPACKLIB)
PRLIBS = $(SCALAPACKLIB) $(CBLACSLIB) $(SMPLIB) $(BLASLIB) $(SYSLIBS)
RLIBS = $(SCALAPACKLIB) $(FBLACSLIB) $(CBLACSLIB) $(BLASLIB) $(SMPLIB)
LIBS = $(PBLIBS)
Lapack/ESSL http://www.netlib.org/lapack/essl/
#
# See the INSTALL/ directory for more examples.
#
SHELL = /bin/sh
#
# The machine (platform) identifier to append to the library names
#
PLAT = _BGPESSL
#
FORTRAN = tbgxlf_r -g -O2 -qnoipa -qfloat=rsqrt:fltint -qstrict -qxlf77=leadzero -qthreaded -qnosave -qalign=4k -qarch=450d -qtune=450 -qextname
DRVOPTS = $(OPTS)
LOADER = tbgxlf_r -g -O2
LOADOPTS =
NOOPT = -g -O0
TIMER = INT_CPU_TIME
#
ARCH = ar
ARCHFLAGS= cr
RANLIB = ranlib
#
#
BLASLIB = ../../blas$(PLAT).a
#
# Location of the extended-precision BLAS (XBLAS) Fortran library
# used for building and testing extended-precision routines. The
# relevant routines will be compiled and XBLAS will be linked only if
# USEXBLAS is defined.
#
# USEXBLAS = Yes
XBLASLIB =
# XBLASLIB = -lxblas
#
# Names of generated libraries.
#
LAPACKLIB = lapack$(PLAT).a
TMGLIB = tmglib$(PLAT).a
EIGSRCLIB = eigsrc$(PLAT).a
LINSRCLIB = linsrc$(PLAT).a