NWChem: Test for CCSD(T) & CCSD[T] calculation with GeForce GTX 1050 Ti

Test for coupled cluster method of geometry optimization of water molecule using CCSD(T) and CCSD[T] with cc-pVDZ basis set. In this test the NVIDIA GTX 1050 Ti is used as a accelerator for pushing quantum calculation upon exploiting only on CPU.

GEFORCE GTX 1050 Ti graphic card (click here for large image)

Ref: https://www.tokopedia.com/

Specification of Special Item

Graphics Engine NVIDIA GeForce GTX 1050 TI

Bus Standard PCI Express 3.0

OpenGL OpenGL4.5

Video Memory

GDDR5 4GB Engine Clock

GPU Boost Clock : 1392 MHz

GPU Base Clock : 1290 MHz

CUDA Core : 768

Memory Clock 7008 MHz ( DDR2 )

Memory Interface 128-bit

CPU info

I used Ryzen Threadripper 1950X . For detail and performance please visit this post.

NWChem compilatin details

Here is the script I used to compile NWChem with OpenMPI v. 2.0.2 and CUDA v. 9.1 on CentOS 7.

https://github.com/rangsimanketkaew/NWChem/blob/master/script/CentOS-OpenMPI-CUDA.sh

**You have to make sure that the architecture you specified in CUDA_FLAGS is correct and supported by nvcc.

Use nvcc --help command and read its help page for more details. The following is the portion of compile script I used. Here I set -arch argument with sm_50.

export TCE_CUDA=Y

export CUDA_LIBS="-L/usr/local/cuda-9.1/lib64/ -L/usr/local/cuda-9.1/lib64/ -lcudart"

export CUDA_FLAGS="-arch sm_50 "

export CUDA_INCLUDE="-I. -I/usr/local/cuda-9.1/include/"

I am not an expert. When I get confused or have difficult question, I always consult the manual or ask the program developer!

Manual of tensor contraction engines (TCE) module (https://github.com/nwchemgit/nwchem/wiki/TCE) The GPU information is at the bottom of the website.

Is your calculation using GPU ?

In Linux, it is very easy to check the status of your GPU, for Nvidia, just type nvidia-smi it will show you a basic information like below.

Status of GTX 1050 Ti. It was running NWChem calculation

The output shows

The version of Nvidia driver is 387.26.
There is one GPU card on the machine.
GPU fan is 20%.
Temperature is 43 degree Celsius.
The process name shows the name of software !
Memory usage of each current process.

Input file and molecule details

#@@  Sample NWChem input for Coupled Cluster (CC) calculation

#@@  using TCE module in NWChem 6.8 and enabling CUDA.

start tce_ccsd_t_h2o

echo

memory total 8 GB

geometry units bohr

O     0.00000000     0.00000000     0.22138519

H     0.00000000    -1.43013023    -0.88554075

H     0.00000000     1.43013023    -0.88554075

end

basis spherical

H library cc-pVDZ

O library cc-pVDZ

end

scf

  thresh 1.0e-10

  tol2e 1.0e-10

  singlet

rhf

end

tce

  ccsd(t)

  io ga

  cuda 1

end

task tce energy

task tce optimize

Confirm that CUDA is enabled and used

The following is partial of output file for CCSD(T) calculation accelerated by CUDA. The important keyword, which confirms that your calculation is using CUDA

is Using CUDA CCSD(T) code" as shown in the part of summary of energy when it had done the iteration.

...

 CCSD(T)

 Using CUDA CCSD(T) code

Using   1 device per node

 CCSD[T]  correction energy / hartree =        -0.007842454657787

 CCSD[T] correlation energy / hartree =        -0.385821575619310

 CCSD[T] total energy / hartree       =      -145.328578427813056

 CCSD(T)  correction energy / hartree =        -0.007637197317536

 CCSD(T) correlation energy / hartree =        -0.385616318279060

 CCSD(T) total energy / hartree       =      -145.328373170472815

 Cpu & wall time / sec            0.3            1.8

...

Test Results

I have no free to update and finish this post. Stay tuned for updates.

(Last updated: January 10th, 2019)

Rangsiman Ketkaew