NWChem: Test for CCSD(T) & CCSD[T] calculation with GeForce GTX 1050 Ti

NWChem: Test for CCSD(T) & CCSD[T] calculation with GeForce GTX 1050 Ti

Test for coupled cluster method of geometry optimization of water molecule using CCSD(T) and CCSD[T] with cc-pVDZ basis set. In this test the NVIDIA GTX 1050 Ti is used as a accelerator for pushing quantum calculation upon exploiting only on CPU.

GEFORCE GTX 1050 Ti graphic card (click here for large image)

Ref: https://www.tokopedia.com/

Specification of Special Item

Graphics Engine NVIDIA GeForce GTX 1050 TI
Bus Standard PCI Express 3.0
OpenGL OpenGL4.5
Video Memory
GDDR5 4GB Engine Clock
GPU Boost Clock : 1392 MHz
GPU Base Clock : 1290 MHz
CUDA Core : 768
Memory Clock 7008 MHz ( DDR2 )
Memory Interface 128-bit


CPU info

I used Ryzen Threadripper 1950X . For detail and performance please visit this post.


NWChem compilatin details

Here is the script I used to compile NWChem with OpenMPI v. 2.0.2 and CUDA v. 9.1 on CentOS 7.

https://github.com/rangsimanketkaew/NWChem/blob/master/script/CentOS-OpenMPI-CUDA.sh

**You have to make sure that the architecture you specified in CUDA_FLAGS is correct and supported by nvcc.

Use nvcc --help command and read its help page for more details. The following is the portion of compile script I used. Here I set -arch argument with sm_50.

export TCE_CUDA=Y
export CUDA_LIBS="-L/usr/local/cuda-9.1/lib64/ -L/usr/local/cuda-9.1/lib64/ -lcudart"
export CUDA_FLAGS="-arch sm_50 "
export CUDA_INCLUDE="-I. -I/usr/local/cuda-9.1/include/"


I am not an expert. When I get confused or have difficult question, I always consult the manual or ask the program developer!


Is your calculation using GPU ?

In Linux, it is very easy to check the status of your GPU, for Nvidia, just type nvidia-smi it will show you a basic information like below.

Status of GTX 1050 Ti. It was running NWChem calculation

The output shows

  • The version of Nvidia driver is 387.26.
  • There is one GPU card on the machine.
  • GPU fan is 20%.
  • Temperature is 43 degree Celsius.
  • The process name shows the name of software !
  • Memory usage of each current process.

Input file and molecule details

#@@  Sample NWChem input for Coupled Cluster (CC) calculation
#@@  using TCE module in NWChem 6.8 and enabling CUDA.
start tce_ccsd_t_h2o
echo
memory total 8 GB
geometry units bohr
O     0.00000000     0.00000000     0.22138519
H     0.00000000    -1.43013023    -0.88554075
H     0.00000000     1.43013023    -0.88554075
end

basis spherical
H library cc-pVDZ
O library cc-pVDZ
end

scf
  thresh 1.0e-10
  tol2e 1.0e-10
  singlet
  rhf
end

tce
  ccsd(t)
  io ga
  cuda 1
end

task tce energy
task tce optimize


Confirm that CUDA is enabled and used

The following is partial of output file for CCSD(T) calculation accelerated by CUDA. The important keyword, which confirms that your calculation is using CUDA

is Using CUDA CCSD(T) code" as shown in the part of summary of energy when it had done the iteration.

...
 CCSD(T)
 Using CUDA CCSD(T) code
Using   1 device per node
 CCSD[T]  correction energy / hartree =        -0.007842454657787
 CCSD[T] correlation energy / hartree =        -0.385821575619310
 CCSD[T] total energy / hartree       =      -145.328578427813056
 CCSD(T)  correction energy / hartree =        -0.007637197317536
 CCSD(T) correlation energy / hartree =        -0.385616318279060
 CCSD(T) total energy / hartree       =      -145.328373170472815
 Cpu & wall time / sec            0.3            1.8
...


Test Results

I have no free to update and finish this post. Stay tuned for updates.

(Last updated: January 10th, 2019)


Rangsiman Ketkaew