Test for coupled cluster method of geometry optimization of water molecule using CCSD(T) and CCSD[T] with cc-pVDZ basis set. In this test the NVIDIA GTX 1050 Ti is used as a accelerator for pushing quantum calculation upon exploiting only on CPU.
GEFORCE GTX 1050 Ti graphic card (click here for large image)
Ref: https://www.tokopedia.com/
Specification of Special Item
Graphics Engine NVIDIA GeForce GTX 1050 TIBus Standard PCI Express 3.0OpenGL OpenGL4.5Video MemoryGDDR5 4GB Engine ClockGPU Boost Clock : 1392 MHzGPU Base Clock : 1290 MHzCUDA Core : 768Memory Clock 7008 MHz ( DDR2 )Memory Interface 128-bitCPU info
I used Ryzen Threadripper 1950X . For detail and performance please visit this post.
NWChem compilatin details
Here is the script I used to compile NWChem with OpenMPI v. 2.0.2 and CUDA v. 9.1 on CentOS 7.
https://github.com/rangsimanketkaew/NWChem/blob/master/script/CentOS-OpenMPI-CUDA.sh
**You have to make sure that the architecture you specified in CUDA_FLAGS is correct and supported by nvcc.
Use nvcc --help command and read its help page for more details. The following is the portion of compile script I used. Here I set -arch argument with sm_50.
export TCE_CUDA=Yexport CUDA_LIBS="-L/usr/local/cuda-9.1/lib64/ -L/usr/local/cuda-9.1/lib64/ -lcudart"export CUDA_FLAGS="-arch sm_50 "export CUDA_INCLUDE="-I. -I/usr/local/cuda-9.1/include/"I am not an expert. When I get confused or have difficult question, I always consult the manual or ask the program developer!
Is your calculation using GPU ?
In Linux, it is very easy to check the status of your GPU, for Nvidia, just type nvidia-smi it will show you a basic information like below.
Status of GTX 1050 Ti. It was running NWChem calculation
The output shows
Input file and molecule details
#@@ Sample NWChem input for Coupled Cluster (CC) calculation#@@ using TCE module in NWChem 6.8 and enabling CUDA.start tce_ccsd_t_h2oechomemory total 8 GBgeometry units bohrO 0.00000000 0.00000000 0.22138519H 0.00000000 -1.43013023 -0.88554075H 0.00000000 1.43013023 -0.88554075endbasis sphericalH library cc-pVDZO library cc-pVDZendscf thresh 1.0e-10 tol2e 1.0e-10 singlet rhfendtce ccsd(t) io ga cuda 1endtask tce energytask tce optimizeConfirm that CUDA is enabled and used
The following is partial of output file for CCSD(T) calculation accelerated by CUDA. The important keyword, which confirms that your calculation is using CUDA
is Using CUDA CCSD(T) code" as shown in the part of summary of energy when it had done the iteration.
... CCSD(T) Using CUDA CCSD(T) codeUsing 1 device per node CCSD[T] correction energy / hartree = -0.007842454657787 CCSD[T] correlation energy / hartree = -0.385821575619310 CCSD[T] total energy / hartree = -145.328578427813056 CCSD(T) correction energy / hartree = -0.007637197317536 CCSD(T) correlation energy / hartree = -0.385616318279060 CCSD(T) total energy / hartree = -145.328373170472815 Cpu & wall time / sec 0.3 1.8...Test Results
I have no free to update and finish this post. Stay tuned for updates.
(Last updated: January 10th, 2019)
Rangsiman Ketkaew