Debugging and Profiling Codes

Debugging with gdb

0. Build the source code with -g

1. Lauch gdb with the app

gdb ./exec

gdb --args ./exec param1 param2

2. Create breakpoints before running the app

gdb> b source.cpp:124

or using under a certain condition

gdb> b source.cpp:124 if variable == 4

3. Run the app with command-line arguments

gdb> run param1 param2

4. When the app stops at breakpoints,

- show the call stack (backtrace)

gbd> bt

- show the call stack (backtrace) with n innermost calls

gdb> bt n

- step up the stack:

gdb> up

- step down the stack:

gdb> down

- step to a certain frame in the stack as shown by backtrace

gdb> f [frame-number]

- show the source code at the vicinity of the break point

gdb> list

- print variable values:

gdb> p var1

- jump to next statement:

gdb> n

- continue to run:

gdb> c

Debug multiple MPI applications with gdb

After the application was built with -g flags, we can debug it with gdb:

mpirun -np 4 xterm -e gdb --args binary_name input_parameters

which launches 4 xterm terminals corresponding to 4 MPI processes. Breakpoints can be created in one of the terminals as usual. And then, execute

run

at all the terminals to start debugging.

Monitor memory consumption with valgrind massif

valgrind --tool=massif ./application args

to generate the memory consumtion over time of the given application, which can help track down memory leaks due to function calls. The output file massive.out.* can be visualized by massif-visualizer.

Use valgrind to detect memory-related issues

I found the following website that is helpful:

http://www.cprogramming.com/debugging/valgrind.html

For my convenience, I extract a couple of notes here:

1. Compile the program with -g so that the debug symbols are stored in the binary, say, a.out

2. Invoke valgrind with the binary

valgrind --tool=memcheck --leak-check=yes --track-origins=yes a.out arg1 arg2

3. The following output messages:

- "blocks are definitely lost in loss record": indicates memory is not freed after malloc() or new, see the line that is reported.
- "Invalid write of size" and "Address XX is 0 bytes after a block of size YY alloc'd": indicates an out-of-bound write to arrays.
- "Conditional jump or move depends on uninitialised value(s)": indicates uninitialized variables are being used.
- "Invalid free()" or "Mismatched free() / delete / delete []": indicates double free() or errors in memory allocation

Profile CUDA codes with nvvp

NVVP (NVIDIA Visual Profiler) comes with CUDA Toolkit. For CUDA Toolkit 11.x, NVVP is compatible with Java 1.8 (for its eclipse GUI), and would crash with the default Java 11. After install openjdk 1.8,

sudo dnf install java-1.8.0-openjdk.x86_64

run "sudo alternatives --config java" to switch to java-1.8. Then launch nvvp from the command line.

Profile MPI applications using VampirTrace

1. Configure Makefile to use vtCC

- Load GNU programming environment, CUDA toolkit

- Load the module vampirtrace

module load vampirtrace

- Modify the Makefile to specify CCFLAGS and LINKFLAGS be: vtCC -vt:mpi or vtCC -vt:hyb

- Show include/library paths and linked libs of vtCC -vt:mpi (or vtCC -vt:hyb) with -v

vtCC -vt:mpi -v

- Note the version of the PAPI library: -L/../papi/perf_events/cuda/lib

- Build the app

2. Configure the PBS script

- module load vampirtrace

- Specify vampirtrace variables

export VT_PFORM_GDIR=traces

if [ ! -e $VT_PFORM_GDIR ]; then

mkdir $VT_PFORM_GDIR

export VT_FILTER_SPEC=./myfilter.spec

export VT_MAX_FLUSHES=10

export VT_METRICS=PAPI_FP_OPS:PAPI_TOT_INS

export VT_MPITRACE=yes

export VT_BUFFER_SIZE=32M

- The filter file myfilter.spec looks like the following:

final_integrate -- 1000

* -- 2000

meaning that sampling the function final_integrate is performed no more than 1000 times, other functions no more than 2000 times.

3. Submit the PBS script, the trace files will be put into the folder traces.

4. Copy the folder traces back to a local machine (if of small size) and open the .otf file using VampirClient.

If the traces folder is large in size, load the vampirserver, open the otf file in the folder traces. From the local machine, launch VampirClient and connect to the port ID given by the server.

Profile applications using CrayPat

The official instruction is given at: http://www.olcf.ornl.gov/?kb_articles=software-jaguar-craypat&menu=software

1. Build application using perftools

module load PrgEnv-gnu

module unload cudatoolkit (<-- if no cuda library is available)

module load perftools

make clean

make

2. Instrument the binary using Automatic Profiling Analysis (APA)

pat_build -O apa my_app

which generated a binary named my_app+pat

3. Run the instrumented binary

aprun -n 16 my_app+pat < input.txt

4. Generate the profile report from the generated .xf file

pat_report -T -o report.txt my_app+pat.xf

Profiling results are stored in the generated report.txt and .apa files.

Page updated

Google Sites

Report abuse