CUDA Tools
Debugging CUDA programs can be frustrating. There are many simultaneous threads and error exception reporting is not precise. Here are some links to resources that can help you. Be aware, in cse260 we are using cuda9 but many of the links point to newer documentation so not all feature may be available. Kepler has cuda capability 3.7.
Debugging
cuda-gdb can help you debug your program. See Cuda gdb documentation for advice on how to use it effectively. Make sure you compile your program with our Makefile using make debug=1. This will set the appropriate compiler flags to enable you to use cuda-gdb.
These tools will help you see what the cuda compiler is doing to your code.
cuobjdump - cuda version of objdump
nvdisasm
cuda-memcheck has several tools that may help you identify problems in your code. nvidia cuda memcheck documentation
Performance
nvprof - profiling tool. The GPU has many built-in performance counters. Nvprof will enable you to access them. See nvprof documentatio.
nvvp - visual profiler can be used in conjunction with nvprof. Note, for cc3.7, line based profiling is not available.