Intel Trace Analyzer and Collector

From CUC3
Jump to navigation Jump to search

This is a thing that came as part of the Intel cluster toolkit on zero but I am still figuring out how to use it, sp these are rough notes.

module add itac
mpicc -trace -o foo foo.c
mpicc -tcollect -o foo foo.c # not sure what the difference is
itcpin -- foo 
mpiexec foo

The output is voluminous and held in memory, and it rapidly causes the nodes to swap. You can filter it so you only get traces for certain processes or nodes. My attempts to cut it down to a reasonable size while maintaining full traces have been a failure so far.

[cen1001@zero tracing]$ cat itac.conf 
MEM-FLUSHBLOCKS 256 # flush to disk once you have 256*64K of data, default is 1024
MEM-MAXBLOCKS 1024 # stop and flush if you have more than 1024*64K, default is 4096
FLUSH-PREFIX /scratch/cen1001 # where to flush to
STOPFILE-NAME /home/cen1001/STOP # this doesn't work unless you use the failsafe libs

echo $VT_CONFIG 
/home/cen1001/src/NMM-MPI/tracing/itac.conf

qsub -v PATH,LD_LIBRARY_PATH,VT_CONFIG run.sh
traceanalyse mpi-hello.stf


To use the failsafe libraries:

mpicc -trace  -o mpi_timer ../../mpi_timer.c -lVTfs $VT_ADD_LIBS 

then you can stop your app by touching the stopfile if things start to swap manically.