Difference between revisions of "Intel Trace Analyzer and Collector"

From CUC3
Jump to navigation Jump to search
import>Cen1001
import>Cen1001
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
This is a thing that came as part of the Intel cluster toolkit on zero but I am still figuring out how to use it, sp these are rough notes.
 
This is a thing that came as part of the Intel cluster toolkit on zero but I am still figuring out how to use it, sp these are rough notes.
   
  +
module add itac
 
mpicc -trace -o foo foo.c
 
mpicc -trace -o foo foo.c
 
mpicc -tcollect -o foo foo.c # not sure what the difference is
 
mpicc -tcollect -o foo foo.c # not sure what the difference is
Line 6: Line 7:
 
mpiexec foo
 
mpiexec foo
   
The output is voluminous and rapidly causes the nodes to swap. You can filter it.
+
The output is voluminous and held in memory, and it rapidly causes the nodes to swap. You can filter it so you only get traces for certain processes or nodes. My attempts to cut it down to a reasonable size while maintaining full traces have been a failure so far.
   
 
[cen1001@zero tracing]$ cat itac.conf
 
[cen1001@zero tracing]$ cat itac.conf
MEM-FLUSHBLOCKS 256 # flush to disk once you have 16Mb of data
+
MEM-FLUSHBLOCKS 256 # flush to disk once you have 256*64K of data, default is 1024
MEM-MAXBLOCKS 1024 # stop and flush if you have more than 64Mb
+
MEM-MAXBLOCKS 1024 # stop and flush if you have more than 1024*64K, default is 4096
 
FLUSH-PREFIX /scratch/cen1001 # where to flush to
 
FLUSH-PREFIX /scratch/cen1001 # where to flush to
 
STOPFILE-NAME /home/cen1001/STOP # this doesn't work unless you use the failsafe libs
 
STOPFILE-NAME /home/cen1001/STOP # this doesn't work unless you use the failsafe libs
Line 19: Line 20:
 
qsub -v PATH,LD_LIBRARY_PATH,VT_CONFIG run.sh
 
qsub -v PATH,LD_LIBRARY_PATH,VT_CONFIG run.sh
   
traceanalyse
+
traceanalyse mpi-hello.stf
  +
  +
  +
To use the failsafe libraries:
  +
  +
mpicc -trace -o mpi_timer ../../mpi_timer.c -lVTfs $VT_ADD_LIBS
  +
  +
then you can stop your app by touching the stopfile if things start to swap manically.

Latest revision as of 16:25, 25 June 2008

This is a thing that came as part of the Intel cluster toolkit on zero but I am still figuring out how to use it, sp these are rough notes.

module add itac
mpicc -trace -o foo foo.c
mpicc -tcollect -o foo foo.c # not sure what the difference is
itcpin -- foo 
mpiexec foo

The output is voluminous and held in memory, and it rapidly causes the nodes to swap. You can filter it so you only get traces for certain processes or nodes. My attempts to cut it down to a reasonable size while maintaining full traces have been a failure so far.

[cen1001@zero tracing]$ cat itac.conf 
MEM-FLUSHBLOCKS 256 # flush to disk once you have 256*64K of data, default is 1024
MEM-MAXBLOCKS 1024 # stop and flush if you have more than 1024*64K, default is 4096
FLUSH-PREFIX /scratch/cen1001 # where to flush to
STOPFILE-NAME /home/cen1001/STOP # this doesn't work unless you use the failsafe libs

echo $VT_CONFIG 
/home/cen1001/src/NMM-MPI/tracing/itac.conf

qsub -v PATH,LD_LIBRARY_PATH,VT_CONFIG run.sh
traceanalyse mpi-hello.stf


To use the failsafe libraries:

mpicc -trace  -o mpi_timer ../../mpi_timer.c -lVTfs $VT_ADD_LIBS 

then you can stop your app by touching the stopfile if things start to swap manically.