Pgprof

From Docswiki
Revision as of 10:54, 13 May 2019 by Adk44 (talk | contribs) (Created page with "Profiling an executable can give useful information on where particular calculations spend the majority of their time, so that you can go for the big win in terms of recoding...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Profiling an executable can give useful information on where particular calculations spend the majority of their time, so that you can go for the big win in terms of recoding for efficiency improvements.

To get the most information, you generally need to recompile your code with particular flags set, then run on a test job (not too long, as profiling slows the code down a lot), and analyse the resulting profile output file.

Portland's pgprof, although it is remarkably difficult to type, is useful for pgf90-compiled codes.

A simple protocol is:

  • Compile making sure that either the -pg or the "-Mprof=<option>" flag is present. <option> can be lines, func, time or hwcts. func generates routine level profiling; lines generates routine and line level profiling. The -pg and -Mprof=time switches generate instruction level profiles using time-based sampling of the program counter. -Mprof=hwcts (linux86-64 only) generates an instruction-level profile using event-based sampling of hardware (event) counters. I like "lines", if the test job is small but representative.
  • Run the resulting executable on a test job.
  • In the directory where you just ran the job, give the command pgprof -exe <path-to-executable> -I <path to source files>. The path to source is optional (needed for line-by-line analysis); multiple directories can be added separated by a ':' . Note that the flag starts with minus-capital-i.

In the pgprof window, when it shows the list of subprogram names and percentage times/counts, double click on a name to see the line-by-line profile for that routine (where appropriate).

Also note that line-by-line profiling may be affected if you also compile with optimization.