Wavefront: Log Files
coNCePTuaL automatically logs everything needed to reproduce a performance test

The C version of the wavefront program writes its performance measurements as follows:

if (this_task == 0) {
  printf ("Per-hop latency (usecs, mean)\tPer-hop latency (usecs, std. dev.)\n");
  printf ("%.10g\t%.10g\n", mean_secs*1e6, stdev_secs*1e6);
}

Typical output looks like this:

Per-hop latency (usecs, mean)   Per-hop latency (usecs, std. dev.)
4.861095        0.6818816043

The coNCePTuaL version of the wavefront program writes its performance measurements in a seemingly similar manner:

task 0 logs the mean of elapsed_usecs/(xdim+ydim-1) as "Per-hop latency (usecs)"
       and the standard deviation of elapsed_usecs/(xdim+ydim-1) as "Per-hop latency (usecs)"

The preceding statement creates a set of files, one per task. The data is written in comma-separated value format for ease of parsing and for the ability to load the data directly into a spreadsheet application. For example, task 0 may log data such as the following:

"Per-hop latency (usecs)","Per-hop latency (usecs)"
"(mean)","(std. dev.)"
4.861095,0.6818816043

So far, there's no significant difference between the C output and the coNCePTuaL output. However, the data listed above is not all that a coNCePTuaL program logs. Every log file also includes a thorough description of the conditions under which the program ran, including the

The idea is that a coNCePTuaL log file should act like a scientist's laboratory notebook in that it strives to present enough information for someone to exactly reproduce an experiment. In contrast, consider the C output presented above. Someone who looks at that data in the future will have no idea what exactly was run, on what type of hardware/software, and with what parameters.

The following log file was produced by running the coNCePTuaL version of the wavefront program. Lines that differ across tasks are prefixed with the applicable task number(s).

coNCePTuaL log file produced by the wavefront program

As an exercise, see if you can find the answers to the following questions in the preceding log file:

  1. How many tasks were used?
  2. What were the mesh dimensions?
  3. How many wavefronts were timed?
  4. How were tasks mapped to hosts?
  5. Which task(s) had unreliable timers?
  6. Was there much memory pressure?

(Answers: (1) 20, (2) 4×5, (3) 1×106=1,000,000, (4) tasks 0 and 1 ran on host a72, tasks 2 and 3 ran on host a73, etc., (5) task 11 (host a77)—probably caused by other activity on the system that interfered with the performance test, (6) no; the greatest per-task peak memory usage was only 17.8 MB out of 2.0 GB.)

All coNCePTuaL programs support a --comment command-line option that enables extra commentary to be included in a log file. For example, one might want to log information about some unique piece of hardware or a nonstandard software configuration for future reference. coNCePTuaL can't automatically log all information about every possible computer system, but it does make it easy for a diligent experimenter to be conscientious about reporting experimental results.

Summary

  1. coNCePTuaL aims to make performance tests reproducible by logging a wealth of information about the experimental setup in addition to the performance measurements.
  2. Log files include information about the hardware and software environment in which the performance test ran.
  3. Users can explicitly include additional commentary in log files if desired.
  4. Each task logs information separately. This enables cross-task comparisons and helps explain anomalous performance.
Scott Pakin, pakin@lanl.gov