Next: , Previous: , Up: Interpreting coNCePTuaL log files   [Contents][Index]


3.5.1 Log-file format

The coNCePTuaL run-time library writes log files in the following (textual) format:

A sample log file is listed below. The log file is presented in its entirety.

###########################################################################
# ===================
# coNCePTuaL log file
# ===================
# coNCePTuaL version: 0.6.4a
# coNCePTuaL backend: c_mpi (C + MPI)
# Executable name: /home/pakin/src/coNCePTuaL/example
# Working directory: /home/pakin/src/coNCePTuaL
# Command line: ./example
# Number of tasks: 2
# Processor (0<=P<tasks): 0
# Host name: a1
# Operating system: Linux version 2.4.21-3.5qsnet (root@a31) (gcc version 2.96 20000731 (Red Hat Linux 7.2 2.96-108.1)) #2 SMP Thu Aug 7 10:51:04 MDT 2003
# CPU vendor: GenuineIntel
# CPU architecture: ia64
# CPU count: 2
# CPU frequency: 1300000000 Hz (1.3 GHz)
# Cycle-counter frequency: 1300000000 Hz (1.3 GHz)
# OS page size: 16384 bytes
# Physical memory: 2047901696 bytes (1.9 GB)
# Elan capability: [1965771c.6213bf5d.3a26411c.5b4c5d58] Version 10002 Type a001 Context 640.640.640 Node 1.2
# coNCePTuaL configuration: ./configure '--prefix=/tmp/ncptl' 'MPICPPFLAGS=-I/usr/local/include' 'CFLAGS=-g -O3 -ansi_alias -ansi' 'MPICC=/usr/lib/mpi/mpi_intel/bin/mpicc' '--enable-maintainer-mode' 'CC=ecc' '--disable-shared'
# Library compiler+linker: /opt/intel-7.1.033/compiler70/ia64/bin/ecc
# Library compiler version: Intel(R) C++ gcc 3.0 mode [7.1]
# Library compiler options: -g -O3 -ansi_alias -ansi
# Library linker options: -lrmscall -lelan -lpopt
# Library compiler mode: LP64
# Dynamic libraries used: /usr/lib/qsnet/elan3/lib/librmscall.so.1 /usr/lib/qsnet/elan3/lib/libelan.so.1 /usr/lib/libpopt.so.0.0.0 /usr/lib/mpi/mpi_intel/lib/libmpi.so.1.0 /lib/libm-2.2.4.so /opt/intel-7.1.033/compiler70/ia64/lib/libcxa.so.4 /lib/libc-2.2.4.so /usr/lib/qsnet/elan3/lib/libelan3.so.1 /usr/lib/qsnet/elan/lib/libelanctrl.so.2 /lib/ld-2.2.4.so
# Microsecond timer type: gettimeofday()
# Average microsecond timer overhead: <1 microsecond
# Microsecond timer increment: 1.00466 +/- 0.123576 microseconds (ideal: 1 +/- 0)
# Minimum sleep time: 1946.6 +/- 31.5872 microseconds (ideal: 1 +/- 0)
# WARNING: Sleeping exhibits poor granularity (not a serious problem).
# WARNING: Sleeping has a large error component (not a serious problem).
# Process CPU timer: getrusage()
# Process CPU-time increment: 976.59 +/- 0.494311 microseconds (ideal: 1 +/- 0)
# WARNING: Process timer exhibits poor granularity (not a serious problem).
# Log-file template: example-%p.log
# Number of minutes after which to kill the job (-1=never): -1
# List of signals which should not be trapped: 14
# Log-file checkpointing interval: 60 seconds (i.e., 1 minute)
# MPI send routine: MPI_Send()
# MPI error checking: off
# Front-end compilation command line: ncptl --backend=c_mpi example.ncptl
# Back-end compilation command line: /usr/lib/mpi/mpi_intel/bin/mpicc -I/tmp/ncptl/include  -I/usr/local/include -g -O3 -ansi_alias -ansi tmppG76Fv.c  -L/tmp/ncptl/lib   -lncptl -lrmscall -lelan -lpopt -o example
# Log creator: Scott Pakin
# Log creation time: Mon Dec 19 12:02:18 2005
#
# Environment variables
# ---------------------
# CVS_RSH: /usr/bin/ssh
# DISPLAY: localhost:19.0
# DYNINSTAPI_RT_LIB: /home/pakin/dyninstAPI-3.0/lib/i386-unknown-linux2.2/libdyninstAPI_RT.so.1
# DYNINST_ROOT: /home/pakin/dyninstAPI-3.0
# EDITOR: /usr/bin/emacs
# GROUP: CCS3
# HOME: /home/pakin
# HOST: a0
# HOSTNAME: a0
# HOSTTYPE: unknown
# KDEDIR: /usr
# LANG: en_US
# LD_LIBRARY_PATH: /opt/intel-7.1.033/compiler70/ia64/lib:/users/pakin/lib:/usr/lib:/usr/ccs/lib:/opt/SUNWspro/lib:/usr/dt/lib:/usr/openwin/lib:/usr/X11R6/lib:/usr/local/gnu/lib:/usr/local/lib:/usr/ucblib:/users/pakin/dyninstAPI-3.0/lib/i386-unknown-linux2.2
# LESSOPEN: |/usr/bin/lesspipe.sh %s
# LOGNAME: pakin
# LPDEST: lwy
# LS_COLORS: no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:
# MACHTYPE: unknown
# MAIL: /var/mail/pakin
# MANPATH: /opt/intel-7.1.033/compiler70/man:/home/pakin/man:/usr/man:/opt/SUNWspro/man:/usr/dt/man:/usr/openwin/man:/usr/X11R6/man:/usr/local/gnu/man:/usr/local/man:/usr/share/man:/usr/lanl/man
# MOZILLA_HOME: /usr/local/netscape/java
# NAME: Scott Pakin
# ORGANIZATION: Los Alamos National Lab
# OSTYPE: linux
# PATH: /opt/intel-7.1.033/compiler70/ia64/bin:.:/home/pakin/bin:/usr/local/bin:/usr/dt/bin:/usr/openwin/bin:/usr/X11R6/bin:/usr/local/gnu/bin:/usr/local/bin:/sbin:/bin:/opt/SUNWspro/bin:/usr/sbin:/usr/bin:/usr/ccs/bin:/usr/ucb:/usr/local/teTeX/bin:.:/usr/lanl/bin
# PRINTER: lwy
# PVM_ROOT: /usr/share/pvm3
# PVM_RSH: /usr/bin/rsh
# PWD: /home/pakin/src/coNCePTuaL
# QTDIR: /usr/lib/qt-2.3.1
# REMOTEHOST: antero.c3.lanl.gov
# RMS_JOBID: 308
# RMS_MACHINE: a
# RMS_NNODES: 2
# RMS_NODEID: 0
# RMS_NPROCS: 2
# RMS_PROCID: 0
# RMS_RANK: 0
# RMS_RESOURCEID: parallel.318
# RMS_STOPONELANINIT: 0
# SHELL: /bin/tcsh
# SHLVL: 2
# SSH_AGENT_PID: 8586
# SSH_ASKPASS: /usr/libexec/openssh/gnome-ssh-askpass
# SSH_AUTH_SOCK: /tmp/ssh-XXfG457q/agent.8562
# SSH_CLIENT: 128.165.20.177 47456 22
# SSH_TTY: /dev/pts/10
# SUPPORTED: en_US:en
# TERM: xterm
# TZ: MST7MDT
# USER: pakin
# VENDOR: unknown
#
# coNCePTuaL source code
# ----------------------
#     FOR 10 REPETITIONS {
#       TASK 0 RESETS ITS COUNTERS THEN
#       TASK 0 SENDS A 0 BYTE MESSAGE TO TASK 1 THEN
#       TASK 1 SENDS A 0 BYTE MESSAGE TO TASK 0 THEN
#       TASK 0 LOGS EACH elapsed_usecs/2 AS "Latency (usecs)"
#     }
#
###########################################################################
"Latency (usecs)"
"(all data)"
192.5
7
5.5
5.5
5
5
5
5.5
5
5.5
###########################################################################
# Program exited normally.
# Log completion time: Mon Dec 19 12:02:18 2005
# Elapsed time: 0 seconds
# Process CPU usage (user+system): 0 seconds
# Number of interrupts received (all CPUs): 22
# Peak memory allocation: 1072987 bytes (1.0 MB)
###########################################################################

As the preceding example indicates, a log file’s comment block can be divided into multiple stanzas:

Two rows of column headers and the measurement data follow the prologue comment block. A set of epilogue comments completes the log file.

The motivation for writing so much information to the log file is to facilitate reproduction of the experiment. The ideal situation is for a third party to be able to look at a coNCePTuaL log file and from that, recreate the original experiment and get identical results.


Some of the comments that may benefit from additional explanation include the following:

Library compiler mode

LP64’ means “Long integers and Pointers contain exactly 64 bits while ordinary integers contain exactly 32 bits”. ‘ILP32’ means “ordinary Integers, Long integers, and Pointers all contain exactly 32 bits”. The library compiler mode will be ‘nonstandard’ for any other combination of datatype sizes.

Average timer overhead

During initialization, the coNCePTuaL run-time library performs some calibration routines. Among these routines is a measurement of the quality of whatever mechanisms the library is using to measure elapsed time. In the sample log file presented above, the mechanism used was ‘inline assembly code’, meaning the run-time library reads the hardware cycle counter without going through a standard library or system call. The complete set of timer mechanisms and selection criteria is presented in Time-related functions, in the documentation for the ncptl_time() function.

The log file then reports “average timer overhead” as the mean time between back-to-back invocations of whichever timer routine is being used. Ideally, the mean should be ‘<1 microsecond but this is not the case on all systems. Large values indicate that a performance penalty is charged every time a coNCePTuaL program reads the timer.

Timer increment

In addition to measuring the overhead of reading the timer, the coNCePTuaL run-time library also measures timer accuracy. The library expects to be able to read the timer with microsecond accuracy. That is, the time reported should not increase by more than a microsecond between successive readings of the timer. To gauge timer accuracy, the library’s initialization routine performs a number of back-to-back invocations of the timer routine and reports the mean and standard deviation of the number of microseconds that elapsed between readings, discarding any deltas of zero microseconds. Ideally, the microsecond timer, when read multiple times in rapid succession, should report nonzero increments of exactly one microsecond with no variation. The log file will contain warning messages if the increment or standard deviation are excessively large, as this may indicate a large margin of error in the measurement data.

Process CPU-time increment
Process CPU usage (user+system)

Log files end with an epilogue section that includes ‘Process CPU usage (user+system)’, which indicates the subset of total wall-clock time for which the program was running (‘user’) or for which the operating system was running on the program’s behalf (‘system’). The log-file prologue reports as ‘Process CPU-time increment’ the resolution of the timer user to report process CPU time. Note that process CPU time is not exported to coNCePTuaL programs; it is therefore much less critical than the wall-clock timer and is reported primarily for informational purposes.

Number of interrupts received (all CPUs)

On certain platforms, coNCePTuaL can tally the number of CPU interrupts that were processed during the run of the program. Because a multiprocessor may migrate tasks among CPUs during their execution, a per-CPU interrupt count may have little merit. Consequently, the number reported represents the sum across all CPUs in the same node (but not across nodes). coNCePTuaL attempts to read interrupt information from the /proc/interrupts file if no alternative mechanism is available. In case having a large number of processes accessing /proc/interrupts poses a problem the --disable-proc-interrupts configuration option prevents programs from accessing that file.

Peak memory allocation

coNCePTuaL programs heap-allocate memory for a variety of purposes: message buffers, event lists (see Generated code), unaggregated performance data (see Computing aggregates), etc. Allocated memory that is no longer needed is returned to the heap. The total amount of allocated memory the program is holding therefore increases and decreases over time. The ‘Peak memory allocation’ comment reports the maximum amount of memory the program held at any given time. If this value nears or exceeds the value reported for ‘Physical memory’, it is possible that paging overheads may be negatively impacting some of the program’s timing measurements. On systems without demand paging, exceeding ‘Physical memory’ is likely to crash the program; a large ‘Peak memory allocation’ on a smaller run can therefore help explain why a larger run is crashing. ‘Peak memory allocation’ includes only memory allocated using the memory-allocation functions described in Memory-allocation functions.


Next: , Previous: , Up: Interpreting coNCePTuaL log files   [Contents][Index]

Scott Pakin, pakin@lanl.gov