Next: Log-file functions, Previous: Message-buffer manipulation functions, Up: Run-time library functions [Contents][Index]
An essential component of any benchmarking system is an accurate
timer. coNCePTuaL’s ncptl_time()
function
selects from a variety of timers at configuration time, first
favoring lower-overhead cycle-accurate timers, then higher-overhead
cycle-accurate, and finally non-cycle-accurate timers.
ncptl_init()
measures the actual timer overhead and resolution and
ncptl_log_write_prologue()
writes this information to the log file. Furthermore, the
validatetimer program (see
Validating the coNCePTuaL timer) can be used to verify that the
timer used by ncptl_init()
truly does
correspond to wall-clock time.
The coNCePTuaL language provides a few time-related functions. These are supported by the functions described below.
Return the time in microseconds. The timer ticks even when the
program is not currently scheduled. No assumptions can be made
about the relation of the value returned to the time of day;
ncptl_time()
is intended to be used strictly for computing elapsed
time. The timer’s resolution and accuracy are logged to the log
file by
ncptl_log_write_prologue()
(more precisely, by the internal
log_write_prologue_timer()
function, which is called by
ncptl_log_write_prologue()
).
Note that ncptl_time()
always returns a 64-bit unsigned value, regardless of how
ncptl_int
is declared.
The coNCePTuaL configure script (see configure) searches for a number of high-resolution timers and selects the best timer mechanism from among the ones available. The selection criteria is as follows:
ncptl_time()
uses gettimeofday()
as its timer mechanism.ncptl_time()
uses MPI_Wtime()
as its timer
mechanism.ncptl_time()
reads the timer using inline assembly code. If the
cycle counter is likely to wrap around during a moderately long
benchmark (i.e., because the cycle counter is a
32-bit register), ncptl_time()
augments the
inline assembly code with calls to gettimeofday()
in an attempt to produce accurate timings that don’t
suffer from clock wraparound.get_cycles()
function is available and ./configure can determine
the number of clock cycles per second then
ncptl_time()
uses get_cycles()
to measure
execution time.ncptl_time()
becomes a call
to PAPI’s PAPI_get_real_usec()
function.clock_gettime()
function is available and the CLOCK_SGI_CYCLE
macro is
defined, ncptl_time()
invokes clock_gettime()
with the
CLOCK_SGI_CYCLE
argument. If CLOCK_SGI_CYCLE
is not
defined but CLOCK_REALTIME
is, then
ncptl_time()
invokes clock_gettime()
with the
CLOCK_REALTIME
argument.dclock()
function for reading the time. ncptl_time()
makes use of
dclock()
if it’s available.QueryPerformanceCounter()
) and for
determining the number of ticks per second that the timer measures
(
QueryPerformanceFrequency()
). If those
functions are available, ncptl_time()
uses
them.ncptl_time()
uses
gettimeofday()
to measure execution
time.Furthermore, coNCePTuaL makes use of the Timers (HPET) High-Precision Event Timers (HPET) device if and only if all of the following conditions hold at run time:
Failing any of those conditions, coNCePTuaL falls back to the timer selected at configuration time. See the HPET specification for more information on HPET.
ncptl_set_flag_after_usecs()
uses the operating system’s interval timer to asynchronously set a
variable to ‘1’ after a given number of microseconds.
This function is intended to be used to support the ‘ FOR
time’ construct (see Timed loops). Note that
delay is a 64-bit unsigned value, regardless of how
ncptl_int
is declared.
ncptl_set_flag_after_usecs()
is implemented in terms of the setitimer()
function and issues a run-time error if the
setitimer()
function is not available.
If spin0block1 is ‘0’,
ncptl_udelay()
spins for delay microseconds (i.e., using
the CPU). If spin0block1 is ‘1’,
ncptl_udelay()
sleeps for delay microseconds (i.e.,
relinquishing the CPU). Note that delay is a 64-bit
unsigned value, regardless of how ncptl_int
is declared.
ncptl_udelay()
is intended
to be used to support the coNCePTuaL language’s SLEEPS
and
COMPUTES
statements (see Delaying
execution).
When spin0block1 is ‘0’,
ncptl_udelay()
uses ncptl_time()
to determine
when delay microseconds have elapsed. Unless
ncptl_time()
is known to utilize an extremely low-overhead timer,
ncptl_udelay()
intersperses
calls to ncptl_time()
with writes to
a dummy variable. When spin0block1 is ‘0’,
ncptl_udelay()
invokes
nanosleep()
to introduce delays.
ncptl_udelay()
issues a run-time error if the nanosleep()
function is not available.
Next: Log-file functions, Previous: Message-buffer manipulation functions, Up: Run-time library functions [Contents][Index]