Next: , Previous: , Up: Run-time library functions   [Contents][Index]

6.3.5 Time-related functions

An essential component of any benchmarking system is an accurate timer. coNCePTuaL’s ncptl_time() function selects from a variety of timers at configuration time, first favoring lower-overhead cycle-accurate timers, then higher-overhead cycle-accurate, and finally non-cycle-accurate timers. ncptl_init() measures the actual timer overhead and resolution and ncptl_log_write_prologue() writes this information to the log file. Furthermore, the validatetimer program (see Validating the coNCePTuaL timer) can be used to verify that the timer used by ncptl_init() truly does correspond to wall-clock time.

The coNCePTuaL language provides a few time-related functions. These are supported by the functions described below.

Function: uint64_t ncptl_time (void)

Return the time in microseconds. The timer ticks even when the program is not currently scheduled. No assumptions can be made about the relation of the value returned to the time of day; ncptl_time() is intended to be used strictly for computing elapsed time. The timer’s resolution and accuracy are logged to the log file by ncptl_log_write_prologue() (more precisely, by the internal log_write_prologue_timer() function, which is called by ncptl_log_write_prologue()). Note that ncptl_time() always returns a 64-bit unsigned value, regardless of how ncptl_int is declared.

The coNCePTuaL configure script (see configure) searches for a number of high-resolution timers and selects the best timer mechanism from among the ones available. The selection criteria is as follows:

  1. If ./configure was passed --with-gettimeofday (see configure) then ncptl_time() uses gettimeofday() as its timer mechanism.
  2. If ./configure was passed --with-mpi-wtime (see configure) then ncptl_time() uses MPI_Wtime() as its timer mechanism.
  3. If ./configure recognizes the CPU architecture, knows how to instruct the C compiler to insert inline assembly code, and can determine the number of clock cycles per second, then ncptl_time() reads the timer using inline assembly code. If the cycle counter is likely to wrap around during a moderately long benchmark (i.e., because the cycle counter is a 32-bit register), ncptl_time() augments the inline assembly code with calls to gettimeofday() in an attempt to produce accurate timings that don’t suffer from clock wraparound.
  4. If the Linux get_cycles() function is available and ./configure can determine the number of clock cycles per second then ncptl_time() uses get_cycles() to measure execution time.
  5. If the PAPI library is available, ncptl_time() becomes a call to PAPI’s PAPI_get_real_usec() function.
  6. If the System V clock_gettime() function is available and the CLOCK_SGI_CYCLE macro is defined, ncptl_time() invokes clock_gettime() with the CLOCK_SGI_CYCLE argument. If CLOCK_SGI_CYCLE is not defined but CLOCK_REALTIME is, then ncptl_time() invokes clock_gettime() with the CLOCK_REALTIME argument.
  7. Intel’s (now obsolete) supercomputers provide a dclock() function for reading the time. ncptl_time() makes use of dclock() if it’s available.
  8. Microsoft Windows provides functions for reading a high-resolution timer ( QueryPerformanceCounter()) and for determining the number of ticks per second that the timer measures ( QueryPerformanceFrequency()). If those functions are available, ncptl_time() uses them.
  9. As a last resort, ncptl_time() uses gettimeofday() to measure execution time.

Furthermore, coNCePTuaL makes use of the Timers (HPET) High-Precision Event Timers (HPET) device if and only if all of the following conditions hold at run time:

Failing any of those conditions, coNCePTuaL falls back to the timer selected at configuration time. See the HPET specification for more information on HPET.

Function: void ncptl_set_flag_after_usecs (volatile int *flag, uint64_t delay)

ncptl_set_flag_after_usecs() uses the operating system’s interval timer to asynchronously set a variable to ‘1’ after a given number of microseconds. This function is intended to be used to support the ‘ FOR time’ construct (see Timed loops). Note that delay is a 64-bit unsigned value, regardless of how ncptl_int is declared.

ncptl_set_flag_after_usecs() is implemented in terms of the setitimer() function and issues a run-time error if the setitimer() function is not available.

Function: void ncptl_udelay (uint64_t delay, int spin0block1)

If spin0block1 is ‘0’, ncptl_udelay() spins for delay microseconds (i.e., using the CPU). If spin0block1 is ‘1’, ncptl_udelay() sleeps for delay microseconds (i.e., relinquishing the CPU). Note that delay is a 64-bit unsigned value, regardless of how ncptl_int is declared. ncptl_udelay() is intended to be used to support the coNCePTuaL language’s SLEEPS and COMPUTES statements (see Delaying execution).

When spin0block1 is ‘0’, ncptl_udelay() uses ncptl_time() to determine when delay microseconds have elapsed. Unless ncptl_time() is known to utilize an extremely low-overhead timer, ncptl_udelay() intersperses calls to ncptl_time() with writes to a dummy variable. When spin0block1 is ‘0’, ncptl_udelay() invokes nanosleep() to introduce delays. ncptl_udelay() issues a run-time error if the nanosleep() function is not available.

Next: , Previous: , Up: Run-time library functions   [Contents][Index]

Scott Pakin,