Next: , Previous: , Up: Supplied backends   [Contents][Index]


3.3.5 The c_profile backend

Profiling communication events can help explain surprising performance measurements. A profiling tool that automatically instruments coNCePTuaL programs is simpler than manually adding LOGS or OUTPUTS statements to an existing program.

Like the c_trace backend (see The c_trace backend), the c_profile backend adds code to that produced by other backends. The c_profile backend accepts a single—mandatory—command-line argument, --profile =backend, which designates a target backend to use. The restrictions on backend are that it must produce C code and must be derived from the c_generic backend (see Backend creation). Improper backends cause c_profile to abort abnormally.

If the profiled coNCePTuaL program produces log files, the log files will include profiling information in the epilogue, one line for each event type that was used at least once by the corresponding process. (Event types, lists and briefly describes the various event types.) Each line names an event and presents the total number of microseconds spent processing that event, a tally of the number of times that event type was executed, and the quotient of those two numbers. In addition, the amount of memory used to store the event list is also logged.

For example, the following extract from a log file indicates that the process that wrote it spent a total of 6.7s processing 22,000 ‘RECV’ events (which includes waiting time) for an average of 303.5us per ‘RECV’ event:

# Profile of SEND (microseconds, count, average): 5267469 22001 239.4
# Profile of RECV (microseconds, count, average): 6676521 22000 303.5
# Profile of REPEAT (microseconds, count, average): 11985167 1 11985167.0
# Profile of NEWSTMT (microseconds, count, average): 43 1 43.0
# Profile of CODE (microseconds, count, average): 5516 1 5516.0
# Profile of event memory: 528 bytes (6 events * 88 bytes/event)

Each ‘REPEAT’ event includes the time for all of the events that it repeats.

Although the preceding log-file excerpt indicates that a total of 22001+22000+1+1+1 = 44004 events were executed, the ‘event memory’ line clarifies that the event list contained only 6 unique events and therefore required only 528 bytes of memory.

Profiled programs that do not produce log files write profiling information to the standard error device. Because all processes may share a single standard error device, each line of output is preceded by a processor ID as in the following example:

1 SEND 5527125 22000 251.2
1 RECV 6322699 22001 287.4
1 REPEAT 11894523 1 11894523.0
1 event-memory 352 4 88
0 SEND 5267469 22001 239.4
0 RECV 6676521 22000 303.5
0 REPEAT 11985167 1 11985167.0
0 event-memory 352 4 88

The columns are processor ID, event, total microseconds, tally, and average microseconds except when event is ‘event-memory’ in which case the columns are processor ID, ‘event-memory’, total bytes, number of events, and bytes per event. The intention is for the output to be easily parseable using tools such as awk.


Next: , Previous: , Up: Supplied backends   [Contents][Index]

Scott Pakin, pakin@lanl.gov