Previous: ncptl-logextract, Up: ncptl-logextract [Contents][Index]
ncptl-logextract - Extract various bits of information from a coNCePTuaL log file
ncptl-logextract --usage | --help | --man
ncptl-logextract [ --extract=[data|params|env|source|warnings]] [ --format=format] [format-specific options...] [ --before=string] [ --after=string] [ --force-merge [=number]] [ --procs=string] [ --quiet] [ --verbose] [ --output=filename] [filename...]
Background coNCePTuaL is a domain-specific programming language designed to facilitate writing networking benchmarks and validation suites. coNCePTuaL programs can log data to a file but in only a single file format. ncptl-logextract extracts this log data and outputs it in a variety of formats for use with other applications.
The coNCePTuaL-generated log files that serve as input to
ncptl-logextract are plain ASCII files. Syntactically,
they contain a number of newline-separated tables. Each table
contains a number of newline-separated rows of comma-separated
columns. This is known generically as comma-separated
value or CSV format. Each table begins with two
rows of header text followed by one or more rows of numbers. Text
is written within double quotes. Double-quote characters and
backslashes within text are escaped with a backslash. No other
escaped characters are recognized. Lines that begin with
#
are considered comments.
Semantically, there are four types of data present in every coNCePTuaL-generated log file:
The first three items appear within comment lines. The measurement data is written in CSV format.
Extracting information from coNCePTuaL log files It is common to want to extract information (especially measurement data) from log files. For simple formatting operations, a one-line awk or Perl script suffices. However, as the complexity of the formatting increases, the complexity of these scripts increases even more. That’s where ncptl-logextract fits in. ncptl-logextract makes it easy to extract any of the four types of log data described above and format it in variety of ways. Although the number of options that ncptl-logextract supports may be somewhat daunting, it is well worth learning how to use ncptl-logextract to avoid reinventing the wheel every time a coNCePTuaL log file needs to be processed. ncptl-logextract takes care of all sorts of special cases that crop up when manipulating coNCePTuaL log files.
ncptl-logextract accepts the following command-line options regardless of what data is extracted from the log file and what formatting occurs:
Output the Synopsis section and the Options section then exit the program.
Output a complete Unix man (“manual”) page for ncptl-logextract then exit the program.
Specify what sort of data should be extracted from the log file.
Acceptable values for info are listed and described in
the Additional Options section and include data
,
params
, env
, and source
.
Specify how the extracted data should be formatted. Valid
arguments depend upon the value passed to
--extract and include such
formats as csv
, html
, latex
,
text
, and bash
. See the Additional
Options section for details, explanations, and descriptions of
applicability.
Output an arbitrary string of text before any other output.
string can contain escape characters such as
\n
for newline, \t
for tab, and
\\
for backslash.
Output an arbitrary string of text after all other output.
string can contain escape characters such as
\n
for newline, \t
for tab, and
\\
for backslash.
Try extra hard to merge multiple log files, even if they seem to
have been produced by different programs or in different execution
environments. This generally implies padding empty rows and columns
with blanks. However, if --force-merge is given a
numeric argument, the value of that argument is used instead of
blanks to pad empty locations. Note that --force-merge is different
from --force-merge=0 because
data-merging functions (mean
, max
, etc.)
ignore blanks but consider zeroes.
When given a “merged” log file, unmerge only the data corresponding to the comma-separated processor ranges in string. For example, --procs=0,16-20,25 unmerges the data for processors 0, 16, 17, 18, 19, 20, and 25. By default, ncptl-logextract uses all of the data from a merged log file.
Suppress progress output. Normally, ncptl-logextract outputs status information regarding its operation. The --quiet option instruct ncptl-logextract to output only warning and error messages.
Increase progress output. Normally, ncptl-logextract outputs basic status information regarding its operation. The --verbose option instruct ncptl-logextract to output more detailed information. Each time --verbose is specified, the program’s verbosity increases (up to a maximum).
Redirect the output from ncptl-logextract to a file. By default, ncptl-logextract writes to the standard output device.
The above is merely a terse summary of the ncptl-logextract command-line options. The reader is directed to the Additional Options section for descriptions of the numerous ways that ncptl-logextract can format information. Note that --extract and --format are the two most common options as they specify what to extract and how to format it; most of the remaining options in the Additional Options section exist to provide precise control over formatting details.
The ncptl-logextract command-line options follow a hierarchy. At the top level is --extract, which specifies which of the four types of data ncptl-logextract should extract. Next, --format specifies how the extracted data should be formatted. Valid values for --format differ based on the argument to --extract. Finally, there are various format-specific options that fine-tune the formatted output. Each output format accepts a different set of options. Many of the options appear at multiple places within the hierarchy, although usually with different default values.
The following hierarchical list describes all of the valid combinations of --extract, --format, and the various format-specific options:
Extract measurement data
Output each table in comma-separated-value format
Do not output column headers
Specify the text placed at the beginning of each data column [default: “”]
Specify the text used to separate data columns [default:
“,
”]
Specify the text placed at the end of each data column [default: “”]
Specify the text placed at the beginning of each data row [default: “”]
Specify the text used to separate data rows [default: “” ]
Specify the text placed at the end of each data row [default:
“\\n
”]
Specify the text placed at the beginning of each header column
[default: same as colbegin
]
Specify the text used to separate header columns [default: same
as colsep
]
Specify the text placed at the end of each header column
[default: same as colend
]
Specify the text placed at the beginning of each header row
[default: same as rowbegin
]
Specify the text used to separate header rows [default: same as
rowsep
]
Specify the text placed at the end of each header row [default:
same as rowend
]
Specify the text placed at the beginning of each table [default: “”]
Specify the text used to separate tables [default:
“\\n
”]
Specify the text placed at the end of each table [default: “” ]
Specify the text used to begin quoted text [default:
“"
”]
Specify the text used to end quoted text [default: same as
quote
]
Output strings in a format readable by Microsoft Excel
Enumerate the columns that should be included in the output [default: all columns]
Specify how to merge data from multiple files [default:
“mean
”]
Add an extra header row showing the filename the data came from
[default: “none
”]
Output each table in tab-separated-value format
Do not output column headers
Specify the text placed at the beginning of each data column [default: “”]
Specify the text used to separate data columns [default:
“\\t
”]
Specify the text placed at the end of each data column [default: “”]
Specify the text placed at the beginning of each data row [default: “”]
Specify the text used to separate data rows [default: “” ]
Specify the text placed at the end of each data row [default:
“\\n
”]
Specify the text placed at the beginning of each header column
[default: same as colbegin
]
Specify the text used to separate header columns [default: same
as colsep
]
Specify the text placed at the end of each header column
[default: same as colend
]
Specify the text placed at the beginning of each header row
[default: same as rowbegin
]
Specify the text used to separate header rows [default: same as
rowsep
]
Specify the text placed at the end of each header row [default:
same as rowend
]
Specify the text placed at the beginning of each table [default: “”]
Specify the text used to separate tables [default:
“\\n
”]
Specify the text placed at the end of each table [default: “” ]
Specify the text used to begin quoted text [default:
“"
”]
Specify the text used to end quoted text [default: same as
quote
]
Output strings in a format readable by Microsoft Excel
Enumerate the columns that should be included in the output [default: all columns]
Specify how to merge data from multiple files [default:
“mean
”]
Add an extra header row showing the filename the data came from
[default: “none
”]
Output each table in HTML table format
Do not output column headers
Specify the text placed at the beginning of each data column
[default: “<td>
”]
Specify the text used to separate data columns [default:
“
”]
Specify the text placed at the end of each data column [default:
“</td>
”]
Specify the text placed at the beginning of each data row
[default: “<tr>
”]
Specify the text used to separate data rows [default: “” ]
Specify the text placed at the end of each data row [default:
“</tr>\\n
”]
Specify the text placed at the beginning of each header column
[default: “<th>
”]
Specify the text used to separate header columns [default: same
as colsep
]
Specify the text placed at the end of each header column
[default: “</th>
”]
Specify the text placed at the beginning of each header row
[default: same as rowbegin
]
Specify the text used to separate header rows [default: same as
rowsep
]
Specify the text placed at the end of each header row [default:
same as rowend
]
Specify the text placed at the beginning of each table [default:
“<table>\\n
”]
Specify the text used to separate tables [default: “” ]
Specify the text placed at the end of each table [default:
“</table>\\n
”]
Specify the text used to begin quoted text [default: “” ]
Specify the text used to end quoted text [default: same as
quote
]
Enumerate the columns that should be included in the output [default: all columns]
Specify how to merge data from multiple files [default:
“mean
”]
Add an extra header row showing the filename the data came from
[default: “none
”]
Output each table as a gnuplot data file
Do not output column headers
Specify the text placed at the beginning of each data column [default: “”]
Specify the text used to separate data columns [default:
“
”]
Specify the text placed at the end of each data column [default: “”]
Specify the text placed at the beginning of each data row [default: “”]
Specify the text used to separate data rows [default: “” ]
Specify the text placed at the end of each data row [default:
“\\n
”]
Specify the text placed at the beginning of each header column
[default: same as colbegin
]
Specify the text used to separate header columns [default: same
as colsep
]
Specify the text placed at the end of each header column
[default: same as colend
]
Specify the text placed at the beginning of each header row
[default: “#
”
Specify the text used to separate header rows [default: same as
rowsep
]
Specify the text placed at the end of each header row [default:
same as rowend
]
Specify the text placed at the beginning of each table [default: “”]
Specify the text used to separate tables [default:
“\\n\\n
”]
Specify the text placed at the end of each table [default: “” ]
Specify the text used to begin quoted text [default:
“"
”]
Specify the text used to end quoted text [default: same as
quote
]
Enumerate the columns that should be included in the output [default: all columns]
Specify how to merge data from multiple files [default:
“mean
”]
Add an extra header row showing the filename the data came from
[default: “none
”]
Output each table as an Octave text-format data file
Do not output column headers
Specify the text placed at the beginning of each data column [default: “”]
Specify the text used to separate data columns [default: “” ]
Specify the text placed at the end of each data column [default:
“\\n
”]
Specify the text placed at the beginning of each data row [default: “”]
Specify the text placed at the end of each data row [default: “” ]
Specify the text placed at the beginning of each header column [default: “”]
Specify the text used to separate header columns [default:
“_
”]
Specify the text placed at the end of each header column [default: “”]
Specify the text placed at the beginning of each header row
[default: “#
”]
Specify the text used to separate header rows [default: “” ]
Specify the text placed at the end of each header row [default:
“\\n
”]
Specify the text placed at the beginning of each table [default: “”]
Specify the text used to separate tables [default:
“\\n
”]
Specify the text placed at the end of each table [default: “” ]
Specify the text used to begin quoted text [default: “” ]
Specify the text used to end quoted text [default: same as
quote
]
Enumerate the columns that should be included in the output [default: all columns]
Specify how to merge data from multiple files [default:
“mean
”]
Add an extra header row showing the filename the data came from
[default: “none
”]
Output each table in a completely user-specified format
Do not output column headers
Specify the text placed at the beginning of each data column [default: “”]
Specify the text used to separate data columns [default: “” ]
Specify the text placed at the end of each data column [default: “”]
Specify the text placed at the beginning of each data row [default: “”]
Specify the text used to separate data rows [default: “” ]
Specify the text placed at the end of each data row [default: “” ]
Specify the text placed at the beginning of each header column
[default: same as colbegin
]
Specify the text used to separate header columns [default: same
as colsep
]
Specify the text placed at the end of each header column
[default: same as colend
]
Specify the text placed at the beginning of each header row
[default: same as rowbegin
]
Specify the text used to separate header rows [default: same as
rowsep
]
Specify the text placed at the end of each header row [default:
same as rowend
]
Specify the text placed at the beginning of each table [default: “”]
Specify the text used to separate tables [default: “” ]
Specify the text placed at the end of each table [default: “” ]
Specify the text used to begin quoted text [default: “” ]
Specify the text used to end quoted text [default: same as
quote
]
Output strings in a format readable by Microsoft Excel
Enumerate the columns that should be included in the output [default: all columns]
Specify how to merge data from multiple files [default:
“mean
”]
Add an extra header row showing the filename the data came from
[default: “none
”]
Output each table as a LaTeX tabular environment
Use the dcolumn package to align numbers on the decimal point
Use the booktabs package for a more professionally typeset look
Use the longtable package to enable multi-page tables
Enumerate the columns that should be included in the output [default: all columns]
Specify how to merge data from multiple files [default:
“mean
”]
Add an extra header row showing the filename the data came from
[default: “none
”]
Extract the program’s run-time parameters and environment variables
Output the parameters in plain-text format
Read from a file the list of keys to output
Ignore any keys whose name matches a regular expression
Sort the list of parameters alphabetically by key
Exclude environment variables
Exclude run-time parameters
Format environment variable names using the given template
[default: “%s (environment variable)
”
]
Output the parameters as a 1-, 2-, or 3-column table [default: 1]
Specify the text used to separate data columns [default:
“:
”]
Specify the text that’s output at the start of each data row [default: “”]
Specify the text that’s output at the end of each data row
[default: “\\n
”]
Output a list of the keys only (i.e., no values)
Read the list of parameters to output from a given file
Ignore any keys whose name matches a regular expression
Format environment variable names using the given template
[default: “%s (environment variable)
”
]
Sort the list of parameters alphabetically by key
Exclude environment variables
Exclude run-time parameters
Output the parameters as a LaTeX tabular environment
Read from a file the list of keys to output
Ignore any keys whose name matches a regular expression
Format environment variable names using the given template
[default: “%s (environment variable)
”
]
Sort the list of parameters alphabetically by key
Use the booktabs package for a more professionally typeset look
Use the tabularx package to enable line wraps within the value column
Use the longtable package to enable multi-page tables
Exclude environment variables
Exclude run-time parameters
Extract the environment in which the program was run
Use Bourne shell syntax for setting environment variables
Separate commands with newlines instead of semicolons
Unset all other environment variables
Switch to the program’s original working directory
Use Bourne Again shell syntax for setting environment variables
Separate commands with newlines instead of semicolons
Unset all other environment variables
Switch to the program’s original working directory
Use Korn shell syntax for setting environment variables
Separate commands with newlines instead of semicolons
Unset all other environment variables
Switch to the program’s original working directory
Use C shell syntax for setting environment variables
Separate commands with newlines instead of semicolons
Unset all other environment variables
Switch to the program’s original working directory
Use Z shell syntax for setting environment variables
Separate commands with newlines instead of semicolons
Unset all other environment variables
Switch to the program’s original working directory
Use tcsh syntax for setting environment variables
Separate commands with newlines instead of semicolons
Unset all other environment variables
Switch to the program’s original working directory
Use ash syntax for setting environment variables
Separate commands with newlines instead of semicolons
Unset all other environment variables
Switch to the program’s original working directory
Extract coNCePTuaL source code
Output the source code in plain-text format
Specify the text placed at the beginning of each line [default: “”]
Specify the text placed at the end of each line [default:
“\\n
”]
Specify the text placed before each keyword [default: “” ]
Specify the text placed after each keyword [default: “” ]
Specify the text placed before each string [default: “” ]
Specify the text placed after each string [default: “” ]
Specify the text placed before each comment [default: “” ]
Specify the text placed after each comment [default: “” ]
Indent each line by a given number of spaces
Wrap the source code into a paragraph with a given character width
Extract a list of warnings the program issued during initialization
Output warnings in plain-text format
Specify text to appear at the beginning of the list [default: “” ]
Specify text to appear at the end of the list [default: “” ]
Specify text to appear before each warning [default:
“*
”]
Specify text to appear after each warning [default:
“\\n
”]
Output warnings as an HTML list
Specify text to appear at the beginning of the list [default:
“<ul>\\n
”]
Specify text to appear at the end of the list [default:
“</ul>\\n
”]
Specify text to appear before each warning [default:
“ <li>
”]
Specify text to appear after each warning [default:
“</li>\\n
”]
Output warnings as a LaTeX list
Specify text to appear at the beginning of the list [default:
“\begin@{itemize@}\\n
”]
Specify text to appear at the end of the list [default:
“\end@{itemize@}\\n
”]
Specify text to appear before each warning [default:
“ \item
”]
Specify text to appear after each warning [default:
“\\n
”]
The following represent additional clarification for some of the above:
2
.72
.coNCePTuaL version: 1.0 coNCePTuaL backend: c_mpi Average timer overhead [gettimeofday()]: <1 microsecond Log creation time: Thu Mar 27 19:22:48 2003 Log completion time: Thu Mar 27 19:22:48 2003
coNCePTuaL version: 1.0 coNCePTuaL backend: c_mpi Average timer overhead [gettimeofday()]: <1 microsecond Log creation time: Thu Mar 27 19:22:48 2003 Log completion time: Thu Mar 27 19:22:48 2003
coNCePTuaL version : 1.0 coNCePTuaL backend : c_mpi Average timer overhead [gettimeofday()]: <1 microsecond Log creation time : Thu Mar 27 19:22:48 2003 Log completion time : Thu Mar 27 19:22:48 2003
mean
(arithmetic mean), hmean
(harmonic
mean), min
(minimum), max
(maximum),
median
(median), sum
(sum),
all
(all values from each column), or
concat
(horizontal concatenation of all data), and
applies the function to corresponding data values across all of the
input files. --merge can also accept a
comma-separated list of the above functions, one per data column.
This enables a different merge operation to be used for each
column. For example, --merge=min,min,mean will
take the minimum value across all files of each element in the
first and second columns and the arithmetic mean across all files
of each element in the third column. If the number of
comma-separated values differs from the number of columns and
--force-merge is specified,
ncptl-logextract will cycle over the given values
until all columns are accounted for. The concat
merge
type applies to all columns and therefore cannot be combined with
any other merge type. The difference between
--merge=all and
--merge=concat is that the
former merges three files each with columns A and
B as {A, A, A,
B, B, B} while the latter merges
the same files as {A, B, A,
B, A, B}.
Note that --merge is applied after --keep-columns. Hence, if --keep-columns specifies that only three columns be kept, --merge should list exactly three operations (or a single operation that applies to all three columns).
none
, all
, or first
. The
default is none
, which doesn’t add an extra header
row. all
repeats the filename in each column of the
extra header row. first
outputs the filename in only
the first column, leaving the remaining columns with an empty
string. The following examples show how a sample data table is
formatted with --showfnames set in turn to
each of none
, all
, and
first
:
none
(the default):
"Size","Value" 1,2 2,4 3,6
all
(filename repeated in each column of
the first row):
"mydata.log","mydata.log" "Size","Value" 1,2 2,4 3,6
first
(filename shown only in the first
column of the first row):
"mydata.log","" "Size","Value" 1,2 2,4 3,6
ltxtable
LaTeX package.
See the ltxtable
documentation for more
information.If no filenames are given, ncptl-logextract will read from the standard input device. If multiple log files are specified, coNCePTuaL will merge the data values and take all other information from the first file specified. Note, however, that all of the log files must have been produced by the same coNCePTuaL program and that that program must have been run in the same environment. In other words, only the data values may change across log files; everything else must be invariant. See the description of --merge in the Additional Options section for more information about merging data values from multiple log files.
ncptl-logextract treats certain files specially:
ncptl-logmerge treats filenames ending in .tgz as if they ended in .tar.gz and filenames ending in .taz as if they ended in .tar.Z.
If the argument provided to any ncptl-logextract
option begins with an at sign (“@
”),
the value is treated as a filename and is replaced by the file’s
contents. To specify an non-filename argument that begins with an
at sign, merely prepend an additional
“@
”:
The option this
is given the value
“that
”.
The option this
is set to the contents of the file
called that.
The option this
is given the value
“@that
”.
For the following examples, we assume that results.log is the name of a log file produced by a coNCePTuaL program.
Extract the data in CSV format and write it to results.csv:
ncptl-logextract --extract=data results.log --output=results.csv
Note that --extract=data is the default and therefore optional:
ncptl-logextract results.log --output=results.csv
ncptl-logextract can combine data from multiple log files (using an arithmetic mean by default):
ncptl-logextract results-*.log --output=results.csv
Put the data from all of the log files side-by-side and produce a CSV file that Microsoft Excel can read directly:
ncptl-logextract results-*.log --output=results.csv --merge=all \ --showfnames=first --excel
Output the data from result.log in tab-separated-value format:
ncptl-logextract --format=tsv results.log
Output the data in space-separated-value format:
ncptl-logextract --colsep=" " results.log
Use gnuplot to draw a PostScript graph of the data:
ncptl-logextract results.log --format=gnuplot \ --before=@params.gp | gnuplot > results.eps
In the above, the params.gp file might contain gnuplot commands such as the following:
set terminal postscript eps enhanced color "Times-Roman" 30 set output set logscale xy set data style linespoints set pointsize 3 plot "-" title "Latency"
(There should be an extra blank line at the end of the file
because ncptl-logextract strips off a trailing newline
character whenever it reads a file using “@
”.)
Produce a complete HTML file of the data (noting that --format=html produces only tables, not complete documents):
ncptl-logextract --format=html --before='<html>\n<head>\n<title>Data</title>\n</head>\n<body>\n' \ --after='</body>\n</html>\n' results.log
Output the data as a LaTeX tabular
, relying on both
the (standard) dcolumn
and (non-standard)
booktabs
packages for more attractive formatting:
ncptl-logextract --format=latex --dcolumn --booktabs \ --output=results.tex results.log
Output the run-time parameters in the form “key
--->
value” with all of the arrows
aligned:
ncptl-logextract results.log --extract=params --columns=3 --colsep=" --> "
Output the run-time parameters as an HTML description list:
ncptl-logextract results.log --extract=params --before='<dl>' \ --rowbegin='<dt>' --colsep='</dt><dd>' --rowend='</dd>\n' \ --after='</dl>\n'
Restore the exact execution environment that was used to produce results.log, including the current working directory (assuming that bash is the current command shell):
eval `ncptl-logextract --extract=env --format=bash \ --unset --chdir results.log`
Set all of the environment variables that were used to produce results.log, overwriting—but not removing—whatever environment variables are currently set (assuming that tcsh is the current command shell):
eval `ncptl-logextract --extract=env --format=tcsh results.log`
Extract the source code that produced results.log:
ncptl-logextract --extract=source results.log
Do the same, but indent the code by four spaces then re-wrap it into a 60-column paragraph:
ncptl-logextract --extract=source --indent=4 --wrap=60 results.log
Here are a variety of ways to express the same thing:
ncptl-logextract -e source --indent=4 --wrap=60 results.log
ncptl-logextract -e source --indent=4 results.log --wrap=60
cat results.log | ncptl-logextract --wrap=60 --indent=4 -e source
Output the source code wrapped to 72 columns, with no indentation, and formatted within an HTML preformatted-text block:
ncptl-logextract --extract=source --wrap --before="<PRE>\n" \ after="</PRE>\n" results.log
List all of the warning messages which occur in results.log:
ncptl-logextract --extract=warnings results.log
ncptl-logmerge(1), ncptl-logunmerge(1), the coNCePTuaL User’s Guide
Scott Pakin, pakin@lanl.gov
Previous: ncptl-logextract, Up: ncptl-logextract [Contents][Index]