Previous: The interpret backend, Up: The interpret backend [Contents][Index]
The
interpret
backend normally simulates a flat network in
which communication between any two tasks takes unit latency.
However, the --hierarchy option lets one
specify a hierarchy of latencies: latency l1 within a set
of t1 tasks, latency l1+l2 within a set of
t1*t2 tasks, latency l1+l2+l3 within a set of
t1*t2*t3 tasks, and so forth.
The argument to --hierarchy is a comma-separated list of <task factor:latency delta> pairs. The latency delta component is optional and defaults to ‘1’. If the list ends with ‘...’, the final <task factor:latency delta> pair is repeatedly indefinitely. All task factor values must be positive integers, and all latency delta values must be nonnegative integers.
As an example, --hierarchy=4 (or --hierarchy=4:1) partitions the program’s tasks into sets of four with one unit of additional latency to communicate with another set. Hence, tasks 0, 1, 2, and 3 can communicate with each other in unit time, as can tasks 4, 5, 6, and 7. However, communication between any task in the first set and any task in the second set (e.g., between tasks 3 and 4) takes an additional unit of time for a total latency of two units.
As a more complex example, consider a cluster of symmetric
multiprocessors (SMPs) in which each SMP comprises two processor
sockets with each socket containing a quad-core processor (four
CPUs). Further assume that the SMPs are networked together via a
fat-tree network consisting of 12-port switches. In this example,
let’s say that it takes unit latency to communicate within a
socket, two units of latency to communicate with another socket in
the same SMP, and an additional three units of latency to traverse
each switch. This configuration can be specified to the
interpret
backend as
--hierarchy="4:1, 2:1, 12:3, 12:6, ..."
(or simply --hierarchy=4,2,12:3,12:6, ...
). With this setting, task 0, for
instance, communicates with tasks 1–3 in
1 time unit, with tasks 4–7 in
2 time units, with tasks 8–95 in
5 time units (one switch crossing), with
tasks 96–1151 in 11 time units
(three switch crossings—a level 0
switch, a level 1 switch, and another level 0
switch), with tasks 1152–13,823 in
17 time units (five switch crossings—of switch
levels 0, 1, 2, 1, and 0), and so
forth.
Previous: The interpret backend, Up: The interpret backend [Contents][Index]