Task latency hierarchies (coNCePTuaL User’s Guide)

Task latency hierarchies

The interpret backend normally simulates a flat network in which communication between any two tasks takes unit latency. However, the --hierarchy option lets one specify a hierarchy of latencies: latency l1 within a set of t1 tasks, latency l1+l2 within a set of t1*t2 tasks, latency l1+l2+l3 within a set of t1*t2*t3 tasks, and so forth.

The argument to --hierarchy is a comma-separated list of <task factor:latency delta> pairs. The latency delta component is optional and defaults to ‘1’. If the list ends with ‘...’, the final <task factor:latency delta> pair is repeatedly indefinitely. All task factor values must be positive integers, and all latency delta values must be nonnegative integers.

As an example, --hierarchy=4 (or --hierarchy=4:1) partitions the program’s tasks into sets of four with one unit of additional latency to communicate with another set. Hence, tasks 0, 1, 2, and 3 can communicate with each other in unit time, as can tasks 4, 5, 6, and 7. However, communication between any task in the first set and any task in the second set (e.g., between tasks 3 and 4) takes an additional unit of time for a total latency of two units.

As a more complex example, consider a cluster of symmetric multiprocessors (SMPs) in which each SMP comprises two processor sockets with each socket containing a quad-core processor (four CPUs). Further assume that the SMPs are networked together via a fat-tree network consisting of 12-port switches. In this example, let’s say that it takes unit latency to communicate within a socket, two units of latency to communicate with another socket in the same SMP, and an additional three units of latency to traverse each switch. This configuration can be specified to the interpret backend as --hierarchy="4:1, 2:1, 12:3, 12:6, ..." (or simply --hierarchy=4,2,12:3,12:6, ... ). With this setting, task 0, for instance, communicates with tasks 1–3 in 1 time unit, with tasks 4–7 in 2 time units, with tasks 8–95 in 5 time units (one switch crossing), with tasks 96–1151 in 11 time units (three switch crossings—a level 0 switch, a level 1 switch, and another level 0 switch), with tasks 1152–13,823 in 17 time units (five switch crossings—of switch levels 0, 1, 2, 1, and 0), and so forth.

Scott Pakin, pakin@lanl.gov