Next: Mesh/torus functions, Previous: n-ary tree functions, Up: Built-in functions [Contents][Index]
k-nomial trees are an efficient way to implement
collective-communication operations in software. Unlike in an
n-ary tree, the number of children in a
k-nomial tree decreases with increasing task depth
(i.e., no task has more children than the root). The advantage is
that the tasks that start communicating earlier perform more work,
which reduces the total latency of the collective operation. In
contrast, in an n-ary tree, the tasks that start
communicating earlier finish earlier, at the expense of increased
total latency. coNCePTuaL supports k-nomial trees via
the KNOMIAL_PARENT
,
KNOMIAL_CHILDREN
,
and KNOMIAL_CHILD
functions, as described below.
KNOMIAL_PARENT
takes a task number, the tree
fan-out factor (the “k” in “k-ary”), and the
number of tasks in the tree. It returns the task ID of the given
task’s parent. fan_out defaults to ‘2’ and
the number of tasks defaults to num_tasks
(see Predeclared
variables).
KNOMIAL_CHILDREN
takes the same arguments as
KNOMIAL_PARENT
but
returns the number of immediate descendents the given task has.
KNOMIAL_CHILD
takes a task number, a child number
(0 <= i < ‘KNOMIAL_CHILDREN(…)’), the
tree fan-out factor, and the number of tasks in the tree. It
returns the task number corresponding to the given task’s
ith child. As in KNOMIAL_PARENT
and
KNOMIAL_CHILDREN
,
fan_out defaults to ‘2’ and the number of
tasks defaults to num_tasks
(see Predeclared
variables).
The following figure shows how coNCePTuaL numbers tasks in a k-nomial tree with k=2 (a.k.a. a 2-nomial or binomial tree).
The figure is structured with time flowing downwards. That is, for a multicast operation expressed over a 2-nomial tree, task 0 sends a message to task 1 in the first time step. Then, task 0 sends to task 2 while task 1 sends to task 3. In the final step, task 0 sends to task 4, task 1 sends to task 5, task 2 sends to task 6, and task 3 sends to task 7—all concurrently. The following expressions also hold, assuming there are a total of eight tasks in the computation:
k-nomial trees for k>2 are much less common in practice than 2-nomial trees. However, they may perform well when a task has sufficient bandwidth to support multiple, simultaneous, outgoing messages. For example, a trinomial tree (i.e., a k-nomial tree with k=3) should exhibit good performance if there is enough bandwidth to send two messages simultaneously. The following illustration shows how coNCePTuaL constructs a 27-task trinomial tree:
As before, time flows downward (assuming a multicast operation)
and tasks are expected to communicate with their children in order.
The following are some coNCePTuaL k-nomial tree
expressions and their evaluations, assuming
num_tasks
is ‘27’:
Next: Mesh/torus functions, Previous: n-ary tree functions, Up: Built-in functions [Contents][Index]