Next: Calling MPI functions, Previous: Hot spot, Up: Examples [Contents][Index]
It may be worth comparing the performance of a native multicast
operation to the performance achieved by multicasting over a
k-nomial tree to gauge how well the underlying
communication layer implements multicasts. The following code
records a wealth of data, varying the tree arity
(i.e., k), the number of tasks
receiving the multicast, and the message size. It provides a good
demonstration of how to use the KNOMIAL_CHILDREN
and KNOMIAL_CHILD
functions.
# Test the performance of multicasting over various k-nomial trees # By Scott Pakin <pakin@lanl.gov> Require language version "1.5". # Parse the command line. minsize is "Min. message size (bytes)" and comes from "--minbytes" or "-n" with default 1. maxsize is "Max. message size (bytes)" and comes from "--maxbytes" or "-x" with default 1M. reps is "Repetitions to perform" and comes from "--reps" or "-r" with default 100. maxarity is "Max. arity of the tree" and comes from "--maxarity" or "-a" with default 2. Assert that "this program requires at least two processors" with num_tasks>=2. # Send messages from task 0 to 1, 2, 3, ... other tasks in a k-nomial tree. For each arity in {2, ..., maxarity} { for each num_targets in {1, ..., num_tasks-1} { for each msgsize in {minsize, minsize*2, minsize*4, ..., maxsize} { task 0 outputs "Multicasting a " and msgsize and "-byte message to " and num_targets and " target(s) over a " and arity and "-nomial tree ..." then for reps repetitions { task 0 resets its counters then for each src in {0, ..., num_tasks} for each dstnum in {0, ..., knomial_children(src, arity, num_targets+1)} task src sends a msgsize byte message to task knomial_child(src, dstnum, arity) then all tasks synchronize then task 0 logs the arity as "k-nomial arity" and the num_targets as "# of recipients" and the msgsize as "Message size (bytes)" and the median of (1E6/1M)*(msgsize/elapsed_usecs) as "Incoming bandwidth (MB/s)" and the median of (num_targets*msgsize/elapsed_usecs)* (1E6/1M) as "Outgoing bandwidth (MB/s)" } then task 0 computes aggregates } } } |
Next: Calling MPI functions, Previous: Hot spot, Up: Examples [Contents][Index]