Scott Pakin's open-source software

When I write software that I think might be of interest to the public at large (and worth the paperwork hassle for me to release it), I'll put it on this page.

edif2qmasm—Run Verilog or VHDL programs on a D-Wave quantum annealer

Technically, edif2qmasm converts from the EDIF netlist format, which can be output by various synthesis tools, to the QMASM quantum macro assembly language. (See below.) The benefit of edif2qmasm is that it enables programs to be expressed in a (more-or-less) conventional programming language yet run on exotic quantum-annealing hardware. Programs can be run either forward or backward, which sometimes enables simple programs to be used to solve challenging problems.

See the edif2qmasm GitHub site for more information and for the edif2qmasm source code.

QMASM—Quantum macro assembler

QMASM fills a gap in the software ecosystem for D-Wave's quantum annealers by shielding the programmer from having to know system-specific hardware details while still enabling programs to be expressed at a fairly low level of abstraction. It is therefore analogous to a conventional macro assembler and can be used in much the same way: as a target either for programmers who want a great deal of control over the hardware or for compilers that implement higher-level languages.

See the QMASM GitHub site for more information and for the QMASM source code.

MPI-Bash—Parallel shell scripts

MPI-Bash is a modification to GNU Bash that augment's Bash's set of built-in commands with various MPI functions for communication across multiple Bash processes running on a cluster or supercomputer. MPI-Bash can also be compiled with support for the Libcircle distributed-queue API, which facilitates load-balanced work distribution across Bash processes. The goal of MPI-Bash is to facilitate parallel processing of large numbers of files, such as may be produced from a parallel application.

See the MPI-Bash GitHub site for more information and for the MPI-Bash source code.

Byfl—Hardware-independent application performance analysis

Byfl provides a form of software performance counters to measure a variety of application performance metrics. With Byfl, code is instrumented at compile time, but data is gathered at execution time. This enables Byfl to fill a niche that lies between hardware-centric approaches such as those taken by PAPI and Pin and source-centric approaches such as those taken by ROSE. Byfl can report the types of instructions performed, memory accessed, and various other code characteristics.

See the Byfl GitHub site for more information and for the Byfl source code.

The Cell Messaging Layer—MPI for the Cell processor's synergistic processing elements

The Cell Messaging Layer is a communication library for clusters of Cell Broadband Engine processors. It implements a subset of the popular MPI message-passing API on the Cell's SPE processors, with one MPI rank per SPE across any number of Cells and the ability for any SPE to communicate directly with any other SPE, even across a network. The Cell Messaging Layer thereby makes programming clusters of Cell processors similar to programming clusters of conventional CPUs. In addition, the Cell Messaging Layer is extremely fast, providing true message-passing semantics at nearly the speed of the underlying data-transfer hardware.

See the Cell Messaging Layer Web page for more information and for the Cell Messaging Layer source code.

JumboMem—Enable programs to page to remote RAM instead of to local disk

JumboMem provides a low-effort solution to the problem of running memory-hungry programs on memory-starved computers. The JumboMem middleware gives programs access to all of the memory in an entire cluster, providing the illusion that all of the memory resides within a single computer. When a program exceeds the memory in one computer, it automatically spills over into the memory of the next computer. Behind the scenes, JumboMem handles all of the network communication required to make this work; the user's program does not need to be modified—not even recompiled—to take advantage of JumboMem. Furthermore, JumboMem does not need administrator privileges to install. Any ordinary user with an account on a workstation cluster has sufficient privileges to install and run JumboMem.

See the JumboMem Web page for more information and for the JumboMem source code.

coNCePTuaL—A network correctness and performance testing language

A frequently reinvented wheel among network researchers is a suite of programs that test a network's performance. A problem with having umpteen versions of performance tests is that it leads to a variety in the way results are reported. coNCePTuaL is a domain-specific programming language (with associated compiler and other tools) designed to facilitate writing communication benchmarks and reporting the results in as scientific a manner as possible. The language is English-like and easy to read, even by a non-expert in the area of high-performance communication.

See the coNCePTuaL Web page for more information and for the coNCePTuaL source code.

whatelse—Report what else is running on a computer

Sometimes a computer seems slow for no particular reason. whatelse helps determine the source of the problem. The program waits quietly for a length of time then reports what happened on the computer while it was waiting. whatelse reports—among other things—process state changes, network activity, memory behavior, and hardware interrupts that occurred.

whatelse is available in the following format:

whatelse-1.5.tar.gz (11KB tar file)

Note that whatelse is very specific to Linux; it gets virtually all of its information from the Linux /proc filesystem.

oddmanout—Run a program on a bunch of nodes and report which nodes gave different output

Homogeneity is a desirable attribute for workstation clusters used for high-performance computing. However, it can be difficult to ensure that all nodes in the cluster are exactly the same. Nodes may have hung processes, filesystems that failed to mount, modules that failed to load, etc. On very large clusters, there may even be nodes with different CPU speeds or amounts of memory. oddmanout helps find nodes that are different from the rest. The idea is to run a command (or set of commands) on every node of a cluster, find the most common output across all nodes, and report those nodes whose output is different from that.

oddmanout is available in the following formats:

oddmanout-1.1-2.noarch.rpm (38KB RPM file)
oddmanout-1.1-2.src.rpm (18KB source RPM file)
oddmanout-1.1.tar.gz (14KB tar file)

Scott Pakin, pakin@lanl.gov