next_inactive up previous

Oxbow Toolkit: User Guide

Oxbow Developers



HPC architectures will continue to change over the next decade in response to efforts to improve energy efficiency, reliability, and performance. At this time of significant disruption, it is critically important to understand the requirements of contemporary and future extreme-scale scientific applications, so that we can drive or adopt new architectural and software features that satisfy the requirements of our applications. e.g., integrated GPU and CPU, integrated random number generator, transactional memory, fine-grained power management, MPI collective offload.

Hence, we believe that it is essential to quantitatively measure, project, and prioritize the resource and feature requirements of our anticipated workloads on such extreme-scale systems.

The Oxbow toolkit is a collection of tools to empirically characterize application behaviours along a critical set of dimensions namely computation, communication, memory capacity and access patterns.


For instructions on building and installing Oxbow, see the README included with the Oxbow tools source distribution. Following these instructions should result in an installation directory structure that includes a subdirectory for the oxbow tools and (optionally) a subdirectory for third party utilities. The directory names will be determined by the vendor ID of the compiler used during the build process.

For example, using gnu compilers and installing into the prefix /local/opt/oxbow, will result in an installed directory structure something like the following:


The etc directory contains a script, to set up your environment for building and running applications using the Oxbow tools. In section 3, most of the instructions begin by sourcing this script. For example, using our installation of Oxbow on the Keeneland test system, the user environment for using Oxbow tools with Intel compilers is set up by running:

$ source /nics/a/proj/oxbow/oxbow-tool-intel/etc/

The other subdirectories of oxbow-tool-vendor follow standard conventions.

Using Oxbow Tools

There are currently five tools available in the Oxbow toolkit.

Using Convenience Scripts

The sections below contain specific instructions for building and running each tool. Rather than run the tools directly, you may use convenience scripts. These are installed under:


All of the scripts are named for the tool they invoke, and have similar usage:

$ export PATH=$PATH:/path/to/oxbow/oxbow-tool-vendor/bin/util

These are meant to be common case run scenarios, so the FLAGS here are not specific flags to the tool being run. There is only one flag value of interest, and it applies only to miami-imix, pin-imix, and reused. For these tools, -unmarked is used to tell the script that the binary being run does not contain caliper functions.

These tools all produce various output files. Sometimes quite a lot of files are generated, as for multithreaded applications or MPI processes with many ranks. The OUTDIR argument to the convenience script tells the tool where to place all output. This directory will also contain a log of the exact commands issued by the script when invoking the command, as well as any output and error messages. The log will be located in OUTDIR/do-tool-name.log.

If the application is launched with an mpirun (or aprun or mpiexec) type command, enclose this command and any arguments between two sets of dashes. Follow the MPIRUN command with the actual command and arguments to be executed.

Example invocations of convenience scripts

Run an unmodified MPI binary using the pin-imix tool. Put output in myoutdir.

$ -unmarked myoutdir -- mpirun -n 4 -- ./myprog arg1 arg2

Run an unmodified serial binary using the reused tool. Put the output in $HOME/work.

$ -unmarked $HOME/work -- -- echo "hello"

Run a modified serial binary that has caliper functions added for pin-imix. Put the output in $HOME/work.

$ $HOME/work -- -- ./myprog-modified arg1 arg2

Run a binary that has been relinked for mpiP. Put the output in myoutdir.

$ myoutdir -- aprun -B -- ./mympi-modified arg1 arg2

mpiP: MPI Communication profiling

mpiP is a lightweight profiling library for MPI applications. Because it only collects statistical information about MPI functions, mpiP generates considerably less overhead and much less data than tracing tools. All the information captured by mpiP is task-local. It only uses communication during report generation, typically at the end of the experiment, to merge results from all of the tasks into one output file.

For extensive information about configuring and using mpiP, see the mpiP user guide. It can be accessed online at:

A copy of the user guide is also installed in Oxbow under:


To use mpiP to characterize your application's communication patterns, you will need to relink your application against the mpiP libraries and its third party library dependencies. Add the following link flags:

-L${OXBOW_TOOLS_DIR}/lib -lmpiP 
-L${LIBUNWIND_DIR}/lib -lunwind 
-L${BINUTILS_DIR}/lib -lbfd 
-L${BINUTILS_DIR}/lib64 -liberty

The locations of the required libraries can be added to your environment by sourcing the environment setup script installed in oxbow. For example:

$ source /path/to/oxbow/oxbow-tool-vendor/etc/
$ mpicc -g obj1.o obj2.o -o myprog-mpip -L${OXBOW_TOOLS_DIR}/lib -lmpiP \
-L${LIBUNWIND_DIR}/lib -lunwind -L${BINUTILS_DIR}/lib -lbfd -L${BINUTILS_DIR}/lib64 -liberty

Once the application is relinked, launch as normal. The application will output the results of profiling the MPI communication. To configure what output is produced, set the MPIP environment variable. The variable stores flags with similar syntax to command line flags. See the user guide for information on specific flags.

If you are using the provided convenience script, MPIP will be set in the script unless you set it yourself before running the script. The setting for MPIP in the convenience script will output results for both collective communication, point-to-point communication, as well as a collective communication matrix.

$ myoutdir -- mpirun -n 64 -- ./myprog-mpip arg1 arg2

miami-imix: Micro-operation Instruction Mix PIN Tool

The Miami imix tool profiles an application run using the Intel PIN profiling infrastructure. Each instruction is broken down into micro-operations: inividual reads, writes, integer, float, and SIMD operations. The tool output prints the counts of each micro-operation type in a comma-seperated-value file. The rows of the csv indicate which binary module the instructions resulted from.

To obtain the instruction mix, you can either:

  1. Use the provided convenience script
  2. OR Perform a two step process using the miamicfg tool and the miami-imix tool

Option 1: Use the script

For code that does not have caliper functions around a section of interest:

$ -unmarked myoutdir -- mpirun -n 4 -- ./myprog arg1 arg2

For code that has had caliper functions added:

$ myoutdir -- mpirun -n 4 -- ./myprog arg1 arg2

Option 2: Run both steps of miami-imix manually

Obtaining the instruction mix is a two step process.

  1. Use the miamicfg tool to collect a control flow graph of the application run
  2. Use the miami-imix tool to process the control flow graphs to obtain instruction mix information

Step 1: Control Flow Graph Info

First, you need to obtain the control flow group information by profiling the application in the following manner.

${OXBOW_TOOLS_DIR}/bin/miamicfg [options] -- <your_application> <your_arguments>

The double dash "-" is important as it separates the instruction mix tool's options from the target application and its parameters.

If this is an MPI application, then place the instruction mix tool in the position where you place your executable name.

mpirun -np 16 ${OXBOW_TOOLS_DIR}/bin/miamicfg [options] -- <your_application> <your_arguments>

No additional options are required for the wrapper. This step creates a .cfg file per process. By default, the output files are named: ExecName-MpiRank-ProcessPid.cfg

Optionally, you can resume and pause data collection dynamically. For this, you must modify the application's source code to insert calls to two, user defined empty functions, one for starting and one for stopping data collection. You can choose any name for these two caliper functions.

Once you identified suitable functions, you should pass use the folowing parameters to the instruction mix tool.

-q -start <name_of_start_function> -stop <name_of_stop_function>

Step 2: Instruction Mix

Once you have the control flow graph information from step 1, you should specify one resulting .cfg file to the miami-imix static tool as follows:

$OXBOW_TOOLS_DIR/bin/miami-imix -c <one_cfg_file>

This command outputs two files:

Note: The '.cfg' files contain information mapped to binary addresses. For this reason, they are valid only with the original executable that you used to collect those files. The CFG file contains paths to the executable and all the shared libraries used during the profiling step.

The second step uses those paths to locate the binaries and decode the instructions. You should not delete or move your binaries before running the second step.

pin-imix: Opcode Instruction Mix PIN Tool

The pin-imix tool outputs counts of instructions categorized by opcode. This tool can be run with or without modification to your program.

Running with unmodified binaries

Convenience script use for unmarked code:

$ -unmarked myoutdir -- mpirun -n 64 -- ./myprog arg1 arg2

To run directly (no convenience script) on unmarked code:

$ source /path/to/oxbow/oxbow-tool-vendor/etc/
$ mpirun -n 64 ${PIN_DIR}/intel64/bin/pinbin -follow_execv -t \
  ${OXBOW_TOOLS_DIR}/bin/ -category -i -- myprog arg1 arg2

Adding caliper functions

Caliper functions for the various Oxbow tools are provided in an interface library in the oxbow installation.

To use this caliper function library, modify your C/C++ source code with the following:

#include <oxbow.h>
// unprofiled code section
oxbow_pin_imix_zero();   //reset statistics
oxbow_pin_imix_start();  //start profiling
// profiled code section
oxbow_pin_imix_stop();   //stop profiling
// unprofiled code section

When compiling, add the following include flags to your object compilation:


Add the following library flags during the link step:

-L${OXBOW_TOOLS_DIR}/lib -loxbow

The OXBOW_TOOLS_DIR variable is set using the script. So, an example compilation after adding caliper functions would be:

$ source /path/to/oxbow/oxbow-tool-vendor/etc/
$ cc -I${OXBOW_TOOLS_DIR}/include -c myprog.c
$ cc -o myprog myprog.o -L${OXBOW_TOOLS_DIR}/lib -loxbow

Running with caliper functions

Convenience script use for marked code:

$ myoutdir -- mpirun -n 64 -- ./myprog arg1 arg2

To run directly (no convenience script) on unmarked code:

$ source /path/to/oxbow/oxbow-tool-vendor/etc/
$ mpirun -n 64 ${PIN_DIR}/intel64/bin/pinbin -follow_execv -t ${OXBOW_TOOLS_DIR}/bin/ \
         -start_address oxbow_pin_imix_marker_start:repeat \
         -stop_address oxbow_pin_imix_marker_stop:repeat \
         -zero_stats_address oxbow_pin_imix_zero_stats:repeat \
         -emit_stats_address oxbow_pin_imix_emit_stats:repeat \
         -category -i -- myprog arg1 arg2

membw: Memory Bandwidth Measurement

To use the memory bandwidth tool, you need PAPI installed in your environment. The environment variable PAPI_DIR must be set to a suitable value for your environment.

Please include the header file mem_bandwidth_calipers.h in your application.

Then you can use the following three calipers to mark the desired section in your application.

You can call the start and stop calipers multiple times. The results are appended to the output file.

When building your application, you should

Please ensure that you have the suitable PAPI include and link flags for your platform.

The calipers will create one output file per process. The output files are named:


reused: Reuse Distance PIN Tool

You can obtain the reuse distance metrics by running:

$ ${PIN_DIR}/intel64/bin/pinbin -follow_execv -t ${OXBOW_TOOLS_DIR}/bin/reuse_dist_cal \
  -- mpiexec -n nprocs your_app args

The output files are named hist_reuse_dist_cal__xxx.txt where xxx is the PID of process.

The tool will collect reuse distance for the whole application by default.

Please include the header file reused_api.h in your application.

Similar to the Memory bandwidth tool, we can insert a few function calls in the application source to mark a portion of the application we are interested in profiling.

When building your application, you should


This research is sponsored by the Office of Advanced Scientific Computing Research in the U.S. Department of Energy. The paper has been authored by Oak Ridge National Laboratory, which is managed by UT-Battelle, LLC under Contract #DE-AC05-00OR22725 to the U.S. Government. Accordingly, the U.S. Government retains a non-exclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U.S. Government purposes.

next_inactive up previous