-------------------------------------------------------------------------------
PREREQUISITES

* OpenMMFreeEnergy branch

Get the latest version of the OpenMMFreeEnergy branch of the OpenMM source tree from SimTK:

svn checkout https://simtk.org/svn/openmm/branches/OpenMMFreeEnergy
cd OpenMMFreeEnergy
cmake .
# edit CMakeCache.txt to modify things...
cmake .
make install

* netcdf4 with netcdf-4 support (to allow file sizes > 2 GB)

If you install NetCDF4 from source, you need hdf5 and its prerequisites installed already.  You can either build them from source:

wget http://www.hdfgroup.org/ftp/lib-external/zlib/1.2/src/zlib-1.2.3.tar.gz
wget http://www.hdfgroup.org/ftp/lib-external/szip/2.1/src/szip-2.1.tar.gz
wget http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.3.tar.gz

or you can use MacPorts with 'sudo port install hdf5-18'.  (Note that the 'hdf5' port is too old to work.)

NetCDF must then be built from source -- the 'netcdf' MacPorts version does not support the new netcdf-4 file format and C++ interface:

wget http://www.unidata.ucar.edu/downloads/netcdf/ftp/netcdf-4.0.1.tar.gz
tar zxf netcdf-4.0.1.tar.gz
cd netcdf-4.0.1
./configure --enable-netcdf-4 --enable-cxx-4 --disable-f77 --disable-f90 --with-hdf5=/opt/local --prefix=/Users/yank/local
make -j8 install

where we here install to ${HOME}/local/.

Pathnames in CMakeLists.txt in INCLUDE_DIRECTORIES and LINK_DIRECTORIES must be updated to reflect the local install directory if you install in a different place..
-------------------------------------------------------------------------------
BUILD INSTRUCTIONS

Make sure the OPENMM_INSTALL_DIR is set to the openmm installation directory (default /usr/local/openmm)
and the OPENMM_PLUGIN_DIR is set to the appropriate plugin directory here (default ${OPENMM_INSTALL_DIR}/lib/plugins).

OPENMM_SOURCE_DIR must be set to your OpenMMFreeEnergy source directory path.  This is only needed for the forcecheck.C sanity check.

Edit CMakeLists.txt to change directories as needed.

# build makefiles
cmake .
# build yank
make

-------------------------------------------------------------------------------
USAGE

# run yank on input files in specified directory
./yank directory

As an example, consider running yank on the 1-methylpyrrole data:

./yank ../example-systems/T4-lysozyme-L99A/amber-gbsa/amber-gbsa/1-methylpyrrole/

The prmtop files must be generated by the LEaP tool from the AmberTools distribution.
The complex must contain the receptor and then the ligand, in that order.

The receptor.crd file is an AMBER-format coordinate set containing coordinates for just the receptor.

The ligand coordinates can be specified either as an AMBER-format coordinate set (amber.crd) or as one or more conformations in a mol2 file (ligand.mol2).
Providing multiple docked ligand conformations (up to ~ 30) will provide a built-in convergence check.

Eventually, we can support mutliple receptor configurations as well.
-------------------------------------------------------------------------------
WHAT IT DOES

Three simulations are conducted for each run: ligand in vacuum (very fast), ligand in GBSA (fast), and ligand in complex (slow).
When each simulation starts, a NetCDF file is created that contains configurations, velocities, and energies.  If execution is halted (provided it does not occur while file writing) then the simulation will resume from the last iteration as read from this NetCDF file.
While the simulations are running, a separate Python script can be run to examine these files, extract the energies, and produce an estimate of the current hydration and binding free energies, and their statistical errors.
If necessary, the simulations runs can be extended by increasing the number of iterations, though this currently requires recompiling yank.cpp.

HOW IT WORKS

An OpenMM::System object is created for each "alchemical intermediate" state.
For each phase of the simulation (ligand in vacuum, ligand in solvent, ligand in complex) data collected for a series of alchemical intermediates provides an estimate one leg of the thermodynamic cycle for estimating the hydration or binding free energies. 
In vacuum or solvent, the alchemical intermediates are constructed such that first the ligand charges and GBSA contributions are scaled, followed by the Lennard-Jones parameters.

-------------------------------------------------------------------------------
SOURCE FILES

yank.cpp - main executable for running free energy calculations
fep.C - class to handle FEP calculation with OpenMM
AlchemicalFactor.C - "factory" function to generate OpenMM System objects with alchemical modifications
restraints.C - crude hacks to build receptor-ligand restraints
forcecheck.C - test routine to compare GPU forces with CPU forces and refuse to run if there is a discrepancy
amber.C - functions to build OpenMM System object from AMBER prmtop file once read into memory by parm.C
rng.C - lightweight random number generator (open source?)
parm.C - AMBER prmtop reader (stolen from NAMD)
common.C - support routines for parm.C (stolen from NAMD)
strlib.C - support routines for parm.C (stolen from NAMD)
utils.h - defines coordinates_t and array2d

UNUSED / DEPRECATED

deepcopy.C - unused hack implementation of a "deep copy" for System objects
mol2io.C - unused file that may eventually read mol2 files
rngtest.C - example file making use of rng.C