YANK A program for GPU-accelerated alchemical binding free energy calculations. AUTHORS John D. Chodera Kim Branson Imran Haque Michael Shirts Portions of this code copyright (c) 2009-2011 University of California, Berkeley, Vertex Pharmaceuticals, Stanford University, University of Virginia, and the Authors. PREREQUISITES Use of this module requires the following * Python 2.6 or later http://www.python.org * OpenMM with Python wrappers http://simtk.org/home/openmm * NetCDF (compiled with netcdf4 support) and HDF5 (on which http://www.unidata.ucar.edu/software/netcdf/ http://www.hdfgroup.org/HDF5/ * netcdf4-python (a Python interface for netcdf4) http://code.google.com/p/netcdf4-python/ * numpy and scipy http://www.scipy.org/ * mpi4py if MPI support is desired http://mpi4py.scipy.org/ (Note that the mpi4py installation must be compiled against the appropriate MPI implementation.) * OpenEye toolkit and Python wrappers if mol2 and PDB reading features are used (requires academic or commercial license) http://www.eyesopen.com Note that the Enthought Python Distribution (EPD) provides many of these prerequisites (including Python, NetCDF 4, HDF5, netcdf4-python, numpy, and scipy): http://www.enthought.com/products/epd.php (Note that using EPD with OpenEye requires some care, as OpenEye tools are very selective about which Python and library versions are compatible.) For example, to use EPD 7.1-2 on OS X with OpenEye's latest toolkit, install OpenEye's toolkit and Python wrappers, then: # Change to OpenEye libs directory cd /path/to/openeye/python/openeye/libs # Create a symlink for your EPD platform to trick OpenEye into thinking it is supported. ln -s osx-10.7-g++4.2-x64-python2.7 osx-10.7-g++4.0-x64-python2.7 USAGE python yank.py --ligand_prmtop PRMTOP --receptor_prmtop PRMTOP { {--ligand_crd CRD | --ligand_mol2 MOL2} {--receptor_crd CRD | --receptor_pdb PDB} | {--complex_crd CRD | --complex_pdb PDB} } [-v | --verbose] [-i | --iterations ITERATIONS] [-o | --online] [-m | --mpi] [--restraints restraint-type] EXAMPLES Serial execution: # Specify AMBER prmtop/crd files for ligand and receptor. python yank.py --ligand_prmtop ligand.prmtop --receptor_prmtop receptor.prmtop --ligand_crd ligand.crd --receptor_crd receptor.crd --iterations 1000 # Specify (potentially multi-conformer) mol2 file for ligand and (potentially multi-model) PDB file for receptor. python yank.py --ligand_prmtop ligand.prmtop --receptor_prmtop receptor.prmtop --ligand_mol2 ligand.mol2 --receptor_pdb receptor.pdb --iterations 1000 # Specify (potentially multi-model) PDB file for complex, along with flat-bottom restraints (instead of harmonic). python yank.py --ligand_prmtop ligand.prmtop --receptor_prmtop receptor.prmtop --complex_pdb complex.pdb --iterations 1000 --restraints flat-bottom MPI execution: See example script mvapich2.pbs for an example using MVAPICH2. NOTES In atom ordering in prmtop/crd files, receptor atoms must come before ligand atoms. Atom orderings must be the same in all files (AMBER prmtop/crd, PDB, mol2). mol2 files must contain only copies of the same molecule in different geometries. Only implicit solvent calculations are supported now. Use the testrun.sh script as an example for serial execution, and the mvapich2.pbs script as an example of MPI execution (can be run with batch or interactive queues). TESTING Three levels of testing frameworks are provided: * DOCTESTS Doctests ensure that each of the individual functions that compose YANK run on valid data without throwing exceptions. These are implemented in the __main__ part of each module in YANK (e.g. 'repex.py'), and are regularly run to ensure that there is no invalid code in YANK. * MODULE TESTS Module tests test that the code contained in the corresponding module (e.g. 'test_repex.py' for 'repex.py') generates the correct results for analytically-tractable test cases. This code ensures the correctness of individual components of YANK. Though it is impossible to test every conceivable input combination, some care is taken to ensure overall correctness of recommended codepaths. * INTEGRATION TESTS Integration tests ensure that the whole of YANK run on certain test problems produce reliable free energy differences for well-characterized systems. Integration tests are run from the provided 'integration_tests.py' script. ROADMAP Support for the following is planned: * Online analysis and automatic convergence detection/termination [in progress] * Explicit solvent support with NPT simulations [almost ready; only waiting on analytical dispersion correction additions] * General Markov chain Monte Carlo (MCMC) move sets in between Hamiltonian exchanges [refactoring of repex.py in progress] * Expanded ensemble simulations (as an alternative to Hamiltonian exchange) * Support for relative free energy calculations * Support for sampling over protein mutations * Generative factories, to allow searching over combinatorially large chemical spaces (both for ligand substituents and protein mutations) * Constant-pH and ligand tautomer sampling LICENSE All code in this repository is released under the GNU General Public License. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . ACKNOWLEDGMENTS The authors are extremely grateful to the OpenMM development team for their help in the development of YANK, especially (but not limited to): * Peter Eastman, Stanford University * Mark Friedrichs * Vijay Pande, Stanford University * Randy Radmer * Christopher Bruns The developers are very grateful to the following contributors for suggesting patches, bugfixes, or changes that have improved YANK: * Kai Wang, University of Virginia * Christoph Klein, University of Virginia * Levi Naden, University of Virginia CITATIONS Please cite the following papers: * OpenMM [1] Friedrichs MS, Eastman P, Vaidyanathan V, Houston M, LeGrand S, Beberg AL, Ensign DL, Bruns CM, and Pande VS. Accelerating molecular dynamic simulations on graphics processing units. J. Comput. Chem. 30:864, 2009. DOI: 10.1002/jcc.21209 [2] Eastman P and Pande VS. OpenMM: A hardware-independent framework for molecular simulations. Comput. Sci. Eng. 12:34, 2010. DOI: 10.1109/MCSE.2010.27 [3] Eastman P and Pande VS. Efficient nonbonded interactions for molecular dynamics on a graphics processing unit. J. Comput. Chem. 31:1268, 2010. DOI: 10.1002/jcc.21413 [4] Eastman P and Pande VS. Constant constraint matrix approximation: A robust, parallelizable constraint method for molecular simulations. J. Chem. Theor. Comput. 6:434, 2010. DOI: 10.1021/ct900463w * Replica-exchange with Gibbs sampling [5] Chodera JD and Shirts MR. Replica exchange and expanded ensemble simulations as Gibbs sampling: Simple improvements for enhanced mixing. J. Chem. Phys., in press. arXiv: 1105.5749 * MBAR (if using automated analysis) [6] Shirts MR and Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. 129:124105, 2008. DOI: 10.1063/1.2978177 * YANK [7] Chodera JD, Shirts MR, Wang K, Eastman P, Friedrichs M, Pande VS, Branson K, Mobley DL. YANK: A GPU-accelerated platform for alchemical free energy calculations. In preparation.