Optimal use of data in parallel tempering simulationsfor the construction of kinetic models of peptide dynamics Jan-Hendrik Prinz1,1, * John D. Chodera1,2, # Vijay S. Pande,2, ## Jeremy C. Smith,3, $ and Frank Noe'4, || 1IWR, University of Heidelberg, INF 368, 69120 Heidelberg, Germany** 2Department of Chemistry, Stanford University, Stanford, CA 94305 3ORNL, Oak Ridge, TN 12345 4DFG Research Center Matheon, FU Berlin, Arnimallee 6, 14195 Berlin, Germany (Dated: September 4, 2008) - Recently, others have demonstrated how the short physical trajectories generated in parallel tem-pering simulations between exchanges can be used to construct Markov models to describe peptide dynamics at each simulated temperature.- While able to describe the temperature-dependent kinetics, this approach does not make optimal use of all available data, instead parameterizing the model at a given temperature with only data fromthat temperature. - We propose a strategy where, with a simple modification of the parallel tempering protocol, theharvested trajectories can be reweighted, allowing data from all temperatures to contribute to the estimated kinetic model without requiring the assumption of a particular kinetic model (like Arrhenius)beyond that of classical statistical mechanics. - This method not only reduces the statistical uncertainty in the kinetic model, providing estimatesof transition rates even for transitions not observed at the temperature of interest, it allows the kinetic model to be estimated at temperatures other than those simulated from.- To demonstrate the method in a system, we apply the method to the solvated terminally-blocked alanine peptide. I. INTRODUCTION Conformational transitions are critical to the functionof proteins and nucleic acids. These transitions span large ranges of lengthscales, timescales and complexity,and include ligand binding ? ], complex conformational rearrangements between native protein substates ? ? ],and folding ? ? ]. Understanding these processes is challenging, as they often involve various pathways viamany intermediate conformations. A natural approach towards modeling the kinetics ofmolecules is by partitioning the state space into discrete states ? ? ? ]. In particular, biomolecular function oftendepends on the ability to undergo transitions between long-lived, or "metastable" states ? ], which serve as auseful definition of states in a kinetic model ? ? ]. The switching process between conformational sub-states is often described with the memoryless Master equation: dp(t) dt = p(t)K. (1) with p(t) 2 R1*m containing the probability to findthe system in each of its m states at time t. K 2 Rm*m *Electronic address: jan-hendrik.prinz@iwr.uni-heidelberg.de# Corresponding author; Electronic address: jchodera@stanford.edu##Electronic address: pande@stanford.edu $Electronic address: smithjc@ornl.gov|| Electronic address: noe@math.fu-berlin.de** 1 Both authors have equally contributed to this work is a rate matrix with Kij being the transition rate fromstate i to state j and the diagonal elements are given by Kii = - Pj6=i Kij in order to ensure mass conservation.Alternatively, the system dynamics can be described by a discrete-time Markov process using the transition ma-trix, T (o/ ) 2 [0, 1]m*m, whose entries, Tij, provide theprobability of the system to be found in state j at time t + o/ given that it was in state i at time t. T (o/ ) is row-stochastic, i.e.: P j Tij = 1 for all i. The time-discreteanalog to Eq. (1) is: p(ko/ ) = p(0)T k(o/ ). (2) Eq. (1) and (2) provide equivalent results at discretetimes t = ko/ , k 2 N0 and are related by T (o/ ) = exp(o/ K)[1]. Here, the transition matrix formulation will be used. * In order for the dynamics to be markovian at lag-time o/ states need to be defined such, that their in-ternal equilibraiation time is significantly shorter than the lagtime o/ . This is not trivial to ensure forcomplex molecules, however this is not the issue of the current article and is discussed elsewhere.Cite: (Hummer), JCP(John), JCP(Frank, Illia, s.o) * It has been shown, that using appropiate defini-tions of the discrete states the essential dynamics of biomolecules is well-described by such amarkov model. Cite: MMS(John), JCP07(Frank, Illia, ...), JPCB? 08 (Hummer). * The estimation of the transition probabilities ishampered by the sampling problem, due to which 2 transition between meta-stable states are rareevents. It would be desirable to use dynamics of this same system at some higher temperaturewhere these transitions occur more easily / frequently * Parallel tempering simulations and Replica Ex-change Molecular Dynamics(Cite: PT...), in short PT simulations, have become popular means toexplore the state space at low temperatures using a set of coupled simulations at a range of tem-peratures. The PT simulations at sifferent temperature are coupled by exchanges constructedin a way that they converge toward a canonical distribution at each temperature. PT simulationshave been widely used to explore conformations in biomolecules, that are difficult to access via di-rect MD. * Recently, an approach to estimate interstate tran-sition probabilities from the short physical trajectories generated from PT trajectories between ex-changes has been proposed. (Cite: PRE (Hummer)). However, this approach produces a sepa-rate estimate of the transition matrix at each temperature without attempting to combine the dataat all available temperatures. * We propose an approach to obtain an optimal esti-mation of the temperature dependend transition matrix combing data from all PT temperatures.The estimated tranition matrix is statistically optimal in a maxmimum likelihood sense and doesnot require any model assumptions such as the Arrhenius Law. the estimation procedure is based on[2]. A slight modification of the PT protocol is necessary. * As a result of the optimization procedure one ob-tains the estimate of the transition matrix at any temperature desired. Even at temperatures, thathave previously not been sampled from. Additionally an estimate for the error of the tranisionmatrix elements is obtained. * The procedure is illustrated on a solvatedterminally-blocked alanine dipeptide simulated with PT at temperature in the range 273K upto600K. * Outline of this manuscript II. THEORY A. Discrete-state Markovian kinetic models Let si, i 2 {1, . . . , n} be a complete (Sni=1 si = \Omega ) andnon-overlapping ( si " sj = ;, 8i 6= j fifi i, j 2 {1, . . . , n}) partition of state space \Omega . These sicorrespond to the state between which interstaet transition probabiliteswill be computed. Define the characteristic function O/ as O/i(x) = (1 if x 2 si0 else . The transition probability from state i to state j at lag-time o/ and inverse temperature fi is then defined as Tij(o/ ; fi) = Cij(o/ ; fi)P k C ik(o/ ; fi) (3) defining the row-stochastic transition matrix T (o/, fi) 2 Rn*n Cite: (Bill). The state-to-state time-correlation function Cij(o/ ; fi) is given by Cij(o/ ; fi) j hO/i(0) O/j(o/ )ifi (4) and is symmetric -- Cij (o/ ; fi) = Cji(o/ ; fi) -- for systemsobeying detailed balance. B. Transition probabilities by reweighting 1. Dynamical reweighting Suppose we have a set of Nk Hamiltonian trajectories zkn(t), n = 1, . . . , Nk, t 2 [0, T ], where the initial phasespace points zkn(0) are sampled from canonical (NVT)distributions at corresponding inverse temperatures fik, k = 1, . . . , K. By the application of dynamical reweight-ing [2], we can estimate a correlation function CAB(o/ ; fi)using this set of trajectories as CAB(o/ ; fi) j [Z(fi)]-1 Z dz0 e-fiH(z0) A[z(t)] ss KX k=1 NkX n=1 wkn(fi) A[zkn(t)] (5) where A[z(t)] is a functional of the trajectory z(t) whoseexpectation produces the desired correlation function, such as A[z(t)] = A(z(0)) B*(z(o/ )). The normalizedtrajectory weights wkn(fi) are specified in terms of un-normalized weights ~ wkn(fi) ~wkn(fi) = " KX k0=1 Nk0 exp[fk0 - (fi - fik0 ) Ekn]# -1 (6) such that wkn(fi) = ~wkn(fi)/ KP k=1 NkP n=1 ~w kn(fi), with Ekn j H(zkn(0)) denoting the total energy of the trajectory.The dimensionless free energies fi = - ln Zi + c aredetermined by solution of a set of self-consistent equations e-fi = KX k=1 NkX n=1 ~ wkn(fi) (7) which can be done efficiently by a number of means [2]. 3 2. Modified parallel tempering protocol By using a modified parallel tempering protocol, wecan generate a set of Hamiltonian trajectory segments zkn(t) of uniform length T >= o/ whose initial phasespace points zkn(0) are distributed from the canon-ical (NVT) ensemble at corresponding temperatures fi1, . . . , fiK.We start assuming that some process was used to generate initial phase space points zk0(0) from equilibriumat each corresponding inverse temperature fik zk0(0) , [Z(fik)]-1 e-fiH(zk0(0)) (8) This could be done by means of a standard PT proto-col, or by running the modified protocol for a number of iterations starting from a single configuration.Consider iteration n of the algorithm. For each tem-perature index k = 1, . . . , K, we propagate Hamil-ton's equations of motion using a symplectic integrator to generate trajectories of zkn(t) of length T . Finally,we propose exchanges between the final configurations zin(o/ ) and zjn(o/ ) of temperatures fii and fij, acceptingor rejecting the exchange with the Metropolis-like probability depending on the final potential energies of theconfigurations Ui and Uj with Pexch(Ui, fii; Uj, fij) = min {1, exp[-(fii - fij)(Uj - Ui)]} Regardless of whether the exchange is accepted orrejected, we reassign the velocities according to the Maxwell-Boltzmann distribution at the new (old) tem-peratures, and denote the new phase space points as zk(n+1)(0), from which the next iteration can be carriedout. 3. Transition probabilities To estimate the transition probabilities Tij(o/ ; fi), wefirst compute an estimate of the correlation functions Cij (o/ ; fi) ^Cij(o/ ; fi) = KX k=1 NkX n=1 wkn(fi) Aknij (o/ ) (9) where we have defined the per-trajectory observable Akn(o/ ) to make use of overlapping trajectory segments(in the case T > o/ ) and the time-reversibility of Hamiltonian dynamics, thus ensuring ^Cij (o/ ; fi) = ^Cji(o/ ; fi): Aknij (o/ ) j 12(T - o/ ) Z T-o/ 0 dt 0 [O/i(zkn(t0)) O/j(zkn(t0 + o/ )) + O/j(zkn(t0))O/i(zkn(t0 + o/ ))] (10) The row-stochastic transition matrix estimate ^T(o/ ; fi) isthen estimated from Eq. 3 ^Tji(o/ ; fi) = ^Cij (o/ ; fi)P k ^C ik(o/ ; fi) (11) Because ^C(o/ ; fi) is symmetric, ^T(o/ ; fi) will be reversible. C. Bayesian estimate of transition probabilities from asingle temperature * To compare the estimates of the present method * To analyze, how the combiantion of data from alltemperatures reduces the uncertainy in the estimation of the transition matrix we compare witha method to compute errors at a single temperature. The method is described in detail in Cite:JCP08(Frank) and is briefly outlined here. * Let Cij (o/, fi) be the transition counts from all tra-jectories at inverse temperature fi.Then the pos-terior probability of transition matrices given this observation is p(T |C) / p(T )p(C|T ) = p(T ) Y i,j2S T cijij , (12) We use a uniform prior distribution p(T ) j const andthus p(T |C) / p(C|T ) Furthermore we restrict ourselvesto transition matrices, which fulfill detailed balance, i.e. are reversible with respect to the stationary distribution ss: ssiTij = ssjTji. Distribution (12) was sampled using a Markov chainMonte-Carlo procedure described in Cite: JCP08(Frank). III. APPLICATION TO TERMINALLY-BLOCKEDALANINE PEPTIDE * Description of system used in the test: solvatedterminally-blocked alanine peptide, relevant details from the protocol (Copy from MMS (John)) * PT protocol that was used. Integrator, energyconservation - validated, fluctuations, drift below ?, temperatures, burn-in phase / equilibration1ns. to ensure, that all initial configurations are drawn from equilibrium at their respective tem-peratures. 40 temperatures exponentially spaced between 273K and 600K, 500 trajectories each witha length of 20ps saved every 100fs * State definition O/i: 6 states have been manuallydefined in the OE, space of dihedrals. Using thisstate-definition the interstate dynamics have been shown to be approximately Markovian at a lag-tim of o/ = 6ps. [3], 2004a or b(Swope) Picture ->Energy Landscape for Peptide in OE, space. Thesame state definition was used across all temperatures. 4 * Each contiguous trajectory snippet n at tempera-ture k the corresponding symmetrized(reversible)state-to-state time-correlation function Aknij wascomputed using Eq. (10) * Determination of the weights ~wkn(fi), involv-ing estimation of "freeenergies" fk . From this computing transition probabilites ^Tijand re-lated uncertainties(should be contained in the theory/methods section, if too long move parts to ap-pendix) * Comparison of transition probabilities as a func-tion of temperature. (Picture) Description of Plots "Blue lines show the estimates Transition Proba-bilities using the present method, the result. Error bars indicate 2oe confidence intervals. For compar-ison the black lines show the maximum-likelihood transition probabilities computed from single tem-perature data. It is apparent, that the reweighting methods provides useful and bounded estimatedacross the whole temperature range even when the corresponding transition was not observed atthat temperature. The counting method without reweighting is very noisy and not useful in situa-tions at temperature with no transitions counted. * Red lines show Transition Probabilities using re-versible TM Sampling conducted seperately at each individual temperature using the methoddescribed in section II C. Surprisingly, there is still a lot of information about transitions to/fromstates that haven't been sampled (or have been poorly sampled) from the Bayesian Sampling ap-proach at single temperatures, due to the incorporation of the reversibility contraint. Apparentlythe reweighting estimate that uses the combined data from all temperatures has smaller uncertain-ties than the estimators that work on individual temperatures. * For transitions, that are not sampled are certaintemperature ranges, the maximum-likelihood estimate obtained with our reweigthing method isapproximately zero(s. fig ... panel ...) In these cases the uncertainty is underestimated, since itis derived from the low temperatures for poorly sampled transitions (e.g. 3->6), suspect reweight-ing may significantly underestimate uncertainty since it is derived from the local (hessian) and thusfor values close to zero a bad approximation for the standard deviation of this highly unsymmetricdistribution (s. fig ...) * To validate our result. Further simulations wereconducted at T = 302K. 10000 trajectories shootfrom the equilibrium of each state with length 10ps. Resulting in a single transition at a lagtime o/ = 6ps. The results of these simulations are plot-ted in figure ... as the black line. The blue curve is the reweighting estimate using combined datefrom all temperatures of the short PT simulation and the gray histogram the sampled distributionusing only single temperatures from the PT simulation. In many cases the combined temperatureestimate is closer to the reference distribution than the sampled distribution. * Comparison of temperature dependence of im-plied timescales -Description. s fig. ... * Some figure showing how much different tem-peratures contribute to the estimates of particular transitions - Description. s fig. * In Extrapolation figure? Is this reasonable at all ? IV. DISCUSSION * Even at temperatures at which no transitions areobserved, can still estimate transition probabilities. * Allows us to interpolate at temperatures thathaven't been sampled; smoothly describe temperature-dependent behavior (fluctuationsmuch smaller than Bayesian method) * We could even differentiate the estimates of transi-tion probabilities if we wanted because trajectory weights are differentiable * Same is true for non-dynamic properties (expecta-tions, stationary distributions) * Some transitions appear to be thermally activatedprocesses (increasing probability with energy or temperature), others are rather flat * What about entropically-dominated barriers? Inthis case, heating would reduce rate, but because we can use lower temperatures, this can help too. * While temperature range at which other tempera-tures contribute significantly to weighting diminishes as system size increases, the requirement thattemperatures in parallel tempering be spaced to get good exchange means that several replicas arealways guaranteed to contribute * Any observations about metastable state structureor grouping as temperature changes? (PCCA+ as a function of temperature?) * What about the future? Some combination ofBayesian and reweighting (similar to Ron Levy's T-WHAM) may provide the best of both types ofestimators in giving good uncertainties at the expense of introducing some bias from introductionof histograms in energy 5 * Transition path sampling could help collect moredata for poorly sampled transitions - comment on "sweet spot" for energies of these transitions? * Limitations of requiring we use Hamiltonian tra-jectories - could it work for longer trajectories or non-Hamiltonian dynamics? V. ACKNOWLEDGMENTS JDC and VSP gratefully acknowledge support froman NSF grant for Cyberinfrastructure (NSF CHE0535616). [1] N. G. van Kampen, Stochastic processes in physics and chem-istry, 2nd ed. (Elsevier, ADDRESS, 1997). [2] M. R. Shirts and J. D. Chodera, J. Chem. Phys. In press,(2008). [3] J. D. Chodera, W. C. Swope, J. W. Pitera, and K. A. Dill,Multiscale Model. Simul. 5, 1214 (2006). Appendix A: Proof that modified parallel temperingprotocol generates canonical distribution Appendix B: Estimation of uncertainties in transitionprobabilities