Our Hierarchical Nyström Methods help spectral clustering to identify metastable aggregates with highly microstates rather than being distracted by lowly populated states when constructing Markov State Models.
Markov state models (MSMs) have become a popular approach for investigating the conformational dynamics of proteins and other biomolecules.
MSMs are typically built from numerous molecular dynamics simulations by dividing the sampled configurations into a large number of microstates based on geometric criteria.
The resulting microstate model can then be coarse-grained into a more understandable macro state model by lumping together rapidly mixing microstates into larger, metastable aggregates.
However, finite sampling often results in the creation of many poorly sampled microstates.
During coarse-graining, these states are mistakenly identified as being kinetically important because transitions to/from them appear to be slow.
In this paper we propose a formalism based on an algebraic principle for matrix approximation, i.e. the Nyström method, to deal with such poorly sampled microstates.
Our scheme builds a hierarchy of microstates from high to low populations and progressively applies spectral clustering on sets of microstates within each level of the hierarchy. It helps spectral clustering identify metastable aggregates with highly populated microstates rather than being distracted by lowly populated states.