The thioredoxin family of oxidoreductases plays an important role in redox signaling and control of protein function. Not only are thioredoxins linked to a variety of disorders, but their stable structure has also seen application in protein engineering. Both sequence-based and structure-based tools exist for thioredoxin identification, but remote homolog detection remains a challenge. We developed a thioredoxin predictor using the approach of integrating sequence with structural information. We combined a sequence-based Hidden Markov Model (HMM) with a molecular dynamics enhanced structure-based recognition method (dynamic FEATURE, DF). This hybrid method (HMMDF) has high precision and recall (0.90 and 0.95, respectively) compared with HMM (0.92 and 0.87, respectively) and DF (0.82 and 0.97, respectively). Dynamic FEATURE is sensitive but struggles to resolve closely related protein families, while HMM identifies these evolutionary differences by compromising sensitivity. Our method applied to structural genomics targets makes a strong prediction of a novel thioredoxin.
Integration of evolutionary conservation information (HMM) and structural dynamics information (FEATURE + MD simulation) improves recognition of protein functional sites.
The study described in this publication showed improved remote homolog detection using a combination of sequence and structural dynamics information. As a test case, we focused on the Thioredoxin family, a member of the Thioredoxin-like superfamily of oxidoreductases.
The Thioredoxin HMM and multi-site Thioredoxin FEATURE model described in the publication are provided. Please refer to the HMMER User Manual and SimTK FEATURE project for more information respectively.
Trajectory data files for implicit solvent simulations of 100+ proteins (Thioredoxins and Non-Thioredoxins) are provided in GROMACS format (.gro and .xtc). A summary of the proteins simulated can be found in Table S1 of the publication. Please refer to the Molecular Dynamics Simulation section of the Experimental Methods for simulation parameters.