Primary Publication
Karr JR*, Sanghvi JC*, Macklin DN, Gutschow MV, Jacobs JM, Bolival B, Assad-Garcia N, Glass JI, Covert MW. A Whole-Cell Computational Model Predicts Phenotype from Genotype. Cell 150, 389-401 (2012)  View

Understanding how complex phenotypes arise from individual molecules and their interactions is a primary challenge in biology that computational approaches are poised to tackle. We report a whole-cell computational model of the life cycle of the human pathogen Mycoplasma genitalium that includes all of its molecular components and their interactions. An integrative approach to modeling that combines diverse mathematics enabled the simultaneous inclusion of fundamentally different cellular processes and experimental measurements. Our whole-cell model accounts for all annotated gene functions and was validated against a broad range of data. The model provides insights into many previously unobserved cellular behaviors, including in vivo rates of protein-DNA association and an inverse relationship between the durations of DNA replication initiation and replication. In addition, experimental analysis directed by model predictions identified previously undetected kinetic parameters and biological functions. We conclude that comprehensive whole-cell models can be used to facilitate biological discovery.

Related Publications
Karr JR*, Phillips NC*, Covert MW. WholeCellSimDB: a hybrid relational/HDF database for whole-cell model predictions. Database. (2014)  View

Mechanistic ‘whole-cell’ models are needed to develop a complete understanding of cell physiology. However, extracting biological insights from whole-cell models requires running and analyzing large numbers of simulations. We developed WholeCellSimDB, a database for organizing whole-cell simulations. WholeCellSimDB was designed to enable researchers to search simulation metadata to identify simulations for further analysis, and quickly slice and aggregate simulation results data. In addition, WholeCellSimDB enables users to share simulations with the broader research community. The database uses a hybrid relational/hierarchical data format architecture to efficiently store and retrieve both simulation setup metadata and results data. WholeCellSimDB provides a graphical Web-based interface to search, browse, plot and export simulations; a JavaScript Object Notation (JSON) Web service to retrieve data for Web-based visualizations; a command-line interface to deposit simulations; and a Python API to retrieve data for advanced analysis. Overall, we believe WholeCellSimDB will help researchers use whole-cell models to advance basic biological science and bioengineering. Database URL: Source code repository URL:

Karr JR, Sanghvi JC, Macklin DN, Arora A, Covert MW. WholeCellKB: Pathway/Genome Databases for Comprehensive Whole-Cell Models. Nucleic Acids Research, 41, D787-D792 (2013)  View

Whole-cell models promise to greatly facilitate the analysis of complex biological behaviors. Whole-cell model development requires comprehensive model organism databases. WholeCellKB ( is an open-source web-based software program for constructing model organism databases. WholeCellKB provides an extensive and fully customizable data model that fully describes individual species including the structure and function of each gene, protein, reaction and pathway. We used WholeCellKB to create WholeCellKB-MG, a comprehensive database of the Gram-positive bacterium Mycoplasma genitalium using over 900 sources. WholeCellKB-MG is extensively cross-referenced to existing resources including BioCyc, KEGG and UniProt. WholeCellKB-MG is freely accessible through a web-based user interface as well as through a RESTful web service.

Purcell O*, Jain B*, Karr JR, Covert MW, Lu TK. Towards a whole-cell modeling approach for synthetic biology. Chaos, 23, 025112 (2013)  View

Despite rapid advances over the last decade, synthetic biology lacks the predictive tools needed to enable rational design. Unlike established engineering disciplines, the engineering of synthetic gene circuits still relies heavily on experimental trial-and-error, a time-consuming and inefficient process that slows down the biological design cycle. This reliance on experimental tuning is because current modeling approaches are unable to make reliable predictions about the in vivo behavior of synthetic circuits. A major reason for this lack of predictability is that current models view circuits in isolation, ignoring the vast number of complex cellular processes that impinge on the dynamics of the synthetic circuit and vice versa. To address this problem, we present a modeling approach for the design of synthetic circuits in the context of cellular networks. Using the recently published whole-cell model of Mycoplasma genitalium, we examined the effect of adding genes into the host genome. We also investigated how codon usage correlates with gene expression and find agreement with existing experimental results. Finally, we successfully implemented a synthetic Goodwin oscillator in the whole-cell model. We provide an updated software framework for the whole-cell model that lays the foundation for the integration of whole-cell models with synthetic gene circuit models. This software framework is made freely available to the community to enable future extensions. We envision that this approach will be critical to transforming the field of synthetic biology into a rational and predictive engineering discipline.

Macklin DN, Ruggero NA, Covert MW. The future of whole-cell modeling. Curr Opin Biotechnol 28C, 111-115 (2014)  View

Integrated whole-cell modeling is poised to make a dramatic impact on molecular and systems biology, bioengineering, and medicine-once certain obstacles are overcome. From our group's experience building a whole-cell model of Mycoplasma genitalium, we identified several significant challenges to building models of more complex cells. Here we review and discuss these challenges in seven areas: first, experimental interrogation; second, data curation; third, model building and integration; fourth, accelerated computation; fifth, analysis and visualization; sixth, model validation; and seventh, collaboration and community development. Surmounting these challenges will require the cooperation of an interdisciplinary group of researchers to create increasingly sophisticated whole-cell models and make data, models, and simulations more accessible to the wider community.

Kazakiewicz D*, Karr JR*, Langner KM & Plewczynski D. Combined systems and structural modeling repositions antibiotics for Mycoplasma genitalium. Comput Biol Chem pii, S1476-9271(15)30089-X (2015)  View

Bacteria are increasingly resistant to existing antibiotics, which target a narrow range of pathways. New methods are needed to identify targets, including repositioning targets among distantly related species. We developed a novel combination of systems and structural modeling and bioinformatics to reposition known antibiotics and targets to new species. We applied this approach to Mycoplasma genitalium, a common cause of urethritis. First, we used quantitative metabolic modeling to identify enzymes whose expression affects the cellular growth rate. Second, we searched the literature for inhibitors of homologs of the most fragile enzymes. Next, we used sequence alignment to assess that the binding site is shared by M. genitalium, but not by humans. Lastly, we used molecular docking to verify that the reported inhibitors preferentially interact with M. genitalium proteins over their human homologs. Thymidylate kinase was the top predicted target and piperidinylthymines were the top compounds. Further work is needed to experimentally validate piperidinylthymines. In summary, combined systems and structural modeling is a powerful tool for drug repositioning.

Lee R*, Karr JR*, Covert MW. WholeCellViz: Data visualization for whole-cell models. BMC Bioinformatics 14, 253 (2013)  View

Summary: Whole-cell models promise to accelerate biomedical science and engineering. However, discovering new biology from whole-cell models and other high-throughput technologies requires novel tools for exploring and analyzing complex, high-dimensional data. We developed WholeCellViz, a web-based software program for visually exploring, analyzing, and communicating whole-cell simulations. WholeCellViz provides 14 structured visualizations including animated metabolic and chromosome maps. These visualizations help researchers communicate and analyze model predictions by displaying predictions in their biological context. WholeCellViz also provides a simple interface for completely exploring model predictions using line plots. Furthermore, WholeCellViz enables researchers to compare predictions within and across simulations by allowing users to simultaneously display multiple plots and animations. Availability: WholeCellViz is freely available at Source code is available at Contact:

Karr JR, Williams AH, Zucker JD, Raue A, Steiert B, Timmer J, Kreutz C, DREAM8 Parameter Estimation Challenge Consortium, Wilkinson S, Allgood BA, Bot BM, Hoff BR, Kellen MR, Covert MW, Stolovitzky GA, Meyer P. A Crowdsourced Approach to Parameter Estimation for Whole-Cell Modeling. (In press) (2015)  View

Whole-cell models that explicitly represent all cellular components at the molecular level have the potential to predict phenotype from genotype. However, even for simple bacteria, whole-cell models will contain thousands of parameters, many of which are poorly characterized or unknown. New algorithms are needed to estimate these parameters and enable researchers to build increasingly comprehensive models. We organized the Dialogue for Reverse Engineering Assessments and Methods (DREAM) 8 Whole-Cell Parameter Estimation Challenge to develop new parameter estimation algorithms for whole-cell models. We asked participants to identify a subset of parameters of a whole-cell model given the model’s structure and in silico “experimental” data. Here we describe the challenge, the best performing methods, and new insights into the identifiability of whole-cell models. We also describe several valuable lessons we learned toward improving future challenges. Going forward, we believe that collaborative efforts supported by inexpensive cloud computing have the potential to solve whole-cell model parameter estimation.