One approach that we use to address these questions is phylogenetic analysis (i.e., the creation and analysis of evolutionary trees). We can make deep evolutionary trees of ancient protein families and then reconstruct what the ancestor at the root of the family looked like. Once we reconstruct the amino acid sequence of the ancestral protein, we can use common computational biology techniques to predict its structure and function.

A second approach uses digital life simulations, which we developed specifically to investigate the kinds of evolutionary processes that would have occurred in very early evolutionary history. These simulations consists of digital organisms in a simulated ecosystem. Each organism has its own evolvable genome that allows is to do something kind of like metabolism in order to grow and divide. This is all really abstracted with respect to how life actually works. Even so, these simulations allow us to study processes that would have occurred in early evolution but wouldn't happen once life took its current form.

A third approach incorporates different types of data about early evolution into a unified framework so that they can be compared and corroborated. The resulting database can be a useful tool for testing hypotheses about early evolution and finding connections between stages of early evolution. We recently used it to find areas where different studies about the genome the last universal common ancestor at the root of the tree of life corroborate one another in terms of metabolism and broader biological functions.

Our research interests generally revolve around early evolutionary history from the perspective of the kinds of biological systems that organisms use today and how they first arose in ancient life. These include questions like: How did cellular organization first arise and what did it look like? Where did the genetic system come from and how did its current complexity arise from a simpler one? How did the metabolism that is today mediated by protein enzymes first evolve and what was its relationship to the geochemical environment?

This is a reconstruction of a protein essential to cellular organization as it would have looked like prior to the last universal common ancestor roughly 3.5-4 billion years ago. See the full publication here.

This is a depiction of the core metabolic network of the last universal common ancestor (black lines) overlaid on top of modern global metabolism. See the full publication here.