Scientists have just released a first draft of the “tree of life” for the roughly 2.3 million named species of animals, plants, fungi, and microbes.
The tree depicts the relationships among living things as they diverged from one another over time, tracing back to the beginning of life on Earth more than 3.5 billion years ago.
Tens of thousands of smaller trees have been published over the years for select branches of the tree of life—some containing upwards of 100,000 species—but this is the first time those results have been combined into a single tree that encompasses all of life. The end result is a digital resource that is available free online for anyone to use or edit, much like a “Wikipedia” for evolutionary trees.
Understanding how the millions of species on Earth are related to one another helps scientists discover new drugs, increase crop and livestock yields, and trace the origins and spread of infectious diseases such as HIV, Ebola, and influenza.
“This is the first real attempt to connect the dots and put it all together,” says principal investigator Karen Cranston of Duke University. “Think of it as Version 1.0.” A paper summarizing the findings appears online in the Proceedings of the National Academy of Sciences.
University of Michigan evolutionary biologist Stephen Smith, a first author of the study, heads the group that tackled the nitty-gritty details of piecing together all the existing branches, stems, and twigs of life’s tree into a single diagram.
Rather than build the tree of life from scratch, the researchers pieced it together by compiling thousands of smaller chunks that had already been published online and merging them into a gigantic “supertree” that encompasses all named species.
“Many participants on the project contributed hundreds of hours tracking down and cleaning up thousands of trees from the literature, then selecting 484 of them that were used to generate the draft tree of life,” says co-first author Cody Hinchliff, formerly a postdoctoral researcher in Smith’s lab who is now at the University of Idaho.
Combining the 484 trees was a painstaking process that took three years to complete, says Smith, an assistant professor in the department of ecology and evolutionary biology.
The project required Smith and Hinchliff to write tens of thousands of lines of computer code and to create several new software packages.
“To complete this project, we had to code our own solutions. There was nothing out of the box that we could use,” adds Smith.
The aim was to create software tools and algorithms that balanced performance with efficiency when combining large numbers of trees, Hinchliff says.
“Our software, which is called ‘treemachine,’ took a few days to generate the current draft tree of life on a moderately outfitted desktop workstation in Stephen’s office,” he says. “For comparison, other state-of-the-art methods we tried would have taken hundreds of years to finish on that kind of hardware.”
What we still don’t know
The team faced another challenge: The vast majority of evolutionary trees are published as PDFs and other image files that are impossible to enter into a database or merge with other trees.
“There’s a pretty big gap between the sum of what scientists know about how living things are related, and what’s actually available digitally,” Cranston says.
As a result, the relationships depicted in some parts of the tree, such as the branches representing the pea and sunflower families, don’t always agree with expert opinion. Other parts of the tree, particularly insects and microbes, remain elusive.
That’s because even the most popular online archive of raw genetic sequences—from which many evolutionary trees are built—contains DNA data for less than 5 percent of the tens of millions of species estimated to exist on Earth.
“As important as showing what we do know about relationships, this first tree of life is also important in revealing what we don’t know,” says coauthor Douglas Soltis of the University of Florida.
‘This is just the beginning’
To help fill in the gaps, the team is also developing software that will enable researchers to log on and update and revise the tree as new data come in for the millions of species still being named or discovered.
“This is just the beginning,” Smith says. “While the tree of life is interesting in its own right, our database of thousands of curated trees is an even more useful resource. We hope that this publication will encourage other researchers to contribute their own studies or to enter information from previously published sources.”
“Twenty five years ago, people said this goal of huge trees was impossible,” Soltis says. “The Open Tree of Life is an important starting point that other investigators can now refine and improve for decades to come.”
The National Science Foundation supported the work. Coauthors contributed from Interrobang Corporation; the University of Florida; the Field Museum of Natural History; George Washington University; the University of Nebraska-Kearney; Clark University; Michigan State University; Smith College; University of Kansas; the National Evolutionary Synthesis Center; and Texas A&M University.
The current version of the tree—along with the underlying data and source code—is available to browse and download online.
Source: University of Michigan