Researchers have developed a new, less expensive, and faster method to determine the DNA sequence of the male-specific Y chromosome in the gorilla.
Because the technique will allow better access to genetic information of the Y chromosome of any species, it may be useful in studying male infertility disorders and male-specific mutations—and could also help trace paternity and track how males move within and between populations in endangered species, like gorillas.
“Surprisingly, we found that in many ways the gorilla Y chromosome is more similar to the human Y chromosome than either is to the chimpanzee Y chromosome,” says Kateryna Makova, professor of science at Penn State.
“In regions of the chromosome where we can align all three species, the sequence similarity fits with what we know about the evolutionary relationships among the species—humans are more closely related to chimpanzees. However, the chimpanzee Y chromosome appears to have undergone more changes in the number of genes and contains a different amount of repetitive elements compared to the human or gorilla.
“Moreover, a greater proportion of the gorilla Y sequences can be aligned to the human than to the chimpanzee Y chromosome.”
The Y chromosome of mammals is incredibly difficult to sequence for a number of reasons. One reason is it present in only one copy and makes up only about one to two percent of the total genetic material found in a cell of a male. To reduce this difficulty, the researchers used an experimental technique called flow-sorting to preferentially select the Y chromosome for sequencing based on the chromosome’s size and genetic content.
“Flow-sorting increased the amount of the Y chromosome in our dataset to about thirty percent,” says Paul Medvedev, assistant professor of computer science and engineering and of biochemistry and molecular biology. “To further enrich our data for the Y chromosome, we developed a computational technique—called RecoverY—to sort the data into Y and non-Y sequences based on how frequently similar sequences appeared in our data.”
As reported in the study, published online in the journal Genome Research, the Y chromosome, like all DNA, is composed of a series of molecules called “bases” that are represented by the letters A, T, C, and G. Current genetic sequencing technologies produce “reads” of sequence that are much shorter than the entire length of the chromosome. These reads need to be placed in order and pieced together by finding places where they overlap into longer and longer chunks. The research team used two different sequencing technologies to help with this assembly of the DNA sequence of the Y chromosome.
One sequencing technology used by the researchers produces massive amounts of very short reads—about 150 to 250 bases in length. Using this method, the researchers sequenced enough reads to cover the entire length of the Y chromosome about 450 times.
The researchers assembled these short reads into longer chunks that they then further connected using the second sequencing technology that produces longer reads—about seven thousand bases in length on average.
“By reducing non-Y chromosome reads from our data with flow sorting and the RecoverY technique that we developed, and by using this combination of sequencing technologies, we were able to assemble the gorilla Y chromosome so that more than half of the sequence data was in chunks longer than about 100,000 bases in length,” Medvedev says.
Another reason that determining the genetic sequence of the Y chromosome is so difficult is that it is composed of an unusually high number of repeated sequences—regions where the sequence of As, Ts, Cs, and Gs are identical, or nearly identical, for thousands or millions of bases in a row. Many of these repeats, including some genes, appear as back-to-back series of the same repeated sequence or as long palindromes which, like the word “racecar,” read the same forward and backward. The researchers used an experimental technique—”droplet digital polymerase chain reaction”—to determine the number of copies of the genes that appear in these series.
“Sequencing the Y chromosome is like trying to put together a jigsaw puzzle, without knowing the final picture, from a pile of pieces where only about one out of every hundred is useful, and most of the pieces you do need look identical,” Makova says. “We’ve developed a pipeline for sequencing the Y chromosome that is more efficient than previous methods and reduces a number of the difficulties associated with determining the genetic sequence of the Y chromosome. Our method will open the door for studying the Y chromosome for more labs, more species, and more individuals within those species.”
To demonstrate the utility of the gorilla Y chromosome sequence they generated, the researchers designed genetic markers that can be used to differentiate the genetic relatedness among male gorillas and thus to aid in conservation genetics efforts targeted at preserving this endangered species.
Other researchers from Penn State and from the University of Cambridge and the San Diego Zoo contributed to the study. The National Science Foundation, the Penn State Clinical and Translational Sciences Institute, the National Institutes of Health, the John and Beverly Stauffer Foundation, the Alice B. Tyler Charitable Trust, and the Leverhulme Trust funded the work.
Source: Penn State