Many claims about sex differences lack scientific rigor

Many scientific claims about differences between males and females aren’t backed by data, research finds.

An analysis of published studies from a range of biological specialties shows that, when researchers report data by sex, critical statistical analyses are often missing and the findings are likely to be reported in misleading ways.

The analysis appears in the journal eLife. Neuroscientists at Emory University examined studies from nine different biological disciplines that involved either human or animal subjects.

“We found that when researchers report that males and females respond differently to a manipulation such as a drug treatment, 70% of the time the researchers have not actually compared those responses statistically at all,” says senior author Donna Maney, professor of neuroscience in Emory University’s psychology department. “In other words, an alarming percentage of claims of sex differences are not backed by sufficient evidence.”

Lots of claims, but how about evidence?

In the articles missing the proper evidence, she adds, sex-specific effects were claimed in nearly 90% of the cases. In contrast, authors that tested statistically for sex-specific effects reported them only 63% of the time.

“Our results suggest that researchers are predisposed to finding sex differences and that sex-specific effects are likely over-reported in the literature,” Maney says.

This particular problem is common and pertains to Maney’s own previous work. “Once I realized how prevalent it is, I went back and checked my own published articles and there it was,” she says. “I myself have claimed a sex difference without comparing males and females statistically.”

Maney stresses that the problem should not be discounted just because it is common. It is becoming increasingly serious, she says, because of mounting pressure from funding agencies and journals to study both sexes, and interest from the medical community to develop sex-specific treatments.

Maney is a behavioral neuroendocrinologist interested in how research on sex differences shapes public opinion and policy. Rigorous standards are needed, she says, to ensure that people of all genders have access to care that is appropriate for them.

Better training and oversight are needed to ensure scientific rigor in research on sex differences, write Maney and coauthor Yesenia Garcia-Sifuentes, an Emory PhD candidate in the Graduate Program in Neuroscience: “We call upon funding agencies, journal editors, and our colleagues to raise the bar when it comes to testing for and reporting sex differences.”

Sex differences and bias

Historically, biomedical research has often included just one sex, usually biased toward males. In 1993, Congress wrote a policy into law to ensure that women are included in clinical studies funded by the National Institutes of Health whenever feasible, and that the studies take place so that it is possible to analyze whether the variables being studied affect women differently than other participants.

In 2016, the NIH announced a policy that also requires the consideration of sex as a biological variable when feasible in basic biological studies that it funds, whether that research involves animals or humans.

“If you’re trying to model anything relevant to a general population, you should include both sexes,” Maney explains. “There are a lot of ways that animals can vary, and sex is one of them. Leaving out half of the population makes a study less rigorous.”

As more studies consider sex-based differences, Maney adds, it is important to ensure that the methods underlying their analyses are sound.

From giraffes to humans

For the analysis, Garcia-Sifuentes and Maney looked at 147 studies published in 2019 to investigate what is typically used as evidence of sex differences. The studies ranged across nine different biological disciplines and included everything from field studies on giraffes to immune responses in humans.

The studies that they analyzed all included both males and females and separated the data by sex. Garcia-Sifuentes and Maney found that the sexes were compared, either statistically or by assertion, in 80% of the articles. And, within those articles, sex differences were reported in 70% of them and treated as a major finding in about half of those.

Some of the studies that reported a sex difference, however, committed a statistical error. For example, if researchers found a statistically significant effect of a treatment on one sex but not the other, they typically concluded a sex difference even if the effect of the treatment was not compared statistically between males and females.

The problem with that approach is that the statistical tests conducted on each sex can’t give “yes” or “no” answers about whether the treatment had an effect.

“Comparing the outcome of two independent tests is like comparing a ‘maybe so’ with an ‘I don’t know’ or ‘too soon to tell,'” Maney explains. “You’re just guessing. To show actual evidence that the response to treatment differed between females and males, you need to show statistically that the effect of treatment depended on sex. That is, to claim a ‘sex-specific’ effect, you must demonstrate that the effect in one sex was statistically different from the effect in the other.”

On the flip side, the analysis also encountered strategies that could mask sex differences, such as pooling data from males and females without testing for a difference. Maney recommends reporting the size of the difference—that is, the extent to which the sexes don’t overlap—before pooling data. She provides a free online tool that lets researchers visualize their data to assess the size of the difference.

“At this moment in history, the stakes are high,” Maney says. “Misreported findings may affect health care decisions in dangerous ways. Particularly in cases where sex-based differences may be used to determine what treatment someone gets for a particular condition, we need to proceed cautiously. We need to hold ourselves to a very high standard when it comes to scientific rigor.”

Source: Emory University