In 2014, Indian sprinter Dutee Chand was barred from competing in the Commonwealth Games, an international sporting tournament between former territories of the British Empire. Chand hadn’t doped—a rampant problem in the sport—or consumed any illicit substances.
But officials deemed that Chand, who was not favored to win the competition, had a seemingly unfair advantage: her body naturally produced a lot of testosterone.
At least, that’s according to a regulation issued by the International Association of Athletics Federation (IAAF). Suspended in 2015 by the Court of Arbitration for Sport (CAS), the regulation was aimed at females whose bodies produce unusually high levels of androgens, a group of hormones that includes testosterone. Concerned in part that this condition, called hyperandrogenism, gave female athletes an unfair competitive advantage, the IAAF ruled that if athletes were unable to prove their body was incapable of using the testosterone, women with over 10 nanomoles of testosterone per liter of blood would either have to undergo surgery or take hormone suppressants before they would be allowed to return to competition.
Men typically have much higher testosterone levels then women, and the IAAF believes that this is part of the reason why, on average, elite men outperform elite women by about 10% in track and field events. Take, for example, Genzebe Dibaba, an Ethiopian distance runner who holds five women’s world records between indoor and outdoor track and field. While Dibaba runs a blazing fast 4:13 mile, all twelve American high school boys at this year’s Brooks P.R. Invitational, the premiere high school meet of the summer, bettered Dibaba’s fastest career time. It would seem that we segregate men and women in track and field events with good reason—otherwise, one of the best female milers the world has seen in years would struggle to compete in national meets at the high school level.
The IAAF argues that to uphold the integrity of women’s athletics, testosterone levels require greater scrutiny. The organization has until today to present evidence to the CAS that they hope will convince the court to reinstate the regulation.
Most experts agree that the IAAF’s body of evidence is unlikely to convince the CAS to end the suspension.
It’s expected that the IAAF will include in its package of evidence a study published earlier this month in the British Journal of Sports Medicine that links testosterone levels to athletic performance in a subset of women’s track and field events. While the authors conclude that their findings suggest that female athletes with high testosterone levels have a competitive advantage in these events and imply that they support the IAAF’s regulation, other scientists argue that the study is limited, lacks conclusive evidence, and that these conclusions about the research’s broader implications are unsubstantiated.
Most experts agree that the IAAF’s body of evidence is unlikely to convince the CAS to end the suspension. But the ban’s ethical complications are manifold—and that’s where opinions differ.
Just about everyone agrees that the sport’s governing body should regulate athletes’ use of unnatural substances, like steroids. But to what degree should they control what occurs naturally in people’s bodies?
The History of Sex Testing
Sorting people into one of two of categories isn’t easy, because sex is not a neat binary of male and female, says Katrina Karkazis, an anthropologist, bioethicist, and senior researcher at Stanford University’s Center for Biomedical Ethics whose research focuses on intersex traits. Scientists used to think people could either have an X and a Y sex chromosome, and were therefore male, or two X chromosomes, and were therefore female. But as Karkazis explains in her 2012 paper in The American Journal of Bioethics, those aren’t the only two options. Some people are born with two X chromosomes and one Y chromosome. Also, not everyone has the same sex chromosome pairing in every cell—some people have some cells with an XY pairing and other cells with an XX pairing.
There are also at least five other markers of sex, including hormones, internal genitalia, and external genitalia, and the way that these markers—like chromosomes—express themselves is also not black and white. To further complicate things, a person might have one of these markers present as male, another present as female, and a third present as something in between—an intersex trait.
Sports’ governing bodies struggled for decades with what to do with athletes who can’t be sorted into one of these two categories, resulting in numerous horror stories. Jaime Schultz, a professor of kinesiology at Penn State, described in The Conversation “nude parades” in the 1960s in which female athletes were forced to present themselves naked to gynecologists at the 1966 European Championships and the 1967 Commonwealth Games. Later, the IAAF switched their chromosome testing to a less invasive but still ethically questionable cheek swabbing.
The IAAF and International Olympics Committee (IOC) ended systematic sex testing in the 1990s. But individual athletes have still been subject to scrutiny by the sport’s governing bodies, most recently Chand—the Indian sprinter—and South African 800-meter runner Caster Semenya. Because of her dominant performance at the 2009 World Championships and her stereotypically masculine build, Semenya, a teenager at the time, was “reportedly subjected to a two-hour examination during which doctors put her legs in stirrups and photographed her genitalia.” After the championships, “intensely intimate details about Semenya’s body became a topic for public debate and scrutiny,” Karkazis wrote in her 2012 paper .
After Chand appealed the 2014 ban, the CAS reinstated her and suspended the hyperandrogenism regulation. The CAS wrote in its decision that “the Hyperandrogenism Regulations are based on an implicit assumption that hyperandrogenic females enjoy a significant performance advantage over their non-hyperandrogenic peers, which outranks the influence of any other single genetic or biological factor.” It gave the IAAF two years to gather evidence supporting this assumption.
The IAAF Evidence
Joanna Harper, a medical physicist and the only transgender person ever to be an advisor to the IOC on matters of gender and sport, says that the IAAF is likely to include a study by Stéphane Bermon , a physician and exercise physiologist, that’s receiving a lot of public attention.
The study divided athletes in each track and field event at the 2011 and 2013 World Championships into three groups based on their fT levels (fT stands functional testosterone, or testosterone that the body can use). The authors then tested for a statistically significant difference between the average times of athletes in the upper third of fT levels and those in the lower third. It concluded that “female athletes with high fT levels have a significant competitive advantage over those with low fT” in five track and field events, and that this advantage should be considered by those writing legislation.
While it’s published in the peer-reviewed and highly respected BJSM, the study has raised a few eyebrows. Schultz is skeptical of the research’s merit because of the authors’ ties to the IAAF. Bermon, the lead author, is a member of the IAAF’s Medical and Antidoping Commission, and co-author Pierre-Yves Garnier is the Director of the IAAF’s Health and Science Department. Schultz says that “the federation has been clear that it intends to return to court with proof that testosterone is linked to improved athletic performance,” so the IAAF has a vested interest in producing results that favor the regulation. “It’s very clear that the article is written with a purpose of…[defending] that particular regulation,” Karkazis echoes. “They have a dog in this fight.” Neither author was available for comment.
Beyond the conflict of interest, Garnier’s reputation has recently taken a hit—according to Reuters , the IAAF suspended Garnier last summer for allegedly receiving cash payments in a cover-up of Russian doping cases.
Three statisticians who did not contribute to the research—Dorit Hammerling of the National Center for Atmospheric Research, Joe Guinness of North Carolina State University, and Richard Smith of the University of North Carolina—say that the paper does have some statistical merits. They wrote in a group email with NOVA Next that the paper successfully demonstrates a measurable difference in performance between female athletes with high testosterone and those with low testosterone in the five relevant events.
But these findings do come with a few caveats. First, the authors ran tests on 43 events (21 women’s and 22 men’s events), so it’s possible that a few of the five statistically significant results are due to random error. Second, the authors didn’t account for the fact that almost a fifth of the female athletes competed in both World Championships, so a significant number of subjects are counted twice. The authors also only had access to one time and one fT level for each athlete, so they could only analyze a narrow sliver of each athlete’s career. Smith thought a different type of analysis would have been better given the limited amount of available data, but he said “that would require a much higher level of statistical expertise than is show in the rest of the paper.” Smith said he would have liked to see the analysis performed excluding known dopers, whose artificially boosted T levels could have skewed the results.
Two runners can have different times without one having a significant competitive advantage over the other.
While the authors are justified in arguing that they found five significant results, some of their other claims are harder to defend. Smith points out that there’s a difference between a statistically significant result and a practically significant result. Two runners can have different times without one having a significant competitive advantage over the other. And while most in the track and field community would consider the two-second difference found between low-T 800 meter runners and high-T 800 meter runners to be significant, that’s only the authors’ best guess for the time difference—it could be a difference as small as a fraction of a second, or as large as four or five seconds. As Smith says, “the broader implications are unclear.”
This distinction is pivotal for the IAAF. Karkazis says “the CAS didn’t say you needed a statistically significant finding, but a finding of a performance difference of a particular magnitude,” a magnitude that the study did not find. The three statisticians agree that the paper doesn’t “[settle] the issues regarding athletes such as Chand or Semenya whose eligibility to participate in female events has been challenged.”
Finally, while the authors make clear in one part of the paper that what they’ve found is merely a correlation and that they can’t conclude that the higher fT levels are causing the better performances, their implication that this evidence is in support of the regulation makes it sound like they actually have found causation. Karkazis says that these are conclusions that “support the regulation but which the science itself in the study doesn’t support.”
Testosterone and ‘Fair Sport’
Harper acknowledges the limitations of the study but argues that “you’d see even a more robust difference between the low T and the high T athletes” if the authors had more data to work with. Much of the performance difference between men and women can be prescribed to testosterone differences, she says.
But Karkazis begs to differ. “There are plenty of studies that show a much more complicated and equivocal relationship than what policy makers would like to claim,” she says.
Schultz agrees, saying that “so many different variables—internal and external—have to align for top performance.” But Harper argues that it’s misleading “to compare testosterone advantages to other natural advantages,” saying the advantages of testosterone far outstrip those of other biological components. She references as evidence studies conducted in Germany in the 1970s and 1980s showing that increases in exogenous testosterone were “spectacularly effective” at improving performances in elite female athletes.
The IAAF claims that the hyperandrogenism regulation protects the integrity of female athletics and promotes “fair sport.” But Karkazis says that by focusing on testosterone, a singular biological component, the IAAF overlooks not just other biological factors, but also inevitable social and economic inequalities in athletics. Athletes in richer countries, for example, have access to better training facilities, coaches, and equipment, and can afford to dedicate more time to training. Professional athletics will always be littered with people who have incredible genetic gifts and socioeconomic benefits. The existence of professional sports itself may even depend on that un-level playing field.
Regardless of where experts align themselves in the debate over testosterone, most seem to agree that the IAAF’s package of evidence will not be enough to convince the CAS to reinstate the hyperandrogenism regulation. Ross Tucker, an exercise physiologist and strong advocate for the regulation, wrote on his blog “The Science of Sport” following the publication of the new Bermon study that “the IAAF evidence does not go far enough, either in terms of the depth or the range.”
“To reduce an athlete’s excellence to her androgen serum level or to disqualify an athlete because of her biology doesn’t strike me as appropriate,” says Schultz. Karkazis, too, argues that women shouldn’t be forced to undergo medical treatment just so they can continue to compete. “The athletes that we’re talking about are no different than any of the other women who are born and lived as women for their entire lives. And I just cannot for the life of me see a reason to treat those women differently.”