Cotton Story: The Crop That Fuelled Empires and Slavery Finally Reveals Where It Started

For decades, scientists knew that upland cotton, the species behind around 90% of the world's natural textile fibre, was domesticated somewhere in the Americas, but could not say exactly where. A new genomic study has now provided the answer, tracing the crop's origins to the coastal scrublands of northwestern Yucatán, where farmers worked thousands of years before the Maya.

Long Story, Cut Short
  • A genomic study has traced the domestication of upland cotton to the coastal scrublands of northwestern Yucatán, resolving a decades-long scientific mystery about the crop's origins.
  • Wild cotton populations in northwestern Yucatán carry nearly twice the genetic diversity of modern cultivars, representing a living reservoir of variation discarded during centuries of selective breeding.
  • The domestication of upland cotton was a gradual process involving thousands of small genetic changes accumulated over millennia, with no single gene responsible for the transformation.
The coastal scrublands of northwestern Yucatán harbour wild cotton plants whose DNA contains the most complete record of the crop's domestication history anywhere on earth.
WILD ORIGINS The coastal scrublands of northwestern Yucatán harbour wild cotton plants whose DNA contains the most complete record of the crop's domestication history anywhere on earth. Mark Stebnicki / pexels

By any measure of economic history, cotton is one of the most fought-over materials on earth. It dressed the modern world, fuelled the slave trade, triggered wars, and built the fortunes of empires. It is in the shirt on your back, the sheets on your bed, the bandage on a wound, the currency note in your wallet. Upland cotton, the species Gossypium hirsutum, supplies around 90% of the world's natural textile fibre. And until very recently, nobody could say with scientific certainty where it came from.

Scientists knew, roughly, that it originated somewhere in the Americas. They had hypotheses and fragmentary evidence pointing toward Mexico. But the precise location, and the full story of how a scraggly coastal shrub was transformed into the crop that shaped the modern world, remained stubbornly out of reach. The tools available produced blurry answers. The fieldwork required to collect truly wild plants, excluding cultivated ones that had escaped into the wild, gone feral, and reverted to resembling their ancestors, across their full native range had never been done at scale.

That distinction matters more than it might seem. Wild cotton and feral cotton can look identical in the field. A cultivated plant that escapes into scrubland and breeds freely across generations will gradually shed the visible markers of domestication, developing smaller bolls and coarser fibre, until it resembles its ancient ancestors closely enough to fool the eye. Only DNA can reliably tell them apart. Which is precisely why, for so long, the origin story of the world's most important natural fibre remained unresolved, which is precisely why resolving it required building the most comprehensive genomic picture of wild cotton ever assembled.

The answers are contained in a new genomic study, 'Genomic Diversity and the Domestication History of Cotton (Gossypium hirsutum),' published recently in Proceedings of the National Academy of Sciences. The study was led by Corrinne Grover, a geneticist and evolutionary biologist at Iowa State University, and Jonathan Wendel, a botanist and evolutionary biologist at the same institution, alongside a research team drawn from institutions across the United States, Mexico, Switzerland, and China.

By sequencing the DNA of 299 newly collected wild cotton plants drawn from across the species' entire native range—from the Florida Keys to the Caribbean islands to the coasts of Mexico—the team traced the origin of domesticated cotton to a specific stretch of coastline in northwestern Yucatán, México. There, somewhere between 4,000 and 7,000 years ago, Stone Age farmers who predate the Maya began selecting and cultivating a wild, multi-branched shrub that produced small bolls of short, brownish fibre that no modern textile mill would recognise as cotton. Over thousands of years, through patient, cumulative selection, they transformed it into the ancestor of every cotton plant grown commercially on earth today.

The study does not merely identify a place on a map. It reconstructs a process—one of the most consequential agricultural transformations in human history, and in doing so, it opens a question that reaches well beyond the past. The wild cotton plants still growing along that Yucatán coastline carry nearly twice the genetic diversity of the cultivars grown in fields today. That diversity, accumulated over millennia and discarded by centuries of selective breeding, may turn out to matter enormously to the future of a crop increasingly stressed by climate, pests, and the narrowing of its own genetic inheritance.

A Crop Without a Birthplace

Farmers grow cotton across roughly 35 million hectares worldwide. Traders move it across commodity markets in New York and Zhengzhou. Textile mills in Bangladesh, India, and Vietnam spin it into yarn by the millions of tonnes. The plant behind all of it has become almost invisible—its biological origins, where it was first domesticated, by whom, and through what process, have remained, until now, genuinely unclear.

The species in question is Gossypium hirsutum, known as upland cotton, which accounts for around 90% of global cotton fibre production. It is not the only domesticated cotton. Gossypium barbadense, the long-staple Pima cotton prized by luxury textile makers, was independently domesticated in South America, near what is now Guayaquil in Ecuador, around 8,000 years ago. Two other species, Gossypium arboreum from the Indian subcontinent and Gossypium herbaceum from sub-Saharan Africa and the Arabian Peninsula, make up most of the rest. But G. hirsutum is overwhelmingly the dominant one. Every standard cotton T-shirt, every hospital bedsheet, every reel of cotton thread in a sewing box almost certainly derives from it. The question of where G. hirsutum was first brought under human cultivation is not an academic footnote. It is the origin story of one of the most economically significant plants in human history.

"Wild cotton plants are woody, multibranched shrubs or small trees, long-lived, with relatively sparse flowering and smaller flowers, fruits and seeds than under cultivation," Wendel told Reuters. "Members of some human groups must have taken an interest in the wild forms," he added, setting in motion the process of domestication from which the modern crop form arose over thousands of years of slow and gradual improvement.

Earlier studies, using older and less precise genetic tools, had pointed broadly toward the northern Yucatán Peninsula as the likely cradle of domestication. But the evidence was thin. Sampling of truly wild populations had been sparse, and the signal from the available data was too weak to pinpoint a specific region with confidence. The problem was compounded by a peculiarity of cotton's long history of cultivation: domesticated plants that escape into the wild can, over generations, shed enough of their cultivated characteristics to look phenotypically wild, resembling wild plants in their physical appearance, without actually being so. Telling a genuinely wild plant from a feral escapee in the field is fiendishly difficult. Only genomic analysis, reading the DNA directly, can do it reliably.

This is what made the new study's approach different. The research team collected 299 plants from across the full native range of wild G. hirsutum: 158 from the northern Yucatán Peninsula and 141 from coastal Florida, supplemented by previously sequenced samples from Caribbean islands including Puerto Rico and Guadeloupe. Each plant was genomically verified as genuinely wild before being included in the analysis. The result was the most complete and geographically expansive dataset of wild cotton genomes ever assembled, a resource that finally made it possible to ask, and answer, where this crop actually began.

What the data revealed was not simply a location. It was a portrait of a domestication process: gradual, cumulative, and far more complex than the sudden revolution that popular accounts of agricultural origins often imply. Early farmers recognised the plant's potential as a source of soft material; early weavers could spin its fibre by hand into cloth, fish nets, and rope. "Early farmers saw potential in this sprawling plant with hairy seeds as a source for soft materials," Grover notes. The transformation left clear genomic traces, all pointing unmistakably to one specific corner of the Yucatán coast.

Upland cotton has been grown, traded, and fought over for millennia, supplying around 90% of the world's natural textile fibre and shaping the economic history of entire civilisations.
Upland cotton has been grown, traded, and fought over for millennia, supplying around 90% of the world's natural textile fibre and shaping the economic history of entire civilisations. Vie Studio / Pexels

The Coast That Holds Everything

Drive along the northern coast of Yucatán and you will not immediately recognise what you are looking at. The plants growing in the coastal scrubland are sprawling, multi-branched shrubs, chest-high in places, with small leaves and modest flowers that shade from pale yellow to pink as they age. The bolls they produce are small, and the fibre inside them is short and brownish, bearing no resemblance to the fat white tufts that spill from a commercial cotton field. These are not failed cotton plants. They are the original ones. And their DNA, it turns out, contains the most complete record of cotton's origins anywhere on earth.

The key concept here is genetic diversity, a measure of how much variation exists within a population's DNA. Think of it as biological memory. A population that has remained large, stable, and connected over thousands of years accumulates variation across generations; mutations arise, spread, and persist, building up a reservoir of genetic difference. A population that has been cut off from others, reduced in size, or pushed through a genetic bottleneck loses variation. A bottleneck is a severe reduction in population size that strips away much of the gene pool in a single generation, leaving survivors to carry only a fraction of what was there before. Diversity, once lost, does not come back easily.

By this measure, the wild cotton populations of northwestern Yucatán are extraordinary. Their nucleotide diversity, measured as the degree to which individual plants differ from one another at the level of their DNA sequence, is the highest recorded anywhere in the species' native range. Two plants drawn at random from this population will differ, on average, at nearly twice as many positions in their DNA as any two modern cultivars grown in a commercial cotton field today. Centuries of human selection, narrowing the gene pool in pursuit of whiter, longer, more abundant fibre, have left the crop genetically impoverished relative to its wild ancestors. "We know that domestication often leads to a loss of genetic diversity as early farmers were selecting for valuable traits, and then to further reductions as crop improvement intensified the selection pressure," Grover explains.

Every major form of analysis the researchers applied returned to the same address. Population structure analysis, a method for identifying genetically distinct groups within a species, placed the northwestern Yucatán plants closest to domesticated cotton. Phylogenetic analysis, which reconstructs evolutionary family trees from DNA, nested the domesticated gene pool within the northwestern Yucatán population. Plastome analysis told the same story. The plastome is the DNA of the chloroplast, the structure inside plant cells that converts sunlight into energy, inherited independently of the main genome and therefore a separate corroborating line of evidence. Even chemical profiling of wild plants along the Yucatán coast provided independent confirmation: a gradient in the plants' secondary chemistry runs from west to east, with western plants chemically closest to domesticated cotton. Every line of inquiry pointed to the same coast.

The picture that emerges from this convergence of evidence is precise enough to locate on a map. The domestication of upland cotton began in the northwestern corner of the Yucatán Peninsula, in the coastal scrublands that still exist there today. From that origin point, early domesticated forms spread gradually across Central America and the Caribbean, each dispersal accompanied by a loss of genetic diversity. Caribbean island populations, for instance, show only around one-quarter of the diversity found in northwestern Yucatán. The bottleneck of island colonisation is written permanently into their DNA.

Modern cultivars carry an additional complication. Roughly 13 to 14% of their genomes derive not from G. hirsutum at all, but from G. barbadense, the South American cotton species independently domesticated thousands of years earlier. As the two species spread across overlapping territories in the Caribbean, they interbred, intentionally and otherwise, and fragments of G. barbadense DNA were absorbed into the hirsutum gene pool. It is partly this genetic mixing that explains why modern cultivars do not sit neatly inside the wild Yucatán family tree. They are the product of a long, complicated history, one that runs well beyond genomics.

Cotton's spread across the world after the Spanish conquests of the 16th century was built on slavery, the exploitation of Indigenous peoples, and imperial expansion. The invention of the cotton gin at the end of the 18th century accelerated that history, making cotton farming enormously profitable and deepening the demand for enslaved labour across the American South. "Cotton has a complicated history, most notably its association with slavery, exploitation of Indigenous peoples and imperial expansion. But it is also an enduring crop, one that is woven into the lives of people worldwide," Grover observes.

What the study also found, and what carries perhaps the greatest consequence for the future, is that the domestication process itself left no single dramatic fingerprint. There are no major domestication genes, no single genetic switch that, when flipped, turned a wild plant into a crop. Instead, the researchers found thousands of small genetic changes distributed across the genome, each with a modest individual effect, accumulating slowly over millennia, consistent with how other major crops, including wheat and rice, were shaped by human selection. Cotton was coaxed into existence, incrementally, across thousands of years of patient human attention.

The fibres are single cells, each stretched to extraordinary length through millennia of selection, producing the fine, white staple the global textile industry depends on. "The fibers themselves are just single-celled seed hairs, but are among the most exaggerated and remarkable cells in plants," Wendel notes. "Research is showing that the process of domestication, of transforming these short, coarse and brownish fibers into the fine, white and superior textile we know today likely involves many genes operating in a complex symphony," Grover points out.

What the Coast Still Holds

Pull a cotton boll apart today and what you hold is the product of a 7,000-year conversation between farmers and a plant, carrying, on average, half the genetic diversity of the wild plants still growing on the Yucatán coast. That coastal scrubland is not a relic. It is a living seed bank, holding variation that thousands of years of human selection discarded in pursuit of whiter, longer, more abundant fibre, variation that plant breeders, facing a crop increasingly stressed by drought, heat, and pest resistance, may urgently need. Whether that scrubland remains intact long enough to be useful is a question no genome can answer.

Cotton by Numbers
  • Upland cotton (G. hirsutum) accounts for around 90% of global natural textile fibre production, making it the dominant fibre crop on earth.
  • The crop is grown across roughly 35 million hectares worldwide, with China, India, the United States, and Brazil the leading producers.
  • Domestication of upland cotton began somewhere between 4,000 and 7,000 years ago in the coastal scrublands of northwestern Yucatán.
  • Modern cultivars carry, on average, half the genetic diversity of wild cotton plants still growing on the Yucatán coast today.
  • Around 13 to 14% of the genomes of modern cotton cultivars derive from Gossypium barbadense, a separate species domesticated independently in South America.
Study at a Glance
  • The study sequenced the DNA of 299 newly collected wild plants drawn from the full native range of Gossypium hirsutum, including sites in Yucatán, Florida, Puerto Rico, and Guadeloupe.
  • Researchers used five independent analytical methods, including population structure analysis, phylogenetic analysis, plastome analysis, and chemical profiling, all of which pointed to northwestern Yucatán.
  • Caribbean island populations showed only around one-quarter of the genetic diversity found in northwestern Yucatán, reflecting the bottleneck of island colonisation.
  • The study found no single domestication gene responsible for cotton's transformation, confirming that the process involved thousands of small genetic changes accumulated over millennia.
  • The research was led by Corrinne Grover and Jonathan Wendel of Iowa State University, with collaborators from institutions across the United States, Mexico, Switzerland, and China.

Subir Ghosh

SUBIR GHOSH is a Kolkata-based independent journalist-writer-researcher who writes about environment, corruption, crony capitalism, conflict, wildlife, and cinema. He is the author of two books, and has co-authored two more with others. He writes, edits, reports and designs. He is also a professionally trained and qualified photographer.

 
 
 
Dated posted: 22 May 2026 Last modified: 22 May 2026