Computer scientists in the US have applied a bioinformatics workflow to reconstruct one of the most complete genomes of a top cotton species—African domesticated Gossypium herbaceum cultivar Wagad.
- The results give scientists a more complete picture of how wild cotton was domesticated over time and may help to strengthen and protect the crop for farmers in the US, Africa and elsewhere.
- The findings were recently published in the journal G3: Genes, Genomes, Genetics.
Thiru Ramaraj, assistant professor of computer science in DePaul’s Jarvis College of Computing and Digital Media, is lead author. - The National Science Foundation funded the research.
The Project: The work is part of a collaboration that includes Jonathan Wendel, distinguished professor in the Department of Ecology, Evolution, and Organismal Biology at Iowa State University; and Joshua Udall, research leader for the Crop Germplasm Research Unit at the US Department of Agriculture (USDA) Agricultural Research Service.
The Context: Cotton is the primary source of natural fibre on Earth, yet only four of 50 known species are suitable for textile production.
- Wagad cotton is a diploid strain grown predominantly in African countries.
- The team’s work began with crunching DNA sequence data. They began reconstructing the Wagad genome by assembling high-quality long DNA sequence data generated using Pacific Biosciences sequencing technology.
- As a next step, whole genome maps from Bionano genomics were used to order and orient the initial assembly. Last, Hi-C sequence data from Phase genomics were used to construct chromosome level genome.
- Ramaraj then turned to Azalea Mendoza, a graduate student in computer science who also holds a bachelor’s degree in environmental studies from DePaul.
- Mendoza began by researching the history of cotton to zoom out and understand “the big picture.” No matter where cotton is grown, it’s primarily used for fibre.
- Using comparative genomics, she looked for variations against its closet relative and to an outgroup. Mendoza also delved into annotated genes and noted their functions.
- They found many genes that were related to the content of fibre.
WHAT THEY SAID:
The power of this technology is it allows us to create high-quality genomes that supply a level of detail that simply wasn’t possible before. This opens up the possibility for more researchers to sequence many crops that are important to the global economy and to feeding the population.
— Thiru Ramaraj
Assistant Professor of Computer Science, Jarvis College of Computing and Digital Media
DePaul University