June 1, 2021  |  

Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome using long-read sequencing

Sequence-based estimation of genetic diversity of Plasmodium falciparum, the most lethal malarial parasite, has proved challenging due to a lack of a complete genomic assembly. The skewed AT-richness (~80.6% (A+T)) of its genome and the lack of technology to assemble highly polymorphic sub-telomeric regions that contain clonally variant, multigene virulence families (i.e. var and rifin) have confounded attempts using short-read NGS technologies. Using single molecule, real-time (SMRT) sequencing, we successfully compiled all 14 nuclear chromosomes of the P. falciparum genome from telomere-to-telomere in single contigs. Specifically, amplification-free sequencing generated reads of average length 12 kb, with =50% of the reads between 15.5 and 50 kb in length. A hierarchical genome assembly process (HGAP), was used to assemble the P. falciparum genome de novo. This assembly accurately resolved centromeres (~90-99% (A+T)) and sub-telomeric regions, and identified large insertions and duplications in the genome that added extra genes to the var and rifin virulence families, along with smaller structural variants such as homopolymer tract expansions. These regions can be used as markers for genetic diversity during comparative genome analyses. Moreover, identifying the polymorphic and repetitive sub-telomeric sequences of parasite populations from endemic areas might inform the link between structural variation and phenotypes such as virulence, drug resistance and disease transmission.

April 21, 2020  |  

Progression of the canonical reference malaria parasite genome from 2002-2019.

Here we describe the ways in which the sequence and annotation of the Plasmodium falciparum reference genome has changed since its publication in 2002. As the malaria species responsible for the most deaths worldwide, the richness of annotation and accuracy of the sequence are important resources for the P. falciparum research community as well as the basis for interpreting the genomes of subsequently sequenced species. At the time of publication in 2002 over 60% of predicted genes had unknown functions. As of March 2019, this number has been significantly decreased to 33%. The reduction is due to the inclusion of genes that were subsequently characterised experimentally and genes with significant similarity to others with known functions. In addition, the structural annotation of genes has been significantly refined; 27% of gene structures have been changed since 2002, comprising changes in exon-intron boundaries, addition or deletion of exons and the addition or deletion of genes. The sequence has also undergone significant improvements. In addition to the correction of a large number of single-base and insertion or deletion errors, a major miss-assembly between the subtelomeres of chromosome 7 and 8 has been corrected. As the number of sequenced isolates continues to grow rapidly, a single reference genome will not be an adequate basis for interpretating intra-species sequence diversity. We therefore describe in this publication a population reference genome of P. falciparum, called Pfref1. This reference will enable the community to map to regions that are not present in the current assembly. P. falciparum 3D7 will be continued to be maintained with ongoing curation ensuring continual improvements in annotation quality.

July 19, 2019  |  

Population structure of mitochondrial genomes in Saccharomyces cerevisiae.

Rigorous study of mitochondrial functions and cell biology in the budding yeast, Saccharomyces cerevisiae has advanced our understanding of mitochondrial genetics. This yeast is now a powerful model for population genetics, owing to large genetic diversity and highly structured populations among wild isolates. Comparative mitochondrial genomic analyses between yeast species have revealed broad evolutionary changes in genome organization and architecture. A fine-scale view of recent evolutionary changes within S. cerevisiae has not been possible due to low numbers of complete mitochondrial sequences.To address challenges of sequencing AT-rich and repetitive mitochondrial DNAs (mtDNAs), we sequenced two divergent S. cerevisiae mtDNAs using a single-molecule sequencing platform (PacBio RS). Using de novo assemblies, we generated highly accurate complete mtDNA sequences. These mtDNA sequences were compared with 98 additional mtDNA sequences gathered from various published collections. Phylogenies based on mitochondrial coding sequences and intron profiles revealed that intraspecific diversity in mitochondrial genomes generally recapitulated the population structure of nuclear genomes. Analysis of intergenic sequence indicated a recent expansion of mobile elements in certain populations. Additionally, our analyses revealed that certain populations lacked introns previously believed conserved throughout the species, as well as the presence of introns never before reported in S. cerevisiae.Our results revealed that the extensive variation in S. cerevisiae mtDNAs is often population specific, thus offering a window into the recent evolutionary processes shaping these genomes. In addition, we offer an effective strategy for sequencing these challenging AT-rich mitochondrial genomes for small scale projects.

July 19, 2019  |  

Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11?kb), single molecule, real-time sequencing.

The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [~80.6% (A?+?T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12?kb, with 50% of the reads between 15.5 and 50?kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [~90-99% (A?+?T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

July 7, 2019  |  

A novel type pathway-specific regulator and dynamic genome environments of solanapyrone biosynthesis gene cluster in the fungus Ascochyta rabiei.

Secondary metabolite genes are often clustered together and situated in particular genomic regions, like the subtelomere, that can facilitate niche adaptation in fungi. Solanapyrones are toxic secondary metabolites produced by fungi occupying different ecological niches. Full-genome sequencing of the ascomycete Ascochyta rabiei revealed a solanapyrone biosynthesis gene cluster embedded in an AT-rich region proximal to a telomere end and surrounded by Tc1/Mariner-type transposable elements. The highly AT-rich environment of the solanapyrone cluster is likely the product of repeat-induced point mutations. Several secondary metabolism-related genes were found in the flanking regions of the solanapyrone cluster. Although the solanapyrone cluster appears to be resistant to repeat-induced point mutations, a P450 monooxygenase gene adjacent to the cluster has been degraded by such mutations. Among the six solanapyrone cluster genes (sol1 to sol6), sol4 encodes a novel type of Zn(II)2Cys6 zinc cluster transcription factor. Deletion of sol4 resulted in the complete loss of solanapyrone production but did not compromise growth, sporulation, or virulence. Gene expression studies with the sol4 deletion and sol4-overexpressing mutants delimited the boundaries of the solanapyrone gene cluster and revealed that sol4 is likely a specific regulator of solanapyrone biosynthesis and appears to be necessary and sufficient for induction of the solanapyrone cluster genes. Despite the dynamic surrounding genomic regions, the solanapyrone gene cluster has maintained its integrity, suggesting important roles of solanapyrones in fungal biology. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

July 7, 2019  |  

Characterization of tet(Y)-carrying LowGC plasmids exogenously captured from cow manure at a conventional dairy farm.

Manure from dairy farms has been shown to contain diverse tetracycline resistance genes that are transferable to soil. Here, we focus on conjugative plasmids that may spread tetracycline resistance at a conventional dairy farm. We performed exogenous plasmid isolation from cattle feces using chlortetracycline for transconjugant selection. The transconjugants obtained harbored LowGC-type plasmids and tet(Y). A representative plasmid (pFK2-7) was fully sequenced and this was compared with previously described LowGC plasmids from piggery manure-treated soil and a GenBank record from Acinetobacter nosocomialis that we also identified as a LowGC plasmid. The pFK2-7 plasmid had the conservative backbone typical of LowGC plasmids, though this region was interrupted with an insert containing the tet(Y)-tet(R) tetracycline resistance genes and the strA-strB streptomycin resistance genes. Despite Acinetobacter populations being considered natural hosts of LowGC plasmids, these plasmids were not found in three Acinetobacter isolates from the study farm. The isolates harbored tet(Y)-tet(R) genes in identical genetic surroundings as pFK2-7, however, suggesting genetic exchange between Acinetobacter and LowGC plasmids. Abundance of LowGC plasmids and tet(Y) was correlated in manure and soil samples from the farm, indicating that LowGC plasmids may be involved in the spread of tet(Y) in the environment.© FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.