Menu
July 19, 2019

Centromere evolution and CpG methylation during vertebrate speciation.

Centromeres and large-scale structural variants evolve and contribute to genome diversity during vertebrate speciation. Here, we perform de novo long-read genome assembly of three inbred medaka strains that are derived from geographically isolated subpopulations and undergo speciation. Using single-molecule real-time (SMRT) sequencing, we obtain three chromosome-mapped genomes of length ~734, ~678, and ~744Mbp with a resource of twenty-two centromeric regions of length 20-345kbp. Centromeres are positionally conserved among the three strains and even between four pairs of chromosomes that were duplicated by the teleost-specific whole-genome duplication 320-350 million years ago. The centromeres do not all evolve at a similar pace; rather, centromeric monomers in non-acrocentric chromosomes evolve significantly faster than those in acrocentric chromosomes. Using methylation sensitive SMRT reads, we uncover centromeres are mostly hypermethylated but have hypomethylated sub-regions that acquire unique sequence compositions independently. These findings reveal the potential of non-acrocentric centromere evolution to contribute to speciation.


July 19, 2019

Cytogenomic identification and long-read single molecule real-time (SMRT) sequencing of a Bardet-Biedl Syndrome 9 (BBS9) deletion.

Bardet-Biedl syndrome (BBS) is a recessive disorder characterized by heterogeneous clinical manifestations, including truncal obesity, rod-cone dystrophy, renal anomalies, postaxial polydactyly, and variable developmental delays. At least 20 genes have been implicated in BBS, and all are involved in primary cilia function. We report a 1-year-old male child from Guyana with obesity, postaxial polydactyly on his right foot, hypotonia, ophthalmologic abnormalities, and developmental delay, which together indicated a clinical diagnosis of BBS. Clinical chromosomal microarray (CMA) testing and high-throughput BBS gene panel sequencing detected a homozygous 7p14.3 deletion of exons 1-4 of BBS9 that was encompassed by a 17.5?Mb region of homozygosity at chromosome 7p14.2-p21.1. The precise breakpoints of the deletion were delineated to a 72.8?kb region in the proband and carrier parents by third-generation long-read single molecule real-time (SMRT) sequencing (Pacific Biosciences), which suggested non-homologous end joining as a likely mechanism of formation. Long-read SMRT sequencing of the deletion breakpoints also determined that the aberration included the neighboring RP9 gene implicated in retinitis pigmentosa; however, the clinical significance of this was considered uncertain given the paucity of reported cases with unambiguous RP9 mutations. Taken together, our study characterized a BBS9 deletion, and the identification of this shared haplotype in the parents suggests that this pathogenic aberration may be a BBS founder mutation in the Guyanese population. Importantly, this informative case also highlights the utility of long-read SMRT sequencing to map nucleotide breakpoints of clinically relevant structural variants.


July 19, 2019

Firefly genomes illuminate parallel origins of bioluminescence in beetles.

Fireflies and their luminous courtships have inspired centuries of scientific study. Today firefly luciferase is widely used in biotechnology, but the evolutionary origin of bioluminescence within beetles remains unclear. To shed light on this long-standing question, we sequenced the genomes of two firefly species that diverged over 100 million-years-ago: the North American Photinus pyralis and Japanese Aquatica lateralis. To compare bioluminescent origins, we also sequenced the genome of a related click beetle, the Caribbean Ignelater luminosus, with bioluminescent biochemistry near-identical to fireflies, but anatomically unique light organs, suggesting the intriguing hypothesis of parallel gains of bioluminescence. Our analyses support independent gains of bioluminescence in fireflies and click beetles, and provide new insights into the genes, chemical defenses, and symbionts that evolved alongside their luminous lifestyle.© 2018, Fallon et al.


July 19, 2019

Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species.

The fungal genus ofAspergillusis highly interesting, containing everything from industrial cell factories, model organisms, and human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverseAspergillusspecies (A. campestris,A. novofumigatus,A. ochraceoroseus, andA. steynii) have been whole-genome PacBio sequenced to provide genetic references in threeAspergillussections.A. taichungensisandA. candidusalso were sequenced for SM elucidation. ThirteenAspergillusgenomes were analyzed with comparative genomics to determine phylogeny and genetic diversity, showing that each presented genome contains 15-27% genes not found in other sequenced Aspergilli. In particular,A. novofumigatuswas compared with the pathogenic speciesA. fumigatusThis suggests thatA. novofumigatuscan produce most of the same allergens, virulence, and pathogenicity factors asA. fumigatus, suggesting thatA. novofumigatuscould be as pathogenic asA. fumigatusFurthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences, and predictive algorithms. We thus identify putative SM clusters for aflatoxin, chlorflavonin, and ochrindol inA. ochraceoroseus,A. campestris, andA. steynii, respectively, and novofumigatonin,ent-cycloechinulin, andepi-aszonalenins inA. novofumigatusOur study delivers six fungal genomes, showing the large diversity found in theAspergillusgenus; highlights the potential for discovery of beneficial or harmful SMs; and supports reports ofA. novofumigatuspathogenicity. It also shows how biological, biochemical, and genomic information can be combined to identify genes involved in the biosynthesis of specific SMs.


July 19, 2019

Structure and distribution of centromeric retrotransposons at diploid and allotetraploid Coffea centromeric and pericentromeric regions.

Centromeric regions of plants are generally composed of large array of satellites from a specific lineage ofGypsyLTR-retrotransposons, called Centromeric Retrotransposons. Repeated sequences interact with a specific H3 histone, playing a crucial function on kinetochore formation. To study the structure and composition of centromeric regions in the genusCoffea, we annotated and classified Centromeric Retrotransposons sequences from the allotetraploidC. arabicagenome and its two diploid ancestors:Coffea canephoraandC. eugenioides. Ten distinct CRC (Centromeric Retrotransposons inCoffea) families were found. The sequence mapping and FISH experiments of CRC Reverse Transcriptase domains inC. canephora, C. eugenioides, andC. arabicaclearly indicate a strong and specific targeting mainly onto proximal chromosome regions, which can be associated also with heterochromatin. PacBio genome sequence analyses of putative centromeric regions onC. arabicaandC. canephorachromosomes showed an exceptional density of one family of CRC elements, and the complete absence of satellite arrays, contrasting with usual structure of plant centromeres. Altogether, our data suggest a specific centromere organization inCoffea, contrasting with other plant genomes.


July 19, 2019

Phasevarions of bacterial pathogens: Methylomics sheds new light on old enemies.

A wide variety of bacterial pathogens express phase-variable DNA methyltransferases that control expression of multiple genes via epigenetic mechanisms. These randomly switching regulons – phasevarions – regulate genes involved in pathogenesis, host adaptation, and antibiotic resistance. Individual phase-variable genes can be identified in silico as they contain easily recognized features such as simple sequence repeats (SSRs) or inverted repeats (IRs) that mediate the random switching of expression. Conversely, phasevarion-controlled genes do not contain any easily identifiable features. The study of DNA methyltransferase specificity using Single-Molecule, Real-Time (SMRT) sequencing and methylome analysis has rapidly advanced the analysis of phasevarions by allowing methylomics to be combined with whole-transcriptome/proteome analysis to comprehensively characterize these systems in a number of important bacterial pathogens. Copyright © 2018 Elsevier Ltd. All rights reserved.


July 19, 2019

Resolving the complete genome of Kuenenia stuttgartiensis from a membrane bioreactor enrichment using Single-Molecule Real-Time sequencing.

Anaerobic ammonium-oxidizing (anammox) bacteria are a group of strictly anaerobic chemolithoautotrophic microorganisms. They are capable of oxidizing ammonium to nitrogen gas using nitrite as a terminal electron acceptor, thereby facilitating the release of fixed nitrogen into the atmosphere. The anammox process is thought to exert a profound impact on the global nitrogen cycle and has been harnessed as an environment-friendly method for nitrogen removal from wastewater. In this study, we present the first closed genome sequence of an anammox bacterium, Kuenenia stuttgartiensis MBR1. It was obtained through Single-Molecule Real-Time (SMRT) sequencing of an enrichment culture constituting a mixture of at least two highly similar Kuenenia strains. The genome of the novel MBR1 strain is different from the previously reported Kuenenia KUST reference genome as it contains numerous structural variations and unique genomic regions. We find new proteins, such as a type 3b (sulf)hydrogenase and an additional copy of the hydrazine synthase gene cluster. Moreover, multiple copies of ammonium transporters and proteins regulating nitrogen uptake were identified, suggesting functional differences in metabolism. This assembly, including the genome-wide methylation profile, provides a new foundation for comparative and functional studies aiming to elucidate the biochemical and metabolic processes of these organisms.


July 19, 2019

Herbivorous turtle ants obtain essential nutrients from a conserved nitrogen-recycling gut microbiome.

Nitrogen acquisition is a major challenge for herbivorous animals, and the repeated origins of herbivory across the ants have raised expectations that nutritional symbionts have shaped their diversification. Direct evidence for N provisioning by internally housed symbionts is rare in animals; among the ants, it has been documented for just one lineage. In this study we dissect functional contributions by bacteria from a conserved, multi-partite gut symbiosis in herbivorous Cephalotes ants through in vivo experiments, metagenomics, and in vitro assays. Gut bacteria recycle urea, and likely uric acid, using recycled N to synthesize essential amino acids that are acquired by hosts in substantial quantities. Specialized core symbionts of 17 studied Cephalotes species encode the pathways directing these activities, and several recycle N in vitro. These findings point to a highly efficient N economy, and a nutritional mutualism preserved for millions of years through the derived behaviors and gut anatomy of Cephalotes ants.


July 19, 2019

The genome of Schmidtea mediterranea and the evolution of core cellular mechanisms.

The planarian Schmidtea mediterranea is an important model for stem cell research and regeneration, but adequate genome resources for this species have been lacking. Here we report a highly contiguous genome assembly of S. mediterranea, using long-read sequencing and a de novo assembler (MARVEL) enhanced for low-complexity reads. The S. mediterranea genome is highly polymorphic and repetitive, and harbours a novel class of giant retroelements. Furthermore, the genome assembly lacks a number of highly conserved genes, including critical components of the mitotic spindle assembly checkpoint, but planarians maintain checkpoint function. Our genome assembly provides a key model system resource that will be useful for studying regeneration and the evolutionary plasticity of core cell biological mechanisms.


July 19, 2019

Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains.

Without knowledge of their genomic sequences, it is impossible to make functional models of the bacteria that make up human and animal microbiota. Unfortunately, the vast majority of publicly available genomes are only working drafts, an incompleteness that causes numerous problems and constitutes a major obstacle to genotypic and phenotypic interpretation. In this work, we began with an example from the class Bacteroidia in the phylum Bacteroidetes, which is preponderant among human orodigestive microbiota. We successfully identify the genetic loci responsible for assembly breaks and misassemblies and demonstrate the importance and usefulness of long-read sequencing and curated reannotation.We showed that the fragmentation in Bacteroidia draft genomes assembled from massively parallel sequencing linearly correlates with genomic repeats of the same or greater size than the reads. We also demonstrated that some of these repeats, especially the long ones, correspond to misassembled loci in three reference Porphyromonas gingivalis genomes marked as circularized (thus complete or finished). We prove that even at modest coverage (30X), long-read resequencing together with PCR contiguity verification (rrn operons and an integrative and conjugative element or ICE) can be used to identify and correct the wrongly combined or assembled regions. Finally, although time-consuming and labor-intensive, consistent manual biocuration of three P. gingivalis strains allowed us to compare and correct the existing genomic annotations, resulting in a more accurate interpretation of the genomic differences among these strains.In this study, we demonstrate the usefulness and importance of long-read sequencing in verifying published genomes (even when complete) and generating assemblies for new bacterial strains/species with high genomic plasticity. We also show that when combined with biological validation processes and diligent biocurated annotation, this strategy helps reduce the propagation of errors in shared databases, thus limiting false conclusions based on incomplete or misleading information.


July 19, 2019

The complete and fully assembled genome sequence of Aeromonas salmonicida subsp. pectinolytica and its comparative analysis with other Aeromonas species: investigation of the mobilome in environmental and pathogenic strains.

Due to the predominant usage of short-read sequencing to date, most bacterial genome sequences reported in the last years remain at the draft level. This precludes certain types of analyses, such as the in-depth analysis of genome plasticity.Here we report the finalized genome sequence of the environmental strain Aeromonas salmonicida subsp. pectinolytica 34mel, for which only a draft genome with 253 contigs is currently available. Successful completion of the transposon-rich genome critically depended on the PacBio long read sequencing technology. Using finalized genome sequences of A. salmonicida subsp. pectinolytica and other Aeromonads, we report the detailed analysis of the transposon composition of these bacterial species. Mobilome evolution is exemplified by a complex transposon, which has shifted from pathogenicity-related to environmental-related gene content in A. salmonicida subsp. pectinolytica 34mel.Obtaining the complete, circular genome of A. salmonicida subsp. pectinolytica allowed us to perform an in-depth analysis of its mobilome. We demonstrate the mobilome-dependent evolution of this strain’s genetic profile from pathogenic to environmental.


July 19, 2019

Extreme sensitivity to ultraviolet light in the fungal pathogen causing white-nose syndrome of bats.

Bat white-nose syndrome (WNS), caused by the fungal pathogen Pseudogymnoascus destructans, has decimated North American hibernating bats since its emergence in 2006. Here, we utilize comparative genomics to examine the evolutionary history of this pathogen in comparison to six closely related nonpathogenic species. P. destructans displays a large reduction in carbohydrate-utilizing enzymes (CAZymes) and in the predicted secretome (~50%), and an increase in lineage-specific genes. The pathogen has lost a key enzyme, UVE1, in the alternate excision repair (AER) pathway, which is known to contribute to repair of DNA lesions induced by ultraviolet (UV) light. Consistent with a nonfunctional AER pathway, P. destructans is extremely sensitive to UV light, as well as the DNA alkylating agent methyl methanesulfonate (MMS). The differential susceptibility of P. destructans to UV light in comparison to other hibernacula-inhabiting fungi represents a potential “Achilles’ heel” of P. destructans that might be exploited for treatment of bats with WNS.


July 19, 2019

Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity.

Although draft genomes are available for most agronomically important plant species, the majority are incomplete, highly fragmented, and often riddled with assembly and scaffolding errors. These assembly issues hinder advances in tool development for functional genomics and systems biology.Here we utilized a robust, cost-effective approach to produce high-quality reference genomes. We report a near-complete genome of diploid woodland strawberry (Fragaria vesca) using single-molecule real-time sequencing from Pacific Biosciences (PacBio). This assembly has a contig N50 length of ~7.9 million base pairs (Mb), representing a ~300-fold improvement of the previous version. The vast majority (>99.8%) of the assembly was anchored to 7 pseudomolecules using 2 sets of optical maps from Bionano Genomics. We obtained ~24.96 Mb of sequence not present in the previous version of the F. vesca genome and produced an improved annotation that includes 1496 new genes. Comparative syntenic analyses uncovered numerous, large-scale scaffolding errors present in each chromosome in the previously published version of the F. vesca genome.Our results highlight the need to improve existing short-read based reference genomes. Furthermore, we demonstrate how genome quality impacts commonly used analyses for addressing both fundamental and applied biological questions.© The Authors 2017. Published by Oxford University Press.


July 19, 2019

Expanding an expanded genome: long-read sequencing of Trypanosoma cruzi.

Although the genome of Trypanosoma cruzi, the causative agent of Chagas disease, was first made available in 2005, with additional strains reported later, the intrinsic genome complexity of this parasite (the abundance of repetitive sequences and genes organized in tandem) has traditionally hindered high-quality genome assembly and annotation. This also limits diverse types of analyses that require high degrees of precision. Long reads generated by third-generation sequencing technologies are particularly suitable to address the challenges associated with T. cruzi’s genome since they permit direct determination of the full sequence of large clusters of repetitive sequences without collapsing them. This, in turn, not only allows accurate estimation of gene copy numbers but also circumvents assembly fragmentation. Here, we present the analysis of the genome sequences of two T. cruzi clones: the hybrid TCC (TcVI) and the non-hybrid Dm28c (TcI), determined by PacBio Single Molecular Real-Time (SMRT) technology. The improved assemblies herein obtained permitted us to accurately estimate gene copy numbers, abundance and distribution of repetitive sequences (including satellites and retroelements). We found that the genome of T. cruzi is composed of a ‘core compartment’ and a ‘disruptive compartment’ which exhibit opposite GC content and gene composition. Novel tandem and dispersed repetitive sequences were identified, including some located inside coding sequences. Additionally, homologous chromosomes were separately assembled, allowing us to retrieve haplotypes as separate contigs instead of a unique mosaic sequence. Finally, manual annotation of surface multigene families, mucins and trans-sialidases allows now a better overview of these complex groups of genes.


July 19, 2019

A near-complete haplotype-phased genome of the dikaryotic wheat stripe rust fungus Puccinia striiformis f. sp. tritici reveals high interhaplotype diversity.

A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N50 of 1.5 Mb) and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies.IMPORTANCE Current representations of eukaryotic microbial genomes are haploid, hiding the genomic diversity intrinsic to diploid and polyploid life forms. This hidden diversity contributes to the organism’s evolutionary potential and ability to adapt to stress conditions. Yet, it is challenging to provide haplotype-specific information at a whole-genome level. Here, we take advantage of long-read DNA sequencing technology and a tailored-assembly algorithm to disentangle the two haploid genomes of a dikaryotic pathogenic wheat rust fungus. The two genomes display high levels of nucleotide and structural variations, which lead to allelic variation and the presence of genes lacking allelic counterparts. Nonallelic candidate effector genes, which likely encode important pathogenicity factors, display distinct genome localization patterns and are less likely to be evolutionary conserved than those which are present as allelic pairs. This genomic diversity may promote rapid host adaptation and/or be related to the age of the sequenced isolate since last meiosis. Copyright © 2018 Schwessinger et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.