Menu
July 19, 2019

Single-Molecule Real-Time Sequencing combined with optical mapping yields completely finished fungal genome.

Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio-generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism’s biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes.Studying whole-genome sequences has become an important aspect of biological research. The advent of next-generation sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBio-generated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome analyses to facilitate functional studies into an organism’s biology. Copyright © 2015 Faino et al.


July 19, 2019

Whole genome?

The reference human genome assembly is remarkable in its completeness and usefulness in research. However, the range of allelic variation in the human population is not well described by a haploid assembly with a profusion of alternative loci. Homozygous regions and the use of multiple sequencing technologies increasingly have roles in strategies for identifying regulatory and trait-associated variation.


July 19, 2019

SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome.

Third generation sequencing methods, like SMRT (Single Molecule, Real-Time) sequencing developed by Pacific Biosciences, offer much longer read length in comparison to Next Generation Sequencing (NGS) methods. Hence, they are well suited for de novo- or re-sequencing projects. Sequences generated for these purposes will not only contain reads originating from the nuclear genome, but also a significant amount of reads originating from the organelles of the target organism. These reads are usually discarded but they can also be used for an assembly of organellar replicons. The long read length supports resolution of repetitive regions and repeats within the organelles genome which might be problematic when just using short read data. Additionally, SMRT sequencing is less influenced by GC rich areas and by long stretches of the same base.We describe a workflow for a de novo assembly of the sugar beet (Beta vulgaris ssp. vulgaris) chloroplast genome sequence only based on data originating from a SMRT sequencing dataset targeted on its nuclear genome. We show that the data obtained from such an experiment are sufficient to create a high quality assembly with a higher reliability than assemblies derived from e.g. Illumina reads only. The chloroplast genome is especially challenging for de novo assembling as it contains two large inverted repeat (IR) regions. We also describe some limitations that still apply even though long reads are used for the assembly.SMRT sequencing reads extracted from a dataset created for nuclear genome (re)sequencing can be used to obtain a high quality de novo assembly of the chloroplast of the sequenced organism. Even with a relatively small overall coverage for the nuclear genome it is possible to collect more than enough reads to generate a high quality assembly that outperforms short read based assemblies. However, even with long reads it is not always possible to clarify the order of elements of a chloroplast genome sequence reliantly which we could demonstrate with Fosmid End Sequences (FES) generated with Sanger technology. Nevertheless, this limitation also applies to short read sequencing data but is reached in this case at a much earlier stage during finishing.


July 19, 2019

Emergence of ebola virus escape variants in infected nonhuman primates treated with the MB-003 antibody cocktail.

MB-003, a plant-derived monoclonal antibody cocktail used effectively in treatment of Ebola virus infection in non-human primates, was unable to protect two of six animals when initiated 1 or 2 days post-infection. We characterized a mechanism of viral escape in one of the animals, after observation of two clusters of genomic mutations that resulted in five nonsynonymous mutations in the monoclonal antibody target sites. These mutations were linked to a reduction in antibody binding and later confirmed to be present in a viral isolate that was not neutralized in vitro. Retrospective evaluation of a second independent study allowed the identification of a similar case. Four SNPs in previously identified positions were found in this second fatality, suggesting that genetic drift could be a potential cause for treatment failure. These findings highlight the importance selecting different target domains for each component of the cocktail to minimize the potential for viral escape. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.


July 19, 2019

Birth of a new gene on the Y chromosome of Drosophila melanogaster.

Contrary to the pattern seen in mammalian sex chromosomes, where most Y-linked genes have X-linked homologs, the Drosophila X and Y chromosomes appear to be unrelated. Most of the Y-linked genes have autosomal paralogs, so autosome-to-Y transposition must be the main source of Drosophila Y-linked genes. Here we show how these genes were acquired. We found a previously unidentified gene (flagrante delicto Y, FDY) that originated from a recent duplication of the autosomal gene vig2 to the Y chromosome of Drosophila melanogaster. Four contiguous genes were duplicated along with vig2, but they became pseudogenes through the accumulation of deletions and transposable element insertions, whereas FDY remained functional, acquired testis-specific expression, and now accounts for ~20% of the vig2-like mRNA in testis. FDY is absent in the closest relatives of D. melanogaster, and DNA sequence divergence indicates that the duplication to the Y chromosome occurred ~2 million years ago. Thus, FDY provides a snapshot of the early stages of the establishment of a Y-linked gene and demonstrates how the Drosophila Y has been accumulating autosomal genes.


July 19, 2019

Mind the gap; seven reasons to close fragmented genome assemblies.

Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including those that study non-model organisms. Thus, hundreds of fungal genomes have been sequenced and are publically available today, although these initiatives have typically yielded considerably fragmented genome assemblies that often lack large contiguous genomic regions. Many important genomic features are contained in intergenic DNA that is often missing in current genome assemblies, and recent studies underscore the significance of non-coding regions and repetitive elements for the life style, adaptability and evolution of many organisms. The study of particular types of genetic elements, such as telomeres, centromeres, repetitive elements, effectors, and clusters of co-regulated genes, but also of phenomena such as structural rearrangements, genome compartmentalization and epigenetics, greatly benefits from having a contiguous and high-quality, preferably even complete and gapless, genome assembly. Here we discuss a number of important reasons to produce gapless, finished, genome assemblies to help answer important biological questions. Copyright © 2015 Elsevier Inc. All rights reserved.


July 19, 2019

An improved genome reference for the African cichlid, Metriaclima zebra.

Problems associated with using draft genome assemblies are well documented and have become more pronounced with the use of short read data for de novo genome assembly. We set out to improve the draft genome assembly of the African cichlid fish, Metriaclima zebra, using a set of Pacific Biosciences SMRT sequencing reads corresponding to 16.5× coverage of the genome. Here we characterize the improvements that these long reads allowed us to make to the state-of-the-art draft genome previously assembled from short read data.Our new assembly closed 68 % of the existing gaps and added 90.6Mbp of new non-gap sequence to the existing draft assembly of M. zebra. Comparison of the new assembly to the sequence of several bacterial artificial chromosome clones confirmed the accuracy of the new assembly. The closure of sequence gaps revealed thousands of new exons, allowing significant improvement in gene models. We corrected one known misassembly, and identified and fixed other likely misassemblies. 63.5 Mbp (70 %) of the new sequence was classified as repetitive and the new sequence allowed for the assembly of many more transposable elements.Our improvements to the M. zebra draft genome suggest that a reasonable investment in long reads could greatly improve many comparable vertebrate draft genome assemblies.


July 19, 2019

Genetic variation and the de novo assembly of human genomes.

The discovery of genetic variation and the assembly of genome sequences are both inextricably linked to advances in DNA-sequencing technology. Short-read massively parallel sequencing has revolutionized our ability to discover genetic variation but is insufficient to generate high-quality genome assemblies or resolve most structural variation. Full resolution of variation is only guaranteed by complete de novo assembly of a genome. Here, we review approaches to genome assembly, the nature of gaps or missing sequences, and biases in the assembly process. We describe the challenges of generating a complete de novo genome assembly using current technologies and the impact that being able to perfectly sequence the genome would have on understanding human disease and evolution. Finally, we summarize recent technological advances that improve both contiguity and accuracy and emphasize the importance of complete de novo assembly as opposed to read mapping as the primary means to understanding the full range of human genetic variation.


July 19, 2019

Stepwise evolution of pandrug-resistance in Klebsiella pneumoniae.

Carbapenem resistant Enterobacteriaceae (CRE) pose an urgent risk to global human health. CRE that are non-susceptible to all commercially available antibiotics threaten to return us to the pre-antibiotic era. Using Single Molecule Real Time (SMRT) sequencing we determined the complete genome of a pandrug-resistant Klebsiella pneumoniae isolate, representing the first complete genome sequence of CRE resistant to all commercially available antibiotics. The precise location of acquired antibiotic resistance elements, including mobile elements carrying genes for the OXA-181 carbapenemase, were defined. Intriguingly, we identified three chromosomal copies of an ISEcp1-blaOXA-181 mobile element, one of which has disrupted the mgrB regulatory gene, accounting for resistance to colistin. Our findings provide the first description of pandrug-resistant CRE at the genomic level, and reveal the critical role of mobile resistance elements in accelerating the emergence of resistance to other last resort antibiotics.


July 19, 2019

Variable genetic architectures produce virtually identical molecules in bacterial symbionts of fungus-growing ants.

Small molecules produced by Actinobacteria have played a prominent role in both drug discovery and organic chemistry. As part of a larger study of the actinobacterial symbionts of fungus-growing ants, we discovered a small family of three previously unreported piperazic acid-containing cyclic depsipeptides, gerumycins A-C. The gerumycins are slightly smaller versions of dentigerumycin, a cyclic depsipeptide that selectively inhibits a common fungal pathogen, Escovopsis. We had previously identified this molecule from a Pseudonocardia associated with Apterostigma dentigerum, and now we report the molecule from an associate of the more highly derived ant Trachymyrmex cornetzi. The three previously unidentified compounds, gerumycins A-C, have essentially identical structures and were produced by two different symbiotic Pseudonocardia spp. from ants in the genus Apterostigma found in both Panama and Costa Rica. To understand the similarities and differences in the biosynthetic pathways that produced these closely related molecules, the genomes of the three producing Pseudonocardia were sequenced and the biosynthetic gene clusters identified. This analysis revealed that dramatically different biosynthetic architectures, including genomic islands, a plasmid, and the use of spatially separated genetic loci, can lead to molecules with virtually identical core structures. A plausible evolutionary model that unifies these disparate architectures is presented.


July 19, 2019

Pangenome analysis of Bifidobacterium longum and site-directed mutagenesis through by-pass of restriction-modification systems.

Bifidobacterial genome analysis has provided insights as to how these gut commensals adapt to and persist in the human GIT, while also revealing genetic diversity among members of a given bifidobacterial (sub)species. Bifidobacteria are notoriously recalcitrant to genetic modification, which prevents exploration of their genomic functions, including those that convey (human) health benefits.PacBio SMRT sequencing was used to determine the whole genome seqeunces of two B. longum subsp. longum strains. The B. longum pan-genome was computed using PGAP v1.2 and the core B. longum phylogenetic tree was constructed using a maximum-likelihood based approach in PhyML v3.0. M.blmNCII was cloned in E. coli and an internal fragment if arfBarfB was cloned into pORI19 for insertion mutagenesis.In this study we present the complete genome sequences of two Bifidobacterium longum subsp. longum strains. Comparative analysis with thirty one publicly available B. longum genomes allowed the definition of the B. longum core and dispensable genomes. This analysis also highlighted differences in particular metabolic abilities between members of the B. longum subspecies infantis, longum and suis. Furthermore, phylogenetic analysis of the B. longum core genome indicated the existence of a novel subspecies. Methylome data, coupled to the analysis of restriction-modification systems, allowed us to substantially increase the genetic accessibility of B. longum subsp. longum NCIMB 8809 to a level that was shown to permit site-directed mutagenesis.Comparative genomic analysis of thirty three B. longum representatives revealed a closed pan-genome for this bifidobacterial species. Phylogenetic analysis of the B. longum core genome also provides evidence for a novel fifth B. longum subspecies. Finally, we improved genetic accessibility for the strain B. longum subsp. longum NCIMB 8809, which allowed the generation of a mutant of this strain.


July 19, 2019

The pineapple genome and the evolution of CAM photosynthesis.

Pineapple (Ananas comosus (L.) Merr.) is the most economically valuable crop possessing crassulacean acid metabolism (CAM), a photosynthetic carbon assimilation pathway with high water-use efficiency, and the second most important tropical fruit. We sequenced the genomes of pineapple varieties F153 and MD2 and a wild pineapple relative, Ananas bracteatus accession CB5. The pineapple genome has one fewer ancient whole-genome duplication event than sequenced grass genomes and a conserved karyotype with seven chromosomes from before the ? duplication event. The pineapple lineage has transitioned from C3 photosynthesis to CAM, with CAM-related genes exhibiting a diel expression pattern in photosynthetic tissues. CAM pathway genes were enriched with cis-regulatory elements associated with the regulation of circadian clock genes, providing the first cis-regulatory link between CAM and circadian clock regulation. Pineapple CAM photosynthesis evolved by the reconfiguration of pathways in C3 plants, through the regulatory neofunctionalization of preexisting genes and not through the acquisition of neofunctionalized genes via whole-genome or tandem gene duplication.


July 19, 2019

Single molecule real-time sequencing of Xanthomonas oryzae genomes reveals a dynamic structure and complex TAL (transcription activator-like) effector gene relationships.

Pathogen-injected, direct transcriptional activators of host genes, TAL (transcription activator-like) effectors play determinative roles in plant diseases caused by Xanthomonas spp. A large domain of nearly identical, 33-35 aa repeats in each protein mediates DNA recognition. This modularity makes TAL effectors customizable and thus important also in biotechnology. However, the repeats render TAL effector (tal) genes nearly impossible to assemble using next-generation, short reads. Here, we demonstrate that long-read, single molecule real-time (SMRT) sequencing solves this problem. Taking an ensemble approach to first generate local, tal gene contigs, we correctly assembled de novo the genomes of two strains of the rice pathogen X. oryzae completed previously using the Sanger method and even identified errors in those references. Sequencing two more strains revealed a dynamic genome structure and a striking plasticity in tal gene content. Our results pave the way for population-level studies to inform resistance breeding, improve biotechnology and probe TAL effector evolution.


July 19, 2019

Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum.

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16?kilobases) reads with random errors, we assembled 99% (244?megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4?megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.


July 19, 2019

Genome sequencing: Long reads for a short plant

The genome of a tiny resurrection plant has been sequenced using PacBio’s long-read single-molecule real-time sequencing technology, aiding the understanding of extreme desiccation tolerance. The genome contiguity is comparable to that of genomes sequenced using far more laborious approaches.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.