Menu
September 21, 2019

Retrotransposons are the major contributors to the expansion of the Drosophila ananassae Muller F element.

The discordance between genome size and the complexity of eukaryotes can partly be attributed to differences in repeat density. The Muller F element (~5.2 Mb) is the smallest chromosome in Drosophila melanogaster, but it is substantially larger (>18.7 Mb) in D. ananassae To identify the major contributors to the expansion of the F element and to assess their impact, we improved the genome sequence and annotated the genes in a 1.4-Mb region of the D. ananassae F element, and a 1.7-Mb region from the D element for comparison. We find that transposons (particularly LTR and LINE retrotransposons) are major contributors to this expansion (78.6%), while Wolbachia sequences integrated into the D. ananassae genome are minor contributors (0.02%). Both D. melanogaster and D. ananassae F-element genes exhibit distinct characteristics compared to D-element genes (e.g., larger coding spans, larger introns, more coding exons, and lower codon bias), but these differences are exaggerated in D. ananassae Compared to D. melanogaster, the codon bias observed in D. ananassae F-element genes can primarily be attributed to mutational biases instead of selection. The 5′ ends of F-element genes in both species are enriched in dimethylation of lysine 4 on histone 3 (H3K4me2), while the coding spans are enriched in H3K9me2. Despite differences in repeat density and gene characteristics, D. ananassae F-element genes show a similar range of expression levels compared to genes in euchromatic domains. This study improves our understanding of how transposons can affect genome size and how genes can function within highly repetitive domains. Copyright © 2017 Leung et al.


September 21, 2019

The kinetoplastid-infecting Bodo saltans virus (BsV), a window into the most abundant giant viruses in the sea.

Giant viruses are ecologically important players in aquatic ecosystems that have challenged concepts of what constitutes a virus. Herein, we present the giant Bodo saltans virus (BsV), the first characterized representative of the most abundant group of giant viruses in ocean metagenomes, and the first isolate of a klosneuvirus, a subgroup of the Mimiviridae proposed from metagenomic data. BsV infects an ecologically important microzooplankton, the kinetoplastid Bodo saltans. Its 1.39 Mb genome encodes 1227 predicted ORFs, including a complex replication machinery. Yet, much of its translational apparatus has been lost, including all tRNAs. Essential genes are invaded by homing endonuclease-encoding self-splicing introns that may defend against competing viruses. Putative anti-host factors show extensive gene duplication via a genomic accordion indicating an ongoing evolutionary arms race and highlighting the rapid evolution and genomic plasticity that has led to genome gigantism and the enigma that is giant viruses.© 2018, Deeg et al.


September 21, 2019

Potato late blight field resistance from QTL dPI09c is conferred by the NB-LRR gene R8.

Following the often short-lived protection that major nucleotide binding, leucine-rich-repeat (NB-LRR) resistance genes offer against the potato pathogen Phytophthora infestans, field resistance was thought to provide a more durable alternative to prevent late blight disease. We previously identified the QTL dPI09c on potato chromosome 9 as a more durable field resistance source against late blight. Here, the resistance QTL was fine-mapped to a 186 kb region. The interval corresponds to a larger, 389 kb, genomic region in the potato reference genome of Solanum tuberosum Group Phureja doubled monoploid clone DM1-3 (DM) and from which functional NB-LRRs R8, R9a, Rpi-moc1, and Rpi_vnt1 have arisen independently in wild species. dRenSeq analysis of parental clones alongside resistant and susceptible bulks of the segregating population B3C1HP showed full sequence representation of R8. This was independently validated using long-range PCR and screening of a bespoke bacterial artificial chromosome library. The latter enabled a comparative analysis of the sequence variation in this locus in diverse Solanaceae. We reveal for the first time that broad spectrum and durable field resistance against P. infestans is conferred by the NB-LRR gene R8, which is thought to provide narrow spectrum race-specific resistance.


September 21, 2019

Chromulinavorax destructans, a pathogenic TM6 bacterium with an unusual replication strategy targeting protist mitochondrion

Most of the diversity of microbial life is not available in culture, and as such we lack even a fundamental understanding of the biological diversity of several branches on the tree of life. One branch that is highly underrepresented is the candidate phylum TM6, also known as the Dependentiae. Their biology is known only from reduced genomes recovered from metagenomes around the world and two isolates infecting amoebae, all suggest that they live highly host-associated lifestyles as parasites or symbionts. Chromulinavorax destructans is an isolate from the TM6/Dependentiae that infects and lyses the abundant heterotrophic flagellate, Spumella elongata. Chromulinavorax destructans is characterized by a high degree of reduction and specialization for infection, so much so it was discovered in a screen for giant viruses. Its 1.2 Mb genome shows no metabolic potential and C. destructans instead relies on extensive transporter system to import nutrients, and even energy in the form of ATP from the host. Accordingly, it replicates in a viral-like fashion, while extensively reorganizing and expanding the host mitochondrion. 44% of proteins contain signal sequences for secretion, which includes many proteins of unknown function as well as 98 copies of ankyrin-repeat domain proteins, known effectors of host modulation, suggesting the presence of an extensive host-manipulation apparatus.


September 21, 2019

Multi-Locus Variable number of tandem repeat Analysis (MLVA) of Yersinia ruckeri confirms the existence of host-specificity, geographic endemism and anthropogenic dissemination of virulent clones.

A Multi-Locus Variable number of tandem repeat Analysis (MLVA) assay was developed for epizootiological study of the internationally significant fish pathogen Yersinia ruckeri, which causes yersiniosis in salmonids. The assay involves amplification of ten Variable Number of Tandem Repeat (VNTR) loci in two five-plex PCR reactions, followed by capillary electrophoresis. A collection of 484 Y. ruckeri isolates, originating from various biological sources and collected from four continents over seven decades, was analysed. Minimum spanning tree cluster analysis of MLVA profiles separated the studied population into nine major clonal complexes, and a number of minor clusters and singletons. The major clonal complexes could be associated with host species, geographic origin and serotype. A single large clonal complex of serotype O1 isolates dominating the yersiniosis situation in international rainbow trout farming suggests anthropogenic spread of this clone, possibly related to transport of fish. Moreover, sub-clustering within this clonal complex indicates putative transmission routes and multiple biotype shift events. In contrast to the situation in rainbow trout, Y. ruckeri strains associated with disease in Atlantic salmon appear as more or less geographically isolated clonal complexes. A single complex of serotype O1 exclusive to Norway was found to be responsible for almost all major yersiniosis outbreaks in modern Norwegian salmon farming, and site-specific sub-clustering further indicates persistent colonisation of freshwater farms in Norway. Identification of genetically diverse Y. ruckeri isolates from clinically healthy fish and environmental sources also suggests the widespread existence of less virulent or avirulent strains.Importance This comprehensive population study substantially improves our understanding of the epizootiological history and nature of an internationally important fish pathogenic bacterium. The MLVA assay developed and presented represents a high-resolution typing tool particularly well suited for Yersinia ruckeri infection tracing, selection of strains for vaccine inclusion, and risk assessment. The ability of the assay to separate isolates into geographically linked and/or possibly host-specific clusters reflects its potential utility for maintenance of national biosecurity. The MLVA is internationally applicable, robust, and provides clear, unambiguous and easily interpreted results. Typing is reasonably inexpensive, with a moderate technological requirement, and may be completed from a harvested colony within a single working day. As the resulting MLVA profiles are readily portable, any Y. ruckeri strain may rapidly be placed in a global epizootiological context. Copyright © 2018 Gulla et al.


September 21, 2019

From the inside out: An epibiotic Bdellovibrio predator with an expanded genomic complement

Bdellovibrio and like organisms are abundant environmental predators of prokaryotes that show a diversity of predation strategies, ranging from intra-periplasmic to epibiotic predation. The novel epibiotic predator Bdellovibrio qaytius was isolated from a eutrophic freshwater pond in British Columbia, where it was a continual part of the microbial community. Bdellovibrio qaytius was found to preferentially prey on the beta-proteobacterium Paraburkholderia fungorum. Despite its epibiotic replication strategy, B. qaytius encodes a complex genomic complement more similar to periplasmic predators as well as several biosynthesis pathways not previously found in epibiotic predators. Bdellovibrio qaytius is representative of a widely distributed basal cluster within the genus Bdellovibrio, suggesting that epibiotic predation might be a common predation type in nature and ancestral to the genus.


September 21, 2019

Direct detection of DNA methylation during single-molecule, real-time sequencing.

We describe the direct detection of DNA methylation, without bisulfite conversion, through single-molecule, real-time (SMRT) sequencing. In SMRT sequencing, DNA polymerases catalyze the incorporation of fluorescently labeled nucleotides into complementary nucleic acid strands. The arrival times and durations of the resulting fluorescence pulses yield information about polymerase kinetics and allow direct detection of modified nucleotides in the DNA template, including N6-methyladenine, 5-methylcytosine and 5-hydroxymethylcytosine. Measurement of polymerase kinetics is an intrinsic part of SMRT sequencing and does not adversely affect determination of primary DNA sequence. The various modifications affect polymerase kinetics differently, allowing discrimination between them. We used these kinetic signatures to identify adenine methylation in genomic samples and found that, in combination with circular consensus sequencing, they can enable single-molecule identification of epigenetic modifications with base-pair resolution. This method is amenable to long read lengths and will likely enable mapping of methylation patterns in even highly repetitive genomic regions.


September 21, 2019

A flexible and efficient template format for circular consensus sequencing and SNP detection.

A novel template design for single-molecule sequencing is introduced, a structure we refer to as a SMRTbell template. This structure consists of a double-stranded portion, containing the insert of interest, and a single-stranded hairpin loop on either end, which provides a site for primer binding. Structurally, this format resembles a linear double-stranded molecule, and yet it is topologically circular. When placed into a single-molecule sequencing reaction, the SMRTbell template format enables a consensus sequence to be obtained from multiple passes on a single molecule. Furthermore, this consensus sequence is obtained from both the sense and antisense strands of the insert region. In this article, we present a universal method for constructing these templates, as well as an application of their use. We demonstrate the generation of high-quality consensus accuracy from single molecules, as well as the use of SMRTbell templates in the identification of rare sequence variants.


September 21, 2019

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.


September 21, 2019

The advantages of SMRT sequencing.

Of the current next-generation sequencing technologies, SMRT sequencing is sometimes overlooked. However, attributes such as long reads, modified base detection and high accuracy make SMRT a useful technology and an ideal approach to the complete sequencing of small genomes.


September 21, 2019

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.

Long-read, single-molecule real-time (SMRT) sequencing is routinely used to finish microbial genomes, but available assembly methods have not scaled well to larger genomes. We introduce the MinHash Alignment Process (MHAP) for overlapping noisy, long reads using probabilistic, locality-sensitive hashing. Integrating MHAP with the Celera Assembler enabled reference-grade de novo assemblies of Saccharomyces cerevisiae, Arabidopsis thaliana, Drosophila melanogaster and a human hydatidiform mole cell line (CHM1) from SMRT sequencing. The resulting assemblies are highly continuous, include fully resolved chromosome arms and close persistent gaps in these reference genomes. Our assembly of D. melanogaster revealed previously unknown heterochromatic and telomeric transition sequences, and we assembled low-complexity sequences from CHM1 that fill gaps in the human GRCh38 reference. Using MHAP and the Celera Assembler, single-molecule sequencing can produce de novo near-complete eukaryotic assemblies that are 99.99% accurate when compared with available reference genomes.


September 21, 2019

A Sequel to Sanger: amplicon sequencing that scales.

Although high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658 bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system.By examining templates from more than 5000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL can reduce greatly reduce sequencing costs in comparison to first (Sanger) and second generation platforms (Illumina, Ion).SMRT analysis generates high-fidelity sequences from amplicons with varying GC content and is resilient to homopolymer tracts. Analytical costs are low, substantially less than those for first or second generation sequencers. When implemented on the SEQUEL platform, SMRT analysis enables massive amplicon characterization because each instrument can recover sequences from more than 5 million DNA extracts a year.


July 19, 2019

Genome-wide mapping of methylated adenine residues in pathogenic Escherichia coli using single-molecule real-time sequencing.

Single-molecule real-time (SMRT) DNA sequencing allows the systematic detection of chemical modifications such as methylation but has not previously been applied on a genome-wide scale. We used this approach to detect 49,311 putative 6-methyladenine (m6A) residues and 1,407 putative 5-methylcytosine (m5C) residues in the genome of a pathogenic Escherichia coli strain. We obtained strand-specific information for methylation sites and a quantitative assessment of the frequency of methylation at each modified position. We deduced the sequence motifs recognized by the methyltransferase enzymes present in this strain without prior knowledge of their specificity. Furthermore, we found that deletion of a phage-encoded methyltransferase-endonuclease (restriction-modification; RM) system induced global transcriptional changes and led to gene amplification, suggesting that the role of RM systems extends beyond protecting host genomes from foreign DNA.


July 19, 2019

Comparison of single-molecule sequencing and hybrid approaches for finishing the genome of Clostridium autoethanogenum and analysis of CRISPR systems in industrial relevant Clostridia.

Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published.A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a phophoenolpyruvate synthase) and substrate utilization pathway (mannose and aromatics utilization) that might explain phenotypic differences between C. autoethanogenum and C. ljungdahlii.Single molecule sequencing will be increasingly used to produce finished microbial genomes. The complete genome will facilitate comparative genomics and functional genomics and support future comparisons between Clostridia and studies that examine the evolution of plasmids, bacteriophage and CRISPR systems.


July 19, 2019

Advantages of Single-Molecule Real-Time Sequencing in high-GC content genomes.

Next-generation sequencing has become the most widely used sequencing technology in genomics research, but it has inherent drawbacks when dealing with high-GC content genomes. Recently, single-molecule real-time sequencing technology (SMRT) was introduced as a third-generation sequencing strategy to compensate for this drawback. Here, we report that the unbiased and longer read length of SMRT sequencing markedly improved genome assembly with high GC content via gap filling and repeat resolution.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.