October 23, 2019  |  

High resolution profiling of coral-associated bacterial communities using full-length 16S rRNA sequence data from PacBio SMRT sequencing system.

Coral reefs are a complex ecosystem consisting of coral animals and a vast array of associated symbionts including the dinoflagellate Symbiodinium, fungi, viruses and bacteria. Several studies have highlighted the importance of coral-associated bacteria and their fundamental roles in fitness and survival of the host animal. The scleractinian coral Porites lutea is one of the dominant reef-builders in the Indo-West Pacific. Currently, very little is known about the composition and structure of bacterial communities across P. lutea reefs. The purpose of this study is twofold: to demonstrate the advantages of using PacBio circular consensus sequencing technology in microbial community studies and to investigate the diversity and structure of P. lutea-associated microbiome in the Indo-Pacific. This is the first metagenomic study of marine environmental samples that utilises the PacBio sequencing system to capture full-length 16S rRNA sequences. We observed geographically distinct coral-associated microbial profiles between samples from the Gulf of Thailand and Andaman Sea. Despite the geographical and environmental impacts on the coral-host interactions, we identified a conserved community of bacteria that were present consistently across diverse reef habitats. Finally, we demonstrated the superior performance of full-length 16S rRNA sequences in resolving taxonomic uncertainty of coral associates at the species level.

September 21, 2019  |  

A Sequel to Sanger: amplicon sequencing that scales.

Although high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658 bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system.By examining templates from more than 5000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL can reduce greatly reduce sequencing costs in comparison to first (Sanger) and second generation platforms (Illumina, Ion).SMRT analysis generates high-fidelity sequences from amplicons with varying GC content and is resilient to homopolymer tracts. Analytical costs are low, substantially less than those for first or second generation sequencers. When implemented on the SEQUEL platform, SMRT analysis enables massive amplicon characterization because each instrument can recover sequences from more than 5 million DNA extracts a year.

July 19, 2019  |  

Advantages of Single-Molecule Real-Time Sequencing in high-GC content genomes.

Next-generation sequencing has become the most widely used sequencing technology in genomics research, but it has inherent drawbacks when dealing with high-GC content genomes. Recently, single-molecule real-time sequencing technology (SMRT) was introduced as a third-generation sequencing strategy to compensate for this drawback. Here, we report that the unbiased and longer read length of SMRT sequencing markedly improved genome assembly with high GC content via gap filling and repeat resolution.

July 19, 2019  |  

An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome.

Second generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read sequencing platform offers the promise of increased read length and unbiased genome coverage and thus the potential to produce genome sequence data of a finished quality containing fewer gaps and longer contigs. However, these advantages come at a much greater cost per nucleotide and with a perceived increase in error-rate. In this investigation, we evaluated the performance of the PacBio RS sequencing platform through the sequencing and de novo assembly of the Potentilla micrantha chloroplast genome.Following error-correction, a total of 28,638 PacBio RS reads were recovered with a mean read length of 1,902 bp totalling 54,492,250 nucleotides and representing an average depth of coverage of 320× the chloroplast genome. The dataset covered the entire 154,959 bp of the chloroplast genome in a single contig (100% coverage) compared to seven contigs (90.59% coverage) recovered from an Illumina data, and revealed no bias in coverage of GC rich regions. Post-assembly the data were largely concordant with the Illumina data generated and allowed 187 ambiguities in the Illumina data to be resolved. The additional read length also permitted small differences in the two inverted repeat regions to be assigned unambiguously.This is the first report to our knowledge of a chloroplast genome assembled de novo using PacBio sequence data. The PacBio RS data generated here were assembled into a single large contig spanning the P. micrantha chloroplast genome, with a higher degree of accuracy than an Illumina dataset generated at a much greater depth of coverage, due to longer read lengths and lower GC bias in the data. The results we present suggest PacBio data will be of immense utility for the development of genome sequence assemblies containing fewer unresolved gaps and ambiguities and a significantly smaller number of contigs than could be produced using short-read sequence data alone.

July 19, 2019  |  

Returning to more finished genomes

Abstract Genomic data have become commonplace in most branches of the biological sciences and have fundamentally altered the way research is conducted. However, the predominance of short-read sequence data from second-generation sequencing technologies has commonly resulted in fragmented and partial genomic data characteristics. In this opinion, I will highlight how long, unbiased reads from single molecule, real-time (SMRT) sequencing now allow for a return to more contiguous and comprehensive views of genomes.

July 19, 2019  |  

Pacific Biosciences sequencing technology for genotyping and variation discovery in human data.

Pacific Biosciences technology provides a fundamentally new data type that provides the potential to overcome some limitations of current next generation sequencing platforms by providing significantly longer reads, single molecule sequencing, low composition bias and an error profile that is orthogonal to other platforms. With these potential advantages in mind, we here evaluate the utility of the Pacific Biosciences RS platform for human medical amplicon resequencing projects.We evaluated the Pacific Biosciences technology for SNP discovery in medical resequencing projects using the Genome Analysis Toolkit, observing high sensitivity and specificity for calling differences in amplicons containing known true or false SNPs. We assessed data quality: most errors were indels (~14%) with few apparent miscalls (~1%). In this work, we define a custom data processing pipeline for Pacific Biosciences data for human data analysis.Critically, the error properties were largely free of the context-specific effects that affect other sequencing technologies. These data show excellent utility for follow-up validation and extension studies in human data and medical genetics projects, but can be extended to other organisms with a reference genome.

July 19, 2019  |  

Whole genome complete resequencing of Bacillus subtilis natto by combining long reads with high-quality short reads.

De novo microbial genome sequencing reached a turning point with third-generation sequencing (TGS) platforms, and several microbial genomes have been improved by TGS long reads. Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and it has a function in the production of the traditional Japanese fermented food “natto.” The B. subtilis natto BEST195 genome was previously sequenced with short reads, but it included some incomplete regions. We resequenced the BEST195 genome using a PacBio RS sequencer, and we successfully obtained a complete genome sequence from one scaffold without any gaps, and we also applied Illumina MiSeq short reads to enhance quality. Compared with the previous BEST195 draft genome and Marburg 168 genome, we found that incomplete regions in the previous genome sequence were attributed to GC-bias and repetitive sequences, and we also identified some novel genes that are found only in the new genome.

July 19, 2019  |  

Genome analysis of the fruiting body forming myxobacterium Chondromyces crocatus reveals high potential for natural product biosynthesis.

Here we report the first complete genome sequence of the type strain of the myxobacterial genus Chondromyces – Chondromyces crocatus Cm c5. It presents one of the largest prokaryotic genomes featuring a single circular chromosome and no plasmids. Analysis revealed an enlarged set of tRNA genes, along with reduced pressure on preferred codon usage compared to other bacterial genomes. The large coding capacity and the plethora of encoded secondary metabolite biosynthetic gene clusters is in line with the capability of Cm c5 to produce an arsenal of anti-bacterial, anti-fungal and cytotoxic compounds. Known pathways of the ajudazol, chondramide, chondrochloren, crocacin, crocapeptin and thuggacin compound families are complemented by many more natural compound biosynthetic gene clusters in the chromosome. Whole-genome comparison of the fruiting-body forming type-strain (Cm c5 = DSM 14714) to an accustomed laboratory strain which has lost this ability (Cm c5 fr-) revealed genetic changes in three loci. In addition to the low synteny found with the closest sequenced representative of the same family, Sorangium cellulosum, extensive genetic information duplication, and broad application of eukaryotic-type signal transduction systems are hallmarks of this 11.3 Mbp prokaryotic genome. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

July 19, 2019  |  

Winding paths to simplicity: genome evolution in facultative insect symbionts.

Symbiosis between organisms is an important driving force in evolution. Among the diverse relationships described, extensive progress has been made in insect-bacteria symbiosis, which improved our understanding of the genome evolution in host-associated bacteria. Particularly, investigations on several obligate mutualists have pushed the limits of what we know about the minimal genomes for sustaining cellular life. To bridge the gap between those obligate symbionts with extremely reduced genomes and their non-host-restricted ancestors, this review focuses on the recent progress in genome characterization of facultative insect symbionts. Notable cases representing various types and stages of host associations, including those from multiple genera in the family Enterobacteriaceae (class Gammaproteobacteria), Wolbachia (Alphaproteobacteria) and Spiroplasma (Mollicutes), are discussed. Although several general patterns of genome reduction associated with the adoption of symbiotic relationships could be identified, extensive variation was found among these facultative symbionts. These findings are incorporated into the established conceptual frameworks to develop a more detailed evolutionary model for the discussion of possible trajectories. In summary, transitions from facultative to obligate symbiosis do not appear to be a universal one-way street; switches between hosts and lifestyles (e.g. commensalism, parasitism or mutualism) occur frequently and could be facilitated by horizontal gene transfer. © FEMS 2016.

July 19, 2019  |  

SMRT genome assembly corrects reference errors, resolving the genetic basis of virulence in Mycobacterium tuberculosis.

The genetic basis of virulence in Mycobacterium tuberculosis has been investigated through genome comparisons of virulent (H37Rv) and attenuated (H37Ra) sister strains. Such analysis, however, relies heavily on the accuracy of the sequences. While the H37Rv reference genome has had several corrections to date, that of H37Ra is unmodified since its original publication.Here, we report the assembly and finishing of the H37Ra genome from single-molecule, real-time (SMRT) sequencing. Our assembly reveals that the number of H37Ra-specific variants is less than half of what the Sanger-based H37Ra reference sequence indicates, undermining and, in some cases, invalidating the conclusions of several studies. PE_PPE family genes, which are intractable to commonly-used sequencing platforms because of their repetitive and GC-rich nature, are overrepresented in the set of genes in which all reported H37Ra-specific variants are contradicted. Further, one of the sequencing errors in H37Ra masks a true variant in common with the clinical strain CDC1551 which, when considered in the context of previous work, corresponds to a sequencing error in the H37Rv reference genome.Our results constrain the set of genomic differences possibly affecting virulence by more than half, which focuses laboratory investigation on pertinent targets and demonstrates the power of SMRT sequencing for producing high-quality reference genomes.

July 19, 2019  |  

Amplification-free, CRISPR-Cas9 targeted enrichment and SMRT Sequencing of repeat-expansion disease causative genomic regions

Targeted sequencing has proven to be an economical means of obtaining sequence information for one or more defined regions of a larger genome. However, most target enrichment methods require amplification. Some genomic regions, such as those with extreme GC content and repetitive sequences, are recalcitrant to faithful amplification. Yet, many human genetic disorders are caused by repeat expansions, including difficult to sequence tandem repeats. We have developed a novel, amplification-free enrichment technique that employs the CRISPR-Cas9 system for specific targeting multiple genomic loci. This method, in conjunction with long reads generated through Single Molecule, Real-Time (SMRT) sequencing and unbiased coverage, enables enrichment and sequencing of complex genomic regions that cannot be investigated with other technologies. Using human genomic DNA samples, we demonstrate successful targeting of causative loci for Huntingtontextquoterights disease (HTT; CAG repeat), Fragile X syndrome (FMR1; CGG repeat), amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (C9orf72; GGGGCC repeat), and spinocerebellar ataxia type 10 (SCA10) (ATXN10; variable ATTCT repeat). The method, amenable to multiplexing across multiple genomic loci, uses an amplification-free approach that facilitates the isolation of hundreds of individual on-target molecules in a single SMRT Cell and accurate sequencing through long repeat stretches, regardless of extreme GC percent or sequence complexity content. Our novel targeted sequencing method opens new doors to genomic analyses independent of PCR amplification that will facilitate the study of repeat expansion disorders.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.