June 1, 2021  |  

WGS SMRT Sequencing of patient samples from a fecal microbiota transplant trial

Fecal samples were obtained from human subjects in the first blinded, placebo-controlled trial to evaluate the efficacy and safety of fecal microbiota transplant (FMT) for treatment of recurrent C. difficile infection. Samples included pre-and post-FMT transplant, post-placebo transplant, and the donor control; samples were taken at 2 and 8 week post-FMT. Sequencing was done on the PacBio Sequel System, with the goal of obtaining high quality sequences covering whole genes or gene clusters, which will be used to better understand the relationship between the composition and functional capabilities of intestinal microbiomes and patient health. Methods: Samples were randomly sheared to 2-3 kb fragments, a sufficient length to cover most genes, and SMRTbell libraries were prepared using standard protocols. Libraries were run on the Sequel System, which has a throughput of hundreds of thousands of reads per SMRT Cell, adequate yield to sample the complex microbiomes of post-transplant and donor samples.Results: Here we characterize samples, describe library prep methods and detail Sequel System operation, including run conditions. Descriptive statistics of data output (primary analysis) are presented, along with SMRT Analysis reports on circular consensus sequence (CCS) reads generated using an updated algorithm (CCS2). Final sequencing yields are filtered at various levels of predicted accuracy from 90% to 99.9%. Previous studies done using the PacBio RS II System demonstrated the ability to profile at the species level, and in some cases the strain level, and provided functional insight. Conclusions: These results demonstrate that the Sequel System is well-suited for characterization of complex microbial communities, with the ability for high-throughput generation of extremely accurate single-molecule sequences, each several kilobases in length. The entire process from shearing and library prep through sequencing and CCS analysis can be completed in less than 48 hours.


April 21, 2020  |  

Identification of Initial Colonizing Bacteria in Dental Plaques from Young Adults Using Full-Length 16S rRNA Gene Sequencing.

Development of dental plaque begins with the adhesion of salivary bacteria to the acquired pellicle covering the tooth surface. In this study, we collected in vivo dental plaque formed on hydroxyapatite disks for 6 h from 74 young adults and identified initial colonizing taxa based on full-length 16S rRNA gene sequences. A long-read, single-molecule sequencer, PacBio Sequel, provided 100,109 high-quality full-length 16S rRNA gene sequence reads from the early plaque microbiota, which were assigned to 90 oral bacterial taxa. The microbiota obtained from every individual mostly comprised the 21 predominant taxa with the maximum relative abundance of over 10% (95.8?±?6.2%, mean ± SD), which included Streptococcus species as well as nonstreptococcal species. A hierarchical cluster analysis of their relative abundance distribution suggested three major patterns of microbiota compositions: a Streptococcus mitis/Streptococcus sp. HMT-423-dominant profile, a Neisseria sicca/Neisseria flava/Neisseria mucosa-dominant profile, and a complex profile with high diversity. No notable variations in the community structures were associated with the dental caries status, although the total bacterial amounts were larger in the subjects with a high number of caries-experienced teeth (=8) than in those with no or a low number of caries-experienced teeth. Our results revealed the bacterial taxa primarily involved in early plaque formation on hydroxyapatite disks in young adults.IMPORTANCE Selective attachment of salivary bacteria to the tooth surface is an initial and repetitive phase in dental plaque development. We employed full-length 16S rRNA gene sequence analysis with a high taxonomic resolution using a third-generation sequencer, PacBio Sequel, to determine the bacterial composition during early plaque formation in 74 young adults accurately and in detail. The results revealed 21 bacterial taxa primarily involved in early plaque formation on hydroxyapatite disks in young adults, which include several streptococcal species as well as nonstreptococcal species, such as Neisseria sicca/Nflava/Nmucosa and Rothia dentocariosa Given that no notable variations in the microbiota composition were associated with the dental caries status, the maturation process, rather than the specific bacterial species that are the initial colonizers, is likely to play an important role in the development of dysbiotic microbiota associated with dental caries. Copyright © 2019 Ihara et al.


April 21, 2020  |  

High-throughput amplicon sequencing of the full-length 16S rRNA gene with single-nucleotide resolution.

Targeted PCR amplification and high-throughput sequencing (amplicon sequencing) of 16S rRNA gene fragments is widely used to profile microbial communities. New long-read sequencing technologies can sequence the entire 16S rRNA gene, but higher error rates have limited their attractiveness when accuracy is important. Here we present a high-throughput amplicon sequencing methodology based on PacBio circular consensus sequencing and the DADA2 sample inference method that measures the full-length 16S rRNA gene with single-nucleotide resolution and a near-zero error rate. In two artificial communities of known composition, our method recovered the full complement of full-length 16S sequence variants from expected community members without residual errors. The measured abundances of intra-genomic sequence variants were in the integral ratios expected from the genuine allelic variants within a genome. The full-length 16S gene sequences recovered by our approach allowed Escherichia coli strains to be correctly classified to the O157:H7 and K12 sub-species clades. In human fecal samples, our method showed strong technical replication and was able to recover the full complement of 16S rRNA alleles in several E. coli strains. There are likely many applications beyond microbial profiling for which high-throughput amplicon sequencing of complete genes with single-nucleotide resolution will be of use. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020  |  

MSC: a metagenomic sequence classification algorithm.

Metagenomics is the study of genetic materials directly sampled from natural habitats. It has the potential to reveal previously hidden diversity of microscopic life largely due to the existence of highly parallel and low-cost next-generation sequencing technology. Conventional approaches align metagenomic reads onto known reference genomes to identify microbes in the sample. Since such a collection of reference genomes is very large, the approach often needs high-end computing machines with large memory which is not often available to researchers. Alternative approaches follow an alignment-free methodology where the presence of a microbe is predicted using the information about the unique k-mers present in the microbial genomes. However, such approaches suffer from high false positives due to trading off the value of k with the computational resources. In this article, we propose a highly efficient metagenomic sequence classification (MSC) algorithm that is a hybrid of both approaches. Instead of aligning reads to the full genomes, MSC aligns reads onto a set of carefully chosen, shorter and highly discriminating model sequences built from the unique k-mers of each of the reference sequences.Microbiome researchers are generally interested in two objectives of a taxonomic classifier: (i) to detect prevalence, i.e. the taxa present in a sample, and (ii) to estimate their relative abundances. MSC is primarily designed to detect prevalence and experimental results show that MSC is indeed a more effective and efficient algorithm compared to the other state-of-the-art algorithms in terms of accuracy, memory and runtime. Moreover, MSC outputs an approximate estimate of the abundances.The implementations are freely available for non-commercial purposes. They can be downloaded from https://drive.google.com/open?id=1XirkAamkQ3ltWvI1W1igYQFusp9DHtVl. © The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


April 21, 2020  |  

Assessment of the microbial diversity of Chinese Tianshan tibicos by single molecule, real-time sequencing technology.

Chinese Tianshan tibico grains were collected from the rural area of Tianshan in Xinjiang province, China. Typical tibico grains are known to consist of polysaccharide matrix that embeds a variety of bacteria and yeasts. These grains are widely used in some rural regions to produce a beneficial sugary beverage that is slightly acidic and contains low level of alcohol. This work aimed to characterize the microbiota composition of Chinese Tianshan tibicos using the single molecule, real-time sequencing technology, which is advantageous in generating long reads. Our results revealed that the microbiota mainly comprised of the bacterial species of Lactobacillus hilgardii, Lactococcus raffinolactis, Leuconostoc mesenteroides, Zymomonas mobilis, together with a Guehomyces pullulans-dominating fungal community. The data generated in this work helps identify beneficial microbes in Chinese Tianshan tibico grains.


April 21, 2020  |  

Improving the sensitivity of long read overlap detection using grouped short k-mer matches.

Single-molecule, real-time sequencing (SMRT) developed by Pacific BioSciences produces longer reads than second-generation sequencing technologies such as Illumina. The increased read length enables PacBio sequencing to close gaps in genome assembly, reveal structural variations, and characterize the intra-species variations. It also holds the promise to decipher the community structure in complex microbial communities because long reads help metagenomic assembly. One key step in genome assembly using long reads is to quickly identify reads forming overlaps. Because PacBio data has higher sequencing error rate and lower coverage than popular short read sequencing technologies (such as Illumina), efficient detection of true overlaps requires specially designed algorithms. In particular, there is still a need to improve the sensitivity of detecting small overlaps or overlaps with high error rates in both reads. Addressing this need will enable better assembly for metagenomic data produced by third-generation sequencing technologies.In this work, we designed and implemented an overlap detection program named GroupK, for third-generation sequencing reads based on grouped k-mer hits. While using k-mer hits for detecting reads’ overlaps has been adopted by several existing programs, our method uses a group of short k-mer hits satisfying statistically derived distance constraints to increase the sensitivity of small overlap detection. Grouped k-mer hit was originally designed for homology search. We are the first to apply group hit for long read overlap detection. The experimental results of applying our pipeline to both simulated and real third-generation sequencing data showed that GroupK enables more sensitive overlap detection, especially for datasets of low sequencing coverage.GroupK is best used for detecting small overlaps for third-generation sequencing data. It provides a useful supplementary tool to existing ones for more sensitive and accurate overlap detection. The source code is freely available at https://github.com/Strideradu/GroupK .


April 21, 2020  |  

Metaepigenomic analysis reveals the unexplored diversity of DNA methylation in an environmental prokaryotic community.

DNA methylation plays important roles in prokaryotes, and their genomic landscapes-prokaryotic epigenomes-have recently begun to be disclosed. However, our knowledge of prokaryotic methylation systems is focused on those of culturable microbes, which are rare in nature. Here, we used single-molecule real-time and circular consensus sequencing techniques to reveal the ‘metaepigenomes’ of a microbial community in the largest lake in Japan, Lake Biwa. We reconstructed 19 draft genomes from diverse bacterial and archaeal groups, most of which are yet to be cultured. The analysis of DNA chemical modifications in those genomes revealed 22 methylated motifs, nine of which were novel. We identified methyltransferase genes likely responsible for methylation of the novel motifs, and confirmed the catalytic specificities of four of them via transformation experiments using synthetic genes. Our study highlights metaepigenomics as a powerful approach for identification of the vast unexplored variety of prokaryotic DNA methylation systems in nature.


April 21, 2020  |  

Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation.

We describe a method that adds long-read sequencing to a mix of technologies used to assemble a highly complex cattle rumen microbial community, and provide a comparison to short read-based methods. Long-read alignments and Hi-C linkage between contigs support the identification of 188 novel virus-host associations and the determination of phage life cycle states in the rumen microbial community. The long-read assembly also identifies 94 antimicrobial resistance genes, compared to only seven alleles in the short-read assembly. We demonstrate novel techniques that work synergistically to improve characterization of biological features in a highly complex rumen microbial community.


April 21, 2020  |  

Long-read based de novo assembly of low-complexity metagenome samples results in finished genomes and reveals insights into strain diversity and an active phage system.

Complete and contiguous genome assemblies greatly improve the quality of subsequent systems-wide functional profiling studies and the ability to gain novel biological insights. While a de novo genome assembly of an isolated bacterial strain is in most cases straightforward, more informative data about co-existing bacteria as well as synergistic and antagonistic effects can be obtained from a direct analysis of microbial communities. However, the complexity of metagenomic samples represents a major challenge. While third generation sequencing technologies have been suggested to enable finished metagenome-assembled genomes, to our knowledge, the complete genome assembly of all dominant strains in a microbiome sample has not been demonstrated. Natural whey starter cultures (NWCs) are used in cheese production and represent low-complexity microbiomes. Previous studies of Swiss Gruyère and selected Italian hard cheeses, mostly based on amplicon metagenomics, concurred that three species generally pre-dominate: Streptococcus thermophilus, Lactobacillus helveticus and Lactobacillus delbrueckii.Two NWCs from Swiss Gruyère producers were subjected to whole metagenome shotgun sequencing using the Pacific Biosciences Sequel and Illumina MiSeq platforms. In addition, longer Oxford Nanopore Technologies MinION reads had to be generated for one to resolve repeat regions. Thereby, we achieved the complete assembly of all dominant bacterial genomes from these low-complexity NWCs, which was corroborated by a 16S rRNA amplicon survey. Moreover, two distinct L. helveticus strains were successfully co-assembled from the same sample. Besides bacterial chromosomes, we could also assemble several bacterial plasmids and phages and a corresponding prophage. Biologically relevant insights were uncovered by linking the plasmids and phages to their respective host genomes using DNA methylation motifs on the plasmids and by matching prokaryotic CRISPR spacers with the corresponding protospacers on the phages. These results could only be achieved by employing long-read sequencing data able to span intragenomic as well as intergenomic repeats.Here, we demonstrate the feasibility of complete de novo genome assembly of all dominant strains from low-complexity NWCs based on whole metagenomics shotgun sequencing data. This allowed to gain novel biological insights and is a fundamental basis for subsequent systems-wide omics analyses, functional profiling and phenotype to genotype analysis of specific microbial communities.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.